extractDomainName

http://jonathanaquino.com/extractDomainName.php?url=%s

SYNOPSIS
      extractDomainName [URL]

EXAMPLES
      extractDomainName http://www.amazon.com/
      returns: amazon.com

      extractDomainName eemadges.com
      returns: eemadges.com

      extractDomainName http://en.wikipedia.org?search=%s
      returns: en.wikipedia.org

      extractDomainName http://seek.sing365.com:8080/cgi-bin/s.cgi?q=ladytron 
      returns: sing365.com

      extractDomainName https://www.cia.gov/cia/publications/factbook/geos/.html 
      returns: cia.gov (thanks to Frank Raiser for noticing the https bug!)

DESCRIPTION
      Extracts the domain name from the given URL.

      It's a tad more complex than that. Since I made this command explicitly as a building block for another command (">") it has some quirks to fit my needs. For instance, I usually wanted the domain address with all subdomains ( e.g. I wanted en.wikipedia.org not just wikipedia.org) unless those subdomains corresponded to a search subsection of a website (e.g. I preferred nytimes.com instead of query.nytimes.com). Details are in the code below.

      Here's the basic regexp behind extractDomainName:
      def extractDomainName(url)
          r = url=~(/^(?:\w+:\/\/)?([^\/?]+)(?:\/|\?|$)/) ? $1 : 'Not a valid URL!' 
          r.gsub!(/((?:www)|(?:seek)|(?:query)|(?:search))\.(([^\.]+)\.([^\.]+)(\.([^\.]+))?)/, '\2')
          r.gsub!(/\:\d+$/, '')
      end

      Please email me (ely[dot]parra[gmail]) if you find bugs or have suggestions. 

-elzr.com



==========
Old implementation:
http://eemadges.com/extractDomainName?id=%s

36522 uses - Created 2006-01-21 22:33:14 - Last used 2024-04-10 08:19:02

Is this command broken? Tell Jon if you know how to fix it.

Do you find this command offensive? Let Jon know.