Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. C#
  4. extract domain from URL

extract domain from URL

Scheduled Pinned Locked Moved C#
regexcsharpcomalgorithmstutorial
10 Posts 6 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • E Offline
    E Offline
    Eli Nurman
    wrote on last edited by
    #1

    Hi guys. I'm going crazy from searching the Internet for solutions. how do i extract the domain from a URL or Uri the host name (Uri.Host) is not good enough since i want to remove all sub-domains. i found a few regex solutions but all are for single top level domains like .com, .org and .net but i need a solutions for co.uk and all domain types. I'm creating a app that allows browsing for certain domains that the user could choose, the user could add "*.google.co.uk" to allow all sub-domains of google.co.uk but i can't compare "search.something.google.co.uk" to match "*.google.co.uk" since i can't extract the domain only. Please if any one knows a solutions for this, or how to compare a host to a wildcard domain please reply. Thanks

    M G H 3 Replies Last reply
    0
    • E Eli Nurman

      Hi guys. I'm going crazy from searching the Internet for solutions. how do i extract the domain from a URL or Uri the host name (Uri.Host) is not good enough since i want to remove all sub-domains. i found a few regex solutions but all are for single top level domains like .com, .org and .net but i need a solutions for co.uk and all domain types. I'm creating a app that allows browsing for certain domains that the user could choose, the user could add "*.google.co.uk" to allow all sub-domains of google.co.uk but i can't compare "search.something.google.co.uk" to match "*.google.co.uk" since i can't extract the domain only. Please if any one knows a solutions for this, or how to compare a host to a wildcard domain please reply. Thanks

      M Offline
      M Offline
      musefan
      wrote on last edited by
      #2

      IM not sure how the Uri.Host string displays or the different types of addresses you could get, so i cant suggest a methofd for parsing. But you could split the string with '.' and then loop the results for a match to pre-defined values. i.e. if one value is 'Google' then its a google website etc. Again im not sure how reliable this would be but it could be a temp solution till you find a better way

      E 1 Reply Last reply
      0
      • M musefan

        IM not sure how the Uri.Host string displays or the different types of addresses you could get, so i cant suggest a methofd for parsing. But you could split the string with '.' and then loop the results for a match to pre-defined values. i.e. if one value is 'Google' then its a google website etc. Again im not sure how reliable this would be but it could be a temp solution till you find a better way

        E Offline
        E Offline
        Eli Nurman
        wrote on last edited by
        #3

        i still need a solution

        M realJSOPR 2 Replies Last reply
        0
        • E Eli Nurman

          i still need a solution

          M Offline
          M Offline
          musefan
          wrote on last edited by
          #4

          OK then: make a list of all '.com' '.co.uk' (whatever you call them) in the world. then search your uri for a match and the domain should be the split before the '.com' etc Thats a solution, right?

          S 1 Reply Last reply
          0
          • M musefan

            OK then: make a list of all '.com' '.co.uk' (whatever you call them) in the world. then search your uri for a match and the domain should be the split before the '.com' etc Thats a solution, right?

            S Offline
            S Offline
            Steve_
            wrote on last edited by
            #5

            @musefan .com etc are known as top-level domains.

            M 1 Reply Last reply
            0
            • E Eli Nurman

              Hi guys. I'm going crazy from searching the Internet for solutions. how do i extract the domain from a URL or Uri the host name (Uri.Host) is not good enough since i want to remove all sub-domains. i found a few regex solutions but all are for single top level domains like .com, .org and .net but i need a solutions for co.uk and all domain types. I'm creating a app that allows browsing for certain domains that the user could choose, the user could add "*.google.co.uk" to allow all sub-domains of google.co.uk but i can't compare "search.something.google.co.uk" to match "*.google.co.uk" since i can't extract the domain only. Please if any one knows a solutions for this, or how to compare a host to a wildcard domain please reply. Thanks

              G Offline
              G Offline
              Guffa
              wrote on last edited by
              #6

              Getting the domain name from the host name is easy. It's just the two last keywords separated by a period. Except for the .co.uk domains of course... (Are there any other weird domains like this in the system?) Why not create a regular expression from the wildcard string by replacing * with .+? and encoding the rest? That would even give greater flexibility, as the user could search for something like "www.*.google.com".

              Despite everything, the person most likely to be fooling you next is yourself.

              E 1 Reply Last reply
              0
              • S Steve_

                @musefan .com etc are known as top-level domains.

                M Offline
                M Offline
                musefan
                wrote on last edited by
                #7

                TY

                1 Reply Last reply
                0
                • E Eli Nurman

                  i still need a solution

                  realJSOPR Online
                  realJSOPR Online
                  realJSOP
                  wrote on last edited by
                  #8

                  This isn't hard at all...

                  string url = "http://abc.com/default.aspx";
                  string[] parts = url.Split("//");
                  string[] parts2 = parts[1].Split("/");
                  string domain = parts2[0];

                  If you need to further parse this, just do domain.Split('.'); and take the first two array items, and you get the root domain. All you have to do is *think*.

                  "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997
                  -----
                  "...the staggering layers of obscenity in your statement make it a work of art on so many levels." - Jason Jystad, 10/26/2001

                  1 Reply Last reply
                  0
                  • G Guffa

                    Getting the domain name from the host name is easy. It's just the two last keywords separated by a period. Except for the .co.uk domains of course... (Are there any other weird domains like this in the system?) Why not create a regular expression from the wildcard string by replacing * with .+? and encoding the rest? That would even give greater flexibility, as the user could search for something like "www.*.google.com".

                    Despite everything, the person most likely to be fooling you next is yourself.

                    E Offline
                    E Offline
                    Eli Nurman
                    wrote on last edited by
                    #9

                    that is the answer i was looking for thanks

                    1 Reply Last reply
                    0
                    • E Eli Nurman

                      Hi guys. I'm going crazy from searching the Internet for solutions. how do i extract the domain from a URL or Uri the host name (Uri.Host) is not good enough since i want to remove all sub-domains. i found a few regex solutions but all are for single top level domains like .com, .org and .net but i need a solutions for co.uk and all domain types. I'm creating a app that allows browsing for certain domains that the user could choose, the user could add "*.google.co.uk" to allow all sub-domains of google.co.uk but i can't compare "search.something.google.co.uk" to match "*.google.co.uk" since i can't extract the domain only. Please if any one knows a solutions for this, or how to compare a host to a wildcard domain please reply. Thanks

                      H Offline
                      H Offline
                      Henry Minute
                      wrote on last edited by
                      #10

                      I found this[^] useful as a starting point when thinking about this type of thing. Hope it is useful.

                      Henry Minute Do not read medical books! You could die of a misprint. - Mark Twain Girl: (staring) "Why do you need an icy cucumber?"

                      1 Reply Last reply
                      0
                      Reply
                      • Reply as topic
                      Log in to reply
                      • Oldest to Newest
                      • Newest to Oldest
                      • Most Votes


                      • Login

                      • Don't have an account? Register

                      • Login or register to search.
                      • First post
                        Last post
                      0
                      • Categories
                      • Recent
                      • Tags
                      • Popular
                      • World
                      • Users
                      • Groups