Using a C# regex to parse a domain name?
I need to parse the domain name from a string. The string can vary and I need the exact domain.
Examples of Strin开发者_StackOverflow中文版gs:
http://somename.de/
www.somename.de/
somename.de/
somename.de/somesubdirectory
www.somename.de/?pe=12
I need it in the following format with just the domain name, the tld
, and the www
, if applicable:
www.somename.de
How do I do that using C#?
As an alternative to a regex solution, you can let the System.Uri
class parse the string for you. You just have to make sure the string contains a scheme.
string uriString = "http://www.google.com/search";
if (!uriString.Contains(Uri.SchemeDelimiter))
{
uriString = string.Concat(Uri.UriSchemeHttp, Uri.SchemeDelimiter, uriString);
}
string domain = new Uri(uriString).Host;
This solution also filters out any port numbers and converts IPv6 addresses to its canonical form.
i simple used
Uri uri = new Uri("http://www.google.com/search?q=439489");
string url = uri.Host.ToString();
return url;
because by using this you can sure.
I checked out Regular Expression Library, and it looks like something like this might work for you:
^(([\w][\w\-\.]*)\.)?([\w][\w\-]+)(\.([\w][\w\.]*))?$
Try this:
^(?:\w+://)?([^/?]*)
this is a weak regex - it doesn't validate the string, but assumes it's already a url, and gets the first word, until the first slash, while ignoring the protocol. To get the domain look at the first captured group, for example:
string url = "http://www.google.com/hello";
Match match = Regex.Match(url, @"^(?:\w+://)?([^/?]*)");
string domain = match.Groups[1].Value;
As a bonus, it also captures until the first ?
, so the url google.com?hello=world
will work as expected.
精彩评论