Last Modified date of an External webpage
-
I am trying to retrieve the last modified date of an external webpage. I have found numerous ways to do this in vb, java, etc... but not in C#. Heres some of them: VB - http://www.freevbcode.com/ShowCode.asp?ID=362 ASP (so basically the same thing) - http://www.webdeveloper.com/forum/archive/index.php/t-3706.html I've tried using HTTPWebRequest and HTTPWebResponse, since HTTPWebResposne has a LastModified property, but this is when the response was last modified, not the page. So I'm stuck... Any help would be awesome :) --Peter
-
I am trying to retrieve the last modified date of an external webpage. I have found numerous ways to do this in vb, java, etc... but not in C#. Heres some of them: VB - http://www.freevbcode.com/ShowCode.asp?ID=362 ASP (so basically the same thing) - http://www.webdeveloper.com/forum/archive/index.php/t-3706.html I've tried using HTTPWebRequest and HTTPWebResponse, since HTTPWebResposne has a LastModified property, but this is when the response was last modified, not the page. So I'm stuck... Any help would be awesome :) --Peter
You can call Response.GetResponseHeader("Last-Modified") to get the time, but then you have to parse it -- and there's about 3 standard time formats in HTTP. Luckily I wrote a parser for a web server[^] a while ago, so here it is:
static DateTime ParseHttpTime(string str)
{
DateTime dt;
try
{
dt = DateTime.ParseExact(str, httpDateTimeFormats, System.Globalization.DateTimeFormatInfo.InvariantInfo,
System.Globalization.DateTimeStyles.AllowWhiteSpaces | System.Globalization.DateTimeStyles.AdjustToUniversal);
}
catch(FormatException)
{
dt = DateTime.Parse(str, CultureInfo.InvariantCulture);
}
return dt;
}The method can throw an ArgumentNullException if the input is null and a FormatException if the input isn't formatted properly. Keep in mind this happens a lot... most servers don't seem to return a Last-Modified value these days.
-
You can call Response.GetResponseHeader("Last-Modified") to get the time, but then you have to parse it -- and there's about 3 standard time formats in HTTP. Luckily I wrote a parser for a web server[^] a while ago, so here it is:
static DateTime ParseHttpTime(string str)
{
DateTime dt;
try
{
dt = DateTime.ParseExact(str, httpDateTimeFormats, System.Globalization.DateTimeFormatInfo.InvariantInfo,
System.Globalization.DateTimeStyles.AllowWhiteSpaces | System.Globalization.DateTimeStyles.AdjustToUniversal);
}
catch(FormatException)
{
dt = DateTime.Parse(str, CultureInfo.InvariantCulture);
}
return dt;
}The method can throw an ArgumentNullException if the input is null and a FormatException if the input isn't formatted properly. Keep in mind this happens a lot... most servers don't seem to return a Last-Modified value these days.
-
So if the server doesen't return a Last-Modified value, as you said most servers don't these days, then is it still possible to retrieve the Last-Modified date of a page via some other means? Or am i just out of luck? --Peter
I can't think of any other ways... :(