2005-02-24

An Exception to the Rule: .NET's HttpWebResponse

There's something funny about parts of the .NET Framework.

For one, I'd like to know why they waited until version 2.0 to start implementing a high-res timing class. The "Stopwatch" class should turn out to tell you exactly how fast parts of your program can perform. No more of this dt1 = DateTime.Now; do_something(); dt2 = DateTime.Now; ts = dt2.Subtract(dt1);

Or is it dt1.Subtract(dt2)? Hmmm. Stopwatch looks to be able to eliminate that problem with a vengeance.

Second thing: I've discovered, much to my chagrin, that the WebResponse class seems to throw a WebException for no reason other than the response from the host wasn't a "200 OK". WTF?

Let's look at an example in C#.

  // We want to fetch a file called "somewhere.txt"
  HttpWebRequest req = (HttpWebRequest) WebRequest.Create("http://some/file/somewhere.txt");

  // But we don't want to wait more than 1 minute.
  // 1 minute is 60 seconds is 60000 milliseconds.
  req.Timeout = 60000;

  // And let's be nice to the host and not fetch the file unless it
  // is newer than our cached copy. The cached copy has a
  // modification time stored in dtLastFetched.
  req.IfModifiedSince = dtLastFetched;

  HttpWebResponse response = (HttpWebResponse) req.GetResponse();

Already, we're in trouble. Let's assume that "somewhere.txt" is infrequently updated, like a list of DNS root servers or a table of leap seconds. For reasons I can not comprehend, the HttpWebResponse will throw an exception if the file has not been modified since the date given by req.IfModifiedSince. Shouldn't the response, in this case a perfectly understandable error code, still be considered a valid response? Why does a "304 Not Modified" error equal "drop everything and throw an exception"?

The If-Modified-Since header is designed to save bandwidth by explicitly telling a host that it's OK to send a 300-level error code if the file requested isn't newer than the value given. And yet when the HttpWebResponse sees the 300-level error code, it pitches a fit. It gets worse: you, the programmer, have to handle the WebException.

I've worked around it for now by writing a miserable wrapper around the exception and comparing the error code there:

  try {
    HttpWebResponse response = (HttpWebResponse) req.GetResponse();
  }
  catch (WebException we) {
    if (null == response) {
      HttpWebResponse re2 = we.Response;
      if (re2.StatusCode == HttpStatusCode.NotModified) {
        Console.WriteLine("file not modified");
        continue; // or return, or break, or whatever
      }
    }
  }

What I would prefer to do is have the response be just that: a response; an error message saying that yes, in fact, something went wrong and everything is not "200 OK". Exceptions are expensive and there is a good chance you can inadvertantly break something trying to compensate for every possible exception that every class can create. I think a better way to do it is to initialize the first HttpWebResponse anyway, with the error and everything, and urge the programmer to acknowledge that his code isn't guaranteed to receive a "200 OK". This way, the program keeps a good flow – all possible status codes are checked at one place in the code – and you don't have the complexity of try-throw-catch problems to deal with.

  HttpWebResponse response = (HttpWebResponse) req.GetResponse():

  if (response.StatusCode == HttpStatusCode.OK) {
    dosomething();
  }
  if (response.StatusCode == HttpStatusCode.NotModified) {
    die_gracefully_notmodified();
  }
  else {
    die_gracefully_otherreason();
  }

  // Always try to close the connection.
  response.Close();

No comments: