Web Authoring Statistics: HTTP Headers

The data we collected for HTTP headers was mostly an afterthought and as such isn't very reliable. Here are some things we noticed, though:

  • text/html documents without a charset parameter in the Content-Type header outnumber those with such a parameter almost by a factor of two (despite the HTML4 spec saying that UAs must not assume any default value for the "charset" parameter).

  • Documents with the text/xml MIME type outnumber documents with the application/xml MIME type by at least three to one (despite the fact that the former is discouraged by the XML standards community because of the rules for how to handle character sets with those MIME types).

  • There are only twice as many text/plain documents out there than application/msword documents (and that doesn't take into account the fact that text/plain is the default MIME type of some servers while many application/msword documents will end up labelled as something else).

  • The Set-Cookie header (which is one of the ten most-used headers) is present on about two orders of magnitude more pages than the Set-Cookie2 header (despite the former being considered insecure).

  • A pretty significant number of pages include an X-Pingback header (more than the number of pages with the Set-Cookie2 header). In fact, X-Pingback was the 30th most-seen header in our data sample. The X-Pingback header is part of Pingback, a blogging technology for tracking responses similar to Trackback.

  • There are pages that use the Window-Target header, and even some that use the Link header (though we haven't yet checked what for!). There are even some pages that include the Content-Style-Type header.