Web Authoring Statistics: The a element

The a element is the element with the attribute used by the most pages:

href, target, class, name, title, onclick, onmouseover, onmouseout, style, id, accesskey, rel, alt, onfocus, "", tabindex, border, onmousedown, language, and ;.

In addition to the almost ubiquitous use of href, around half of pages in the sample had at least one a element with a target attribute (determining whether these are mostly attempts to make links open in new windows or tabs or whether they are indicative of frames use would require further research; the first seems most likely, however, since frames in general aren't used much).

A lot of pages use the title attribute on a elements, which is good. On the other hand, the high frequency of occurrences of onmouseover on a elements is a little worrying; presumably those are mostly cases of the status bar being overridden. If so, the fact that there are noticeably fewer onmouseout attributes in use on a elements is probably a sign of rampant bad UI. Thankfully for users, most Web browsers these days prevent the status bar from being changed by scripts.

The rel attribute is not used all that much, but it is still used enough to matter. Here are the most common rel values:

nofollow, license, external, bookmark, tag, contents, start, help, alternate, and sidebar.

The Element: nofollow value was originally announced by Google in conjunction with some other vendors, but has apparently now gotten much wider industry support. license is part of the rel-license microformat, and is propagated by the Creative Commons movement. external seems to be mainly propagated by WordPress, but people have long been asking for a way to label their links as being external vs internal. bookmark is the de-facto "permalink" link type (which we all pretend is what HTML4 meant by its ridiculously vague definition). tag is a big part of the recent "Web 2.0 collaborative remixability social tagging" trend. contents, start, and help are HTML4 values, with relatively well-defined meanings (though it is possible that the reason contents and start have roughly the same popularity is that a lot of people are writing <a rel="contents start">). alternate is mostly used for feeds. sidebar is an Opera and Firefox value that makes the link appear in the browser's sidebar.

The ""="" and ;="" attributes, which are apparently relatively common, are indicative of bad markup. There's a lot of that about. Another "attribute" that is seen a lot on a elements (though not in the top twenty) is http:. These comes from things like this:

<a href http://www.example.com/; title "">...</a>

This example would result in an a element with six attributes: href, http:, www.example.com, ;, title and "". For similar reasons, the "attributes" location.href, return, and false are used measurably often on this element.

Other strange attributes that are used on a elements include alt (a lot!), border, nowrap, width, and valign (none of which have any effect on browsers, but all of which were probably intended for other elements in the vincinity of the a). Interestingly, all of these are used on more pages than the rev attribute, which is used so rarely that it didn't even appear in the top fifty attributes on the a element in our study.

The coords attribute is, much like the rev attribute, basically never used. This indicates that the ability to use image maps with a elements instead of area elements is simply not used. In contrast, the type and hreflang attributes are used a little.

From the point of view of changes to the specifications, these findings are quite important. The rarity of rev and coords suggests that those features could be removed from HTML without any difficulty. In contrast, the ping attribute, proposed in HTML5, didn't appear on the list at all, so it is likely that adding it will not cause any problems on existing sites.