MOMspider --
Making Document Metainformation Visible

As MOMspider traverses an infostructure, it can read and parse any HTML metainformation identified via the META element, as proposed for the HTML 2.0 specification. This document contains two such examples:
    <META http-equiv="Owner" content="RTF">
    <META http-equiv="Reply-To" content="fielding@ics.uci.edu">
which can also be seen by viewing the HTML source.

Metainformation obtained in this manner can be stored by MOMspider and included in the HTML index it outputs. As shipped, MOMspider only stores the META elements tagged as "Expires", "Owner", and "Reply-To". However, it is very easy to extend MOMspider so that it will look for and store other named metainfo. Possibilities include IAFA index items for building site description files, graphical coordinates for building spacial maps of webspace, etc.

MOMspider would be immensely more powerful in its ability to maintain large, distributed infostructures, if this metainformation was parsed by the server and provided within the response headers to an HTTP HEAD request. This would allow metainformation to be obtained from remote servers without ever having to retrieve the file using a GET request.

The META Element in HTML

Purpose

The META element can be used within the HEAD element to embed document metainformation not defined by other HTML elements. Such information can be extracted by servers/clients for use in identifying, indexing, and cataloging specialized document metainformation.

Although it is generally preferable to use named elements which have well-defined semantics for each type of metainformation (e.g. TITLE), this element is provided for situations where strict SGML parsing is necessary and the local DTD is not extensible.

In addition, HTTP servers can read the content of the document HEAD to generate response headers corresponding to any elements defining a value for the attribute HTTP-EQUIV. This provides document authors a mechanism (not necessarily the preferred one) for identifying information which should be included in the response headers for an HTTP request.

The attributes of the META element are:

HTTP-EQUIV
This attribute binds the element to an HTTP response header. It means that if you know the semantics of the HTTP response header named by this attribute, then you can process the contents based on a well-defined syntactic mapping, whether or not your DTD tells you anything about it. HTTP header names are not case sensitive. If not present, the attribute NAME should be used to identify this metainformation and it should not be used within an HTTP response header.
NAME
Metainformation name. If not present, the name can be assumed equal to the value of HTTP-EQUIV.
CONTENT
The metainformation content to be associated with the given name and/or HTTP response header.

Examples of use

If the document contains:

    <expires http-equiv="Expires">Tue, 04 Dec 1993 21:29:02 GMT</expires>
    <meta http-equiv="Keywords" content="Fred, Barney, Wilma">
    <meta http-equiv="Reply-to" content="fielding@ics.uci.edu (Roy Fielding)">
The server may include the headers:
    Expires: Tue, 04 Dec 1993 21:29:02 GMT
    Keywords: Fred, Barney, Wilma
    Reply-to: fielding@ics.uci.edu (Roy Fielding)
as part of the HTTP response to a GET or HEAD request for that document.

When the HTTP-EQUIV attribute is not present, the server should not generate an HTTP response header for this metainformation; e.g.,

    <meta name="IndexType" content="Service">
would not generate an HTTP response header but would still allow clients or other tools to make use of that metainformation.

Examples of misuse

One example of an inappropriate usage for the META element is to use it to define information that should be associated with an already existing HTML element, e.g.

    <meta name="Title" content="The Etymology of Dunsel">

A second example of inappropriate usage is to name an HTTP-EQUIV equal to a response header that should normally only be generated by the HTTP server. Example names that are inappropriate include "Server", "Date", and "Last-modified" -- the exact list of inappropriate names is dependent on the particular server implementation. It is recommended that servers ignore any META elements which specify http-equivalents which are equal (case-insensitively) to their own reserved response headers.


Roy Fielding <fielding@ics.uci.edu>
Department of Information and Computer Science,
University of California, Irvine, CA 92717-3425
Last modified: Wed Aug 10 02:33:51 1994