Self-description Microformat

Version: Draft - February 10th 2007
URL: http://selfdescription.org/microformat/selfdescription-draft-10022007.html
Author: Tim Hodson (tim@timhodson.com)

The microformat components

There are 26 data elements (not including the vCard elements) that are considered suitable in a self description of a collection. These are broken down into four distinct sets:

Basic
- collection
- title
- description
- format
- location
- interface
- homepage
- contributor
  Used with hCard microformat
Desirable
Optional
Expert
- service
- authentication-method

Basic

[top]

collection

A class named collection is used to identify a block level (X)HTML element containing information about a collection. Can contain all other data elements.

Nested (X)HTML components imply a parent/child relationship between collections.

<body class="collection">
<!-- the main collection -->
	<div class="collection">
	<!-A sub collection -->
	...
	</div>
	<div class="collection">
	<!-A sub collection -->
	...
		<p class="collection">...</p>
		<p class="collection">...</p>
		<p class="collection">...</p>
		<!-- three collections within a sub collection -->
	</div>
</body>

[top]

title

The name of the collection. A class title applied to an inline element to indicate that it is the name of the collection

<h1 class="title"></h1> 
or
<p><b class="title">The Tattlebury-Smith</b> collection is well known ...</p>

[top]

description

A description of the collection. A class description applied to any element that contains within it a description of a collection.

<p class="description"> ... </p>
or
<ul class="description">
	<li> ... </li>
</ul>

[top]

format

The format of items within the collection. A class format is applied to elements that contain the format information. This is typically, but not limited to, a element. As can be seen fro the example, this format is typically within a written description identified using the description element.

<p class="description">
A collection of <span class="format">pamphlets</span>, 
<span class="format">periodicals</span> 
and <span class="format">books</span> exploring ...
</p>

[top]

location

The physical location at which a collection can be found. If the collection does not have a physical presence, i.e. it is digital, then no location should be given. Use interface or homepage instead.

A class location is given to an element that contains the name (and preferably address and geospatial coordinates) of the location, encoded as a vCard. See the vCard microformat specification at http://microformats.org/wiki/hcard for specifc details on the vCard elements.

<div class="location vcard">
<span class="fn org">Herefordshire Libraries</span><br /> 
<span class="organization-unit">Central Library</span><br /> 
<address class="adr"> <span class="street-address">Broad Street</span><br /> 
	<span class="locality">Hereford</span><br /> 
	<span class="region">Herefordshire</span><br /> 
	<span class="country-name">England</span><br /> 
	<span class="postal-code">HR4 9AU</span><br /> 
</address> 
<span class="geo"> 
	<span class="latitude">52.054813</span> 
	<span class="longitude">-2.717657</span> 
</span> 
</div>

[top]

interface

The location of a web page providing access to a search tool for the collection. A class interface is applied to an anchor element to indicate that the anchor link points to a search interface.

<a href="http://example.org/supercat/search.html" 
	title="Search the catalogue" class="interface">Link to the catalogue</a>

[top]

homepage

The location of a web page providing more information about the collection. This may be an 'about' page that provides a more in-depth history of the collection. Or the index page of a mini-site devoted to the collection. Or a link to the page that this self description can be found on.

BEWARE: If the homepage is a good description of the collection, and is not currently a page containing a self description microformat, it may be worth considering whether the homepage is better suited to hosting the self description.

A class homepage is applied to an anchor element to indicate that the anchor link points to a search interface.

<a href="http://example.org/collection/index.html"
	title="The Jones Collection" class="homepage">More information...</a>

[top]

contributor

The person who created this self description, and to whom enquires about the self description should be addressed. It should be noted that the self descriptions are unlikely to carry much authority unless the person who wrote it identifies themselves, and provides a means by which contact can be made in the event of a disagreement over accuracy.

A class contributor is given to an element that contains the name and contact information of the person formatted as a vCard. See the vCard microformat specification at http://microformats.org/wiki/hcard for specific details on the vCard elements.

<div class="contributor vcard">
	<span class="fn n">Tim Hodson/span><br /> 
	<span class="title">Stock Officer</span> 
	<span class="org fn">Herefordshire Libraries</span><br /> 
	<span class="organization-unit">Central Library</span><br /> 
	<address class="adr"> 
		<span class="street-address">Broad Street</span><br /> 
		<span class="locality">Hereford</span><br /> 
		<span class="region">Herefordshire</span><br /> 
		<span class="country-name">England</span><br /> 
		<span class="postal-code">HR4 9AU</span><br /> 
	</address> 
	<p class="tel"> tel:<span class="work pref voice">01432 261644</span> </p> 
	<p class="mail"> 
		<span class="work pref">timhodson@herefordshire.gov.uk</span><br /> 
		<span class="home">tim@timhodson.com</span>
	</p> 
</div>

Desirable

[top]

subject

The subjects covered by the collection.

A class subject is applied to elements that contain the subject information. This is typically, but not limited to, a element.

<p class="description"> 
	A collection of books about <span class="subject">biology</span>, <span class="subject">phrenology</span> 
	and <span class="subject">trepanning</span> in the ...
</p>

[top]

spatial

References to geographic coverage of the collection.

A class spatial is applied to elements that contain the spatial information. This is typically, but not limited to, a element.

<p class="description">
	A collection of books about life in 
	<span class="spatial">Herefordshire</span> ...
</p>

[top]

temporal

References to historical times covered by the collection.

A class temporal is applied to elements that contain the temporal information. This is typically, but not limited to, a element.

<p class="description"> 
	A collection of books about life in 
	<span class="temporal">eighteenth century</span> ...
</p>

[top]

extent

References to the size of the collection.

A class extent is applied to elements that contain the extent information. This is typically, but not limited to, a element.

Note that extent should include as much information as necessary to give a numerical figure meaning. In the example shown, 1,250 on its own would not have any meaning. The extent of '1,250 books' by it's nature also implies that that format is present, hence the use of the format element

<p class="description">
	<span class="extent">1,250<span class="format">books</span></span> 
	and <span class="extent">100 <span class="format">pamphlets</span></span>
	relating to ...
</p>

[top]

accumulation

A date or range of dates when material in the collection was accumulated. A class accumulation is applied to elements that contain the extent information. This is typically, but not limited to, a element.

<p class="description">
... were collected <span class="accumulation">between 1870 and 1923<span/> ... 
</p>

[top]

contents-date

A date or range of dates when material in the collection was created or published.

A class contents-date is applied to elements that contain information about the create date of the items in the collection. This is typically, but not limited to, a element. It may be necessary to include format information within the contents-date span if more than one contents-date is used.

<p class="description">
	... a collection of original <span class="contents-date">
	<span class="format">posters</span> 
	from 1940s - 1960s</span> and <span class="contents-date">
	<span class="format">Theatre playbills</span> 
	from 1920s to 1930s</span> 
	...
</p>

[top]

access-policy

Any restrictions on how, when and which users can access the collection. A class access-policy is applied to elements that contain the access policy information. This is typically, but not limited to, a element.

<p class="access policy">
	The Reading Room is open from 10am - 5pm, Monday - Thursday, 
	lending copies may be borrowed with 
	a valid Herefordshire Libraries library card
</p>

[top]

accrual-policy

Information relating to the current accrual status, periodicity or method. A class accrual-policy is applied to elements that contain the accrual policy information. This is typically, but not limited to, a element.

<p class="collection">
	<span class="name">The periodical collection</span> contains 
	<span class="accrual-policy">current</span> 
	<span class="format">journals</span> and 
	<span class="format">newspapers</span>.
</p>

[top]

custodial-history

Information relating to the custody of the collection. This may include the provenance of the collection. A class custodial-history is applied to elements that contain the custody information. This is typically, but not limited to, a element.

<p class="collection">
	<span class="name">The Benjamin Smith collection</span> 
	was <span class="custodial-history">bequeathed to the 
	library of Minnesota in 1969</span>.
</p>

[top]

super-collection and sub-collection

Both super-collection and sub-collection are implied through the use of nested (X)HTML elements when they are described on the same web page. See the explanation for collection.

However, you may have collection descriptions that are spread over a number of pages, and wish to identify a connection between them. This can be achieved using the rel attribute in either a link or anchor element.

Link - (inside <head>)

<link href="http://example.org/sub/collection.html" rel="sub-collection" /> 
<link href="http://example.org/super/collection.html" rel="super-collection" />

Anchor - (inside <body>)

<a href="http://example.org/sub/collection.html" rel="sub-collection" 
	title="another collection">The sub collection</a>
<a href="http://example.org/super/collection.html" 
  	rel="super-collection" title="another collection">The parent collection</a>

Use of these rel attribute linktypes requires that the <head> tag should include the following piece of code

<head profile="http://www.purl.org/NET/selfdescription/collectionprofile.html" />

[top]

language

See Language under Optional for a full explanation.

Optional

[top]

An identifier

An identifier was considered to be an optional component of the microformat, on the grounds that it is only useful for the person writing the self description in their implementation.

There are several options available for choosing a style of identifier to use, but none of them are required. The microformat specification leaves it to the implementer to decide that a particular identifier will be used. It is enough to require that an identifier be used where possible to differentiate or uniquely label collections on a page, or within an organisation. It is not considered necessary to atempt to uniquiely identify the collection within a global scope.

The identifier is considered to be optional because a collection could be unambiguously identified by looking at three pieces of information.

The name of the collection
The location of the collection
The name of the organisation (org) with custody of the collection

The exception! Where a self description is written in more than one language, and the two languages reside on separate html pages or sections of a website, then an identifier must be used to identify that self description as being the same but in translation.

Example: The identifier would be entered in the web document in one of two ways.

1. by using an id attribute on a block element that contained a collection.

<div class="collection" id="thomas-bay-collection"> ... </div> 

or
<div class="collection" id="http://example.org/collections/thomasbay.html"> ... </div> 

or
<p class="collection" id="thomas-bay-collection"> ... </p> 

or
<tr class="collection" id="thomas-bay-collection"> ... </tr>

2. by using a named anchor as part of page navigation within a block element.

<div class="collection"> 
	<a name="thomas-bay-collection"><h1 class="name">The Thomas Bay Collection</h1></a>
	... 
</div>

[top]

subtitle

A subtitle may be used to further identify the collection. A class subtitle is applied to the (X)HTML element containing the subtitle information.

<h2 class="subtitle"> ... </h2>

[top]

language

Inclusion of language information is possible where language is the language of the description, or the language of the collection.

Language of the self description

An xml:lang attribute in each element identifies the language of the element. The attribute can be repeated. For example, multiple title elements may be present in different languages. The language is assumed to be English if no language attribute is given.

<h1 class="name">My Collection of stamps</h1>
<h2 class="name" xml:lang="fr">Ma Collection de timbres</h2>

Additionally, the use of semantic elements like h1 or h2 will indicate the importance of the language to the description. In the example above, the main language is English with a French equivalent. A parser could deduce that the French h2 element was a subtitle to the main title.

It would be up to contributors of descriptions to decide which languages they wished to write the self description in, and whether to present that information within one or more html pages. If separate html pages were used, the unique identifier becomes a required element, as the self description will need to be identified as being in translation.

Where the translation resides on the same html page, the translation may follow other elements in any sensible manor, but MUST be within the same (X)HTML block that contains the collection

Example: translated element follows element (of course this may not lead to very readable web pages!!)

<h1 class="name">My Collection of stamps</h1> 
<h1 class="name" xml:lang="fr">Ma Collection de timbres</h1> 
	
or
<p class="description"> ... </p> 
<p class="description" xml:lang="fr"> ... </p>

Example: translation follows as fresh block of html

<div class="collection">
	<div xml:lang="en"> 
  		<h1 class="name">My Collection of stamps</h1> 
		<p class="description"> ... </p> 
	</div> 
	<div xml:lang="fr"> 
  		<h1 class="name" >Ma Collection de timbres</h1> 
		<p class="description"> ... </p> 
	</div> 
</div>

Language of the collection

Language of the collection should be identified using the class language, where the content of the (X)HTML element with that class is the language.

<p class="description">
	A description of many items in a collection 
	where the language of items is both 
	<span class="language">French</span>
	and <span class="language">Spanish</span>
</p>

Expert

[top]

service

Service is intended to support the automatic discovery of remote searching such as Z39.50, OAI Harvesting, SRU/W and Open URL among others. In order to express all possible combinations of search parameters and types, it is suggested that implementers, who are comfortable with defining how their search service operates, should do so using the Information Environment Service Registery, or the Talis Silkworm Directory.

At this time (and quite possibly never) it is not a feasible proposition to fully identify all potential combinations of values that would be needed by a system to commit a search.

This is suggested as an avenue for further development.

It may be decided that a service element for a self description in the form of a microformat is not justified because of the microformat "humans first" principle. This may leave it open to self descriptions in other RDF or XML formats to be inclusive of a description.

In this microformat it may only be necessary to point to the existence of a definition in an existing service registery. For example, a service that supports Open URL may have an interaction set registered in the IESR or Talis Silkworm Directories. A reference as a link or Anchor element could point to the information source for an Open URL resolver.

Thus:

<div class="service"> 
	<form method="get" action="http://api.talis.com/1/node/items/{silkworm identifier}/bib"> 
	Title: <input type="text" name="title" size="25"/> 
	<input type="submit" value="Search"/><br/> 
  	<input type="hidden" name="api_key" value="{CONTRIBUTOR_API_KEY}"> 
  	</form>
</div>

The form is essentially the same as defined by the Talis Silkworm Deep Linking API. More information can be found at http://www.talis.com/tdn/node/1429.

[top]

authentication-method

Identify the method of authorisation of an on or off-campus user within the auth-method data element. This may be a or <abbr> with a title attribute to denote whether on-campus or off-campus.

<p class="access-policy"> 
The library is open from 9am - 5pm. 
Registered borrowers can renew their books and search the 
<a href="{URL}" class="interface">online catalogue</a> 
using their <span class="authentication-method" >library card and PIN</span>.
</p>

or
<p class="access-policy">
The FAME database is available offcampus using your 
<abbr class="authentication-method" title="off-campus">Athens Login</abbr>
</p>

[top]

Changes

Added navigation aids.
Corrected error in 'spatial' definition.
Corrected error in 'contents-date' definition.
Expanded authentication-method.
Renamed name as title to use Dublin Core terminology.
Headings for elements now use the element name with correct capitalisation and punctuation.