3.2.7. Providing metadata – inside, outside
documents, Simon Pockley
There is nothing new about the concept of metadata. Metadata is resource description; the kind of information found in a library catalogue. What is new in the digital world is the essential role that you, the creator, now play in providing this information. Good quality metadata is easy to provide at the point of creation but usually difficult, expensive or impossible to discover retrospectively.
At one level, this is because all digital resources are in some way dependent on electronic mediation by computers and software and it is only at the point of creation that a record of these dependencies and descriptions can be recorded. At another level, it is the sheer volume of creation that alters the role of the librarian or custodian from cataloguer to metadata repository manager.
In an ideal world, all digital material would be created independent of proprietary hardware and software. In other words, everything would run on commonly available hardware using freely available (public domain) software such as a web browser.
In the real world, many content creators will be producing work on-line or off-line that is either hardware or software dependent (or both). Unfortunately, the costs of emulation, migration and licensing increase if resources are generated in proprietary or platform dependent formats. If possible, try to use commonly available open source formats.
Metadata is information about these applications and formats, which allows for licensed versions to be archived so that the material can be displayed or accessed. In order to be able to provide long-term access to a digital resource, the NDLTD needs the following metadata:
n Information about the content creator (rights, contributors, publisher,);
n Information about the content that will help it to be found or discovered (coverage, description, title, subject, relationships);
n Information about the resource (formats, system requirements, date, identification).
Metadata
can be stored in:
1. The
object or document being described.
There are
a growing number of audiovisual formats that allow for metadata to be embedded
in the file itself. For example, a text format like HTML allows you to embed
metadata in the header of the file and recent versions of image formats such as
MPEG include space for metadata. This has the advantage that the information is
self-contained and is truly transportable across systems. The major
disadvantage is that systems accessing the object will have trouble catering
for multiple views or meanings.
2.
A
separate file that can be externally accessed but is linked to the object or
document.
This has
the advantage that different communities can gather the metadata for different
purposes. It has the disadvantage of being open to misinterpretation through
syntax error or unrecognised schema.
3.
A separate file stored in a
database.
The NDLTD
model encourages students to submit their metadata to a central repository for
indexing in a database. The database will then point to the object/document.
This also allows for multiple instances of the metadata for one document. It
also provides for enhanced administrative tools (as are normally provided by database
systems). Advanced database systems could provide a very sophisticated
management system. This is the most expensive method to implement but it has
the advantage of being significantly more flexible and provides administrative
support from the outset.
See: European
Projects such as Metadata
Observatory. The aim of
the Observatory is to maintain and promote a knowledge base for metadata for
multimedia information to continually assess relationships between Dublin Core
and other initiatives, especially undertaken in
Available [on-line]
http://www.cenorm.be/isss/Workshop/metadata-observatory/Home%20Page.htm
See: The
Available [on-line] http://dublincore.org/