Metadata (computing)

Metadata (Greek: meta- + Latin: data "information"), literally "data about data", is information that describes another set of data. A common example is a library catalog card, which contains data about the contents and location of a book: It is data about the data in the book referred to by the card. Other common contents of metadata include the source or author of the described dataset, how it should be accessed, and its limitations. Another important type of data about data is the link or relationship between data. Some metadata schemes attempt to embrace this concept, such as the Dublin Core element link.

Since metadata is also data, it is possible to have metadata of data--"meta-metadata." Machine-generated meta-metadata, such as the reversed index created by a free-text search engine, is generally not considered metadata, though.

Metadata that is embedded with content is called embedded metadata. A data repository typically stores the metadata detached from the data.

Uses
Metadata has become important on the World Wide Web because of the need to find useful information from the mass of information available. Manually-created metadata adds value because it ensures consistency. If one webpage about a topic contains a word or phrase, then all webpages about that topic should contain that same word. It also ensures variety, so that if one topic has two names, each of these names will be used. For example, an article about Sports Utility Vehicles would also be given the metadata keywords &lsquo;4 wheel drives&rsquo;, &lsquo;4WDs&rsquo; and &lsquo;four wheel drives&rsquo;, as this is how they are known in some countries.

Examples of metadata for an audio CD include the MusicBrainz project, and AMG's All Music Guide. Similarly, MP3 files have metadata tags in a format called ID3.

Metadata is more properly called ontology or schema when it is structured into a hierarchical arrangement. Both terms describe &ldquo;what exists&rdquo; for some purpose or to enable some action. For instance, the arrangement of subject headings in a library catalog serves as not only a guide to finding books on a particular subject in the stacks, but also as a guide to what subjects &ldquo;exist&rdquo; in the library&rsquo;s own ontology and how more specialized topics are related to or derived from the more general subject headings.

Metadata is frequently stored in a central location and used to help organizations standardize their data. This information is typically stored in a Metadata Registry.

Relational database metadata
Each relational database system has its own mechanisms for storing metadata. Examples of relational-database metadata include:
 * Tables of all tables in database, their names, sizes and number of rows in each table.
 * Tables of columns in each database, what tables they are used in, and the type of data stored in each column.

In database terminology, this set of metadata is referred to as the catalog. The SQL standard specifies a uniform means to access the catalog, called the, but not all databases implement it, even if they implement other aspects of the SQL standard. For an example of database-specific metadata access methods, see Oracle metadata.

Data warehouse metadata
Data warehouse metadata systems are sometimes seperated into two sections:
 * 1) back room metatdata that is used for Extract, transform, load functions to get OLTP data into a data warehouse
 * 2) front room metadata that is used to label screens and create reports

Kimball lists the following types of metadata in a data warehouse (See also ):
 * source system metadata
 * source specifications, such as repositories, and source schemas
 * source descriptive information, such as ownership descriptions, update frequencies, legal limitations, and access methods
 * process information, such as job schedules and extraction code
 * data staging metadata
 * data acquisition information, such as data transmission scheduling and results, and file usage
 * dimension table management, such as definitions of dimensions, and surrogate key assignments
 * transformation and aggregation, such as data enhancement and mapping, DBMS load scripts, and aggregate definitions
 * audit, job logs and documentation, such as data lineage records, data transform logs
 * DBMS metadata, such as:
 * DBMS system table contents
 * processing hints

File system metadata
Nearly all file systems keep metadata about files out-of-band. Some systems keep metadata in directory entries; others in specialized structure like inodes or even in the name of a file. Metadata can range from simple timestamps, mode bits, and other special-purpose information used by the implementation itself, to icons and free-text comments, to arbitrary attribute-value pairs.

With more complex and open-ended metadata, it becomes useful to search for files based on the metadata contents. The Unix find utility was an early example, although inefficient when scanning hundreds of thousands of files on a modern computer system. Apple Computer's current version of its Mac OS X operating system (Tiger) supports cataloging and searching for file metadata through a feature known as Spotlight. Microsoft Windows (Vista) is expected to include a similar functionality via the WinFS file system. Linux implements file metadata using extended file attributes.

Image metadata
Examples of image files containing metadata include Exchangeable Image File Format (EXIF) and Tagged Image File Format (TIFF).

Program metadata
Most executable file formats include metadata describing issues that need to be considered by the runtime or operating system when executing the program.

In DOS, the COM file format does not, but the EXE file format does, and Windows PE format also. This metadata can include the company that published the program, the date the program was created, the version number and more.

In the Microsoft .NET executable format, extra metadata is included to allow reflection at runtime.

Other programs such as Microsoft Word and other Microsoft Office products save metadata into the document files. This metadata can contain the name of the person who created the file (obtained from the operating system), the name of the person who last edited the file, how many times the file has been printed, and even how many revisions have been made on the file.

For a list of executable formats, see object file.