Digital library

A digital library is a library in which a significant proportion of the resources are available in machine-readable format (as opposed to print or microform), accessible by means of computers. The digital content may be locally held or accessed remotely via computer networks. In libraries, the process of digitization began with the catalog, moved to periodical indexes and abstracting services, then to periodicals and large reference works, and finally to book publishing. Some of the largest and most successful digital libraries are Project Gutenberg, ibiblio and the Internet Archive.

Advantages
Traditional libraries are limited by storage space; digital libraries have the potential to store much more information, simply because digital information requires very little physical space to contain it. As such, the cost of maintaining a digital library is much lower than that of a traditional library. A traditional library must spend large sums of money paying for staff, book maintenance, rent, and additional books. Digital libraries do away with these fees.

Digital libraries can immediately adopt innovations in technology providing users with improvements in electronic and audio book technology as well as presenting new forms of communication such as wikis and blogs.


 * No physical boundary. The user of a digital library need not to go to the library physically; people from all over the world can gain access to the same information, as long as an Internet connection is available.
 * Round the clock availability. A major advantage of digital libraries is that people can gain access to the information at any time, night or day.
 * Multiple accesses. The same resources can be used at the same time by a number of users.
 * Structured approach. Digital libraries provide access to much richer content in a more structured manner, i.e. we can easily move from the catalog to the particular book then to a particular chapter and so on.
 * Information retrieval. The user is able to use any search term bellowing to the word or phrase of the entire collection. Digital libraries can provide very user-friendly interfaces, giving clickable access to its resources.
 * Preservation and conservation. An exact copy of the original can be made any number of times without any degradation in quality.
 * Space. Whereas traditional libraries are limited by storage space, digital libraries have the potential to store much more information, simply because digital information requires very little physical space to contain them. When a library has no space for extension digitization is the only solution.
 * Networking. A particular digital library can provide a link to any other resources of other digital libraries very easily; thus a seamlessly integrated resource sharing can be achieved.
 * Cost. In theory, the cost of maintaining a digital library is lower than that of a traditional library. A traditional library must spend large sums of money paying for staff, book maintenance, rent, and additional books. Although digital libraries do away with these fees, it has since been found that digital libraries can be no less expensive in their own way to operate. Digital libraries can and do incur large costs for the conversion of print materials into digital format, for the technical skills of staff to maintain them, and for the costs of maintaining online access (i.e servers, bandwidth costs, etc.). Also, the information in a digital library must often be "migrated" every few years to the latest digital media. This process can incur very large costs in hardware and skilled personnel.(See data migration).

Disadvantages
Some people have criticized that digital libraries are hampered by copyright law, because works cannot be shared over different periods of time in the manner of a traditional library. The content is, in many cases, public domain or self-generated content only. Some digital libraries, such as Project Gutenberg, work to digitize out-of-copyright works and make them freely available to the public. An estimate of the number of distinct books still existent in library catalogues from 2000B.C. to 1960, has been made.

Digital libraries cannot reproduce the environment of a traditional library. Many people also find reading printed material to be easier than reading material on a computer screen although this depends heavily on presentation as well as personal preferences. Also, due to technological developments, a digital library can see some of its content become out-of-date and its data may become unaccessible.

Digital libraries are wholly dependent on cheap, abundant sources of electricity; a secondary form of energy which is produced primarily from fossil fuels and to a lesser extent from nuclear and "green" sources. Without electricity, the content cannot be accessed. Hence, any threat to the energy security of a society will threaten the very existence of the digital library.

Academic Repositories
Many academic libraries are actively involved in building institutional repositories of the institution's books, papers, theses, and other works which can be digitized. Many of these repositories are made available to the academic community or the general public. Insitutional repositories are often referred to as digital libraries.

Digital Archives
Archives differ from libraries in several ways. Traditionally, archives were defined as:


 * 1) Containing primary sources of information (typically letters and papers directly produced by an individual or organization) rather than the secondary sources found in a library (books, etc);
 * 2) Having their contents organized in groups rather than individual items. Whereas books in a library are cataloged individually, items in an archive are typically grouped by provenance (the individual or organization who created them) and original order (the order in which the materials were kept by the creator);
 * 3) Having unique contents. Whereas a book may be found at many different libraries, depending on its rarity, the records in an archive are usually one-of-a-kind, and cannot be found or consulted at any other location except at the archive that holds them.

The technology used to create digital libraries has been even more revolutionary for archives since it breaks down the second and third of these general rules. The use of search engines, Optical Character Recognition and metadata allow digital copies of individual items (i.e. letters) to be cataloged, and the ability to remotely access digital copies has removed the necessity of physically going to a particular archive to find a particular set of records.

Cornell University and the Wisconsin State Historical Society are considered leaders in the field of digital archive creation and management.

Searching
Most digital libraries provide a search interface which allows resources to be found. These resources are typically deep web (or invisible web) resources since they frequently cannot be located by search engine crawlers. Some digital libraries create special pages or sitemaps to allow search engines to find all their resources. Digital libraries frequently use the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) to expose their metadata to other digital libraries, and search engines like Google can also use OAI-PMH to find these deep web resources.

There are two general strategies for searching a federation of digital libraries:
 * 1) distributed searching, and
 * 2) searching previously harvested metadata.

Distributed searching typically involves a client sending multiple search requests in parallel to a number of servers in the federation. The results are gathered, duplicates or eliminated or clustered, and the remaining items are sorted and presented back to the client. Scalability and performance issues tend to plague distributed searching for large federations of digital libraries. Protocols like Z39.50 are frequently used in distributed searching.

Searching over previously harvested metadata requires the pooling of metadata collected from every digital library in the federation. This solution scales better than distributed search, but it introduces the problem of data freshness; digital libraries need to be re-harvested on a periodic basis to discover new and updated resources. OAI-PMH is frequently used by digital libraries for harvesting metadata.

The future
Large scale digitizaton projects are underway at Google, the Million Book Project, MSN, and Yahoo. With continued improvements in book handling and presentation technologies such as Optical Character Recognition and Ebooks, and many alternative depositories and business models, digital libraries are rapidly growing in popularity as demonstrated by Google, Yahoo, and MSN's efforts. And, just as libraries have ventured into audio and video collections, so have digital libraries such as the Internet Archive.