Knowledge organization

The term knowledge organization (KO) (or "organization of knowledge", "organization of information" or "information organization") designates a field of study related to Library and Information Science (LIS). In this meaning, KO is about activities such as document description, indexing and classification performed in libraries, databases, archives etc. These activities are done by librarians, archivists, subject specialists as well as by computer algorithms. KO as a field of study is concerned with the nature and quality of such knowledge organizing processes (KOP) as well as the knowledge organizing systems (KOS) used to organize documents, document representations and concepts.

There exist different historical and theoretical approaches to and theories about organizing knowledge, which are related to different views of knowledge, cognition, language, and social organization. Each of these approaches tends to answer the question: “What is knowledge organization?” differently. Library Information Service professionals have often concentrated on applying new technology and standards, and may not have seen their work as involving interpretation and analysis of meaning. That is why library classification has been

Traditional human-based activities are increasingly challenged by computer-based retrieval techniques. It is appropriate to investigate the relative contributions of different approaches; the current challenges make it imperative to reconsider this understanding.

The leading journal in this field is Knowledge Organization published by the International Society for Knowledge Organization (ISKO). See also "Lifeboat for Knowledge Organization".

A broad introduction to knowledge organization can be found in Hoetzlein (2007).

Theoretical approaches
Hjørland (2008) provided an overview of approaches to KO:

Traditional approaches
Among the major figures in the history of KO, which can be classified as “traditional”, are Melvil Dewey (1851-1931) and Henry Bliss (1870-1955).

Dewey’s business approach is hardly an intellectual approach on which the field can find a theoretical foundation for KO understood as an academic discipline. His interest was not to find an optimal system to support users of libraries, but rather to find an efficient way to manage library collections. He was interested in developing a system which could be used in many libraries, a standardized way to manage library collections.

An important characteristic in Henry Bliss' (and many contemporary thinkers of KO) was that the sciences tend to reflect the order of Nature and that library classification should reflect the order of knowledge as uncovered by science: Natural order --> Scientific Classification --> Library classification (KO)

The implication is that librarians, in order to classify books, should know about scientific developments. This should also be reflected in their education: “Again from the standpoint of the higher education of librarians, the teaching of systems of classification. . . would be perhaps better conducted by including courses in the systematic encyclopedia and methodology of all the sciences, that is to say, outlines which try to summarize the most recent results in the relation to one another in which they are now studied together. . . .” (Ernest Cushing Richardson, quoted from Bliss, 1935, p. 2).

Among the other principles, which may be attributed to the traditional approach to KO are:


 * Principle of controlled vocabulary
 * Cutter’s rule about specificity
 * Hulme’s principle of literary warrant (1911)
 * Principle of organizing from the general to the specific

Today, after more than 100 years of research and development in LIS, the “traditional” approach still has a strong position in KO and in many ways its principles still dominate.

Facet analytic approaches
The date of the foundation of this approach may be chosen as the publication of S. R. Ranganathan’s Colon Classification in 1933. The approach has been further developed by, in particular, the British Classification Research Group. In many ways this approach has dominated what might be termed “modern classification theory.”

The best way to explain this approach is probably to explain its analytico-synthetic methodology. The meaning of the term “analysis” is: Breaking down each subject into its basic concepts. The meaning of the term synthesis is: Combining the relevant units and concepts to describe the subject matter of the information package in hand.

Given subjects (as they appear in, for example, book titles) are first analyzed into a few common categories, which are termed “facets”. Ranganathan proposed his PMEST formula: Personality, Matter, Energy, Space and Time:


 * Personality is the distinguishing characteristic of a subject
 * Matter is the physical material of which a subject may be composed
 * Energy is any action that occurs with respect to the subject
 * Space is the geographic component of the location of a subject.
 * Time is the period associated with a subject.

The information retrieval tradition (IR)
Important in the IR-tradition have been, among others, the Cranfield experiments, which were founded in the 1950s, and the TREC experiments (Text Retrieval Conferences) starting in 1992. It was the Cranfield experiments, which introduced the famous measures “recall” and “precision” as evaluation criteria for systems efficiency. The Cranfield experiments found that classification systems like UDC and facet-analytic systems were less efficient compared to free-text searches or low level indexing systems (“UNITERM”). The Cranfield I test found according to Ellis (1996, 3-6) the following results.


 * UNITERM		 	82,0% recall
 * Alphabetical subject headings	81,5% recall
 * UDC			       75,6% recall
 * Facet classification scheme	73,8% recall

Although these results have been criticized and questioned, the IR-tradition became much more influential while library classification research lost influence. The dominant trend has been to regard only statistical averages. What has largely been neglected is to ask: Are there certain kinds of questions in relation to which other kinds of representation, for example, controlled vocabularies, may improve recall and precision?

User-oriented and cognitive views
The best way to define this approach is probably by method: Systems based upon user-oriented approaches must specify how the design of a system is made on the basis of empirical studies of users.

User studies demonstrated very early that users prefer verbal search systems as opposed to systems based on classification notations. This is one example of a principle derived from empirical studies of users. Adherents of classification notations may, of course, still have an argument: That notations are well-defined and that users may miss important information by not considering them.

Folksonomies is a recent kind of KO based on users' rather than on librarians' or subject specialists' indexing.

Bibliometric approaches
These approaches are primarily based on using bibliographical references to organize networks of papers, mainly by bibliographic coupling (introduced by Kessler 1963) or co-citation analysis ( independently suggested by Marshakova 1973 and Small 1973). In recent years it has become a popular activity to construe bibliometric maps as structures of research fields.

Two considerations are important in considering bibliometric approaches to KO:


 * 1) The level of indexing depth is partly determined by the number of terms assigned to each document. In citation indexing this corresponds to the number of references in a given paper. On the average, scientific papers contain 10-15 references, which provide quite a high level of depth.
 * 2) The references, which function as access points, are provided by the highest subject-expertise: The experts writing in the leading journals. This expertise is much higher than that which library catalogs or bibliographical databases typically are able to draw on.

The domain analytic approach
Domain analysis is a sociological-epistemological standpoint. The indexing of a given document should reflect the needs of a given group of users or a given ideal purpose. In other words, any description or representation of a given document is more or less suited to the fulfillment of certain tasks. A description is never objective or neutral, and the goal is not to standardize descriptions or make one description once and for all for different target groups.

The development of the Danish library “KVINFO” may serve as an example that explains the domain-analytic point of view.

KVINFO was founded by the librarian and writer Nynne Koch and its history goes back to 1965. Nynne Koch was employed at the Royal Library in Copenhagen in a position without influence on book selection. She was interested in women’s’ studies and began personally to collect printed catalog cards of books in the Royal Library, which were considered relevant for women’s studies. She developed a classification system for this subject. Later she became the head of KVINFO and got a budget for buying books and journals, and still later, KVINFO became an independent library. The important theoretical point of view is that the Royal Library had an official systematic catalog of a high standard. Normally it is assumed that such a catalog is able to identify relevant books for users whatever their theoretical orientation. This example demonstrates, however, that for a specific user group (feminist scholars), an alternative way of organizing catalog cards was important. In other words: Different points of view need different systems of organization.

DA is the only approach to KO which has seriously examined epistemological issues in the field, i.e. comparing the assumptions made in different approaches to KO and examining the questions regarding subjectivity and objectivity in KO. Subjectivity is not just about individual differences. Such differences are of minor interest because they cannot be used as guidelines for KO. What seems important are collective views shared by many users. A kind of subjectivity about many users is related to philosophical positions. In any field of knowledge different views are always at play. In arts, for example, different views of art are always present. Such views determine views on art works, writing on art works, how art works are organized in exhibitions and how writings on art are organized in libraries (see Ørom 2003). In general it can be stated that different philosophical positions on any issue have implications for relevance criteria, information needs and for criteria of organizing knowledge.