Knowledge representation

Knowledge representation is needed for library classification and for processing concepts in an information system. In the field of artificial intelligence, problem solving can be simplified by an appropriate choice of knowledge representation. Representing the knowledge in one way may make the solution simple, while an unfortunate choice of representation may make the solution difficult or obscure; the analogy is to make computations in Hindu-Arabic numerals or in Roman numerals; long division is simpler in one and harder in the other. Likewise, there is no representation that can serve all purposes or make every problem equally approachable.

History
The term "Knowledge Representation" (KR) is most commonly used to refer to representations intended for processing by modern computers, and particularly for representations consisting of explicit objects (the class of all elephants, or Clyde a certain individual), and of assertions or claims about them ('Clyde is an elephant', or 'all elephants are grey'). Representing knowledge in such explicit form enables computers to draw conclusions from knowledge already stored ('Clyde is grey').

Many KR methods were tried in the 1970s and early 1980s, such as heuristic question-answering, neural networks, theorem proving, and expert systems, with varying success. Medical diagnosis (e.g., Mycin) was a major application area, as were games such as chess.

In the 1980s formal computer knowledge representation languages and systems arose. Major projects attempted to encode wide bodies of general knowledge; for example the "Cyc" project went through a large encyclopedia, encoding not the information itself, but the information a reader would need in order to understand the encyclopedia: naive physics; notions of time, causality, motivation; commonplace objects and classes of objects. The Cyc project is managed by Cycorp, Inc.; much but not all of the data is now freely available.

Through such work, the difficulty of KR came to be better appreciated. In computational linguistics, meanwhile, much larger databases of language information were being built, and these, along with great increases in computer speed and capacity, made deeper KR more feasible.

Several programming languages have been developed that are oriented to KR. Prolog developed in 1972 (see http://www.aaai.org/AITopics/bbhist.html#mod), but popularized much later, represents propositions and basic logic, and can derive conclusions from known premises. KL-One (1980s) is more specifically aimed at knowledge representation itself.

In the electronic document world, languages were being developed to represent the structure of documents more explicitly, such as SGML and later XML. These facilitated information retrieval and data mining efforts, which have in recent years begun to relate to KR. The Web community is now especially interested in the Semantic Web, in which XML-based KR languages such as RDF, Topic Maps, and others can be used to make KR information available to Web systems

Links and structures
While hyperlinks have come into widespread use, the closely related semantic link is not yet widely used. The mathematical table has been used since Babylonian times. More recently, these tables have been used to represent the outcomes of logic operations, such as truth tables, which were used to study and model Boolean logic, for example. Spreadsheets are yet another tabular representation of knowledge. Other knowledge representations are trees, by means of which the connections among fundamental concepts and derivative concepts can be shown.

Storage and manipulation
One problem in knowledge representation consists of how to store and manipulate knowledge in an information system in a formal way so that it may be used by mechanisms to accomplish a given task. Examples of applications are expert systems, machine translation systems, computer-aided maintenance systems and information retrieval systems (including database front-ends).

Language and notation
Some people think it would be best to represent knowledge in the same way that it is represented in human mind, which is the only known working intelligence so far, or to represent knowledge in the form of human language. Unfortunately, we don't know how knowledge is represented in the human mind, or how to manipulate human languages the same way that the human mind does it. One clue is that primates know how to use point and click user interfaces; thus the gesture-based interface appears to be part of our cognitive apparatus, a modality which is not tied to verbal language, and which exists in other animals besides humans.

For this reason, various artificial languages and notations have been proposed for representing knowledge. They are typically based on logic and mathematics, and have easily parsed grammars to ease machine processing.

Notation
The recent fashion in knowledge representation languages is to use XML as the low-level syntax. This tends to make the output of these KR languages easy for machines to parse, at the expense of human readability.

First-order predicate calculus is commonly used as a mathematical basis for these systems, to avoid excessive complexity. However, even simple systems based on this simple logic can be used to represent data that is well beyond the processing capability of current computer systems: see computability for reasons.

Examples of notations:
 * DATR is an example for representing lexical knowledge
 * RDF is a simple notation for representing relationships between and among objects

Language
Examples of artificial languages intended for knowledge representation include:
 * CycL
 * Loom
 * OWL
 * KM : the Knowledge Machine (frame-based language used for knowledge representation work)

Techniques of knowledge representation
Semantic networks may be used to represent knowledge. Each node represents a concept and arcs are used to define relations between the concepts.

From the 1960s, the knowledge frame or just frame has been used. A frame consists of slots which contain values; for instance, the frame for house might contain a color slot, number of floors slot, etc.

Frames can behave something like object-oriented programming languages, with inheritance of features described by the "is-a" link. However, there has been no small amount of inconsistency in the usage of the "is-a" link: Ronald J. Brachman wrote a paper titled "What IS-A is and isn't", wherein 29 different semantics were found in projects whose knowledge representation schemes involved an "is-a" link. Other links include the "has-part" link.

Frame structures are well-suited for the representation of schematic knowledge and stereotypical cognitive patterns. The elements of such schematic patterns are weighted unequally, attributing higher weights to the more typical elements of a schema. A pattern is activated by certain expectations: If a person sees a big bird, he or she will classify it rather as a sea eagle than a golden eagle, assuming that his or her "sea-scheme" is currently activated and his "land-scheme" is not.

Frames representations are more object-centers than semantic networks: All the facts and properties of a concept are located in one place - there is no need for costly search processes in the database.

A script is a type of frame that describes what happens temporally; the usual example given is that of describing going to a restaurant. The steps include waiting to be seated, receiving a menu, ordering, etc.