Writing system



A writing system is a type of symbolic system used to represent elements or statements expressible in language.

General properties
Writing systems are distinguished from other possible symbolic communication systems in that one must usually understand something of the associated language in order to successfully read and comprehend the text. Contrast this with other possible symbolic systems such as information signs, painting, maps, and mathematics, which do not necessarily depend upon prior knowledge of a given language in order to extract their meaning.

Every human community possesses language, a feature regarded by many as an innate and defining condition of humankind. However, the development and adoption of writing systems has occurred only sporadically. Once established, writing systems are on the whole modified more slowly than their spoken counterparts, and often preserve features and expressions which are no longer current in the discourse of the speech community. The great benefit conferred by writing systems is their ability to maintain a persistent record of information expressed in a language, which can be retrieved independently of the initial act of formulation.

All writing systems require:
 * a set of defined base elements or symbols, individually termed characters or graphemes, and collectively called a script;
 * a set of rules and conventions understood and shared by a community, which arbitrarily assign meaning to the base elements, their ordering, and relations to one another;
 * a language (generally a spoken language) whose constructions are represented and able to be recalled by the interpretation of these elements and rules;
 * some physical means of distinctly representing the symbols by application to a permanent or semi-permanent medium, so that they may be interpreted (usually visually, but tactile systems have also been devised).

Basic terminology


The study of writing systems has developed along partially independent lines in the examination of individual scripts, and as such the terminology employed differs somewhat from field to field.

The generic term text may be used to refer to an individual product of a writing system. The act of composing a text may be referred to as writing, and the act of interpreting the text as reading. In the study of writing systems, orthography refers to the method and rules of observed writing structure (literal meaning, "correct writing"), and in particular for alphabetic systems, includes the concept of spelling.

A grapheme is the technical term coined to refer to the specific base or atomic units of a given writing system. Graphemes are the minimally significant elements which taken together comprise the set of "building blocks" out of which texts of a given writing system may be constructed, along with rules of correspondence and use. The concept is similar to that of the phoneme used in the study of spoken languages. For example, in the Latin-based writing system of standard contemporary English, examples of graphemes include the majuscule and minuscule forms of the twenty-six letters of the alphabet (corresponding to various phonemes), marks of punctuation (mostly non-phonemic), and a few other symbols such as those for numerals (logograms for numbers).

Note that an individual grapheme may be represented in a wide variety of ways, where each variation is visually distinct in some regard, but all are interpreted as representing the "same" grapheme. These individual variations are known as allographs of a grapheme (compare with the term allophone used in linguistic study). For example, the minuscule letter a has different allographs when written as a cursive, block, or typed letter. The selection between different allographs may be influenced by the medium used, the writing instrument, the stylistic choice of the writer, and the largely unconscious features of an individual's handwriting.

The terms glyph, sign and character are sometimes used to refer to a grapheme. Common usage varies from discipline to discipline; compare cuneiform sign, Maya glyph, Chinese character. The glyphs of most writing systems are made up of lines (or strokes) and are therefore called linear, but there are glyphs in non-linear writing systems made up of other types of marks, such as Cuneiform and Braille.

Writing systems are conceptual systems, as are the languages to which they refer. Writing systems may be regarded as complete according to the extent to which they are able to represent all that may be expressed in the spoken language.

History of writing systems
Writing systems were preceded by proto-writing, systems of ideographic and/or early mnemonic symbols. The best known examples are:
 * Jiahu Script Symbols on tortoise shells in Jiahu, ca. 6600 BC
 * Vinca script (Tărtăria tablets), ca. 4500 BC
 * Early Indus script, ca. 3500 BC

The invention of the first writing systems is roughly contemporary with the beginning of the Bronze Age in the late Neolithic of the late 4th millennium BC. The Sumerian archaic cuneiform script and the Egyptian hieroglyphs are generally considered the earliest writing systems, both emerging out of their ancestral proto-literate symbol systems from ca. 3200 BC with earliest coherent texts from about 2600 BC.

The Chinese script may have originated independently of the Middle Eastern scripts, around 1600 BC. The pre-Columbian Mesoamerican writing systems (including among others Olmec and Maya scripts) are also generally believed to have had independent origins.

It is thought that the first true alphabetic writing appeared around 2000 BC, as a representation of language developed by Semitic workers in Egypt (see History of the alphabet). Most other alphabets in the world today either descended from this one innovation, many via the Phoenician alphabet, or were directly inspired by its design.

Types of writing systems
See List of writing systems for a list of featural writing systems. The oldest-known forms of writing were primarily logographic in nature, based on pictographic and ideographic elements. Most writing systems can be broadly divided into three categories: logographic, syllabic and alphabetic (or segmental); however, all three may be found in any given writing system in varying proportions, often making it difficult to categorise a system uniquely. The term complex system is sometimes used to describe those where the admixture makes classification problematic.

See also: phonemic and phonetic orthography.

Logographic writing systems
A logogram is a single written character which represents a complete grammatical word. Most Chinese characters are classified as logograms.

As each character represents a single word (or, more precisely, a morpheme), many logograms are required to write all the words of language. The vast array of logograms and the memorization of what they mean are the major disadvantage of the logographic systems over alphabetic systems. However, since the meaning is inherent to the symbol, the same logographic system can theoretically be used to represent different languages. In practice, this is only true for closely related languages, like the Chinese languages, as syntactical constraints reduce the portability of a given logographic system. Japanese use Chinese logograms extensively in its writing systems, with most of the symbols carrying the same or similar meanings. However, the semantics, and especially the grammar, are different enough that a long Chinese text is not readily understandable to a Japanese reader without any knowledge of basic Chinese grammar, though short and concise phrases such as those on signs and newspaper headlines are much easier to comprehend.

While most languages do not use wholly logographic writing systems many languages use some logograms. A good example of modern western logograms are the Hindu-Arabic numerals &mdash; everyone who uses those symbols understands what 1 means whether he or she calls it one, eins, uno, yi, ichi or ehad. Other western logograms include the ampersand &, used for and, the at sign @, used in many contexts for at, the percent sign % and the many signs representing units of currency ($, ¢, €, £, ¥ and so on.)

Logograms are sometimes called ideograms, a word that refers to symbols which graphically represent abstract ideas, but linguists avoid this use, as Chinese characters are often semantic–phonetic compounds, symbols which include an element that represents the meaning and element that represents the pronunciation. Some nonlinguists distinguish between lexigraphy and ideography, where symbols in lexigraphies represent words, and symbols in ideographies represent words or morphemes.

The most important (and, to a degree, the only surviving) modern logographic writing system is the Chinese one, whose characters are used, with varying degrees of modification, in Chinese, Japanese, Korean, Vietnamese, and other east Asian languages. Ancient Egyptian hieroglyphics and the Mayan writing system are also systems with certain logographic features, although they have marked phonetic features as well, and are no longer in current use.

See List of writing systems for a list of predominantly-logographic writing systems.

Syllabic writing systems
As logographic writing systems use a single symbol for an entire word, a syllabary is a set of written symbols that represent (or approximate) syllables, which make up words. A symbol in a syllabary typically represents a consonant sound followed by a vowel sound, or just a vowel alone. In a true syllabary there is no systematic graphic similarity between phonetically related characters (though some do have graphic similarity for the vowels). That is, the characters for "ke", "ka", and "ko" have no similarity to indicate their common "k"-ness. Compare abugida, where each grapheme typically represents a syllable but where characters representing related sounds are similar graphically (typically, a common consonantal base is annotated in a more or less consistent manner to represent the vowel in the syllable).

Syllabaries are best suited to languages with relatively simple syllable structure, such as Japanese. The English language, on the other hand, allows complex syllable structures, with a relatively large inventory of vowels and complex consonant clusters, making it cumbersome to write English words with a syllabary. To write English using a syllabary, every possible syllable in English would have to have a separate symbol, and whereas the number of possible syllables in Japanese is no more than about fifty to sixty, in English there are many thousands.

Other languages that use syllabic writing include Mycenaean Greek (Linear B) and Native American languages such as Cherokee. Several languages of the Ancient Near East used forms of cuneiform, which is a syllabary with some non-syllabic elements.

See List of writing systems for a list of syllabaries.

Alphabetic writing systems
An alphabet is a small set of letters &mdash; basic written symbols &mdash; each of which roughly represents or represented historically a phoneme of a spoken language. The word alphabet is derived from alpha and beta, the first two symbols of the Greek alphabet.

In a perfectly phonetic alphabet, the phonemes and letters would correspond perfectly in two directions: a writer could predict the spelling of a word given its pronunciation, and a speaker could predict the pronunciation of a word given its spelling. Each language has general rules that govern the association between letters and phonemes, but, depending on the language, these rules may or may not be consistently followed.

Perfectly phonetic alphabets are very easy to use and learn, and languages that have them (for example Serbian) have much lower barriers to literacy than languages such as English, which has a very complex and irregular spelling system. As languages often evolve independently of their writing systems, and writing systems have been borrowed for languages they were not designed for, the degree to which letters of an alphabet correspond to phonemes of a language varies greatly from one language to another and even within a single language. In modern times, when linguists invent a writing system for a language that didn't previously have one, the goal is usually to develop a phonetic alphabet. An example of such a writing system is the International Phonetic Alphabet (IPA).

See alphabet for more information about alphabets. See List of writing systems for a list of alphabetic writing systems.

Abjads
The first type of alphabet that was developed was the abjad. An abjad is an alphabetic writing system where there is one symbol per consonant. Abjads differ from regular alphabets in that they only have characters for consonantal sounds. Vowels are not usually marked in abjad.

All known abjads (except maybe Tifinagh) belong to the Semitic family of scripts, and derive from the original Northern Linear Abjad. The reason for this is that Semitic languages and the related Berber languages have a morphemic structure which makes the denotation of vowels redundant in most cases.

Some abjads (like Arabic and Hebrew) have markings for vowels as well, but only use them in special contexts, such as for teaching. Many scripts derived from abjads have been extended with vowel symbols to become full alphabets, the most famous case being the derivation of the Greek alphabet from the Phoenician abjad. This has mostly happened when the script was adapted to a non-Semitic language.

The term abjad takes its name from the old order of the Arabic alphabet's consonants Alif, Bá, Jim, Dál, though the word may have earlier roots in Phoenician or Ugaritic.

Abjad is still the word for alphabet in Arabic, Malay, and Indonesian.

See List of writing systems for a list of abjad-based writing systems.

Abugidas
An abugida is an alphabetic writing system whose basic signs denote consonants with an inherent vowel and where consistent modifications of the basic sign indicate other following vowels than the inherent one.

Thus, in an abugida there is no sign for "k", but instead one for "ka" (if "a" is the inherent vowel), and "ke" is written by modifying the "ka" sign in a way that is consistent with how one would modify "la" to get "le". In many abugidas the modification is the addition of a vowel sign, but other possibilities are imaginable (and used), such as rotation of the basic sign, addition of diacritical marks, and so on.

The obvious contrast is with syllabaries, which have one distinct symbol per possible syllable, and the signs for each syllable have no systematic graphic similarity. The graphic similarity comes from the fact that most abugidas are derived from abjads, and the consonants make up the symbols with the inherent vowel, and the new vowel symbols are markings added on to the base symbol.



The Ethiopic script is an abugida, although the vowel modifications in Ethiopic are not entirely systematic. Canadian Aboriginal Syllabics can be considered abugidas, although they are rarely thought of in those terms. The largest single group of abugidas is the Brahmic family of scripts, however, which includes nearly all the scripts used in India and Southeast Asia.

The name abugida is derived from the first four characters of an order of the Ge'ez script used in some religious contexts. The term was coined by Peter T. Daniels.

See List of writing systems for a list of abugida-based writing systems.

Featural writing systems
A featural script represents finer detail than an alphabet. Here symbols do not represent whole phonemes, but rather the elements (features) that make up the phonemes, such as voicing or its place of articulation. Theoretically, each feature could be written with a separate letter; and abjads or abugidas, or indeed syllabaries, could be featural, but the only prominent system of this sort is Korean Hangul. In Hangul, the featural symbols are combined into alphabetic letters, and these letters are in turn joined into syllabic blocks, so that the system combines three levels of phonological representation.

Directionality
Different scripts are written in different directions. The early alphabet could be written in any direction: either horizontal (left-to-right or right-to-left) or vertical (up or down). It could also be written boustrophedon: starting horizontally in one direction, then turning at the end of the line and reversing direction. Egyptian hieroglyph is one such script, where the beginning of a line written horizontally was to be indicated by the direction in which animal and human ideograms are looking.

The Greek alphabet and its successors settled on a left-to-right pattern, from the top to the bottom of the page. Other scripts, such as Arabic and Hebrew, came to be written right-to-left. Scripts that incorporate Chinese characters have traditionally been written vertically (top-to-bottom), from the right to the left of the page, but nowadays are frequently written left-to-right, top-to-bottom, due to Western influences, a growing need to accommodate terms in the Roman alphabet, and technical limitations in popular electronic document formats. The Mongolian alphabet is unique in being the only script written top-to-bottom, left-to-right; this direction originated from an ancestral Semitic direction by rotating the page 90° counter-clockwise to conform to the appearance of Chinese writing. Scripts with lines written away from the writer, from bottom to top, also exist, such as several used in the Philippines and Indonesia.

Writing systems on the computer
Different ISO/IEC standards are defined to deal with each individual writing systems to implement them in computers (or in electronic form). Now most of those standards are re-defined in a better collective standard, the ISO 10646, also known as Unicode. In Unicode, each character, from all languages' writing systems, is given a unique identification number, known as its code point. The computer's Operating System interprets the code for different characters (and languages) from files, and it retrieves appropriate characters from the font file (for that code), so the characters can be displayed on the page or screen.

A keyboard is the device most commonly used for writing via computer. The keyboard generates specific standardized codes when keys are pressed. By using a combination of keys with Ctrl, Alt, Shift, Function, Caps Lock, Num Lock, Numeric keypad, Option, Command, etc modifier keys, various character codes are generated and sent to the CPU. The operating system intercepts and converts those signals to the appropriate characters, (based on the keyboard layout for the language codepage, input method environment, fonts, etc which are used in that specific computer), and then delivers those converted codes and characters to the running processes application software, video adapter, etc, which displays the characters on the screen.

In computers and telecommunication systems, graphemes and other grapheme-like units that are required for text processing are represented by "characters" that typically manifest in encoded form. For technical aspects of computer support for various writing systems, see the articles UCS (Universal Character Set), CJK (Chinese, Japanese, Korean) and Bi-directional text, as well as Category:Character encoding.