Identification of the varieties of Chinese

Chinese forms part of the Sino-Tibetan family of languages. About one-fifth of the people in the world speak some variety of Chinese as their native language. Internal diversity in Chinese, with respect to grammar, vocabulary, and syntax, is comparable to the Romance languages, and greater than the Germanic languages and Slavic languages. However, owing to China's sociopolitical and cultural situation, whether these variants should be known as languages or dialects is a subject of ongoing debate. Some people call Chinese a language and its subdivisions dialects, while others call Chinese a language family and its subdivisions languages.

From a purely descriptive point of view, "languages" and "dialects" are simply arbitrary groups of similar idiolects, and the distinction is irrelevant to linguists who are primarily concerned with describing regional speeches scientifically. However, the language/dialect distinction has far-reaching implications in socio-political issues, such as the national identity of China, regional identities within China, and the very nature of the (Han) Chinese "nation" or "race". As a result, it has become a subject of contention.

Self-descriptions of speakers of regional variants
Although linguists have made great progress in describing and classifying the regional varieties of Chinese over the last century, their classification does not necessarily correspond to how these regional variants have traditionally been viewed and categorized. Thus, although the first-level divisions of Chinese are often referred to as "languages", they do not always correspond to linguistic or cultural self-identity.

It is customary in China to refer to the speeches of cities and provinces, even though these provincial boundaries correspond poorly to the groupings devised by linguists. For example, the various dialects within Anhui Province are often called "Anhui dialect", even though they are scattered in four of the "Chinese languages" recognized by linguists &mdash; Mandarin, Wu, Hui, and Gan. Similarly, the dialects that linguistics consider to be part of the Wu language are spoken over Zhejiang Province, Jiangsu Province, Anhui Province, and Shanghai Municipality, and so can be described variously as "Zhejiang dialect", "Jiangsu dialect", "Anhui dialect", and "Shanghai dialect". Another example is that although the Sichuan dialect is considered to be distinct from the Beijing dialect, linguists consider Sichuan dialect and Beijing dialect both to be part of the Mandarin group. With such a contradiction between geography-based and linguistics-based classification, linguistic self-identity is also complex.

There is a tendency to regard dialects as "variations" of a single written Chinese language. This is partly because speakers of different varieties of Chinese have historically used one single formal written language. Before the 20th century, Classical Chinese was used, an archaic form of Chinese with grammar and style different from all modern Chinese languages; thus, it was possible to regard the common written language as detached and "above" all of the spoken languages. However, the 20th century saw the replacement of Classical Chinese with "Vernacular Chinese", a written standard that is based on the modern Mandarin group of dialects and used by all Chinese-speakers regardless of dialect group. This development has complicated the idea that all Chinese languages, Mandarin or not, share one single written language, as this one single written language is now based on one particular spoken group of dialects. This "Standard Written Chinese" is essentially consistent in terms of grammar and vocabulary when written by speakers of different Chinese languages, and differs only in the pronunciation of characters in the local Chinese language. However, the spoken Chinese languages are generally not mutually intelligible with Standard Written Chinese even when recited with the local language's pronunciation, since the written language, being based on Mandarin, may not use the same grammar and vocabulary. Proponents of Chinese as a single language with many dialects describe grammatical/lexical deviations of the local language from the single written language as "slang", even if these differences persist in the acrolectal (formal) level.

At the same time, regions with strong senses of regional cohesiveness have become more aware of regional groupings of dialects in recent times, and have formed self-identities connected to these linguistic groupings. In some self-identified linguistic groups, such as Wu or Hakka, these groups correspond well to those devised by linguists. In other self-identified linguistic groups, such as Teochew and Taiwanese, the correspondences are not as exact.

It is notable that in Chinese, whether the standard or the regional languages, there is typically no conscious distinction between "language" and "dialect" when referring to any of the languages, unless the subject matter necessitates the distinction (and even then, the distinction is not always made). If, for example, a Guangdong inhabitant refers to the Suzhou dialect, he talks about "Suzhou speech", not Sūzhōu dialect  or the like.

Implications of the language / dialect distinction
The idea of single language has major overtones in politics and self-identity, and explains the amount of emotion over this issue. The idea of Chinese as a language family may suggest that China consists of several different nations, challenge the notion of a single Han Chinese nationality, and legitimize secessionist movements. This is why some Chinese are uncomfortable with it, while supporters of Taiwan independence tend to be strong promoters of Min- and Hakka-language education. Furthermore, for some, suggesting that Chinese is more correctly described as multiple languages implies that the notion of a single Chinese language and a single Chinese state or nationality is artificial.

However, the links between ethnicity, politics, and language can be complex. Many Wu, Min, Hakka, and Cantonese speakers consider their own varieties as separate spoken languages, but the Han Chinese nationality as one entity. They do not regard these two positions as contradictory, but consider the Han Chinese an entity of great internal diversity. Moreover, the government of the People's Republic of China officially states that China is a multinational state, and that the term "Chinese" refers to a broader concept Zhonghua Minzu that incorporates groups that do not natively speak Chinese, such as Tibetans, Uyghurs, and Mongols. (Groups that do speak Chinese are properly called Han Chinese, and are regarded as one component of a multi-ethnic whole.) This is seen as an ethnic and cultural concept, not a political one. Similarly, on Taiwan, some supporters of Chinese reunification promote the local language, while some supporters of Taiwan independence have little interest in the topic. And the Taiwanese identity incorporates Taiwanese aborigines, who are not considered Han Chinese because they speak Austronesian languages, predate Han Chinese settlement, and are culturally and genetically linked to other Austronesian-speaking peoples such as Filipinos, Malays and Polynesians.

Comparison with Europe
Differences in the socio-political context of Chinese and European languages gave rise to the difference in terms of linguistic perception between the two cultures. In Western Europe, Latin remained the written standard for centuries after the spoken language diverged and began shifting into distinct Romance languages, in a situation not unlike the use of classical Chinese. However, political fragmentation gave rise to independent states roughly the size of Chinese provinces. This eventually generated a political desire to create separate cultural and literary standards to differentiate nation-states and standardize the language within a nation-state. In China, a single cultural and literary standard (Classical Chinese and later, Vernacular Chinese) continued to exist while the spoken language continued to diverge between different cities and counties, much as European languages diverged, due to the scale of the country, and the obstruction of communication by geography.

The diverse Chinese spoken forms and common written form comprise a very different linguistic situation from that in Europe. In Europe, linguistic differences sharpened as the language of each nation-state was standardized. For example, a farmer on the French side of the border would start to model his speech and writing after Paris while his neighbour on the Spanish side after Madrid. The use of local speech became stigmatized. In China, standardization of spoken dialects was weaker, and mostly due to cultural influence. Although, as with Europe, dialects of regional political or cultural capitals were still prestigious and widely used as the region's lingua franca, their linguistic influence depended more on the capital's status and wealth than entirely on the political boundaries of the region.

Comparison with India
China's linguistic situation can be compared to Northern India. Like Classical Chinese, Sanskrit long played a role as a common written language. However, because India has historically been unified for shorter periods of time and was not a single state when intense contact with European countries began, in contrast to Chinese the descendants of Sanskrit are often recognized as separate languages, twelve of which are official national languages.