Phylogenetic nomenclature

Phylogenetic nomenclature is a method of nomenclature for taxa in biology that uses phylogenetic definitions for taxon names as explained below. This contrasts with the traditional approach, in which taxon names are defined by a type and a rank. It is currently not regulated, but the International Code of Phylogenetic Nomenclature (PhyloCode) is intended to regulate it once it is ratified.

Definitions
In the traditional system (often called Linn(a)ean nomenclature or rank-based nomenclature), as implemented in the codes of biological nomenclature that are currently in force, names are defined by a type and a rank. For example, Tyrannosaurus rex is defined as the taxon that contains CM 9380 (its type specimen) and has the rank of species, Tyrannosaurus is defined as the taxon that contains T. rex (its type species) and has the rank of genus, and Tyrannosauridae is defined as the taxon that contains Tyrannosaurus (its type genus) and has the rank of family. The ranks are not defined (except in relation to each other). The rank of a taxon influences its name: species names must consist of two words, the first of which is a genus name; the names of genera must be a single word in the singular; the names of families must be composed of the name of their type genus and the ending -idae (zoology) or -aceae (elsewhere); and so on. Thus, when the rank of a taxon is changed (as every taxonomist is free to do), the name of that taxon, of taxa it contains, and of taxa that contain it must often change as well. Some biologists feel that this is unsatisfactory, because it creates instability in nomenclature that is caused by what they view as subjective judgments, not by changes in our knowledge of phylogeny. Phylogenetic nomenclature, on the other hand, uses phylogenetic definitions to tie a name to a clade, a taxon consisting solely of an ancestor and all its descendants, in such a manner that the meaning of the name is fixed under any phylogenetic hypothesis: the choice of hypothesis will affect what organisms are thought to be included in a named taxon, but it will not affect what organisms the name actually applies to. The name is independent of theory revision.

Phylogenetic definitions of clade names
All phylogenetic definitions for the names of clades are based on the definition of "clade": "an ancestor and all its descendants". This can be done in different ways, as explained in Article 9 of the PhyloCode.

1) The ancestor could be specified directly: "X and all its descendants", where X is a specimen, a breeding pair, a population, a species or some other taxon. This would be called an ancestor-based definition. Although it is an important theoretical possibility, it has probably never been attempted in reality, because the ancestors of the vast majority of taxa that one might wish to name are unknown.

2) The ancestor can be indicated by its relation to two or more specifiers that are mentioned explicitly. This can be done in three ways, which are shown in the diagram:


 * A node-based definition could read: "the last common ancestor of A and B, and all descendants of that ancestor". Thus, the entire line below the junction of A and B does not belong to the clade to which the name with this definition refers.
 * A branch-based definition, often called a stem-based definition, could read: "the first ancestor of A which is not also an ancestor of C, and all descendants of that ancestor". Thus, the entire line below the junction of A and B does belong to the clade to which the name with this definition refers.
 * An apomorphy-based definition could read: "the first ancestor of A to possess derived trait M as inherited by A, and all descendants of that ancestor". In the diagram, M evolves at the intersection of the horizontal line with the tree. Thus, the clade to which the name with this definition refers contains that part of the line below the last common ancestor of A and B which corresponds to ancestors possessing the apomorphy M. The rest of the line is excluded.

3) The ancestor can be indicated by its relation to two or more specifiers that are not mentioned explicitly, but described as members of another taxon that fulfill a certain criterion. Such definitions first describe an unnamed clade and then use its extant members as specifiers for a node-based definition. Examples are:


 * A branch-modified node-based definition could read: "the last common ancestor of all extant specimens/populations/species that share a more recent common ancestor with A than with C".
 * An apomorphy-modified node-based definition could read: "the last common ancestor of all extant specimens/populations/species that possess derived trait M as inherited by A".

Other forms of phylogenetic definition are possible as well.

Phylogenetic definitions of the names of para- and polyphyletic taxa
Traditionally, only clades are named in phylogenetic nomenclature. However, it is also possible to create phylogenetic definitions for the names of paraphyletic taxa. Assuming Mammalia and Aves are defined, Reptilia could for instance be defined as "the most recent common ancestor of birds and mammals and all its descendants except birds and mammals". This makes it possible to name taxa that are not currently named – and even taxa that cannot be named – under the rank-based codes without seriously disrupting existing classifications, such as "all organisms that share a more recent common ancestor with Homo sapiens than with birds and plesiomorphically keep laying eggs". Names of polyphyletic taxa could be defined by referring to the sum of two or more clades or paraphyletic taxa.

Ranks
Under the rank-based nomenclature codes, taxa that are not explicitly associated with a rank cannot be formally named, because taxon names are defined by a type and a rank. The number of generally recognized ranks is limited, however. Especially in recent decades (due to advances in phylogenetics, taxonomists have named many taxa (often nested ones) for which no rank is available – especially if the now widespread convention is followed that only clades should be named. Gauthier et al. (1988) claimed that a classification which uses the common array of ranks, while including Aves within a monophyletic Reptilia, keeping Reptilia at its traditional rank of class, and recognizing well-established names for taxa within Reptilia, is forced to demote Aves substantially, perhaps to the rank of genus. All ~ 12,000 known species of extant and extinct birds would then have to be incorporated into such a genus. Various solutions have been proposed. Patterson and Rosen (1977) suggested nine new ranks between family and superfamily in order to be able to classify a clade of herrings, and McKenna and Bell (1997) introduced a large array of new ranks in order to cope with the diversity of Mammalia; these have not been widely adopted.

The current codes also each have rules stating that names must have certain endings if they are applied to taxa that have certain ranks. When a taxon changes rank from one classification to another, its name must change its suffix. Ereshefsky (1997:512) gave an example: "The Linnaean rule of assigning rank-specific suffices [sic] gives rise to even more confusing cases. Simpson (1963, 29–30) and Wiley (1981, 238) agree that the genus Homo belongs to a particular taxon. They disagree, however, on that taxon's rank. Acting in accord with the Linnaean system, they attach different suffixes to the root Homini [actually Homin-] and give the taxon in question different names: Wiley calls it 'Hominini' [tribe rank] and Simpson calls it 'Hominidae' [family rank]. Their disagreement does not stop there. Wiley believes that the taxon just cited is a part of a more inclusive taxon which is a family. Using the root Homini, and following the rules of the Linnaean system [more precisely, the zoological code], he names the more inclusive taxon 'Hominidae.' So for Wiley and Simpson, the name 'Hominidae' refers to two different taxa. In brief, the Linnaean system causes Wiley and Simpson to assign different names to what they agree is the same taxon, and it causes them to give the same name to what they agree are different taxa."

In phylogenetic nomenclature, ranks have no bearing on the spelling of taxon names (see e.g. Gauthier (1994) and the PhyloCode). Ranks are, however, not altogether forbidden in phylogenetic nomenclature. They are merely decoupled from nomenclature: they do not influence which names can be used, which taxa are associated with which names, and which names can refer to nested taxa (e.g.  ).

Philosophy
Rank-based nomenclature is theory-free. Phylogenetic nomenclature assumes that there is a phylogeny, in other words, it assumes evolution. It takes the tree of life as given and can be said to tie labels to precisely defined places on it.

The inherent vagueness of definitions in terms of types and undefined ranks has led Laurin (2008) to compare the nomenclature of biological taxa to the nomenclatures of other sciences and similar endeavors, such as chemistry, stratigraphy, and geopolitics. On stratigraphy, Laurin (p. 226) remarks: "In geology, time units were initially defined on the basis of a type-section, which formed the etymological basis of the period name. For instance, the type-section of the Permian is near Perm, Russia, and that of the Devonian is in Devon, UK. In this respect, early geological nomenclature was similar to RN [rank-based nomenclature] because time units were defined on the basis of a single type (section) and a rank (era, period, stage, etc.). However, the meaning of these ranks was not objectively defined (e.g. the amount of time ascribed to an ‘era’). As a result, the relevant boundaries could not be objectively recognized, but relied on consensus. However, as in PN [sic – error for RN], this did not yield stability, as the Great Devonian Controversy attests (Rudwick 1985 ). Thus, it is no surprise that, more recently, geologists have moved from this type of nomenclature to one based on boundary stratotypes, which precisely delimit successive (rock and therefore) time units (Gradstein et al. 2004: 20–21 ). Thus, the limits between successive time units are objectively fixed on the basis of a real section, at a precise layer, rather than by an arbitrary rank (period, stage, etc.). This approach is more similar to PN [phylogenetic nomenclature] than to RN, since boundaries are objectively and precisely fixed (which is deliberately avoided in RN; see above)."

History
Ultimately, phylogenetic nomenclature is a result of Darwin's discovery that the diversity and history of life is best represented in tree-shaped diagrams. This discovery immediately led to changes in the existing classifications. For example, John Hogg proposed the term Protoctista in 1860 for organisms that did not seem closely related to either animals or plants. In 1866, the controversial biologist Ernst Haeckel for the first time reconstructed a single tree of all life (see figure) and immediately proceeded to translate it into a classification. This classification was rank-based, as usual at the time, but did not contain taxa that Haeckel considered polyphyletic; in it, Haeckel introduced the rank of phylum which carries a connotation of monophyly in its name (literally meaning "stem").

Ever since it has been debated in which ways and to what extent the phylogeny of life should be used as a basis for its classification, with views ranging from "numerical taxonomy" (phenetics) over "evolutionary taxonomy" (gradistics) to "phylogenetic systematics" (cladistics – today, the term "cladistics" is only used for the method of phylogeny reconstruction, but its inventor, Willi Hennig, regarded this method as a mere tool for the purpose of classification). From the 1960s onwards, rankless classifications were occasionally proposed, but in general the principles of rank-based nomenclature were used by all three schools of thought.

Most of the basic tenets of phylogenetic nomenclature (lack of obligatory ranks, and something close to phylogenetic definitions) can, however, be traced to 1916, when Edwin Goodrich interpreted the name Sauropsida, erected 40 years earlier by T. H. Huxley, to include the birds (Aves) as well as part of Reptilia, and coined the new name Theropsida to include the mammals as well as another part of Reptilia. Goodrich did not give them ranks, and treated them exactly as if they had phylogenetic definitions, using neither contents nor diagnostic characters to decide whether a given animal should belong to Theropsida, Sauropsida, or something else once its phylogenetic position was agreed upon. Goodrich also opined that the name Reptilia should be abandoned once the phylogeny of the reptiles would be better known. The lack of compatibility of his scheme with the existing rank-based classifications (despite agreement on the phylogeny in all but details), and the lack of a method of phylogenetics at this time, are the most likely reasons why Goodrich's suggestions were largely ignored.

The principle that only clades should be formally named became popular in the second half of the 20th century. It spread together with the methods for discovering clades (cladistics) and is an integral part of phylogenetic systematics (see above). At the same time, it became apparent that the obligatory ranks that are part of the traditional systems of nomenclature produced problems. Some authors suggested abandoning them altogether, starting with Willi Hennig's abandonment of his earlier proposal to define ranks as geological age classes.

The first use of phylogenetic nomenclature in a publication can be dated to 1986. Theoretical papers outlining the principles of phylogenetic nomenclature, as well as further publications containing applications of phylogenetic nomenclature (mostly to vertebrates), soon followed (see Literature section).

In an attempt to avoid a schism in the biologist community, "Gauthier suggested to two members of the ICZN to apply formal taxonomic names ruled by the zoological code only to clades (at least for supraspecific taxa) and to abandon Linnean ranks, but these two members promptly rejected these ideas" (Laurin, 2008: 224). This led Kevin de Queiroz and the botanist Philip Cantino to start drafting their own code of nomenclature, originally called the PhyloCode, for regulating phylogenetic nomenclature in 2000.

The International Code of Phylogenetic Nomenclature
The ICPN, or PhyloCode, is a draft code of rules and recommendations for phylogenetic nomenclature.


 * The ICPN will only regulate clade names. Names for para- or polyphyletic taxa, and names for species (which may or may not be clades), will not be considered, at least not at first. This means that the regulation of species names will be left, for the time being, to the rank-based codes of nomenclature.
 * The Principle of Priority will be introduced for names and for definitions. The starting point for priority will be the publication date of the ICPN.
 * Definitions for existing names, and new names along with their definitions, will have to be published in peer-reviewed works (on or after the starting date) and will have to be registered in an online database in order to be valid.

The number of supporters for widespread adoption of the PhyloCode is still small, and it is uncertain (as of 2012) when the code will be implemented and how widely it will be followed.