Homology (biology)

In biology, two or more structures are said to be homologous if they are alike because of shared ancestry. This could be evolutionary ancestry, meaning that the structures evolved from some structure in a common ancestor (the wings of bats and the arms of humans are homologous in this sense), or developmental ancestry, meaning that the structures arose from the same tissue in embryonal development (the ovaries of female humans and the testicles of male humans are homologous in this sense).

Homology has to be distinguished from analogy; for instance, the wings of insects, the wings of bats and the wings of birds are analogous but not homologous, this phenomenon is known as Homoplasy. These similar structures evolved through different developmental pathways, in a process known as convergent evolution.

Homology of sequences in genetics
In genetics, homology is used in reference to protein or DNA sequences, meaning that the given sequences share a common ancestor. Sequence homology may also indicate common function. Asking whether two sequences are homologous is a yes-or-no question&mdash;there is no such condition as "degrees of homology." Sequence regions that are homologous may also be called conserved.

Homology among proteins and DNA is often concluded on the basis of sequence similarity, especially in bioinformatics. For example, in general, if two genes have an almost identical DNA sequence, it is likely that they are homologous. However, it may be that the sequence similarity did not arise from their sharing a common ancestor; short sequences may be similar by chance, or sequences may be similar because both were selected to bind to a particular protein, such as a transcription factor. Such sequences are similar but not homologous.

The phrase "percent homology", as sometimes used by those outside the fields of evolutionary biology or bioinformatics, is incorrect. The phrases "percent identity" or "percent similarity" should be used to quantify the similarity between the biomolecule sequences. For two naturally occurring sequences, percent identity is a factual measurement, whereas homology is a hypothesis supported by evidence. One can, however, refer to partial homology where a fraction of the sequences compared (are presumed to) share descent, while the rest does not.

Many algorithms exist to cluster protein sequences into sequence families, which are sets of mutually homologous sequences. (See sequence clustering and sequence alignment.)

Orthology and paralogy
Homology of sequences can be of two types: orthology or paralogy. Homologous sequences are orthologous if they were separated by a speciation event: if a gene exists in a species, and that species diverges into two species, then the copies of this gene in the resulting species are orthologous. Homologous sequences are paralogous if they were separated by a gene duplication event: if a gene in an organism is duplicated, then the two copies are paralogous. A pair of sequences that are orthologous to each other are called orthologs ( see Orthologue), a pair that are paralogous are called paralogs.

Orthologs will typically have the same or similar function. This is not always true for paralogs: due to lack of the original selective pressure upon one copy of the duplicated gene, this copy is free to mutate and acquire new functions.

The genes encoding myoglobin and hemoglobin are considered to be ancient paralogs. Similiarly, the four known classes of hemoglobins (hemoglobin A, hemoglobin A2, hemoglobin S, and hemoglobin F) are all paralogs of each other. While each of these genes serve the same basic function of oxygen transport, they have already diverged slightly in function: fetal hemoglobin (hemoglobin F) has a higher affinity to oxygen than adult hemoglobin.

Another example can be found in rodents such as rats and mice. Rodents have a pair of paralogous insulin genes, although it is unclear if any divergence in function has occurred.

Paralogous genes often belong to the same species, but this is not necessary: for example, the hemoglobin gene of humans and the myoglobin gene of chimpanzees are paralogs. This is a common problem in bioinformatics: when the genome of different species have been sequenced and homologous genes have been found, one can not immediately conclude that these genes have the same or similar function, as they could be paralogs whose function has diverged.

Homologous chromosome pairs
A homologous pair of chromosomes in a diploid cell is a matching pair of chromosomes, one derived from each parent of the organism. Except for the sex chromosomes, the chromosomes of each homologous pair share significant sequence similarity across their entire length, and thus typically contain the same sequence of genes. The sex chromosomes have a shorter region of sequence similarity. Based on the sequence similarity and our knowledge of biology, we can presume that the chromosomes are paralogous.