Tertiary structure

In biochemistry, the tertiary structure of a protein is its overall shape, also known as its fold. Protein molecules are linear chains of amino acids that typically assume a specific three-dimensional structure in which they perform their biological function. The study of protein tertiary structure is known as structural biology.

Relationship to Primary Sequence
Tertiary structure is considered to be largely determined by the protein's primary sequence, or the sequence of amino acids of which it is composed. Efforts to predict tertiary structure from the primary sequence are known generally as protein structure prediction. However, the environment in which a protein is synthesized and allowed to fold are significant determinants of its final shape and are usually not directly taken into account by current prediction methods. (Most such methods do rely on comparisons between the sequence to be predicted and sequences of known structure in the Protein Data Bank and thus account for environment indirectly, assuming the target and template sequences share similar cellular contexts.) A large-scale experiment known as CASP directly compares the performance of state-of-the-art prediction methods and is run once every two years.

Determinants of Tertiary Structure
In globular proteins, tertiary interactions are frequently stabilized by the sequestration of hydrophobic amino acid residues in the protein core, from which water is excluded, and by the consequent enrichment of charged or hydrophilic residues on the protein's water-exposed surface. In secreted proteins that do not spend time in the cytoplasm, disulfide bonds between cysteine residues help to maintain the protein's tertiary structure. A variety of common and stable tertiary structures appear in a large number of proteins that are unrelated in both function and evolution - for example, many proteins are shaped like a TIM barrel, named for the enzyme triosephosphateisomerase. Another common structure is a highly stable dimeric coiled-coil structure composed of four alpha helices. Proteins are classified by the folds they represent in databases like SCOP and CATH.

Not every polypeptide chain has a well-defined tertiary structure. Some proteins, especially short proteins, are natively disordered and exist as random coils under standard physiological conditions. Disordered regions can also occur in otherwise well-structured proteins, especially at the termini and in loop or linker regions connecting domains whose relative orientation can change depending on the environment.

Stability of Native States
The most typical conformation of a protein in its cellular environment is generally referred to as the native state or native conformation. It is commonly assumed that this most-populated state is also the most thermodynamically stable conformation attainable for a given primary sequence; this is a reasonable first approximation but the claim assumes that the reaction is not under kinetic control - that is, that the time required for the protein to attain its native conformation after being translated is small.

In the cell, a variety of protein chaperones assist a newly synthesized polypeptide in attaining its native conformation. Some such proteins are highly specific in their function, such as protein disulfide isomerase; others are very general and can be of assistance to most globular proteins - the prokaryotic GroEL/GroES system and the homologous eukaryotic Hsp60/Hsp10 system fall into this category.

Some proteins explicitly take advantage of the fact that they can become kinetically trapped in a relatively high-energy conformation due to folding kinetics. Influenza hemagglutinin, for example, is synthesized as a single polypeptide chain that acts as a kinetic trap. The "mature" activated protein is proteolytically cleaved to form two polypeptide chains that are trapped in a high-energy conformation. Upon encountering a drop in pH, the protein undergoes an energetically favorable conformational rearrangement that enables it to penetrate a host cell membrane.

Experimental Determination
The majority of protein structures known to date have been solved with the experimental technique of X-ray crystallography, which typically provides data of high resolution but provides no time-dependent information on the protein's conformational flexibility. A second common way of solving protein structures uses NMR, which provides somewhat lower-resolution data in general and is limited to relatively small proteins, but can provide time-dependent information about the motion of a protein in solution. More is known about the tertiary structural features of soluble globular proteins than about membrane proteins because the latter class is extremely difficult to study using these methods.

History
Since the tertiary structure of proteins is an important problem in biochemistry, and since structure determination is relatively difficult, protein structure prediction has been a long-standing problem. The first predicted structure of globular proteins was the cyclol model of Dorothy Wrinch, but this was quickly discounted as being inconsistent with experimental data. Modern methods are sometimes able to predict the tertiary structure de novo to within 5 Å for small proteins (<120 residues) and under favorable conditions, e.g., confident secondary structure predictions.