3. Tertiary structure :

The elements of secondary structure are usually folded into a compact shape using a variety of loops and turns to form the tertiary structure of protein. The formation of tertiary structure is usually driven by non-local interactions, most commonly the formation of a hydrophobic core by the burial of hydrophobic residues, but other interactions such as hydrogen bonding, ionic interactions and disulphide bonds and even post-translational modifications can also stabilize the tertiary structure. The tertiary structure encompasses all the noncovalent interactions that are not considered secondary structure, and is what defines the overall fold of the protein, and is usually indispensable for the function of the protein.

The term "tertiary structure" is often used as synonymous with the term fold. And the tertiary structure is what controls the basic function of the protein. During protein folding to tertiary level, several motifs pack together to form compact, local, semi-independent units called domains. This overall 3D structure of the polypeptide chain is referred to as the protein's 'tertiary structure'. Domains are the fundamental units of tertiary structure, each domain containing an individual hydrophobic core built from secondary structural units connected by loop regions. The packing of the polypeptide is usually much tighter in the interior than the exterior of the domain producing a solid-like core and a fluid-like surface. In fact, core residues are often conserved in a protein family, whereas the residues in loops are less conserved, unless they are involved in the protein's function. Protein tertiary structure can be divided into four main classes based on the secondary structural content of the domain :

  • All-α domains have a domain core built exclusively from α-helices. This class is dominated by small folds, many of which form a simple bundle with helices running up and down.
  • All-β domains have a core comprising of antiparallel (3-sheets, usually two sheets packed against each other. Various patterns can be identified in the arrangement of the strands, often giving rise to the identification of recurring motifs, for example the Greek key motif.
  • α+β domains are a mixture of all-α and all-β motifs. Classification of proteins into this class is difficult because of overlaps to the other three classes and therefore is not used in the CATH domain database.
  • α/β domains are made from a combination of β-α-β motifs that predominantly form a parallel β-sheet surrounded by amphipathic α-helices. The secondary structures are arranged in layers or barrels.

The CATH domain database classifies domains into approximately 800 fold families, ten of these folds are highly populated and are referred to as 'superfolds'. Super-folds are defined as folds for which there are at least three structures without significant sequence similarity. The most populated is the α/β-barrel super-fold described previously.

Domains have limits on size. The size of individual structural domains varies from 36 residues in E-selectin to 692 residues in lipoxygenase-1, but the majority, 90%, have less than 200 residues with an average of approximately 100 residues. Very short domains, less than 40 residues, are often stabilized by metal ions or disulphide bonds. Larger domains, greater than 300 residues, are likely to consist of multiple hydrophobic cores. Domains are the common material used by nature to generate new sequences, they can be thought of as genetically mobile units, referred to as 'modules'.