COLLAGEN

collagen-moleculeCollagen is an extracellular protein organized into soluble fibers of great tensile strength. A single molecule of Type I collagen has a width of ~14 A, and a length of ~3000 A. It is composed of 3 polypeptide chains. It has the shape of a rod. If it had the thickness of a pencil, it would have the length of 1.5m. This rod is reinforced by crosslinking bonds.
A single chain of collagen is defined as an a-chain. Each collagen molecule consists of three a-chains usually identical. The only known exception is Type I collagen. Type I collagen consists of two identical chains (a1) and one different chain (a2) which is denoted as [a1(I)]2a2. It is the only heteropolymer among collagens. Index I is used because the chains in particular collagen types differ slightly in their amino acid composition.

The amino acid sequence is a typical feature of protein, determining its structure as a whole. Collagen, contains 19 amino acids, among which are two that do not occur in other proteins i.e. hydroxyproline and  hydroxylysine. Besides collagen contains more glycine than most other proteins, but it does not contain cysteine, cystine (with exception of collagen III) and tryptophan.

The unique shape and properties of the collagen molecule are due to its amino acid composition and sequence. Collagen has a distinctive amino acid composition and sequence: Gly-X-Y (Glycine, X is often Proline and  Y is often 4-Hydroxyproline -with some 3-Hydroxyproline and some 5-hydroxylysine). Hyp confers stability upon collagen, probably through intramolecular hydrogen bonds that may involve bridging water molecules.

hydroxyproline

Pro residues are converted to Hyp in a reaction catalyzed by prolyl hydroxylase. If collagen is synthesized under conditions that inactivate prolyl hydroxylase, it loses its native conformation (denaturation) at 240C, whereas normal collagen denatures at 390C (denatured collagen is known as gelatin). Prolyl hydroxylase requires ascorbic acid (vit-C) to maintain activity. If there is Vit-C deficiency, disease scurvy, collagen can not form fibers properly, this results in skin lesions, poor wound healing.

The typical features of collagen are: 
1.  The number of glycine residues amounts to 1/3 of all amino acid residues.
2.  The number of iminoacids residue is 1/5 of all amino acids residues in mammals and birds. (The name iminoacid is currently used in biochemistry though it is not quite correct since those compounds are derivatives of pyrollidine not imines. Systematic name of proline is pyrolidine a-carboxylic acid and that of hydroxyproline is b- hydroxyprolidine – a- carboxylic acid.)
3.   The presence of two specific hydroxyamino acids: hydroxyproline, hydroxylysine.
4.  The presence of certain amount of aldehyde groups (participating in crosslinking bonds).
5.  The presence of hexoses bound to protein side chains.
6.  The occurrence of characteristic hydrophilic and hydrophobic space groupings in a chain.
7.  The average molecular weight of one residue 90.7.
8.  The number of aminoacid in a chain amounting to about 1,000 on the average.
9.  The average molecular weight of one chain amounting to about 90,000.

Collagen at present is a great protein of known sequence. Details regarding this sequence are given in monographs.

By generalizing, we can describe the discussed sequence as follows: 
1.  The collagen a-chain consists of a central helical part containing 1011-1047 aminoacid residues of which every third must be glycine.
2.  The helical part contains ~20% iminoacids in the second or third positions, if we divide the molecule in tripeptides, each of which starts with glycine (G-X-Y). In mammals collagen about 2/3 of the iminoacids are hydroxylated and are always in the Y position (4-hydroxyproline). The only exception is 3- hydroxyproline which occurs in the X-position however once or twice in the chain only.
3.  The nonhelical extensions are relatively rich in hyrophobic aminoacids and contain a lysine residue which can be enzymatically oxidized and serves as a functional group for the formation of  intra and  intermolecular crosslinks.
4.  Hydroxylysine is occuring exclusively in collagen. It is the only aminoacid glycosylated at several sites but not every residue in the chain. Lysine like proline is hydroxylated only when it is in the Y-position.
5.  The average content of proline plus hydroxyproline is equal throughout the chain, except for the C-terminal, which terminates with 5 consecutive three peptides Gly-Pro-Hyp. This suggests an exceptional stability of the C-terminal helical region of the molecule.

Conformation of collagen chain: 
X-ray studies show that collagen’s three polypeptide chains are parallel and wind around each other with a gentle right handed rope like twist to form a triple-helical structure. Every third residue of each polypeptide chain passes through the center of the triple helix, which is so crowded that only a Gly side chain can fit in there. Also the three polypeptide chains are staggered so that gly, X and  Y residues from the three chains occur at similar levels. The staggered peptide groups are oriented such that the N-H of each Gly makes a strong H-bond with the carbonyl oxygen of an X residue on a neighboring chain. The bulky and relatively inflexible Pro and Hyp residues confer rigidity on the entire assembly.

helix-collagen
As with the twisted fibers of a rope, the extended and twisted polypeptide chains of collagen convert a longitudinal tensional force to a more easily supported lateral compressional force on the almost incompressible triple helix. This occurs because the oppositely twisted directions of collagen’s polypeptide chains and triple helix prevent the twists from being pulled out under tension..
The repetitive sequence in collagen which is called the helical region consists of an infinite set of points, lying on a screw line and separated by a constant axial translation.
Constant axial translation  h (unit height)
Angular separation t (unit twist)
Radius of helix ro

Pitch    P = 2 p h / t

P/h may be expressed as the rational fraction n /V , which means that the discontinuous helix has   n points in V turns.
Number of points N per turn is found from the expression
N = 2 p / r = P / n = n / V , N being negative for the left hand helix.

Freser 1979:
h = 2.98 A
t = 1070
N = 3.36

Ramachandran:
h = 2.91 A
t = 1110
N = 3.25

Synthetic polytripeptide (GlyProPro)n
h = 2.87 A
t = 1080
N = 3.33

The non-integer number of residues in one turn could not be explained until Ramachandran and Kanthen’s suggestion was accepted which states that the molecule has the form of a three-strand rope in which the individual chains have a left hand helical conformation and the three chains are twisted around a common axis with a right hand rope twist. In this model  two H-bonds per tri peptide have been accepted.

Ramachandran and Chandrasekharan suggest that
“Collagen has one bonded  structure which contain water bridges.”
Rich-Crick  suggest a model with t=108, N= -10/3, P is 30 units hights of the basic helix (86 A long). The water bound to the chains do not affect the symmetry if it is accepted that more than one water molecule is involved in a bridge.
Considering the optimal interactions of the adjacent a1(I) chains, the molecules align with an axial stagger of 233 residues which is consistent with the quarter stagger hypothesis.

Many authors have approached the question of energetics of collagen molecule through investigation of its thermal stability and denaturation thermodynamics (shown for globular proteins). For the denaturation process involving over 30 residues, the micro process(micro unfoldind) has Gibbs energy of the order 7-11 kJ/mole, macro process(macro unfolding)  energy of 200-400 kJ/mole. The total values for DH were found to be 4,000-6,500 kJ/mole. DS=14-21 kJ/mole.

In addition to the enthalpy DH, we have two main criteria for estimating the strength of H-bond in the A-H……B system: The A-H stretching frequency or its relative shift  ( n0-  n) / n0 (Where n0is stretching frequency of the free A-H group) and the distances  ( R ) A-H and A………B.According to these criteria H-bonds may be regarded as weak, intermediate and strong. For the OH……….O bonds this approximate classification is as follows:

H-Bond Dn/ n0 RO…O DH DH
(%) (A0) kcal/mol kj/mole
Weak 12 2,7 5 21
Medium 12 – 22 2,7 – 2,6 6 – 8 25 – 33
Strong 25 – 83 2,6 – 2,4 8 33

The length of H-bonds in collagen is approx. 3A.
most occuring ones:

…………C=O………..H-N
also…….C-H…………O = C,
…………N-H…………N-

If  AH………B has Potential Energy curve, the bond is strong or moderate. For  A-……..HB+system well II is deeper than I. Finally, the potential energy curve may be symmetric when the potential barrier is small or equal to zero a “hesitating proton” is involved. Thus we distinguish: an asymmetric double minimum, a symmetric double minimum, and asymmetric single minimum with
RA-H = ? RA……B (then usually A=B)

The knowledge of the character and properties of crosslinking bonds is of great importance to tanning chemistry. The splitting of these bonds increases solubility of collagen, which decreases the shrinkage temperature. Increase in the amount of these bonds, which is equivalent to tanning, has an opposite effect.
Crosslinking reducible covalent bonds (only 2 examples given here):

Dehydro-hydroxylysino-norleucine 

COOH             OH                                     COOH
∣                            ∣                                               ∣
CH-CH2-CH2-CH-CH2-N=CH-CH2-CH2-CH2-CH
∣                                                                            ∣
NH2                                                               NH2

Hydroxylysine-5-keto-norleucine

COOH             OH                                   O     COOH
∣                         ∣                                       ∣∣               ∣
CH-CH2-CH2-CH-CH2-NH-CH2-CH2-C-CH2-CH2-CH
∣                                                                                  ∣
NH2                                                                        NH2

are typical components of such bonds. The first of the above occurs in skin.
The second of the above occurs in cartilage.

Collagen is organized into distinctive banded fibrils that have periodicity 680 A (with hole zones and overlap zones). Collagen contains covalently attached carbohydrates in amounts that range from ~0.4 to 12 % by weight depending on collagen’s tissue of origin. The carbohydrates which consist  mostly of glucose, galactose and their disaccharides are covalently attached to collagen at its 5-hydroxylysyl residues by specific enzymes. They are located in the “hole ” regions of collagen fibrils.
The supposed existence of an ester-type bond, via hexose residue, probably derives from the fact that saccharide units have been found in collagen, which are attached to hydroxylysine by glycosidic linkage in the helical region of the molecule, either as galactosyl-hydroxylysine or glucosyl galactosyl hydroxylysine.
Type I and II collagens contain about 0.4% carbohydrates and type II contain about 4 %. The major sites of glycosylation are those involved in the intramolecular crosslink. To date no experimental evidence has been made that would demonstrate the function of these carbohydrates. It has been thought that they may regulate the formation of crosslinks and aggregation of collagen molecules into the quarter stagger arrangement.

Collagens insolubility in solvents is explained by the observation that it is both intramolecularly and intermolecularly covalently cross-linked. The cross-links cannot be disulfide links, as in keratin, because collagen is almost devoid of Cys residues. Rather, they are derived from Lys and His side chains. Up to four side chains can be covalently bonded to each other. The cross links do not form at random but tend to occur near the N- and C- termini of the collagen molecules. The aspects of crosslinking are closely related to molecule ageing. Degree of crosslinking increases with the age of the animal (meat of older animals tougher)
In early postnatal tissues the amount of reducible crosslinks is high and decreases as the physical maturity progresses. The stable crosslinks replacing the reducible ones have not yet been determined with certainity. Alterations of the physical and chemical properties of collagen fibers due to aging are very distinct. The fibers become increasingly insoluble, their ability to swell in acid solution decreases and so does the susceptibility to enzyme attack, whereas their mechanical strength and stiffness increases. The stiffness increases through the whole lifetime, creating brittleness which results in the decrease of tensile strength. When artificially introduced crosslinks give rise to more than the optimum number of crosslinks, the connective tissue becomes brittle.
No position in the central part of the molecule is susceptible to proteolytic attack (Proteolytic enzymes: peptidases and proteinases)  pronase, pepsin or tripsin.

Go to index page