Serpin

Serpins are a group of proteins with similar structures that were first identified as a set of proteins able to inhibit proteases. The name serpin is derived from this activity - serine protease inhibitors.

The first members of the serpin superfamily to be extensively studied were the human plasma proteins antithrombin and antitrypsin, which play key roles in controlling blood coagulation and inflammation, respectively. Initially, research focused upon their role in human disease: antithrombin deficiency results in thrombosis and antitrypsin deficiency causes emphysema. In 1980 Hunt and Dayhoff made the surprising discovery that both these molecules share significant amino acid sequence similarity to the major protein in chicken egg white, ovalbumin, and they proposed a new protein superfamily. Over 1000 serpins have now been identified, these include 36 human proteins, as well as molecules in plants, bacteria, archaea and certain viruses. Serpins are thus the largest and most diverse family of protease inhibitors.

While most serpins control proteolytic cascades, certain serpins do not inhibit enzymes, but instead perform diverse functions such as storage (ovalbumin, in egg white), hormone carriage proteins (thyroxine-binding globulin, cortisol binding globulin) and tumor suppressor genes (maspin). The term serpin is used to describe these latter members as well, despite their noninhibitory function.

As serpins control processes such as coagulation and inflammation, these proteins are the target of medical research. However, serpins are also of particular interest to the structural biology and protein folding communities, because they undergo a unique and dramatic change in shape (or conformational change) when they inhibit target proteases. This is unusual - most classical protease inhibitors function as simple "lock and key" molecules that bind to and block access to the protease active site (see for example, bovine pancreatic trypsin inhibitor). While the serpin mechanism of protease inhibition confers certain advantages, it also has drawbacks and serpins are vulnerable to mutations that result in protein misfolding and the formation of inactive long chain polymers (serpinopathies). Serpin polymerisation reduces the amount of active inhibitor, as well as accumulation of serpin polymers causing cell death and organ failure. For example, the serpin antitrypsin is primarily produced in the liver, and antitrypsin polymerisation causes liver cirrhosis. Understanding serpinopathies also provides insights on protein misfolding in general, a process common to many human diseases, such as Alzheimer’s and CJD.

Proteases inhibited by serpins
Most inhibitory serpins target chymotrypsin-like serine proteases (see Table 1). These enzymes are defined by the presence of a nucleophilic serine residue in their catalytic site. Examples include thrombin, trypsin and human neutrophil elastase.

Some serpins inhibit other classes of protease and are termed "cross class inhibitors". For example squamous cell carcinoma antigen 1 (SCCA-1) and the avian serpin myeloid and erythroid nuclear termination stage specific protein (MENT) both inhibit papain-like cysteine proteases

The viral serpin crmA is a suppressor of the inflammatory response through inhibition of IL-1 and IL-18 processing by the cysteine protease caspase-1. Cysteine proteases differ from serine proteases in that they are defined by the presence of a nucleophilic cysteine residue, rather than a serine residue, in their catalytic site. Nonetheless, the enzymatic chemistry is similar, and serpins most likely inhibit both classes of enzyme in a similar fashion.

Localisation and general biological roles
Approximately two thirds of human serpins perform extracellular roles. For example, extracellular serpins regulate the proteolytic cascades central to blood clotting (antithrombin), the inflammatory response (antitrypsin, antichymotrypsin and C1 inhibitor) and tissue remodelling (PAI-1). Non-inhibitory extracellular serpins also perform important roles. Thyroxine-binding globulin and cortisol binding globulin transport the sterol hormones thyroxine and cortisol respectively. The protease renin cleaves off a ten amino acid N-terminal peptide from angiotensinogen to produce the peptide hormone angiotensin I. Table 1 at the bottom of this article provides a brief summary of human serpin function as well as some of the diseases that result from serpin deficiency.

The first Intracellular members of the serpin superfamily were identified in the early 1990s. As all nine serpins in Caenorhabditis elegans lack signal sequences, they are probably intracellular. Based upon these data it seems likely that the ancestral serpin to human serpins was an intracellular molecule.

The protease targets of intracellular inhibitory serpins have been more difficult to identify. Characterisation is complicated by these molecules appearing to perform overlapping roles, as well as the lack of precise functional equivalents of human serpins in model organisms such as the mouse. An important function of intracellular serpins may be to protect against the inappropriate activity of proteases inside the cell. For example, one of the best characterised human intracellular serpins is SERPINB9, which inhibits the cytotoxic granule protease granzyme B. In doing so, SERPINB9 may protect against inadvertent release of granzyme B and premature or unwanted activation of cell death pathways.

Intracellular serpins also perform roles distinct from protease inhibition. For example, maspin, a non-inhibitory serpin, is important for preventing metastasis in breast and prostate cancers. Another example is the avian nuclear cysteine protease inhibitor MENT, which acts as a chromatin remodelling molecule in avian red blood cells.

Phylogenetic studies show that most intracellular serpins belong to a single clade (see table 1). Exceptions include the non-inhibitory heat shock serpin HSP47, which is a chaperone essential for proper folding of collagen and cycles between the cis-Golgi and the endoplasmic reticulum.

Structure


Structural biology has played a central role in the understanding of serpin function and biology. Over eighty serpin structures, in a variety of different conformations (described below) have been determined to date. Although the function of serpins varies widely, these molecules all share a common structure (or fold).

The structure of the non-inhibitory serpin ovalbumin, and the inhibitory serpin antitrypsin revealed the archetype native serpin fold. All typically have three β-sheets (termed A, B and C) and eight or nine α-helices (hA-hI) (see figure 1). Serpins also possess an exposed region termed the reactive centre loop (RCL) that in inhibitory molecules includes the specificity determining region and forms the initial interaction with the target protease. In antitrypsin, the RCL is held at the top of the molecule and is not pre-inserted into the A β-sheet (figure 1, left panel). This conformation commonly exists in dynamic equilibrium with a partially inserted native conformation seen in other inhibitory serpins (see figure 1, right panel).

Conformational change and inhibitory mechanism
Early studies on serpins revealed that the mechanism by which these molecules inhibit target proteases appeared distinct from the lock-and-key-type mechanism utilised by small protease inhibitors such as the Kunitz-type inhibitors (eg. Basic pancreatic protease inhibitor). Indeed, serpins form covalent complexes with target proteases. Structural studies on serpins also revealed that inhibitory members of the family undergo an unusual conformational change, termed the Stressed to Relaxed (S to R) transition. During this structural transition the RCL inserts into β-sheet A (in red in figure 1 and 2) and forms an extra (fourth) β-strand. The serpin conformational change is key to the mechanism of inhibition of target proteases.

When attacking a substrate, serine proteases catalyze peptide bond cleavage in a two-step process. Initially, the catalytic serine performs a nucleophilic attack on the peptide bond of the substrate (Figure 3). This releases the new N-terminus and forms an ester-bond between the enzyme and the substrate. This covalent enzyme-substrate complex is called an acyl enzyme intermediate. Subsequently, this ester bond is hydrolysed and the new C-terminus is released. The RCL of a serpin acts as a substrate for its cognate protease. However, after the RCL is cleaved, but prior to hydrolysis of the acyl-enzyme intermediate, the serpin rapidly undergoes the S to R transition. Since the RCL is still covalently attached to the protease via the ester bond, the S to R transition causes the protease to be moved from the top to the bottom of the serpin. At the same time, the protease is distorted into a conformation where the acyl enzyme intermediate is hydrolysed extremely slowly. The protease thus remains covalently attached to the target protease and is thereby inhibited. Further, since the serpin has to be cleaved to inhibit the target protases, inhibition consumes the serpin as well. Serpins are therefore irreversible enzyme inhibitors. The serpin mechanism of inhibition is illustrated in figure 2 and several movies illustrating the serpin mechanism can be seen at this link.

Conformational modulation of serpin activity
The conformational mobility of serpins provides a key advantage over static lock and key protease inhibitors. In particular, the function of inhibitory serpins can be readily controlled by specific cofactors. The X-ray crystal structures of antithrombin, heparin co-factor II, MENT and murine antichymotrypsin reveal that these serpins adopt a conformation where the first two amino acids of the RCL are inserted into the top of the A β-sheet (see figures 1 and 4). The partially inserted conformation is important because co-factors are able to conformationally switch partially inserted serpins into a fully expelled form. This conformational rearrangement makes the serpin a more effective inhibitor.

The archetypal example of this situation is antithrombin, which circulates in plasma in a partially inserted relatively inactive state. The primary specificity determining residue (the P1 Arginine) points towards the body of the serpin and is unavailable to the protease (Figure 4). Upon binding a high affinity heparin pentasaccharide sequence within long chain heparin, antithrombin undergoes a conformational change, RCL expulsion and exposure of the P1 Arginine. The heparin pentasaccharide bound form of antithrombin is thus a more effective inhibitor of thrombin and factor Xa (figure 4). Furthermore, both of these coagulation proteases contain binding sites (called exosites) for heparin. Heparin therefore also acts as a template for binding of both protease and serpin, further dramatically accelerating the interaction between the two parties (Figure 4). After the initial interaction, the final serpin complex is formed and the heparin moiety is released. This interaction is physiologically important. For example, after injury to the blood vessel wall heparin is exposed, and antithrombin is thus activated to control the clotting response. The understanding of the molecular basis of this interaction formed the basis of the development of Fondaparinux, a synthetic form of Heparin pentasaccharide used as an anti-clotting drug.



Certain serpins spontaneously undergo the S to R transition as part of their function, to form a conformation termed the latent state (Figure 5). In latent serpins the first strand of the C-sheet has to peel off to allow full RCL insertion. Latent serpins are unable to interact with proteases and are not protease inhibitors. The transition to latency represents a control mechanism for the serpin PAI-1. PAI-1 is released in the inhibitory conformation, however, undergoes conformational change to the latent state unless it is bound to the cofactor vitronectin. Thus PAI-1 contains an "auto-inactivation" mechanism. Similarly, antithrombin can also spontaneously convert to the latent state as part of its normal function. Finally, the N-terminus of tengpin, a serpin from Thermoanaerobacter tengcongensis, is required to lock the molecule in the native inhibitory state. Disruption of interactions made by the N-terminal region results in spontaneous conformational change of this serpin to the latent conformation.



Serpin receptor interactions
In humans, extracellular serpin-enzyme complexes are rapidly cleared from circulation. One mechanism by which this occurs is the low density lipoprotein receptor related protein (LRP receptor), which binds to inhibitory complexes made by antithrombin, PA1-1 and neuroserpin, causing uptake and subsequent signalling events. Thus, as a consequence of the conformational change during serpin-enzyme complex formation, serpins may act as signalling molecules that alert cells to the presence of protease activity. The fate of intracellular serpin-enzyme complexes remains to be characterised.

Conformational change and non-inhibitory function
Certain non-inhibitory serpins also use the serpin conformational change as part of their function. For example the native (S) form of thyroxine-binding globulin has high affinity for thyroxine, whereas the cleaved (R) form has low affinity. Similarly, native (S) Cortisol Binding Globulin (CBG) has higher affinity for cortisol than its cleaved (R) counterpart. Thus, in these serpins, RCL cleavage and the S to R transition has been commandeered to allow for ligand release, rather than protease inhibition.

Serpins, serpinopathies and human disease
The complexity of the serpin mechanism renders these molecules vulnerable to inactivating mutations that promote inappropriate conformational change (or misfolding) and diseases ("serpinopathies"). Well characterised serpinopathies include emphysema, cirrhosis, thrombosis and dementia. Serpins thus belong to a large group of molecules such as the prion proteins and the glutamine repeat containing proteins that are susceptible to misfolding, causing conformational disease.

The ability to map the mutations in serpins that cause serpinopathies onto a structural framework aided understanding of the mechanism of normal serpin conformational changes, as well as serpin dysfunction. In particular, many serpin mutations that cause disease localise to two distinct regions of the molecule (highlighted in figure 1a) termed the shutter and the breach. The shutter and the breach contain highly-conserved residues and underlie the path of RCL insertion.

Serpin misfolding results in two common outcomes, both of which stem from the instability of the native (S) conformation. Firstly, pathogenic mutations in serpins can promote inappropriate transition to the monmoeric latent state. This causes disease because it reduces the amount of active inhibitory serpin. For example, the disease-linked antithrombin variants wibble and wobble, both promote formation of the latent state.

Secondly, and more insidiously, mutations in serpins may cause polymerisation. While the X-ray crystal structure of an intact serpin polymer remains to be determined, much biochemical, biophysical and structural data suggest that serpins "domain swap" with one another and form long-chain polymers. This may occur by a RCL of one serpin inserting into the A-sheet of another serpin, to form a chain, rather than inserting into its "own" A-sheet (see figure 6a for a model). The polymeric form is inactive and causes pathology. Serpin polymerisation causes disease in two ways. Firstly, the lack of active serpin results in uncontrolled protease activity and tissue destruction, this is seen in the case of antitrypsin deficiency. Secondly, the polymers themselves clog up the endoplasmic reticulum of cells that synthesize serpins, eventually resulting in cell death and tissue damage. In the case of antitrypsin deficiency, antitrypsin polymers cause the death of liver cells, eventually resulting in liver damage and cirrhosis.

.

Finally, it is worth highlighting a structure of a disease-linked human antichymotrypsin variant that demonstrates the extraordinary flexibility of the serpin scaffold. The structure of antichymotrypsin (Leucine 55 to Proline) revealed a novel "delta" conformation that may represent an intermediate between the native and latent state (Figure 6b). In the delta conformation four residues of the RCL are inserted into the top of β-sheet A. The bottom half of the sheet is filled as a result of one of the α-helices (the F-helix) partially switching to a strand-like conformation, completing the β-sheet hydrogen bonding. It is unclear whether other serpins can adopt this conformer, or whether this conformation has a functional role. However, this conformation may be important for thyroxine release by Thyroxine binding globulin.

Other mechanisns of serpin-related disease
In humans, simple deficiency of many serpins (e.g. through a null mutation) may result in disease (see table 1).

Rarely, single amino acid changes in the RCL of a serpin alters the specificity of the inhibitor and allow it to target the wrong protease. For example, the Antitrypsin-Pittsburgh mutation (methionine 358 to arginine) allowed the serpin to inhibit thrombin, thus causing a bleeding disorder.

Serpins are suicide inhibitors, the RCL acting as a "bait". Certain disease-linked mutations in the RCL of human serpins permit true substrate-like behaviour and cleavage without complex formation. Such variants are speculated to affect the rate or the extent of RCL insertion into the A-sheet. These mutations effectively result in serpin deficiency through a failure to properly control the target protease.

Several non-inhibitory serpins play key roles in important human diseases. Most notably, maspin functions as a tumour suppressor in breast and prostate cancer. The mechanism of maspin function remains to be fully understood. Murine knockouts of maspin are lethal; these data suggest that maspin plays a key role in development.

Evolution
Serpins were initially believed to be restricted to eukaryote organisms, but have since been found in a number of bacteria and archaea. It remains unclear whether these prokaryote genes are the descendants of an ancestral prokaryotic serpin or whether they are the product of lateral gene transfer (genetic transfer between organisms not by evolutionary descent). Rawlings et al., showed that serpins are the most widely distributed and largest family of protease inhibitors.

Human serpins
The human genome encodes 36 serpins (see Law et al., (2006) for a recent review. ). Table 1 lists each human serpin, together with brief notes in regards to each molecules function and the consequence (where known) of dysfunction or deficiency.

Insect Serpins
Studies on Drosophila serpins reveal that Serpin-27A inhibits the Easter protease (the final protease in the Nudel, Gastrulation Defective, Snake and Easter proteolytic cascade) and thus controls dorsoventral patterning. Easter functions to cleave Spätzle (a chemokine-type ligand), which results in toll mediated signaling. In addition to its central role in embryonic patterning, toll signalling is also important for the innate immune response in insects. Accordingly, serpin-27A additionally functions to control the insect immune response.

Worm Serpins
The genome of the nematode worm C. elegans contains nine serpins, however, only five of these molecules appear to function as protease inhibitors. One of these serpins, SRP-6, has been shown to perform a protective function and guard against stress induced calpain-associated lysosomal disruption. Further SRP-6 functions to inhibit lysosomal cysteine proteases released after lysosomal rupture. Accordingly, worms lacking SRP-6 are sensitive to stress. Most notably, SRP-6 knockout worms die when placed in water (the hypo-osmotic stress lethal phenotype or Osl). Based on these data it is suggested that lysosomes play a general and controllable role in determining cell fate.

Plant serpins
The presence of serpins in plants has long been recognised - indeed, barley Z serpin is the major protein component in beer. The genome sequence of Arabidopsis thaliana is predicted to encode 29 serpins. Plant serpins are able to inhibit serine proteases in vitro. However, the absence of close relatives of chymotrypsin-like proteases in plants suggests that these molecules may instead perform an alternative function. Indeed, Arabidopsis serpin1 inhibits metacaspase-like proteases in vivo and may control cell death pathways.

Prokaryote serpins
Predicted serpin genes are sporadicly distributed in prokaryotes. In vitro studies on some of these moelcules have revealed that they are able to inhibit proteases and it is suggested that they function as inhibitors in vivo. Interestingly, several prokaryote serpins are found in extremeophiles. Accordingly, and in contrast to mammalian serpins, these molecule possess elevated resistance to heat denaturation. The precise role of most bacterial serpins remains obscure, however,  Clostridium thermocellum serpin localises to the cellulosome, a large extracellular mulitprotein complex that breaks down cellulose. It is suggested that the role of cellulosome-associated serpins may be to prevent unwanted protease activity against the cellulosome.

Classification
In 2001, a serpin nomenclature was established. The naming system is based upon a phylogenetic analysis of ~500 serpins. This work classified the serpins into sixteen major clades, with several orphan sequences. The serpin family continues to grow - to date over 1000 serpins have been identified.