EP1141263A4 - Production of soluble recombinant trypsinogen analogs - Google Patents

Production of soluble recombinant trypsinogen analogs

Info

Publication number
EP1141263A4
EP1141263A4 EP99951445A EP99951445A EP1141263A4 EP 1141263 A4 EP1141263 A4 EP 1141263A4 EP 99951445 A EP99951445 A EP 99951445A EP 99951445 A EP99951445 A EP 99951445A EP 1141263 A4 EP1141263 A4 EP 1141263A4
Authority
EP
European Patent Office
Prior art keywords
trypsinogen
trypsin
amino acid
analog
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP99951445A
Other languages
German (de)
French (fr)
Other versions
EP1141263A1 (en
Inventor
Jose Michael Hanquier
Charles Lee Hershberger
Dominique Desplancq
Jeffery Lynn Larson
Paul Robert Rosteck Jr
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Universite Louis Pasteur Strasbourg I
Eli Lilly and Co
Original Assignee
Universite Louis Pasteur Strasbourg I
Eli Lilly and Co
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Universite Louis Pasteur Strasbourg I, Eli Lilly and Co filed Critical Universite Louis Pasteur Strasbourg I
Publication of EP1141263A1 publication Critical patent/EP1141263A1/en
Publication of EP1141263A4 publication Critical patent/EP1141263A4/en
Withdrawn legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/48Hydrolases (3) acting on peptide bonds (3.4)
    • C12N9/50Proteinases, e.g. Endopeptidases (3.4.21-3.4.25)
    • C12N9/64Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from animal tissue
    • C12N9/6421Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from animal tissue from mammals
    • C12N9/6424Serine endopeptidases (3.4.21)
    • C12N9/6427Chymotrypsins (3.4.21.1; 3.4.21.2); Trypsin (3.4.21.4)

Definitions

  • the invention relates generally to recombinant DNA technology. More specifically, the present invention relates to recombinant trypsinogen analogs as well as methods for producing recombinant trypsin.
  • Trypsin is a widely used serine protease which cleaves the peptide bond on the carboxy-terminus of basic amino acid residues such as lysine and arginine.
  • trypsin plays a pivotal role among pancreatic enzymes in the activation of endopeptidases.
  • pancreatic enzymes are secreted through the pancreatic duct into the duodenum of the small intestine in response to a hormone signal generated when food passes from the stomach. They are not, however, synthesized in their final active form. Rather, they are made as slightly longer catalytically inactive molecules called zymogens.
  • zymogens include trypsinogen, chymotrypsinogen, proelastase, and procarboxypeptidase. These zymogens must themselves be cleaved proteolytically to yield active enzymes.
  • the first step of the activation cascade is the activation of trypsin from trypsinogen in the duodenum.
  • Enteropeptidase also known as enterokinase
  • enterokinase is a protease produced by duodenal epithelial cells which activates pancreatic trypsinogen to trypsin by excising a hexapeptide leader sequence from the amino-terminus of trypsinogen. Trypsin in turn autocatalytically activates more trypsinogen to trypsin and also acts on other proenzymes, thus, for example, liberating the endopeptidases chymotrypsin and elastase as well as carboxypeptidases A and B.
  • This battery of enzymes work together with pepsin produced in the stomach and other proteases secreted by the intestinal wall cells to digest most ingested proteins into free amino acids, which can be absorbed by the intestinal epithelium.
  • the enzymes themselves are continually subjected to autodigestion and other degradative processes so that high levels of these enzymes never accumulate in the intestine. In the pancreas, several factors oppose trypsinogen autoactivation, whereas in the duodenum, all the conditions favorable for typsinogen activation by enteropeptidase are present.
  • a hexapeptide leader sequence which consists of Valine-Aspartate-Aspartate-Aspartate-Aspartate-Lysine ((Asp) 4 -Lys), on the amino-terminus of trypsinogen is enzymatically removed. Trypsinogens of many different species have been cloned and characterized. The pattern of (Asp) 4 -Lys at the amino-terminus, however, is well-conserved in all of these precursors. Mireaux Rovery, Limited Proteolyses in Pancreatic Chymotrypsinogens and Trypsinogens, 70 Biochimie 1131 (1988).
  • Serine proteases such as trypsin have a variety of uses. They are useful for the characterization of other proteins as well as in the manufacturing process of other recombinant bioproducts. For example, small recombinant proteins are often expressed first as fusion proteins to facilitate their purification and enhance their stability.
  • the fusion proteins can be engineered such that a leader sequence can be cleaved from the native protein sequence by trypsin. Any internal lysines or arginines that are not part of the leader sequence can be chemically protected from cleavage by trypsin.
  • BSE bovine spongiform encephalopathy
  • the present invention also provides an efficient and relatively inexpensive process to manufacture recombinant trypsin.
  • the process of the present invention is significantly different from past approaches by providing for the expression of an inactive zymogen form that is soluble and properly folded yet is not activated until after purification from fermentation broth or cell extracts.
  • This is accomplished through the expression of a single chain trypsinogen analog wherein the leader sequence is modified such that it lacks a trypsin-like enzyme cleavage site.
  • the trypsinogen analogs of the present invention lack a lysine or arginine in the N-terminal leader sequence of the protein to prevent auto-activation or activation by endogenous host cell enzymes. Once expressed in a particular cell type, the trypsinogen analog can then be secreted outside of the cell and isolated from the culture medium or alternatively expressed in the cytoplasm.
  • the cells can be collected by centrifugation, lysed, and the trypsinogen isolated. Once the trypsinogen is isolated, stability can be enhanced by lowering the pH to about 3.0 and then subsequently activating the trypsinogen with an aminopeptidase such as dipetidylaminopeptidase (DAP).
  • DAP dipetidylaminopeptidase
  • the present invention provides trypsinogen analogs which have a modified amino- terminal leader sequence which results in recombinant trypsinogen that is stable and easy to activate.
  • This invention provides a means to move away from animal-sourced trypsin and avoid the problems of degradation, instability, and damage to cell membranes which occurs during expression and/or secretion of recombinantly produced trypsinogen.
  • the invention makes possible the secretion of a folded trypsinogen, therefore eliminating costly and time consuming refolding steps.
  • the present invention provides a trypsinogen analog comprising a protein having trypsin activity fused to a leader sequence having at least 2 amino acids wherein the amino acids are any amino acid except Lys or Arg.
  • the invention provides a trypsinogen analog represented by X-AA-Y, wherein Y is a protein having trypsin activity, AA is an amino acid other than Lys or Arg and X-AA is a leader sequence having at least 2 amino acids.
  • the invention further provides analogs that can be activated by cleavage with an aminopeptidase.
  • the mature trypsin protein, which is the active enzyme is derived from a single polypeptide which comprises a leader sequence fused to a polypeptide representing the active enzyme.
  • the leader sequence must be cleaved to generate the active trypsin molecule. Still further, the invention provides an isolated trypsinogen analog represented by X-AA-Y, wherein: Y is a protein having trypsin activity; AA is any amino acid except Lys or Arg; and X-AA is a leader sequence having at least 2 amino acids.
  • the invention encompasses leader sequences fused to trypsin which cannot be cleaved by trypsin, enteropeptidases, or any other endogenous trypsin-like enzymes that may be present during expression, secretion, or isolation of the protein.
  • the leader sequence of the trypsinogen analog is selected from the group consisting of: ) from 2 to 20 non basic amino acid residues; b) from 2 to 20 amino acid residues wherein the amino acid residues can be any residue except except Lys or Arg; c ) Val-Asp-Asp-Asp-Asp-N, wherein N is any amino acid residue except Lys, Arg, or Pro; d) amino acid residues 5 through 10 of SEQ ID NO:2; e ) from 2 to 20 amino acid residues wherein the residues can be any hydrophilic amino acid residue that is not Lys or Arg; f ) amino acid residues 5 through 8 of SEQ ID NO:2; and g) from 2 to 20 amino acid residues, wherein the number of residues is an even number of residues and the residues are any non-basic amino acid except proline.
  • this invention provides an inactive trypsin precursor comprising the sequence which is SEQ ID NO:2.
  • the invention also provides proteins having the amino acid sequences of SEQ ID NO: 2 and SEQ ID NO 4, including derivatives without the first three amino acids of SEQ ID NO: 2 and the first two amino acids of SEQ ID NO 4.
  • the invention further provides a trypsinogen analog having a secretion signal sequence of amino acids operably linked to the amino-terminal end of the leader sequence wherein said signal sequence allows the trypsinogen to be secreted into the host cell growth medium.
  • the invention encompasses both RNA and DNA which encode the trypsinogen analog proteins of the present invention.
  • Preferred polynucleotides of the current invention include the nucleic acids of SEQ ID NO:l and SEQ ID NO:3 and variants thereof.
  • This invention also provides recombinant vectors comprising the above-described nucleic acids as well as host cells harboring said recombinant vectors.
  • the invention further provides essentially chymotrypsin-free trypsin preparations.
  • the invention also provides nucleic acids sharing at least about 65 % identity with the nucleic acids of SEQ ID NO: 1 and SEQ ID NO: 3.
  • the invention includes methods for making recombinant trypsinogen analogs comprising the steps of: expressing the trypsinogen analogs described above in a host cell and isolating the expressed trypsinogen. Methods are also provided for making recombinant trypsin further comprising the step of activating the isolated trypsinogen with an aminopeptidase.
  • methods comprising the steps of: expressing trypsinogen analogs in Pichia pastoris; isolating the trypsinogen analogs in a buffer; and incubating the isolated trypsinogen with a dipeptidylaminopeptidase in a buffer with a pH from about 2.0 to about 6.5 such that the trypsinogen is converted to active trypsin.
  • Figure 1 is a restriction map of the plasmid vector pRMG5 containing the bovine trypsin gene.
  • Figure 2 is a restriction map of the plasmid vector pLGD43 containing the bovine trypsin gene.
  • FIG. 3 summarizes the production of ValAsp5 trypsinogen in the two expression systems described herein.
  • Base pair refers to DNA or RNA.
  • the abbreviations A,C,G, and T correspond to the 5'-monophosphate forms of the deoxyribonucleosides (deoxy)adenosine, (deoxy)cytidine, (deoxy)guanosine, and thymidine, respectively, when they occur in DNA molecules.
  • the abbreviations U,C,G, and A correspond to the 5'- monophosphate forms of the ribonucleosides uridine, cytidine, guanosine, and adenosine, respectively when they occur in RNA molecules.
  • base pair may refer to a partnership of A with T or C with G.
  • heteroduplex base pair may refer to a partnership of A with U or C with G. (See the definition of "complementary”, infra.)
  • the terms "digestion” or “restriction” of DNA refers to the catalytic cleavage of the DNA with a restriction enzyme that acts only at certain sequences in the DNA (“sequence-specific endonucleases”).
  • restriction enzymes used herein are commercially available and their reaction conditions, cofactors, and other requirements were used as would be known to one of ordinary skill in the art. Appropriate buffers and substrate amounts for particular restriction enzymes are specified by the manufacturer or can be readily found in the literature.
  • Ligation refers to the process of forming phosphodiester bonds between two double stranded nucleic acid fragments. Unless otherwise provided, ligation may be accomplished using known buffers and conditions with a DNA ligase, such as T4 DNA ligase.
  • Plasmid refers to an extrachromosomal (usually) self-replicating genetic element. Plasmids are generally designated by a lower case “p” followed by letters and/or numbers.
  • the starting plasmids herein are either commercially available, publicly available on an unrestricted basis, or can be constructed from available plasmids in accordance with published procedures. In addition, equivalent plasmids to those described are known in the art and will be apparent to the ordinarily skilled artisan.
  • reading frame means the nucleotide sequence from which translation occurs “read” in triplets by the translational apparatus of transfer RNA (tRNA) and ribosomes and associated factors, each triplet corresponding to a particular amino acid.
  • tRNA transfer RNA
  • ribosomes and associated factors each triplet corresponding to a particular amino acid.
  • the triplet codons corresponding to the desired polypeptide must be aligned in multiples of three from the initiation codon, i.e. the correct "reading frame” being maintained.
  • Recombinant DNA cloning vector refers to any autonomously replicating agent, including, but not limited to, plasmids and phages, comprising a DNA molecule to which one or more additional DNA segments can or have been added.
  • recombinant DNA expression vector refers to any recombinant DNA cloning vector in which a promoter to control transcription of the inserted DNA has been incorporated.
  • expression vector system refers to a recombinant DNA expression vector in combination with one or more trans-acting factors that specifically influence transcription, stability, or replication of the recombinant DNA expression vector.
  • the trans-acting factor may be expressed from a co-transfected plasmid, virus, or other extrachromosomal element, or may be expressed from a gene integrated within the chromosome.
  • Transcription refers to the process whereby information contained in a nucleotide sequence of DNA is transferred to a complementary RNA sequence.
  • transfection refers to the taking up of an expression vector by a host cell whether or not any coding sequences are in fact expressed. Numerous methods of transfection are known to the ordinarily skilled artisan, for example, calcium phosphate co-precipitation, and electroporation. Successful transfection is generally recognized when any indication of the operation of this vector occurs within the host cell.
  • transformation means the introduction of DNA into an organism so that the DNA is replicable, either as an extrachromosomal element or by chromosomal integration.
  • Methods of transforming bacterial and eukaryotic hosts are well known in the art, many of which methods, such as nuclear injection, protoplast fusion or by calcium treatment using calcium chloride are summarized in J. Sambrook, et al.. Molecular Cloning: A Laboratory Manual, (1989). Generally, when introducing DNA into Yeast the term transformation is used as opposed to the term transfection.
  • translation refers to the process whereby the genetic information of messenger RNA is used to specify and direct the synthesis of a polypeptide chain.
  • vector refers to a nucleic acid compound used for the transfection and/or transformation of cells in gene manipulation bearing polynucleotide sequences corresponding to appropriate protein molecules which, when combined with appropriate control sequences, confer specific properties on the host cell to be transfected and/or transformed. Plasmids, viruses, and bacteriophage are suitable vectors. Artificial vectors are constructed by cutting and joining DNA molecules from different sources using restriction enzymes and ligases.
  • vector as used herein includes Recombinant DNA cloning vectors and Recombinant DNA expression vectors.
  • complementarity refers to pairs of bases (purines and pyrimidines) that associate through hydrogen bonding in a double stranded nucleic acid.
  • bases purines and pyrimidines
  • the following base pairs are complementary: guanine and cytosine; adenine and thymine; and adenine and uracil.
  • hybridization refers to a process in which a strand of nucleic acid joins with a complementary strand through base pairing.
  • the conditions employed in the hybridization of two non-identical, but very similar, complementary nucleic acids varies with the degree of complementarity of the two strands and the length of the strands. Such techniques and conditions are well known to practitioners in this field.
  • isolated amino acid sequence refers to any amino acid sequence, however constructed or synthesized, which is locationally distinct from the naturally occurring sequence.
  • isolated DNA compound refers to any DNA sequence, however constructed or synthesized, which is locationally distinct from its natural location in genomic DNA.
  • isolated nucleic acid compound refers to any RNA or DNA sequence, however constructed or synthesized, which is locationally distinct from its natural location.
  • a “primer” is a nucleic acid fragment which functions as an initiating substrate for enzymatic or synthetic elongation.
  • promoter refers to a DNA sequence which directs transcription of DNA to RNA.
  • a “probe” as used herein is a nucleic acid compound or a fragment thereof which hybridizes with another nucleic acid compound.
  • stringency refers to a set of hybridization conditions which may be varied in order to vary the degree of nucleic acid affinity for other nucleic acid. (See the definition of "hybridization”, supra.)
  • PCR refers to the widely-known polymerase chain reaction employing a thermally-stable DNA polymerase.
  • leader sequence refers to a sequence of amino acids which can be enzymatically or chemically removed to produce the desired polypeptide of interest.
  • processed polypeptide refers to a polypeptide or protein wherein the N- terminal leader sequence has been removed to yield the desired polypeptide of interest.
  • secretion signal sequence refers to a sequence of amino acids generally present at the N-terminal region of a larger polypeptide functioning to initiate association of that polypeptide with the cell membrane and secretion of that polypeptide through the cell membrane.
  • trypsinogen refers to an inactive serine protease zymogen which can be converted to trypsin by removal of a hexapeptide leader sequence which generally comprises the sequence Val(Asp) 4 Lys (SEQ ID NO:?).
  • trypsin or "trypsin-like enzymes” refer to proteases which have the ability to cleave the peptide bond on the carboxy-terminus of basic amino acid residues such as lysine and arginine.
  • enterokinase or "enteropeptidase” refer to proteases generally produced in epithelial cells that activate trypsinogn to trypsin by cleaving off the trypsinogen leader sequence from the amino terminus of the protein.
  • trypsinogen analog refers to trypsinogen which has been mutated such that it cannot be converted to active trypsin by the action of trypsin or trypsin-like enzymes.
  • autoactivation or “autocatalytic” refers to the ability of trypsin to activate trypsinogen by cleaving the leader sequence to produce more active trypsin.
  • endogenous trypsin-like enzyme refers to proteases which have the ability to cleave the peptide bond on the carboxy-terminus of basic amino acid residues such as lysine and arginine and which are normally expressed in the particular cell type of interest.
  • percent identity is used with reference to the Blast 2 algorithm, which is available at the NCBI (http://www.ncbi.nlm.nih.gov/BLAST), using default parameters. References pertaining to this algorithm include: those found at http://www.ncbi.nlm.nih.gov/BLAST/T3last_references.html; Altschul, S.F., Gish, W., Miller, W., Myers, E. W. & Lipman, D.J. (1990) "Basic local alignment search tool.” J. Mol. Biol. 215:403-410; Gish, W. & States, D.J.
  • the present invention avoids the premature activation of recombinant trypsinogen to trypsin that can occur during expression, secretion, and isolation.
  • Prior experimentation involving recombinant trypsinogen indicates that it is generally quite difficult to provide a process for producing stable recombinant trypsinogen.
  • Attempts by researchers experimenting in the field of recombinant protein expression to make recombinant trypsin were not successful, in part, because a mixture of different species of active and inactive trypsin was produced.
  • the final product was not fully active mature recombinant trypsin. Trypsin contains three internal trypsin cleavage sites in addition to the cleavage site in the leader sequence.
  • Trypsin has a strong affinity for itself. Prior processes were problematic because, during expression and/or secretion, trypsinogen was activated to trypsin by endogenous enzymes. These activated trypsin molecules then could cleave other recombinantly produced trypsin enzymes at internal cleavage sites and render these enzymes inactive. The resulting mixture of recombinant trypsin peptides contained only a small percentage of intact active trypsin. In addition, activation of trypsin during expression or secretion can damage the cell membrane of the host. This can also contribute to low yields of recombinant trypsinogen or trypsin.
  • trypsinogen analogs described herein circumvent the premature activation problem because they are mutated at the activation site to prevent autoactivation or activation by endogenous trypsin-like enzymes.
  • classical activation of trypsinogen into trypsin by enterokinase or trypsin would not be possible.
  • Trypsin proteins having trypsin activity have been exceptionally good subjects for a molecular analysis of protein structure and function. Trypsin proteins have been isolated and characterized from numerous species including bovine, rat, and humans. Le Huerou et al. (1990) Isolation and nucleotide sequence ofcDNA clone for bovine pancreatic anionic trypsinogen. Structural identity within the trypsin family, Eur. J. Biochem. 193, 767-773; Craik, C.S. et al. (1984) J.Biol. Chem. 259:14255-14264; Fletcher, T.S. et al. (1987) Biochemistry 26:3081-3086.
  • a protein having trypsin activity includes a large group of enzymes which are well-conserved between species and which function by cleaving the peptide bond on the carboxy-terminus of basic amino acid residues such as lysine and arginine.
  • Wild-type trypsinogens generally contain a hexapeptide leader sequence consisting of Val-Asp-Asp-Asp-Asp-Lys. Trypsin and endogenous trypsin-like enzymes generally cleave the C-terminal peptide bond of Arg and Lys.
  • the present invention provides trypsinogens that have modified leader sequences which maintain the molecule as an inactive zymogen that cannot be activated by trypsin-like enzymes.
  • leader sequence may be used in place of the native leader peptide to prevent activation by tryspin-like enzymes.
  • the number of amino acids in the N-terminal leader sequence can be as small as two amino acids which will maintain the enzyme in an inactive state. Amino acid chains longer than six are also possible. N-terminal leader sequences, however, longer than about 20 amino acids slow down the subsequent aminopeptidase activation process.
  • the leader sequence be between four and six amino acids in length to maintain inactivity and provide stability while at the same time allowing efficient subsequent activation by aminopeptidases.
  • an odd number of amino acid residues making up the leader sequence will not generally be processed cleanly by DAP.
  • the leader sequence constitute four or six amino acids if DAP is to be used as the activating aminopeptidase.
  • the present invention also contemplates leader sequences having a variable sequence of amino acids.
  • the leader sequence is preferably exposed to the solvent which enables it to be cleaved by DAP or other aminopeptidases.
  • a leader sequence consisting of amino acids which will facilitate exposure of the leader to the solvent and allow for subsequent removal by DAP is preferred.
  • a preferred leader sequence will contain an even number from two to twenty hydrophilic amino acids which cannot be cleaved by trypsin-like enzymes, but which can be removed by DAP or other aminopeptidases.
  • a more preferred leader sequence provides a single amino acid substitution in the native trypsinogen leader sequence wherein the first lysine encountered, for example, position 6 of the bovine typrsinogen zymogen, is substituted with any amino acid except Arg or Pro.
  • An even more preferred leader sequence comprises the sequence Val-Asp-Asp-Asp-Asp- Asp (amino acids 5 through 10 of SEQ ID NO:2) immediately N-terminal to the first amino acid of the mature active trypsin enzyme.
  • Wild-type trypsinogen genes can be obtained by a plurality of recombinant DNA techniques including, for example, hybridization, polymerase chain reaction (PCR) amplification, or de novo DNA synthesis.(5ee e.g., T. MANIATIS ET AL., MOLECULAR CLONING: A LABORATORY MANUAL, (2d ed. 1989). The isolated gene can then be modified or mutated to encode the trypsinogen analogs described above.
  • PCR polymerase chain reaction
  • the isolated nucleic acids of the present invention can be prepared by direct chemical synthesis by methods such as the phosphotriester method of Narang, et al., Meth. Enzymol. 68:90-99 (1979); the phosphodiester method of Brown, et al., Meth. Enzymol.68: 109-151 (1979); the diethylphosphoramiditemethod of Beaucage, et al., Tetra. Letts.22:1859-1862 ( 1981 ); the solid phase phosphoramidite triester method described by Beaucage and Caruthers, Tetra.
  • the trypsinogen cDNA can be isolated from a library constructed from any tissue in which said gene is expressed.
  • Methods for constructing cDNA libraries in a suitable vector such as a plasmid or phage for propagation in prokaryotic or eukaryotic cells are well known to those skilled in the art. (See e.g., MANIATIS ET AL., supra). Suitable cloning vectors are well known and are widely available.
  • mRNA is isolated from a suitable tissue, and first strand cDNA synthesis is carried out.
  • a second round of DNA synthesis can be carried out for the production of the second strand.
  • the double-stranded cDNA can be cloned into any suitable vector, for example, a plasmid, thereby forming a cDNA library.
  • a variety of different cDNA libraries can be purchased commercially (Clontech Laboratories Inc., Palo Alto, California).
  • Oligonucleotide primers targeted to any suitable region of the trypsinogen gene can be used for PCR amplification. See e.g. PCR PROTOCOLS: A GUIDE TO METHOD AND APPLICATION (M. Innis et al. eds., 1990).
  • the PCR amplification comprises template DNA, suitable enzymes, primers, and buffers, and is conveniently carried out in a DNA Thermal Cycler (Perkin Elmer Cetus, Norwalk, CT). A positive result is determined by detecting an appropriately-sized DNA fragment following agarose gel electrophoresis.
  • An object of the present invention is to provide a process whereby the trypsinogen analogs described herein can be expressed by recombinant methods. Recombinant protein expression is preferred to obtain a high yield of highly pure protein, especially when the goal is to use these proteins in the manufacturing process for other therapeutic biomolecules.
  • the basic steps in the recombinant production of desired proteins are: a ) construction of a synthetic or semi-synthetic DNA encoding the protein of interest; b ) integrating said DNA into an expression vector in a manner suitable for the expression of the protein of interest, either alone or as a fusion protein; c ) transforming an appropriate eukaryotic or prokaryotic host cell with said expression vector, d) culturing said transformed or transfected host cell in a manner to express the protein of interest; and e ) recovering and purifying the recombinantly produced protein of interest.
  • the invention further provides trypsinogen-encoding nucleic acids. These are exemplified by the sequences shown in SEQ ID NO: 1 and SEQ ID NO: 3, and their complements. Also included in the inventive nucleic acids are those closely related to the sequences of SED ID NO: 1 and SED ID NO: 3, yet retain the ability to be activated to form mature trypsin. Generally, these sequences share at least about 65 percent identity with SEQ ID NOS. 1 .and 2, but more typically share at least about 70 percent identity. More preferred embodiments share at least about 75% identity, and some share at least about 80% identity. Even more preferred nucleic acids share at least about 85% identity or at least about 90% identity. Most preferred nucleic acids share at least about 95% identity, with some sharing at least about 98 percent or 99 percent identity.
  • the present invention also relates to vectors that include isolated nucleic acid molecules of the present invention, host cells that are genetically engineered with the recombinant vectors, and the production of trypsinogen analog polypeptides or fragments thereof by recombinant techniques.
  • the nucleotides encoding trypsinogen analogs can optionally be joined to a vector containing a selectable marker for propagation in a host.
  • a plasmid vector is introduced in a precipitate, such as a calcium phosphate precipitate, or in a complex with a charged lipid, or by other methods that are well known to those with ordinary skill in the art.
  • the vector is a viral vector, it can be introduced directly into mammalian host cells or introduced using viral supernatant produced by packaging in vitro using an appropriate packaging cell line.
  • Bacterial viral vectors can also be packaged in vitro using packaging cell extracts commercially available and then tranfected into host bacterial cells.
  • the DNA insert should be operatively linked to an appropriate promoter, such as the phage lambda PL promoter, the E. coli lac, trp and tac promoters, the SV40 early and late promoters and promoters of retroviral LTRs, as well as the glyceraldehyde phosphate dehydrogenase (GAPDH) and alcohol oxidase (AOX) promoters to name a few. Other suitable promoters will be known to the skilled artisan.
  • the expression constructs will further contain sites for transcription initiation, termination and, in the transcribed region, a ribosome binding site for translation.
  • the coding portion of the mature transcripts expressed by the constructs will preferably include a translation initiating at the beginning and a termination codon (e.g., UAA, UGA or UAG) appropriately positioned at the end of the mRNA to be translated.
  • Expression vectors will preferably include at least one selectable marker.
  • markers include, e.g., dihydrofolate reductase or neomycin resistance for mammalian cell culture, neomycin resistance or complementation of auxotrophic markers for Yeasts, and tetracycline, ampicillin, kanamycin, or chloramphenicol resistance genes for culturing in E. coli and other bacteria.
  • Representative examples of appropriate hosts include, but are not limited to, bacterial cells, such as E.
  • coli Streptomyces and Salmonella typhimurium cells
  • fungal cells such as Aspergillus niger
  • yeast cells such as Pichia pastoris and Saccharomyces cerevisiae
  • insect cells such as Drosophila S2 and Spodoptera Sf9 cells
  • animal cells such as CHO, COS and Bowes melanoma cells
  • plant cells Appropriate culture mediums and conditions for the above-described host cells are known in the art.
  • Vectors preferred for use in bacteria include pQE70, pQE60 and pQE-9, available from Qiagen; pBS vectors, Phagescript vectors, Bluescript vectors, pNH8A, pNHl ⁇ a, pNH18A, pNH46A, available from Stratagene; and ptrc99a, pKK223-3, pKK233-3, pDR540, pRIT5 available from Pharmacia.
  • Preferred eucaryotic vectors include pWLNEO, pSV2CAT, pOG44, pXTl and pSG available from Stratagene; and pSVK3, pBPV, pMSG and pSVL available from Pharmacia.
  • Preferred vectors for expression in Pichia pastoris include pLDG vectors (figure 1) and pPIC vectors commercially available from In vitrogen, Inc. Other suitable vectors will be readily apparent to the skilled artisan.
  • Introduction of a vector construct into a host cell can be effected by calcium phosphate transfection, DEAE-dextran mediated transfection, cationic lipid-mediated transfection, electroporation, transduction, infection, transformation or other methods. Such methods are described in many standard laboratory manuals, such as Sambrook, supra, Chapters 1-4 and 16-18; Ausubel, supra, Chapters 1, 9, 13, 15, 16.
  • Trypsinogen analogs of the present invention can be expressed in a modified form, such as a fusion protein, and can include not only secretion signals, but also additional heterologous functional regions. For instance, a region of additional amino acids can be added to the N-terminus of an analog to improve stability and persistence in the host cell, during purification, or during subsequent handling and storage. Also, peptide moieties can be added to facilitate purification. Such regions can be removed prior to final preparation of an active enzyme. Such methods are described in many standard laboratory manuals, such as Sambrook, supra, Chapters 17.29-17.42 and 18.1-18.74; Ausubel, supra, Chapters 16, 17 and 18.
  • nucleic acids of the present invention may express a protein of the present invention in a recombinantly engineered cell, such as bacteria, yeast, insect, or mammalian cells.
  • a recombinantly engineered cell such as bacteria, yeast, insect, or mammalian cells.
  • the cells produce the protein in a non-natural condition (e.g., in quantity, composition, location, and/or time), because they have been genetically altered through human intervention to do so.
  • the expression of isolated nucleic acids encoding a protein of the present invention will typically be achieved by operably linking, for example, the DNA or cDNA to a promoter (which is either constitutive or inducible), followed by incorporation into an expression vector.
  • the vectors can be suitable for replication and integration in either prokaryotes or eukaryotes.
  • Typical expression vectors contain transcription and translation terminators, initiation sequences and promoters useful for regulation of the expression of the DNA encoding a protein of the present invention.
  • expression vectors which contain, at the minimum, a strong promoter to direct transcription, a ribosome binding site for translational initiation, and a transcription/translationterminator.
  • modifications are well known to those of skill in the art and include, for example, a methionine added at the amino terminus to provide an initiation site, or additional amino acids (e.g., poly His) placed on either terminus to facilitate purification of the protein or other cleavages to create conveniently located restriction sites or termination codons.
  • additional amino acids e.g., poly His
  • nucleic acids of the present invention can be expressed in a host cell by turning on (by manipulation) in a host cell that contains endogenous DNA encoding a trypsinogen analog of the present invention.
  • Such methods are well known in the art, e.g., as described in US patentNos. 5,580,734, 5,641,670, 5,733,746, and 5,733,761, entirely incorporated herein by reference.
  • Prokaryotic cells may be used as hosts for expression. Prokaryotes most frequently are represented by various strains of E. coli; however, other microbial strains may also be used. Commonly used prokaryotic control sequences which are defined herein to include promoters for transcription initiation, optionally with an operator, along with ribosome binding site sequences, include such commonly used promoters as the beta lactamase (penicillinase)and lactose (lac) promoter systems (Chang, et al., Nature 198:1056 (1977)), the tryptophan (trp) promoter system (Goeddel, et al., Nucleic Acids Res.
  • bacteriaphage T7 promoter and RNA polymerase the bacteriaphage T7 promoter and RNA polymerase, and the bacteriaphage lambda derived P L promoter and N-gene ribosome binding site (Shimatake, et al., Nature 292: 128 (1981)).
  • selection markers include genes specifying resistance to ampicillin, tetracycline, kanamycin, or chloramphenicol.
  • Bacterial vectors are typically of plasmid or phage origin. Appropriate bacterial cells are infected with phage vector particles or transfected with naked phage vector DNA. If a plasmid vector is used, the bacterial cells are transformed with the plasmid vector DNA. Expression systems for expressing a protein of the present invention are available using Bacillus sp. and Salmonella (Palva, et al., Gene 22:229-235 (1983); Mosbach, et al., Nature 302:543-545 (1983)).
  • eukaryotic expression systems such as yeast, insect cell lines, plant and mammalian cells, are known to those of skill in the art. As explained briefly below, a nucleic acid of the present invention can be expressed in these eukaryotic systems.
  • yeast Synthesis of heterologous proteins in yeast is well known.
  • F. Sherman, et al., Methods in Yeast Genetics, Cold Spring Harbor Laboratory (1982) is a well-recognized work describing the various methods available to produce the protein in yeast.
  • Two widely utilized yeast for production of eukaryotic proteins are Saccharomyces cerevisiae and Pichia pastoris.
  • Vectors, strains, and protocols for expression in Saccharomyces and Pichia are known in the art and available from commercial suppliers (e.g., Invitrogen).
  • Suitable vectors usually have expression control sequences, such as promoters, including 3-phosphoglyceratekinase or alcohol oxidase, and an origin of replication, termination sequences and the like as desired.
  • sequences encoding proteins of the present invention can also be ligated to various expression vectors for use in transfecting cell cultures of, for instance, mammalian, insect, or plant origin.
  • Illustrative of cell cultures useful for the production of the peptides are mammalian cells. Mammalian cell systems often will be in the form of monolayers of cells although mammalian cell suspensions may also be used.
  • a number of suitable host cell lines capable of expressing intact proteins have been developed in the art, and include the HEK293, BHK21 , and CHO cell lines.
  • Expression vectors for these cells can include expression control sequences, such as an origin of replication, a promoter (e.g., the CMV promoter, a HSV tk promoter or pgk (phosphoglyceratekinase) promoter), an enhancer (Queen, et al., Immunol. Rev. 89:49 ( 1986)), and processing information sites, such as ribosome binding sites, RNA splice sites, polyadenylation sites (e.g., an SV40 large T Ag poly A addition site), and transcriptional terminator sequences.
  • Other animal cells useful for production of proteins of the present invention are available, for instance, from the American Type Culture Collection Catalogue of Cell Lines and Hybridomas (7th edition, 1992).
  • Appropriate vectors for expressing proteins of the present invention in insect cells are usually derived from the SF9 baculovirus.
  • suitable insect cell lines include mosquito larvae, silkworm, army worm, moth and Drosophila cell lines such as a Schneider cell line (See Schneider, J. Embryol. Exp. Morphol.27:353-365 (1987).
  • polyadenylation or transcription terminator sequences are typically incorporated into the vector.
  • An example of a terminator sequence is the polyadenlyation sequence from the bovine growth hormone gene. Sequences for accurate splicing of the transcript may also be included.
  • An example of a splicing sequence is the VP1 intron from SV40 (Sprague, et al., J. Virol.45:773-781 ( 1983)).
  • gene sequences to control replication in the host cell may be incorporated into the vector such as those found in bovine papilloma virus type- vectors. M.
  • Signal peptides may be used to facilitate the extracellular discharge of proteins in both prokaryotic and eukaryotic environments. It has been shown that the addition of a heterologous signal peptide to a normally cytosolic protein may result in the extracellular transport of the normally cytosolic protein. Alternate signal peptide sequences may function with heterologous coding sequences.
  • Signal peptides are well known in the art and can be incorporated into the modified trypsinogen structure to facilitate extracellular translocation or intracellular destination.
  • the signal peptide used is a signal peptide native to a secretory protein of the host cell line.
  • the signal peptide is the alpha factor signal sequence. This signal sequence is fused to the N-terminal end of the trypsinogen analog leader sequence and will result in the extracellular transport of trypsinogen analogs expressed in Yeast.
  • HSA human serum albumin
  • an expression vector carrying the trypsinogen analog gene is transfected into a suitable host cell using standard methods, cells that contain the vector are propagated under conditions suitable for expression of the recombinant trypsinogen analog protein.
  • suitable growth conditions would incorporate the appropriate inducer.
  • the recombinantly-produced protein may be purified from cellular extracts of transformed cells by any suitable means.
  • Trypsinogen analogs of the present invention can be recovered and purified from recombinant cell cultures by well-known methods including ammonium sulfate or ethanol precipitation, acid extraction, anion or cation exchange chromatography, phosphocellulose chromatography, hydrophobic interaction chromatography, affinity chromatography, reversed-phase chromatography, hydroxylapatite chromatography, and lectin chromatography. Most preferably, ultra-filtration is employed coupled with cation exchange chromatography.
  • the trypsinogen analog may be fused at its N-terminal end to several histidine residues.
  • This "histidine tag” enables a single-step protein purification method referred to as “immobilized metal ion affinity chromatography” (IMAC), essentially as described in U.S. Patent 4,569,794, which hereby is incorporated by reference.
  • IMAC immobilized metal ion affinity chromatography
  • trypsinogen analog which has been secreted into the culture medium is concentrated by ammonium sulfate precipitation (70%).
  • the precipitate is resuspended in 20 mM sodium acetate, pH 5.0 and dialyzed against the same buffer.
  • the trypsinogen analog can then be loaded on a CM-Sepharose CL6B Column (Pharmacia, Upsala, Sweden) and eluted with a gradient of NaCl.
  • An alternative purification process from cell culture growth medium begins with centrifugation of the growth medium to pellet cells.
  • the pH of the supernatant is adjusted to about 3.0 preferably with an acetate buffer.
  • the adjusted supernatant is then subjected to ultra-filtration (tangential flow filtration with a 300 kDa/ 10 kDa molecular weight cutoffs).
  • the filtrate can optionally be further purified using cation exchange chromatography.
  • the purified isolate then is processed using an aminopeptidase such as DAP or MAP.
  • the reaction is then concentrated again using ultrafiltration resulting in yields around 50%.
  • the present invention provides a process for activating the trypsinogen analog following expression of the trypsinogen in bacteria, yeast, or higher eukaryotic cells. Following digestion with an aminopeptidase such as a mono- or diaminopeptidase, an active trypsin protein with an N-terminus identical to that of wild type trypsin can be obtained. The conformation of trypsinogen will prevent aminopeptidases from cleaving beyond the first amino acid of the mature trypsin molecule. Numerous aminopeptidases exist and can be used to activate the trypsinogen analogs of the present invention. Watson et al. (1976) Methods Microb. 9:1-14 describe different aminopeptidases present in different bacteria including E. coli and is entirely herein incorporated by reference.
  • the trypsinogen Before processing the trypsinogen analogs using an aminopeptidase, however, the trypsinogen may be purified from a cell lysate or, in the case of a secreted protein, from the culture medium. A variety of purification steps may be employed including ammonium sulfate precipitation followed by dialysis and column chromatography.
  • DAP dipeptidylaminopeptidase
  • Dipeptidylaminopeptidases are enzymes which hydrolyze the penultimate amino terminal peptide bond releasing dipeptides from the unblocked amino-termini of peptides and proteins.
  • There are currently four classes of dipeptidylaminopeptidases (designated DAP-I, DAP-II, DAP-III and DAP-IV) that differ based on their physical characteristics and the rates at which they react with their substrates.
  • DAP I is a relatively non-specific DAP that catalyzes the release of many dipeptide combinations from the unblocked amino termini of peptides and proteins.
  • DAP I shows little or no activity if the emergent dipeptide is Pro-X, or X-Pro (where X is any amino acid).
  • DAP II shows a preference for amino terminal dipeptide sequences that begin with Arg-X or Lys-X, and to a lesser extent, X-Pro.
  • DAP-II exhibits significantly lower reaction rates versus most other dipeptide combinations.
  • DAP III appears to have a propensity toward amino terminal dipeptide sequences of the form Arg- Arg and Lys-Lys.
  • DAP IV shows its highest rate of hydro lytic activity toward dipeptide sequences of the form X-Pro.
  • the DAP enzymes, particularly DAP-I and DAP-IV have been shown to be useful in processing proteins.
  • leader sequence present on any protein molecule could be removed by an aminopeptidase as long as the secondary structure of the protein was such that the peptidase could not cleave beyond the first critical residue of the desired end product protein.
  • the activation reaction can be carefully controlled to ensure that cleavage does not occur beyond the first amino acid of the mature protein.
  • the DAP reaction that converts inactive trypsinogen analogs into active trypsin is generally conducted in an aqueous medium suitably buffered to obtain and maintain a pH from about 2.0 to about 6.5.
  • the pH of the medium ranges from about 3.0 to about 4.5, and most preferably, from about 3.0 to about 3.5.
  • dDAP to remove dipeptides from trypsinogen is advantageous because dDAP's pH optimum of 3.5 allows the reaction to be run at acidic pH.
  • a solubilizing agent such as urea, sodium dodecylsulfate, guanidine, and the like, may be employed.
  • Another product of the invention is a highly purified preparation of trypsin.
  • Trypsin is defined below.
  • the present trypsin is produced in a recombinant system that results in secretion into the medium, it is anticipated that the resulting preparation should by highly purified and have negligible levels of contaminating chymotrypsin activity.
  • This highly purified trypsin meets at least one of the following criteria of purity. Generally such preparations are greater than about 90 percent pure. More typically, however, these preparations are more than about 95 percent pure and preferably they are at least about 99 percent pure, and generally at least about 99.5, 99.9 or 99.99 percent pure. Some preparations have no detectable contaminants.
  • Purity may also be evaluated by reverse phase high performance liquid chromatography method as presented generally below in the Examples. In such a case, percent purity is evaluated by comparing the integration of the trypsin peak to all peaks.
  • the preparations of the invention generally show no chymotrypsin by this method.
  • Purity may also be as measured by specific activity using the TAME assay, detailed below. Reference pure material should yield about 235 U/mg.
  • preferred trypsin preparations are "essentially chymotrypsin-free.”
  • "essentially chymotrypsin-free” denotes a preparation that has less than about 0.01% chymotrypsin (by weight, relative to trypsin). More preferred essentially chymotrypsin-free preparations generally have less than about 0.005%, and even more preferred preparations have less than about 0.001% chymotrypsin and most preferred preparations have less than about 0.0005% chymotrypsin.
  • Some highly purified preparations are expected to meet the foregoing criteria of purity when measured using the ultra-sensitive glucagon-based high performance liquid chromatography (HPLC) method presented below in Example 6. In such an assay, preferred compositions should show no detectable chymotryptic glucagon peaks.
  • the trypsin preparations of the invention also are free of any cellular components derive from Aspergillus, or related organisms, such as proteins and toxins.
  • the trypsin preparations of the invention generally are essentially mammalian protein-free.
  • essentially mammalian protein-free compositions refer to any of the inventive compositions that are essentially free of all mammalian proteins, except, of course, for the recombinantly-produced product. In general, this is achieved by avoiding the addition of mammalian proteins, like casein (added to bacterial cultures), and producing the proteins in a non-mammalian host. Excluding the recombinant protein, typical compositions have less than about 1% mammalian protein, but usually have less than about 0.5% mammalian protein or less than about 0.1% mammalian protein.
  • compositions have less than about 0.01% mammalian protein and most preferred compositions have no detectable mammalian protein.
  • ELISAs enzyme-linked immunosorbant assays
  • the essentially chymotrypsin-free trypsin of the invention may be assembled into "commercial units" that are suitable for sale.
  • Each commercial unit generally comprises a bulk quantity of essentially chymotrypsin-free trypsin.
  • a bulk quantity usually comprises at least about 10 mg of essentially chymotrypsin-free trypsin unit. More typically, however, larger quantities will be present, such as at least about 50 mg per unit, at least about 100 mg per unit, at least about 500 mg per unit or at least about 1 gram per unit. Even larger quantities, e.g., a. unit of at least about a fifty grams, a unit of at least about a hundred grams unit, or a unit of at least about a kilogram(s), also are contemplated.
  • commercial unit also contemplates assemblages of smaller commercial units to form larger ones.
  • a one kilogram commercial unit of essentially chymotrypsin-free trypsin may be provided as a thousand one-gram commercial units.
  • a commercial unit will contain essentially chymotrypsin-free trypsin as a liquid solution. It may, however, be present in a solid form, such as a freeze-dried (lyophylized) powder. Bulking agents and stabilizers (like calcium) are optionally included.
  • a commercial unit also includes the packaging containing the essentially chymotrypsin-free trypsin, and optionally includes printed product specifications, an inventory control number and/or instructions for use.
  • Wild type bovine trypsinogen was mutated to destroy the trypsin cleavage site.
  • the Lys residue present in the leader sequence of the native bovine trypsinogen protein was mutated to an Asp residue (position 10 of SEQ ID NO:2).
  • At the DNA level the mutation generates an EcoRV site which is useful for subsequent modifications of different constructs. All constructs were first assembled using the vector pRMG5 (Fig. 1) and then transferred to Pichia pastoris expression vectors.
  • the expression vector pPIC9 supplied by In Vitrogen was used to create a trypsinogen analog fused to the ⁇ -factor signal sequence.
  • the DNA encoding the fusion protein was cloned downstream of the methanol inducible AOX1 promoter.
  • the pPIC9 vector contains the AOX 1 promoter cloned 5' to the ⁇ -factor signal sequence which in turn is cloned 5' to a multiple cloning sequence.
  • the vector was constructed such that DNA encoding a Glu-Ala-Glu-Ala (amino acids 1 through 4 of S ⁇ Q ID NO:2) peptide was inserted between the C-terminus of the alpha factor signal sequence and the N-terminus of the trypsinogen analog leader sequence to improve the yield of secreted protein.
  • oligonucleotide final concentrations were about 2.5 ⁇ g/ ⁇ l.
  • each oligonucleotide was phosphorylated with 4 Units of T4 polynucleotide kinase for 1 hour at 37°C in a buffer containing 6 mM MgCl2, 6 mM DTT, 0.6 mM ATP and 120 mM Tris, pH 8.0.
  • the phosphorylated oligonucleotides were then pooled and annealed.
  • the resulting double stranded oligonucleotides were then cloned into pRMG5 to create pRMG5- ⁇ F( ⁇ A) 2 VD5.
  • the pRMG5 vector was prepared by digesting the vector with Xh ⁇ l and N ⁇ rl. After digestion, the vector was dephosphorylated with 0.1 U of calf intestinal phosphatase (CIP) at 37°C for 30 minutes. The vector was then ligated to the double stranded phosphorylated oligonucleotides. The resulting clone was used for subcloning the trypsinogen analog in pPIC9.
  • the pPIC9 expression vector was prepared in the dam minus E.coli GM82 strain and digested with vrll (Xbal cohesive) and Xhol and then dephosphorylated with 0.1 U of CIP.
  • the trypsinogen analog was extracted from pRMG5- ⁇ F(EA) 2 VD5 as a XhollXbal fragment and subcloned into pPIC9.
  • the resulting plasmid, pPIC9- ⁇ F(EA) 2 VD5 contained DNA encoding the alpha factor signal sequence ( ⁇ F) fused to the Glu-Ala-Glu-Ala ((EA) 2 ) insertion fused to the leader Val(Asp) 5 (VD5) fused to bovine trypsin.
  • Trypsinogen was also fused directly to the C-terminus of the alpha factor without the (GluAla) 2 insertion.
  • Two oligonucleotides, 5'-TC GAG AAA AGA GTC GAC GAT GAT GAC GAT ATC GTT GGA GTT TAT ACA TGT GG-3' and 5'-CGC CAC ATG TAT AAC CTC CAA CGA TAT CGT CAT CAT CGT CGA CTC TTT TC-3' encoding a C-terminal portion of the alpha factor signal sequence and the Val(Asp) leader sequence were synthesized, prepared as described above, and ligated into the Xhol and N ⁇ rl sites of pRMG5.
  • a positive clone with the correct sequence was then used to subclone the trypsinogen gene into pPIC9 as described above.
  • the resulting clone was named pPIC9- ⁇ FVD5 and contained D ⁇ A encoding the alpha factor signal sequence fused to the leader Val(Asp) 5 fused to bovine trypsin.
  • the trypsinogen analog gene is under the control of the glyceraldehyde phosphate dehydrogenase (GAPDH) promoter and the HSA secretion signal sequence is used to direct the protein to the outside of the cell.
  • GPDH glyceraldehyde phosphate dehydrogenase
  • the cloning strategy involves first ligating a trypsinogen analog coding sequence into a pRMG5 construct and then further subcloning into the expression vector pLGD43.
  • trypsinogen analog gene sequences Two different trypsinogen analog gene sequences were cloned into pLGD43. The two gene sequences encoded a trypsinogen analog with either four or six amino acids in the leader sequence wherein the lysine is replaced with aspartate (Val(Asp) 5 -trypsin or Val(Asp)3-trypsin ).
  • the trypsinogen analog having the four amino acid leader sequence was cloned using two oligonucleotides, 5'-TC GAG GGT AAC CTT TAT TTC CCT TCT TTT TCT CTT TAG CTC GGC TTA TTC CAG GGG TGT GTT TCG TCG AGT CGA CGA CGA T-3' and 5'-ATC GTC GTC GAC TCG ACG AAA CAC ACC CCT GGA ATA AGC CGA GCT AAA GAG AAA AAG AAG GGA AAT AAA GGT TAC CC-3', encoding the HSA signal sequence and the four amino acid leader sequence.
  • the oligos were phosphorylated, annealed and ligated into the Xhol and EcoRV sites of the pRMG5- ⁇ F( ⁇ A) 2 VD5 construct described above to create pRMG5HSAVD3.
  • the resulting pRMG5 vector was then used for transferring the DNA sequence encoding the HSA signal sequence fused to the four amino acid leader sequence to pLGD43 as a BstElllBam ⁇ l fragment to create pLGD43HSAVD3.
  • the trypsinogen analog having the six amino acid leader sequence was cloned by taking the trypsinogen analog gene cassette from pRMG5- ⁇ F( ⁇ A) 2 VD5 as a SaWBamHl fragment into the vector pRMG5HSAVD3 also digested with Sail and BamHl.
  • the resulting vector was named pRMG5HSAVD5.
  • This modified trypsinogen was then subcloned from pRMG5HS AVD5 into pLGD43 as described above for the modified trypsinogen containing the four amino acid leader sequence.
  • the resulting vector was named pLGD43HSAVD5.
  • Pichia pastoris GS115 and SMDl 163 protease minus strains were transfected using the spheroplast method or the electroporation method.
  • the expression vectors described in Example 1 were linearized with BgHl for pPIC9 derivatives and N ⁇ tl for pLGD43 derivatives to facilitate homologous recombination of an expression cassette on the Pichia chromosome.
  • This cassette consisted of a promoter (AOXl or GAPDH) controlling the expression of a trypsinogen analog and the Yeast His4 gene which was used as a selectable marker.
  • Yeast strains used for transformation have a mutated HIS4 gene and are unable to grow in medium lacking histidine.
  • Transfected cultures were plated on minimal medium and only those cells which had integrated the HIS4 gene in their chromosome were able to grow on minimal medium lacking histidine (HIS+ clones).
  • Clones having integrated the cassette in the AOXl locus were then isolated. Homologous recombination directed to the AOXl locus leads to the deletion of the AOXl gene. Strains carrying a deletion of the AOXl gene are still able to grow on methanol using AOX2 alcohol oxydase but the growth is much slower compared to cells with an intact AOXl gene. Therefore, HIS+ transformants were screened for integration in the AOXl locus by plating them on minimal medium containing methanol as the sole carbon source. Slow growing clones having integrated the expression cassette in the AOXl locus were isolated and designated mut s (methanol utilization slow).
  • HIS+/mut cells which were originally transfected with the expression vectors containing trypsinogen analogs driven by the AOXl promoter, were first grown in glycerol to generate biomass and then transferred to medium containing methanol for induction. Clones to be screened were stored as 1 ml glycerol stocks at -80°C.
  • BMGY lOOmM phosphate buffer pH 6.0, 1.34% bacto yeast nitrogen base (YNB), 10% yeast extract, 20% bactopeptone, 1% glycerol.
  • BMMY lOOmM phosphate buffer pH 6.0, 1.34% YNB, 10% yeast extract, 20% bactopeptone, 0.5% methanol
  • 125 mis of pure methanol was added to keep the level of methanol constant in the culture medium. The cultures were centrifuged and the supernatant was isolated to determine the concentration of trypsinogen analog.
  • the GAPDH promoter is induced when glucose becomes limited in the culture medium.
  • the induction of expression of trypsinogen analog from the GAPDH promoter was carried out s essentially as described for the AOXl promoter.
  • Subconfluent cultures of HIS+/mut cells which were originally transfected with expression vectors containing a trypsinogen analog driven by the GAPDH promoter, were grown in BMGlcY (lOOmM phosphate buffer pH6, 1.34% YNB, 10% yeast extract, 20% bactopeptone, 5% glucose). Cells were harvested and transferred to BMGY medium depleted in glucose (lOOmM phosphate buffer pH6, 1.34% YNB, 10% yeast extract, 20% bactopeptone). The supernatant was harvested 48 hours after induction.
  • Example 4 Selection and characterization of clones expressing trypsinogen analogs: The trypsinogen present in the supernatant of Pichia pastoris was first concentrated by ammonium sulfate precipitation (70%). The precipitate was resuspended in 20 mis of 20mM sodium acetate pH 5 and dialyzed overnight against the same buffer at 4°C. The dialyzed solution was then loaded on CM-Sepharose CL6B (Pharmacia, Upsala, Sweden) equilibrated with 20mM acetate buffer pH 5. Trypsinogen analog was eluted with a gradient of NaCl (0- 50mM). Fractions were analyzed by SDS-PAGE and fractions containing the trypsinogen analogs were pooled and concentrated.
  • the purified trypsinogen was then activated with 500 U dDAP per gram of trypsinogen analog in 50 mM acetate buffer pH 3.0 and the resulting trypsin activity was determined using the Tosyl-Arg methyl esterase (TAME) assay as described in B.C.W. Hummel, 37 Can. J. Biochem. Physiol., 1393 (1959).
  • TAME Tosyl-Arg methyl esterase
  • FIG. 3 summarizes the production of ValAsp5 trypsinogen in the two expression systems described in the preceding Examples.
  • N-terminal sequencing was carried out on purified trypsinogen analog expressed from two different constructs.
  • the 25kDa and 30 kDa bands observed on SDS gels were sequenced which confirmed that the two proteins were trypsinogen.
  • Both the 30kDa and the 25 kDa proteins were found with an N-terminus corresponding to modified trypsinogen.
  • the trypsinogen produced by Pichia pastoris is secreted into the culture medium. Both the alpha factor and the HSA signal sequence are correctly processed during secretion, but as mentioned above the GluAlaGluAla sequence inserted between the signal sequence and the trypsinogen is only partially removed.
  • Trypsinogen analog was treated with dDAP and trypsin activity followed with time. Trypsinogen purified from Pichia culture supernatant was activatable into trypsin with dDAP. In the absence of dDAP, no trypsin activity was detected. Following incubation with dDAP, the trypsin activity increases and reaches a plateau once all activatable trypsinogen has been digested with dDAP. Animal sourced trypsinogen purchased from Sigma Chemical Company was also activated into trypsin by dDAP but the trypsin activity started to decrease after a few hours. The Sigma trypsin produced by dDAP activation at pH 3.0 was much less stable then the trypsin produced from trypsinogen analog expressed in Pichia.
  • N-terminal sequencing was done on the trypsin produced following dDAP digestion .and the N-terminus was found to be identical to wild type bovine trypsin.
  • the dDAP removes only aminoacids up to He 11 (SEQ ID NO:2) and does not digest further.
  • This example sets out methods useful in assessing the quality of the inventive products, especially the essentially chymotrypsin-free products.
  • the purity of r-trypsin is measured by SDS-Page under non-reducing conditions, followed by densitometric analysis of the bands.
  • the potential impurities that can be measured are small fragments of trypsin or yeast polypeptides.
  • the method is designed to measure contamination of the r-trypsin by yeast proteins from the fermentation. The method also detects trypsin, two chain trypsin, and two other proteolysis products of trypsin. These variants of trypsin are counted as product in the calculation of purity. All bands included as trypsin product have been identified as bovine trypsin sequences as measured by direct sequence analysis of bands from control sample of bovine trypsin eluted from gels. Bands found at all other molecular weights are considered impurities.
  • the activity of trypsin can be measured by following the degradation of a synthetic substrate (TAME or Tosyl-Arg-Methyl-Ester) as measured by a change in absorbance at 247 nm over time. This method is essentially as described by Hummel, Can. J. Biochem. Physiol. 37: 1393 (1959).
  • the specific activity (expressed in Units/mg) is calculated as the ratio between the activity (expressed in Units per ml) and the protein concentration (expressed in mg/ml).
  • high quality trypsin will have a specific activity of >190U/mg.
  • the specific activity should exceed 210U/mg, with superior preparations exceeding 220 U/mg.
  • the specificity of a lot of r-trypsin is determined by performing a complete tryptic digest of a reference standard of glandular glucagon at room temperature using a ratio of 1 mg of trypsin per 100 mg of substrate over a period of 2 hours, as described below. This exposure-ratio combination is equivalent to the reaction conditions in most processes using trypsin. The reaction is monitored by reversed-phase HPLC, as described below. The results are reported as the ratio of the sum of the area for the tryptic fragments obtained with a sample versus that obtained with a standard (% specificity). This assay was designed to measure any degradation of the substrate resulting from contaminating proteases (particularly chymotrypsin) which could have co-purified with r-trypsin throughout the purification process.
  • Glucagon contains several chymotryptic sites and the assay can detect very low levels of chymotryptic contamination of trypsin (>0.01% by comparing the area of chymotryptic peaks to the area of tryptic peaks).
  • Example 6 HPLC Method for Detecting Trypsin versus Chymotrypsin This assay is useful in measuring the purity of trypsin and simultaneously measuring chymotryptic activity.
  • Glucagon Substrate Preparation One vial of glucagon reference standard is needed to assay two trypsin samples in triplicate. To each vial of glucagon reference standard, add 0.5 mL of 0.001 M HC1. Mix gently, and transfer the solution to a 15 mL polypropylene tube. Add 5 mL of the 50 mM borate buffer. Add 28 ⁇ L of the 1 M CaCl 2 stock solution and mix. Measure the pH and adjust to 8.0 +/- 0.1 with IN NaOH if necessary. Hold the solution on ice until needed. The same substrate solution must be used for analysis of both the bovine trypsin (bTrp) control and the recombinant trypsin (rTrp) samples.
  • bTrp bovine trypsin
  • rTrp recombinant trypsin
  • Glucagon Standard Preparation Aliquot 250 ⁇ L of the glucagon solution prepared in step a) into a 1.5 mL microfuge tube. Add 750 ⁇ L of the 50 mM borate buffer and mix. Add 50 ⁇ L of the 5 N HC1 and mix. Hold on ice until needed.
  • bTrp Standard Preparation Weigh out approximately 1 mg of the Sigma bovine trypsin. Dissolve in 1 mL of 0.05 M HOAc. Determine the concentration of the bTrp solution by measuring the A280 on the Spectrophotometer. See Data Analysis section for calculations. Dilute the solution to 0.5 mg/mL with 0.05 M HOAc. Hold on ice until needed.
  • rTrp Sample Preparation Determine the concentration of the rT ⁇ * s3rrr le by measuring the A280 on the Spectrophotometer. See Data Analysis for calculations. Dilute the solution to 0.5 mg/mL with 0.05 M HOAc. Hold on ice until needed.
  • Enzyme reaction For each sample, transfer a 1 mL aliquot of the glucagon reference standard substrate solution into a 1.5 mL eppendorf tube. Add 10 ⁇ L of the 0.5 mg/mL bTrp standard solution or rTrp sample solution and vortex. Place each tube in a 25°C water bath and incubate for 2 hours. After 2 hours, quench the reaction by adding 50 ⁇ L of 5 N HCL Samples must be held between 4 and 9°C prior to analysis or precipitation will occur. Samples should be analyzed within 12 hours after quenching. The bTrp standard must be analyzed using the same glucagon substrate solution and in the same HPLC sequence as the rTrp samples.
  • Needle Wash 1000 ⁇ L after each injection. See Materials #8, above, for wash solution makeup.
  • a typical HPLC run of trypsin-digested material yields 4 tryptic peaks at retention time of about 6.1 (3), 9.1 (2), 17.4 (1) and 18.7 (4) minutes (parentheticals indicate the ranking of the peaks in order of area, largest to smallest).
  • the same protocol used with chymotrypsin yielded 5 peaks having retention time of about 8.0 (3), 8.1 (2), 8.7 (5) , 13.5 (1) and 13.6 (4) minutes (parentheticals indicate the ranking of the peaks in order of area, largest to smallest).
  • all chymotryptic and tryptic peaks resulting from glucagon digestion are clearly resolvable.
  • Standard trypsin samples were spiked with 1%, 0.1% and 0.01% (by weight) of chymotrypsin (Sigma) and the reaction resolved by the same HPLC method. Peak resolution was obtained at all spiking levels, demonstrating that the detection limit of the assay is well below 0.01% by weight.
  • the sensitivity may be enhanced even further.
  • the inherent limit on sensitivity is interference with the chymotryptic peaks by the tryptic peaks. Sensitivity may be increased, it is believed, to levels of less than 0.005%, 0.001% and even 0.0005%, merely by increasing the amount of material injected into the HPLC system.

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Biomedical Technology (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • Genetics & Genomics (AREA)
  • Microbiology (AREA)
  • Biotechnology (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Enzymes And Modification Thereof (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)

Abstract

The present invention relates to recombinantly produced trypsin and trypsinogen analogs, including nucleic acids which encode trypsin and trypsinogen analogs. The trypsinogen analogs of the present invention contain modifications of the trypsinogen leader sequence such that the trypsinogen cannot be cleaved by trypsin or trypsin-like enzymes. The present invention also includes processes for producing recombinant trypsinogen analogs which can subsequently be activated.

Description

PRODUCTION OF SOLUBLE RECOMBINANT TRYPSINOGEN ANALOGS
Field of the Invention The invention relates generally to recombinant DNA technology. More specifically, the present invention relates to recombinant trypsinogen analogs as well as methods for producing recombinant trypsin.
Background of the Invention Trypsin is a widely used serine protease which cleaves the peptide bond on the carboxy-terminus of basic amino acid residues such as lysine and arginine. In animals, trypsin plays a pivotal role among pancreatic enzymes in the activation of endopeptidases. These pancreatic enzymes are secreted through the pancreatic duct into the duodenum of the small intestine in response to a hormone signal generated when food passes from the stomach. They are not, however, synthesized in their final active form. Rather, they are made as slightly longer catalytically inactive molecules called zymogens. The names given to some of these zymogens include trypsinogen, chymotrypsinogen, proelastase, and procarboxypeptidase. These zymogens must themselves be cleaved proteolytically to yield active enzymes.
The first step of the activation cascade is the activation of trypsin from trypsinogen in the duodenum. Enteropeptidase (also known as enterokinase) is a protease produced by duodenal epithelial cells which activates pancreatic trypsinogen to trypsin by excising a hexapeptide leader sequence from the amino-terminus of trypsinogen. Trypsin in turn autocatalytically activates more trypsinogen to trypsin and also acts on other proenzymes, thus, for example, liberating the endopeptidases chymotrypsin and elastase as well as carboxypeptidases A and B.
This battery of enzymes work together with pepsin produced in the stomach and other proteases secreted by the intestinal wall cells to digest most ingested proteins into free amino acids, which can be absorbed by the intestinal epithelium. The enzymes themselves are continually subjected to autodigestion and other degradative processes so that high levels of these enzymes never accumulate in the intestine. In the pancreas, several factors oppose trypsinogen autoactivation, whereas in the duodenum, all the conditions favorable for typsinogen activation by enteropeptidase are present.
To activate trypsin from its inactive precursor, a hexapeptide leader sequence, which consists of Valine-Aspartate-Aspartate-Aspartate-Aspartate-Lysine ((Asp)4-Lys), on the amino-terminus of trypsinogen is enzymatically removed. Trypsinogens of many different species have been cloned and characterized. The pattern of (Asp)4-Lys at the amino-terminus, however, is well-conserved in all of these precursors. Mireille Rovery, Limited Proteolyses in Pancreatic Chymotrypsinogens and Trypsinogens, 70 Biochimie 1131 (1988).
Serine proteases such as trypsin have a variety of uses. They are useful for the characterization of other proteins as well as in the manufacturing process of other recombinant bioproducts. For example, small recombinant proteins are often expressed first as fusion proteins to facilitate their purification and enhance their stability. The fusion proteins can be engineered such that a leader sequence can be cleaved from the native protein sequence by trypsin. Any internal lysines or arginines that are not part of the leader sequence can be chemically protected from cleavage by trypsin.
Trypsinogens from various species have been isolated and characterized. Craik, C.S. et al. (1984) J. Biol. Chem. 259:14255-14264; Fletcher, T.S. et al. (1987) Biochemistry 26:3081-3086. However, bovine trypsin, isolated from bovine pancreas, is now largely used in research laboratories and is the trypsin of choice for protein processing in the pharmaceutical industry. Even after extensive purification of animal sourced trypsin, however, there are contaminating activities in most preparations that can have undesirable consequences for both experimental research and pharmaceutical therapeutic protein processing. For example, the emergence of diseases such as bovine spongiform encephalopathy (BSE) has raised concerns about the use of enzymes from animal original in industrial processes. In addition, strict guidelines and regulations issued by the Food and Drug Administration as well as other national and international regulatory bodies has led to a need for pure trypsin of recombinant origin. The present inveniton addressed this need by providing trypsinogen analogs which can be recombinantly expressed and purified in large quantities. These analogs can be stably produced in a variety of expression systems and subsequently activated to provide pure trypsin for use in both experimental research and industrial therapeutic protein processing. Attempts to efficiently manufacture large quantities of recombinant trypsin which can then be used in the manufacture of protein pharmaceuticals have been problematic. Problems have stemmed from such things as the instability of mature trypsin in expression systems, the activation of trypsinogen during expression by endogenous host cell enzymes and subsequent damage to cell membranes, the low solubility of expressed protein in bacterial host cell systems, and improper folding of the protein in various host cell systems.
It has been a goal of researchers to provide an efficient and inexpensive means to produce recombinant trypsin which can then be used to safely manufacture other protein therapeutics. Thus, the present invention also provides an efficient and relatively inexpensive process to manufacture recombinant trypsin.
One type of bacterial expression system has been developed for rat ionci trypsin. Vasquez, J.R. et al., An Expression System for Trypsin, J. of Cell. Bioch., 39:265-276 (1989). In this system the rat trypsin hexapeptide leader sequence is replaced with the phoA signal peptide which directs the secretion of trypsin to the periplasmic space of E. coli. The signal peptide is removed from the fusion protein during secretion into the periplasmic space. Thus, unlike the present invention, active trypsin is secreted rather than an inactive zymogen. This can be problematic in that active trypsin is less stable than the inactive precursor and can result in toxicity to host cells and/or damage to cell membranes during the secretion process.
The process of the present invention is significantly different from past approaches by providing for the expression of an inactive zymogen form that is soluble and properly folded yet is not activated until after purification from fermentation broth or cell extracts. This is accomplished through the expression of a single chain trypsinogen analog wherein the leader sequence is modified such that it lacks a trypsin-like enzyme cleavage site. Specifically the trypsinogen analogs of the present invention lack a lysine or arginine in the N-terminal leader sequence of the protein to prevent auto-activation or activation by endogenous host cell enzymes. Once expressed in a particular cell type, the trypsinogen analog can then be secreted outside of the cell and isolated from the culture medium or alternatively expressed in the cytoplasm. If the trypsinogen is expressed in the cytoplasm and not secreted, the cells can be collected by centrifugation, lysed, and the trypsinogen isolated. Once the trypsinogen is isolated, stability can be enhanced by lowering the pH to about 3.0 and then subsequently activating the trypsinogen with an aminopeptidase such as dipetidylaminopeptidase (DAP).
The present invention provides trypsinogen analogs which have a modified amino- terminal leader sequence which results in recombinant trypsinogen that is stable and easy to activate. This invention provides a means to move away from animal-sourced trypsin and avoid the problems of degradation, instability, and damage to cell membranes which occurs during expression and/or secretion of recombinantly produced trypsinogen. In addition, the invention makes possible the secretion of a folded trypsinogen, therefore eliminating costly and time consuming refolding steps.
Summary of the Invention
The present invention provides a trypsinogen analog comprising a protein having trypsin activity fused to a leader sequence having at least 2 amino acids wherein the amino acids are any amino acid except Lys or Arg. In another embodiment, the invention provides a trypsinogen analog represented by X-AA-Y, wherein Y is a protein having trypsin activity, AA is an amino acid other than Lys or Arg and X-AA is a leader sequence having at least 2 amino acids. The invention further provides analogs that can be activated by cleavage with an aminopeptidase. The mature trypsin protein, which is the active enzyme, is derived from a single polypeptide which comprises a leader sequence fused to a polypeptide representing the active enzyme. The leader sequence must be cleaved to generate the active trypsin molecule. Still further, the invention provides an isolated trypsinogen analog represented by X-AA-Y, wherein: Y is a protein having trypsin activity; AA is any amino acid except Lys or Arg; and X-AA is a leader sequence having at least 2 amino acids.
The invention encompasses leader sequences fused to trypsin which cannot be cleaved by trypsin, enteropeptidases, or any other endogenous trypsin-like enzymes that may be present during expression, secretion, or isolation of the protein. Preferably, the leader sequence of the trypsinogen analog is selected from the group consisting of: ) from 2 to 20 non basic amino acid residues; b) from 2 to 20 amino acid residues wherein the amino acid residues can be any residue except except Lys or Arg; c ) Val-Asp-Asp-Asp-Asp-N, wherein N is any amino acid residue except Lys, Arg, or Pro; d) amino acid residues 5 through 10 of SEQ ID NO:2; e ) from 2 to 20 amino acid residues wherein the residues can be any hydrophilic amino acid residue that is not Lys or Arg; f ) amino acid residues 5 through 8 of SEQ ID NO:2; and g) from 2 to 20 amino acid residues, wherein the number of residues is an even number of residues and the residues are any non-basic amino acid except proline.
More preferably, this invention provides an inactive trypsin precursor comprising the sequence which is SEQ ID NO:2.
The invention also provides proteins having the amino acid sequences of SEQ ID NO: 2 and SEQ ID NO 4, including derivatives without the first three amino acids of SEQ ID NO: 2 and the first two amino acids of SEQ ID NO 4.
The invention further provides a trypsinogen analog having a secretion signal sequence of amino acids operably linked to the amino-terminal end of the leader sequence wherein said signal sequence allows the trypsinogen to be secreted into the host cell growth medium.
The invention encompasses both RNA and DNA which encode the trypsinogen analog proteins of the present invention. Preferred polynucleotides of the current invention include the nucleic acids of SEQ ID NO:l and SEQ ID NO:3 and variants thereof. This invention also provides recombinant vectors comprising the above-described nucleic acids as well as host cells harboring said recombinant vectors.
The invention further provides essentially chymotrypsin-free trypsin preparations.
The invention also provides nucleic acids sharing at least about 65 % identity with the nucleic acids of SEQ ID NO: 1 and SEQ ID NO: 3.
Finally, the invention includes methods for making recombinant trypsinogen analogs comprising the steps of: expressing the trypsinogen analogs described above in a host cell and isolating the expressed trypsinogen. Methods are also provided for making recombinant trypsin further comprising the step of activating the isolated trypsinogen with an aminopeptidase. Specifically, methods are provided comprising the steps of: expressing trypsinogen analogs in Pichia pastoris; isolating the trypsinogen analogs in a buffer; and incubating the isolated trypsinogen with a dipeptidylaminopeptidase in a buffer with a pH from about 2.0 to about 6.5 such that the trypsinogen is converted to active trypsin.
Brief Description of the Drawings Figure 1 is a restriction map of the plasmid vector pRMG5 containing the bovine trypsin gene.
Figure 2 is a restriction map of the plasmid vector pLGD43 containing the bovine trypsin gene.
Figure 3 summarizes the production of ValAsp5 trypsinogen in the two expression systems described herein.
Detailed Description of the Invention
The terms and abbreviations used in this document have their normal meanings unless otherwise designated. For example, "°C" refers to degrees Celsius; "mmol" refers to millimole or millimoles; "mg" refers to milligrams; "ml" refers to milliliters; "μg" refers to micrograms; and "μl" refers to microliters.
Amino acids abbreviations are as set forth in 37 C.F.R. § 1.822 (b)(2) (1994).
All nucleic acid sequences, unless otherwise designated, are written in the direction from the 5' end to the 3' end, frequently referred to as "5" to 3'".
All amino acid or protein sequences, unless otherwise designated, are written commencing with the amino terminus ("N-terminus") and concluding with the carboxy terminus ("C-terminus").
"Base pair" or "bp" as used herein refers to DNA or RNA. The abbreviations A,C,G, and T correspond to the 5'-monophosphate forms of the deoxyribonucleosides (deoxy)adenosine, (deoxy)cytidine, (deoxy)guanosine, and thymidine, respectively, when they occur in DNA molecules. The abbreviations U,C,G, and A correspond to the 5'- monophosphate forms of the ribonucleosides uridine, cytidine, guanosine, and adenosine, respectively when they occur in RNA molecules. In double stranded DNA, base pair may refer to a partnership of A with T or C with G. In a DNA/RNA, heteroduplex base pair may refer to a partnership of A with U or C with G. (See the definition of "complementary", infra.) The terms "digestion" or "restriction" of DNA refers to the catalytic cleavage of the DNA with a restriction enzyme that acts only at certain sequences in the DNA ("sequence-specific endonucleases"). The various restriction enzymes used herein are commercially available and their reaction conditions, cofactors, and other requirements were used as would be known to one of ordinary skill in the art. Appropriate buffers and substrate amounts for particular restriction enzymes are specified by the manufacturer or can be readily found in the literature.
"Ligation" refers to the process of forming phosphodiester bonds between two double stranded nucleic acid fragments. Unless otherwise provided, ligation may be accomplished using known buffers and conditions with a DNA ligase, such as T4 DNA ligase.
The term "plasmid" refers to an extrachromosomal (usually) self-replicating genetic element. Plasmids are generally designated by a lower case "p" followed by letters and/or numbers. The starting plasmids herein are either commercially available, publicly available on an unrestricted basis, or can be constructed from available plasmids in accordance with published procedures. In addition, equivalent plasmids to those described are known in the art and will be apparent to the ordinarily skilled artisan.
The term "reading frame" means the nucleotide sequence from which translation occurs "read" in triplets by the translational apparatus of transfer RNA (tRNA) and ribosomes and associated factors, each triplet corresponding to a particular amino acid. To insure against improper translation, the triplet codons corresponding to the desired polypeptide must be aligned in multiples of three from the initiation codon, i.e. the correct "reading frame" being maintained.
"Recombinant DNA cloning vector" as used herein refers to any autonomously replicating agent, including, but not limited to, plasmids and phages, comprising a DNA molecule to which one or more additional DNA segments can or have been added.
The term "recombinant DNA expression vector" as used herein refers to any recombinant DNA cloning vector in which a promoter to control transcription of the inserted DNA has been incorporated.
The term "expression vector system" as used herein refers to a recombinant DNA expression vector in combination with one or more trans-acting factors that specifically influence transcription, stability, or replication of the recombinant DNA expression vector. The trans-acting factor may be expressed from a co-transfected plasmid, virus, or other extrachromosomal element, or may be expressed from a gene integrated within the chromosome.
"Transcription" as used herein refers to the process whereby information contained in a nucleotide sequence of DNA is transferred to a complementary RNA sequence.
The term "transfection" as used herein refers to the taking up of an expression vector by a host cell whether or not any coding sequences are in fact expressed. Numerous methods of transfection are known to the ordinarily skilled artisan, for example, calcium phosphate co-precipitation, and electroporation. Successful transfection is generally recognized when any indication of the operation of this vector occurs within the host cell.
The term "transformation" as used herein means the introduction of DNA into an organism so that the DNA is replicable, either as an extrachromosomal element or by chromosomal integration. Methods of transforming bacterial and eukaryotic hosts are well known in the art, many of which methods, such as nuclear injection, protoplast fusion or by calcium treatment using calcium chloride are summarized in J. Sambrook, et al.. Molecular Cloning: A Laboratory Manual, (1989). Generally, when introducing DNA into Yeast the term transformation is used as opposed to the term transfection.
The term "translation" as used herein refers to the process whereby the genetic information of messenger RNA is used to specify and direct the synthesis of a polypeptide chain.
The term "vector" as used herein refers to a nucleic acid compound used for the transfection and/or transformation of cells in gene manipulation bearing polynucleotide sequences corresponding to appropriate protein molecules which, when combined with appropriate control sequences, confer specific properties on the host cell to be transfected and/or transformed. Plasmids, viruses, and bacteriophage are suitable vectors. Artificial vectors are constructed by cutting and joining DNA molecules from different sources using restriction enzymes and ligases. The term "vector" as used herein includes Recombinant DNA cloning vectors and Recombinant DNA expression vectors.
The terms "complementary" or "complementarity" as used herein refers to pairs of bases (purines and pyrimidines) that associate through hydrogen bonding in a double stranded nucleic acid. The following base pairs are complementary: guanine and cytosine; adenine and thymine; and adenine and uracil.
The term "hybridization" as used herein refers to a process in which a strand of nucleic acid joins with a complementary strand through base pairing. The conditions employed in the hybridization of two non-identical, but very similar, complementary nucleic acids varies with the degree of complementarity of the two strands and the length of the strands. Such techniques and conditions are well known to practitioners in this field.
"Isolated amino acid sequence" refers to any amino acid sequence, however constructed or synthesized, which is locationally distinct from the naturally occurring sequence.
"Isolated DNA compound" refers to any DNA sequence, however constructed or synthesized, which is locationally distinct from its natural location in genomic DNA.
"Isolated nucleic acid compound" refers to any RNA or DNA sequence, however constructed or synthesized, which is locationally distinct from its natural location.
A "primer" is a nucleic acid fragment which functions as an initiating substrate for enzymatic or synthetic elongation.
The term "promoter" refers to a DNA sequence which directs transcription of DNA to RNA.
A "probe" as used herein is a nucleic acid compound or a fragment thereof which hybridizes with another nucleic acid compound.
The term "stringency" refers to a set of hybridization conditions which may be varied in order to vary the degree of nucleic acid affinity for other nucleic acid. (See the definition of "hybridization", supra.)
The term "PCR" as used herein refers to the widely-known polymerase chain reaction employing a thermally-stable DNA polymerase.
The term "leader sequence" refers to a sequence of amino acids which can be enzymatically or chemically removed to produce the desired polypeptide of interest.
The term "processed polypeptide" refers to a polypeptide or protein wherein the N- terminal leader sequence has been removed to yield the desired polypeptide of interest.
The term "secretion signal sequence" refers to a sequence of amino acids generally present at the N-terminal region of a larger polypeptide functioning to initiate association of that polypeptide with the cell membrane and secretion of that polypeptide through the cell membrane.
The term "trypsinogen" refers to an inactive serine protease zymogen which can be converted to trypsin by removal of a hexapeptide leader sequence which generally comprises the sequence Val(Asp)4Lys (SEQ ID NO:?). The term "trypsin" or "trypsin-like enzymes" refer to proteases which have the ability to cleave the peptide bond on the carboxy-terminus of basic amino acid residues such as lysine and arginine.
The term "enterokinase" or "enteropeptidase" refer to proteases generally produced in epithelial cells that activate trypsinogn to trypsin by cleaving off the trypsinogen leader sequence from the amino terminus of the protein.
The term "trypsinogen analog" refers to trypsinogen which has been mutated such that it cannot be converted to active trypsin by the action of trypsin or trypsin-like enzymes.
The term "autoactivation" or "autocatalytic" refers to the ability of trypsin to activate trypsinogen by cleaving the leader sequence to produce more active trypsin.
The term "endogenous trypsin-like enzyme" refers to proteases which have the ability to cleave the peptide bond on the carboxy-terminus of basic amino acid residues such as lysine and arginine and which are normally expressed in the particular cell type of interest.
As used herein, "percent identity" is used with reference to the Blast 2 algorithm, which is available at the NCBI (http://www.ncbi.nlm.nih.gov/BLAST), using default parameters. References pertaining to this algorithm include: those found at http://www.ncbi.nlm.nih.gov/BLAST/T3last_references.html; Altschul, S.F., Gish, W., Miller, W., Myers, E. W. & Lipman, D.J. (1990) "Basic local alignment search tool." J. Mol. Biol. 215:403-410; Gish, W. & States, D.J. (1993) "Identification of protein coding regions by database similarity search." Nature Genet. 3:266-272; Madden, T.L., Tatusov, R.L. & Zhang, J. (1996) "Applications of network BLAST server" Meth. Enzymol. 266:131-141; Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D.J. (1997) "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs." Nucleic Acids Res. 25:3389-3402; and Zhang, J. & Madden, T.L. (1997) "PowerBLAST: A new network BLAST application for interactive or automated sequence analysis and annotation." Genome Res. 7:649-656.
Trypsinogen analogs:
The present invention avoids the premature activation of recombinant trypsinogen to trypsin that can occur during expression, secretion, and isolation. Prior experimentation involving recombinant trypsinogen indicates that it is generally quite difficult to provide a process for producing stable recombinant trypsinogen. Attempts by researchers experimenting in the field of recombinant protein expression to make recombinant trypsin were not successful, in part, because a mixture of different species of active and inactive trypsin was produced. Using prior art processes, the final product was not fully active mature recombinant trypsin. Trypsin contains three internal trypsin cleavage sites in addition to the cleavage site in the leader sequence. Trypsin has a strong affinity for itself. Prior processes were problematic because, during expression and/or secretion, trypsinogen was activated to trypsin by endogenous enzymes. These activated trypsin molecules then could cleave other recombinantly produced trypsin enzymes at internal cleavage sites and render these enzymes inactive. The resulting mixture of recombinant trypsin peptides contained only a small percentage of intact active trypsin. In addition, activation of trypsin during expression or secretion can damage the cell membrane of the host. This can also contribute to low yields of recombinant trypsinogen or trypsin.
The trypsinogen analogs described herein circumvent the premature activation problem because they are mutated at the activation site to prevent autoactivation or activation by endogenous trypsin-like enzymes. Thus, classical activation of trypsinogen into trypsin by enterokinase or trypsin would not be possible. It is also an object of the invention, however, to provide a means of activating the recombinant trypsinogen with an aminopeptidase such as di- or monoaminopeptidase once it is isolated from a cell lysate or, in the case of secreted enzymes, the cell culture medium.
Proteins having trypsin activity have been exceptionally good subjects for a molecular analysis of protein structure and function. Trypsin proteins have been isolated and characterized from numerous species including bovine, rat, and humans. Le Huerou et al. (1990) Isolation and nucleotide sequence ofcDNA clone for bovine pancreatic anionic trypsinogen. Structural identity within the trypsin family, Eur. J. Biochem. 193, 767-773; Craik, C.S. et al. (1984) J.Biol. Chem. 259:14255-14264; Fletcher, T.S. et al. (1987) Biochemistry 26:3081-3086. Detailed mechanisms for the catalytic hydrolysis of peptide and ester substrates by serine proteases have been established and trypsin is a well-researched member of this class of enzymes. A large number of trypsin mutants have been made in order to elucidate the catalytic mechanism leading to proteolysis. Knights, R.J. et al. (1976) J. Biol. Chem. 251:222-228. In addition, many Biochemistry text books use trypsin as an example when discussing general enzyme structure and function. Mathews van Holde, Biochemistry 355 (1990). Thus, a protein having trypsin activity includes a large group of enzymes which are well-conserved between species and which function by cleaving the peptide bond on the carboxy-terminus of basic amino acid residues such as lysine and arginine.
Wild-type trypsinogens generally contain a hexapeptide leader sequence consisting of Val-Asp-Asp-Asp-Asp-Lys. Trypsin and endogenous trypsin-like enzymes generally cleave the C-terminal peptide bond of Arg and Lys. Thus, the present invention provides trypsinogens that have modified leader sequences which maintain the molecule as an inactive zymogen that cannot be activated by trypsin-like enzymes.
In the broad practice of the invention, it is contemplated that a variety of alternate leader peptides may be used in place of the native leader peptide to prevent activation by tryspin-like enzymes. For example, the number of amino acids in the N-terminal leader sequence can be as small as two amino acids which will maintain the enzyme in an inactive state. Amino acid chains longer than six are also possible. N-terminal leader sequences, however, longer than about 20 amino acids slow down the subsequent aminopeptidase activation process. Thus, it is preferred that the leader sequence be between four and six amino acids in length to maintain inactivity and provide stability while at the same time allowing efficient subsequent activation by aminopeptidases. In addition, an odd number of amino acid residues making up the leader sequence will not generally be processed cleanly by DAP. Thus, it is preferred that the leader sequence constitute four or six amino acids if DAP is to be used as the activating aminopeptidase.
The present invention also contemplates leader sequences having a variable sequence of amino acids. The leader sequence is preferably exposed to the solvent which enables it to be cleaved by DAP or other aminopeptidases. A leader sequence consisting of amino acids which will facilitate exposure of the leader to the solvent and allow for subsequent removal by DAP is preferred. A preferred leader sequence will contain an even number from two to twenty hydrophilic amino acids which cannot be cleaved by trypsin-like enzymes, but which can be removed by DAP or other aminopeptidases. A more preferred leader sequence provides a single amino acid substitution in the native trypsinogen leader sequence wherein the first lysine encountered, for example, position 6 of the bovine typrsinogen zymogen, is substituted with any amino acid except Arg or Pro. An even more preferred leader sequence comprises the sequence Val-Asp-Asp-Asp-Asp- Asp (amino acids 5 through 10 of SEQ ID NO:2) immediately N-terminal to the first amino acid of the mature active trypsin enzyme.
Expression of recombinant trypsinogen analogs:
Wild-type trypsinogen genes can be obtained by a plurality of recombinant DNA techniques including, for example, hybridization, polymerase chain reaction (PCR) amplification, or de novo DNA synthesis.(5ee e.g., T. MANIATIS ET AL., MOLECULAR CLONING: A LABORATORY MANUAL, (2d ed. 1989). The isolated gene can then be modified or mutated to encode the trypsinogen analogs described above.
The isolated nucleic acids of the present invention can be prepared by direct chemical synthesis by methods such as the phosphotriester method of Narang, et al., Meth. Enzymol. 68:90-99 (1979); the phosphodiester method of Brown, et al., Meth. Enzymol.68: 109-151 (1979); the diethylphosphoramiditemethod of Beaucage, et al., Tetra. Letts.22:1859-1862 ( 1981 ); the solid phase phosphoramidite triester method described by Beaucage and Caruthers, Tetra. Letts.22(20): 1859-1862 (1981), e.g., using an automated synthesizer, e.g., as described in Needham- VanDevanter,et al., Nucleic Acids Res. 12:6159-6168 (1984); and the solid support method of U.S. Patent No. 4,458,066. Chemical synthesis generally produces a single-stranded oligonucleotide, which may be converted into double-stranded DNA by hybridization with a complementary sequence, or by polymerization with a DNA polymerase using the single strand as a template.
The trypsinogen cDNA can be isolated from a library constructed from any tissue in which said gene is expressed. Methods for constructing cDNA libraries in a suitable vector such as a plasmid or phage for propagation in prokaryotic or eukaryotic cells are well known to those skilled in the art. (See e.g., MANIATIS ET AL., supra). Suitable cloning vectors are well known and are widely available.
In one method, mRNA is isolated from a suitable tissue, and first strand cDNA synthesis is carried out. A second round of DNA synthesis can be carried out for the production of the second strand. If desired, the double-stranded cDNA can be cloned into any suitable vector, for example, a plasmid, thereby forming a cDNA library. In addition, a variety of different cDNA libraries can be purchased commercially (Clontech Laboratories Inc., Palo Alto, California).
Oligonucleotide primers targeted to any suitable region of the trypsinogen gene can be used for PCR amplification. See e.g. PCR PROTOCOLS: A GUIDE TO METHOD AND APPLICATION (M. Innis et al. eds., 1990). The PCR amplification comprises template DNA, suitable enzymes, primers, and buffers, and is conveniently carried out in a DNA Thermal Cycler (Perkin Elmer Cetus, Norwalk, CT). A positive result is determined by detecting an appropriately-sized DNA fragment following agarose gel electrophoresis.
An object of the present invention is to provide a process whereby the trypsinogen analogs described herein can be expressed by recombinant methods. Recombinant protein expression is preferred to obtain a high yield of highly pure protein, especially when the goal is to use these proteins in the manufacturing process for other therapeutic biomolecules.
The basic steps in the recombinant production of desired proteins are: a ) construction of a synthetic or semi-synthetic DNA encoding the protein of interest; b ) integrating said DNA into an expression vector in a manner suitable for the expression of the protein of interest, either alone or as a fusion protein; c ) transforming an appropriate eukaryotic or prokaryotic host cell with said expression vector, d) culturing said transformed or transfected host cell in a manner to express the protein of interest; and e ) recovering and purifying the recombinantly produced protein of interest.
Trypsino en-Encoding Nucleic Acids The invention further provides trypsinogen-encoding nucleic acids. These are exemplified by the sequences shown in SEQ ID NO: 1 and SEQ ID NO: 3, and their complements. Also included in the inventive nucleic acids are those closely related to the sequences of SED ID NO: 1 and SED ID NO: 3, yet retain the ability to be activated to form mature trypsin. Generally, these sequences share at least about 65 percent identity with SEQ ID NOS. 1 .and 2, but more typically share at least about 70 percent identity. More preferred embodiments share at least about 75% identity, and some share at least about 80% identity. Even more preferred nucleic acids share at least about 85% identity or at least about 90% identity. Most preferred nucleic acids share at least about 95% identity, with some sharing at least about 98 percent or 99 percent identity.
It will be apparent to the skilled artisan that a certain amount of deviation in identity can be generated merely by taking into account "wobble," which causes the nucleic acid sequence to vary, but does not alter the encoded protein sequence. Some preferred nucleic acids eliminate or alter positions 1-12 or 1-30 of SEQ ID NO: 1 and others eliminate or alter positions 1-3 or 1-9 of SEQ ID NO: 3.
Vectors and Host Cells:
The present invention also relates to vectors that include isolated nucleic acid molecules of the present invention, host cells that are genetically engineered with the recombinant vectors, and the production of trypsinogen analog polypeptides or fragments thereof by recombinant techniques.
The nucleotides encoding trypsinogen analogs can optionally be joined to a vector containing a selectable marker for propagation in a host. Generally, with respect to mammalian cell hosts, a plasmid vector is introduced in a precipitate, such as a calcium phosphate precipitate, or in a complex with a charged lipid, or by other methods that are well known to those with ordinary skill in the art. If the vector is a viral vector, it can be introduced directly into mammalian host cells or introduced using viral supernatant produced by packaging in vitro using an appropriate packaging cell line. Bacterial viral vectors can also be packaged in vitro using packaging cell extracts commercially available and then tranfected into host bacterial cells.
The DNA insert should be operatively linked to an appropriate promoter, such as the phage lambda PL promoter, the E. coli lac, trp and tac promoters, the SV40 early and late promoters and promoters of retroviral LTRs, as well as the glyceraldehyde phosphate dehydrogenase (GAPDH) and alcohol oxidase (AOX) promoters to name a few. Other suitable promoters will be known to the skilled artisan. The expression constructs will further contain sites for transcription initiation, termination and, in the transcribed region, a ribosome binding site for translation. The coding portion of the mature transcripts expressed by the constructs will preferably include a translation initiating at the beginning and a termination codon (e.g., UAA, UGA or UAG) appropriately positioned at the end of the mRNA to be translated.
Expression vectors will preferably include at least one selectable marker. Such markers include, e.g., dihydrofolate reductase or neomycin resistance for mammalian cell culture, neomycin resistance or complementation of auxotrophic markers for Yeasts, and tetracycline, ampicillin, kanamycin, or chloramphenicol resistance genes for culturing in E. coli and other bacteria. Representative examples of appropriate hosts include, but are not limited to, bacterial cells, such as E. coli, Streptomyces and Salmonella typhimurium cells; fungal cells, such as Aspergillus niger; yeast cells, such as Pichia pastoris and Saccharomyces cerevisiae; insect cells such as Drosophila S2 and Spodoptera Sf9 cells; animal cells such as CHO, COS and Bowes melanoma cells; and plant cells. Appropriate culture mediums and conditions for the above-described host cells are known in the art. Vectors preferred for use in bacteria include pQE70, pQE60 and pQE-9, available from Qiagen; pBS vectors, Phagescript vectors, Bluescript vectors, pNH8A, pNHlόa, pNH18A, pNH46A, available from Stratagene; and ptrc99a, pKK223-3, pKK233-3, pDR540, pRIT5 available from Pharmacia. Preferred eucaryotic vectors include pWLNEO, pSV2CAT, pOG44, pXTl and pSG available from Stratagene; and pSVK3, pBPV, pMSG and pSVL available from Pharmacia. Preferred vectors for expression in Pichia pastoris include pLDG vectors (figure 1) and pPIC vectors commercially available from In vitrogen, Inc. Other suitable vectors will be readily apparent to the skilled artisan.
Introduction of a vector construct into a host cell can be effected by calcium phosphate transfection, DEAE-dextran mediated transfection, cationic lipid-mediated transfection, electroporation, transduction, infection, transformation or other methods. Such methods are described in many standard laboratory manuals, such as Sambrook, supra, Chapters 1-4 and 16-18; Ausubel, supra, Chapters 1, 9, 13, 15, 16.
Trypsinogen analogs of the present invention can be expressed in a modified form, such as a fusion protein, and can include not only secretion signals, but also additional heterologous functional regions. For instance, a region of additional amino acids can be added to the N-terminus of an analog to improve stability and persistence in the host cell, during purification, or during subsequent handling and storage. Also, peptide moieties can be added to facilitate purification. Such regions can be removed prior to final preparation of an active enzyme. Such methods are described in many standard laboratory manuals, such as Sambrook, supra, Chapters 17.29-17.42 and 18.1-18.74; Ausubel, supra, Chapters 16, 17 and 18.
Expression of Proteins in Host Cells
Using nucleic acids of the present invention, one may express a protein of the present invention in a recombinantly engineered cell, such as bacteria, yeast, insect, or mammalian cells. The cells produce the protein in a non-natural condition (e.g., in quantity, composition, location, and/or time), because they have been genetically altered through human intervention to do so.
It is expected that those of skill in the art are knowledgeable in the numerous expression systems available for expression of a nucleic acid encoding a protein of the present invention. No attempt to describe in detail the various methods known for the expression of proteins in prokaryotes or eukaryotes will be made.
In brief summary, the expression of isolated nucleic acids encoding a protein of the present invention will typically be achieved by operably linking, for example, the DNA or cDNA to a promoter (which is either constitutive or inducible), followed by incorporation into an expression vector. The vectors can be suitable for replication and integration in either prokaryotes or eukaryotes. Typical expression vectors contain transcription and translation terminators, initiation sequences and promoters useful for regulation of the expression of the DNA encoding a protein of the present invention. To obtain high level expression of a cloned gene, it is desirable to construct expression vectors which contain, at the minimum, a strong promoter to direct transcription, a ribosome binding site for translational initiation, and a transcription/translationterminator. One of skill would recognize that modifications can be made to a protein of the present invention without diminishing its biological activity. Some modifications may be made to facilitate the cloning, expression, incorporation of the targeting molecule into a fusion protein, or purification of the protein. Such modifications are well known to those of skill in the art and include, for example, a methionine added at the amino terminus to provide an initiation site, or additional amino acids (e.g., poly His) placed on either terminus to facilitate purification of the protein or other cleavages to create conveniently located restriction sites or termination codons.
Alternatively, nucleic acids of the present invention can be expressed in a host cell by turning on (by manipulation) in a host cell that contains endogenous DNA encoding a trypsinogen analog of the present invention. Such methods are well known in the art, e.g., as described in US patentNos. 5,580,734, 5,641,670, 5,733,746, and 5,733,761, entirely incorporated herein by reference.
Expression in Prokaryotes
Prokaryotic cells may be used as hosts for expression. Prokaryotes most frequently are represented by various strains of E. coli; however, other microbial strains may also be used. Commonly used prokaryotic control sequences which are defined herein to include promoters for transcription initiation, optionally with an operator, along with ribosome binding site sequences, include such commonly used promoters as the beta lactamase (penicillinase)and lactose (lac) promoter systems (Chang, et al., Nature 198:1056 (1977)), the tryptophan (trp) promoter system (Goeddel, et al., Nucleic Acids Res. 8:4057 (1980)), the bacteriaphage T7 promoter and RNA polymerase, and the bacteriaphage lambda derived P L promoter and N-gene ribosome binding site (Shimatake, et al., Nature 292: 128 (1981)). The inclusion of selection markers in DNA vectors transfected in E. coli is also useful. Examples of such markers include genes specifying resistance to ampicillin, tetracycline, kanamycin, or chloramphenicol.
The vector is selected to allow introduction into the appropriate host cell. Bacterial vectors are typically of plasmid or phage origin. Appropriate bacterial cells are infected with phage vector particles or transfected with naked phage vector DNA. If a plasmid vector is used, the bacterial cells are transformed with the plasmid vector DNA. Expression systems for expressing a protein of the present invention are available using Bacillus sp. and Salmonella (Palva, et al., Gene 22:229-235 (1983); Mosbach, et al., Nature 302:543-545 (1983)).
Expression in Eukaryotes A variety of eukaryotic expression systems such as yeast, insect cell lines, plant and mammalian cells, are known to those of skill in the art. As explained briefly below, a nucleic acid of the present invention can be expressed in these eukaryotic systems.
Synthesis of heterologous proteins in yeast is well known. F. Sherman, et al., Methods in Yeast Genetics, Cold Spring Harbor Laboratory (1982) is a well-recognized work describing the various methods available to produce the protein in yeast. Two widely utilized yeast for production of eukaryotic proteins are Saccharomyces cerevisiae and Pichia pastoris. Vectors, strains, and protocols for expression in Saccharomyces and Pichia are known in the art and available from commercial suppliers (e.g., Invitrogen). Suitable vectors usually have expression control sequences, such as promoters, including 3-phosphoglyceratekinase or alcohol oxidase, and an origin of replication, termination sequences and the like as desired.
The sequences encoding proteins of the present invention can also be ligated to various expression vectors for use in transfecting cell cultures of, for instance, mammalian, insect, or plant origin. Illustrative of cell cultures useful for the production of the peptides are mammalian cells. Mammalian cell systems often will be in the form of monolayers of cells although mammalian cell suspensions may also be used. A number of suitable host cell lines capable of expressing intact proteins have been developed in the art, and include the HEK293, BHK21 , and CHO cell lines. Expression vectors for these cells can include expression control sequences, such as an origin of replication, a promoter (e.g., the CMV promoter, a HSV tk promoter or pgk (phosphoglyceratekinase) promoter), an enhancer (Queen, et al., Immunol. Rev. 89:49 ( 1986)), and processing information sites, such as ribosome binding sites, RNA splice sites, polyadenylation sites (e.g., an SV40 large T Ag poly A addition site), and transcriptional terminator sequences. Other animal cells useful for production of proteins of the present invention are available, for instance, from the American Type Culture Collection Catalogue of Cell Lines and Hybridomas (7th edition, 1992).
Appropriate vectors for expressing proteins of the present invention in insect cells are usually derived from the SF9 baculovirus. Suitable insect cell lines include mosquito larvae, silkworm, army worm, moth and Drosophila cell lines such as a Schneider cell line (See Schneider, J. Embryol. Exp. Morphol.27:353-365 (1987).
As with yeast, when higher animal or plant host cells are employed, polyadenylation or transcription terminator sequences are typically incorporated into the vector. An example of a terminator sequence is the polyadenlyation sequence from the bovine growth hormone gene. Sequences for accurate splicing of the transcript may also be included. An example of a splicing sequence is the VP1 intron from SV40 (Sprague, et al., J. Virol.45:773-781 ( 1983)). Additionally, gene sequences to control replication in the host cell may be incorporated into the vector such as those found in bovine papilloma virus type- vectors. M. Saveria-Campo, Bovine Papilloma Virus DNA, a Eukaryotic Cloning Vector in DNA Cloning Vol. II, a Practical Approach, D. M. Glover, Ed., IRL Press, Arlington, VA, pp. 213- 238 (1985).
Signal Peptides:
Signal peptides may be used to facilitate the extracellular discharge of proteins in both prokaryotic and eukaryotic environments. It has been shown that the addition of a heterologous signal peptide to a normally cytosolic protein may result in the extracellular transport of the normally cytosolic protein. Alternate signal peptide sequences may function with heterologous coding sequences.
Signal peptides are well known in the art and can be incorporated into the modified trypsinogen structure to facilitate extracellular translocation or intracellular destination. In the preferred practice of the invention, the signal peptide used is a signal peptide native to a secretory protein of the host cell line. In the most preferred practice of the invention as exemplified herein, the signal peptide is the alpha factor signal sequence. This signal sequence is fused to the N-terminal end of the trypsinogen analog leader sequence and will result in the extracellular transport of trypsinogen analogs expressed in Yeast. An insertion of amino acids such as Glu-Ala-Glu-Ala (amino acids x through x of SEQ ID NO:2) between the C-terminus of the alpha factor signal sequence and the N-terminus of the leader sequence may improve the yield of secreted protein in some systems. An additional signal sequence exemplified herein is the human serum albumin (HSA) signal sequence which can also be used to target the analog to the cell membrane for extracellular secretion.
Protein Purification:
Once an expression vector carrying the trypsinogen analog gene is transfected into a suitable host cell using standard methods, cells that contain the vector are propagated under conditions suitable for expression of the recombinant trypsinogen analog protein. For example, if the recombinant gene has been placed under the control of an inducible promoter, suitable growth conditions would incorporate the appropriate inducer. The recombinantly-produced protein may be purified from cellular extracts of transformed cells by any suitable means.
Trypsinogen analogs of the present invention can be recovered and purified from recombinant cell cultures by well-known methods including ammonium sulfate or ethanol precipitation, acid extraction, anion or cation exchange chromatography, phosphocellulose chromatography, hydrophobic interaction chromatography, affinity chromatography, reversed-phase chromatography, hydroxylapatite chromatography, and lectin chromatography. Most preferably, ultra-filtration is employed coupled with cation exchange chromatography.
Additionally, the trypsinogen analog may be fused at its N-terminal end to several histidine residues. This "histidine tag" enables a single-step protein purification method referred to as "immobilized metal ion affinity chromatography" (IMAC), essentially as described in U.S. Patent 4,569,794, which hereby is incorporated by reference. The IMAC method enables rapid isolation of substantially pure protein starting from a crude extract of cells that express a recombinant protein, as described above.
In one embodiment of the invention, trypsinogen analog which has been secreted into the culture medium is concentrated by ammonium sulfate precipitation (70%). The precipitate is resuspended in 20 mM sodium acetate, pH 5.0 and dialyzed against the same buffer. The trypsinogen analog can then be loaded on a CM-Sepharose CL6B Column (Pharmacia, Upsala, Sweden) and eluted with a gradient of NaCl.
An alternative purification process from cell culture growth medium begins with centrifugation of the growth medium to pellet cells. The pH of the supernatant is adjusted to about 3.0 preferably with an acetate buffer. The adjusted supernatant is then subjected to ultra-filtration (tangential flow filtration with a 300 kDa/ 10 kDa molecular weight cutoffs). The filtrate can optionally be further purified using cation exchange chromatography. The purified isolate then is processed using an aminopeptidase such as DAP or MAP. The reaction is then concentrated again using ultrafiltration resulting in yields around 50%.
Activation by aminopeptidases:
The present invention provides a process for activating the trypsinogen analog following expression of the trypsinogen in bacteria, yeast, or higher eukaryotic cells. Following digestion with an aminopeptidase such as a mono- or diaminopeptidase, an active trypsin protein with an N-terminus identical to that of wild type trypsin can be obtained. The conformation of trypsinogen will prevent aminopeptidases from cleaving beyond the first amino acid of the mature trypsin molecule. Numerous aminopeptidases exist and can be used to activate the trypsinogen analogs of the present invention. Watson et al. (1976) Methods Microb. 9:1-14 describe different aminopeptidases present in different bacteria including E. coli and is entirely herein incorporated by reference.
Before processing the trypsinogen analogs using an aminopeptidase, however, the trypsinogen may be purified from a cell lysate or, in the case of a secreted protein, from the culture medium. A variety of purification steps may be employed including ammonium sulfate precipitation followed by dialysis and column chromatography.
It is a preferred method of the present invention to activate trypsinogen analogs using dipeptidylaminopeptidase (DAP). Dipeptidylaminopeptidases are enzymes which hydrolyze the penultimate amino terminal peptide bond releasing dipeptides from the unblocked amino-termini of peptides and proteins. There are currently four classes of dipeptidylaminopeptidases (designated DAP-I, DAP-II, DAP-III and DAP-IV) that differ based on their physical characteristics and the rates at which they react with their substrates. DAP I is a relatively non-specific DAP that catalyzes the release of many dipeptide combinations from the unblocked amino termini of peptides and proteins. DAP I shows little or no activity if the emergent dipeptide is Pro-X, or X-Pro (where X is any amino acid). DAP II shows a preference for amino terminal dipeptide sequences that begin with Arg-X or Lys-X, and to a lesser extent, X-Pro. DAP-II exhibits significantly lower reaction rates versus most other dipeptide combinations. DAP III appears to have a propensity toward amino terminal dipeptide sequences of the form Arg- Arg and Lys-Lys. DAP IV shows its highest rate of hydro lytic activity toward dipeptide sequences of the form X-Pro. The DAP enzymes, particularly DAP-I and DAP-IV, have been shown to be useful in processing proteins.
Processing of percursor polypeptides containing leader sequences by bovine dipeptidylaminopeptidase is disclosed in Becker et al., United States Patent 5,126,249 and is herein incorporated by reference. One particular DAP commonly used to process precursor polypeptides is that derived from the slime mold Dictyostelium discoidium. The synthesis, purification and use of this protease, often abbreviated as dDAP, are described in European Patent Publication 595,476, published May 4, 1994, and United States Patent Applications 08/301,519, filed September 7, 1994, and 08/445,308, filed May 19, 1995, all of which are herein incorporated by reference.
It was unexpected to find that addition of dDAP to the trypsinogen analogs of the present invention resulted in a full-length active trypsin enzyme. The unique secondary structure of trypsin prevents dDAP from cleaving amino acids beyond the first few amino acids of the mature protein. DAP cleavage of trypsinogen analog derived from the bovine sequence does not progress beyond the isoleucine at position 11 (SEQ ID NO:2) due to the disulfide bond formed by the cysteine residue at position 17 (SEQ ID NO:2). One could conceive that a leader sequence present on any protein molecule could be removed by an aminopeptidase as long as the secondary structure of the protein was such that the peptidase could not cleave beyond the first critical residue of the desired end product protein. In addition, the activation reaction can be carefully controlled to ensure that cleavage does not occur beyond the first amino acid of the mature protein.
The DAP reaction that converts inactive trypsinogen analogs into active trypsin is generally conducted in an aqueous medium suitably buffered to obtain and maintain a pH from about 2.0 to about 6.5. Preferably, the pH of the medium ranges from about 3.0 to about 4.5, and most preferably, from about 3.0 to about 3.5. Using dDAP to remove dipeptides from trypsinogen is advantageous because dDAP's pH optimum of 3.5 allows the reaction to be run at acidic pH. These acidic conditions are important in maintaining solubility, preventing degradation by various proteases, and allowing for the production of single-chain trypsin. In some cases, however, a solubilizing agent such as urea, sodium dodecylsulfate, guanidine, and the like, may be employed.
The following examples more fully describe the present invention. Those skilled in the art will recognize that the particular reagents, equipment, and procedures described are merely illustrative and are not intended to limit the present invention in any manner.
These examples discuss the expression and secretion of various modified trypsinogens in Pichia pastoris expression systems. One of these expression systems is commercially sold by In Vitrogen, Inc. (Leek, Netherlands) and utilizes a vector containing the methanol inducible alcohol oxydase (AOX1) promoter. Secretion of the expressed protein to the culture medium is directed by the alpha factor signal sequence. Clones were established in the GS115 strain also supplied by In Vitrogen. An additional expression system that was tested employs the glyceraldehyde phosphate dehydrogenase (GAPDH) promoter which is induced when glucose becomes limiting in the culture medium. The human serum albumin (HSA) signal sequence directs the expressed protein to the culture medium. The production strain SMDl 163 used in this expression system has mutations in two vacuolar proteases which have been shown to cause a decrease in the degradation of expressed proteins.
Chymotrvpsin-Free Trypsinogen
Another product of the invention is a highly purified preparation of trypsin. (Trypsin is defined below.) Because the present trypsin is produced in a recombinant system that results in secretion into the medium, it is anticipated that the resulting preparation should by highly purified and have negligible levels of contaminating chymotrypsin activity.
This highly purified trypsin meets at least one of the following criteria of purity. Generally such preparations are greater than about 90 percent pure. More typically, however, these preparations are more than about 95 percent pure and preferably they are at least about 99 percent pure, and generally at least about 99.5, 99.9 or 99.99 percent pure. Some preparations have no detectable contaminants.
On method of estimating purity is sodium dodecyl sulfate - polyacrylamide gel electrophoresis (SDS-PAGE) with silver staining and densitometry. Percent purity is expressed as a ratio of the trypsin peak area to total peak area
Purity may also be evaluated by reverse phase high performance liquid chromatography method as presented generally below in the Examples. In such a case, percent purity is evaluated by comparing the integration of the trypsin peak to all peaks. The preparations of the invention generally show no chymotrypsin by this method.
Purity may also be as measured by specific activity using the TAME assay, detailed below. Reference pure material should yield about 235 U/mg.
Importantly, preferred trypsin preparations are "essentially chymotrypsin-free." As used herein, "essentially chymotrypsin-free" denotes a preparation that has less than about 0.01% chymotrypsin (by weight, relative to trypsin). More preferred essentially chymotrypsin-free preparations generally have less than about 0.005%, and even more preferred preparations have less than about 0.001% chymotrypsin and most preferred preparations have less than about 0.0005% chymotrypsin. Some highly purified preparations are expected to meet the foregoing criteria of purity when measured using the ultra-sensitive glucagon-based high performance liquid chromatography (HPLC) method presented below in Example 6. In such an assay, preferred compositions should show no detectable chymotryptic glucagon peaks.
The trypsin preparations of the invention also are free of any cellular components derive from Aspergillus, or related organisms, such as proteins and toxins.
The trypsin preparations of the invention generally are essentially mammalian protein-free. As used herein "essentially mammalian protein-free" compositions refer to any of the inventive compositions that are essentially free of all mammalian proteins, except, of course, for the recombinantly-produced product. In general, this is achieved by avoiding the addition of mammalian proteins, like casein (added to bacterial cultures), and producing the proteins in a non-mammalian host. Excluding the recombinant protein, typical compositions have less than about 1% mammalian protein, but usually have less than about 0.5% mammalian protein or less than about 0.1% mammalian protein. Again, excluding the recombinant protein, more preferred compositions have less than about 0.01% mammalian protein and most preferred compositions have no detectable mammalian protein. The skilled artisan will be aware of numerous methods, such as enzyme-linked immunosorbant assays (ELISAs) for detecting contaminating mammalian proteins.
Conveniently, the essentially chymotrypsin-free trypsin of the invention may be assembled into "commercial units" that are suitable for sale. Each commercial unit generally comprises a bulk quantity of essentially chymotrypsin-free trypsin. A bulk quantity usually comprises at least about 10 mg of essentially chymotrypsin-free trypsin unit. More typically, however, larger quantities will be present, such as at least about 50 mg per unit, at least about 100 mg per unit, at least about 500 mg per unit or at least about 1 gram per unit. Even larger quantities, e.g., a. unit of at least about a fifty grams, a unit of at least about a hundred grams unit, or a unit of at least about a kilogram(s), also are contemplated.
The term "commercial unit" also contemplates assemblages of smaller commercial units to form larger ones. For example a one kilogram commercial unit of essentially chymotrypsin-free trypsin may be provided as a thousand one-gram commercial units. Generally, a commercial unit will contain essentially chymotrypsin-free trypsin as a liquid solution. It may, however, be present in a solid form, such as a freeze-dried (lyophylized) powder. Bulking agents and stabilizers (like calcium) are optionally included. A commercial unit also includes the packaging containing the essentially chymotrypsin-free trypsin, and optionally includes printed product specifications, an inventory control number and/or instructions for use.
Example 1 : Construction of Expression Vectors Containing Trypsinogen Analogs:
Wild type bovine trypsinogen was mutated to destroy the trypsin cleavage site. The Lys residue present in the leader sequence of the native bovine trypsinogen protein was mutated to an Asp residue (position 10 of SEQ ID NO:2). At the DNA level the mutation generates an EcoRV site which is useful for subsequent modifications of different constructs. All constructs were first assembled using the vector pRMG5 (Fig. 1) and then transferred to Pichia pastoris expression vectors.
The expression vector pPIC9 supplied by In Vitrogen was used to create a trypsinogen analog fused to the α-factor signal sequence. The DNA encoding the fusion protein was cloned downstream of the methanol inducible AOX1 promoter. The pPIC9 vector contains the AOX 1 promoter cloned 5' to the α-factor signal sequence which in turn is cloned 5' to a multiple cloning sequence. The vector was constructed such that DNA encoding a Glu-Ala-Glu-Ala (amino acids 1 through 4 of SΕQ ID NO:2) peptide was inserted between the C-terminus of the alpha factor signal sequence and the N-terminus of the trypsinogen analog leader sequence to improve the yield of secreted protein.
An oligonucleotide encoding the alpha factor signal sequence, the Glu-Ala-Glu-Ala peptide, and the leader sequence comprising amino acids 5 through 10 of SΕQ ID NO:2 were first cloned into the vector pRMG5 (Fig. 1). The following lyophilized oligonucleotides were resuspended in 200 μl of sterile H2O: 5'-TC GAG AAA AGA GAG GCT GAA GCT GTC GAC GAT GAT GAC GAT ATC GTT GGA GGT TAT ACA TGT GG-3' and 5'-C GCC ACA TGT ATA ACC TCC AAC GAT ATC GTC ATC ATC GTC GAC AGC CTT CGA CTC TCT TTT C-3' (oligonucleotide final concentrations were about 2.5μg/μl). Approximately 10 μg of each oligonucleotide was phosphorylated with 4 Units of T4 polynucleotide kinase for 1 hour at 37°C in a buffer containing 6 mM MgCl2, 6 mM DTT, 0.6 mM ATP and 120 mM Tris, pH 8.0.
The phosphorylated oligonucleotides were then pooled and annealed. The resulting double stranded oligonucleotides were then cloned into pRMG5 to create pRMG5- αF(ΕA)2VD5. The pRMG5 vector was prepared by digesting the vector with Xhόl and Nαrl. After digestion, the vector was dephosphorylated with 0.1 U of calf intestinal phosphatase (CIP) at 37°C for 30 minutes. The vector was then ligated to the double stranded phosphorylated oligonucleotides. The resulting clone was used for subcloning the trypsinogen analog in pPIC9.
The pPIC9 expression vector was prepared in the dam minus E.coli GM82 strain and digested with vrll (Xbal cohesive) and Xhol and then dephosphorylated with 0.1 U of CIP. The trypsinogen analog was extracted from pRMG5-αF(EA)2VD5 as a XhollXbal fragment and subcloned into pPIC9. The resulting plasmid, pPIC9-αF(EA)2VD5 contained DNA encoding the alpha factor signal sequence (αF) fused to the Glu-Ala-Glu-Ala ((EA)2) insertion fused to the leader Val(Asp)5 (VD5) fused to bovine trypsin.
Trypsinogen was also fused directly to the C-terminus of the alpha factor without the (GluAla)2 insertion. Two oligonucleotides, 5'-TC GAG AAA AGA GTC GAC GAT GAT GAC GAT ATC GTT GGA GTT TAT ACA TGT GG-3' and 5'-CGC CAC ATG TAT AAC CTC CAA CGA TAT CGT CAT CAT CGT CGA CTC TTT TC-3' encoding a C-terminal portion of the alpha factor signal sequence and the Val(Asp) leader sequence were synthesized, prepared as described above, and ligated into the Xhol and Nαrl sites of pRMG5. A positive clone with the correct sequence was then used to subclone the trypsinogen gene into pPIC9 as described above. The resulting clone was named pPIC9-αFVD5 and contained DΝA encoding the alpha factor signal sequence fused to the leader Val(Asp)5 fused to bovine trypsin.
Additional constructs were made employing the vector pLGD43 (Figure 2). In this vector, the trypsinogen analog gene is under the control of the glyceraldehyde phosphate dehydrogenase (GAPDH) promoter and the HSA secretion signal sequence is used to direct the protein to the outside of the cell. The cloning strategy involves first ligating a trypsinogen analog coding sequence into a pRMG5 construct and then further subcloning into the expression vector pLGD43.
Two different trypsinogen analog gene sequences were cloned into pLGD43. The two gene sequences encoded a trypsinogen analog with either four or six amino acids in the leader sequence wherein the lysine is replaced with aspartate (Val(Asp)5 -trypsin or Val(Asp)3-trypsin ).
The trypsinogen analog having the four amino acid leader sequence was cloned using two oligonucleotides, 5'-TC GAG GGT AAC CTT TAT TTC CCT TCT TTT TCT CTT TAG CTC GGC TTA TTC CAG GGG TGT GTT TCG TCG AGT CGA CGA CGA T-3' and 5'-ATC GTC GTC GAC TCG ACG AAA CAC ACC CCT GGA ATA AGC CGA GCT AAA GAG AAA AAG AAG GGA AAT AAA GGT TAC CC-3', encoding the HSA signal sequence and the four amino acid leader sequence. The oligos were phosphorylated, annealed and ligated into the Xhol and EcoRV sites of the pRMG5-αF(ΕA)2VD5 construct described above to create pRMG5HSAVD3. The resulting pRMG5 vector, was then used for transferring the DNA sequence encoding the HSA signal sequence fused to the four amino acid leader sequence to pLGD43 as a BstElllBamΑl fragment to create pLGD43HSAVD3.
The trypsinogen analog having the six amino acid leader sequence was cloned by taking the trypsinogen analog gene cassette from pRMG5-αF(ΕA)2VD5 as a SaWBamHl fragment into the vector pRMG5HSAVD3 also digested with Sail and BamHl. The resulting vector was named pRMG5HSAVD5. This modified trypsinogen was then subcloned from pRMG5HS AVD5 into pLGD43 as described above for the modified trypsinogen containing the four amino acid leader sequence. The resulting vector was named pLGD43HSAVD5.
Example 2: Selection of Pichia pastoris transformants
Transformation of Pichia pastoris:
Pichia pastoris GS115 and SMDl 163 protease minus strains were transfected using the spheroplast method or the electroporation method. Before transfection, the expression vectors described in Example 1 were linearized with BgHl for pPIC9 derivatives and Nøtl for pLGD43 derivatives to facilitate homologous recombination of an expression cassette on the Pichia chromosome. This cassette consisted of a promoter (AOXl or GAPDH) controlling the expression of a trypsinogen analog and the Yeast His4 gene which was used as a selectable marker.
Yeast strains used for transformation have a mutated HIS4 gene and are unable to grow in medium lacking histidine. Transfected cultures were plated on minimal medium and only those cells which had integrated the HIS4 gene in their chromosome were able to grow on minimal medium lacking histidine (HIS+ clones).
Clones having integrated the cassette in the AOXl locus were then isolated. Homologous recombination directed to the AOXl locus leads to the deletion of the AOXl gene. Strains carrying a deletion of the AOXl gene are still able to grow on methanol using AOX2 alcohol oxydase but the growth is much slower compared to cells with an intact AOXl gene. Therefore, HIS+ transformants were screened for integration in the AOXl locus by plating them on minimal medium containing methanol as the sole carbon source. Slow growing clones having integrated the expression cassette in the AOXl locus were isolated and designated muts (methanol utilization slow).
Example 3: Induction of trypsinogen analog expression
Induction of trypsinogen analogs using the AOXl promoter: s The AOXl promoter is inducible in the presence of methanol. HIS+/mut cells, which were originally transfected with the expression vectors containing trypsinogen analogs driven by the AOXl promoter, were first grown in glycerol to generate biomass and then transferred to medium containing methanol for induction. Clones to be screened were stored as 1 ml glycerol stocks at -80°C. One aliquot (1ml) was thawed and used to inoculate 75ml BMGY (lOOmM phosphate buffer pH 6.0, 1.34% bacto yeast nitrogen base (YNB), 10% yeast extract, 20% bactopeptone, 1% glycerol). Cultures were grown for 24 hours at 30°C by shaking at 250 RPM. After 24 hours, the cells were harvested by centrifugation and resuspended in BMMY (lOOmM phosphate buffer pH 6.0, 1.34% YNB, 10% yeast extract, 20% bactopeptone, 0.5% methanol) in a 500ml shaker flask. Cells were incubated for an additional 48 hours at 30°C to induce trypsinogen analog expression. Twenty-four hours after induction, 125 mis of pure methanol was added to keep the level of methanol constant in the culture medium. The cultures were centrifuged and the supernatant was isolated to determine the concentration of trypsinogen analog.
Induction of trypsinogen analog using the GAPDH promoter:
The GAPDH promoter is induced when glucose becomes limited in the culture medium.
The induction of expression of trypsinogen analog from the GAPDH promoter was carried out s essentially as described for the AOXl promoter. Subconfluent cultures of HIS+/mut cells, which were originally transfected with expression vectors containing a trypsinogen analog driven by the GAPDH promoter, were grown in BMGlcY (lOOmM phosphate buffer pH6, 1.34% YNB, 10% yeast extract, 20% bactopeptone, 5% glucose). Cells were harvested and transferred to BMGY medium depleted in glucose (lOOmM phosphate buffer pH6, 1.34% YNB, 10% yeast extract, 20% bactopeptone). The supernatant was harvested 48 hours after induction. Glycerol was added to 0.5% 24 hours after induction. Example 4: Selection and characterization of clones expressing trypsinogen analogs: The trypsinogen present in the supernatant of Pichia pastoris was first concentrated by ammonium sulfate precipitation (70%). The precipitate was resuspended in 20 mis of 20mM sodium acetate pH 5 and dialyzed overnight against the same buffer at 4°C. The dialyzed solution was then loaded on CM-Sepharose CL6B (Pharmacia, Upsala, Sweden) equilibrated with 20mM acetate buffer pH 5. Trypsinogen analog was eluted with a gradient of NaCl (0- 50mM). Fractions were analyzed by SDS-PAGE and fractions containing the trypsinogen analogs were pooled and concentrated.
The purified trypsinogen was then activated with 500 U dDAP per gram of trypsinogen analog in 50 mM acetate buffer pH 3.0 and the resulting trypsin activity was determined using the Tosyl-Arg methyl esterase (TAME) assay as described in B.C.W. Hummel, 37 Can. J. Biochem. Physiol., 1393 (1959).
The presence of the Glu-Ala-Glu-Ala linker peptide positioned between the signal sequence and trypsinogen analogs causes an increase in the yield of secreted protein. However, this sequence is not completely processed during secretion, but is effectively removed by dDAP during the activation process.
Figure 3 summarizes the production of ValAsp5 trypsinogen in the two expression systems described in the preceding Examples.
N-terminal sequencing of recombinant modified trypsinogen:
In order to confirm that the protein expressed and isolated following induction was modified trypsinogen, N-terminal sequencing was carried out on purified trypsinogen analog expressed from two different constructs. The 25kDa and 30 kDa bands observed on SDS gels were sequenced which confirmed that the two proteins were trypsinogen. Both the 30kDa and the 25 kDa proteins were found with an N-terminus corresponding to modified trypsinogen.
The trypsinogen produced by Pichia pastoris is secreted into the culture medium. Both the alpha factor and the HSA signal sequence are correctly processed during secretion, but as mentioned above the GluAlaGluAla sequence inserted between the signal sequence and the trypsinogen is only partially removed.
Analysis of the amino acid sequence of trypsinogen analogs showed the presence of a potential glycosylation site on Asn 169 which explains the presence of the 30kDa species.
Purified trypsinogen analog was treated with dDAP and trypsin activity followed with time. Trypsinogen purified from Pichia culture supernatant was activatable into trypsin with dDAP. In the absence of dDAP, no trypsin activity was detected. Following incubation with dDAP, the trypsin activity increases and reaches a plateau once all activatable trypsinogen has been digested with dDAP. Animal sourced trypsinogen purchased from Sigma Chemical Company was also activated into trypsin by dDAP but the trypsin activity started to decrease after a few hours. The Sigma trypsin produced by dDAP activation at pH 3.0 was much less stable then the trypsin produced from trypsinogen analog expressed in Pichia.
N-terminal sequencing was done on the trypsin produced following dDAP digestion .and the N-terminus was found to be identical to wild type bovine trypsin. The dDAP removes only aminoacids up to He 11 (SEQ ID NO:2) and does not digest further.
Example 5: Product Characterization
This example sets out methods useful in assessing the quality of the inventive products, especially the essentially chymotrypsin-free products.
Purity
The purity of r-trypsin is measured by SDS-Page under non-reducing conditions, followed by densitometric analysis of the bands. The potential impurities that can be measured are small fragments of trypsin or yeast polypeptides. The method is designed to measure contamination of the r-trypsin by yeast proteins from the fermentation. The method also detects trypsin, two chain trypsin, and two other proteolysis products of trypsin. These variants of trypsin are counted as product in the calculation of purity. All bands included as trypsin product have been identified as bovine trypsin sequences as measured by direct sequence analysis of bands from control sample of bovine trypsin eluted from gels. Bands found at all other molecular weights are considered impurities.
Specific Activity
The activity of trypsin can be measured by following the degradation of a synthetic substrate (TAME or Tosyl-Arg-Methyl-Ester) as measured by a change in absorbance at 247 nm over time. This method is essentially as described by Hummel, Can. J. Biochem. Physiol. 37: 1393 (1959). The specific activity (expressed in Units/mg) is calculated as the ratio between the activity (expressed in Units per ml) and the protein concentration (expressed in mg/ml). Generally, high quality trypsin will have a specific activity of >190U/mg. Preferably, the specific activity should exceed 210U/mg, with superior preparations exceeding 220 U/mg.
Specificity
The specificity of a lot of r-trypsin is determined by performing a complete tryptic digest of a reference standard of glandular glucagon at room temperature using a ratio of 1 mg of trypsin per 100 mg of substrate over a period of 2 hours, as described below. This exposure-ratio combination is equivalent to the reaction conditions in most processes using trypsin. The reaction is monitored by reversed-phase HPLC, as described below. The results are reported as the ratio of the sum of the area for the tryptic fragments obtained with a sample versus that obtained with a standard (% specificity). This assay was designed to measure any degradation of the substrate resulting from contaminating proteases (particularly chymotrypsin) which could have co-purified with r-trypsin throughout the purification process.
Glucagon contains several chymotryptic sites and the assay can detect very low levels of chymotryptic contamination of trypsin (>0.01% by comparing the area of chymotryptic peaks to the area of tryptic peaks). Example 6: HPLC Method for Detecting Trypsin versus Chymotrypsin This assay is useful in measuring the purity of trypsin and simultaneously measuring chymotryptic activity. Equipment
1)A suitable HPLC system capable of gradient elution, equipped with a UV detector and column heater, and chilled autosampler capable of keeping samples between 4 and 9°C.
2) Beckman DU70 Spectrophotometer or equivalent
3) Zorbax SB-C18, 0.46X15 cm, 80A, 5 micron packing (cat. # 883975-902)
4) Pipets able to dispense accurately from 10 to 5000μL
5) Stir plate
6) pH Meter
7) Vortex
8) Analytical Balance
9) Polypropylene microfuge tubes
10) Polypropylene 15 mL tubes or equivalent
11) 25°C water bath
12) 0.45 micron mobile phase filtration assembly
Materials
1) Glandular Glucagon Reference Standard 3 mg
2) Sigma Bovine Trypsin (bTrp) (Cat. No. T 8003)
3) 0.001 M HC1
4) 0.05 M HOAc
5) 50 mM Borate Buffer, pH 8.5 (see materials 12 and 13)
6) 1 M CaCl2
7) 5 N HC1
8) Needle Wash Solution - 50% ACN in Milli-Q water
9) Triflouroacetic acid
10) Acetonitrile
11) Phosphoric Acid
12) Boric Acid
13) Concentrated NaOH
14) Ice
15) Milli-Q water or equivalent Sample Preparation
Glucagon Substrate Preparation: One vial of glucagon reference standard is needed to assay two trypsin samples in triplicate. To each vial of glucagon reference standard, add 0.5 mL of 0.001 M HC1. Mix gently, and transfer the solution to a 15 mL polypropylene tube. Add 5 mL of the 50 mM borate buffer. Add 28 μL of the 1 M CaCl2 stock solution and mix. Measure the pH and adjust to 8.0 +/- 0.1 with IN NaOH if necessary. Hold the solution on ice until needed. The same substrate solution must be used for analysis of both the bovine trypsin (bTrp) control and the recombinant trypsin (rTrp) samples.
Glucagon Standard Preparation: Aliquot 250 μL of the glucagon solution prepared in step a) into a 1.5 mL microfuge tube. Add 750 μL of the 50 mM borate buffer and mix. Add 50 μL of the 5 N HC1 and mix. Hold on ice until needed. bTrp Standard Preparation: Weigh out approximately 1 mg of the Sigma bovine trypsin. Dissolve in 1 mL of 0.05 M HOAc. Determine the concentration of the bTrp solution by measuring the A280 on the Spectrophotometer. See Data Analysis section for calculations. Dilute the solution to 0.5 mg/mL with 0.05 M HOAc. Hold on ice until needed. rTrp Sample Preparation: Determine the concentration of the rTηδ*s3rrr le by measuring the A280 on the Spectrophotometer. See Data Analysis for calculations. Dilute the solution to 0.5 mg/mL with 0.05 M HOAc. Hold on ice until needed.
Enzyme reaction: For each sample, transfer a 1 mL aliquot of the glucagon reference standard substrate solution into a 1.5 mL eppendorf tube. Add 10 μL of the 0.5 mg/mL bTrp standard solution or rTrp sample solution and vortex. Place each tube in a 25°C water bath and incubate for 2 hours. After 2 hours, quench the reaction by adding 50 μL of 5 N HCL Samples must be held between 4 and 9°C prior to analysis or precipitation will occur. Samples should be analyzed within 12 hours after quenching. The bTrp standard must be analyzed using the same glucagon substrate solution and in the same HPLC sequence as the rTrp samples.
HPLC Conditions
1 ) Mobile Phase A - 0.1 % Phosphoric acid and 0.025% TFA
2) Mobile Phase B - 0.1 % phosphoric Acid and 0.025% TFA in
ACN
Column: Zorbax C18, 5 μm, 80A, 15 cm x 0.46 cm
Injection Volume: 20 μL
Flow Rate: 1 mL/min
Detector: UV at 214 nm
Autosampler Temperature: 8°C
Column Oven Temperature: 60°C +/- 2°C
Gradient set as follows:
10) Needle Wash: 1000 μL after each injection. See Materials #8, above, for wash solution makeup.
11) Integrate all peaks from 250 to 1150 seconds
A typical HPLC run of trypsin-digested material yields 4 tryptic peaks at retention time of about 6.1 (3), 9.1 (2), 17.4 (1) and 18.7 (4) minutes (parentheticals indicate the ranking of the peaks in order of area, largest to smallest). The same protocol used with chymotrypsin yielded 5 peaks having retention time of about 8.0 (3), 8.1 (2), 8.7 (5) , 13.5 (1) and 13.6 (4) minutes (parentheticals indicate the ranking of the peaks in order of area, largest to smallest). Thus, all chymotryptic and tryptic peaks resulting from glucagon digestion are clearly resolvable.
Standard trypsin samples were spiked with 1%, 0.1% and 0.01% (by weight) of chymotrypsin (Sigma) and the reaction resolved by the same HPLC method. Peak resolution was obtained at all spiking levels, demonstrating that the detection limit of the assay is well below 0.01% by weight.
By injecting more of the reaction mixture, the sensitivity may be enhanced even further. The inherent limit on sensitivity is interference with the chymotryptic peaks by the tryptic peaks. Sensitivity may be increased, it is believed, to levels of less than 0.005%, 0.001% and even 0.0005%, merely by increasing the amount of material injected into the HPLC system.

Claims

WE CLAIM:
1 . An isolated trypsinogen analog comprising: a ) a protein having trypsin activity; and b) a leader sequence having at least 2 amino acids wherein the amino acids are any amino acid except Lys or Arg.
2 . The isolated trypsinogen analog of Claim 1, wherein the trypsinogen analog can be activated by cleavage with an aminopeptidase.
3 . The isolated trypsinogen analog of Claim 2, wherein the aminopeptidase is a diaminopeptidase .
4 . The trypsinogen analog according to Claim 1 , wherein the protein having trypsin activity is bovine trypsin.
5 . The trypsinogen analog according to Claim 1 , wherein the leader sequence of said trypsinogen is selected from the group consisting of: a ) from 2 to 20 non basic amino acid residues; b) from 2 to 20 amino acid residues wherein the amino acid residues can be any residue except except Lys or Arg; c ) Val-Asp-Asp-Asp-Asp-N, wherein N is any amino acid residue except Lys, Arg, or Pro; d) amino acid residues 5 through 10 of SEQ ID NO:2; e ) from 2 to 20 amino acid residues wherein the residues can be any hydrophilic amino acid residue that is not Lys or Arg; f ) amino acid residues 5 through 8 of SEQ ID NO:2; .and
g ) from 2 to 20 amino acid residues, wherein the number of residues is an even number of residues and the residues are any non-basic amino acid except proline.
The trypsinogen analog according to Claim 5 wherein the amino acid sequence is
SEQ ID NO:2
The trypsinogen analog according to Claim 6, wherein the amino acid sequence is amino acids 5 through 231 of SEQ ID NO:2.
The trypsinogen analog according to Claim 5 further comprising: a secretion signal sequence of amino acids operably linked to the amino-terminal end of the leader sequence wherein said signal sequence allows the trypsinogen to be secreted into the host cell growth medium.
An isolated nucleic acid compound encoding the trypsinogen analog of Claim 1.
. An isolated nucleic acid compound encoding the trypsinogen analog of Claim 5.
. An isolated nucleic acid compound encoding a trypsinogen analog, wherein the nucleic acid sequence is SEQ ID NO:l.
. An isolated nucleic acid compound encoding the trypsinogen of Claim 8.
. An isolated nucleic acid compound encoding trypsin wherein the nucleic acid consists essentially of bases 4 through 699 of SEQ ID NO:3.
. An expression vector comprising the isolated nucleotide compound of Claim 13 operably linked to an expression control sequence.
. A host cell transformed with a vector of Claim 14. The host cell of Claim 15, wherein said host cell is a Yeast.
The host cell of Claim 16, wherein said host cell is Pichia pastoris.
A method for producing recombinant trypsin comprising the steps of: a ) expressing the trypsinogen of Claim 1 in a host cell; b ) isolating the expressed trypsinogen; and c ) activating the isolated trypsinogen with an aminopeptidase.
The method as claimed in Claim 18 wherein the aminopeptidase in the activation step is a diaminopeptidase.
. The method as claimed in Claim 18 further comprising the step of secreting the trypsinogen into the growth medium.
. A method for producing recombinant trypsin comprising the steps of: a ) expressing a trypsinogen analog comprising SEQ ID NO:2 in Pichia pastoris; b ) isolating the trypsinogen in a buffer;
c ) incubating the isolated trypsinogen with dipeptidylaminopeptidase in a buffer with a pH from about 2.0 to about 6.5 such that the trypsinogen is converted to active trypsin.
22 . A recombinantly produced trypsin consisting essentially of the amino acid sequence of SEQ ID NO:4.
23. An isolated trypsinogen analog represented by X-AA-Y, wherein: a) Y is a protein having trypsin activity; b) AA is any amino acid except Lys or Arg; and c) X-AA is a leader sequence having at least 2 amino acids.
24. An isolated nucleic acid comprising the sequence of SEQ ID NO: 1 or nucleic acids sharing at least about 65 percent identity therewith.
25. An isolated nucleic acid comprising the sequence of SEQ ID NO: 3 or nucleic acids sharing at least about 65 percent identity therewith.
26. Essentially chymotrypsin-free trypsin.
27. The trypsin of claim 26 that contains less than about 0.01% chymotrypsin.
EP99951445A 1998-09-21 1999-09-15 Production of soluble recombinant trypsinogen analogs Withdrawn EP1141263A4 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US10121398P 1998-09-21 1998-09-21
US101213P 1998-09-21
PCT/US1999/021047 WO2000017332A1 (en) 1998-09-21 1999-09-15 Production of soluble recombinant trypsinogen analogs

Publications (2)

Publication Number Publication Date
EP1141263A1 EP1141263A1 (en) 2001-10-10
EP1141263A4 true EP1141263A4 (en) 2002-08-28

Family

ID=22283537

Family Applications (1)

Application Number Title Priority Date Filing Date
EP99951445A Withdrawn EP1141263A4 (en) 1998-09-21 1999-09-15 Production of soluble recombinant trypsinogen analogs

Country Status (6)

Country Link
EP (1) EP1141263A4 (en)
AR (1) AR021818A1 (en)
AU (1) AU6388499A (en)
CA (1) CA2343966A1 (en)
PE (1) PE20001068A1 (en)
WO (1) WO2000017332A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU7053500A (en) * 1999-09-15 2001-04-17 Eli Lilly And Company Chymotrypsin-free trypsin
US7276605B2 (en) 2001-02-01 2007-10-02 Roche Diagnostics Operations, Inc. Method for producing recombinant trypsin
WO2004020612A1 (en) 2002-08-30 2004-03-11 Novozymes Biotech, Inc. Methods for producing mammalian trypsins
CN104312933B (en) * 2014-10-17 2017-03-29 江南大学 A kind of method that optimization signal peptide improves the expression of trypsin exocytosiss
JP2018537976A (en) 2015-11-25 2018-12-27 アカデミシュ・ジークンホイス・ライデン Recombinant serine protease
CN116445462A (en) * 2023-04-20 2023-07-18 西安麦博泰克生物科技有限公司 Purification preparation method of recombinant porcine pepsin and recombinant porcine pepsin

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0597681A1 (en) * 1992-11-13 1994-05-18 Eli Lilly And Company Expression vectors for the bovine trypsin and trypsinogen and host cells transformed therewith

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5763215A (en) * 1984-08-16 1998-06-09 Bio-Technology General Corporation Method of removing N-terminal amino acid residues from eucaryotic polypeptide analogs and polypeptides produced thereby
TW257792B (en) * 1992-10-01 1995-09-21 Lilly Co Eli
US5773248A (en) * 1995-11-13 1998-06-30 Uab Research Foundation Nucleic acid encoding a human α3(IX) collagen protein and method of producing the protein recombinantly
JPH09294583A (en) * 1996-03-08 1997-11-18 Ajinomoto Co Inc Aminopeptidase gx and hydrolysis of protein using the same

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0597681A1 (en) * 1992-11-13 1994-05-18 Eli Lilly And Company Expression vectors for the bovine trypsin and trypsinogen and host cells transformed therewith

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of WO0017332A1 *

Also Published As

Publication number Publication date
CA2343966A1 (en) 2000-03-30
EP1141263A1 (en) 2001-10-10
AU6388499A (en) 2000-04-10
AR021818A1 (en) 2002-08-07
PE20001068A1 (en) 2000-10-18
WO2000017332A1 (en) 2000-03-30

Similar Documents

Publication Publication Date Title
CA2153254C (en) Cloning of enterokinase and method of use
Gakh et al. Mitochondrial processing peptidases
JP5027812B2 (en) Cleavage of insulin precursors by trypsin mutants.
EP1873251A1 (en) Expression vector(s) for enhanced expression of a protein of interest in eukaryotic or prokaryotic host cells
KR102049900B1 (en) Modified factor X polypeptides and uses thereof
Choi et al. Recombinant enterokinase light chain with affinity tag: expression from Saccharomyces cerevisiae and its utilities in fusion protein technology
CA2437342A1 (en) Method for producing recombinant trypsin
EP2507258B1 (en) Novel peptidyl alpha-hydroxyglycine alpha-amidating lyases
US6746859B1 (en) Cloning of enterokinase and method of use
US5989890A (en) Compositions and methods for PACE 4 and 4.1 gene and polypeptides in cells
WO2000017332A1 (en) Production of soluble recombinant trypsinogen analogs
JP2001514003A (en) Autocatalytically activatable zymogen precursors of proteases and their use
EP1618135B1 (en) Cleavage of fusion proteins using granzyme b protease
WO2001019970A2 (en) Chymotrypsin-free trypsin
Smeekens et al. The biosynthesis and processing of neuroendocrine peptides: identification of proprotein convertases involved in intravesicular processing
Ledgerwood et al. Endoproteolytic processing of recombinant proalbumin variants by the yeast Kex2 protease
JP2003529330A (en) Method for producing active serine protease and inactive derivative
Hay et al. Enhanced expression of a furin-cleavable proinsulin
KR100714116B1 (en) Production of insulin with pancreatic procarboxypeptidase B
JPH0638771A (en) Expression of human protein disulfide isomerase gene and production of polypeptide by co-expression with the gene
Pozzuolo et al. Efficient bacterial expression of fusion proteins and their selective processing by a recombinant Kex-1 protease
JP3172968B2 (en) Multimeric forms of IL-16, methods for producing them and their use
EP1326890B1 (en) Shrimp alkaline phosphatase
Baker et al. Cloning, expression, purification, and activity of dog (Canis familiaris) and monkey (Saimiri boliviensis) cathepsin S
WO2001051624A2 (en) Carboxypeptidase b free of animal products and contaminating enyzme activity

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20010711

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

AX Request for extension of the european patent

Free format text: AL;LT;LV;MK;RO;SI

A4 Supplementary search report drawn up and despatched

Effective date: 20020715

AK Designated contracting states

Kind code of ref document: A4

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

17Q First examination report despatched

Effective date: 20021021

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20030301