MXPA99008622A - Shuffling of heterologous dna sequences - Google Patents
Shuffling of heterologous dna sequencesInfo
- Publication number
- MXPA99008622A MXPA99008622A MXPA/A/1999/008622A MX9908622A MXPA99008622A MX PA99008622 A MXPA99008622 A MX PA99008622A MX 9908622 A MX9908622 A MX 9908622A MX PA99008622 A MXPA99008622 A MX PA99008622A
- Authority
- MX
- Mexico
- Prior art keywords
- sequences
- pcr
- primer
- sequence
- region
- Prior art date
Links
- 229920001850 Nucleic acid sequence Polymers 0.000 title claims description 56
- 230000000694 effects Effects 0.000 claims abstract description 24
- 102000007312 Recombinant Proteins Human genes 0.000 claims abstract description 21
- 108010033725 Recombinant Proteins Proteins 0.000 claims abstract description 21
- 108091005771 Peptidases Proteins 0.000 claims abstract description 19
- 239000004365 Protease Substances 0.000 claims abstract description 19
- 238000004519 manufacturing process Methods 0.000 claims abstract description 11
- 229920003013 deoxyribonucleic acid Polymers 0.000 claims description 23
- 238000006243 chemical reaction Methods 0.000 claims description 21
- 239000004367 Lipase Substances 0.000 claims description 16
- 108090001060 lipase Proteins 0.000 claims description 16
- 102000004882 lipase Human genes 0.000 claims description 16
- 235000019421 lipase Nutrition 0.000 claims description 16
- 102000004169 proteins and genes Human genes 0.000 claims description 16
- 108090000623 proteins and genes Proteins 0.000 claims description 16
- 230000000692 anti-sense Effects 0.000 claims description 15
- 102000033147 ERVK-25 Human genes 0.000 claims description 13
- 102000004190 Enzymes Human genes 0.000 claims description 12
- 108090000790 Enzymes Proteins 0.000 claims description 12
- 230000003321 amplification Effects 0.000 claims description 12
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 12
- 229920000972 Sense strand Polymers 0.000 claims description 11
- 239000000203 mixture Substances 0.000 claims description 10
- 238000000034 method Methods 0.000 claims description 9
- 238000002955 isolation Methods 0.000 claims description 8
- 108010022999 Serine Proteases Proteins 0.000 claims description 7
- 102000012479 Serine Proteases Human genes 0.000 claims description 7
- 229940040461 Lipase Drugs 0.000 claims description 6
- 239000003599 detergent Substances 0.000 claims description 5
- 229940088598 Enzyme Drugs 0.000 claims description 4
- 238000010276 construction Methods 0.000 claims description 3
- 238000004851 dishwashing Methods 0.000 claims description 3
- 238000002708 random mutagenesis Methods 0.000 claims description 3
- 102000035443 Peptidases Human genes 0.000 abstract description 6
- 229920000023 polynucleotide Polymers 0.000 abstract description 5
- 239000002157 polynucleotide Substances 0.000 abstract description 5
- 230000001747 exhibiting Effects 0.000 abstract description 2
- PWKSKIMOESPYIA-BYPYZUCNSA-N L-N-acetyl-Cysteine Chemical group CC(=O)N[C@@H](CS)C(O)=O PWKSKIMOESPYIA-BYPYZUCNSA-N 0.000 description 17
- 241000235389 Absidia Species 0.000 description 15
- 241000235402 Rhizomucor Species 0.000 description 11
- 241000235527 Rhizopus Species 0.000 description 11
- 238000009396 hybridization Methods 0.000 description 10
- 241000223198 Humicola Species 0.000 description 8
- 239000000499 gel Substances 0.000 description 7
- 229940014598 TAC Drugs 0.000 description 6
- 238000003752 polymerase chain reaction Methods 0.000 description 6
- 238000005215 recombination Methods 0.000 description 6
- 241000589516 Pseudomonas Species 0.000 description 5
- 101710028865 SBT Proteins 0.000 description 5
- 210000001178 Neural Stem Cells Anatomy 0.000 description 4
- 210000004027 cells Anatomy 0.000 description 4
- 230000002068 genetic Effects 0.000 description 4
- 230000001131 transforming Effects 0.000 description 4
- 241000588724 Escherichia coli Species 0.000 description 3
- 125000003275 alpha amino acid group Chemical group 0.000 description 3
- 235000019622 astringency Nutrition 0.000 description 3
- 235000019606 astringent taste Nutrition 0.000 description 3
- 230000000295 complement Effects 0.000 description 3
- 238000005520 cutting process Methods 0.000 description 3
- 238000002703 mutagenesis Methods 0.000 description 3
- 231100000350 mutagenesis Toxicity 0.000 description 3
- 239000002773 nucleotide Substances 0.000 description 3
- 125000003729 nucleotide group Chemical group 0.000 description 3
- FAPWRFPIFSIZLT-UHFFFAOYSA-M sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 3
- 238000000844 transformation Methods 0.000 description 3
- 238000005406 washing Methods 0.000 description 3
- KMRCGPSUZRGVOV-CTFHGZAXSA-N α-Neup5Ac-(2->3)-β-D-Galp-(1->3)-β-D-GlcpNAc Chemical compound O1[C@@H]([C@H](O)[C@H](O)CO)[C@H](NC(=O)C)[C@@H](O)C[C@@]1(C(O)=O)O[C@@H]1[C@@H](O)[C@H](O[C@@H]2[C@H]([C@H](O)O[C@H](CO)[C@H]2O)NC(C)=O)O[C@H](CO)[C@@H]1O KMRCGPSUZRGVOV-CTFHGZAXSA-N 0.000 description 3
- AVKUERGKIZMTKX-NJBDSQKTSA-N Ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 2
- 102000013142 Amylases Human genes 0.000 description 2
- 108010065511 Amylases Proteins 0.000 description 2
- 241000690470 Plantago princeps Species 0.000 description 2
- 108010019653 Pwo polymerase Proteins 0.000 description 2
- DBMJMQXJHONAFJ-UHFFFAOYSA-M Sodium laurylsulphate Chemical compound [Na+].CCCCCCCCCCCCOS([O-])(=O)=O DBMJMQXJHONAFJ-UHFFFAOYSA-M 0.000 description 2
- 241000223258 Thermomyces lanuginosus Species 0.000 description 2
- 238000000246 agarose gel electrophoresis Methods 0.000 description 2
- 150000001413 amino acids Chemical class 0.000 description 2
- 229960000723 ampicillin Drugs 0.000 description 2
- 235000019418 amylase Nutrition 0.000 description 2
- 230000000975 bioactive Effects 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 230000000875 corresponding Effects 0.000 description 2
- 230000029087 digestion Effects 0.000 description 2
- 230000002255 enzymatic Effects 0.000 description 2
- 230000002538 fungal Effects 0.000 description 2
- 238000003780 insertion Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 150000007523 nucleic acids Chemical group 0.000 description 2
- 238000002741 site-directed mutagenesis Methods 0.000 description 2
- 239000011780 sodium chloride Substances 0.000 description 2
- VGONTNSXDCQUGY-RRKCRQDMSA-N 2'-deoxyinosine Chemical group C1[C@H](O)[C@@H](CO)O[C@H]1N1C(N=CNC2=O)=C2N=C1 VGONTNSXDCQUGY-RRKCRQDMSA-N 0.000 description 1
- SNBCLPGEMZEWLU-QXFUBDJGSA-N 2-chloro-N-[[(2R,3S,5R)-3-hydroxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methyl]acetamide Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](CNC(=O)CCl)[C@@H](O)C1 SNBCLPGEMZEWLU-QXFUBDJGSA-N 0.000 description 1
- 108010011619 6-Phytase Proteins 0.000 description 1
- 241001375492 Absidia reflexa Species 0.000 description 1
- 206010058994 Acute haemorrhagic leukoencephalitis Diseases 0.000 description 1
- 239000004382 Amylase Substances 0.000 description 1
- 229940025131 Amylases Drugs 0.000 description 1
- 108020000948 Antisense Oligonucleotides Proteins 0.000 description 1
- 241000972773 Aulopiformes Species 0.000 description 1
- 241000304886 Bacilli Species 0.000 description 1
- 241000193830 Bacillus <bacterium> Species 0.000 description 1
- 108091005650 Basic proteases Proteins 0.000 description 1
- 229940106157 CELLULASE Drugs 0.000 description 1
- 108060003367 CHRM3 Proteins 0.000 description 1
- 108010059892 Cellulase Proteins 0.000 description 1
- 108010084185 Cellulases Proteins 0.000 description 1
- 102000005575 Cellulases Human genes 0.000 description 1
- 108010076804 DNA Restriction Enzymes Proteins 0.000 description 1
- MUCZHBLJLSDCSD-UHFFFAOYSA-N Diisopropyl fluorophosphate Chemical compound CC(C)OP(F)(=O)OC(C)C MUCZHBLJLSDCSD-UHFFFAOYSA-N 0.000 description 1
- 241001522878 Escherichia coli B Species 0.000 description 1
- 241000233866 Fungi Species 0.000 description 1
- UGQMRVRMYYASKQ-KQYNXXCUSA-N Inosine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NC=NC(O)=C2N=C1 UGQMRVRMYYASKQ-KQYNXXCUSA-N 0.000 description 1
- UGQMRVRMYYASKQ-KMPDEGCQSA-N Inosine Natural products O[C@H]1[C@H](O)[C@@H](CO)O[C@@H]1N1C(N=CNC2=O)=C2N=C1 UGQMRVRMYYASKQ-KMPDEGCQSA-N 0.000 description 1
- 102000003960 Ligases Human genes 0.000 description 1
- 108090000364 Ligases Proteins 0.000 description 1
- 229920000272 Oligonucleotide Polymers 0.000 description 1
- 108020005187 Oligonucleotide Probes Proteins 0.000 description 1
- 108020005203 Oxidases Proteins 0.000 description 1
- 101700047009 PCSK6 Proteins 0.000 description 1
- 102100003489 PCSK6 Human genes 0.000 description 1
- 241000228143 Penicillium Species 0.000 description 1
- 241000588767 Proteus vulgaris Species 0.000 description 1
- 229940007042 Proteus vulgaris Drugs 0.000 description 1
- 229940055023 Pseudomonas aeruginosa Drugs 0.000 description 1
- 241000589517 Pseudomonas aeruginosa Species 0.000 description 1
- 241000589538 Pseudomonas fragi Species 0.000 description 1
- 241000589630 Pseudomonas pseudoalcaligenes Species 0.000 description 1
- 241000589774 Pseudomonas sp. Species 0.000 description 1
- 241000577556 Pseudomonas wisconsinensis Species 0.000 description 1
- 101710028810 RC0047 Proteins 0.000 description 1
- 241000235403 Rhizomucor miehei Species 0.000 description 1
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 1
- 108090000787 Subtilisin Proteins 0.000 description 1
- 108010056079 Subtilisins Proteins 0.000 description 1
- 102000005158 Subtilisins Human genes 0.000 description 1
- HRXKRNGNAMMEHJ-UHFFFAOYSA-K Trisodium citrate Chemical compound [Na+].[Na+].[Na+].[O-]C(=O)CC(O)(CC([O-])=O)C([O-])=O HRXKRNGNAMMEHJ-UHFFFAOYSA-K 0.000 description 1
- 101700053141 aex-5 Proteins 0.000 description 1
- 125000000539 amino acid group Chemical group 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 239000000074 antisense oligonucleotide Substances 0.000 description 1
- 230000001580 bacterial Effects 0.000 description 1
- 230000000721 bacterilogical Effects 0.000 description 1
- 238000010170 biological method Methods 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- OYPRJOBELJOOCE-UHFFFAOYSA-N calcium Chemical compound [Ca] OYPRJOBELJOOCE-UHFFFAOYSA-N 0.000 description 1
- 239000011575 calcium Substances 0.000 description 1
- 229910052791 calcium Inorganic materials 0.000 description 1
- 229920000407 conserved sequence Polymers 0.000 description 1
- 108010005400 cutinase Proteins 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 150000002148 esters Chemical group 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 229960005051 fluostigmine Drugs 0.000 description 1
- 238000006460 hydrolysis reaction Methods 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 229960003786 inosine Drugs 0.000 description 1
- OWIKHYCFFJSOEH-UHFFFAOYSA-N isocyanate Chemical compound N=C=O OWIKHYCFFJSOEH-UHFFFAOYSA-N 0.000 description 1
- 244000005700 microbiome Species 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006011 modification reaction Methods 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 238000007857 nested PCR Methods 0.000 description 1
- 239000002751 oligonucleotide probe Substances 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 229940085127 phytase Drugs 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 108091007521 restriction endonucleases Proteins 0.000 description 1
- 235000019515 salmon Nutrition 0.000 description 1
- 239000000523 sample Substances 0.000 description 1
- 125000003616 serine group Chemical group [H]N([H])[C@]([H])(C(=O)[*])C(O[H])([H])[H] 0.000 description 1
- 239000001509 sodium citrate Substances 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000002194 synthesizing Effects 0.000 description 1
- 239000011778 trisodium citrate Substances 0.000 description 1
Abstract
The present invention relates to a new method of shuffling especially heterologous polynucleotide sequences, screening and/or selection of new recombinant proteins resulting therefrom having a desired biological activity, and especially to production and identification of novel proteases exhibiting desired properties. The method comprises the following steps;i) identification of at least one conserved region between the heterologous sequences of interest;ii) generating fragments of each of the heterologous sequences of interest, wherein said fragments comprise the conserved region(s), in a preferred embodiment due to the use of parts of the regions(s) as primers;and iii) shuffling/recombining said fragments using the conserved region(s) as (a) homologous linking point(s).
Description
INTERMITTING OF DNA SEQUENCES HETEROLOGAS
FIELD OF THE INVENTION
The present invention relates to a novel method for intermixing particularly heterologous polynucleotide sequences, screening and / or screening new recombinant proteins resulting therefrom, having a desired biological activity, and especially the production and identification of novel proteases exhibiting the desired properties.
BACKGROUND OF THE INVENTION
It is generally found that a protein that develops a certain bioactivity exhibits a certain variation among the genus, and there could still be differences in the species between the members. This variation is even more open at the genomic level.
This natural genetic diversity between the genes that encode proteins, which has basically the same bioactivity has been generated in nature for billions of years and reflects a natural optimization of the encoded proteins with respect to the environment of the organism in question
However, in general it has been found that bioactive molecules that occur naturally are not optimized for the various uses to which they are placed by the human species, especially when used for industrial purposes.
Therefore, it has been of interest for the industry to identify such bioactive proteins that exhibit the optimum properties with respect to the use for which they are intended.
This has been done for many years by screening natural sources, or by using mutagenesis. For example, within the technical field of enzymes for use in p. ex. detergents, the performance of washing and / or washing plates of p. ex. Proteases, lipases, amylases and cellulases that occur naturally, has been significantly improved by in vitro modifications of the enzymes.
In most cases, these improvements have been obtained through site-directed mutagenesis resulting in the substitution, deletion or insertion of the specific amino acid residues that have been chosen, either on the basis of their type or on the basis of its location in the secondary or tertiary structure of the mature enzyme (see, for example, US Pat. No. 4,518,584).
Previous Art
The numerous methods for creating genetic diversity, such as by random or site-directed mutagenesis, have been proposed and described in the scientific literature as well as in patent applications. For further details in this regard, reference is made to the related art section of WO 95/22625, where a review is provided.
A method for intermixing the homologous DNA sequences has been described by Stemmer (Stemmer, (1994), Proc. Nati, Acad. Sci. USA, Vol. 91, 10747-10751; Stemmer, (1994), Nature, vol 370, 389-391). The method refers to the intermixing of the homologous DNA sequences using the PCR techniques in vi t ro. Positive recombinant genes that contain the intermixed DNA sequences are selected from a DNA library based on the enhanced function of the expressed proteins
WO 95/22625 is believed to be the most pertinent reference in relation to the present invention in its "intermingling of the gene" aspect. In WO 95/22625, a method for intermixing the homologous DNA sequences is described. An important step in the method described in WO 95/22625 is to cut the double stranded polynucleotide from the homologous template into the random fragments of a desired size followed by the re-assembly of the fragments into the full length genes.
An inherent disadvantage with respect to the method of
WO 95/22625 is, however, that the diversity generated through such a method is limited due to the use of the homologous gene sequences (as defined in WO).
95/22625).
Another disadvantage in the method of WO 95/22625 lies in the production of the random fragments by cutting the double-stranded polynucleotide from the template.
An additional reference of interest is WO 95/17413 which describes a method for intermixing the gene or DNA by recombining the DNA sequences either by recombination of the double-stranded fragments synthesized or recombination of the generated PCR sequences. According to the method described in WO 95/17413, recombination has to be carried out between DNA sequences with sufficient homology of the sequence to allow hybridization of the different sequences to be recombined.
WO 95/17413, therefore, also establishes the disadvantage that the diversity generated is relatively limited.
The present invention does not contain any steps involving the production of random fragments by cutting the double-stranded polynucleotide from the template, as described in WO 95/22625.
In addition, WO 95/22625 refers to the intermixing of homologous genes, while the present invention relates to the intermixing of heterologous genes.
BRIEF DESCRIPTION OF THE INVENTION:
The problem to be solved by the present invention is to avoid limiting the intermixing only of the homologous DNA sequences by providing a method for intermixing / recombining the heterologous sequences of interest.
The solution is to use at least one "region of the conserved sequence", where there is a sufficient degree of homology between the heterologous sequences to be intermixed, as a "point of attachment" between the heterologous sequences.
Therefore, a first aspect of the invention relates to the method for intermixing the heterologous sequences of interest comprising the following steps,
i) identification of at least one conserved region between the heterologous sequences of interest;
ii) generating the fragments of each of the heterologous sequences of interest, wherein the fragments comprise the conserved regions; Y
iii) intermixing / recombining the fragments using the conserved regions as a homologous binding site.
In a second aspect, the invention relates to a method for producing an intermixed protein having a desired biological vity comprising, in addition to the steps of the first aspect, the following steps:
iv) expressing the numerous different recombinant proteins encoded by the numerous different intermixed sequences of step iii); Y
v. screening or screening the numerous different recombinant proteins from step ii) in an appropriate screening or screening system for one or more recombinant proteins having the desired vity.
The term "conserved region" represents a region of the sequence (preferably at least 10 bp), where there is a relatively high sequence identity between the heterologous sequences.
For the conserved region to be used as the "binding point" between the heterologous sequences, the identity of the sequence. among the heterologous sequences, within the conserved regions, it is sufficiently high to allow hybridization of the heterologous sequences using the conserved region with the hybridization site ("binding point").
BRIEF DESCRIPTION OF THE DRAWINGS
Fig. 1: Fig. 1 illustrates the general concept of the invention, wherein
a) the black boxes define the mutual, common, conserved regions of the sequences of interest, and
b) PCR primers called "a, a ', b, b', etc." they are primers directed to conserved • regions. The primers ("a '" and "b"), ("b'" and "c") .etc. they have a sequence that match of preferably at least 10 bp, and
c) primers "z" and "z '" are primers directed towards the parts of the surrounding area of the sequence of the sequences of interest that are intermixed according to the method of the invention.
Fig. 2: Shows an alignment of the DNA protease (subtylase) sequences. Here, there are a number of conserved regions such as the common partial sequences numbered 1-5.
Fig. 3: Shows an alignment of the different lipases.
DEFINITIONS
Before discussing this invention in further detail, the following terms will be defined.
"Intermixed": The term "intermixed" means the recombination of the nucleotide sequences between two or more DNA sequences of interest that result in the output DNA sequences (eg, the DNA sequences that have been subjected to a cycle intermixed) having a number of nucleotides exchanged, as compared to the input DNA sequences (eg, the DNA sequences of the starting point of interest).
Alternatively, the term "intermixed" could be called "recombination".
? "Homology of DNA sequences": In the present
In the context, the degree of the homology of the DNA sequence is determined as the degree of identity between two sequences indicating a degree of derivation of the first sequence with respect to the second. The homology could be determined appropriately through programs of
computer known in the art, such as GAP provided in the GCG program package (Program Manual
• for the Wisconsin Package, version 8, August 1994, Genetics Computer Group, 575 Science Drive, Madison, Wisconsin, USA 53711) (Needleman, S.B. and Wunsch, C.D.,
(1970), Journal of Molecular Biology, 48, 443-453).
"Homologous": The term "homologous" means that a single-stranded nucleic acid sequence could hybridize to a strand nucleic acid sequence
simple complementary. The degree of hybridization could depend on a number of factors, including the amount of identity between the sequences and the conditions of hybridization such as temperature and saline concentration as discussed below (vi de in fra). 25 Using the GAP computer program (vi of s upra) with the following settings for the comparison of the DNA sequence: penalty of the GAP creation of 5.0 and penalty of the GAP extension of 0.3, it is in the present context that it is created that two DNA sequences will be able to hybridize (using the medium astringency hybridization conditions as defined above) if they mutually exhibit a degree of identity of at least 50%, more preferably at least 60%, more preferably at least 70%, more preferably at least 80%, more preferably at least 85%, and even more preferably at least 90%.
"Heterologist": Two DNA sequences are said to be heterologous if one of these comprises a partial sequence of at least 40 bp that does not exhibit a degree of identity greater than 50%, more preferably greater than 70%, more preferably greater than 80 %, more preferably greater than 85%, more preferably greater than 90%, and even more preferably greater than 95%, of any sequence in the other. More preferably, the first partial sequence is at least 60 bp, more preferably the first partial sequence is at least 80 bp, even more preferably the first partial sequence is at least 120 bp, and more preferably the first partial sequence is at least 500 bp
"Hybridization": Appropriate experimental conditions for determining whether two or more DNA sequences of interest hybridize or not, are defined herein as hybridization to medium stringency as described in detail below.
A low stringency hybridization protocol between two DNA sequences of interest involves pre-rinsing a filter containing the DNA fragments to hybridize in 5 x SSC (sodium chloride / sodium citrate, Sambrook et al., 1989 ) for 10 min, and prehybridization of the filter in a solution of 5 x SSC, 5 x Denhardt's solution (Sambrook et al., 1989), 0.5% SDS and 100 μg / ml denatured sonicated salmon sperm DNA (Sambrook) et al., 1989), followed by hybridization in the same solution containing a concentration of 10 ng / ml of a randomly primed probe (DNA sequence) (Feinberg, AP and Vogeístein, B. (1983) Anal. Biochem. : 6-13), labeled in 32P-dCTP (specific activity> 1 x 109 cpm / μg) for 12 hours at approx. 45 ° C. The filter is then washed twice for 30 minutes in 2 x SSC, 0.5% SDS at at least 55 ° C, more preferably at least 60 ° C, and even more preferably at least 65 ° C (high astringency).
The molecules to which the oligonucleotide probe hybridizes under these conditions of astringency are detected using an X-ray film.
"Alignment": The term "alignment" used herein in connection with an alignment of a number of DNA and / or amino acid sequences, means that the sequences of interest are aligned to identify the mutual / common sequences of homology / identity among the sequences of interest. This procedure is used to identify the common "conserved regions" (vi de i n fra) among the sequences of interest. An alignment could be appropriately determined by means of computer programs known in the art, such as PILEUP provided in the GCG program package (Program Manual for the Wisconsin Package, version 8, August 1994, Genetics, Computer Group, 575 Science Drive , Madison, Wisconsin, USA 53711) (Neddleman, SB and Wunsch, CD, (1970), Journal of Molecular Biology, 48, 443-453).
"Conserved regions": The term "conserved region" is used herein in connection with a "conserved region" between the DNA sequences and / or amino acids of interest, means a region of the mutual sequence, common to two or more sequences of interest, where there is a relatively high degree of sequence identity between two or more of the heterologous sequences of interest. In the ÉB < present context, a conserved region is preferably at least 10 base pairs (bp), more preferably at least 20 bp and even more preferably at least 30 bp.
Using the computer program GAP (vi of s upra) with 10 the following settings for the comparison of the DNA sequence: penalty of creation GAP of 5.0 and penalty of extension GAP of 0.3, the degree of identity of the sequence of DNA within the conserved region, between two or more of the heterologous sequences of interest, is preferably at least 80%, more preferably at least 85%, more preferably at least 90%, and even more preferably at least 95%.
"Primer": The term "primer" used here,
Especially in connection with a PCR reaction, it is a primer (especially a "PCR primer") defined and constructed according to the general standard specification known in the art ("PCR A practical approach" IRL Press, (1991)). "A primer directed to a sequence": The term "a primer directed to a sequence" means that the primer (preferably to be used in a PCR reaction) is constructed to exhibit at least 80% of the degree of sequence identity to the part of the sequence of interest most preferably at least 90% of the degree of identity of the sequence to the part of the sequence of interest, wherein the primer is consequently "directed to".
"Reaction of extended-sequence extension PCR (SOE-PCR)": The term "SOE-PCR" is a standard PCR reaction protocol known in the art and in the present context is defined and developed according to the standard protocols defined in art ("PCR A practical approach" IRL Press, (1991)).
"Around": The term "around" used here in connection with the DNA sequences comprised in a PCR fragment, means the partial sequences of the plus end. of the PCR fragment, both at the 5 'and 3' ends of the PCR fragment.
• "Subt ilases": A serine protease is an enzyme that catalyzes the hydrolysis of peptide bonds, and in which there is an essential serine residue in the active site (White, Handler and Smith, 1973"Principles of Biochemistry", Fifth Edition, McGraw-Hill Book Company, NY, pp. 271-272).
Bacterial serine proteases have molecular weights in the range of 20,000 to 45,000 Daltons. They are inhibited by diisopropylfluorophosphate. These hydrolyse the simple terminal esters and are similar in activity to the eukaryotic chemitosynthesis, also a serine protease. A narrower term alkaline protease, which covers a subgroup, reflects the high pH optimum of some of the serine proteases, from pH 9.0 to 11.0 (for review, see Priest (1977) Bact eriological Rev. 41 711-753).
A subgroup of serine proteases designated tentatively subtilases, has been proposed by Siezen et al., Protein Engng. 4 (1991) 719-737. They are defined by homology analysis of more than 40 amino acid sequences of the serine proteases previously referred to as subtilisin-like proteases.
DETAILED DESCRIPTION OF THE INVENTION
A method for intermixing the heterologous sequences of interest
In a preferred embodiment, the fragments generated in step ii) of the first aspect of the invention are generated by the use of PCR technology.
However, one aspect of the invention relates to a method for intermixing the heterologous DNA sequences of interest, according to the first aspect of the invention, which comprises the following steps
i) identification of one or more conserved regions (subsequently called "A, B, C" etc. ..) in two or more of the heterologous sequences;
ii) construction of at least two groups of primers by PCR (each group comprises a sense and one antisense primer) for one or more of the conserved regions identified in i) where
in a group of the sense primer (called: "a" = sense primer) refers to a region of the 5 'sequence (sense strand) of the conserved region (eg, conserved region "A"), and the primer antisense (called "a" '= antisense primer) refers to a region of the 3' sequence (sense strand) of the conserved region or refers to a region of the
• sequence at least partially within the conserved region, and in another group of the sense primer (called: "b" = sense primer) refers to a region of the 5 'sequence (sense strand) of the conserved region or refers to a region of the
sequence at least partially within the
• conserved region and antisense primer (called: "b '" = antisense primer) refers to a region of the 3' sequence (sense strand) of the conserved region (eg, region)
conserved "A"), and the two regions of the sequence defined by the regions between the primer group "a" and
• "a" 'and "b" and "b"' (both regions include the current primer sequences) have a
superimposed homologous sequence of at least 10 base pairs (bp) within the conserved region;
iii) for one or more conserved regions identified
of interest in step i), two PCR amplification reactions are performed with the heterologous DNA sequences in step i) as the template, and where
one of the PCR reactions uses the 5 'primer group identified in step ii) (eg called "a", "a") and the second PCR reaction uses the 3' primer group identified in step ii ) (eg called "b", "b" ');
iv) isolation of the PCR-fragments generated as described in step iii) for one or more of the conserved regions identified in step i);
v, form a pool of two or more of the fragments isolated by PCR of step iv) and perform a PCR reaction of superimposed extension of the sequence (SOE-PCR) using the fragments isolated by PCR as templates; Y
vi) isolation of the PCR fragment obtained in step v), wherein the fragment isolated by PCR comprises numerous different interspersed sequences containing an intermixed mixture of the fragments isolated by PCR in step iv), wherein the intermixed sequences are characterized by the partial DNA sequences, which originate from the overlays of the homologous sequence in step ii), have at least 80% identity towards one or more of the partial sequences in one or more of the original heterologous DNA sequences in the stage i).
A method for producing one or more recombinant proteins that have a desired biological activity
In a second aspect, the invention relates to a method for producing an intermixed protein having a desired biological activity, further comprising steps i) to vi) immediately before the following steps:
VII expressing the numerous different recombinant proteins encoded by the numerous different sequences intermixed in step vi); Y
viii) screening or screening the numerous different recombinant proteins of step vii) in an appropriate screening or screening system for one or more recombinant proteins having a desired activity.
Sequences of. Heterologous DNA
The method of the present invention could be used to intermix basically all the heterologous DNA sequences of interest.
Preferably, heterologous DNA sequences encoding an enzymatic activity, such as the activity of amylase, lipase, cutinase, cellulase, oxidase, phytase and protease, is used to intermix.
A further advantage of the present method is that it makes it possible to intermix the heterologous sequences encoding the different activities, e.g. ex. the different enzymatic activities.
The method of the invention is particularly suitable for intermixing the heterologous DNA sequences encoding a protease activity, in particular a subtylase activity.
A number of subtilase DNA sequences are published in the art. A number of these subtilase DNA sequences are in the present context the heterologous DNA sequences, and in general are believed to be mutually too heterologous to be intermixed by the intermixing methods currently known in the art (WO 95/17413, WO 95 / 22625). However, the method according to the invention allows such sequences to be intermixed. For further details, reference is made to a working example (vi de infra).
In addition, the present invention is suitable for intermixing different lipase sequences. For further details, reference is made to a working example (vi de i n fra).
The heterologous DNA sequences used as templates could in advance have been cloned into the appropriate vectors, such as a plasmid. Alternatively, a PCR reaction could be performed directly on the known microorganisms comprising the DNA sequence of interest, according to standard PCR protocols known in the art.
Identification of one or more conserved regions in heterologous sequences:
The identification of the conserved regions could be done by an alignment of the heterologous sequences by standard computer programs. { I saw upra).
Alternatively, the method could be performed on completely new sequences, where the relevant "conserved regions" are chosen as the conserved regions that are known in the art which are the conserved regions for this particular class of proteins.
E.g . , the method could be used to completely intermix the unknown subtylase sequences, which are known to be highly conserved in p. ex. the regions around the active site amino acids. The PCR reaction could then be performed directly on the new unknown strands with the primers directed to the conserved regions.
PCR primers
PCR primers are constructed according to standard descriptions in the art. Preferably, they are of length of 10-75 base pairs (bp).
Superposition of the homologous sequence
In step ii) of claim 3 of the invention, the two regions of the sequence defined by the regions between the primer group "a" and "a" 'and "b" and "b'" (both regions are included) the current primer sequences) have an overlapping homologous sequence of at least 10 base pairs (bp) within the conserved region.
The overlap of the homologous sequence is more preferably at least 15 bp, more preferably at least 20 bp, and even more preferably at least 35 bp.
The overlays of the homologous sequence in step ii) of claim 3 have at least 80% identity towards one or more partial sequences in one or more original heterologous DNA sequences in step i) of the claim, more preferably the overlays of the homologous sequence in step ii) have at least 90% identity towards one or more partial sequences in one or more original heterologous DNA sequences in step i) of the claim, and even more preferably the overlays of the homologous sequence in step ii) have at least 95% identity towards one or more partial sequences in one or more original heterologous DNA sequences in step i) of the claim.
PCR reactions
If on the contrary "is not mentioned, the reaction
PCR performed according to the invention is carried out according to standard protocols known in the art.
The term "fragment isolation by PCR" is intended to cover an aliquot containing the fragment by PCR. However, the fragment by PCR is preferably isolated to a degree that removes the excess of primers, nucleotides, etc.
In addition, the fragment used for SOE-PCR in step v) of claim 3, could alternatively be generated by other processes than the PCR amplification processes described in step iii) of the claim. The appropriate fragments used for SOE-PCR in step v) could, e.g. ex. be generated by cutting the appropriate fragments by digesting the restriction enzyme at the appropriate sites (eg, the restriction sites located on each site of a conserved region identified in step i). Such alternative processes for generating the appropriate fragments for use in the SOE-PCR in step v), are considered within the scope of the invention.
In one embodiment of the invention, the DNA fragment by PCR is prepared under conditions that result in a low, medium or high random mutagenesis frequency.
To obtain a low mutagenesis frequency, the DNA sequences (comprising the DNA fragments) could be prepared by a standard PCR amplification method (US 4,683,202 or Saiki et al., (1988), Science 239, 487-491).
A frequency of medium or high mutagenesis could be obtained by performing the PCR amplification under conditions that increase the poor incorporation of nucleotides, for example as described by Deshler, (1992), GATA 9 (4), 103-106; Leug et al., (1989), Technique, Vol. 1, 11-15.
Final intermixed sequences
One of the advantages of the present invention is that the final "interspersed sequences" in step vi) of claim 3 of the present invention only comprise the sequence information that is originally derived from the original heterologous sequences of interest in the step i) of this claim. The present invention does not use "linker sequences" to recombine one or more of the heterologous sequences, which is a strategy known in the art for example, to be able to intermix the different domains in the proteins, where each domain is encoded by different heterologous sequences (WO 95/17413).
Therefore, the invention relates to a method, characterized in that each of the intermixed sequences, the partial DNA sequences, which originate from the overlays of the homologous sequences in step ii), contains only the information of the sequence that is originally derived from the original heterologous sequences in step i) (in the first and third aspects of the invention) (eg overlays of "the homologous sequence" in step ii) has at least 80% identity to one or more partial sequences in one or more of the heterologous DNA sequences in step i)
More preferably, the "overlays of the homologous sequence" in step ii) have at least 90% identity to one or more partial sequences in one or more of the original heterologous DNA sequences in step i) and even more preferably the "overlays of the homologous sequence" in step ii) have at least 95% identity to one or more partial sequences in one or more of the original heterologous DNA sequences in step i), and more preferably "superpositions of the homologous sequence "in step ii) have 100% identity to one or more partial sequences in one or more of the original heterologous DNA sequences in the step
Expression of the recombinant protein of the zynthetic sequences
Expression of the recombinant protein encoded by the intermixed sequence of the present invention could be accomplished by the use of standard expression vectors and corresponding expression systems known in the art.
Appropriate screening or screening system
In its second aspect, the present invention relates to a method for producing one or more recombinant proteins having a desired biological activity.
An appropriate screening or selection system will depend on the desired biological activity.
A number of appropriate screening or screening systems are described in the art for screening or selecting a desired biological activity. The examples are:
Strauberg et al. (Biotechnology 13: 669-673 (1995), which describes a screening system for screening subtilisin variants that have an independent calcium stability;
Bryan et al. (Proteins 1: 326-334 (1986)), which describes a screening test for screening proteases that have improved thermal stability; Y
WO 97/04079 describes a screening test for screening lipases that have an improved wash development in washing detergents.
A preferred embodiment of the invention comprises screening or selecting the recombinant proteins, wherein the desired biological activity is carried out in washing dishes or laundry detergents. Examples of washing dishes or laundry detergents are described in WO 97/04079 and WO 95/30011.
The invention is described in further detail in the following examples, which are not intended in any way to limit the scope of the invention.
MATERIALS AND METHODS Strains
E. coli strain: DH10B (Life Technologies)
Strain of Bacillius subtilis: DN1885 amyE. A derivative of B, s 168RUB200 (J.? Acteriology 172: 4315-4321 (1990)).
Plasmids
pKH400: Constructed in pKH400 from pJS3 (intermixed vector of E. coli-B. subtilis containing a synthesis gene encoding subtyla 309 (described by Jacob Schiodt et al. in Protein and Peptide Letters 3: 39- 44 (1996)), by introducing two BamHI sites at positions 1841 and 3992.
Sequences of proteases used to intermix
Unless otherwise mentioned, manipulations and transformations of DNA were performed using the standard methods of molecular biology Sambrook et al. (1989) Molecular cloning: A laboratory manual, Cold Spring Harbor lab., Cold Spring Harbor, NY; Ausubel, F. M. et al. (eds.) "Current protocols in Molecular Biology". John Wiley and Sons, 1995; Harwood, C.R., and Cutting, S.M. (eds.) "Molecular Biological Methods for Bacillus". John Wiley and Sons, 1990).
Enzymes for DNA manipulation were used according to the specifications of the suppliers.
Enzymes for DNA manipulation
Unless otherwise mentioned, all enzymes for DNA manipulations, such as p. ex. endonucleases, restriction ligases, etc., are obtained from New England Biolabs, Inc.
EXAMPLES EXAMPLE 1
A) Construction of the vector
1) Amplification of the pre-pro sequences
The host cells that protect the DNA of the plasmid encoding the enzymes A13050_l (GenBank), SUBT_BACAM P00782 (Swiss-Prot), D26542 (GenBank), A22550 (GenBank), and PD498 (Patent Application No. WO 96/3 963 were The initial material was obtained by means of the standard mini-prep isolation of the plasmid DNA, the purified DNA was obtained, with these DNAs from the template, 5 standard PCRs were carried out to amplify the respective pre-pro sequences.The fragments were generated using the polymerase of DNA of the Pwo reading test (Boehringer Mannheim) and the following groups of primers directed against the N- and C- terminus of the respective pre-pro sequences.
A13050_l TiKlll: 5 'GAG GAG GGA AAC CGA ATG AGG AAA AAG AGT TTT TGG
TÍK117: 5 'CGC GGT CGG GTA CCG TTT GCG CCA AGG CAT G.
SUBT_BACAM P00782 TÍK112: 5 'GAG GAG GGA AAC CGA ATG AGA GGC AAA AAA GTA TGG / TÍK118: 5' CGC GGT CGG GTA CCG ACT GCG CGT ACG CAT G.
D26542 TÍK110: 5 'GAG GAG GGA AAC CGA ATG AGA CAA AGT CTA AAA GTT ATG.
TÍK116: 5 'CGC GGT CGG GTA CCG TTT GAC TGA TGG TTA CTT C.
A22550 TÍK109: 5 'GAG GAG GGA AAC CGA ATG AAG AAA CCG TTG GGG.
TÍK115: 5 'CGC GGT CGG GTA CCG ATT GCG CCA TTG TCG TTA C.
PD498 TÍK113: 5 'GAG GAG GGA AAC CGA ATG AAG TTC AAA AAA ATA GCC
TÍK119: 5 'CGC GGT CGG GTA CCG CAG AAT AGT AAG GGT CAT TC'
DNA fragments obtained in the length between 300-400 bp were purified by agarose gel electrophoresis with the subsequent gel extraction (QIAGEN) and subjected to assembly by overlap extension PCR by binding (SOE-PCR).
2) SOE-PCR
The pre-pro fragments were then separately joined by SOE-PCR to the 3 'part of the pKH400 vector promoter. The 3 'part of the promoter was obtained by standard PCR with the Pwo DNA polymerase using 1 ng of pKH400 as the template and the primers
TÍK106: 5 'CGA CGG CCA GCA TTG G. TÍK107: 5' CAT TGG GTT TCC CTC CTC.
The resulting 160 bp fragment was purified by gel. Subsequently, 5 SOE-PCRs were performed under the standard conditions (Pwo DNA polymerase) using as the template each of the 5 pre-pro sequences mixed with equal molar amounts of the 3 'part of the promoter. The assembly primers were:
TÍK120: 5 'CTT TGA TAC GTT TAA ACT ACC. TÍK121: 5 'CGC GGT CGG GTA CCG.
The fragments obtained were also purified by gel.
3) Insertion of the pre-pro sequences into the intermixed vector pKH400
The vector pKH400 was cut with Pme I and Acc65 I to remove the existing linker sequence. The 5 fragments purified by SOE-PCR of 2) were also digested with the same enzymes and purified by gel.
Only with SOE-PCR of the SUB-BACAM P00782 pre-pró sequence, special care was required because it contained an internal Pme I site, so that digestion was performed. Separate standard ligation mixtures, the pfe-pro fragments were then ligated to vector pKH400 After the transformation of DH10B cells from E. coli, colonies were selected in the medium containing ampicillin. were identified by control digestion and sequenced.The vectors thus obtained were named pTK4001-4005.
B) Preparation of small fragments of proteases A13050_1 (GenBank), SUBT_BACAM P00782 (Swiss-Prot), D26542 (GenBank), A22550 (GenBank) and PD498 (Patent Application No. WO 96/34963).
1) Standard PCR reactions were mounted with 0.5 μl of the mini-prep DNA of each protease gene as the templates. Because these five protease genes will be fragmented into six fragments (I-VI), 30 PCRs are required (see fig 1). The Ampli-Taq polymerase (5U) was used in combination with the following groups of primers (the numbering corresponds to the position of the amino acid in A22550). If there are primers marked1,2, etc.,.
then the equal molar amounts of these are mixed before PCR and treated as a primer in the PCR:
• Group I) TÍK122.1 (116-124) 5 'CCG GCG CAG GCG GTA CCX TRS GGX ATW XCX CXX RTX MAA GC
TÍK122.2 (116-124) 5 'CCG GCG CAG GCG GTA CCX TRS GGX ATW XCA WWC ATX WAT AC
TÍK123 (174-180) 10 5 'GTT CCX GCX ACR TGX GTX CC.
• Group II) TÍK124 (174-180) 5 'GGX ACX CAY GTX GCX GGA AC. 15 TÍK125.1 (217-223)
'GCC CAC TSX AKX CCG YTX AC. • TÍK125.2 (217-223) 5 'GCC CAC TSX AKX CCT YGX GC. 20 TÍK125.3 (217-223) 5 'GCC CAX TSR AKX CCK XXX RCW AT
Group III) TÍK126.1 (217-223) 25 5 'GTX ARC GGX MTX SAG TGG GC.
TÍK126.2 (217-223) 5 'GCX CRA GGX MTX SAG TGG GC. TÍK126.3 (217-223) 5 'TWG CYC AAG GWW TXS AXT GKR. TÍK126.5 (217-223) 5 'TWG CTC AAG GHH THS ART GG. TÍK127.1 (255-261) 5 'GCX GCX ACX ACX ASX ACX CC. TÍK127.2 (255-261) 5 'GCY SCW AYW AMX AGW- AYA YCA.
Group IV) TÍ128.1 (255-261) 5 'GGX GTX STX GTX GTX GCX GC.
TÍK128.2 (255-261) 5 'TGR TRT WCT MKT WRT WGS RGG.
TÍ129.1 (292-299) 5 'GBX CCX ACR YTX GAR AAW GAX G
TÍK129.2 (292-299) 5 'GBX CCR TAC TGX GAR AAR CTX G
TÍ129.3 (292-299) 5 'GKX CCA TAC KKA GAR AAR YTT G
TÍ129.5 (292-299) 5 'GKR CCA TAC KKA GAR AAG YTT G Group V) TÍK130. 1 (2 92 - 2 99) 5 'CXT CWT TYT CXA RYG TXG GXV C TÍK130.2 (292-299) 5' CXA GYT TYT GXC AGT AYG GXV C TÍK130.3 (292-299) 5 'CA GYT TCT CTM MGT ATG GSM C. TÍK130.5 (292-299) 5 'CAA GTT TCT CTC AGT ATG GGA C TÍK131.1 (324-330) 5' GGX GWX GCC ATX GAY GTX GC. TÍK131.2 (324-330) 5 'GGA GTA GCC ATX GAX GTW GC.
Group VI) TÍK132.1 (324-330) 5 'GGX ACR TCX ATG GCX WCX CC. TÍK132.2 (324-330) 5 'GGW ACX TCX ATG GCA WCX CC. TÍK133.1 (375-380) 5 'CGG CCC CGA CGC GTT TAC YGX RYX GCX SYY TSX RC.
TÍK133.2 (375-380) 5 'CGG CCC CGA CGC GTT TAT CKT RYX GCX XXY TYW G. Tikl33.3 (375-380) 5' CGG CCC CGA CGC GTT TAT CKT RCX GCX GCX TYT GMR TT TÍK133.4 ( 375-380) 5 'CGG CCC CGA CGC GTT TAT CTT ACG GCA GCC TCA GC
• (X = deoxy-inosine, Y = 50% C, 50% T, R = 50% A + 50% G, S = 50% C + 50% G, W = 50% A + 50% T, K = 50% T + 50% G, M = 50% A + 50% C, B = 33.3% C + 33.3% G + 33.3% T, V = 33.3% C + 33.3% G + 33.3% A, H = 33.3% C + 33.3% A +
_ > _. _ > ~ 6) •
After 30 cycles at the fixing temperatures in the range of 40-60 ° C, the amplified fragments were gel purified and recovered.
2) SOE-PCR to randomly mount the fragments
small
The equimolar amounts of each of the
• purified fragments were taken and mixed in a tube as molds for assembly in a standard SOE-PCR
distinct with Ampli-Tag polymerase. The external primers used are:
TÍK13 .1: CCG GCG CAG GCG GTA CC, TÍK135.1: CGG CCC CGA CGC GTT TA. The primer pairs TÍK135.2: GGC GCA GGC GGT AC can also be used. TÍK135.2: GCC CCG ACG CGT TTA. •
TÍK13 .3: CGC AGG CGG TAC TÍK135.3: CCC GAC GCG TT.
The fixing temperatures are in the range of 40 ° C to 70 ° C.
Re-assembly is also achieved by sequentially re-assembling all conceivable combinations of
fragments, p. ex. : In tube 1 all the seven fragments obtained by PCR with the primers of group I (see above, Bl-2) are mixed, in tube 2, the fragments obtained by PCR are mixed with the primers of group II, in the tube 3, the obtained fragments are mixed
by PCR with the primers of group III, in tube 4, the fragments obtained by PCR are mixed with the primers of group IV, in tube 5, the fragments obtained by PCR are mixed with the primers of group V, in the tube 6, the fragments obtained by PCR are mixed with the
primers of group VI.
Then, an SOE-PCR is performed by mixing the aliquots of tubes 1 and 2 and using the resulting mixture as the template for a primary SOE-PCR with the corresponding external primers. The same is done with the aliquot mixtures of tubes 3 and 4 as well as tubes 5 and 6. The pairs of respective external primers are TÍK134. # / 125 # for fragments 1 and 2, TÍK126. # / 129. # for fragments 3 and 4, and TÍK130. # / 135. # for fragments 5 and 6. Amplified assembled fragments of approximately 340, 260 and 280 bp length, respectively, are purified by agarose gel electrophoresis. In a secondary SOE-PCR, the obtained fragments are mixed and assembled using the primer pair TiKl 34. # / 135. # as the external primers. The full length protease genes obtained are purified by gel as described above.
In another example, the aliquots of tubes 1, 2 and 3 are mixed and reassembled by a primary SOE-PCR with primer pair TÍK134. # / 127. The aliquots of tubes 4, 5 and 6 are also mixed in another tube and reassembled by another SOE-PCR using the primers Ti Kl 28. # / 135. The generated fragments of approximately 450 bp length are purified as described above, mixed and reassembled in a secondary SOE-PCR with the external primers TÍK134. # / 135. The full length protease genes obtained are purified by gel as described above.
In principle, each combination of the fragments could be assembled in separate SOE-PCRs. In subsequent SOE-PCRs, the units of the obtained assembly are mounted to larger units until the total gene length is obtained. The overall number of SOE-PCRs used for such purpose is limited only by experimental capacity. The unique prerequisite that is inherent with SOE-PCR is that the fragments to be assembled must contain a sequence overlap as previously defined.
c) clone the full-length protease hybrids derived from SOE-PCR to produce library # 1.
Hybrid genes of the full-length protease of step B2) as well as the newly intermixed vectors pTK4001-4005 of A3) are digested separately with Acc65 I and Mlu I. In standard ligation procedures, the protease genes are ligated separately to each of the five vectors pTK4001-4005 and transformed into DH10B from E. coli. The selection of correctly transformed cells is carried out with ampicillin. The DNA of these clones is prepared and designated library # 1. The size of the library is approximately 106 independent transformations.
D) Screening of library # 1
The aliquots of library # 1 are used to transform the Bacilli DN1885 cells. The transformations are screened for the desired properties.
Using this method and using a standard protease activity test to screen the desired property in step D) above, a number of new subtilisins intermingled with a desired property were identified.
The results are indicated in Table 1 below
Table 1
Pre-frag clone frag frag frag. frag frag pro 1 (5 ') 2 3 4 5 6 (3')
BPN Sav Sav Sav Sav Sav Sav Ale Sav Sav Sav Sav Sav Sav 12 Esp Sav Sav Sav Sav Sav Sav
PD498 Sav Sav Sav Sav Sav Sav
4 Esp PD138 Esp Esp Esp Esp JAI 6
22 Ale PD138 Esp Esp Esp Esp JAI 6
11 PD498 PD138 Esp Esp Esp Esp JAI 6
1 Ale PD138 Esp PD138 Esp Esp JAI 6
3 BPN PD138 Esp Esp PD138 Sav Sav
17 Esp PD138 PD138 Esp Esp Esp JAI 6 '
19 PD498 Ale BPN Esp Esp Esp JAI 6
16 Ale Ale BPN Esp PD138 Esp JAI 6
Identity of clones: Alcalase: A13050_l (GenBank) BPN ': Poo782 (SwissProt) Esperasa: D26542 (GenBank) Savinase: A22550 (GenBank) PD498: WO 96/34963 JA16: WO 92/17576
PD138 WO 93/18140
We identified 23 clones that have protease activity, of which 12 were different. Clones 8, 9, 18, 20, 23 were the same; clones 6, 15, 21 were the same, clones 12, 14 were the same, clones 10, 13 were the same and clones 4, 7 were the same. With regard to mature enzymes, 7 different ones were identified.
From Table 1 it is observed that the process of the invention makes it possible to obtain the active proteins that represent the combinations of the related proteins very distantly.
Example 2
The same methods can be used as described in Example 1 for the amplification of the fragments by PCR from the fungal lipases.
The fungal lipases of the following fungus are aligned using the Geneworks alignment program (using the following parameters: cost to open a space = 5, cost to lengthen a space = 25, Minimum Diagonal Length = 4, Maximum Diagonal Length = 10, consensus cut = 50%): Rhizomucor Miehei (LIP_RHIMI from the Swiss Prot database, Penecillium camenbertii (MDLA_PENCA from the Swiss Prot database) Absidia reflexa (WO 96/13578) and Humicola lanuginosa (US 5536661).
The primers for the amplification of the lipase genes of Absidia (Absidia), Rhizopus (LIP_RHIDL) and Rhizomucor (LIP_RHIMI) to intermix N: according to the IUPAC nomenclature means all 4 bases (A, T, G, C).
Group 1) 5 'primer for YCRT / SVI / VPG: TAY TGY MGR ACN GTN ATH CCN GG or TAY TGY MGR AGY / TCN GTN GTN GTN CCN GG
3 'primer for VFRGT / S: NSW NCC YCK RAA NAC
Group 2)
'primer for VFRGT / S: GTN TTY MGR GGN WSN 3 * primer for KVHK / AGF: RAA NCC YTT RTG NAC YTT or RAA NCC NGC RTG NAC YTT
Group 3) 5 'primer for KVHK / AGF AAR GTN CAY AAR GGN TTY or AAR GTN CAY GCN GGN TTY 3' primer for VTGHSLGG: CC NCC YAR NGA RTG NCC NGT NAC O CC NCC YAR RCT RTG NCC NGT NAC
Group 4) 5 'primer for VTGHSLGG: GTN ACN GGN CAY TCN YTR GGN GG or GTN ACN GGN CAY AGY YTR GGN GG 3 * primer for FGFLH: RTG YAR RAA NCC RAA
Group 5) 5 'primer for FGFLH: TTY GGN TTY YTR CAY 3' primer for IVPFT: NGT RAA NGG NAC DAT
Primers for the amplification of the lipase genes of Humicola lanuginosa (Humicola) and Penicillium camenbertii (MDLA_PENCA) for intermixing
Group 1) 5 'primer for CPEVE: TGY CCN GAR GTN GAR 3' primer for VLS / AFRG: NCC YCK RAA NGM YAR NAC
Group 2) 5 'primer for VLS / AFRG: GTN YTR KCN TTY MGR GGN' 3 'primer for CFT / WSSW: CCA NGA NGA NGT RAA NCC or CCA RSW RSW CCA RAA NCC
Group 3) 5 'primer for GFT / WSSW: GGN TTY ACN TCN TCN TGG or GGN TTY TGG WSY WSY TGG 3' primer for GHSLGG / AA: NGC NSC NCC YAR NGA RTG NCC or NGC NSC NCC YAR RCT RTG NCC Group 4) 5 'primer for GHSLGG / AA: GGN CAY TCN YTR GGN GSN GCN or GGN CAY AGY YTR GGN GSN GCN 3' primer for PRVGN: RTT NCC NAC YCK NGG
Group 5) 5 'primer for PRVGN: CCN MGR GTN GGN AAY 3' primer for THTND: RTC RTT NGT RTG NGT
Group 6) 5 'primer for THTND: ACN CAY ACN AAY GAY 3' primer for PEYWI: DAT CCA RTA YTC NGG
Group 7 )'
'primer for PEYWI: CCN GAR TAY TGG ATH 3' primer for AHL / _IWYF: RAA RTA CCA DAK RTG NGC
Primers to intermix all five genes:
Group 1) 5 'primer for AN / TA / SYCR: GCN AMY KCN TAY TGY MG for the sequences of Absidia, Rhizopus and Rhizomucor 5' primer for AN / TA / SYCGKNNDA: GCN AMY KCN TAT TGY GGN
AAR AAY AAY GAY GC for Humicola 5 'primer for AN / TA / SYCEADYTA: GCN AMY KCN TAY TGY GAR GCN GAY TAY ACN GC for P. camenbertii
3 'primer for E / OKTIY: RTA DAT NGT YTT YTS for sequences of Absidia, Rhizopus and Rhizomucor 3' primer for ALDNTE / OKTIY: RTA DAT NGT YTT YTS NGT RTT
RTC YAR NGC for Humicola primer 3 'for AVDHTE / OKTI Y: RTA DAT NGT YTT YTS NGT RTG
RTC NAC NGC for P. camenbertii
Group 2) 5 'primer for E / OKTIY: SAR AAR ACN ATH TAY for the sequences of Absidia, Rhizopus and Rhizomucor 5' primer for E / OKTI YLA / SFRG: SAR AAR ACN ATH TAY YTR KCN
TTY MGR GGN for the other two sequences
3 'primer for KVHK / AGF: RAA NCC YTT RTG NAC YTT or RAA NCC
NGC RTG NAC YTT for the sequences of Absidia, Rhizopus and
Rhizomucor 3 'primer for ICSGCKVHK / AGF: RAA NCC YTT RTG NAC YTT RCA
NCC NGA RCA DAT or RAA NCC NGC RTG NAC YTT RCA NCC NGA RCA
DAT for Humicola 3 'primer for LCDGCKVHK AGF: RAA NCC YTT RTG NAC YTT RCA
NCC RTC YAR or RAA NCC NGC RTG NAC YTT RCA NCC RTC RCA YAR for P. camenbertii Group 3) 5 'primer for KVHK / AGF: AAR GTN CAY AAR GGN TTY or AAR GTN
CAY GCN GGN TTY for the sequences of Absidia, Rhizopus and
Rhizomucor 5 'primer for KVHK / AGFTSSW: AAR GTN CAY AAR GGN TTY ACN
TCN TCN TGG or AAR GTN CAY GCN GGN TTY ACN TCN TCN TGG for
Humicola 5 'primer for KVHK / AGFWSSW: AAR GTN CAY AAR GGN TTY TGG
WSY WSY TGG or AAR GTN CAY GCN GGN TTY TGG. WSY WSY TGG for
P. camenbertii 3 'primer for GHSLGG / AA: NGC NSC NCC YAR NGA RTG NCC or
NGC NSC NCC YAR RCT RTG NCC for all five sequences
Group 4) 5 'primer for GHSLGG / AA: GGN CAY TCN YTN GGN GSN GCN or GGN CAY AGY YTN GGN GSN GCN for all five sequences
3 'primer for PRVGN / D: RTY NCC NAC YCK NGG for all genes except Absidia-3' primer for TOCOPRVGN / D: RTY NCC NAC YCK NGG YTG NCC TYG NGT for Absidia
Group 5) 5 'primer for PRVGN / D: CCN MGR GTN GGN RAY for all genes except Absidia primer 5' for PRVGN / DPAFA: CCN MGR GTN GGN RAY CCN GCN TTY GCN for Absidia
3 * primer for RDIVPH / R / K: YK NGG NAC DAT RTC YCK for the sequences of Absidia, Rhizopus and Rhizomucor 3 'primer for I / FTHTRDIVPH / R / K: YK NGG NAC DAT RTC YCK NGT RTG NGT RAW for the others two sequences
Group 6) 10 'primer for RDIVPH / R / K: MGR GAY ATH GTN CCN MR for Absidia, Rhizopus and Rhizomucor sequences • 5' primer for RDIVPH / R / KLP: MGR GAY ATH GTN CCN MRN YTR CCN for two other sequences
3 'primer for EYWIK / T .: YKT DAT CCA RTA YTC for the sequences of Rhizomucor, Humicola and P. camenbertii 3' primer for PGVEYWIK / T .: YKT DAT CCA RTA YTC NAC NCC NGG
• for Rhizopus 3 'primer for AGEEYWIK / _: YKT DAT CCA RTA YTC YTC NCC NGC
for Absidia
Group 7) 5 'primer for EYWIK / T: GAR TAY TGG ATH AAR or GAR TAY TGG ATH ACN for Rhizomucor, Humicola and P. camenbertii 25 5' primer for EYWIKSGT: GAR TAY TGG ATH AAR WSY GGN ACN for Rhizopus primer 5 ' for EYWIKKDSS: GAR TAY TGG ATH AAR AAR GAY WSY
WSY for Absidia
3 'primer for DHLSY: RTA NGA / RCT YAR RTG RTC for the sequences of Absidia, Rhizopus and Rhizomucor 3' primer for IPDIPDHLSY: RTA NGA / RCT YAR RTG RTC NGG DAT RTC NGG DAT for Humicola 3 'primer for TDFEDHLSY: RTA NGA / RCT YAR RTG RTC YTC RAA RTC NGT for P. camenbertii
The primers 5 'of the first primer group and the 3' primer can be used for the SOE-PCR for the last group of the primers.
The SOE-PCR fragments can then be combined with a 5 'and 3' lipase end, when the 5 'and 3' ends have been generated by PCR. The 5 'end can be generated by PCR using the 5' specific primers (containing a sequence for the BamHI recognition site at the 5 'end) for the 5' end of the genes of interest and using the complementary sequence of the primer 5 ' of the first group of primers as the 3 'primer. The 3 'end can be generated by PCR using the specific 3' primers (containing a sequence for the Xbal recognition site at the 5 'end) for the 3' end of the genes of interest and the complementary sequence of the 3 'primer of the last group of primers as the 5 'primer.
A second SOE is then used to generate the complete sequence, using the 5 'and 3' primers specific for the genes of interest.
The genes can then be cloned into the yeast vector pJS026 as a BamHI-Xbal fragment (see WO 97/07205).
Example 3
The same overall method as described in example 2, can be used for the amplification and recombination of the PCR fragments of the Pseudomonas lipases. The term "same overall method" represents that it could be advantageous to use the slightly different vectors as compared in example 2. Based on the sequence and the information of the primer shown below, it is a matter of routine for a person skilled in the art to modify vectors etc. of Example 2, to recombine the aforementioned Pseudomonas lipases according to an intermixing method of the invention.
The Pseudomonas lipases mentioned below are aligned using the Geneworks alignment program (using the following parameters: cost to open a space = 5, cost to lengthen a space = 25, Minimum Diagonal Length = 4, Maximum Diagonal Length = 10, cut of consensus = 50%).
Pseudomonas lipases
Pseudomonas aeruginosa TE3285 (row ate3285d) Pseudomonas Pseudoalcaligenes Ml (Lipomax wt) (row pseudmld) Pseudomonas sp.SD705 (mature) (row spsd705d) Pseudomonas wisconsinensis (row wisconsd) Proteus vulgaris K80 (row provulgd) Pseudomonas fragi IFO 12049 (row frl2049d) .
The appropriate primers for intermixing the Pseudomonas lipases: I = Inosine, the numbers refer to the numbers in the alignment (see Figure 4), S means meaning strand, of course the antisense oligonucleotide is also used: 109-131 YES: 5 '-TA (C / T) CCIAT (C / T) (G / T) I (C / T) T (G / A) (G / A) (C /) ICA (C / T) GG -3
250-269 S2: 5'-GA (G / A) (G / C) I ICGIGGIG (A / C) I (G / C) A (G / A) (T / C) T-3 '
318-343 S3: 5'-GT (C / A) AA (C / T) (C / T) T (G / A) ITCGG (C / T) CA (C / T) AG (C / T) CAIGG -3 '
607-628 S4: 5 '- TIAA (C / T) (G / C / A) (G / C / A) (C / T / A) (A / C) (A / G) I (T / C) ) (A / T) (C / T) CCI (C / T) (A / G) (T / G / A) GG-3 '
801-817 S5: 5 '-AA (C / T) GA (C / T) GG (C / T) (C / A / T) TGGT (C / T / G) GG-3'
871-890 S6: 5 '-CA (C / T) (C / G) T (C / G) GA (C / T) (G / A) (A / C / T) (G / C) (G) / A) (G / C / A) AACCA-3 *
It is noted that in relation to this date, the best method known by the applicant to carry out the aforementioned invention, is the conventional one for the manufacture of the objects to which it relates.
Having described the invention as above, the content of the following is claimed as property.
Claims (8)
1. A method for intermixing the heterologous sequences of interest, characterized in that it comprises the following steps, i) identification of at least one conserved region between the heterologous sequences of interest: ii) generating the fragments of each of the heterologous sequences of interest, wherein the fragments comprise the conserved regions; Y iii) intermixing / recombining the fragments using the conserved regions as a homologous binding site.
2. A method for producing an intermixed protein having a desired biological activity, characterized in that it comprises in addition to the steps of claim 1 the following additional steps: iv) expressing the numerous different recombinant proteins encoded by the numerous different intermixed sequences of step iii) (in claim 1); Y v) screening or screening the numerous different recombinant proteins from step ii) in a screening or appropriate screening system for one or more recombinant proteins having the desired activity.
3. The method for intermixing the heterologous DNA sequences of interest, according to claim 1, having at least one conserved region, characterized in that it comprises the following steps i) identification of one or more conserved regions (subsequently called "A, B, C" etc. ..) in two or more of the heterologous sequences; ii) constructing at least two sets of PCR primers (each set comprising a sense primer and one antisense) to one or more of the conserved regions identified in i) where a group of sense primer (named: sense primer) is refers to a sequence region 5 '(sense strand) of the conserved region (p. eg. conserved region "a"), and antisense primer (named "a'" = antisense primer) refers to a region the 3 'sequence (sense strand) of the conserved region or refers to a region of the sequence at least partially within the conserved region, and in the second group of the sense primer (called: "b" = sense primer) is refers to a sequence region 5 '(sense strand) of the conserved region or refers to a sequence region at least partially within the conserved region and antisense primer (named: "b'" = antisense primer) refers to a region of the 3 'sequence (strand of sense) of the conserved region (p. ex. conserved region "A"), and the two regions of the sequence defined by the regions between the primer group "a" and "a" 'and "b" and "b'" (both regions include the sequences of the current primer) have an overlapping homologous sequence of at least 10 base pairs (bp) within the conserved region; for one or more identified conserved regions of interest in step i), two PCR amplification reactions are performed with the heterologous DNA sequences in step i) as the template, and wherein • one of the PCR reactions uses the 5 'primer identified in step ii) (eg called "a", "a"') and the second PCR reaction uses the 3 'primer group identified in step ii) (eg called "b") "," b "'); 10? isolation of PCR fragments generated as • was described in step iii) for one or more of the conserved regions identified in step i); 15 v) form a pool of two or more of the fragments isolated by PCR from step iv) and perform a PCR reaction of superimposed extension of the sequence (SOE-PCR) using the fragments 20 isolated by PCR as templates; Y vi isolation of the PCR fragment obtained in step v), wherein the fragment isolated by PCR comprises numerous different sequences 25 intermingled gue contains an intermingled mixture of the isolated fragments by PCR in step iv), wherein the intermingled sequences are characterized by sequences of partial DNA originating from overlays the homologous sequence in step ii) have at least 80% identity towards one or more of the partial sequences in one or more of the original heterologous DNA sequences in step i).
4. The method for producing one or more recombinant proteins having a desired biological activity, according to claim 2, characterized in that it comprises: intermixing the heterologous DNA sequences, which have at least one conserved region, that encode a protein by i) identification of one or more conserved regions (subsequently called "A, B, C" etc. ..) in two or more of the heterologous sequences; ii) construction of at least two groups of primers by PCR (each group comprises a sense primer and an antisense primer) for one or more of the conserved regions identified in i) where in a group of the sense primer (called: "a" = sense primer) refers to a region of the 5 'sequence (sense strand) of the conserved region (e.g., conserved region "A"), and the antisense primer (called "a"' = antisense primer) refers to to a region of the 3 'sequence (sense strand) of the region 10 conserved or refers to a region of the sequence at least partially within the conserved region, and in the second group of the sense primer (called: "b" = sense primer) refers to a 15 region of the 5 'sequence (sense strand) of the conserved region or refers to a region of the sequence at least partially within the conserved region and the antisense primer (called: "b"' = antisense primer) refers 20 to a region of the 3 'sequence (sense strand) of the conserved region (eg conserved region "A"), and the two regions of the sequence defined by the regions between the "a" primer group and 25 and "b '" (both regions include the current primer sequences) have an overlapping homologous sequence of at least 10 base pairs (bp) within the conserved region; 111 for one or more identified conserved regions of interest in step i), two PCR amplification reactions are performed with the heterologous DNA sequences in step i) as the template, and wherein one of the PCR reactions uses the 5 'primer identified in step ii) (eg called "a", "a"') and the second PCR reaction uses the 3 'primer group identified in step ii) (eg called "b") "," b "'); iv) isolation of PCR fragments generated as described in step iii) for one or more of the conserved regions identified in step i); v) forming a pool of two or more of the fragments isolated by PCR of step iv) and performing a PCR reaction of superimposed sequence extension (SOE-PCR) using the fragments isolated by PCR as templates; Y vi isolation of the PCR fragment obtained in step v), wherein the fragment isolated by PCR comprises numerous different intermixed sequences containing an intermixed mixture of the fragments isolated by PCR in step iv), wherein the intermixed sequences are characterized by partial DNA sequences, which originate from the overlays of the homologous sequence in step ii), have at least 80% identity towards one or more of the partial sequences in one or more of the original heterologous DNA sequences in the stage i); vii) expressing the numerous different recombinant proteins encoded by the numerous different interspersed sequences in step vi); Y viii) screening or screening the numerous different recombinant proteins of step vii) in a suitable screening or screening system for one or more recombinant proteins having a desired activity
5. The method according to any one of ddfc claims 1-4, characterized in that the heterologous sequences of interest encode an enzyme.
6. The method according to claim 5, characterized in that the enzyme is a protease, preferably a serine protease, and in particular a 10 subtylase; or a lipase.
7. The method according to any of claims 3 and 4, characterized in that the PCR amplification process in step iii) is carried out under the 15 conditions that result in a frequency of low, medium or high random mutagenesis. • The method according to any of claims 2 and 4, characterized in that the activity The desired activity is an activity that leads to the development of the recombinant proteins in a dish washing or laundry detergent. 25
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DK0304/97 | 1997-03-18 | ||
DK0432/97 | 1997-04-17 |
Publications (1)
Publication Number | Publication Date |
---|---|
MXPA99008622A true MXPA99008622A (en) | 2000-09-04 |
Family
ID=
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU738461B2 (en) | Shuffling of heterologous DNA sequences | |
JP4263248B2 (en) | Library creation method by DNA shuffling | |
EP0948615B1 (en) | In vivo recombination | |
US6368805B1 (en) | Methods of producing polynucleotide variants | |
AU741834B2 (en) | An (in vitro) method for construction of a dna library | |
JP3471797B2 (en) | Stabilizing enzymes and detergents | |
US7098017B2 (en) | Protease variants and compositions | |
CN109312353A (en) | Improve microorganism by CRISPR- inhibition | |
CN106922154A (en) | The gene editing of the engineered nucleic acid enzyme guided using RNA derived from campylobacter jejuni CRISPR/CAS systems | |
US6159687A (en) | Methods for generating recombined polynucleotides | |
JP6552969B2 (en) | Library preparation method for directed evolution | |
JPS6070075A (en) | Carbonyl hydrolyzing enzyme of eukaryote | |
NZ322304A (en) | Molecular cloning by multimerization of plasmids using two nucleic acids and a single PCR | |
WO2019046703A1 (en) | Methods for improving genome editing in fungi | |
AU2022204029A1 (en) | Oleic acid-enriched plant body having genetically modified FAD2 and production method thereof | |
US6291165B1 (en) | Shuffling of heterologous DNA sequences | |
CA2386090A1 (en) | Production of functional hybrid genes and proteins | |
MXPA99008622A (en) | Shuffling of heterologous dna sequences | |
US9856470B2 (en) | Process for generating a variant library of DNA sequences | |
CN115427560A (en) | CRISPR-AID Using catalytically inactive RNA-guided endonucleases | |
WO2005021719A2 (en) | Libraries of recombinant chimeric proteins | |
WO2003054127A2 (en) | Subtilisin variants with improved characteristics | |
CA2442096A1 (en) | Methods for the preparation of polynucleotide librairies and identification of library members having desired characteristics | |
Ninkovic et al. | High-fidelity in vitro recombination using a proofreading polymerase | |
JP4312832B2 (en) | In vivo recombination |