MXPA00005640A - Method for creating polynucleotide and polypeptide sequences - Google Patents

Method for creating polynucleotide and polypeptide sequences

Info

Publication number
MXPA00005640A
MXPA00005640A MXPA/A/2000/005640A MXPA00005640A MXPA00005640A MX PA00005640 A MXPA00005640 A MX PA00005640A MX PA00005640 A MXPA00005640 A MX PA00005640A MX PA00005640 A MXPA00005640 A MX PA00005640A
Authority
MX
Mexico
Prior art keywords
polynucleotide
vector
heteroduplexes
polynucleotides
variants
Prior art date
Application number
MXPA/A/2000/005640A
Other languages
Spanish (es)
Inventor
Frances Arnold
Zhixin Shao
Alexander Volkov
Original Assignee
Frances Arnold
California Institute Of Technology
Zhixin Shao
Alexander Volkov
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Frances Arnold, California Institute Of Technology, Zhixin Shao, Alexander Volkov filed Critical Frances Arnold
Publication of MXPA00005640A publication Critical patent/MXPA00005640A/en

Links

Abstract

The invention provides methods for evolving a polynucleotide toward acquisition of a desired property. Such methods entail incubating a population of parental polynucleotide variants under conditions to generate annealed polynucleotides comprising heteroduplexes. The heteroduplexes are then exposed to a cellular DNA repair system to convert the heteroduplexes to parental polynucleotide variants or recombined polynucleotide variants. The resulting polynucleotides are then screened or selected for the desired property.

Description

METHOD TO CREATE POLYUCLEOTIDIC AND POLYPEPTIDIC SEQUENCES TECHNICAL FIELD The invention lies in the technical field of genetics, and more specifically, the forced molecular evolution of polynucleotides to acquire desired properties.
BACKGROUND A variety of methods, including rational design and directed evolution, have been used to optimize protein functions (1,2). The choice of method for a given optimization problem depends, in part, on the degree of understanding of the relationships between sequence, structure and function. Rational design typically requires extensive knowledge of a structure-function relationship. The directed evolution requires little or no specific knowledge about the structure-function relationship, instead the essential characteristics are the means to evaluate the function to be optimized. Directed evolution involves the generation of libraries of mutant molecules followed by the selection or separation of the desired function. Genetic products that show improvements with respect to the property or set of desired properties are identified by selection or separation. The genes that code for these products can be subjected to additional cycles of the process to accumulate beneficial mutations. This evolution may involve a few or many generations, depending on how much you want to progress and the effects of the mutations typically observed in each generation. Such methods have been used to create novel functional nucleic acids (3,4), peptides and other small molecules (3), antibodies (3) as well as enzymes and other proteins (5,6,7). These procedures are very tolerant of inaccuracies and noise when evaluating the function (7). Several publications have discussed the role of genetic recombination in directed evolution (see WO 97/07205, WO 98/42727, US 5807723, US 5,721,367, US 5,776,744 and WO 98/41645, US 5,811,238, WO 98/41622, WO 98 / 41623 and US 5,093,257). A group of PCR-based recombination methods consists of redistributing DNA [5, 6], stacking extension process [87,90] and recombination by random priming [87]. Such methods typically involve the synthesis of significant amounts of DNA during the assembly / recombination step and the subsequent amplification of the final products and the efficiency of the amplification decreases with the increase in the size of the gene.
Yeast cells, which have an active system for homologous recombination, have been used for in vivo recombination. Cells transformed with a vector and partially overlapping inserts efficiently bind the inserts in the regions of homologies and restore a functionally covalently closed plasmid [91]. This method does not require PCR amplification at any stage of the recombination and is therefore free from the size considerations inherent in this method. However, the number of crosses introduced in a recombination event is limited by the efficiency of the transformation of a cell with multiple inserts. Other methods of in vivo recombination involve recombination between two parental genes cloned on the same plasmid in a cascade orientation. One method depends on the homologous recombination machinery of bacterial cells to produce chimeric genes [92]. A first gene in the cascade provides the N-terminal part of the target protein and a second provides the C-terminal part. However, only one cross can be generated by this method. Another method of in vivo recombination uses the same cascade organization of substrates in a vector
[93] Before transformation into E. coli cells, the plasmids are linearized by endonuclease digestion between the mother sequences. The recombination is carried out in vivo by the enzymes responsible for the repair of the double strand break. The ends of linear molecules are degraded by the activity of 5'-3 'exonuclease, followed by annealing of the 3' ends of a single complementary strand and restoration of the double-stranded plasmid [94]. This method has advantages and disadvantages similar to cascade recombination on circular plasmids.
BRIEF DESCRIPTION OF THE INVENTION The invention provides methods for evolving a polynucleotide towards the acquisition of a desired property. Such methods involve incubating a population of parent polynucleotide variants under conditions to generate annealed polynucleotides comprising heteroduplex. The heteroduplexes are then exposed to a cellular DNA repair system for converting the heteroduplex to polynucleotide variants or recombinant polynucleotide variants. The resulting polynucleotides are then separated or selected according to the desired property. In some methods, heteroduplexes are exposed to a DNA repair system in vi tro. A suitable repair system can be prepared in the form of cell extracts. In other methods, annealing products that include heteroduplex are introduced into host cells. Heteroduplexes are thus exposed to the host cell DNA repair system in vivo. In several methods, the introduction of annealed products into host cells selects heteroduplex in relation to transformed cells comprising heteroduplex. Such a thing can be achieved, for example, by providing a first polynucleotide variant as a component of a first vector, and a second polynucleotide variant is provided as a component of a second vector. The first and second vectors are converted to linearized forms in which the first and second polynucleotide variants occur at opposite ends. In the incubation step, the single-strand forms of the first linearized vector annealed again with each other to form the first linear vector, the single-strand forms of the second linearized vector annealed again with each other for forming the second linear vector, and the linearized forms of a single strand of the first and second vectors recossed with each to form a circular heteroduplex containing a slit in each strand. The introduction of the products into cells in this way selects circular heteroduplexes in relation to the first and second linear vectors. Optimally, in the above methods, the first and second vectors can be converted to linearized forms by PCR. Alternatively, the first and second vectors can be converted to linearized forms by digestion with first and second restriction enzymes. In some methods, polynucleotide variants are provided in double-stranded form and converted to a single-stranded form prior to annealing step. Optionally, such conversion is leading to an asymmetric amplification of the first and second double-stranded polynucleotide variants to amplify a first strand of the first polynucleotide variant, and a second strand of the second polynucleotide variant. The first and second strands are annealed in the incubation step to form a heteroduplex. In some methods a population of polynucleotides comprising first and second polynucleotides in double stranded form is provided, and the method further comprises incorporating the first and second polynucleotides as a component of a first and second vectors, so the first and second polynucleotides occupy the opposite ends of the first and second vectors. In the incubation step the single-strand forms of the first linearized vector are annealed together to form the first linear vector, the single-strand forms of the second linearized vector are annealed together to form the second linear vector, and the shapes linearized single strand of the first and second vectors are annealed together to form a circular heteroduplex containing a slit in each strand. In the introduction step, the transformed cells comprising the circular heteroduplex are selected in relation to the first and second linear vectors. In some methods, the first and second polynucleotides are obtained from chromosomal DNA. In some methods, polynucleotide variants code for variants of a polypeptide. In some methods, the population of polynucleotide variants comprises at least 20 variants. In some methods, the population of polynucleotide variants are at least 10 kb in length. In some methods, the polynucleotide variants comprise natural variants. In other methods, the polynucleotide variants comprise variants generated by metagenic PCR or cassette metagenesis. In some methods, the host cells in which the heteroduplexes are introduced are bacterial cells. In some methods, the population of polynucleotide variants comprises at least 5 polynucleotides having a sequence identity of at least 90% relative to one another.
Some methods also comprise a step of demethylating at least partially variant polynucleotides.
The demethylation can be carried out by PCR amplification or by passing variants through host cells deficient in methylation. Some methods include an additional step of sealing one or more slits in heteroduplex molecules before exposing heteroduplexes to a repair system.
DNA The slits can be sealed by treatment with DNA ligase. Some methods further comprise a step of isolating a separate recombinant polynucleotide variant. In some methods, the polynucleotide variant is separated to produce a recombinant protein or a secondary metabolite whose production is catalyzed by it. In some methods, the recombinant protein or secondary metabolite is formulated with a carrier to form a pharmaceutical composition. In some methods, the polynucleotide variants code for enzymes selected from the group consisting of proteases, lipases, amylases, cutinases, cellulases, amylases, oxidases, proxidases and phytases. In other methods, polynucleotide variants encode a polypeptide selected from the group consisting of insulin, ACTH, glucagon, somatostatin, somatotropin, thymosin, parathyroid hormone, pigment hormones, somatomedin, erythropoietin, luteinizing hormone, chorionic gonadotropin, hyperthermic release factors, antidiuretic hormones, thyroid stimulating hormone, relaxin, interferon, trompoietic (TPO), and prolactin. In some methods, each polynucleotide in the population of variant polynucleotides codes for a plurality of enzymes that form a metabolic pathway.
BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 illustrates the process of heteroduplex formation using the polymerase chain reaction (PCR) with one set of primers for each different sequence to amplify the target sequence and the vector. Figure 2 illustrates the process of heteroduplex formation using restriction enzymes to linearize the target sequence and vector. Figure 3 illustrates a process of heteroduplex formation using the asymmetric polymerase chain reaction (PCR) or a single primer with one set of primers for each different sequence to amplify the target sequence and vector. Figure 4 illustrates the recombination of the heteroduplex using unique restriction enzymes (X and Y) to remove homoduplexes.
Figure 5 shows the amino acid sequences of R. lupini FlaA (SEQ ID NO: l) R. meliloti (SEQ ID NO: 2). Figures 6A and 6B show the location of the unique restriction sites used to linearize pRL20 and pRM40. Figures 7A, B, and C show the sequences of DNA of four mosaic flaA genes created for in vitro heteroduplex formation followed by in vivo repair ((a) is SEQ ID NO: 3, (b) is SEQ ID NO: 4, (c) is SEQ ID NO: 5 and (d) is SEQ ID NO: ß). Figure 8 illustrates how the heteroduplex repair process created flaA mosaic genes that contain the sequence information of both parent genes. Figure 9 shows physical maps of ECB mutants deacylase of Actinoplanes utahensis with increased specific activity ((a) is pM7-2 for Mutant 7-2, and (b) is pMld for Mutant 16). Figure 10 illustrates the process used by Example 2 to recombine mutations in Mutant 7-2 and Mutant 16 to produce ECB recombinant deacylase with no increased specific activity. Figure 11 shows specific activities of ECB natural deacylase and improved mutants of Mutant 7-2, Mutant 16 and Mutant 15 recombined.
Figure 12 shows the positions of the DNA base changes and amino acid substitutions in Mutant 15 of the ECB deacylase recombined with respect to the mother sequences of Mutant 7-2 and Mutant 16. Figures 13 A, B, C , D and E show the DNA sequence of the M-15 mutant genes of the ECB deacylase from A. utahensis' created by heteroduplex formation in vi tro followed by in vivo repair. { SEQ ID NO: 7). Figure 14 illustrates the process used by Example 3 to recombine mutations in RCl and RC2 to produce thermostable subtilisin E. Figure 115 illustrates the sequences of RCl and RC2 and the ten clones taken at random from transformants of the reaction products of duplex formation according to that described in Example 3. The x correspond to the positions of the bases that differ between the RCl and the RC2. The mutation in 995 corresponds to the substitution of the amino acid in 181, while 1107 corresponds to an amino acid substitution at 218 in the subtilisin protein sequence. Figure 16 shows the results of separating 40 clones from the library created by the formation and repair of the heteroduplex for the initial activity (Ai) and the residual activity (Af). The Ai / Ar ratio was used to estimate the thermostability of the enzymes. The data of active variants were classified and plotted in descending order. Approximately 12.9% of the clones exhibit a phenotype corresponding to the double mutant that contains both N181D and N218S mutations.
DEFINITIONS Separations, in general, are a two-step process in one of which the cells are first physically separated and then determine which cells and which do not possess a desired property. Selection is a form of separation in which identification and physical separation are achieved simultaneously by the expression of a selection marker, which, in some genetic circumstances, allows the cells expressing the marker to survive while the other cells die (or vice versa). Exemplary separation members include luciferase, β-galactosidase and green fluorescent protein. Selection markers include drug resistance genes and toxins. Although spontaneous selection can and does occur in the course of natural evolution, in the methods of the present the selection is made by man. A segment of hexogen DNA is a stranger (or heterologous) to the cell or homologous to the cell but at a position within the nucleic acid of the host cell in which the element is not commonly found. The hexogenic DNA segments are expressed to produce hexogenic polypeptides. The term "gene" is widely used to refer to any DNA segment associated with a biological function. In this way, the genes include coding sequences and / or the regulatory sequences required for their expression. Genes also include unexpressed DNA segments that, for example, form recognition sequences for other proteins. The term "natural or wild" means that the nucleic acid fragment does not comprise any mutation. A "wild or natural" protein means that the protein will be activated at a level of activity found in nature and will typically comprise the amino acid sequence found in nature. In one aspect, the term "natural or wild" or "master sequence" may indicate the start or a reference sequence prior to manipulation of the invention. "Substantially pure" means that an object species is the predominant species present (i.e., that on a molar basis it is more abundant than any other individual macromolar species in the composition), and preferably a substantially pure fraction is a composition where the Object species comprise at least 50 percent (on a molar basis) of all the macromolecular species present. Generally, a substantially pure composition will comprise more than about 80 to 90 percent of all macromolecular species present in the composition. More preferably, the target species are purified essentially to homogeneity (contaminating species can not be detected in the composition by conventional detection methods), where the composition consists essentially of a single macromolecular species. Solvent species, small molecules (<500 Daltons), and elemental ionic species are not considered macromolecular species. The percent identity of the sequence is calculated by comparing two optimally aligned sequences on the comparison window, determining the number of positions in which identical nucleic acid bases occur in both sequences to produce the number of equal positions, dividing the number of equal positions by the total number of positions in the comparison window. Optimal sequence alignment to align a comparison window can be driven by computerized implementations of GAP, BESTFIT, FASTA, and TFASTA algorithms in the Wisconsin Genetics Programming Program and Systems Package Version 7.0, Genetics Computation Group, 575 Science Dr., Madison, Wl.
The term found in nature is used to describe an object that can be found in nature in a way different from that artificially produced by man. For example, a polypeptide or polynucleotide sequence that is present in an organism (including viruses) that can be isolated from a source in nature and that has not been intentionally modified by man in the laboratory is found in nature. In general terms, the term found in nature refers to an object as it is present in a non-pathological individual (without disease), as would be typical for species. A nucleic acid is operatively linked when it is placed in a functional relationship with another nucleic acid sequence. For example, a promoter or amplifier is operably linked to a coding sequence if it increases the transcription of the coding sequence. Linked in operable ways means that the DNA sequences that are ligated are typically contiguous and, where it is necessary to join two protein coding regions, contiguous and in the reading frame. However, since the amplifiers generally operate when several kilobases are separated from the promoter and the intronic sequences may be of varying lengths, some polynucleotide elements may be operably linked but not contiguous. A specific binding affinity between, for example, a ligand and a receptor, means a binding affinity of at least 1 x 106 M "1. The term" cognate "as used herein refers to a genetic sequence that is evolutionary and functionally related between species For example but without limitation, in the human genome, the human CD4 gene is the gene cognate to the mouse CD4 gene, since the sequences and structures of those two genes are identical since they are highly homologous and both genes encode a protein which functions in signaling the activation of T cells through the recognition of the antigen restricted by MHC class 2. The term "heteroduplex" refers to the hybrid DNA generated by the coupling of bases between strands simple complementary ones derived from different mother duplex molecules, while the term "homoduplexes" refers to the single-stranded DNA generated by the base pairing between Complementary simple processes derived from the same mother duplex molecules. The term "slit" in the duplex DNA refers to the absence of the phosphodiester bond between two adjacent nucleotides on a strand. The term "space" in the duplex DNA refers to the absence of one or more nucleotides in a strand of the duplex. The term "loop" in the duplex DNA refers to one or more unpaired nucleotides in a strand. A mutant or variant sequence is a sequence that shows a substantial variation of a natural or reference sequence that differs from the natural or reference sequence in one or more positions.
DETAILED DESCRIPTION 1. General Aspects The present invention provides methods for evolving a polynucleotide towards the acquisition of a desired property. The substrates for the method are a population of at least two polynucleotide variant sequences which contain regions of similarity to each other but which also have points or regions of divergence. The substrates are annealed in vi tro in the regions of similarity. Annealing can generate initial substrates or can form heteroduplexes, in which the component strands originate from different parents. The annealing products are exposed to enzymes to repair the DNA, and optionally a reproduction system, which repairs the different matings. The exposure can be in vivo as when transformed products are recognized in host cells and exposed to the host DNA repair system. Alternatively, the exposure can be in vitro, as when the annealed products are exposed to cellular extracts containing functional DNA repair systems. Exposure of heteroduplexes to a DNA repair system results in DNA repair in projections in the heteroduplex due to more DNA pairing. The repair process differs from homologous recombination in that it promotes the non-reciprocal exchange of strand diversity. The process of DNA repair is typically performed on both component strands of a heteroduplex molecule and any mismatching for particular is typically random as which strand is repaired. The resulting population can thus contain recombinant polynucleotides that encompass an essentially random rearrangement of divergence points between the parent strands. The population of the recombinant polynucleotides is then separated according to the acquisition of a desired property. The property can be a property of the polynucleotide per se, such as the ability of a DNA molecule to bind to a protein or can be a property of an expression product thereof, such as mRNA or a protein.
II. Substrates for Suspendedness Suspended substrates are variants of a reference polynucleotide that shows some regions of similarity to reference regions or other regions or points of divergence. The regions of similarity should be sufficient to support the annealing of the polynucleotides, so that stable heteroduplexes can be formed. Variant forms often show substantial sequence identity to each other (eg, at least 50%, 75%, 90% or 99%). There must be at least sufficient diversity between substrates so that recombination can generate products more diverse than the initial materials. Thus, there must be at least two substrates that differ in at least two positions. The degree of diversity depends on the length of the substrate that is being recombined and the degree of functional change to evolve. A diversity between 0.1-25% in the positions is typical. The recombination of very closely related gene mutations, or all sections of the sequence of one or more genes or sets of distantly related genes can increase the rate of evolution and the acquisition of new desirable properties. Recombination to create chimeric or mosaic genes may be useful to combine desirable characteristics of two or more parents into a single gene or set of genes, or to create novel functional features not found in parents. The number of different substrates to be combined can vary widely in size from two to 10, 100, 1000, up to more than 105, 107, or 109 members. The initial small population of specific nucleic acid sequences that have mutations can be created by a number of different methods. Mutations can be created by error-prone PCR. Error-prone PCR uses low fidelity polymerization conditions to introduce a low level of point mutations randomly over a long sequence. Alternatively, mutations in the standard polynucleotide can be introduced by site-directed mutagenesis or to an oligonucleotide. In oligonucleotide-directed mutagenesis, a short polynucleotide sequence is removed from the polynucleotide using restriction enzyme digestion and replaced with a synthetic polynucleotide in which several bases of the original sequence have been altered. The polynucleotide sequence can also be altered by chemical mutagenesis. Chemical mutagenesis includes, for example, sodium bisulfite, nitrous acid, hydroxylamine, hydrazine or formic acid. Other agents that are homologues of the nucleotide precursors include nitrosoguanidine, 5-bromouracil, 2-aminopurine, or acridine. In general, these agents are added to the reaction by PCR instead of the nucleotide precursors, thereby mutating the sequence. Intercalating agents such as proflavine, acriflavine, quinacrine, and the like can also be used. Random mutagenesis of the polynucleotide sequence can also be achieved by irradiation with X-rays or ultraviolet light. Generally, the plasmid DNA subjected to such a mutation is introduced into E. coli and propagated as a set or mutant plasmid libraries. Alternatively, the small mixed population of specific nucleic acids can be found in nature in the form of different alleles of the same gene or the same gene from different related species (i.e., cognate genes). Alternatively, the substrates may be related but the non-allelic genes may not, such as the immunoglobulin genes. Diversity can also be the result of recombination or previous lethargy. Diversity can also result from the resynthesis of genes that code for natural proteins with the use of an alternative codon. The initial substrates code for variant forms of the sequences that are intended to evolve. In some methods, the substrates code for variant forms of a protein for which the evolution of a new property or modified property is desired. In other methods, the substrates may code for variant forms of a plurality of genes that constitute a multigenetic pathway. In such methods, variation can occur in one or any number of component genes. In other methods, the substrates may contain segments of variants that are intended to evolve as DNA or RNA-binding sequences. In methods, in which the initial substrates contain coding sequences, any essential regulatory sequences, such as a promoter or polyadenylation sequence, required for expression may be present as a component of the substrate. Alternatively, such regulatory sequences can be provided as components of the vectors used to clone the substrates. The initial substrates may vary in length of about 50, 250, 1000, 10,000, 100,000, 106 or more bases. The initial substrates can be provided in the form of a double or a single strand. The initial substrates can be DNA or RNA and analogs thereof. If it is DNA, the initial substrates can be genomic or cDNA. If the substrates are RNA, the substrates are typically transcribed in reverse fashion to cDNA before heteroduplex formation. The substrates can be provided as cloned fragments, chemically synthesized fragments or PCR amplification products. Substrates can be derived from chromosomal, plasmidic or viral sources. In some methods, the substrates are provided concatamerically.
III. Procedures for Generating Heteroduplexes Heteroduplexes are generated from double-stranded DNA substrates, denaturing the DNA substrates and incubating under annealing conditions. Hybridization conditions for heteroduplex formation depend on the sequence and are different in different circumstances. Longer sequences hybridize specifically at higher temperatures. In general, the hybridization conditions are selected from. so that they are approximately 25 ° C or less than the thermal melting point (Tm) for the specific sequence at the defined ionic strength and pH. The Tm is the temperature (under the ionic strength, pH, and defined nucleic acid concentration) at which 50% of the probes complementary to the target sequence hybridize to the target sequence at equilibrium. The exemplary conditions for the denaturation and renaturation of double-stranded substrates are the following. Equimolar concentrations (~ 1.0 - 5.0 nM) of the substrate is mixed in 1 x of SSPE buffer (80 mM NaCl, 1.0 mM EDTA, 10 M NaH2P0, pH 7.4). After heating at 96 ° C for 10 minutes, the reaction mixture is immediately cooled to 0 ° C for 5 minutes. The mixture is then incubated at 68 ° C for 2-6 hours. The denaturation and annealing can also be carried out by the addition and removal of a denaturant such as NaOH. The procedure is the same for single-stranded DNA substrates, except that the denaturing step can be omitted for short sequences. By appropriate design of substrates for heteroduplex formation, it is possible to achieve the heteroduplex selection in relation to the reformed father homoduplex. The homoduplexes simply reconstruct the parent substrates and effectively divide the recombinant products into the subsequent separation steps. In general, selection is achieved by designing substrates so that heteroduplexes are formed in open circles, while homoduplexes are formed as linear molecules. A subsequent transformation step results in substantial enrichment (for example 100 times) of the circular heteroduplex. Figure 1 shows a method in which two sequences of the substrate in separate vectors are amplified by PCR using two different sets of primers (Pl, P2 and P3, P4). Typically, the first and second substrates are inserted into separate copies of the same vector. The two different pairs of primers initiate the amplification at different points of the two vectors. Figure 1 shows an array in which the primer pairs P1 / P2 initiate amplification at one of the two vector boundaries with the substrate and the primer pair P1 / P2 initiates reproduction at the other limit in a second vector. The two primers in each pair of primers prime the amplification in opposite directions around a circular plasmid. The amplification products generated by this amplification are linearized double-stranded vector molecules in which the first and second substrates occur at opposite ends of the vector. The products of the amplification are mixed, denatured and annealed. The mixing and denaturing can be carried out in any order. The annealing generates two linear homoduplexes, and an open circular heteroduplex containing a slit in each strand, at the starting point of the PCR amplification. The introduction of the products of the amplification in host cells selects the heteroduplexes in relation to the homoduplex because the former transforms much more efficiently than the latter. It is not essential in the above scheme that the amplification be initiated at the interface between the substrate and the rest of the vector. Instead, amplification can be initiated in any positions on two vectors containing substrates as long as the amplification is initiated at different points between the vectors. In the general case, such amplification generates two linearized vectors in which the first and second substrates respectively occupy different positions in relation to the rest of the vector. Denaturing and annealing generate similar heteroduplexes to those illustrated in Figure 1, except that slits occur within the component vector rather than at the interface between the plasmid and the substrate. The start of the amplification outside the substrate component of a vector has the advantage that it is not necessary to design specific primers for the substrate carried by the vector. Although two I substrates are exemplified in Figure 1, the above scheme can be extended to any number of substrates. For example, an initial population of substrate containing vectors can be divided into two groups. One group is amplified by PCR for one set of primers, and the other set of others. The products of the amplification are denatured and annealed as above. The heteroduplexes may contain one strand of any substrate in the first set and one strand of any substrate in the second set. Alternatively, three or more substrates cloned in multiple copies of a vector can be amplified with the amplification in each vector starting at a different point. For each substrate, this process generates amplification products that vary in how the DNA of the flanking vector is divided on both sides of the substrate. For example, a product of the amplification may have the majority of the vector on one side of the substrate, another product of the amplification may have the majority of the vector on another side of the substrate, and a product of the additional amplification may have an equal division of the sequence of the vector flanking the substrate. In the subsequent annealing step, a strand of substrate can form a circular heteroduplex with a strand of any other substrate, but the strands of the same substrate can only anneal to each other to form a linear homoduplex. In a further variation, multiple substrates can be obtained by performing multiple iterations of each scheme in Figure 1. After the first iteration, the recombinant polynucleotides in a vector undergo heteroduplex formation with a third substrate incorporated into an additional copy of the vector. The vector containing recombinant polynucleotides and the vector containing the third substrate are amplified by PCR separately from different pairs of primers. The products of the amplification are then denatured and annealed. The process can be repeated more times to allow recombination with additional substrates.
An alternative scheme for heteroduplex formation is shown in Figure 2. Here, the first and second substrates are incorporated into separate copies of a vector. The two copies are then digested respectively with different restriction enzymes. Figure 2 shows an array in which the restriction enzymes cut at opposite boundaries between the substrates and the vector, but all that is needed is to use two different restriction enzymes that cut at different places. The digestion generates first and second substrates containing first and second linearized vectors, the first and second substrates occupy different positions in relation to the sequences of the remaining vectors. The denaturation and annealing generates open circular heteroduplexes and linear homoduplexes. The scheme can be extended to recombination between more than two substrates using analogous strategies to those described with respect to Figure 1. In one variation, two groups of substrates are formed, and each is cloned separately in the vector. The two groups are then cut with different enzymes, and the annealing proceeds as for the two substrates. In another variation, three or more substrates can be cloned into three or more copies of the vector, and the three or more resulting molecules cut with three or more enzymes, cutting into three or more sites. This generates three linearized vector forms that differ in the division of the vector sequences that flank the portion of the substrate, in the vectors. Alternatively, any number of substrates may be recombined in pairs in an iterative fashion with the products of a recombination round annealing with a fresh substrate in each round. In a further variation, heteroduplexes of substrate molecules can be formed in vector-free form, and the heteroduplexes can be subsequently cloned into vectors. Such a thing can be achieved by the asymmetric amplification of the first and second substrates as shown in Figure 3. A asymmetric PCT or a single primer amplifies only one strand of a duplex. By appropriate selection of the primers, opposite strands of two different substrates can be amplified. With the annealing of the products, heteroduplexes of opposite strands of the two substrates are formed. Because only one strand of each substrate is amplified, annealing does not form the homoduplex (more than small amounts of non-amplified substrate). The process can be extended to allow the recombination of any number of substrates using strategies analogous to those described with respect to Figure 1. For example, the substrates can be divided into two groups, and each group subjected to the same asymmetric amplification, so that the products of the amplification of a set can only be annealed with the products of the amplification of the other group, and not with each of the others. Alternatively, the lethargy can be produced in pairs in an iterative form, in which the recombinants formed of heteroduplexes of the first and second substrates are subsequently subjected to heteroduplex formation with a third substrate. Point mutations can also be introduced at the desired level during PCR amplification. Figure 4 shows another method for selecting heteroduplexes in relation to homoduplexes. The first and second substrates are isolated by PCR amplification of separate vectors. The vectors are denatured and allowed to anneal to form both reconstructed heteroduplexes and homoduplexes. The annealing products are digested with restriction enzymes X and Y. X has a site on the first substrate but not on the second substrate and vice versa for Y. The enzyme X cuts the reconstructed homoduplex of the first substrate and the enzyme Y cuts the homoduplex reconstructed from the second substrate. No enzyme cuts the heteroduplex. The heteroduplex can be effectively separated from the homoduplex restriction fragments by further cleavage with enzymes A and B having sites near the ends of both the first and second substrates., and the ligation of the products in the vector having cohesive ends compatible with the ends resulting from digestion with A and B. Only the heteroduplexes cut with A and B can be ligated with the vector. Alternatively, the heteroduplexes can be separated from the restriction fragments of the heteroduplexes by selection by size on gel. The above process can be generalized to N substrates by cleaving the mixture of heteroduplex and homoduplexes with N enzymes, each of which cuts a substrate on different and not another substrate. The heteroduplexes can be formed by directional cloning. Two substrates for heteroduplex formation can be obtained by PCR amplification of chromosomal DNA and attached to the opposite ends of a linear vector. Directional cloning can be achieved by digesting the vector with two different enzymes, and digesting or adapting the first and second substrates to be compatible respectively with the cohesive ends of only two enzymes used to cut the vector. The first and second substrates can thus be ligated to opposite ends of a linearized vector fragment. This scheme can be extended to any number of substrates using principles analogous to those described for Figure 1. For example, substrates can be divided into two sets before ligation to the vector. Alternatively, the recombinant products formed by the heteroduplex formation of the first and second substrates can subsequently be subjected to heteroduplex formation with a third substrate.
IV. Vectors and Transformation In general, substrates are incorporated into vectors either before or after the heteroduplex formation step. A variety of cloning vectors typically used are suitable in genetic engineering. The vectors containing the DNA segments of interest can be transferred to the host cell by standard methods, depending on the type of cellular host. For example, the transformation with calcium chloride is commonly used for prokaryotic cells, while treatment with calcium phosphate, lipofection, or electroporation can be used for other cellular hosts. Other methods used to transform mammalian cells include the use of Polybrene, protoplast fusion, liposomes, electroporation and microinjection, and biolysics (see generally, Sambrook et al., Supra). Viral vectors can also be packaged in vi tro and introduced by infection. The choice of vector depends on the host cells. In general, a suitable vector has a recognized duplication origin in the desired host cell, a selectable marker capable of being expressed in the intended host cells and / or regulatory sequences to support the expression of genes within the substrate that is being dormant.
V. Types of Host Cells In general, any type of cells that support the repair and duplication of the heteroduplex DNA introduced into the cells can be used. Cells of particular interest are the standard cell types commonly used in genetic engineering, such as bacteria, particularly E. coli (16, 17). Suitable E. coli cells include E. coli mutS, mutL, dam ~, and / or recA *, E. coli XL-10-Gold ([Tet? (McrA) 183 A (mcrCB-hsdSMR-mrr) 1 73 endAl supE44 thi -1 recAl gyrA96 relay lac HTe] [Fr proAB laclqZAM15 Tnl O (Tetr) Amy Camr]), E. coli ES1301 mutS [Genotype; lavZ53, muts201:: Tn5, thiA36, rha-5, merBl, deoC, IN (rrnD-rrnE)] (20, 24, 28-82). The preferred E. coli strains are E. coli SCS110 [Genotype: rpsl, (Strr), thr, leu, enda, thi-1, lacy, galk, galt, ara tona, tsx, dam, dem, supE44,? (lac-proAB], [F, traD36, proA ~ B ~ lac ^ ZAMld], which have normal cellular mating repair systems (17) .This type of strains repairs bad matings and mating absences in the heteroduplex with In addition, because this strain is dam ~ and dcrrf, the plasmid isolated from the strain is not methylated and therefore is particularly suitable for additional rounds of formation / repair of mismatches of the DNA duplex. (see below) Other suitable bacterial cells include gram-negative and gram-positive cells, such as Bacillus Pseudomonas, and Salmonella.The eukaryotic organisms are also capable of carrying out the repair of mismatches (43-48). The repair systems of bad pairings in prokaryotes and eukaryotes play an important role in the maintenance of genetic fidelity during the duplication of DNA. is that they play important roles in the repair of bad matings in prokaryotes, particularly mutS and I mutL, have homologs in eukaryotes, in the result of genetic recombinations, and in the stability of the genome. The natural S. cerevisiae or mutant has been shown to repair heteroduplex mating mismatch (49-56), as have the COS-1 monkey cells (57). The preferred yeast cells are Picchia and Saccharomyces. Mammalian cells have been shown to have the ability to repair the base pairs G-T or G-C by a short patching mechanism (38, 58-63). Cell and primary cell lines from mammalian cells (eg, mouse, hamster, primate, human) can also be used. Such cells include proliferating cells, including embryonic proliferating cells, zygotes, fibroblasts, lymphocytes, Chinese hamster ovary (CHO), mouse fibroblasts (NIH3T3), kidney, liver, muscle and skin cells. Other eukaryotic cells of interest include plant cells, such as corn, rice, wheat, cotton, soybean, sugar cane, tobacco and arabidopsis; fish, seaweed, mushrooms . { aspergillus, podosporra, neurospora), insects (eg, Lepidoptera staff) (see, Winnacker, "From Genes to Clones. "VCH Publishers, NY, (1987), which is incorporated herein by reference.) In vivo repair occurs in a wide variety of prokaryotic and eukaryotic cells.The use of mammalian cells is advantageous in certain applications in which substrates coding for polypeptides that are expressed only in mammalian cells or that are intended to be used in mammalian cells, however, bacteria and yeast cells are advantageous for separating large libraries due to the higher transformation frequencies achievable in those strains.
V. In Vitro DNA Feedback Systems As an alternative to introducing annealed products into host cells, the annealed products can be exposed to a DNA repair system in vi tro. The DNA repair system can be obtained as extracts of E. coli competent for repair, yeast or other cells (64-67). The cells competent for repair are used in an appropriate buffer and supplemented with nucleotides. The DNA is incubated in this cell extract and transformed into whole cells for duplication.
V. Separation and Selection After the introduction of annealed products into the host cells, the host cells are typically cultured to allow repair and duplication to occur and optionally for the genes encoded by the polynucleotides to be expressed. The recombinant polynucleotides can be subjected to further rounds of recombination using the heteroduplex methods described above, or other methods of lethargy described below. However, whether after a recombination cycle or several, the recombinant polynucleotides are subjected to separation or selection for a desired property. In some cases, the separation or selection is carried out in the same host cells that are used for DNA repair. In other cases, the I I recombinant polynucleotides, their expression products or secondary metabolites produced by the expression products are isolated from such cells and separated in vi tro. In other cases, the recombinant polynucleotides isolated from the host cells in which recombination occurs and are separated or selected in other host cells. For example, in the methods, it is advantageous to allow DNA repair to occur in a bacterial host cell, but to separate an expression product from recombinant polynucleotides in eukaryotic cells. The recombinant polynucleotides that survive the separation or selection are sometimes useful products per se. In other cases, such recombinant polynucleotides are subjected to further recombination with each other or with other substrates. Such recombination can be effected by the heteroduplex methods described above or any other methods of lethargy. Additional rounds of recombination are followed by additional rounds of separation or selection on an iterative basis. Optionally, the rigidity of the selection can be increased in each round. The nature of the separation or selection depends on the desired property that is sought to be acquired. The desirable properties of the enzymes include high catalytic activity, ability to confer drug resistance, high stability, the ability to accept a wider (or narrower) range of substrates or the ability to function in non-natural environments such as organic solvents. Other desirable properties of the proteins include the ability to bind to a selected target, secretory capacity, ability to generate an immune response to a given target, absence of immunogenicity and toxicity to pathogenic microorganisms. The desirable properties of DNA or RNA polynucleotide sequences include the ability to specifically bind to a given target protein, and the ability to regulate the expression of operably linked coding sequences. Some of the above properties, such as drug resistance, can be selected by culturing cells on the drug. Other properties, such as the influence of a regulatory sequence on expression, can be separated by detecting the appearance of the product of the expression of a reporter gene linked to the regulatory sequence. Other properties, such as the ability of a protein to be secreted, can be separated by FACS "1, using a labeled antibody for the protein.Other properties, such as immunogenicity or lack of it, can be selected by isolating the cell protein. individual or groups of cells, and analyzing the protein in vi tro or in a laboratory animal.
VII. Variations I. Demethylation Most cell types methylate DNA in some way, with the methylation pattern differing between cell types. Methylation sites include 5-methylcytosine (m5C), N4-methylcytosine (m4C), and N6-methyladenine (m6A), 5-hydroxymethylcytosine (hm5C) and 5-hydroxymethyluracil (hm5C). In E. coli, methylation is carried out by the enzymes Dam and Dcm. The methylase specified by the dam gene metilates the N6 position of the adenine residue in the GATC sequence, and the methylase specified by the dcm gene methylated the C5 position of the internal cytosine residue in the CCWGG sequence. The DNA of plants and mammals is frequently subjected to GC methylation which means that the CG and CNG sequences are methylated. The possible effects of methylation on cell repair are discussed in references 8-20. In some methods, DNA substrates for heteroduplex formation are at least partially demethylated on one or both strands, preferably the second. Demethylation of substrate DNA promotes efficient and random repair of heteroduplexes. In heteroduplexes formed with a methylated strand with dam and an unmethylated strand, repair is diverted to the unmethylated strand, and with the methylated strand serving as the pattern for correction. If no strand is methylated, a repair occurs with a bad mating, but it shows an insignificant preference for the strand (23, 24). The demethylation can be carried out in a variety of ways. In some methods, the substrate DNA is demethylated by PCR amplification. In some cases, DNA demethylation is performed in one or more steps of PCR in the heteroduplex formation methods described above. In other methods, an additional PCR step is performed to effect the demethylation. In other methods, demethylation is effected by passing the DNA from the substrate through host cells deficient in methylation (eg, E. coli strain dam'dcm '). In other methods, the substrate DNA is demethylated in vi tro using a demethylating enzyme. The demethylated DNA is used for heteroduplex formation using the same procedures described above. The heteroduplexes are subsequently introduced into cells capable of repairing DNA but deficient in restriction enzymes to prevent the degradation of demethylated heteroduplexes. 2. Slit Sealing Several of the methods of heteroduplex formation described above result in circular heteroduplexes containing slits in each strand. These slits can be sealed before introducing the heteroduplexes into the host cells. Sealing can be effected by treatment with DNA ligase under standard ligation conditions. The ligation forms a phosphodiester bond to link two adjacent bases repaired by a slit in a strand of DNA double helix. The sealing of the slits increases the frequency of recombination after the introduction of the heteroduplexes into the host cells. 3. PCR Prone to Errors Accompanying the Amplification Several of the formats described above include the step of PCR amplification. Optionally, such step can be effected under mutagenic conditions to induce additional diversity between the substrates.
VIII Other methods of lethargy The methods for heteroduplex formation described above can be used in conjunction with other methods of lethargy. For example, a dormancy, separation or selection cycle may be performed, followed by a dormancy cycle by another method, followed by an additional separation or selection cycle. Other lethargy formats are described in WO 95/22625; US 5,605,793; US 5,811,238; WO 96/19256; Stemmer, Science 270, 1510 (1995); Stemmer et al., Gene, 164, 49-53 (1995); Stemmer, Bio / Technology, 13, 549-553 (1995); Stemmer, Proc. Na ti. Acad. Sci. USA 91, 10747-10751 (1994); Stemmer, Na ture 370, 389-391 (1994); Crameri et al., Na ture Medicine, 2 (1): 1-3, (1996); Crameri et al., Na ture Biotechnology 14, 315-319 (1996); WO 98/42727; WO 98/61622; WO 98/05764 and WO 98/42728, WO 98/239 (each of which is incorporated herein by reference in its entirety for all purposes).
IX. Protein analogues Proteins isolated by the methods also serve as lead compounds for the development of derivative compounds. Derivative compounds can include chemical modifications of amino acids or replace amino acids with chemical structures. The analogs should have a stabilized electronic configuration and a molecular conformation that allows them to present key functional groups in substantially the same manner as the leader protein. In particular, the non-peptic compounds have spatial electronic properties which are comparable to the binding region of the polypeptide, but will typically be much smaller molecules than the polypeptides, which frequently have a lower molecular weight of about 2 CHD, and preferably less than 1 CHD. The identification of such non-peptic compounds can be carried out through several standard methods such as self-consistent field analysis (CSF), configuration interaction analysis (CHI), and normal mode dynamic analysis. Computer programs to implement these techniques are readily available. See Rein et al., Computer-Assisted Modeling of Receptor-Ligand Interactions (Alan Liss, New York, 1989).
IX. Pharmaceutical Compositions Polynucleotides, their expression products, and secondary metabolites whose formation is catalyzed by the expression products, generated by the above methods are optionally formulated as pharmaceutical compositions. Such compositions comprise one or more active agents, and a pharmaceutically acceptable carrier. A variety of aqueous carriers can be used, for example, water, buffered water, phosphate buffered saline (PBS), 0.4% saline, 0.3% glycine, human alumina solution and the like. These solutions are sterile and are generally free of particulate matter. The compositions may contain pharmaceutically acceptable auxiliary substances as required to approximate physiological conditions such as pH adjusting agents and buffers, agents for adjusting toxicity and the like, for example, sodium acetate, sodium chloride, potassium chloride, The calcium and sodium chloride are selected mainly on the basis of fluid volumes, viscosities and so on, according to the particular mode of administration selected.
EXAMPLES EXAMPLE 1. Flaa RhizobJu Genes Innovative for the Recombination of Flati from Ktiizobivm Lupini and FlaA from Rhizobivim Meliloti Bacterial flagella have a helical fragment, a proximal hook and a basal leather with the flagellar motor (68). This basic design has been extensively examined in E. coli and S. typhimuri um and is widely applicable to many other bacteria as well as some archaea. Long helical filaments are polymers mounted on flagellin subunits, whose molecular weights range from 20,000 to 65,000, depending on the bacterial species (69). Two types of flagellar filaments, called plane and complex, have been distinguished by their surface structures determined by electron microscopy (70). The flat filaments have a smooth surface with clearly helical lines, while the complex filaments exhibit a conspicuous helical pattern of alternating edges and grooves. It is considered that these characteristics of the complex flaglelar filaments are responsible for the brittle structure and (by implication) rigid that allows the bacteria to be efficiently driven in viscous media (71-73). While flagella with flat filaments can alternate between rotations clockwise and counterclockwise (68), all known flagella with complex filaments rotate only clockwise with intermittent stops (74). Since this last pattern of navigation is found through bacteria and archae, it has been suggested that complex flagella may reflect the common antecedent of a basic, ancestral motility design (69). Differing from the flat bacterial flagella in the fine structure of their filaments dominated by helical bands, conspicuous and in their fragility, the filaments are also resistant against thermal decomposition (72). Scmitt et al. (75) showed that bacteriophage 7-7-1 is specifically adsorbed to the complex flagellum of i ?, lupini H13-3 and requires mobility for a productive infection of its host. Although the flagellins of R. meliloti and R. lupini are very similar, bacteriophage 7-7-1 does not infect R. meliloti. So far complex flagella have been observed only in three species of soil bacteria: Pseudomonas rhodos (73), R. meliloti (76), and R. l upini H13-3 (70, 72). The cells of R. lupini H13-3 possess 5 to 10 complex flagella peritonically inserted, which were first isolated and analyzed by high-resolution electron microscopy and by optical diffraction (70). Murayama et al., (77) also found that a high content of hydrophobic residual amino acids in the complex filament may be one of the main reasons for the unusual properties of complex flagella. Measuring the mass per unit length and obtaining the three-dimensional reconstruction of electronic micrographs. Trachtenberg et al. (73, 78) suggested that the complex filaments of R. Lupini are composed of functional dimers. Figure 6 shows the comparison between the deduced amino acid sequence of FlaA from R. lupini H13-3 and the deduced amino acid sequence of FlaA from R. meliloti Perfect matings are indicated by vertical lines, and conservation exchanges are identified by two points. The total identity is 56%. t La flaA de R. lupini and flaA R. Meliloti underwent in vitro heteroduplex formation followed by in vivo repair to create novel FlaA molecules and structures.
A. Methods The pRL20 that contains the flaA gene of JR. l upini H13-3 and pRM40 containing the flaA gene of JR. meliloti are shown in Figures 6A and 6B. These plasmids were isolated from E. coli SCS110 (methylation-free dam and dcm). Approximately 3.0 pg DNA from pRL20 and pRM40 were digested without methylating with Bam Hl and Eco RI, respectively, at 37 ° C for 1 hour. After separation on agarose gel, the linearized DNA was purified with a Preparative PCR Kit Wizard (Promega, Wl, USA). Equimolar concentrations (2.5 nM) of linearized unmethylated pRL20 and pRM40 were mixed in 1 x of SSPE buffer (180 mM NaCl, 1 mM EDTA, 10 mM NaH2P04, pH 7.4). After heating at 98 ° C for 10 minutes, the reaction mixture was immediately cooled to 0 ° C for 5 minutes. The mixture was incubated at 68 ° C for 2 hours to form heteroduplexes.
An icrolitre of the reaction mixture was used to transform 50 μl of E. coli ES 1301 mutS, E. coli SCS110 and competent cells of E. coli JM109. The efficiency of the transformation with competent cells of E. coli JM109 was approximately seven times higher than with E. coli SCS110 and ten times higher than with E. coli ES1301 mutS, although the total transformation efficiencies were 10-200 times lower than those of the control transformations with the closed, covalent and circular pUC19 plasmid. Two random clones of E transformants were selected. coli ES1301 and two transformants of E. coli ES1301 mutS, and DNA was isolated from the plasmid of those short clones for the analysis of the additional DNA sequence.
B. Results Figure 7 shows (a) the sequence of SCS01 (clone # l of the transformant E. coli SCS110 library), (b) the SCS02 sequence (clone # 2 of the transforming E. coli SCS110 library), ( c) the sequence of ES01 (clone # l of the transformant E. coli library ES1301), (d) the ES02 sequence (clone # 2 of the transformant E. coli library ES1301). The four sequences were different from the JR flaA sequences. lupini and flaA de JR. natural meliloti. The clones SCS02, ES01 and ES02 had all open reading frame but SCS01 was truncated. Figure 8 shows that recombination occurred mainly in the regions of the loop (unpaired regions). The library of mutant flaA generated from FLA of R. meliloti and flaA of R. lupini can be transformed into E. coli SCS110, ES1301, XLIO-Gold and JM109, and the separated transformants to obtain functional FlaA recombinants.
EXAMPLE 2 Directed Evolution of the ECB Desacilasa Para Variants With Specified Improved Activity I Streptomyces are among the most important industrial microorganisms due to their ability to produce numerous important secondary metabolites (including many antibiotics) as well as large amounts of enzymes. The methods described herein can be used with little modification for the direct evolution of native Streptomyces enzymes, at least or all genes in a metabolic pathway, as well as other heterologous enzymes expressed in Streptomyces. New antifungal agents are critically necessary due to the increasing and increasing number of patients with AIDS, organ transplantation and immunocompromised cancer chemotherapy who suffer from opportunistic infections. Echinocandin B (ECB), a lipopeptide produced by some Aspergillus species, has been studied extensively as a potential antifungal. Several antifungal agents with significantly reduced toxicity have been generated by replacing the side chain of linoleic acid of echinocandin B from A. nidulans with different aryl side chains (79-83). The cyclic hexapeptide ECB core precursor for chemical acylation is obtained by enzymatic hydrolysis of ECB using ECS deacylase from Actinoplanes utahensis. To maximize the ECB conversion in the intact core, this reaction is carried out at pH 5.5 with a small amount of miscible organic solvent to solubilize the ECB substrate. The cyclic hexapeptide nucleus of the product is unstable at pH above 5.5 during the long incubation required to completely deacylate the ECB (84). The optimal pH for ECB deacylase, however, is 8.0-8.5 and its activity is reduced to pH 5.5 in the presence of more than 2.5% ethanol (84). To improve the production of the ECB core, it is necessary to increase the activity of ECB deacylase under those relevant process conditions. Relatively little is known about ECD deacylase. The enzyme is a heterodimer whose two subunits are derived by processing a single precursor protein (83). The a subunit of 19.9 kD is separated from the subunit β of 60.4 kD by a peptide separator of 15 amino acids that is removed together with a signal peptide and another peptide separator in the native organism. The polypeptide is also expressed and processed into functional enzyme in Streptomyces lividans, the organism used for the large-scale conversion of ECB by recombinant ECB deacylase. The three-dimensional structure of the enzyme has not been determined, and its sequence shows little similarity with other possibly related enzymes such as penicillin acylase, so that a sufficiently reliable structural model to guide a rational effort to design the deacylated ECB will be difficult to create. Therefore we decided to use direct evolution (85) to improve this important activity. Recently, suitable protocols have been described for a mutagenic PCR and recombination by random priming of the ECB deacelase gene of 2.4 kb (73% of G + C) (86). Here, we also describe the use of heteroduplex recombination to generate a new ECD deacylase with specific activity improvement. In this case, two mutants of ECB descilasa of Actinoplanes utahensis, M7-2 and M16, which show higher specific activity at pH 5.5 and in the presence of 10% of MeOH were recombined using the technique of heteroduplex formation in vi tro and repair of bad matings in I live. Figure 12 shows the physical maps of the plasmids pM7-2 and pMl6 which contain the genes for the ECB mutants desacilasa M7-2 and M16. The M7-2 mutant was obtained through a mutagenic PCR carried out directly on Streptomyces lividans cells containing the natural ECB deacylase gene, expressed from the plasmid pSHP150-2 *. StreptoiOyces with pM7-2 show 1.5 times the specific activity of the cells expressing natural ECC deacylase (86). The clone pMl6 was obtained using the recombination technique by random priming as described (86, 87). This shows 2.4 times the specific activity of the natural ECB deacylase clone.
A. Methods: Plasmid DNA of M7-2 and M16 (pM7-2 and PM16) (Figure 9) of E. coli SCS110 (in separate reactions) was purified. Approximately 5.0 μg of M7-2 DNA and M16 unmethylated were digested with Xho I and Psh AI, respectively, at 37 ° for 1 hour (Figure 10). After separation on agarose gel, the linearized DNA was purified using a Wizard Preparative PCR Kit (Promega, Wl, USA). Equimolar concentrations (2.0 nM) of PM7-2 DNA and pM16 without linearized methylation were mixed with 1 x of SSPE buffer (1 x SSPE: 180 M NaCl, 1.0 mM EDTA, 10 mM NaH2P04, pH 7.4). After heating at 96 ° C for 10 minutes, the reaction mixture was immediately cooled to 0 ° C for 5 minutes. The mixture was incubated at 68 ° C for 3 hours to promote heteroduplex formation. One microliter of the reaction mixture was used to transform 50 μl of E. coli ES1301 mutS, E. coli SCS110 and JM109 competent cells. All transformants of E. coli ES1301 mutS were pooled and E. coli SCS110 was pooled. A set of plasmids was isolated from each pooled library, and this pool was used to transform protoplasts of S. lividans TK23 to form a mutant library to separate the deacylase activity. The transformants of S. lividans libraries were separated to determine the ECB deacylase activity with a plaque assay in itself. The transformed protoplasts were allowed to regenerate on E2YE agar plates for 24 hours at 3Q ° C and develop in the presence of thiostrepton a for 48 hours. When the colonies grew to the appropriate size, 6 ml of 0.7% agarose solution containing 0.5 mg / ml of ECB in 0.1 M sodium acetate buffer (pH 5.5) was poured onto the top of each R2YE agar plate. and they were allowed to develop for 18-24 hours at 30 ° C. Colonies surrounded by a clear area larger than that of a control colony containing plasmid pSHP150-2 * natural, were selected for further characterization. The selected transformants were inoculated in 20 ml of medium containing thiostrepton and grown aerobically at 30 ° C for 48 hours, at which point they were analyzed for ECB deacylase activity using CLAP. 100 μl of complete broth was used for a reaction at 30 ° C for 30 minutes in 0.1 M NaAc buffer (pH 5.5) with a content of 10% (volume / volume) of MeOH and 200 μg / ml of ECB substrate. The reactions were stopped by adding 2.5 volumes of methanol, and 20 μl of each sample was analyzed by CLAP on a 100 x 4.6 mm polyhydroxyethyl aspartamide column (PolyLC Inc., Columbia, MD, USA) at room temperature using a linear gradient of acetonitrile starting with 50:50 A: B (A = 93% acetonitrile, 0.1% phosphoric acid, B = 70% acetonitrile, 0.1% phosphoric acid) and ending with 30:70 A: B ery 22 minutes at a flow rate of 2.2 ml / min. The areas of the ECB peaks and the ECB core were calculated and subtracted from the corresponding peak areas of a sample culture of S. lividans containing pIJ702 * to estimate the activity of ECB deacylase. 20 ml of positive mutant precultures were used to inoculate 50 ml of medium and allowed to grow at 3 ° C for 96 hours. The supernatants were further concentrated to 1/30 of their original volume using an Amicon filtration unit (Beverly, MA, USA) with a molecular weight cut-off of 10 kD. The resulting enzyme samples were diluted with an equal volume of 50 mM KH2P04 buffer (pH 6.0) and applied to a Hi-Trap ion exchange column (Pharmacia Biotech, Piscataway, USA). The binding buffer was 50 mM KH2P04 (pH 6.0), and the elution buffer was 50 mM KH2P04 (pH 6.0) containing 1.0 M NaCl. A linear gradient of 0 to 1.0 M NaCl was applied in 8 column volumes. with a flow rate of 2.7 ml / min. The ECC deacylase fraction flowing to 0.3 M NaCl was concentrated and the buffer was exchanged for 50 mM KH2P04 (pH 6.0) using Centricon-10 units. The purity of the enzyme was verified by SDS-PAGE using Coomasie blue staining and the concentration was determined using the Bio-Rad Protein Assay Reagent (Hercules, CA, USA). A modified CLAP assay was used to determine the activities of ECB mutants deacylase on the ECB substrate (84). Four μg of each purified mutant deacylase ECB was used for the activity assay reaction at 30 ° C for 30 minutes in 0.1 M NaAc buffer (pH 5.5) containing 10% (volume / volume) of MeOH and different substrate concentrations of ECB. The tests were carried out in duplicate. The reactions were stopped by adding 2.5 volumes of methanol, and the CLAP tests were carried out as described above. The absorbance values were recorded, and the initial velocities were calculated by least squares regression of the time progress curves from which Km and kcat were calculated. The activities were measured as a function of the pH for the deacylase ECB purified at 30 ° C at different pH values: 5, 5.5 and 6 (0.1 M acetate buffer); 7, 7.5, 8 and 8.5 (0.1 M phosphate buffer), 9 and 10 (0.1 M carbonate buffer) using the CLAP test. The stabilities of purified deacylase ECB were determined at 30 ° C in 0.1 M NaAc buffer (pH 5.5) with a 10% methanol content. Samples were taken at different time intervals, and the residual activity was measured in the same buffer with the CLAP assay described above.
B. Results Figure 11 shows that after one round of applying this heteroduplex repair technique on the mutant M7-2 and M16 genes, it was found that a mutant (M15) of approximately 500 original transformants possesses 3.1 times the specific activity of the natural. Deacetylated ECBs of natural and evolved M15 were purified and their kinetic parameters were determined by deactivation of ECB by CLAP. The evolved M15 deacylases have a constant catalytic rate increased, kcat by 205%. The catalytic efficiency (kcat / Km) of M20 increased by a factor of 2.9 over the natural enzyme. The initial deacylation rates with natural and M 15 at different pH values from 5 to 10 were determined at 200 μg / ml ECB. The recombined M15 is more active than the natural one at pH 5-8. Although the pH dependence of the activity of the enzyme in this test is not strong, there is a definite deviation of 1.0-1.5 units at the optimum at a lower pH, compared with the natural one. The activation time courses of the ECB deacylase mutant M15 purified gave in NaAC 0.1 M (pH 5.5) at 30 ° C. No significant differences were observed in the stability between the wild type and the M15 mutant.
DNA mutations with respect to the ECB sequence of natural deacylase and the positions of the amino acid substitutions in the evolved variants M7-2, M16 and M15 are summarized in Figure 12. The heteroduplex recombination technique can recombine mother sequences to create novelty progeny. The recombination of the M7-2 and M16 genes produced M15, whose activity is greater than either of their parents (Figure 13). Of the six base substitutions at M15, five (at positions a50, al71, ß57, ßl29 and ß340) were inherited from M7-2, and the other (ß30) came from M16. This method provides an alternative to existing methods of DNA recombination and is particularly useful for recombining large-scale genes or whole operons. This method can be used to recose recombinant proteins to improve their properties or to improve the structure / function relationship.
EXAMPLE 3. Subtilisin E Variant of Bacillus Subtilis Novel Thermostatic This example shows the use of heteroduplex in vi tro formation followed by in vivo repair to combine sequence information from two different sequences to improve the stability of Bacillus subtilisin E subtilis. The RCl and RC2 genes that code for subtilisin E variants of B. thermostable subtilis (88). Mutations at the positions of bases 1107 in RCl and 995 in RC2 (Figure 14), which give rise to the amino acid substitutions Asn218 / Ser (N218S) and Asn 181 / Asp (N181 ID), lead to improvements in thermostability of subtilisin E; The remaining mutations, both synonymous and non-synonymous, have no detectable effects on thermostability. At 65 ° C, the only N181D and N218S variants have half lives approximately 3 times and twice as long, respectively, that natural subtilisin E, and variants containing both mutations have half lives that are 8 times greater (88). The different half-lives in a population of subtilisin E variants can therefore be used to estimate the efficiency with which the I sequence information is combined. In particular, the recombination between these two mutations (in the absence of point mutations that affect thermostability) should generate a library in which 25% of the population exhibits the thermostability of the double mutant. Similarly, 25% of the population should exhibit a stability similar to the natural one, since the N181D and N218S are eliminated at equal frequencies. We use the fraction of the recombined population as a diagnosis.
A. Methods The strategy underlying this example is shown in Figure 15. The mutant genes of thermostable subtilisin E RCl and RC2 (Figure 14) are fragments of 986 bp that includes 45 nt of the prosequence of subtilisin E, the whole sequence matures and 113 nt after the stop codon. The genes were cloned between Bam Hl and Nde I in the pBE3 release vector pb E. coli / B subtilis, resulting in pBE3-l and PBE3-2, respectively. The plasmid DNA of pBE3-l and PBE3-2 was isolated from E. coli SCS110. Approximately 5.0 μg of DNA of pBE3-l and PBE3-2 unmethylated were digested with Bam Hl and Nde I, respectively, at 37 ° C for one hour. After separation on agarose gel, equimolar concentrations (2.0 nM) of pBE3-l and PBE3-2 without methylated linearized were mixed in 1 x of buffered SSPE (180 M NaCl, 1.0 mM EDTA, 10 mM NaH2P04, pH 7.4). After heating at 96 ° C for 10 minutes, the reaction mixture was immediately cooled to 0 ° C for 5 minutes. The mixture was incubated at 68 ° C for 2 hours to form heteroduplex. One microliter of the mixture was used to transform 50 μl of E. coliES 1301 mutS, E. coli SCS 110 and competent cells of E. coli HB101. The efficiency of the transformation of competent cells of E. coli HB101 was approximately 10 times higher than with E. coli SCS110 and 15 times higher than E. coli ES1301 mutS. But within the cases, the efficiencies of the transformation were 10-250 times lower than that of the transformation with the closed, covalent and circular control pUC19 plasmids. Five mutant SCS110 and five E. coli ES1301 E. coli library clones were randomly chosen and the plasmid DNA was isolated using a plasmid inipreparative kit by QIAprep centrifugation for the analysis of the additional DNA sequence. Approximately 2000 random clones of mutant E. coli HB101 library were pooled and the total plasmid DNA was isolated using a QUIAGEN-100 column, 0.50-4.0 μg of the isolated plasmid was used to transform Bacillus subtilis DB428 as previously described (88). Approximately 400 Bacillus subtilis DB428 library transformants were subjected to separation. The separation was carried out using the assay described above (88), on succinyl-Ala-Ala-Pro-Phe-p-nitroanilide. B. subtilis DB428 containing the plasmid library were grown on LB plates containing kanamycin plates (20 μg / ml). After 18 hours at 37 ° C a single colony was selected in 96-well plates containing 200 μl / half of kanamycin per well. These plates were incubated with shaking at 37 ° C for 24 hours to allow the cells to grow to saturation. The cells were centrifuged, and the supernatants were sampled for the thermostability test. Two replicates of the 96-well assay plates were prepared for each growth plate by transferring 10 μl of supernatant onto the replica plates. Subsequently, the activities of subtilisin were measured by adding 100 μl of activity assay solution (succinyl-Ala-Ala-Pro-Phe-p-nitroanilide 0.2 mM, 100 mM Tris-HCl, 10 mM CaCl 2, pH 8.0, 37 ° C). The reaction rates were measured at 405 nm for 1.0 min in a ThermoMax microplate reader (Molecular Devices, Sunnyvale CA). The activity measured at room temperature was used to calculate the fraction of active clones (clones with activity lower than 10% of the natural one were recorded as inactive). The initial activity (Ax) was measured after incubating a test plate at 65 ° C for 10 minutes by immediately adding 100 μl of assay solution (succinyl-Ala-Ala-Pro-Phe-p-nitroanilide 0.2 mM, Tris-HCl 100 mM, 10 mM CaCl2, pH 8.0, lOmM) preheated (37 ° C) in each well. Residual activity (Ar) was measured after 40 minutes of incubation.
B. Results In vitro heteroduplex formation and in vivo repair were carried out as described above. Five clones of E. coli SCS110 mutant library and five libraries of E. coli ES1301 mutS were randomly selected and sequenced. Figure 14 shows that four of the ten clones were different from the parent genes. The frequency of occurrence of a point mutation of RCl or RC2 in the resulting genes varied from 0% to 50%, and the ten point mutations in the heteroduplex have been repaired without a strong specific preference for the strand. Since one of the ten mutations is located within the dcm site, repair of the mismatch seems to be generally made via the repair systems of bad matings by long patches of E. coli. The system repairs different mismatches in a specific form of the strand using the N6 methylation state of adenine in GATC sequence and the main mechanism for determining the strand to be repaired. With methylated heteroduplexes in the GATC sequences on only one strand of DNA, the repair showed to be highly deviated toward the non-methylated strand, with the methylated strand serving as the standard for correction. If no strand was methylated, repair of the bad matings occurred, but showed little preference for the strand (23, 24). These results show that it is preferable to demethylate the DNA to be recombined to promote the efficient and random feedback of the heteroduplexes. The heat inactivation rates of subtilisin E at 65 ° C were determined by analyzing the 400 random clones from the Bacillus subtillis DB428 library. The thermostabilities obtained from a 96-well plate are shown in Figure 16, plotted in descending order. Approximately 12.9% of the clones exhibited thermostability comparable to that of the mutant with the double mutations N181D and N218S. Since this velocity is only half of that expected for the random recombination of these two markers, it indicates that the two bad matings at positions 995 and 1107 within the heteroduplex have been repaired with less position randomness. The analysis of the clone sequence exhibits the highest thermostability among the 400 separate transformants of the E heteroduplex library. coli SCS110 confirmed the presence of both N181D and N218S mutations. Among the 400 transformants of the B. subtilis DB428 library that separated, approximately 91% of the clones expressed enzymatic stabilities of type N181D and / or N218S, while approximately 8.0% of the transformants showed only natural subtilisin E stability. Less than 1.0% inactive clone was found, indicating that few new point mutations were induced in the recombination process. This is consistent with the fact that no new point I mutations were identified in the ten sequenced genes (Figure 14). Although point mutations can provide useful diversity for some applications of in vitro evolution, they can also be problematic for the recombination of beneficial mutations, especially when the mutation rate is high.
EXAMPLE 4. Optimization of Conditions for Recombination of the heteroduplex We have found that the recombination efficiency of the heteroduplex can differ considerably from gene to gene [17, 57]. In this example, we investigate and optimize a variety of parameters that improve the efficiency of recombination. The DNA substrates used in this example were mutants directed to the green fluorescent protein site of Aequorea victoria. The GFP mutants had a high codon introduced at different places along the sequence that abolished their fluorescence. The natural fluorescent protein could only be repaired by recombination between two or more mutations. The fraction of fluorescent colonies was used as a measure of the efficiency of the recombination.
A. Methods 2-4 μg of each parent plasmid was used in a recombination experiment. One parent plasmid was digested with endonuclease Pst I and another parent with J? CoRI. The linearized plasmids were mixed together and 20 x of SSPE buffer was added to the final concentration of 1 x (180 mM NaCl, 1 mM EDTA, 10 mM NaH2P04, pH 7.4). The reaction mixture was heated at 96 ° C for 4 minutes, immediately transferred on ice for 4 minutes and the incubation was continued for 2 hours at 68 ° C. The target genes were amplified in a PCR reaction with the primers corresponding to the vector sequence of the pGFP plasmid. Forward primer: CCGACTGGAAAGCGGGCAGTG-3 ', rearward primer 5'-CGGGGCTGGCTTAACTATGCGG-3'. The PCR products were mixed and purified using the Qiagen PCR purification kit. The purified products were mixed with 20 x SSPE buffer and hybridized as described above. The annealed products were precipitated with ethanol or purified on Qiagen columns and digested with EcoRI and PstI enzymes. The digested products were ligated into pGFP vector digested with PstI and EcoRl. DUTP was added to the PCR reaction at final concentrations of 200 μM, 40 μM, 8 μM, 1.6 μM, 0.32 μM. The PCR reaction and subsequent cloning procedures will be carried out as described above. The recombinant plasmids were transformed into E. coli strain XL10 by a modified chemical transformation method. Cells were grown on LB agar plates containing ampicillin and grown overnight at 37 ° C, followed by incubation at room temperature or at 4 ° C until fluorescence developed.
B, Results 1. Effect of ligation on the efficiency of recombination. Two experiments have been carried out to test 1 effect of breaks in the DNA heteroduplex on the efficiency of the recombination. In one experiment, the heteroduplex plasmid was treated with DNA ligase to close all existing single-strand breaks and transformed into identical conditions as an unlinked sample (see Table I). The ligated samples showed up to 7-fold improvement in the efficiency of the recombination over the unbound samples. In another experiment, dUTP was added in the PCR reaction to introduce additional breaks in the DNA after repair by uracil N-glycosylase in the host cells. Table 2 shows that the incorporation of dUMP suppressed recombination significantly, the degree of suppression increases with the increase in dUTP concentration. 2. Effect of plasmid size on the efficiency of heteroduplex formation. The size of the plasmid was a significant factor affecting the efficiency of the recombination. Two pGFP plasmids (3.3 kb) and Bacillus pCTIl launch vector (approximately 9 kb) were used in the preparation of circular heteroduplex-like plasmids following the traditional heteroduplex protocol. For the purposes of this experiment (to study the effect of plasmid size on duplex formation), both parents had the same sequences. Although pGFP formed approximately 30-40% circular plasmid, the launch vector produced less than 10% of this form. The increase in the size of the plasmid increases the concentration of the ends in the vicinity of each one and makes the annealing of very long ends (>; 0.8 kb) which are of a single more difficult strand. This difficulty is avoided by the procedure shown in Figure 3, in which the heteroduplex formation occurs between substrates in vector-free form, and, the heteroduplexes are subsequently inserted into a vector. 3. Efficiency of the Recombination against the Distance between Mutations A series of GFP variants were recombined in pairs to study the effect of the distance between the mutations on the efficiency of the recombination. The parent genes were amplified by PCR, annealed and ligated again in the pGFP vector. The heteroduplexes were transformed into the strain of E. coli XL10. The first three columns in Table 3 show the results of three independent experiments and demonstrate the dependence of the recombination efficiency on the distance between the mutations. As expected, recombination becomes less and less efficient for each nearby mutation. However, it is still remarkable that the repair of long patches has been to recombine mutations separated by only 27 bp. The last line in Table 3 represents the recombination between a single and two mutations, the natural GFP could only be restored in the case of a double crossing with each individual crossing occurring at the distance of 99 bp only, demonstrating the capacity of this method to recombine multiple mutations little separated. 4. Elimination of Double Strands Mothers of Heteroduplex Preparations. Annealing substrates in vector-free form offers size advantages in relation to annealing substrates as vector components, but does not allow the selection of heteroduplexes in relation to homoduplex simply by transformation in the host. Asymmetric PCR reactions with only one primer per parent seeded with the appropriate amount of previously amplified and purified gene fragment were performed for 10 cycles, ensuring a 100-fold excess of one strand over the other. The products of those asymmetric reactions were mixed and annealed together producing only a minor amount of non-recombinant duplexes. The last column of Table 3 shows the efficiency of the recombination obtained from these enriched heteroduplexes. The comparison of the first three columns with the fourth shows the improvement achieved by the asymmetric synthesis of the mother strands. Although the invention has been described in some detail for purposes of clarity and understanding, it will be clear to one skilled in the art upon reading this description that various changes in form and details may be made without departing from the true scope of the invention. All publications and patent documents cited in this application are incorporated by reference in their entirety for all purposes to the same extent as if each individual publication or document of origin were so denoted individually.
References 1. Shao Z. and Arnoid. F.M. 1996. Engineering new functions and altering existing functions. Curr. Opin. Struct. Biol. 6: 513-518. 2. Kuchnen 0 and Arnold F. H. 1997. Directed evolution of enzi e catalysts. Trends in Biotechnol. 15: 523-530. 3. Abelson, J. N. (ed.) 1996. Combinatorial chemistry. Methods in Enzymol. 267, Academic Press, Inc. San Diego. 4. Joyce, G. F. 1992. Directed molecular evolution. Scientific American 267: 90-97. 5. Stemmer, W. P.C. 1994a. Rapid evolution of a protein in vitro by DNA shuffling. Nature 370: 389-391. 6. Stemmer. W. P.C. 1 994b. DNA shuffling by random fragmentation and reassembly: in vitro recombination for molecular evolution. Proc. Nati Acad. Sci. USA 91: 10747-10751. 7. Moore, J.C. and Arnold F.M. 1996. Directed evolution of a para-nitrobenzyl esterase for aqueous-organic solvents. * Nature Biotech. 14: 458467. 8. Holland, J. H. 1975. Adaptation of natural and artificial systems. The University Press, Ann Arbor. 9. Goldberg, D. E. 1989. Genetic algorithms in search, optimization and machine learning. Addison-Wesley. Reading. 10. Eigen. M. 1971. Self-organization of maner and the evolution of biological macromolecules.
Naturwissenschaffen 58: 465-523. 11. Rechenberg, L. 1973. Evolutions strategie: Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. Fronimann-Molzboog, Stuttgart. 12. Brady, R. M. 1985. Optimization strategies gleaned from biological evolution. Nature 317: 804-806. 13. Muhlenbein, H. 1991. The parallel genetic algorithm as function optimizer. Parallel Computing 17: 619-632. 14. Pal. K. F. 1993. Genetic algorithms for the traveling salesman problem-based on a heuristic crossover operation. Bío. Cybem 69: 539-546. 15. Pal. K. F. 1995. Genetic alcorithm with local optimization. Bio. Cybem 73: 335-341. 16. Cami. B., P. Chambón. P. Kourilskv. 1984.
Correction of complex heteroduplexes made of mouse M-2 gene sequences in E. coli K-12. Proc. Nati Acad. Sci. USA 81: 503-507. 17. Westmoreland. J 0. Put. M. Radman and M. A. Resnick. 1997. Highly mismatched molecules resembling recombination intermediates Efficient transform mismatch repair proficient E. coli. Genetics 145: 29-38. 18. Kramer. B., W. Kramer and H.-J. Fritz. 1984. Different base / base mismatches are corrected with different efficiencies by the methyl-directed DNA mismatch-repair system of E. coli. Cell 38: 879-887. 19. Lu. A.-L., 5. Clark and P. Modrich. 1983. Methyl-affected repair of DNA base pair mismatches in vitro. Proc. Nati Acad. Sci. USA 80: 4639-4643. 20. Carraway, M. and Marinus, M. O., 1993. Repair of heteroduplex DNA molecules with multibase loops in Escherichia coli. J Bacteriol. 175: 3972-3980. 21. Cooper. D. L., Lahue. R. S. and Modrich, P. 1993. Medivi-directed mismatch repair is bi-directional. J. Biol. Chem. 268: 11823-11829. 22. Au, K. G., Welsh. K. and Modrich, P. 1992. Initiation of methyl-directed mismatch repair J. Biol. Chem. 267: 12142-12148. 23. Meselson. M. 1988. Methyl-directed repair of DNA mismatches, p. 91-113. In K. B. Low (ed.), Recombination of the Genetic Material. Academic Press, Inc., San Diego, Calif. 24. Fishel, R.A., Siegel, E.C. and Kolodner, R. 1986. Gene conversion in Escherichia coli. Resolution of heteroallelic mismatched nucleotides by co-repair. J. Mol. Biol. 188: 147-157. 25. Pukkila. P. J., J. Peterson. O. Herman. P. Modrich. and M. Meselson. 1983. Effects of high levéis of DNA adenine ethylation on methyl-directed mismatch repair in Escher ± chia coli. Genetics 104: 571-582. 26. Raciman. M., R. E. Wagner. B. W. Gliclanan, and M. Meselson. 1980. DNA methylation. mismatch correction and genetic stabillity, p. 121-130. In M. Alacevic (ed.) Process in Enviromental Mutagenesis. Elsevier / North-Holland Biochemical Press, Amsterdam. 27. Sambrook. J., Fritsch. E. E. and Maniatis. T. 1989. Molecular cloning: A Laboratory Manual. 2nd ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor. NY. 28. Alien, D. J., Malkhov. A., Grilley, M., Taylor. J., Thresher. R., Modrich. P. and Griffith. J.D. MutS v mediates heteroduplex loop formation by a translocation mechanism. 1997. EMPO 1.16: 4467-4476. 30. Tsai-Wu. J.J. and Lu. A. L. 1994. Escherichia coJi inutY-dependent mismatch repair involves DNA polymerase I and a short repair tract. Mol. Gen. Genet. 244: 444-450. 31. Worth, L. Jr., Clark, S., Radman, M. and Modrich, P. 1994. Mismatch repair products MutS and MutL inhibit RecA-catalyzed strand transfer between diverged DNAs. Proc- Nati. Acad. Sci. USA 91: 3238-3241. 32. Fox. M. S., Radicella. J. P. and Yamamoto. K. 1994. Some features of base pair mismatch repair and its role in the formation of genetic recombinants. Experientia 50: 253-260. 33. Radicella. J. P., Clark. E. A., Chen. S. and Fox, M.S. 1993. Patch length of localized repair events: role of DNA polymerase 1 in mutY-dependent mismatch repair. J. Bacteriol. 175: 7732-7736. 34. Krackiciewicz-Dowjat, A. and Fishel, R. 1990. RecB-recC-dependent processing of heteroduplex DNA stimulates recombination of an adjacent gene in Escherichia coli. J.
Bacteriol. 172: 172-178. 35. Radman. M. 1989. Mismatch repair and the fidelity of genetic recombination. Genome 31: 68-73. -36. Raposa S. and Fox, M. S. 987. Some features of base pair mismatch and heterology repair in Escherichia coli. Genetics 117: 381-390. 37. Jones, M., Wagner, R. and Radman, M. 1987. Mismatch repair and recombination in E. coli. Cell 50: 621-626. 38. Langle-Rouault, E., Maenhaut-Michel, O. and Radman, M. 1987. GATC sequences, DNA nicks and the MutM function in Escherichia coli mismatch repair. EMBO J. 6: 1121-1127. 39. Glazer. P. M., Sarkar. 5. N., Chisholm. 0. E. and Summers. W.C. 1987. DNA mismatch repair detected in human Cell extracts. Mol. Cell. Biol. 7: 218-224 40. Laengle-Rouault, F., Maenhaut-Michel, 0. and Radman M. 1986. GATC sequence and mismatch repair in Escherichia coli EMBO 3.5: 2009-2013. 41. Bauer, J., Krammer, G. and Knippers. R. 1981. Asymmetric repair of bacteriophage T7 heteroduplex DNA. Mol. Gen. Genet. 181: 541-547. , 42. Wildenberg, J. and Meselson. NEITHER. 1975. Mismatch repair in heteroduplex DNA. Proc. Nati Acad. Sci. USA 72: 2202-2206. 43. Kirkpatrick. D. T. and Petes. T. D. 1997. Repair of DNA loops involves DNA-mis atch and nucleotide-excision repair proteins. Nature 387: 929-3. 44. Leung, W., Malkova. A. and Maber. 3. E. 1997. Gene targeting by linear duplex DNA frequently occurs by assimilation of a single strand that is subject to preferential mismatch correction. Proc. Nati Acad. Sci. USA 94: 6851-6856. 45. Hunter. N. and Borts, R. H. 1997, Mlhl is unique among mismatch repair proteins in its ability to promote crossing-over during meiosis. Genes Dev. 11: 0890-9369. 46. Alani. E., Lee. S., Kane. M. E., Griffith. J. and Kolodner. R. D. 1997. Saccharomyces cerevisiae MSM2. a mispaired base recognition protein, also recognizes Holliday junctions in DNA. 3. Mol. Biol. 265: 289-301. 47. Varlet. I., Canard. B., Brooks. P., Cerovic, G. and Radman. M. 1996. Mismatch repair in Xenopus egg extracts: DNA strand breaks act as you sign rather than exc ision points. Proc. Nati Acad. Sci. USA 93: 10156-10161. 48. Nicolás. A. and Petes, T. D. 1994. Polarity of meiotic gene conversion in fungi: contrasting views. Experientia 50: 242-52. 49. Bishop, D. K., J. Andersen. and R. D. Kolodner. 1989. Specificity of mismatch repair following transformation of Saccharomyces cerevisiae with heteroduplex plasmid DNA. Proc. Nati Acad. Sci. USA 86: 3713-3717. 50. Kramer, B., W. Kramer, M. S. Williamson. and S. Fogel. 1989. Meteroduplex DNA correction in saccharomyces cerevisiae is mismatch specific and requires functional PMS genes Mol. Cell. Biol. 9: 4432-0. 51. Baynton, K., Bresson-Roy, A. and Fuchs, R. P. 1998. Analysis of damage tolerance pathways in saccharomyces cerevisiae: a requirement for Rev3 DNA polymerase in translation synthesis. Mol. Cell. Biol. 18: 960-966. 52. Alan! E., Reenan, R. A. and Kolodner, R. D. 1994. Interaction between mismatch repair and genetic recombination in Saccharomvces cerevisiae. Genetics 137: 19-39. 54. Bishop, D. K., Williamson. M. S., Fogel, 5. and Kolodner. R. D. 1987. The role of heteroduplex correction in gene conversion in Saccharomyccs cerevisiae. Nature 328: 362-364. 55. Bishop, D. K. and Kolodner. R. D. 1986. Repair of heteroduplex plasmid DNA after transformation into saccharomyces cerevisiae. Mol. Cell Biol. 6: 3401-3409. 56. White. J. H., Lusnak. K. and Fogel, S. 1985. Mismatch-specific post-meiotic segregation frequency in yeast suggests a heteroduplex recombination intermediate. Nature 315: 350-352. 57. Supplied. J.-P., B. Cami. T. H. Dinh. J. Igoler and "P. Kourilsky, 1984. Processing of complex heteroduplexes in E. coli and Cos-1 death cells, Proc. Nati, Acad. Sci. USA 81: 5792-5796. 58. Brown, TC and J. Jiricny 1987. A specific mismatch repair event protects mammalian cells from the 5-methylcytosine, Cell 50: 945-950, 59. Sibghat-Ullah and RS, Day 1993. DNA-substrate sequence specificity of human G: T mismatch repair activity Nucleic Acids Res. 21: 1281-1287. 60. Miller. E. M., Mough., M. L., Cho. J. W. and Nickoloff. J. A. 1997. Mismatch repair by efficient nick-directed, and less efficient mismatch-specific, mechanisms in so ho ologous recombination intermediates in Chínese hamster ovary cells. Genetics 147: 743-753. 61. Deng, W. P. and Nickoloff, J. A. 1994. Mismatch repair of heteroduplex DNA intermediates of extrachromosomal recombination in mammalian cells. Mol. Cell Biol. 14: 400-406. I 62. Thomas, D.C., Roberts, J. D. and Kunkel, T. A. 1991. Heteroduplex repair in extracts of human HeLa cells. J. Biol. Chem. 266: 3744-51. 63. Folger, K. R., Thomas. K. and Capecchi, M.R. 1985. Efficient correction of mismatched bases in plasmid heteroduplexes inflected into cultured mammalian cell nuclei. Mol. Cell. Biol. 5: 70-74. 64. Fang, W., Wu, J. Y. and Su. M. 3.1997. Methyl-directed repair of mismatched small heterologous sequences in cell extracts from Escherichia coli. J. Biol. Chem. 272: 22714-22720. 65. Smith. J. and Modrich. P. 1997. Removal of polymerase-produced mutant sequences from PCR products. Proc. Nati Acad. Sci. USA 94: 6847-50. 66. Su, S. S., Grilley, M., Thresher, R., Griffith, J. and Modrich, P. 1989. Gap formation is associated with methyl-directed mismatch correction under conditions of restricted DNA synthesis. Genome 31: 104-11. 67. Muster-Nassal. C. and Kolodner, R. 1986. Mísmatch correction catalyzed by Cell-free extracts of Saccharomyces cerevisiae. Proc. Nati Acad. Sci. USA 83: 7618-7622. 68. Macnab. R.M. 1992. Genetic and biogenesis of bacterial flagella. Annul Rev. Genet. 26: 131-158. 69. Wilson, D. R. and Beveridge, T. J. 1993.
Bacterial flagellar filaments and their component flagellins. Dog. J. Microbiol. 39: 451472. 70. Schmitt. R., Raskal. A. and Mayer, E. 1974. Plain and complex flagella of Pseudomonas rhodos: analysis of fine structure and composition. J. Bacteriol. 117: 844-857. 71. Gotz R., Limmen N., Ober. K. and Schmitt. R. 1982. Motility and chemotaxis in two strains of Rhizobium with complex flagella. J. Gen. Microbiol. 128: 789-798. 72. Schmitt. R., Bambergerl., Acker G. and Mayer. E. 1974. Fine structure analysis of the complex flagella of Rhizobium lupini H13-3. Arch. Microbiol. 100: 145-162. 73. Trachtenberg, S., DeRosier, D. J. and Macbab, R.M. 1987. Three-dimensional structure of the complex flagellar filament of Rh ± zobirim lupini and its relation to the structure of the plain filaments. J. Mol. Biol. 195: 603-620. 74. Gdtz. R. and Schmitt, R. 1987. Rhizobium meliloti swims by unidirectional intennittent rotation of right-handed flagellar propellers. J. Bacteriol. 169: 3146-3150. 75. Lotz, W., Acker, 0. and Schmitt. R. 1977. Bacteriophage 7-7-1 adsorbs to the complex flagella of Rhizobium lupini H13-3. J. Gen. Virol. 34: 9-17. 76. Krupski. G., Gotz, E., Ober, K., Pleicr, E. and Schmitt. R. 1985. Structure of camplex flagellar filaments in Rhizobium meliloti. J. Bacteriol. 162: 361-366. 77. Murayama, M., Lodderstaedt, 0. and Schmitt, R. 1978. Purification and. biochemical properties of complex flagella isolated from Rhizobium lupini H13-3. Biochem. Biophys. Minutes 535: 110-124. 78. Trachtenberg, S., DeRosier. D. J., Aizawa, S-I. and Macnab. R. M. 1986. Pairwise perturbation of flagellin subunits. The structural basis for the differences between plain and complex bacterial flagellar filaments. J. Mol. Biol. 190: 569-576. 79. Gordee, R.S., Zeckner, D.J., Ellis, L.E., Tilalcicar, A.L. and Howard. L.C. 1984. In vitro and in vivo anti-Candida activity and toxicity of LY121019. J. Antibiotics 37: 1054-1065. 80. Debono, M., Willard, K. E., Kirsn M. A., Wind, J.A., Crouse, G.D., Tao. EV, Vicenzi, JT, Counter, ET, Ott, JL, Ose, EE and Omura, S. 1989. Synthesis of new anachronisms of echinocandin B by enzymatic deacylation and chemical reacylation of the echinocandin B peptide: synthesis of the antifungal agent cílofungin ( LY121019). J. Antibiotics 42 (3): 389-397. 81. Debono, M. and Gordee, R. S. 1994. Antibiotics that inhibit fungal cell-wall development. Annu. Rev. Microbiol. 48: 471497. 82. Debono. M., Tumer, W. W., Lagrandeur, L., Burlchardt, E. J., Nissen. J.S., Nichols, K.K., Rodriguei M.J., Zweifel. M. J., Zeckner, D.J., Gordee, R.S., Tang. J. and Parr, T.R. 1995. Semi-synthetic chemical modification of the antifungal lipopeptide echinocandin B (ECB): structure-activity studies of the lipophilic and geometric parameters of polyarylated acyl anagogs of ECB. J. Med. Chem. 38 (17): 3271-3281. 83. Yeh, W. K. 1997. Evolving enzyme technology for pharmaceutical applications: case studies. J. Ind.
Microbiol. Biotechnol. 19 (5-6): 334-343. 84. Boeck. L. D., Fukuda. D., Abbott. B. J. and M. Debono. 1989. Deacylation of echinocandin B by Actinoplanes utahensis. J. Antibiotics 42 (3): 382-388 85. Arnold. F.H. 1998. Desian by directed i evolution. Accts. Chem. Res. 31: 125-131. 86. Shao, Z., Callahan, M. and Arnold, F. H. 1998.
Directed enzyme evolution of Actinoplane utahensis ECB deacylase in Steptomyces lividans for enhanced specific activity, Manuscript submited. 87. Shao. Z., Zhao. U., Giver. L. and Arnold. F.M. 1998. Random-priming in vitro recombination: an effective tool for directed evolution. Nucleic Acids Res. 26 (2): 681-683. 88. Zhao. U. and Arnold. FH. 1997. Functional and I nonfunctional mutations distinguished by random recombination of homologous genes. Proc. Nati Acad. Sci. USA 94: 7997-8000. 89. Zhao, H., Giver. L., Shao, Z., Affholter, J.A., and Arnold, F.M. 1998. Molecular evolution by staggered extension process (StEP) in vitro recombination. Nat. Biotechnol. 16: 258-261. 90. Judo. M. S. B., Wedel. A. B. and Wilson. C. 1998. Stimulation and suppression of PCR-mediated recombination. Nucleic Acids Res. 26: 1819-1825. 91. Olkkels. J.S. 1997. Method for preparing polypeptide variants. PCT application WO 97/07205. 92. Gray, G. L. 1992. Hybrid prokaryotic polypeptides produced by in vivo homologous recombination.
U.S. Patent 5,093,257. 93. Weber, M. and Weissmann, C. 1983. Formation of genes coding for hybrid proteins by recombination between related. cloned genes in E. coli. Nucí Acids Res. 11: 5661-5669. 94. Maryon E. and Carroll. D. 1991. Characterization of recombination intermediates from DNA injected into Xenopus laevis oocytes: evidence for a nonconservative mechanism of homologous recombination. Mol. Cell. Biol. 11: 3278-3287.

Claims (34)

CHAPTER CLAIMEDICATORÍO Having described the invention, it is considered as a novelty and, therefore, the content is claimed in the following CLAIMS:
1. A method for evolving a polynucleotide towards the acquisition of a desired property, characterized in that it comprises (a) incubating a population of parent polynucleotide variants under conditions to generate annealed polynucleotides comprising heteroduplexes; (b) exposing the heteroduplexes to a cellular DNA representation system for converting the heteroduplexes to polynucleotide variants or recombinant polynucleotide variants; (c) separating or selecting the recombined polynucleotide variants for the desired property.
2. The method according to claim 1, characterized in that the heteroduplexes are exposed to the cellular DNA repair system in vitro.
3. The method according to claim 2, characterized in that the cellular DNA repair system comprises cell extracts. i7
4. The method according to claim 1, characterized in that it further comprises introducing the heteroduplexes into cells, whereby the heteroduplexes are exposed to the cell DNA repair system in vivo.
The method according to claim 4, characterized in that the annealed polynucleotides further comprise homoduplexes and the introduction step selects transformed cells comprising the heteroduplexes in relation to transformed cells comprising homoduplexes.
The method according to claim 4, characterized in that the first polynucleotide variant is provided as a component of a first vector, and a second polynucleotide variant is provided as a component of a second vector, and the method further comprises converting the first and second vectors to linearized forms in which the first and second polynucleotide variants occur at the opposite ends, whereby in the incubation step the single-stranded forms of the first linearized vector are annealed together to form the first linear vector, the single-strand forms of the second linearized vector are annealed together to form the second linear vector, and the linearized single-strand forms of the first and second vectors anneal to each other to form a circular heteroduplex containing a slit in each strand and the introduction step selects transformed cells comprising the heteroduplexes circulates in relation to the first and second linear vectors.
7. The method according to claim 6, characterized in that the first and second vectors are converted to linearized forms by PCR.
8. The method according to claim 6, characterized in that the first and second vectors are converted to linearized forms by digestion with first and second restriction enzymes.
The method according to claim 1, characterized in that the population of polynucleotide variants are provided in double-stranded form, and the method further comprises converting the double-stranded polynucleotides to single-stranded polynucleotides before the annealing step.
The method according to claim 1, characterized in that the conversion step comprises: conducting the asymmetric amplification of first and second I double-stranded polynucleotide variants to amplify a first strand of the first polynucleotide variant, and a second strand of the second polynucleotide variant, whereby the first and second strands are annealed in the incubation step to form a heteroduplex.
The method according to claim 10, characterized in that the first and second double-stranded polynucleotide variants are provided in vector-free form, and the method further comprises incorporating the heteroduplex into a vector.
12. The method according to claim 4, characterized in that the polynucleotide population comprises first and second polynucleotides provided in double-stranded form, and the method further comprises incorporating the first and second polynucleotides as components of the first and second vectors, whereby the first and second polynucleotides occupy opposite ends of the first and second vectors, whereby in the incubation step the single-strand forms of the first linearized vector are annealed together to form the first linear vector, the single-strand forms of the second linearized vector are annealed together to form the second linear vector, and linearized single-strand forms of the first and second vectors are annealed together to form a circular heteroduplex containing a slit in each strand, and the introduction step selects the transformed cells comprising the circular heteroduplex in relation to the first and second linear vectors.
The method according to claim 4, characterized in that it further comprises the slits in the heteroduplexes to form covalently closed circular heteroduplexes before the introduction step.
The method according to claim 11, characterized in that the first and second polynucleotides are obtained from chromosomal DNA.
15. The method according to claim 1, characterized in that it further comprises repeating steps (a) - (c) whereby the incubation step in a subsequent cycle is carried out on recombinant variants of a previous cycle.
16. The method according to claim 1, characterized in that the polynucleotide variants code for a polypeptide.
17. The method according to claim 1, characterized in that the population of polynucleotide variants comprises at least 20 variants.
18. The method according to claim 1, characterized in that the population of polynucleotide variants is at least 10 kb in length.
19. The method according to claim 1, characterized in that the population of polynucleotide variants co-produces natural variants.
The method according to claim 1, characterized in that the population of polynucleotides comprises variants generated by mutagenic PCR.
21. The method according to claim 1, characterized in that the population of polynucleotide variants comprises variants generated by site-directed mutagenesis.
22. The method according to claim 1, characterized in that the cells are bacterial cells.
23. The method according to claim 1, characterized in that it also comprises at least partially demeaning the population of variant polynucleotides.
24, The method according to claim 23, characterized in that the demethylation step, at least partially, is carried out by PCR amplification of the population of variant polynucleotides.
25. The method according to claim 23, characterized in that the demethylation step, at least partially, is carried out by amplifying the population of variant polynucleotides in host cells.
26. The method according to claim 25, characterized in that the host cells are defective in a gene encoding a methylase enzyme.
27. The method according to claim 1, characterized in that the population of variant polynucleotide variants coirprende at least 5 polynucleotides having a sequence identity of at least 90% relative to each other.
28. The method according to claim 1, characterized in that it further comprises isolating a separate recombinant variant.
29. The method according to claim 28, characterized in that it further comprises expressing a separate reclosing variant to produce a recombinant protein.
30. The method according to claim 29, characterized in that it further comprises formulating the recombinant protein with a carrier to form a pharmaceutical composition.
31. The method according to claim 1, characterized in that the polynucleotide variants code for enzymes selected from the group consisting of proteases, lipases, amylases, cutinases, cellulases, amylases, oxidases, proxidases and phytases.
32. The method according to claim 1, characterized in that the polynucleotide variants encode a polypeptide selected from the group consisting of insulin, ACTH, glucagon, sonetostatin, somatotropin, thymosin, parathyroid hormone, pigment hormones, somatomedin, erythropoietin, luteinizing hormone , chorionic gonadotropin, hyperthalamic release factor, antidiuretic hormones, thyroid stimulating hormone, relaxin, interferon, trompoietin (TPO), and prolactin.
33. The method according to claim 1, characterized in that the polynucleotide variants code for a plurality of enzymes that form a metabolic pathway.
34. The method according to claim 1, characterized in that the polynucleotide variants are in concatameric form.
MXPA/A/2000/005640A 1997-12-08 2000-06-08 Method for creating polynucleotide and polypeptide sequences MXPA00005640A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US60/067,908 1997-12-08

Publications (1)

Publication Number Publication Date
MXPA00005640A true MXPA00005640A (en) 2001-07-31

Family

ID=

Similar Documents

Publication Publication Date Title
US7790381B2 (en) Method for creating polynucleotide and polypeptide sequences
AU2021201257B2 (en) Cas9 variants and uses thereof
KR101817482B1 (en) Genome editing using campylobacter jejuni crispr/cas system-derived rgen
EP0911396B1 (en) Methods for generating polynucleotides having desired characteristics by iterative selective and recombination
JP4448619B2 (en) Method for obtaining recombinant polynucleotide sequences in vitro, a sequence bank and the sequences thus obtained
EP0996718B1 (en) Method for constructing a library using dna shuffling
CN111448313A (en) Compositions and methods for improving the effectiveness of Cas 9-based knock-in strategies
US7563578B2 (en) Method for in vitro molecular evolution of protein function
JPH11513562A (en) Method for producing recombinant plasmid
WO1998056926A1 (en) System for expressing hyperthermostable protein
Potter et al. DNA recombination: in vivo and in vitro studies
AU2002234572B2 (en) A method for in vitro molecular evolution of protein function
Dunny et al. Group II introns and expression of conjugative transfer functions in lactic acid bacteria
MXPA00005640A (en) Method for creating polynucleotide and polypeptide sequences
KR100650960B1 (en) Mutant staphylococcus aureus v8 proteases
AU2005232282B2 (en) Method for creating polynucleotide and polypeptide sequences
KR20180128864A (en) Gene editing composition comprising sgRNAs with matched 5&#39; nucleotide and gene editing method using the same
CN1946844B (en) Generation of recombinant genes in prokaryotic cells by using two extrachromosomal elements
Arnold et al. Method for creating polynucleotide and polypeptide sequences
Lai et al. A simple and efficient method for site-directed mutagenesis with double-stranded plasmid DNA
Szardenings et al. A phasmid optimised for protein design projects: pMAMPF
US20100041033A1 (en) Site specific system for generating diversity protein sequences
WO9119805 Patent bibliography