MXPA01003404A - A method for analyzing polynucleotides - Google Patents

A method for analyzing polynucleotides

Info

Publication number
MXPA01003404A
MXPA01003404A MXPA/A/2001/003404A MXPA01003404A MXPA01003404A MX PA01003404 A MXPA01003404 A MX PA01003404A MX PA01003404 A MXPA01003404 A MX PA01003404A MX PA01003404 A MXPA01003404 A MX PA01003404A
Authority
MX
Mexico
Prior art keywords
modified
polynucleotide
nucleotide
fragments
cleavage
Prior art date
Application number
MXPA/A/2001/003404A
Other languages
Spanish (es)
Inventor
Vincent P Stanton Jr
Jia Liu Wolfe
Tomohiko Kawate
Gregory Verdine
Original Assignee
Variagenics Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Variagenics Inc filed Critical Variagenics Inc
Publication of MXPA01003404A publication Critical patent/MXPA01003404A/en

Links

Abstract

The present invention relates to methods for the analysis of polynucleotides including detection of variance in nucleotide sequence without the need for full sequence determination, full sequence determination of a polynucleotide, genotyping of DNA and labeling a polynucleotide fragment during the process of cleaving it into fragments.

Description

A METHOD FOR ANALYZING POLYUCLEOTIDES FIELD OF THE INVENTION The present invention relates in general to organic chemistry, analytical chemistry, biochemistry, molecular biology, genetics, diagnostics and medicine. In particular, it relates to a method for analyzing polynucleotides: that is, to determine the complete nucleotide sequence of a polynucleotide, to detect variation in the nucleotide sequence and to genotype DNA.
BACKGROUND OF THE INVENTION The following is offered as background information only and is not intended or admitted to be the prior art of the present invention. DNA is the carrier of the genetic information of all living cells. The genetic and physical characteristics of an organism, its genotype and its phenotype, respectively, are controlled by precise sequences of nucleic acids in the body's DNA. The sum total of all information in the sequence present in the organism's DNA is called the "genome" of the organism. The nucleic acid sequence of a DNA molecule consists of a linear polymer of four "nucleotides".
The four nucleotides are tripartite molecules, each consisting of (1) one of the four heterocyclic bases, adenine (abbreviated "A"), cytosine ("C"), guanine ("G") and thymine ("T"); (2) 2-deoxyribose derived from pentose sugar, which is attached via its carbon atom 1 to a nitrogen atom of a ring of the heterocyclic bases; and (3) a monophosphate monoester formed between a phosphoric acid molecule and the 5'-hydroxy group of the sugar portion. The nucleotides polymerize by the formation of diesters between the 5'-phosphate of one nucleotide and the 3'-hydroxy group of another nucleotide to provide a single strand of DNA. In nature, two of these simple chains interact through nitrogen bonds between complementary nucleotides, A is complementary to T and C is complementary to G, to form "base pairs" that result in the formation of the well-known "double helix" of Watson and Crick DNA. RNA is similar to DNA, except that the base thymine is replaced with uracil ("U") and the pentose sugar is the same ribose instead of deoxyribose. In addition, RNA exists in nature predominantly as a single chain; that is, two chains are usually not combined to form a double helix. When reference is made to the nucleotide sequences in a polynucleotide, it is customary to use the abbreviation of the base; that is, A, C, G and T (or U) to represent all the nucleotides that contain that base. For example, a polynucleotide sequence denoted as "ACG" means that an adenine nucleotide is linked via a phosphate ester linkage to a cytosine nucleotide that is linked via another phosphate ester linkage to a nucleotide of Guanine If the polynucleotide to be described is DNA, then "A" is understood to refer to an adenine nucleotide containing a deoxyribose sugar. If there is any possibility of ambiguity, the "A" of a DNA molecule can be designated as "deoxy" or simply "dA". The same will be true for C and G. Since T only occurs in DNA and not RNA, there can be no ambiguity, so it is not necessary to refer to deoxyT or dT. As a first approximation, it can be said that the number of genes that an organism has is proportional to the phenotypic complexity of the organism; that is, the number of genomic products needed to replicate the organism and allow it to function. The human genome, which is currently considered one of the most complex, consists of approximately 60,000 to 100,000 genes and approximately three thousand three hundred million base pairs. Each of these genes codes for an RNA, most of which encodes in turn for a particular protein that performs a specific biochemical or structural function. A variant, also known as a polymorphism or mutation, in the genetic code of any of these genes can result in the production of a genetic product, usually a protein or an RNA, with altered -biochemical activity or without any activity. This may result from such a small change as the addition, deletion or substitution (transition or transversion) of a single nucleotide in DNA that comprises a particular gene, which is sometimes referred to as a "single nucleotide polymorphism". "or" SNP "(for its acronym in English). The consequence of this mutation in the genetic code varies from non-detrimental to debilitating and from debilitating to fatal. Currently there are more than 6700 human disorders that are believed to have a genetic component. For example, it is known that hemophilia, Alzheimer's disease, Huntington's disease, Duchernne muscular dystrophy and cystic fibrosis are related to variants in the nucleotide sequence of DNA comprising certain genes. In addition, evidence is accumulating suggesting that changes in certain DNA sequences may predispose an individual to a variety of abnormal conditions, such as obesity, diabetes, cardiovascular diseases, disorders of the central nervous system, disorders P1266 autoimmune and cancer. Variations in the DNA sequence of specific genes have also been implicated in the differences observed in patient responses to, for example, drugs, radiation therapy, nutritional status and other medical interventions. In this way, the ability to detect variants in the DNA sequence of an organism's genome is an important aspect of the investigation of the relationships between these variants and medical disorders and responses to medical interventions. Once an association has been established, the ability to detect variants in a patient's genome can be an extremely useful diagnostic tool. It may even be possible, through the use of early detection of the variant, to diagnose and potentially treat or even prevent a disorder before this disorder manifests physically. In addition, the detection of the variants can be a valuable research tool that can lead to the discovery of the genetic basis of the disorders, whose causes were hitherto unknown or were thought to be different from the genetic ones. The detection of variants can also be useful for driving towards the selection of an optimal therapy, where there are differences in the responses between patients to one or more proposed therapies.
While the benefits of being able to detect variants in the genetic code are clear, the practical aspects of this detection are very disappointing: it is estimated that variations in the sequence of human DNA occur with a frequency of approximately one per 100 nucleotides when They compare 50 to 100 people. Nickerson, D.A., Nature Genetics, 1998, 223-240. This translates into an amount such as thirty million variants in the human code. Not all of these variants, in fact, are very few, they have a measurable effect on the physical well-being of human beings. The detection of these 30 million variants and, therefore, the determination of which of them are relevant to human health is truly a formidable task. Besides the detection of variants, the knowledge of the complete nucleotide sequence of an organism's genome would contribute immensely to the understanding of the complete biology of the organism, that is, it would lead to the identification of each gene product, its organization and arrangement or disposition. in the organism's genome, the sequences required to control gene expression (that is, the production of each gene product) and replication. In fact, the search for this knowledge and understanding is the raison P1266 d 'etre (reason for being) of the Human Genome Project, an international effort focused on the determination of the sequence of the entire human genome. Once the sequence of a single genome, of any organism, is available, it becomes useful then to obtain the partial or complete sequence of other organisms of those species, particularly those organisms within the species that exhibit different characteristics , in order to identify the differences in the DNA sequence that correlate with the different characteristics. These different characteristics may include, for microbial organisms, pathogenicity on the negative side or the ability to produce a particular polymer or to remedy contamination on the positive side. The difference in growth rate, nutrient content or resistance to pests are potential differences that can be observed between plants. Even among humans, a difference in susceptibility to disease or response to a particular therapy may be related to a genetic variant, that is, a variant in the DNA sequence. As a result of the enormous potential utility that will be obtained from DNA sequence information, in particular, from the identification of variants in the DNA sequence between individuals of the same species, a significant increase in the future can be expected. in the demand for DNA sequencing procedures and detection of rapid, inexpensive and automated DNA variants. Once the sequence of the DNA of a DNA segment; for example, a gene, a cDNA or, on a larger scale, a chromosome or a whole genome, the existence of sequence variants in that segment of DNA between members of the same species can be explored. The complete DNA sequencing is the definitive procedure to perform this task. In this way, it is possible to determine the complete sequence of a copy of a DNA segment obtained from a different member of the species and simply compare that complete sequence with one previously obtained. However, the present technology of DNA sequencing is expensive, time consuming and to reach high levels of precision, it must be very redundant. Most major sequencing projects require coverage 5 to 10 times per nucleotide to achieve an acceptable error rate of 1 for every 2,000 to 1 per 10,000 bases. In addition, DNA sequencing is an inefficient way to detect variants. For example, a variation between any two copies of a gene, for example, when two chromosomes are being compared, can be presented in a form P1266 as rare as once for every thousand or more bases. Thus, only a small portion of the sequence, that in which the variant exists, is of interest. However, if full sequencing is used, an enormous number of nucleotides have to be sequenced to arrive at the desired information that involves the aforementioned small portion. Consider, for example, the comparison of ten versions of a DNA sequence of 3,000 nucleotides in order to detect, among them, four variants. Even if only a double redundancy is used (each chain of the DNA segment of 3,000 double-stranded nucleotides of each individual is sequenced once), 60,000 nucleotides (10 x 3,000 x 2) would have to be sequenced. In addition, problem areas are more likely to be found in sequencing that requires additional runs with new primers; thus, the project could engender the sequencing of as many nucleotides as 100,000 to determine four variants. In the past 15 years a variety of procedures have been developed to identify differences in the sequence and provide some information about the location of sites with variant (Table 1). Using this procedure, it would only be necessary to sequence four relatively short portions of the 3,000 nt sequence P1266 (nucleotides). In addition, only a few samples from each region would have to be sequenced, because each variant produces a characteristic change (Table 1), so, for example, 22 of the 50 samples exhibit this characteristic change with the variant detection procedure, then the sequencing of a number of samples as small as four samples of the 22 would provide information about the other 18. The length of the segments that require sequencing could, depending on the variant detection procedure used, be as short as 50 to 100 nt. In this way, the scale of the sequencing project could be reduced to: 4 (sites) x 50 (nt per site) x 2 (each individual's chains) x 2 (individuals per site) that is, approximately only 800 nucleotides. This represents approximately 1% of the required sequencing in the absence of a preceding step of detecting variants. As currently practiced, the technique for determining the complete nucleotide sequence of a polynucleotide and for detecting previously unknown variants or mutations in related polynucleotides end up being the same; that is, although the issue is the presence or absence of a single variant of nucleotides between related polynucleotides, the complete sequences of at least one P1266 segment of the related polynucleotides and then compared. The only difference is that as a first step to reduce the amount of complete sequencing necessary in the detection of unknown variants, a method of detecting variants, such as one of those described in Table 1, can be used. or O) Methods for the discovery of new variants in the Q DNA sequence.
DNA folding Formation methods Sequence spectrometry of a single strand heteroduplex DNA ciation established by mass Bent Fusion Recognition of Methods of Methods of Excision of qui 'of DNA DNA mismatches sequencing sequencing and electrophoretic sequencing by stair hybridization modified nucleotides Union Excision I-1 Enzymatic Chemistry Oligos as Oligos linked probes SSCP: SFLP: DGGE: DHPLC: SCGE; HA MutS: Excision T4E7: Sequencing Labeling Labeling oligoProducirla These are Analysis of Electrophoresis Analysis Chromatography Electrophoresis and other bi-chemistry: cutting the DNA of Sanger and of fragments of nucleotides and ladder of the polymorphism polymorphism methods in the con gralíquida desnaen gel sensible sas bind Modify with in Maxam-Gilbert ADR bases of the hybridized analyte for products that are formed by denaturing tooth excision to the conformaselectivand poorly coupled tetroxide; + automates (later, joining DNA by describing one of the duralizing length high resolution, analysis to heteroduplex, osmium or hydroxy other clivations and software fragment porducts the termination in this single strand of the fragment- (also known as Reverse heteroduplexes also laminate include T7E1 for detection of PCR) and hide DNA of the chain or request folding bending use graae pair ionic) cleavage if and then heterobridation for analyte treatment of DNA alters DNA alters teeth others cleave zygotic ligand oligo- with nuclease its mobility thermal cuts) nucleotide proteins Table 1. Summary of the methods commonly used for the discovery of variants in the DNA sequence. In the lower part, the physical basis or result of each method is schematically represented. Sequencing electrophoretic methods include a variety of enzymatic procedures to generate partial sequence ladders (e.g., the use of UTP and uracil glycosylase or exonuclease degradation in the presence of boron nucleotides). Methods for genotyping (i.e., for testing specifically the presence of a previously identified polymorphism) include many of those listed above as well as others.
The two classical methods for effecting complete nucleotide sequencing are the chemical procedure of Maxam and Gibert (Proc. Nat. Acad. Sci. USA, 74, 560-564 (1977)) and the chain termination method of Sanger, and collaborators, (Proc. Nat. Acad. Sci. USA, 74, 5463-5467 (1977)). The Maxam-Gilbert method of complete nucleotide sequencing includes the end-labeling of a DNA molecule with, for example, 32P, followed by one of two discrete reaction sequences involving two reactions each; that is, four reactions in total. One of these reaction sequences includes the selective methylation of the guanine (G) and adenine (A) purine nucleotides in the nucleotide being investigated, which, in most cases, is an isolated, natural polynucleotide, such as DNA . The N7 position of guanine is methylated approximately five times faster than the N3 position of adenine. When heated in the presence of an aqueous base, the methylated bases are lost and an interruption occurs in the polynucleotide chain. The reaction is more effective with methylated guanine than with methylated adenine, so when the product of the reaction is subjected to electrophoresis on polyacrylamide gel plates, the cleavage stairs G predominate. On the other hand, in acidic conditions, the two Methylated bases are effectively eliminated. Piperidine treatment cleaves DNA in these abasic sites, generating sequencing ladders corresponding to A + G. In this way, the four chemical reactions followed by the electrophoretic analysis of the resulting end-label cleavage product ladder will reveal the exact nucleotide sequence of a DNA molecule. It is important in the Maxam-Gilbert sequencing method that only partial excision occurs, on the order of 1 to 2% in each susceptible position. This is because the electrophoresis separates the fragments by size. To be meaningful, the fragments produced must represent, on average, a single modification and excision per molecule. Then, when the fragments of the four reactions are aligned in accordance with the size, the exact sequence of the target DNA can be determined. The Sanger method for determining complete nucleotide sequences consists in preparing by enzymatic polymerization four series of DNA fragments specifically base labeled in the finished chain. As in the Maxam-Gilbert procedure, four separate reactions can be performed. In the Sanger method, each of the four mixtures of P1266 reaction contains the same oligonucleotide template (either single-stranded or double-stranded DNA), the four nucleotides A, G, C and T (one of which can be labeled), a polymerase and a primer, the polymerase and the primer is present to effect the polymerization of the nucleotides in a complement of the template oligonucleotide. To one of the four reaction mixtures is added an empirically determined amount of the dideoxy derivative of one of the nucleotides. To the second reaction mixture is added a small amount of the dideoxy derivative of one of the three remaining nucleotides and so on, resulting in four reaction mixtures each containing a different dideoxy nucleotide. The dideoxy derivatives, by virtue of their lost 3'-hydroxyl groups, terminate the enzymatic polymerization reaction with incorporation into the nascent oligonucleotide chain. Thus, in a reaction mixture, which contains, say, dideoxydenosine triphosphate (ddATP), a series of oligonucleotide fragments is produced that all terminate in ddA, which, when resolved by electrophoresis, produces a series of bands corresponding to the size of the fragment created up to the point where the ddA terminating the chain is incorporated in the polymerization reaction. The corresponding fragment ladders can be P1266 obtained from each of the other reaction mixtures in which the oligonucleotide fragments end in C, G and T. The four sets of fragments create a "sequence ladder", of which each rung represents the next nucleotide in the sequence of bases comprising the target DNA. In this way, the exact nucleotide sequence of the DNA can simply be read from the gel plate of the electrophoresis after analysis by autoradiography or by computer of the chromatograms in the case of an automated instrument for DNA sequencing. As mentioned above, dideoxy nucleotides that end in chain, labeled with dye and modified polymerases that efficiently incorporate modified nucleotides are an improved method for sequencing that ends in chain. The two procedures, that of Maxam-Gilbert and that of Sanger have their drawbacks. Both are slow, labor-intensive (particularly with respect to the Maxam-Gilbert procedure, which has not been automated like the Sanger procedure), costly (For example, the versions with the highest optimization of the Sanger procedure require very expensive reagents) and require a high degree of technical knowledge to ensure proper operation and reliable results. In addition, the Maxam-Gilbert procedure suffers from the P1266 lack of specificity of the chemical modification, which can result in artificial fragments resulting in false ladder readings of the gel plate. The Sanger method, on the other hand, is susceptible to the formation of the secondary structure of the mold, which can cause interference in the polymerization reaction. This causes terminations of polymerization at secondary sites (called "stops"), which can result in the appearance of erroneous fragments in the sequence ladder that makes some parts of the sequence illegible, although this problem is reduced by the use of the dideoxy terminator labeled with dye. In addition, both methods of sequencing are susceptible to "compressions", another result of the secondary structure of DNA that can affect the mobility of the fragment during electrophoresis, thus making the sequence ladder illegible or subject to interpretation. wrong in the vicinity of the secondary structure. In addition, both methods are plagued by the uneven intensity of the ladder and by non-specific background interference. These issues are amplified when the issue is the detection of variants. To discern a single variant of the nucleotide, the procedure used must be extremely precise, an "error" in the reading of a nucleotide can result in a P1266 false positive; that is, in the indication of a variant where none exists. Neither the Maxam-Gilbert nor the Sanger procedure have the capacity for such precision in a single run. In fact, the frequency of errors in a "one-step" sequencing experiment is equal to or greater than 1%, which is of the order of ten times the frequency of actual DNA variants when comparing any two versions of a sequence . The situation may be somewhat lessened by performing multiple runs (usually in the context of a "randomized" sequencing procedure) for each polynucleotide to be compared, although this simply increases the cost in terms of equipment, reagents, labor and time. . The high cost of sequencing becomes even less acceptable when it is considered that it is often not necessary when looking for variants in the nucleotide sequence between related polynucleotides to determine the complete sequence of the target polynucleotides or even the exact nature of the variant (although , as will be noted, in some cases this is still discernible using the method of this invention), - the detection of the variant alone may be sufficient. While not all the problems associated with the Maxam-Gilbert and Sanger procedures are avoided, several techniques have been developed for It would be less efficient to make one or the other of the procedures more efficient. One of these approaches has been to develop ways to avoid gel electrophoresis, one of the slowest steps in the procedures. For example, in U.S. Patent Nos. 5,003,059 and 5,174,962, the Sanger method is used; however, the dideoxy derivative of each of the nucleotides used to terminate the polymerization reaction is uniquely labeled with a sulfur isotope, 32S, 33S, 34S or 36S. Once the polymerization reactions are complete, the finished chain sequences are separated by capillary zone electrophoresis, which, in comparison with electrophoresis, increases the resolution, reduces the time of the run and allows the analysis of very small samples. The separated terminated chain sequences are then burned to convert the incorporated isotopic sulfur into isotope sulfur oxides (32S02, 33S02, 34S02 and 36S02). The isotope sulfur biotoxes are then subjected to mass spectrometry. Since each isotope of sulfur is uniquely related to one of the four sets of fragments specifically of base-terminated chain, the nucleotide sequence of target DNA can be determined from the mass spectrogram. Another method, disclosed in the Patent of the P1256 United States No. 5,580,733, also incorporates the Sanger technique, although it also eliminates gel electrophoresis. The method includes taking each of the four specific chain-based oligonucleotide populations terminated from the Sanger reactions and forming a mixture with a matrix that absorbs visible laser light, such as 3-hydroxypicolinic acid. The mixtures are then illuminated and visible laser light is vaporized, which occurs without further fragmentation of the finished chain nucleic acid fragments. The vaporized molecules, which are charged, are then accelerated in an electric field and the mass-to-charge ratio (m / z) of the ionized molecules is determined by time-of-flight mass spectrometry (TOF-MS). ). The molecular weights are then aligned to determine the exact sequence of the target DNA. By measuring the mass difference between successive fragments of each of the mixtures, the fragment lengths ending in A, G, C or T can then be inferred. An important limitation of the present MS instruments is that in routine use they are not polynucleotide fragments greater than 100 nucleotides long (in many instruments, 50 nucleotides) can be efficiently detected, especially if the fragments are part of a complex mixture. This serious limitation regarding the size of the P1266 fragments that can be analyzed has limited the development of polynucleotide analysis by MS. Thus, there is a need for a method that adapts large polynucleotides, such as DNA, to the capabilities of the present MS instruments. The present invention provides said method. An additional approach to nucleotide sequencing is disclosed in U.S. Patent No. 5,547,835. Again, the starting point is Sanger's sequencing strategy. The four series of specific fragments of base and of terminated chain are "conditioned" by, for example, purification, cation exchange and / or mass modification. The molecular weights of the conditioned fragments are then determined by mass spectrometry and the initial nucleic acid sequence is determined by the alignment of specifically base-terminated fragments, in accordance with the molecular weight. Each of the above methods includes complete Sanger sequencing of a polynucleotide before analysis by mass spectrometry. To detect genetic mutations; that is, the variants, the entire sequence can be compared to a known nucleotide sequence. When the sequence is not known, the comparison with the nucleotide sequence of the same DNA P1266 isolated from another of the same organisms that does not show the abnormalities observed in the target organism will likely reveal mutations. Of course, this approach requires running the Sanger procedure twice; that is, eight separate reactions. In addition, if a potential variant is detected, in most cases the entire procedure would be run again, sequencing the opposite strand using a different primer to ensure that a false positive has not been obtained. When the specific nucleotide variant or mutation related to a particular disorder is known, there is a wide variety of known methods for detecting the variant without complete sequencing. For example, U.S. Patent No. 5,605,798 describes said method. The method includes obtaining a nucleic acid molecule containing the target sequence of interest from a biological sample, optionally amplifying the target sequence and then hybridizing the target sequence with a detector oligonucleotide that is specifically designed to be complementary to the target sequence. The detector oligonucleotide or the target sequence is "conditioned" by bulk modification prior to hybridization. The unhybridized detector oligonucleotide is removed and the remainder of the reaction product is volatilized and ionized. The P1266 detection of the oligonucleotide detector by mass spectrometry indicates the presence of the target nucleic acid sequence in the biological sample and thus confirms the diagnosis of the disorder related to the variant. The variant detection procedures can be divided into two general categories, although there is a considerable degree of overlap. One category, variant discovery procedures, is useful for examining segments of DNA in terms of the existence, location and characteristics of new variants. To achieve this, variant discovery procedures can be combined with DNA sequencing. The second group of procedures, variant typing procedures (sometimes referred to as genotyping), are useful for the repetitive determination of one or more nucleotides at a particular site of a DNA segment, when the location of a DNA segment has been previously identified and characterized. the variant or variants. In this type of analysis, it is often possible to design a very sensitive test of the state of a particular nucleotide or nucleotide. Of course, this technique is not well adapted for the discovery of new variants. As indicated above, Table 1 is P1266 a list of several existing techniques for nucleotide testing. Most of these are used mainly in the determination of new variants. There is a variety of other methods, not shown, for gene typing. Like the Maxam-Gilbert sequencing procedures, these techniques are generally time consuming, tedious and require a relatively high level of skill to achieve the highest possible degree of accuracy of each procedure. Still, some of the techniques listed are, even with the best of each, inherently less precise than is desirable. The methods in Table 1, although thought mainly for the discovery of variants, can also be used when a nucleotide with variation has already been identified and the goal is to determine its status in one or more unknown DNA samples (typing of the variant or genotyping) . Some of the methods that have been developed specifically for genotyping include: (1) primer extension methods in which the dideoxynucleotide termination of the primer extension reaction occurs at the site of the variant generating extension products of different length or with different terminal nucleotides, which can then be determined by electrophoresis, mass spectrometry or fluorescence in P1266 a plate reader; (2) hybridization methods in which the oligonucleotides corresponding to the two possible sequences at the site of the variant are bound to the solid surface and hybridized with probes from the unknown sample; (3) fragment length restriction polymorphism analysis, wherein a restriction endonuclease recognition site includes the polymorphic nucleotide in a form such that the site is cleavable with one variant nucleotide, but with no other; (4) methods such as "TaqMan" that include differential hybridization and the consequent differential digestion of the 5 'endonuclease of labeled oligonucleotide probes, in which there is fluorescent resonance energy transfer (FRET) between two fluorine of the probe that it is abrogated by nuclease digestion of the probe; (5) other methods based on FRET include labeled oligonucleotide probes, called molecular beacons that take advantage of allele-specific hybridization; (6) ligation-dependent methods that require the enzymatic ligation of two oligonucleotides through a polymorphic site that is perfectly matched to only one of them; and (7) priming with allele-specific oligonucleotide in a polymerase chain reaction (PCR). U. Landegren, et al., 1998, Reading Bits of Genetic Information: Methods for Single-nucleotide Polymorphism Analysis, Genome Research 8 (8): 769-76. When complete sequencing of large templates is desired, such as the complete genome of a virus, a bacterium or a eukaryote (eg, of higher organisms, including humans) or the repeated sequencing of a large DNA region or of DNA regions of different strains or individuals of particular species, for comparison purposes, it is necessary to implement strategies to prepare model libraries for DNA sequencing. is because sequencing with conventional chain termination (ie, the Sanger procedure) is limited by the resolving power of the analytical procedure used to create the nucleotide ladder of the target polynucleotide. For gels, resolving power is approximately 500 to 800 nt at a time. For mass spectrometry, the limitation is the length of a polynucleotide that can be vaporized efficiently before detection in the instrument. Although larger fragments have been analyzed by highly specialized procedures and instrumentation, limit is currently approximately 50 to 60 nt. However, in large-scale sequencing projects, such as the Human Genome Project, the "markers" (DNA segments) P1266 of known chromosomal location, whose presence can be determined relatively easily by the technique of polymerase chain reaction (PCR) and, therefore, can be used as a reference point to correlate new areas of the genome) are currently separated by approximately 100 kilobases (Kb). Markers at 100 Kb intervals should be connected using efficient sequencing strategies. If the analytical method used is gel electrophoresis, then hundreds of sequencing reactions would be required to sequence a 100 Kb stretch of DNA. A fundamental question to focus on is how to divide the 100 Kb segment (or whatever size it is being dealt with) to optimize the process; that is, to minimize the number of sequencing reactions and the binding or assembly work of sequences necessary to generate a complete sequence with the desired level of precision. An important point in this regard is how to initially fragment the DNA in such a way that the fragments, once sequenced, can reassemble correctly to recreate the full-length white DNA. Currently, two general approaches provide both sequence-ready fragments and the information necessary to recombine the sequences in the full-length target DNA: "random sequencing" (see, eg, Venter, JC, et al., Science, 1998, 280: 1540-1542; Weber, JL and Myers, EW, Genome Research, 1997, 7: 401-409; Andersson, B. et al., DNA Sequence, 1997, 7: 63-70) and "DNA sequencing. directed "(see, for example, Voss, H., et al., Biotechniques, 1993, 15: 714-721; Kaczorowski, T., et al., Anal. Biochem., 1994, 221: 127-135; , MA et al., Genome Research, 1996, 6: 10-18). Random sequencing includes the creation of a large lry of random fragments or "clones" in a ready sequence vector, such as a plasmid or a phagemid. To arrive at a lry in which all the portions of the original sequence are represented in relatively equal form, the DNA to be subjected to random sequencing is frequently fragmented by physical procedures, such as sonication, which has shown that it produces an almost random fragmentation. Then, for the sequencing, the clones are randomly selected from the lry at random. The complete DNA sequence is then assembled by identifying sequences of overlap or overlap in short random sequences (of approximately 500 nt). To ensure that the entire target region of the DNA is represented among the randomly selected clones and to reduce the frequency of errors (overlaps or overlaps assigned in P1266 incorrect), a high degree of redundancy is necessary in the sequencing; for example, 7 to 10 times. Even with this high redundancy, additional sequencing is often required to fill gaps in coverage. Even then, the presence of repeated sequences may occur, such as Alu (a sequence of 300 base pairs that occurs in 500,000 to 1,000,000 copies per haploid genome) and LINES ("Long Elements Intercalated of the DNA sequence") that can being 7,000 bases long and can occur in as many copies as 100,000 per haploid genome), from either of these in different locations of multiple clones, can make the reassembly of the DNA sequence problematic. For example, different members of these sequence families may be more than 90% identical, which can sometimes make it very difficult to determine the sequence relationships on opposite sides of these repeats. Figure X illustrates the difficulties of the randomization approach in a hypothetical 10 kb sequence modeled after the sequence reported in Martin-Gallardo, et al., Nature Genetics, (1992) 1: 34-39. Directed DNA sequencing, the second general approach, also involves preparing a lry of clones, often with large inserts (e.g.
P1266 libraries of cosmids, Pl, PAC or BAC). In this procedure, the location of the clones in the region to be sequenced is then correlated to obtain a set of clones that constitute a minimum overlapping tiling path or coverage that extends into the region to be sequenced. The clones of this minimal set are then sequenced by methods such as "primer-directed" (see, for example, Voss, supra). In this procedure, the end of a sequence is used to select a new sequencing primer with which the next sequencing reaction begins. The end of the second sequence is used to select the next primer and so on. The assembly of a complete DNA is easier through direct sequencing and less redundancy of sequencing is required, because the order of the clones and the integrity of the coverage are known from the map or correlation of clones. On the other hand, the assembly of the same map requires an important effort. In addition, the speed with which new sequencing primers can be synthesized and the cost of doing so is often a limiting factor with respect to primer routing. While in this process a variety of methods have been approached to simplify the construction of the new primer (see, for example, P1266 example, Kaczorowski, et al. and Lodhi, et al., supra), directed DNA sequencing remains a valuable, but often expensive and time consuming procedure. Most large-scale sequencing projects use aspects of both randomized sequencing and directed sequencing. For example, a detailed map or correlation can be prepared from a large library of inserts (eg, BACs) to identify a minimum set of clones that provide full coverage of the target region, although then sequencing of each of the large inserts it is done through the random approach; for example, fragmenting the large insert and re-cloning the fragments into a better sequencing vector (see, for example, Chen, C. N., Nucleic Acids Research, 1996, 24: 4034-4041). Randomized and directed procedures are also used in a complementary form in which specific regions not covered by an initial randomized experiment are subsequently determined by directed sequencing. Thus, there are significant limitations in both approaches, the randomized approach and the directed sequencing to complete the sequencing of large molecules, such as that required in genomic DNA sequencing projects. However, both P1266 procedures would be beneficial if the usable reading length of the contiguous DNA were expanded from the current 500 to 800 nt, which can be sequenced effectively by the Sanger method. For example, directed sequencing could be significantly improved by reducing the need for maps or high-resolution correlations that could be obtained by longer reading stretches, which in turn would allow greater distances between reference marks. An important limitation in current sequencing procedures is the high error rate or proportion (Kristensen, T., et al., DNA Sequencing, 2: 243-346, 1992; Kurshid, F. and Beck, S., Analytical Biochemistry. , 208: 138-143, 1993; Fichant, GA and Quentin, Y., Nucleic Acid Research, 23: 2900-2908, 1995). It is well known that many of the errors associated with the Maxam-Gilbert procedures are systematic, that is, the errors are not random; rather, they occur repeatedly. To avoid this, two different methods of sequencing can be used from the point of view of mechanics, so that one's systematic errors can be detected and, thus, corrected with the second and vice versa. Since a significant fraction of the cost of current sequencing methods is P1266 associated with the need for high redundancy to reduce sequencing errors, the use of two procedures can reduce the overall cost of obtaining a very precise DNA sequence. The production and / or chemical cleavage of polynucleotides composed of ribonucleotides and deoxyribonucleotides have been previously described. In particular, mutant polymerases have been described that incorporate both ribonucleotides and deoxyribonucleotides into a polynucleotide; the production by polymerization of mixed polynucleotides containing ribonucleotides and deoxyribonucleotides has been described; and the generation of sequence ladders from these mixed polynucleotides, taking advantage of the well-known lability of the sugar ribo to become a chemical base, has also been described. However, the use of these methods has been limited to: (i) polynucleotides wherein a ribonucleotide and three deoxyribonucleotides are incorporated; (ii) cleavage in the ribonucleotides is carried out using a chemical base, (iii) only the partial cleavage of the polynucleotides containing ribonucleotides is carried out and (iv) the utility of the procedure is confined to the production of sequence ladders, which are solved P1266 electrophoretically. In addition, the chemical synthesis of polynucleotide primers containing a single ribonucleotide, which in a subsequent step is practically completely cleaved by a chemical base has also been reported. The size of the extension product with primer is then determined by mass spectrometry or by other methods.
SUMMARY OF THE INVENTION It is clear from the foregoing that there is a need for a simple, low cost, fast and still sensitive and accurate method for analyzing polynucleotides such as, without limitation, DNA, to determine both the complete sequence of nucleotides and the presence of variants. In addition, there is a need for methods that allow the assembly of very long DNA sequences through dense regions of repetition. The methods of the present invention meet each of these needs. In general, the present invention provides new methods for genotyping, DNA sequencing and variant detection, based on the specific cleavage of DNA and other modified polynucleotides by the enzymatic incorporation of chemically modified nucleotides.
P1266 Thus, in one aspect, the invention relates to a method for cleaving a polynucleotide, comprising: a. replacing a natural nucleotide at virtually every point of occurrence in a polynucleotide with a modified nucleotide to form a modified polynucleotide, wherein the modified nucleotide is not a ribonucleotide; b. contacting the modified polynucleotide with a reagent or reagents that cleave the modified polynucleotide at virtually every point of occurrence. In another aspect, this invention relates to the method described above for use in detecting variants in a nucleotide sequence in related polynucleotides, by the additional steps of: c. determine the masses of the fragments obtained in step b; and, d. comparing the masses of the fragments with the masses of the fragments expected from the cleavage of a related polynucleotide of known sequence or e. repeat steps a-c with one or more related polynucleotides of unknown sequence and compare the masses of the polynucleotide fragments with the masses of fragments obtained from the P1266 related polynucleotides. A further aspect of this invention is the use of the first method above, by which the nucleotide sequence is determined, by the additional steps of: c. determine the masses of the fragments obtained from step lb: d. repeating steps la, lb and le, each time in the polynucleotide a different natural nucleotide is replaced with a modified nucleotide until each natural nucleotide in the polynucleotide has been replaced with a modified polynucleotide, each modified polynucleotide has been cleaved and determined the masses of the cleavage fragments; and, e. constructing the nucleotide sequence of the polynucleotide from the masses of the first fragments. Another aspect of this invention is the use of the aforementioned first method, by which a nucleotide known to contain a polymorphism or a mutation is genotyped, by: using as the natural nucleotide to be replaced a nucleotide known to be involved in the polymorphism or mutation. replace the natural nucleotide by P12S6 amplification of the polynucleotide portion, using a modified nucleotide to form a modified polynucleotide; cleaving the modified polynucleotide into fragments at each point of occurrence of the modified nucleotide; analyze the fragments to determine the genotype. In the immediately preceding method, one aspect of this invention is the fragment analysis by electrophoresis, mass spectrometry or FRET detection. Another aspect of this invention is a method for cleaving a polynucleotide, comprising: a. replacing a first natural nucleotide at virtually every point of occurrence of a polynucleotide, with a modified nucleotide to form a modified polynucleotide once. b. replacing a second natural nucleotide at virtually every point of occurrence in the modified nucleotide once with a second modified nucleotide to form a modified nucleotide twice; and, c. contacting the modified polynucleotide twice with a reagent or with reagents that cleave the modified polynucleotide twice at each point of the modified polynucleotide twice, wherein the P1266 nucleotide modified for the first time immediately follows, and bound by a phosphodiester or modified phosphodiester bond, the modified nucleotide a second time. One aspect of this invention is, in the method immediately above, that the variant in the nucleotide sequence of related polynucleotides is detected by the additional steps of: d. determine the masses of the fragments obtained from step c; and. comparing the masses of the fragments with the expected fragment masses of the cleavage of a related polynucleotide of known sequence or f. repeat steps a-d with one or more related polynucleotides of unknown sequence and compare the masses of the fragments with the masses of fragments obtained from the cleavage of related polynucleotides. One aspect of this invention is a method for detecting the variant in the nucleotide sequence of related polynucleotides, comprising: a. replace three of four natural nucleotides at virtually every point of occurrence in a polynucleotide with three modified nucleotides of stabilization to form a modified polynucleotide that P1266 has a remaining natural nucleotide; b. cleaving the modified polynucleotide into fragments at virtually every point of occurrence of the only remaining natural nucleotide; c. determine the masses of the fragments; and, d. comparing the masses of the fragments with the expected fragment masses of the cleavage of a related polynucleotide of known sequence or e. repeat steps a-c with one or more related polynucleotides of unknown sequence and compare the masses of the fragments with the masses obtained from the cleavage of the related polynucleotides. Another aspect of this invention is, in the immediately preceding method, replacing the remaining natural nucleotide with a modified destabilizing nucleotide. A further aspect of this invention is a method for detecting a variant in the nucleotide sequence of related polynucleotides, comprising: a. replacing two or more natural nucleotides at virtually every point of occurrence in a polynucleotide with two or more modified nucleotides, wherein each modified nucleotide has a different cleavage characteristic of each of the modified nucleotides, to form a modified polynucleotide; P12S6 b. cleaving the modified polynucleotide in first fragments, practically at each point of occurrence of the first of the two or more modified nucleotides; c. cleaving the first fragments in second fragments at each point of occurrence of one second of the two or more nucleotides modified in the first fragments; d. determine the masses of the first fragments and the second fragments; and, e. comparing the masses of the first fragments and the second fragments with the masses of the first fragments and the second fragments expected from the excision of a related polynucleotide of known sequence, or f. repeating steps a-d with one or more related polynucleotides of unknown sequence and comparing the masses of the first and second fragments with the masses obtained from the cleavage of the related polynucleotides. It is an aspect of this invention that, in the above method, the steps are repeated using a modified nucleotide obtained by replacing different pairs of natural nucleotides with modified nucleotides; that is, given four natural nucleotides, 1, 2, 3 and 4, P1266 replace 1 and 3 in one experiment, and 2 and 4 in another, 1 and 4 in another, 2 and 3 in another or 3 and 4 in a final experiment with modified nucleotides. It is an aspect of this invention that the modified polynucleotides obtained by the above methods can be excised in a mass spectrometer, in particular, in a tandem mass spectrometer, that is, in series. A further aspect of this invention is a method for determining the nucleotide sequence in a polynucleotide, comprising: a. replacing a natural nucleotide at a percentage of points of occurrence of a polynucleotide with a modified nucleotide to form a modified polynucleotide, wherein the modified polynucleotide is not a ribonucleotide; b. cleaving the modified polynucleotide into fragments at virtually every point of occurrence of the modified nucleotide; c. repeating steps a and b, each time replacing a different natural nucleotide in the polynucleotide with a modified nucleotide; and, d. determine the masses of the fragments obtained from each excision; and, e. build the polynucleotide sequence a P1266 from the masses, or f. analyze a sequence ladder obtained from the fragments of step c. Another aspect of this invention is a method for determining the nucleotide sequence in a polynucleotide, comprising: a. replacing a natural nucleotide at a first percentage of points of occurrence in a polynucleotide with a modified nucleotide to form a modified polynucleotide, wherein the modified nucleotide is not a ribonucleotide; b. cleaving the modified polynucleotide into fragments at a second percentage of the points of occurrence of the modified nucleotide, such that the combination of the first percentage and the second percentage result in partial cleavage of the modified polynucleotide; c. repeating steps a and b, each time replacing a different natural nucleotide of the polynucleotide with a modified nucleotide; d. determining the masses of the fragments obtained from each cleavage reaction; and e. construct the polynucleotide sequence from the masses o, f. analyze a sequence stairway obtained P1266 from the fragments of steps a and b. One aspect of this invention is a method for determining the nucleotide sequence of a polynucleotide, comprising: a. replacing two or more natural nucleotides at virtually every point of occurrence in a polynucleotide with two or more nucleotides modified to form a modified polynucleotide; b. separating the modified polynucleotide into two or more aliquots, the number of aliquots is the same as the number of natural nucleotides replaced in step a; and c. cleaving the modified polynucleotide in each aliquot into fragments at virtually every point of occurrence of a modified nucleotide different from the modified nucleotides, such that each aliquot contains fragments of the excision in a different modified nucleotide than each of the other aliquots; d. determine the masses of the fragments; and e. construct the nucleotide sequence from the masses; or f. cleaving the modified polynucleotide in each of the aliquots into fragments at a percentage of points of occurrence of a different modified nucleotide, P1266 in such a way that each of the aliquots contains fragments of the excision in a different modified nucleotide than each of the other aliquots; and g. analyze a sequence ladder obtained from the fragments of step f. In addition, an aspect of this invention is a method for determining the nucleotide sequence in a polynucleotide, comprising: a. replacing the first natural nucleotide at a percentage of points of incorporation into a polynucleotide with a first modified nucleotide to form a first partially modified polynucleotide, wherein the first modified nucleotide is not a ribonucleotide; b. cleaving the first partially modified nucleotide into fragments using the known cleavage efficiency cleavage procedure to form a first set of nucleotide-specific cleavage products; c. repeat steps a and b replacing a second, a third and a fourth natural nucleotide with a second, third and fourth modified nucleotide to form a second, third and fourth partially modified polynucleotide, which, after excision, produces a second, third and fourth set of cleavage products P1266 specific for the nucleotide; d. performing the gel electrophoresis on the first, second, third and fourth set of nucleotide-specific cleavage products to form a sequence ladder; and e. read the polynucleotide sequence from the sequence ladder. As an aspect of this invention is a method for cleaving a polynucleotide during polymerization, comprising: mixing four different nucleotides, of which one or two are modified nucleotides; and two or more polymerases, of which at least one produces or enhances cleavage at the points where the modified nucleotide will be incorporated or, if two modified nucleotides are used, at points where the adjacent pair of modified nucleotides will be incorporated and that they are in an appropriate spatial relationship; provided that, when only one modified nucleotide is used, it does not contain ribose as its sole modification characteristic. In the above method, when using two modified nucleotides, it is an aspect of this invention that one of them is a ribonucleotide and the other is a 5'-amino-2 ', 5'-dideoxynucleotide.
P1266 Furthermore, in the above method, using the specific modified nucleotides, it is an aspect of this invention to use two polymerases, one is the Klenow (exo-) polymerase and the other is a Klenow (exo-) polymerase E710A mutant. In any of the above methods, it is an aspect of this invention that all natural nucleotides that will not be replaced with modified nucleotides can be replaced with modified mass nucleotides. It is also an aspect of all the methods of this invention that the polynucleotide to be modified is selected from the group consisting of DNA and RNA. Another aspect of all the above methods is the detection of the masses of the fragments by mass spectrometry. Currently the preferred types of mass spectrometry are mass spectrometry by ionization with electrorotating and mass spectrometry with desorption / ionization aided with matrix (MALDI). In previous methods that require the generation of a sequence ladder, this generation can be achieved using gel electrophoresis. Furthermore, in the above method it relates to the determination of a polynucleotide sequence by partially replacing a natural nucleotide with a P12S6 modified nucleotide, the cleavage of the first, second, third and fourth partially modified polynucleotide obtained in step "a" with one or more restriction enzymes, the labeling of the ends of the obtained restriction fragments and the purification of the fragments of restriction, before performing step "b" is another aspect of this invention. One aspect of this invention is a method for cleaving a polynucleotide, such that virtually all fragments obtained from the cleavage have a tag, which comprises: a. replacing a partial natural nucleotide or virtually at each point of occurrence in a polynucleotide with a modified nucleotide to form a modified polynucleotide; b. contacting, in the presence of a phosphine covalently bound to a tag, the modified polynucleotide with a reagent or reagents that cleave the partially or substantially modified polynucleotide at each point of occurrence. In a presently preferred embodiment of this invention, the phosphine of the above method is tris (carboxyethyl) phosphine (TCEP). Also in the previous method, the label is a fluorescent tag or a radioactive tag in another P1266 aspect of this invention. It is an aspect of this invention that the above methods can be used to diagnose a genetically related disease. The methods can also be used as a means to obtain a prognosis of a genetically related disease or disorder. These can also be used to determine whether a particular patient is eligible for medical treatment by procedures applicable to genetically related diseases or disorders. One aspect of this invention is a method for detecting a variant in a nucleotide sequence of a polynucleotide, for sequencing a polynucleotide or for genotyping a polynucleotide that is known to contain a polymorphism or a mutation: a. replacing one or more natural nucleotides in the polynucleotide with one or more modified nucleotides, of which one or more comprises a modified base; b. contacting the modified polynucleotide with a reagent or reagents that cleave the modified polynucleotide into fragments at the sites of incorporation of the modified nucleotide; c. analyze the fragments to detect the variant, to construct the sequence or to genotype the polynucleotide.
P1266 The modified base of the above method can be adenine in another aspect of this invention. It can also be 7-deaza-7-nitroadenine. A polynucleotide modified according to the above can be cleaved into fragments by contacting the chemical base in another aspect of this invention. In the above method, cleavage of the modified polynucleotide into fragments comprises contacting the modified polynucleotide with a phosphine in yet another aspect of this invention. The use of TCEP as the phosphine in the above method is another aspect of this invention. The modified base in the above method can also be modified cytosine, such as, aunciatively, the azacytosine or the cytosine substituted at position 5 with an electron withdrawing group, wherein the electron withdrawing group is also, at enunciative form, nitro or halo. Once again, the polynucleotides modified as indicated above, can be cleaved with a chemical base. The inclusion of TCEP in the immediately preceding cleavage reaction is another aspect of this invention. The modified base of the previous method also P1266 can be modified guanine, such as, but not limited to, 7-methyl-guanine and the cleavage can be carried out with a chemical base. The modified guanine is N2-allylguanine in a further aspect of this invention. The cleavage of this modified guanine by contacting the modified polynucleotide with an electrophile, such as, enunciatively, iodine, is another aspect of this invention. In another aspect of this invention, the base modified in the above method can also be modified thymine and modified uracil. A currently preferred embodiment of this invention is the use of 5-hydroxyuracil in place of either thymine or uracil. When 5-hydroxyuracil is used, cleavage is achieved by: a. contacting the polynucleotide with a chemical oxidant; and then b. contact the polynucleotide with the chemical base. Another aspect of this invention is a method for detecting a variant in a nucleotide sequence of a polynucleotide, sequencing a polynucleotide or genotyping a polynucleotide, which comprises replacing one or more natural nucleotides in the polynucleotide with one or more modified nucleotides, of which one or more comprise a modified sugar, as long as, when P1266 will only be replaced one nucleotide, the modified sugar is not ribose. The modified sugar is a 2-keto sugar in a further aspect of this invention. Keto sugar can be cleaved with a chemical base. The modified sugar can also be arabinose, which is also susceptible to the chemical base. The modified sugar can also be a sugar substituted with a 4-hydroxymethyl group which, likewise, produces a polynucleotide susceptible to cleavage with a chemical base. On the other hand, the modified sugar can be hydroxycyclopentane, in particular, 1-hydroxy- or 2-hydroxycyclopentane. Hydroxycyclopentanes can also be cleaved with a chemical base. The modified sugar can be azido-sugar, for example, enunciatively, 2'-azide, 4'-azido or 4'-azidomethyl sugar. The cleavage of a sugar azido can be carried out in the presence of TCEP. The sugar may also be substituted with a group capable of photolysis to form a free radical, such as, but not limited to, a phenylselenyl group or a t-butylcarboxy group. These groups make the polynucleotide susceptible to excision with ultraviolet light.
P1266 Sugar can also be a cyano-sugar. In a presently preferred embodiment, the cyano-sugar is 2'-cyano-sugar or 2'-cyano-sugar The cyano-sugar-modified polynucleotides can be cleaved with a chemical base A sugar substituted with an electron withdrawing group, such as, for example, fluorine , azido, methoxy or nitro at the 2 ', 2"or 4' position of the modified sugar is another aspect of this invention. These modified sugars make the modified polynucleotide susceptible to cleavage with a chemical base. On the other hand, a sugar can be modified by the inclusion of an electron withdrawing element in the sugar ring. Nitrogen is an example of this group. Nitrogen can replace the oxygen in the sugar ring or a ring carbon and the resulting modified sugar is cleavable with a chemical base. In a further aspect of this invention, the modified sugar can be a sugar containing a mercapto group. The 2 'position of sugar is a currently preferred embodiment, such as sugar that will be cleavable by a chemical base. In particular, the modified sugar can be a 5 '-methylene-sugar, a 5'-keto-sugar or a 5', 5'- P1266 difluoro-sugar, all these are cleavable with a chemical base. Another aspect of this invention is a method for detecting a variant in the nucleotide sequence of a polynucleotide, sequencing a polynucleotide or genotyping a polynucleotide known to contain a polymorphism or mutation, which comprises replacing one or more natural nucleotides in the polynucleotide with one or more modified nucleotides, of which one or more comprises a modified phosphate ester. The modified phosphate ester can be a phosphorothioate. In one embodiment, the sulfur of the phosphorothioate is not covalently bound to the sugar ring. In this case, the cleavage of the modified polynucleotide into fragments comprises: a. contacting the sulfur of the phosphorothiolate with an alkylating agent; and b. then contacting the modified polynucleotide with a chemical base. In a currently preferred embodiment of this invention, the alkylating agent is methyl iodide. In another aspect of this invention, the modified phosphorothioate-containing polynucleotide can be cleaved into fragments by contacting the sulfur of the P1266 phosphorothioate with β-mercaptoethanol from a chemical base such as, but not limited to, sodium methoxide in methanol. On the other hand, the sulfur atom of the phosphorothiolate may be covalently bound to a sugar ring in another embodiment of this invention. The cleavage of a polynucleotide modified in this way can be carried out with a chemical base. The modified phosphate ester can also be a phosphoramidate. The cleavage of a polynucleotide containing phosphoramidate can be effected using acid.
It is an aspect of this invention that the modified phosphate ester comprises a group selected from the group consisting of alkyl phosphonate and alkyl phosphorotriester, wherein the alkyl group is preferably methyl. This modified polynucleotide can also be cleaved with acid. Another aspect of this invention is a method for detecting a variant in the nucleotide sequence of a polynucleotide, sequencing a polynucleotide or genotyping a polynucleotide known to contain a polymorphism or mutation, which comprises replacing a first and a second nucleotide. of the polynucleotide with a modified first and second nucleotides, such that the polynucleotide can be specifically cleaved at sites where immediately after the P1266 first modified nucleotide in the modified polynucleotide sequence is followed by the second modified nucleotide. In the above method, the first modified nucleotide is covalently linked at its 5 'position to a sulfur atom of a phosphorothioate group and the second modified nucleotide, which is modified with a 2' hydroxy group, is contiguous to the 5 'of the first nucleotide modified. This dinucleotide pair is cleavable with a chemical base. Also in the above method, the first modified nucleotide can be covalently linked in its 3 'position to a sulfur atom of a phosphorothioate group, wherein the second modified nucleotide, which is modified with a 2'-hydroxy group, is contiguous with the 3 'of the first modified nucleotide. This pair of modified nucleotides can also be cleaved with a chemical base. It is also an aspect of this invention that, in the above method, the first modified nucleotide is covalently linked in its 5 'position to a first oxygen atom of a phosphorothioate group, the second modified nucleotide is substituted in its 2' position with a leaving group and the second modified nucleotide is covalently linked in its 3 'position to a second oxygen P1266 of the phosphorothioate group. Any leaving group can be used, examples are fluorine, chlorine, bromine and iodine. The polynucleotide thus modified can be cleaved with a chemical base. An example, in an enunciative form, of a useful chemical base is sodium methoxide. In another embodiment of this invention, the first modified nucleotide is covalently linked at its 5 'position to a first oxygen atom of a phosphorothioate group, the second modified nucleotide is substituted at its 4' position with a leaving group and the second modified nucleotide is covalently attached at its position 3 'to a second oxygen of the phosphorothioate group. Here, again, any suitable leaving group can be used, of which exemplifying examples are fluorine, chlorine, bromine and iodine. These groups do in the same way that the modified polynucleotide is susceptible to cleavage by a chemical base such as, but not limited to, sodium methoxide. In a further embodiment of this invention, the first modified nucleotide is covalently linked at its 5 'position to a first oxygen atom of a phosphorothioate group, the second modified nucleotide is substituted at its 2' position with one or two fluorine atoms and the second modified nucleotide is covalently linked in its 3 'position to a second oxygen in the group P1266 phosphorothioate. This modified polynucleotide can be cleaved by: a. contacting the modified polynucleotide with ethylene sulfide or β-mercaptoethanol; and, later, b. contacting the modified polynucleotide with a chemical base such as, but not limited to, sodium methoxide. Another embodiment of this invention has the first modified nucleotide covalently linked at its 5 'position to a first oxygen atom of a phosphorothioate group, to the second modified nucleotide substituted at its 2' position with a hydroxy group and to the second modified nucleotide covalently linked at its position 3 'to a second oxygen of the phosphorothioate group. Here, the split can be made by: a. contacting the modified polynucleotide with a metal oxidant and then, b. contacting the modified polynucleotide with a chemical base. Enunciative examples of metal oxidants are Cu '' and Fe '' 'and equally enunciative examples of useful bases are dilute hydroxide, piperidine and dilute ammonium hydroxide. It is also a mode of this invention that the P1266 first modified nucleotide is covalently linked at its 5 'position to a nitrogen atom of a phosphoramidate group and that the second modified nucleotide, which is modified with a 2'-hydroxy group, is contiguous with the 5' of the first modified nucleotide. This type of modification makes the modified polynucleotide susceptible to acid cleavage. A further embodiment of this invention is one in which the first modified nucleotide is covalently linked in its 3 'position to a nitrogen atom of a phosphoramidate group and the second modified nucleotide, which is modified with a 2' - hydroxy group, is contiguous to 31 of the first modified nucleotide. Again, this substitution pattern is cleavable with acid. It may also be that the first modified nucleotide is covalently linked at its 5 'position to an oxygen atom of an alkylphosphonate group or an alkylphosphorotriester group and the second modified nucleotide, which is modified with a 2'-hydroxy group, is contiguous with the first modified nucleotide. This alternative dinucleotide grouping is also cleavable with acid. Another cleavable dinucleotide cluster is one in which the first modified nucleotide has an electron withdrawing group at its 4 'position and the P1266 second modified nucleotide, which is modified with a 2'-hydroxy group, is contiguous to the 5 'of the first modified nucleotide. Again, the cleavage can be carried out by contact with acid. Another aspect of this invention is a method for detecting a variant in the nucleotide sequence of a polynucleotide, for sequencing a polynucleotide or for genotyping a polynucleotide known to contain a polymorphism or mutation, comprising: a. replacing one or more natural nucleotides in the polynucleotide with one or more modified nucleotides, wherein each modified nucleotide is modified with one or more modifications selected from the group consisting of a modified base, a modified sugar and a modified phosphate ester, provided that if only one modified nucleotide is used, the modified nucleotide is not a ribonucleotide; b. contacting the modified polynucleotide with a reagent or reagents that cleave the modified polynucleotide in fragments at the modified nucleotide incorporation sites; c. analyze the fragments to detect the variant, to construct the sequence or to genotype the polynucleotide. One aspect of this invention is a compound that P1266 has the structure chemistry: wherein R1 is selected from the group consisting of A compound that has the chemical structure; P1266 wherein the "Base" is selected from the group consisting of cytosine, guanine, inosine and uracil is another aspect of this invention. Another aspect of this invention is a compound that has the chemical structure: wherein the "Base" is selected from the group consisting of adenine, cytosine, guanine, inosine and uracil. Another additional aspect of this invention is a compound that has the chemical structure: P12S6 99eta -0 39 where the "Base" is selected from the group consisting of adenine, cytosine, guanine, inosine, thymine and uracil. A polynucleotide comprising a dinucleotide sequence is selected from the group consisting of: P1266 P1266 wherein, each "Base" is independently selected from the group consisting of adenine, cytosine, guanine and thymine; it is an electron withdrawing group; X is a leaving group and R is an alkyl, preferably a lower alkyl group, is also an aspect of this invention. The electron withdrawing group is selected from the group consisting of F, Cl, Br, I, N02, C = N, -C (0) 0H and OH, in another aspect of this invention and, in another additional aspect, the Outgoing group is selected from the group consisting of Cl, Br, I and OTs. One aspect of this invention is a method for synthesizing a polynucleotide comprising mixing a compound having the chemical structure: P1266 wherein R1 is selected from the group consisting of: CF ' with adenosine triphosphate, guanosine triphosphate and thymidine triphosphate or uridine triphosphate in the presence of one or more polymerases. A method for synthesizing a polynucleotide comprising mixing a compound having the chemical structure: P1266 wherein R1 is selected from the group consisting of With adenosine triphosphate, cytidine triphosphate and guanosine triphosphate in the presence of one or more polymerases, it is also an aspect of this invention. A method for synthesizing a polynucleotide, which comprises mixing a compound having the chemical structure: wherein R1 is selected from the group consisting of P1266 with cytidine triphosphate, guanosine triphosphate and thymidine triphosphate in the presence of one or more polymerases is a further aspect of this invention. One aspect of this invention is a method for synthesizing a polynucleotide, which comprises mixing a compound having the chemical structure: wherein R1 is selected from the group consisting of: with adenosine triphosphate, cytidine triphosphate and thymidine triphosphate in the presence of one or more polymerases. Another aspect of this invention is a method for synthesizing a polynucleotide, which comprises mixing a compound selected from the group consisting of: a compound having the chemical structure: P126S " wherein the "Base" is selected from the group consisting of cytosine, guanine, inosine and uracil; A compound that has the chemical structure: wherein the "Base" is selected from the group consisting of adenine, cytosine, guanine, inosine and uracil; and a compound that has the chemical structure: P1266 99eta ZL where the "Base" is selected from the group consisting of adenine, cytosine, guanine or inosine and thymine or uracil, with any three of the four nucleoside triphosphates, adenosine triphosphate, cytidine triphosphate, guanosine triphosphate and triphosphate thymidine, do not contain the base (or its substitute), in the presence of one or more polymerases. Another aspect of this invention is a method for synthesizing a polynucleotide, which comprises mixing one of the following pairs of compounds: P1266 P12S6 P12S6 99ZXd tß wherein: Basei is selected from the group consisting of adenine, cytosine, guanine or inosine and thymine or uracil; Base2 is selected from the group consisting of the three remaining bases that are not the Bassi; R3 is 0"-P (= 0) (0") - 0-P (= 0) (0") - 0-P (= 0) (0") - 0-; and, it is an electron withdrawing group; P1266 X is a leaving group; a second or X shown in parentheses on the same carbon atom means that a single group or X can be at any position in the sugar or both groups or both X groups can be present at the same time; and R is a lower alkyl group; wherein any two of the four nucleoside triphosphates, adenosine triphosphate, cytidine triphosphate, guanosine triphosphate and thymidine triphosphate do not contain base-1 or base-2 (or its substitutes), in the presence of one or more polymerases . One aspect of this invention is a mutant polymerase that has the ability to catalyze the incorporation of a modified nucleotide into a polynucleotide, where the modified nucleotide is not a ribonucleotide, the polymerase will be obtained by a process comprising the shuffling of DNA into another aspect of this invention. The DNA shuffle includes the process that can comprise the following steps: a. selecting one or more known polymerases; b. perform the DNA shuffling; c. transform the shuffled DNA into a host cell; P1266 d. develop colonies of host cells; and. form a lysate of the host cell colony; F. adding a DNA template containing a detectable reporter sequence, the modified nucleotide or nucleotides, whose incorporation into a polynucleotide is desired and the natural nucleotides that are not replaced by the modified nucleotides; and g. examine the lysate to detect the presence of the detectable reporter. The process that includes the DNA shuffling may also include: a. selecting a known polymerase or two or more known polymerases having different sequences or different biochemical properties or both; b. perform the DNA shuffling; c. transforming the shuffled DNA into a host to form a library of transformants in the colonies of host cells; d. preparing the first combinations or separate mixtures of the transformants by plating the colonies of host cells; and. form a lysate from each of the colonies of host cells of the first separate mixture; P1266 f. remove all natural nucleotides from each lysate; g. combine each lysate with: i. a single-stranded DNA template comprising a sequence corresponding to an RNA polymerase promoter, followed by a reporter sequence; ii. a single-stranded DNA primer complementary to one end of the template; iii. the nucleotide or modified nucleotides that it is desired to incorporate into the polynucleotide; iv. each natural nucleotide will not be replaced by the modified nucleotide or nucleotides; h. add RNA polymerase to each combined lysate; i. examine each combined lysate to detect the presence of the reporter sequence, - j. creating second separate mixtures of transformants in the colonies of host cells from each of the first separate mixture of colonies of host cells in which the presence of the reporter was detected; . forming a lysate from each second mixture separated from colonies of host cells; Pl266 1. Repeat steps g, h, 1, j, k and I to form separate mixtures of transformants in colonies of host cells only until a host cell colony containing the polymerase remains; and m. re-clone the polymerase of a colony of host cells in a protein expression vector. A polymerase that has the ability to catalyze the incorporation of a modified nucleotide into a polynucleotide, wherein the modified nucleotide is not a ribonucleotide obtained by a process comprising selection by cellular senescence is another aspect of this invention. The process of selection by cellular senescence can include the following steps: a. generating a mutant (mutagenesis) of a known polymerase to form a library of mutant polymerases; b. clone the library in a vector; c. transforming the vector into selected host cells, so that they are susceptible to being destroyed by a selected chemical compound only when the cell is being actively developed; d. add a modified nucleotide; P1266 e. develop the host cells; F. treat the host cells with the selected chemical compound; g. separate living cells from dead cells; and, h. Isolate polymerase or polymerases from living cells. Steps d to h of the above method can be repeated one or more times to refine the selection of the polymerase in another aspect of this invention. The cell senescence procedure for obtaining a polymerase can also comprise the steps of: a. generating a mutant (mutagenesis) of a known polymerase to form a library of mutant polymerases; b. cloning the mutant polymerase library in a plasmid vector; c. transform with the plasmid vector the bacterial cells which, when developed, are susceptible to an antibiotic; d. select the transfectants using the antibiotic; and. introduce a modified nucleotide, such as the corresponding nucleoside triphosphate, into the cells P1266 bacterial; F. develop the cells; g. add an antibiotic that will destroy the bacterial cells that are developing in an active form; h. isolate bacterial cells; i. develop the bacterial cells in a fresh medium that does not contain an antibiotic; j. select living cells from developing colonies; k. isolate the plasmid vector from living cells; 1. Isolate the polymerase; and, m. test or analyze the polymerase. Repeating steps c to k of the above process one or more additional times before proceeding to step 1 is another aspect of this invention. A polymerase can also be obtained by a process comprising the phage display. The phage display process may comprise the steps of: a. select a DNA polymerase; b. expressing the polymerase in a bacteriophage vector as a fusion with a phage coat protein; P1266 c. attaching an oligonucleotide to the surface of the phage; d. forming a primer and template complex, either by adding a second oligonucleotide complementary to the oligonucleotide of c or by forming a self-priming complex using the intramolecular complementarity of the oligonucleotide of c; and. performing the extension with primer in the presence of the modified nucleotide or nucleotides, which are desired to be incorporated into a polynucleotide and the natural nucleotides will not be replaced by the modified nucleotides, wherein the successful primer extension results in the presence of a detectable reporter sequence; F. classify the phage with the detectable indicator from those that do not have the detectable indicator. The detectable reporter sequence is formed by the incorporation of one or more natural or modified nucleotides labeled with dye in the primer extension reaction in another aspect of this invention. The indicated classification procedure may comprise the use of a fluorescence activated cell sorter in a further aspect of this invention.
P1266 One aspect of this invention is that the indicator detectable in the above method is a restriction endonuclease cleavage site and the classification procedure involves restriction endonuclease digestion. Another aspect of this invention is that the polymerase obtained in the above methods is a thermally stable polymerase. The polymerase obtained by any of the above methods, wherein the modified nucleotide to be incorporated is selected from the group consisting of: a compound having the chemical structure: OR"- wherein R1 is selected from the group consisting of: P1266 a compound that has the chemical structure: wherein the "Base" is selected from the group consisting of cytosine, gunina, inosine and uracil, a compound having the chemical structure: P1266 wherein the "Base" is selected from the group consisting of adenine, cytosine, guanine, inosine and uracil; A compound that has the chemical structure: P1266 wherein the "Base" is selected from the group consisting of adenine, cytosine, guanine, inosine, thymine and uracil; and, a compound selected from the group consisting of: P1266 P1266 P1266 66 P1266 P1266 ^ _ ^^ ___ ^ where: Basei is selected from the group consisting of adenine; cytosine, guanine or inosine and thymine or uracil; Base2 is selected from the group consisting of the three remaining bases that are not the Bassi; R3 is 0"-P (= 0) (0 ~) -0-P (= 0) (0") - 0-P (= 0) (O ') - O-; and, it is an electron withdrawing group; X is a leaving group; a second or X shown in parentheses on the same carbon atom means that a single group or X may be at any position in the sugar or that both groups or both groups X may be present at the same time; And l R is a lower alkyl group; A final aspect of this invention is a set comprising: one or more modified nucleotides; one or more polymerases capable of incorporating one or more modified nucleotides in a nucleotide, to form a modified polynucleotide; and a reagent or reagents capable of cleaving the modified polynucleotide at each point of occurrence of the one or more modified nucleotides of the polynucleotide. As used herein, a "chemical method" refers to a combination of one or more P1266 modified nucleotides and one or more reagents, which, when the modified nucleotides are incorporated into a polynucleotide by partial or complete replacement of the natural nucleotides and the modified polynucleotide is exposed to the reagents, results in selective cleavage of the modified polynucleotide in the points of incorporation of the modified nucleotides. By "analysis" we mean reference to any detection of the variant in the nucleotide sequence between two or more related polynucleotides or, alternatively, the determination of the complete nucleotide sequence of a polynucleotide. By "reactive" is meant a chemical or physical force that causes the cleavage of a modified polynucleotide at the point of incorporation of a modified nucleotide instead of a natural nucleotide; This reagent may be, but is not limited to, a chemical compound or a combination of chemical compounds, normal or coherent visible light (laser) or ultraviolet light, high energy ion bombardment and irradiation. In addition, a reagent may consist of a protein such as, but not limited to, a polymerase. The "related" polynucleotides are polynucleotides obtained from genetically similar sources, so that one would expect the sequence of P1266 nucleotides of the polynucleotides being exactly the same in the absence of a variant would be expected to have a region of overlap which, in the absence of a variant would be exactly the same, wherein the overlap region is greater than 35 nucleotides. A "variant" is a difference in the nucleotide sequence between related polynucleotides. The difference may be the deletion of one or more nucleotides from the sequence of a polynucleotide compared to the sequence of a related polynucleotide, the addition of one or more nucleotides or the replacement of one nucleotide by another. In the present the terms "mutation", "polymorphism" and "variant" are used interchangeably. As used herein, the term "variant" in the singular shall be construed as including multiple variants; that is, two or more additions, deletions and / or substitutions in the same polynucleotide. A "point mutation" refers to a single substitution of one nucleotide for another. A "sequence" or "nucleotide sequence" refers to the order of the nucleotide residues in a nucleic acid. As indicated above, one aspect of the chemical method of the present invention consists of modified nucleotides that can be incorporated into a P1266 polynucleotide instead of the natural nucleotides. A "nucleoside" refers to a base linked to a sugar. The base can be adenine (A), guanine (G) (or its substitute, inosine (I)) cytosine (C) or thymine (T) (or its substitute, uracil (U)). The sugar can be ribose (the sugar of a natural nucleotide in the RNA) or 2-deoxyribose (the sugar of a natural nucleotide in the DNA). A "nucleoside triphosphate" refers to a nucleoside linked to a triphosphate group (0"-P (= 0) (0 ~) -0-P (= 0) (0 ~) -0-P (= 0) ( 0") -0-nucleoside). The triphosphate group has four formal negative charges that require counterions, that is, of positively charged ions. Any positively charged ion can be used, for example, enunciatively, Na +, K +, NH4 +, Mg2 +, etc. Na + is one of the most commonly used counterions. A convention accepted in the art is to omit the counterion, which is understood to be present, when nucleoside triphosphates are shown and in this application said convention will be maintained. As used herein, unless otherwise expressly indicated, the term "nucleoside triphosphate" or reference to any specific nucleoside triphosphates; for example, adenosine triphosphate, guanosine triphosphate or P1266 cytidine triphosphate refers to the triphosphate prepared using either a ribonucleoside or a 2'-deoxyribonucleoside. A "nucleotide" refers to a nucleoside linked to a single phosphate group or, by convention, when referring to incorporation into a polynucleotide, refers to an abbreviated form of nucleoside triphosphate, which is the species that actually polymerizes in the presence of a polymerase. A "natural nucleotide" refers to a nucleotide A, C, G or U when referring to RNA and to dA, dC, dG (the "d" refers to the fact that the sugar is a deoxyribose) and dT when it is made reference to DNA. A "natural" nucleotide also refers to a nucleotide that may have a structure different from the previous one, but which is naturally incorporated into a polynucleotide sequence by the organism which is the source of the polynucleotide. As used herein, inosine (I) refers to a purine ribonucleoside containing the hypoxanthine base. As used herein, a "substitute" for a nucleoside triphosphate refers to a molecule in a different nucleoside can be naturally substituted for A, C, G or T. Thus, inosine is a P1266 natural substitute for guanosine and uridine is the natural substitute for thymidine. As used herein, a "modified nucleotide" is characterized by two criteiros. The first, a modified nucleotide is a "non-natural" nucleotide. In one aspect, a "non-natural" nucleotide can be a natural nucleotide that is placed in non-natural environments. For example, in a polynucleotide that is naturally composed of deoxyribonucleotides, a ribonucleotide would constitute an "unnatural" nucleotide when incorporated into that polynucleotide. Conversely, in a polynucleotide that is naturally composed of ribonucleotides, a deoxyribonucleotide incorporated in that polynucleotide would constitute an unnatural nucleotide. In addition, a "non-natural" nucleotide can be a natural nucleotide that has been chemically altered, for example, enunciatively, by adding to the nucleotide molecule one or more chemical substituent groups, the deletion of the molecule from one or more chemical substituent groups or replacement at the nucleotide of one or more chemical atoms or substituents by other chemical substituents or atoms. Finally, a "modified" nucleotide can be a molecule that looks like a small natural nucleotide, if it looks like it, though, not P1266, however, is capable of being incorporated by a polymerase into a nucleotide instead of a natural nucleotide. The second criterion by which a "modified" nucleotide, as this term is used herein, is characterized in that it alters the cleavage properties of the polynucleotide in which it is incorporated. For example, enunciatively, the incorporation of a ribonucleotide into a polynucleotide composed predominantly of deoxyribonucleotides imparts a susceptibility to alkaline cleavage that does not exist in natural deoxyribonucleotides. This second criterion of a "modified" nucleotide can be fulfilled by a single unnatural nucleotide that replaces a single natural nucleotide (for example, the substitution of ribonucleotide by deoxyribonucleotide described above) or by the combination of two or more unnatural nucleotides. which, when subjected to selected reaction conditions, do not individually alter the cleavage properties of a polynucleotide but, rather, interact with each other to impose altered cleavage properties on the polynucleotide (termed "dinucleotide cleavage"). When reference is made herein to incorporation into a single nucleotide polynucleotide Modified P1266 and the subsequent cleavage of the modified polynucleotide, the modified nucleotide can be a ribonucleotide. By "having different cleavage characteristics" when referring to a modified nucleotide, it is meant that the modified nucleotides incorporated in the same modified polynucleotide can be cleaved under reaction conditions leaving intact the incorporation sites in that modified polynucleotide of each one of the other modified nucleotides. As used herein, a "modified stabilization nucleotide" refers to a modified nucleotide that imparts an increase in cleavage resistance to that site of incorporation of the modified nucleotide. The majority of the modified nucleotides described herein provide an increase in cleavage upon cleavage when incorporated into a modified polynucleotide. However, the differential lability of the modified nucleotides with respect to the natural nucleotides in a modified polynucleotide is not always sufficient to allow complete excision in the modified nucleotides, while avoiding any cleavage in the natural nucleotides. Therefore, there is a useful function of nucleotides Modified P1266 that reduces lability (nucleotide stabilization), and that the presence of stabilizing nucleotides in a polynucleotide that also contains nucleotides that increase the lability before a particular cleavage procedure (labilization nucleotides) can provide an increase in discrimination between the cleaved and non-cleaved nucleotides in a cleavage procedure. The preferred way to use the stabilizing nucleotides in a polynucleotide is to replace with nucleotides of stabilization all the nucleotides that are not nucleotides of labilization. In the case of mononucleotide cleavage, this would involve using three nucleotides of stabilization and one nucleotide of labilization; in the case of dinucleotide cleavage, this would involve using two stabilizing nucleotides and two (different) nucleotides of labilization. As used herein, the term "stabilizing nucleotide" refers to a modified nucleotide which, when incorporated into a polynucleotide and subjected to cleavage processing, reduces cleavage at the stabilizing nucleotides with respect to the cleavage of mono or dinucleotide in other nucleotides (which are not stabilizing) of the polynucleotide, whether the other nucleotides are natural nucleotides or nucleotides of labilization.
P1266 As used herein, a "destabilizing modified nucleotide" or a "labilizing modified nucleotide" refers to a modified nucleotide that imparts a greater affinity for cleavage than a natural nucleotide at sites of incorporation of the destabilizing modified nucleotide in a polynucleotide. In the sense in which it is used here, "determining a mass" refers to the use of a mass spectrometer to determine the mass of a molecule. Mass spectrometers in general measure the mass to charge ratio (m / z) of the analyte ions, from which the mass can be inferred. When the charge state of the analyte polynucleotide is +1 or -1 the ratio m / z and mass are numerically equal after performing a correction for the proton mass (an extra proton is added to the positively charged ions and a proton is subtracted of negatively charged ions) but when the charge is > + l or < -l the m / z ratio will normally be less than the current mass. In some cases, the software provided with the mass spectrometer calculates the mass from m / z so that the user does not need to realize the difference. In the sense in which it is used here, a "brand" or "label" refers to a molecule that, when P1266 is annexed, for example and enunciatively, by covalent attachment or hybridization, to another molecule, for example a polynucleotide or polynucleotide fragment, provides or improves a means of detecting the other molecule. A fluorescent or fluorescent label or tag emits perceptible light at a particular wavelength when excited at a different wavelength. A radiolabel or radioactive tag emits detectable radioactive particles with an instrument such as a scintillation counter. A "modified mass" nucleotide is a nucleotide in which an atom or chemical substituents have been added, deleted or substituted, but this addition, deletion or substitution does not create modified properties in the nucleotide, as defined herein, i.e. , the only effect of addition, deletion or substitution is the modification of the mass of the nucleotide. A "polynucleotide" refers to a linear chain of nucleotides connected by a phosphodiester bond between the 3'-hydroxyl group of a nucleoside and the 5-hydroxyl group of a second nucleoside which in turn is linked through its 3 'group. -hydroxyl to the 5 '-hydroxyl group of a third nucleoside, and so on, to form a polymer consisting of nucleosides linked by a phosphodiester structure. The polynucleotide can be, without P1266 limitation, single or double stranded DNA or RNA or any other structure known in the art. A "modified polynucleotide" refers to a polynucleotide in which one or more natural nucleotides have been partially or substantially completely replaced with modified nucleotides. A "modified DNA fragment" refers to a DNA fragment synthesized under Sanger dideoxy termination conditions, with one of the natural nucleotides other than that which is partially substituted and its dideoxy analog is replaced with a modified nucleotide as defined herein . The result is a set of Sanger fragments; that is, a set of fragments ending in ddA, ddC, ddG or ddT, depending on the dideoxy nucleotide used with each fragment that also contains modified nucleotides (of course, if the natural nucleotide corresponding to the modified nucleotide exists in that particular Sanger fragment) ). As used herein, "altering the cleavage properties" of a polynucleotide means causing the polynucleotide to be cleavable or differentially cleavable; that is, that it is resistant to cleavage, at the point of incorporation of the modified nucleotide in relation to the sites consisting of other non-natural or natural nucleotides.
P1266 It is currently preferred to "alter the cleavage properties" by making the polynucleotide more susceptible to cleavage at the modified nucleotide incorporation sites than at any other sites in the molecule. In the sense in which it is used herein, the use of the singular when referring to the substitution of the nucleotide should be interpreted to include substitution at each point of occurrence of the natural nucleotide unless otherwise expressly stated. In the sense in which it is used herein, a "template" refers to a white polynucleotide chain, for example, unrestrictedly, a DNA strand that occurs naturally unmodified, and which a polymerase uses as a medium to recognize which nucleotide must then be incorporated into a growing chain to polymerize the chain complement that occurs naturally. This DNA strand can be a single strand or can be part of a double strand DNA template. In applications of the present invention which require repeated cycles of polymerization, for example, the polymerase chain reaction (PCR), the template chain itself can be modified by the incorporation of the nucleotides Modified P1266 even serves as a template for a polymerase to synthesize additional polynucleotides. A "primer" is a short oligonucleotide, the sequence of which is complementary to a segment of the template that will be replicated and in which the polymerase is used as the starting point for the replication process. By "complementary" it is to be understood that the nucleotide sequence of a primer is such that the primer can form a stable hydrogen-binding complex with the template; that is, the primer can hybridize to the template by virtue of the formation of base pairs over a length of at least ten base pairs. As used herein, a "polymerase" refers, without limitation, to molecules such as, for example, DNA or RNA polymerases, reverse transcriptases, DNA polymerases or mutant RNAs mutagenized by the addition of nucleotide, nucleotide deletion, or more point mutations or the technique known to those skilled in the art as "random shuffling or shuffling of DNA" (qv, infra) or by joining the portions of different polymerases to produce chimeric polymerases. The combinations of these mutagenizing techniques can also be used. A polymerase catalyzes the polymerization of nucleotides to form polynucleotides. The methods are set forth herein and are an aspect of this invention, P1266 for producing, identifying and using polymerases capable of efficiently incorporating modified nucleotides together with natural nucleotides into a polynucleotide. The polymerases can be used either to extend a primer once or repeatedly or to amplify a polynucleotide by repetitively priming two complementary strands using two primers. Amplification methods include, without limitation, polymerase chain reaction (PCR), NASBR, SDA, 3SR, TSA, and spinning circle replication. It should be understood that, in any method for producing a polynucleotide containing determined modified nucleotides, one or more polymerases or amplification methods may be used. A "heat stable polymerase" or "thermostable polymerase" refers to a polymerase that retains sufficient activity to effect the primer extension reactions after being subjected to elevated temperatures, such as those that are necessary to denature double nucleic acids. chain. The selection of optimal polymerization conditions depends on the application. In general, one form of primer extension may be the most suitable for sequencing or variation detection methods that account for the excision of the P1266 dinucleotide and mass spectrometric analysis, while any extension or amplification of the primer (eg, PCR) will be suitable for sequencing methods that rely on electrophoretic analysis. Genotyping methods are most suitable for the production of polynucleotides by amplification. Any type of polymerization may be suitable for the variation detection methods of this invention. A "restriction enzyme" refers to an endonuclease (an enzyme that cleaves the phosphodiester bonds within a polynucleotide chain) that cleaves DNA in response to a recognition site in the DNA. The recognition site (restriction site) consists of a specific nucleotide sequence normally of about 4 to 8 long nucleotides. As used herein, "electrophoresis" refers to the technique known in the art as gel electrophoresis; for example, gel cake electrophoresis, capillary electrophoresis and automated versions thereof, such as the use of an automated DNA sequencer or an automated multi-channel, simultaneous capillary DNA sequencer or electrophoresis in an etched channel, such as P1266 which can be produced in glass or other materials. "Mass spectrometry" refers to a technique for mass analysis known in this field, which includes, but is not limited to: ionization by matrix-assisted laser desorption (ALDI) and mass spectrometry by electro-ionization ionization (ESI) which optionally employs , without limitation, time-of-flight techniques, quadripole or Fourier transformation detection. While the use of mass spectrometry constitutes a preferred embodiment of this invention, it will be apparent that other instrumental techniques are available or available for the determination of mass or comparison of oligonucleotide masses. One aspect of the present invention is the determination and comparison of masses and any instrumental procedure having the ability for this determination and comparison is considered to be within the scope and spirit of this invention. As used herein, "FRET" refers to the fluorescence resonance energy transfer, an interaction that depends on the distance between the electron excited states of two dye molecules in which the excitation is transferred from a dye ( the donor) to another dye (the recipient) without P1266 emission of a photon. A series of fluorogenic procedures has been developed to take advantage of FRET. In the present invention, the two dye molecules are generally located on opposite sides of a modified nucleotide that can be cleaved, such that the cleavage will alter the proximity of the dyes to each other and thus change the fluorescence output of the dyes in the polynucleotide. As used herein, "constructing a gene sequence" refers to the process of inferring partial or complete information about the DNA sequence of a subject polynucleotide by analyzing the masses of its fragments obtained by a cleavage procedure. The process of constructing a gene sequence in general entails the comparison of a set of excision masses, determined experimentally with the known or predicted masses of all possible polynucleotides that could be obtained from the polynucleotide in question, taking into account only the shrinkages of the nucleotide or modified nucleotides incorporated into the polynucleotide and the chemical reaction mechanisms used, both impact the range of possible constituent masses. Various analytical deductions can then be used to extract the greatest amount of sequence information from the masses of the P1266 excision fragments. Most sequence information can generally be inferred when the polynucleotide in question is modified and cleaved, in separate reactions, by means of two or more modified nucleotides or modified sets of nucleotides because the range of deductions that can be made from the analysis of various sets of cleavage fragments is greater. As used herein, a "sequence ladder" is a collection of overlapping polynucleotides, prepared from a simple DNA or RNA template, which shares a common purpose, usually the 5 'end, but which differs in length due to to end up in different places at the opposite end. The termination sites coincide with the occurrence sites of one of the four nucleotides, A, G, C or T / U, in the template. In this way, the lengths of the polynucleotides collectively specify the intervals in which one of the four nucleotides is present in the template DNA fragment. A set of these four sequence ladders, one specific for each of the four nucleotides, specifies the intervals at which all four nucleotides are presented and, therefore, provide the complete sequence of the template DNA fragment. As used herein, the term P1266"Sequence ladder" also refers to the set of four sequence ladders required to determine a complete DNA sequence. The process for obtaining the four sequence ladders to determine a complete DNA sequence is called "generation of a sequence ladder". As used herein, "cell senescence selection" refers to a process by which cells that are likely to be killed by a particular chemical agent only when the cells are actively developing; for example, without limitation, bacteria that can be killed by antibiotics only when they are in development, are used to find a polymerase that will incorporate a modified nucleotide into a polynucleotide. The procedure requires that, when a particular polymerase that has been introduced into the cell line incorporates a modified nucleotide, that incorporation produces changes in the cells that make them age, that is, stop their development. When cell colonies, some members of which contain the polymerase that incorporates a modified nucleotide and some members of which are not exposed to chemicals, only those cells that do not contain the polymerase are killed. The cells that are then placed in a medium where cell development is reinitiated; that is, a medium without the chemical agent or the modified nucleotide, and those cells that develop are separated and the polymerase is isolated therefrom. As used herein, a "chemical oxidant" refers to a reagent capable of increasing the oxidation state of a group in a molecule. For example, without limitation, a hydroxyl group (-0H) can be oxidized to a keto group. For example, and without limitation, potassium permanganate, t-butyl hypochlorite, m-chloroperbenzoic acid, hydrogen peroxide, sodium hypochlorite, ozone, peracetic acid, potassium persulfate and sodium hypobromite are chemical oxidants. As used herein, a "chemical base" refers to a chemical that, in an aqueous medium, has a pK greater than 7.0. Examples of chemical bases are, without limitation, alkali hydroxides (sodium, potassium, lithium) and alkaline earth metal hydroxides (calcium, magnesium, barium), sodium carbonate, sodium bicarbonate, trisodium phosphate, ammonium hydroxide and organic compounds containing nitrogen as pyridine, aniline, quinoline, morpholine, piperidine and pyrrole. These can be used as mild aqueous solutions (usually due to dilution) or strong solutions (concentrated solutions). A chemical base also refers to a non-aqueous organic base P1266 strong; examples of these bases include, without limitation, sodium methoxide, sodium ethoxide and potassium t-butoxide. As used herein, the term "acid" refers to a substance that dissociates in solution in water to produce one or more hydrogen ions. The acid may be strong, which in general means that it is highly concentrated, or it may be weak which in general refers to a dilution. Of course it should be understood that acids inherently have different concentrations; for example, sulfuric acid is much stronger than acetic acid and this factor can also be taken into consideration when selecting the suitable acid to be used in conjunction with the methods described herein. The proper choice of acid will be apparent to those skilled in the art from the teachings herein. Preferably, the acids used in the methods of this invention are weak. Examples of inorganic acids are, without limitation, hydrochloric acid, sulfuric acid, phosphoric acid, nitric acid and boric acid. Examples, without limitation, of organic acids are formic acid, acetic acid, benzoic acid, p-toluenesulfonic acid, trifluoroacetic acid, naphthoic acid, uric acid and phenol. An "electron withdrawing group" refers to a chemical group that, by virtue of its great electronegativity, it inductively extracts the electron density away from the close groups and towards itself, leaving the group less electronegative with a partial positive charge. This partial positive charge, in turn, can stabilize a negative charge on an adjacent group thereby facilitating any reaction that includes a negative charge, either formal or in a transition state, in the adjacent group. Examples of electron extraction groups include, without limitation, cyano (C = N), azido (-N = N), nitro (N02), halo (F, Cl, Br, I), hydroxy (-0H), thiohydroxy ( -SH) and ammonium (-NH3 +). An "electron withdrawing element" as used herein, refers to an atom that is more electronegative than carbon in such a way that, when placed in a ring, the atom extracts electrons for which, as with a electron withdrawing group, results in nearby atoms that are left with a partial positive charge. This makes nearby atoms susceptible to nucleophilic attack. This also tends to stabilize and, therefore, favors the formation of negative charges on other atoms attached to the positively charged atom. An "electrophile" or "electrophilic group" refers to a group that, when reacted with a P1266 molecule, it takes a pair of electrons from the molecule. Examples of some common electrophiles are, without limitation, iodine and aromatic nitrogen cations. An "alkyl" group as used herein refers to a straight or branched, unsubstituted group of 1 to 20 carbon atoms. Preferably, the group consists of a chain of 1 to 10 carbon atoms; more preferably, it is a chain of 1 to 4 carbon atoms. As used in the present "1 to 20", etc. carbon atoms means 1 or 2 or 3 or 4, etc., up to 20 carbon atoms in the chain. A group "mercapto" refers to a group -SH. An "alkylating agent" refers to a molecule that is capable of introducing an alkyl group into a molecule. Examples, without limitation, of alkyl groups include methyl iodide, dimethyl sulfate, diethyl sulfate, ethyl bromide and butyl iodide. As used herein, the terms "selective", "selectively", "practically", "essentially", "uniformly" and the like, mean that the indicated event is presented at a particular degree. In particular, the percentage of incorporation of a modified nucleotide is greater than 90%, preferably greater than 95%, more preferably greater than 99% or selectivity for cleavage in a modified nucleotide P1266 is greater than 10X, preferably greater than 25X, more preferably greater than 100X other than other natural or modified nucleotides, or the percentage of cleavage in a modified nucleotide is greater than 90%, preferably greater than 95%, more preferred greater than 99% As used herein, "diagnosis" refers to the determination of the nature of a disease or disorder. The methods of this invention can be used in any of the diagnostic forms, among which are included unrestricted: clinical diagnosis (a diagnosis made from a study of the signs and symptoms of a disease or disorder, where the sign or symptom is the presence of a variance), differential diagnosis (to determine this, of two or more diseases with similar symptoms, is that of which a patient is suffering), etc. By "prognosis", as used herein, is meant a prediction of the probable course and / or result of a disease. In the context of this invention, the methods described herein can be used to track the effect of a variance or genetic variance on the progression of the disease or response to treatment. It should be noted that, the use of the methods of this invention as a forecasting tool does not require knowledge of the biological impact of a variance. The P1266 dfetftfMlMtti detection of a variance in an individual suffering from a particular disorder or from the statistical association of variance with the disorder is sufficient. The advance or response to the treatment of patients with a particular variance can then be followed throughout the course of the disorder to guide therapy or other decisions for the management of the disorder. By "having a genetic component" is meant that it is known or suspected that a disease, disorder or response to treatment in particular, is related to a variance or variance in the genetic code of an individual afflicted with a disease or disorder . As used here, an "individual" refers to any higher life form, including reptiles and mammals, particularly humans. However, the methods of this invention are useful for the analysis of the nucleic acids of any biological organism.
BRIEF DESCRIPTION OF THE TABLES Table 1 is a description of several procedures currently in use for the detection of variances in DNA. Table 2 shows the molecular weights of the P1266 four DNA nucleotide monophosphates and the difference in mass between each pair of nucleotides. Table 3 shows the masses of all the possible 2mers, 3mers, 4mers and 5mers of the DNA nucleotides of Table 2. Table 4 shows the masses of all possible 2mers, 3mers, 4mers, 5mers, 6mers and 7mers that would be produced by cleavage in one of the four nucleotides and the mass differences between the oligonucleotides in the vicinity. Table 5 shows the mass changes that will occur for all possible point mutations (replacement of one nucleotide for another) and the maximum theoretical size of a polynucleotide where a point mutation should be detectable by mass spectrometry using mass spectrometers of varying resolution powers. Table 6 shows the current differences in molecular weight observed in an oligonucleotide using the method of this invention; the difference reveals a variance hitherto unknown in the oligonucleotide. Table 7 shows all the masses obtained by the cleavage of an exemplifying 20mer in four separate reactions, each reaction being specific for one of the DNA nucleotides; that is, in A, C, G and T.
P1266 BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 shows the detection of the change of a single base (a T to C) in fragments of the base pair 66 obtained by PCR. Figure 2 shows the molecular weight of the major fragments expected from excision of a modified polynucleotide by incorporation of the modified 7-methylguanine nucleotide in place of G. Figure 3 shows polyacrylamide gel analysis of polynucleotides with modified G before and after the split. Two polynucleotides that differ by a single nucleotide (RFC against RFC mut) were analyzed. Figure 4 is a mass spectrogram, with amplified insert, of the PCR fragment of base pair 66 amplified in the presence of RFC. Figure 5 shows the mass spectrogram, with amplified insert, of the cleavage products from a base 66 polynucleotide with complete substitution of 7-methylG for G and the subsequent cleavage in G. Figure 6 is a mass spectrogram of two oligonucleotides that differ by a single nucleotide; that is, a G is present only in the larger oligonucleotide. Figure 7 shows a sequencing gel of a P1266 single-chain linearized model M13. The model was extended to 87 nucleotides in the presence of 5'-amino dTTP using exo-minus Klenow polymerase and partially cleaved with acetic acid. Figure 8 shows a life-size extension product purified from the fragment of Figure 7 before and after chemical cleavage. Figure 9 shows the results of a restriction endonuclease digestion of the fully extended template / primer complex of Figures 7 and 8 and also shows the extension of the primer in the presence of 5'-amino T to form a 7.2 Kb polynucleotide Figure 10 shows the resolution obtained during separation by high performance liquid chromatography (HPLC) of a DNA phiX174 restricted by Hae III. Figure 11 shows the sequence ladder obtained from a polynucleotide where T was replaced with 5-amino T, followed by cleavage with acetic acid and by electrophoresis with denaturing polyacrylamide gel. Figure 12 shows an example of dinucleotide cleavage wherein a ribonucleotide is 5 'of a thiol ester bridging. Figure 13 shows the efficiency of complete mononucleotide cleavage or dinucleotide cleavage P1266 complete for the detection of variance in polynucleotides of 50, 100, 150, 200 and 250 nucleotides. Figures 14 to 18 show various aspects of large distance DNA sequencing using chemically cleavable modified nucleotides. Figure 14 shows a hypothetical random sequencing analysis of a 10 kb clone and illustrates the principle and advantages of large distance DNA sequencing by chemical cleavage of mononucleotides incorporated by polymerase. Figure 15 illustrates the sequencing of a 2.7 kb plasmid by extension of the primer in the presence of 4 dNTP and a 5'-amino-dNTP followed by restriction endonuclease digestion, and final labeling, chemical cleavage and electrophoretic resolution of the ladder of resulting sequence. Figure 16 shows the separation of restriction endonuclease fragments HincII partially substituted with 5'-aminoT by HPLC. Figure 17 is a comparison of sequence ladders produced by dideoxy termination and by acidic cleavage of 5 'partially amino substituted nucleotide primer extension products. The chemical cleavage process results in a homogeneous distribution of products marked with P1266 approximately more than 4000 nucleotides. Figure 18 is a comparison of sequence ladders produced by deoxo-termination and acid cleavage of partially-substituted 5'-amino-nucleotide primer extension products as visualized on an autoradiogram. The Figure 19 is an illustration of DNA fragments produced by restriction endonuclease cleavage of a 700 nt DNA fragment compared to fragments produced by chemical cleavage of dinucleotide. Figure 20 shows a dinucleotide cleavage employing a ribonucleotide and a 5'-amino-nucleotide in a 5 'to 3' orientation. Figure 21 compares the cleavage products obtained by base cleavage of a fragment of ribonucleotide DNA and 5'-aminonucleotide substituted with the cleavage products obtained by acid cleavage. Figure 22 shows the results of the excision of a DNA fragment substituted with ribo-G and 5'-amino-TTP. The autoradiogram shows the complete excision in GT and no background excision in G or T. Figure 23 shows the results of the excision of a DNA fragment that incorporates ribo-A and 5'-amino-TTP. Again, the autoradiogram shows the site specific excision completely and completely.
P1266 Figure 24 is a mass spectrogram of the cleavage products of the DNA fragment of Figure 23. All fragments are observed, except the 2 nt fragment. Figure 25 illustrates the results of the dinucleotide cleavage of an extension primer 257 nt in which the ribo-A and 5'-amino-TTP have been incorporated. Figure 26 is a MALDI-TOF mass spectrogram of the cleavage products of the primer extension product of Figure 25. Figures 27 to 33 demonstrate the application of mononucleotide cleavage to genotyping by mass spectrometry, electrophoresis capillary and FRET. Figure 27 is a schematic illustration of genotype determination (detection of variance at a known variant site). Figure 28 shows the results of the genotyping of a variance dA against dG in the transferrin receptor by PCR amplification in the presence of modified ddA followed by chemical cleavage in the modified nucleotide. Figure 29 exemplifies the determination of the genotype using chemical excision / incorporation of P1266 modified nucleotide followed by mass spectrometric analysis of the resulting fragments. Figure 30 demonstrates the determination of the genotype of a modified nucleotide containing transferrin receptor by chemical cleavage followed by MALDI-TOF. Figure 31 demonstrates distinguishing features of the determination of the MALDI-TOF genotype. Figure 32 demonstrates the genotyping of a transferrin receptor polymorphism by chemical cleavage of a modified nucleotide transferrin receptor followed by capillary electrophoresis or gel in cake or splint. Figure 33 illustrates schematically the FRET detection of variant polynucleotides after chemical cleavage of a modified polynucleotide.
DETAILED DESCRIPTION OF THE INVENTION In one aspect, this invention relates to a method for detecting a variance in the nucleotide sequence between the related polynucleotides by replacing a natural nucleotide in a polynucleotide at virtually every point of incorporation of the natural nucleotide with a modified nucleotide, by cleaving the modified polynucleotide at virtually every point of P12S6 incorporation of the modified nucleotide, determining the mass of the obtained fragments and then cring the masses with the expected masses of a related polynucleotide of known sequence or, if the sequence of a related polynucleotide is not known, repeating the previous steps with a second polynucleotide related and then cring the masses of the fragments obtained from the two related polynucleotides. Of course, it is understood that the methods of this invention are not limited to any particular amount of related polynucleotides; you can use as many as you need or want. In another aspect, this invention relates to a method for detecting a variance in the nucleotide sequence between related polynucleotides by replacing two natural nucleotides in a polynucleotide with two modified nucleotides, the modified nucleotides are selected so that, under the selected reaction condition , do not individually impart selective cleavage properties on the modified polynucleotide. Instead, when the two modified nucleotides are contiguous; that is, when the natural nucleotides that are replaced were contiguous in the unmodified polynucleotide, they act in concert to impart selective cleavage properties on the modified polynucleotide.
P1266 In addition to mere proximity, it may also be necessary, depending on the modified nucleotides and the selected reaction conditions, that the modified nucleotides be in the appropriate spatial relationship. For example and without limitation, 5'A-3'G could be susceptible to cleavage while 5'G-3'A could not be. As in the previous, once the substitution of the modified nucleotides for the natural nucleotides has been carried out, the modified nucleotide pair is cleaved, the masses of the fragments are determined and the masses are compared, either with the expected masses of a polynucleotide of known sequence or, if the sequence of at least one of the related polynucleotides is not known, with the mass obtained when the procedure is repeated with other related polynucleotides. In another aspect, this invention relates to methods for detecting mono or dinucleotide cleavage products by electrophoresis or fluorescence resonance energy transfer (FRET). In FRET-based assays, the presence or absence of fluorescence is monitored over a specific wavelength range. These two methods are particularly well suited for detecting the variance at a single site in a polynucleotide where the variance has been previously identified. The knowledge of the particular variance P1266 allows the design of FRET or electrophoretic reagents and procedures specifically suited for the automated, rapid, and low cost determination of the state of the variant nucleotides. The FRET and electrophoretic detection examples of cleavage products are described below and in the Figures. The use of the variance detection methods of this invention for the development of diagnostic or prognostic tools and their use for the detection of predisposition to certain diseases and disorders is another aspect of this invention. In the development of diagnostic tools, the methods of this invention will be used to compare the DNA of a test subject who is showing symptoms of a particular disease or disorder known or suspected to be genetically related or showing desirable characteristics. such as an improvement in health or economically valuable traits such as: growth rate, resistance to parasites, crop yield, etc. with the DNA of the healthy members of the same population and / or members of the population that exhibit the same disease, disorder or trait. The test subject can be, without limit, a human, any other mammal, for example a rat, mouse, dog, cat, horse, cow, pig, sheep, goat, etc., cold-blooded species such as fish P1266 or important crops for agriculture such as wheat, corn, cotton and soybeans. The detection of a statistically significant variance between the healthy members of the population and the members of the population with the disease or disorder would serve as substantial evidence of the usefulness of the test to identify subjects who have or are at risk of having the disease or the disorder. This could lead to very useful diagnostic tests. The use of the methods of this invention as a diagnostic or prognostic tool is completely unnecessary to know anything about the variance that is being sought; that is, its exact location, if it is an addition, deletion or substitution or to what nucleotides have been added, deleted or replaced. Mere detection of the presence of variance completes the desired task, to diagnose or predict the incidence of a disease or disorder in a test subject. In most cases, however, it would be preferable to create a specific genotypic test for a particular variance with diagnostic or prognostic utility. The particularly useful aspects of the methods for determining the genotype described here are: ease of designing the assay, low cost of the P1266 reagents and availability of cleavage products for detection by a variety of methods, including unrestricted methods: electrophoresis, mass spectrometry and fluorescent detection. In another aspect of this invention, the complete sequence of a polynucleotide can be determined by repeating the method described above which involves the replacement of a natural nucleotide at each point of occurrence of the natural nucleotide with a modified nucleotide followed by cleavage and mass detection. In this embodiment, the procedure is carried out four times with each of the natural nucleotides; that is, in the case of DNA, for example but without limitation, each of the dA, dC, dG and T is replaced with a modified nucleotide in four separate experiments. The masses obtained from the four cleavage reactions can then be used to determine the complete sequence of the polynucleotide. This method is applicable to polynucleotides prepared by primer extension or amplification by, for example, PCR; in the latter case, both chains undergo modified nucleotide replacement. An additional experiment may be necessary if the above procedure leaves any of the nucleotide positions in the ambiguous sequence (see, for example, the Examples section, infra). This experiment Additional P1266 may be the repetition of the above procedure, using the complementary strand of DNA that is being studied if the method includes primer extension. The additional experiment may also be the use of the method described above to replace two natural nucleotides with two modified nucleotides, the cleavage wherein the modified nucleotides are contiguous and then determine the masses of the obtained fragments. Knowledge of the position of contiguous nucleotides in the target polynucleotide can resolve the ambiguity. Another experiment that could be used to solve any ambiguity that might occur in the main experiment is a one-pass Sanger sequencing followed by gel electrophoresis that is quick and easy but which, would not provide a highly accurate sequencing alone. Therefore, together with the methods of this invention, an alternative sequencing method known in the art could, in the case of a specific ambiguity, provide the information necessary to resolve the ambiguity. Combinations of these procedures could also be used. The use value of different procedures rests on the generally recognized observation that each sequencing method has certain associated artifacts that compromise its performance but the artifacts are different for P1266 different procedures. Thus, when the goal is highly accurate sequencing, the use of two or more sequencing techniques that would tend to cancel each other's manipulations would be very useful. Other additional experiments that could resolve an ambiguity will be, based on the expositions made here and on the ambiguity of the specific sequence in question, evident to the experts in the technique and are considered, therefore, to be within the scope of this invention. In still another aspect of this invention, the modified nucleotide cleavage reactions described herein can result in the formation of a covalent bond between one of the cleavage fragments and another molecule. This molecule can serve a variety of purposes. It may contain a directly detachable label or an entity that improves the detection of cleavage products during the fluorogenic, electrophoretic or mass spectrometric analysis. For example, in an unrestricted way, the entity can be a dye, a radioisotope, a trapped ion to improve the ionization efficiency, an excitable group that can alter the desorption efficiency or simply a large molecule that globally alters the characteristics of desorption and / or ionization. The labeling reaction may be partial or P1266 complete. An example of the use of homogenously labeled DNA fragments of controllable size is in the hybridization of DNA, such as, for example, hybridization probes for DNA in high-density arrays as small fragments of DNA. A further aspect of this invention is the replacement of a natural nucleotide with a modified nucleotide only at a percentage of the point of occurrence of that natural nucleotide in a polynucleotide. This percentage may be between about 0.01% and 95%, preferably between about 0.01% and 50%, more preferably between about 0.01% and 10% and most preferably between about 0.01% and 1%. The percent replacement is selected to be complementary to the efficiency of the selected cleavage reaction. That is, if a low efficiency cleavage reaction is selected, then a higher percentage of substitution is acceptable; if a high efficiency cleavage reaction is selected, then a low replacement percentage is preferred. The desired result is that, on the average, each individual polynucleotide chain is cleaved once so that a sequencing ladder, such as that described for the Maxam-Gilbert and Sanger procedures, can be developed. Since the cleavage reactions P1266 described herein are of relatively high efficiency, low replacement percentages are preferred to achieve the desired simple cleavage by polynucleotide chain. Low replacement rates with available polymerases can also be easily achieved. However, based on the teachings herein, other cleavage reactions of varying degrees of efficiency will be apparent to those skilled in the art and, likewise, are within the scope of this invention. In fact, it is an aspect of this invention that, the use of cleavage reactions of sufficiently low efficiency, which, in terms of percentage of cleavage at the points of incorporation of a modified nucleotide in a modified polynucleotide can be between about 0.01% and 50%, preferably between about 0.01% and 10% and most preferably between about 0.01% and 1%, a polynucleotide in which a natural nucleotide has been replaced with a modified nucleotide at virtually every point of occurrence can still used to generate the sequence ladder. At the most preferred efficiency level, approximately between 0.01% and 1%, each chain of a completely modified polynucleotide must, on average, be cleaved only once. In another aspect, this invention relates to P1266 methods for producing and identifying polymerases with novel properties with respect to the incorporation and cleavage of modified nucleotides. A. Modification and cleavage of nucleotides (1) Excision and modification of the base A modified nucleotide may contain a modified base, a modified sugar, a modified phosphate ester linkage or a combination thereof. The base modification is the chemical modification of the entity of adenine, cytosine, guanine or thymine (or, in the case of RNA, of uracil) of a nucleotide so that the resulting chemical structure produces the modified nucleotide most susceptible to attack by a reactive than a nucleotide containing the unmodified base. The following are examples, without limitation, of base modification. Other modifications of bases such as these will be readily apparent to those skilled in the art in light of the teachings contained herein and therefore, will be considered within the scope of this invention (eg, the use of difluorotoluene; Liu, D, et al., Chem. Biol., 4: 919-929, 1997; Moran, S., et al., Proc. Natl. Acad. Sci. USA. 94: 10506-10511, 1997). Some examples, without limitation, of these modified bases are described below. 1. Adenine (1) can be replaced with 7- P1266 deaza-7-nitroadenine (2.). 7-Deaza-7nitroadenine is easily incorporated into the polynucleotides by enzyme-catalyzed polymerization. The 7-nitro group activates C-8 to attack by chemical base, as for example and without limitation: sodium hydroxide or aqueous piperidine, which eventually results in a specific chain cleavage. Verdine, et al., JACS, 1996, 118: 6116-6120; 1 The authors have found that cleavage with piperidine is not always complete, whereas complete excision is the desired result. However, when the cleavage reaction is carried out in the presence of a phosphine derivative, for example and without limitation, tris (2-carboxyethyl) phosphine (TCEP) and a base, complete cleavage is obtained. An example of this cleavage reaction is as follows: the DNA modified by the incorporation of P1266 7-nitro-7-deaza-2'-deoxyadenosine is treated with 0.2 M TCEP / 1 M piperidine / 0.5 M Tris base at 95 ° C for one hour. Analysis by denaturing polyacrylamide gel (20%) showed complete excision. Other bases such as and without limitation, NH40H may be used in place of the piperidine and the Tris base. This method, that is, the use of a phosphine in conjunction with a base, should be applicable to any cleavage reaction wherein the target polynucleotide has been replaced with a modified nucleotide that is labile to piperidine. The cleavage product with TCEP and base is unique. The mass spectrometry analysis was consistent with a structure having a phosphate-ribose adduct-TCEP at the 3 'ends and a phosphate entity at the 5' ends, ie structure 3.
P1266 The manner in which TCEP participates in the fragmentation of a modified polynucleotide is not currently known; However, without adhering to any particular theory, the authors think that the mechanism may be as follows: The incorporation of TCEP / or other phosphine) into the cleavage product should be a very useful method for labeling fragmented polynucleotides at the same time that excision is taking place. The use of an appropriately functionalized phosphine which continues to be capable of forming an adduct in the 3 'end ribose, as described above, these functionalities, without limitation, as mass labels, fluorescence labels, P1266 radioactive labels and ion trap labels could be incorporated into a fragmented polynucleotide. Phosphines containing one or more labels and capable of covalently binding to a cleavage fragment constitute another aspect of this invention. Similarly, the use of these labeled phosphines as a method for labeling polynucleotide fragments is another aspect of this invention. While other phosphines, which may be apparent to those skilled in the art, based on the teachings herein, can be used to prepare labeled phosphines by incorporation into nucleotide fragments, TCEP is a particularly good candidate for labeling. For example, carboxy (-C (O) OH) groups can be modified directly by various techniques, for example, without limitation: by reaction with an amine, alcohol or mercaptan in the presence of a carbodimide to form an amide, ester or mercapto ester As shown in the following reaction scheme: Alcohol, amine or thiol (1M1H) Alcohol, amine or thiol Dici ciohexylcarbodiimide HOOC (R2 2H) ^ vr ./-COOH (DCC) * - ^ pp- "00 ^ 1 8» _. ^ PJ-8 ^ HOO 6C moa H00 (r- 'Tris- (2-carboxyethyl) phosphine mono-modified derivative bismodified derivative P1266 When a carboxy group is reacted with a carbodiimide in the absence of a nucleophile (the amine in this case), the adduct between the carbodiimide and the carboxy group can rearrange to form a stable N-acylurea. If the carbodiimide contains a fluorophore, the resulting phosphine will then carry that fluorophore as shown in the following reaction scheme: wherein, M1 and M2 are independently), NH, NR, S. R1 and R2 are mass labels, fluorescent labels, radioactive labels, ion trap labels or combinations thereof. Fluorophores containing the amino group, such as for example: fluoresceinyl glycine amide (5- (aminoacetamido) fluorescein, 7-amino-4-methylcoumarin, 2-aminoacridone, 5-aminofluorescein, 1-pyrenemethylamine and 5-aminoeosine can be used to prepare the phosphines marked by this method.Amino derivatives of phosphorus yellow and Blue Cascade can also be used as derivatives P1266 of biotin amino. In addition, hydrazine derivatives, such as, for example, rhodamine and Texas Red may also be useful in this method. Fluorescent diazoalkanes, such as, for example, and without limitation, 1-pyridyldiazomethane, may also be useful for forming esters with TCEP. The fluorescent alkyl halides can also react with the anion of the carboxyl group, ie the group C (0) 0 ~, to form esters. Among the halides that could be used are, without limit: panacil bromide, 3-bromoacetil-7-diethylaminocoumarin, 6-bromoacetil-2-diethylaminonaphthalene, 5-bromomethylfluorescein, BODIPY® 493/503 methyl bromide, monobromobimanos and iodoacetamides, such as iodoacetamide of coumarin can serve as effective label carrier entities that will covalently bind with TCEP. The naphthalimide sulfonate ester reacts rapidly with the anions of carboxylic acids in acetonitrile to give adducts that can be detected by absorption at 259 nm to decrease to 100 femtomols and by fluorescence at 394 nm down to four femtomols. There are also countless reactive fluorescent probes with amine available and it is possible to convert TCEP to a primary amine by means of the following reaction: P1266 OR EDAC. (CH3) 3C0-C- NH (CH2) nNH2 + P [(CH2) 2COOH] 3 O O II 11 CF, COOH (CH 3) 3CO-C-NH (CH 2) nNH-C-CH 3) 2 P [) CH 2) 2COOH) 2 3 ^^ O II H2N (CH 2) nNh -C (CH 2) 2 P [(CH 2) 2COOH] 2 The aminophosphine can then be used to form label containing aminophosphines for use in the cleavage / labeling method described herein. The dyes and procedures described above, to covalently link them to the TCEP are only a few examples of the possible adducts that can be formed. A valuable source of these reagents and additional procedure is the Molecular catalog Probes, Inc. Based on the presentations of the present and resources such as the Molecular Probes catalog, many other ways to modify phosphines, in particular TCEP, will be evident to those with expertise in the art. Those other ways of modifying phosphines for use in the incorporation of labels in the polynucleotide fragments during the chemical cleavage of the polynucleotide are within the scope of this invention. 2. Cytosine (4) can be replaced with 5-azacytosine (5.). 5-azacytosine is incorporated P1266 efficiently similarly to polynucleotides by enzyme-catalyzed polymerization. 5-Azacytosine is susceptible to cleavage by means of a chemical base, particularly aqueous base, such as, for example, aqueous piperidine or aqueous sodium hydroxide. Cerdine, et al., Biochemistry. 1992, 31: 11265-11273; 3 (a). Guanine (6.) can be replaced with 7-methylguanine (7.) and can, similarly, be easily incorporated into the polynucleotides by polymerases (Verdine, et al., JACS, 1991, 113: 5104-5106) and is susceptible to attack by chemical base, for example and without limitation: aqueous piperidine (Siebenlist, et al., Proc. Natl. Acad. Sci. USA, 1980, 77: 122); or, P1266 3 (b). Gupta and Kool, Chem. Commun. 1997, pgs. 1425-26 have shown that N6-allyl-dideoxydenine, when incorporated into a DNA strand, will split in the treatment with a mild electrophile, E +, in its case iodine. The proposed mechanism is shown in (Scheme 1): Scheme 1 A similar procedure could be employed with guanine using the 2-allylaminoguanine derivative 8 not reported above, which can be prepared by the procedure shown in (Scheme 2): P1266 Scheme 2 Other forms of synthesizing compound 8 will become apparent based on the teachings herein; it is considered that these syntheses are within the spirit and scope of this invention. The incorporation of the N2-allyguanosine triphosphate in a polynucleotide chain must be susceptible to cleavage in a manner similar to the N6-alyladenine nucleotide of Gupta, ie, by the mechanism shown in (Scheme 3): Scheme 3 P1266 4. Both thiamine (9) or uracil (10) can be replaced with 5-hydroxyuracil (11.) (Verdine, JACS, 1991, 113: 5104). As with the modified bases mentioned above, the nucleotide prepared from 5-hydroxyuracil can also be incorporated into a polynucleotide by enzyme-catalyzed polymerization. Verdine, et al., JACS, 1993, 115: 374-375. The specific cleavage is completed by first treating the 5-hydroxyuracil with an oxidation agent, for example, aqueous permanganate, and then with a chemical base such as, for example, and without limitation, aqueous piperidine (Verdine, ibid).
P1266 eleven . Pyrimidines substituted at position 5 with an electron extraction group, for example and without limitation, a nitro, halo or cyano group, must be susceptible to nucleophilic attack at position 6 followed by catalyzed ring opening at the base and degradation Subsequent phosphate ester bond. An example, which is not interpreted as limiting the scope of this technique in any way, is shown in (Scheme 4) which uses 5-substituted cytidine. If the cleavage is carried out in the presence of tris (carboxyethyl) phosphine (TCEP), the adduct 10 can be obtained and, if the TCEP is functionalized with an appropriate entity (q.v. infra), fragments of labeled polynucleotide can be obtained.
P1266 Scheme 4 (2) Excision and modification of sugar Modification of the sugar portion of a nucleotide can also give a modified polynucleotide that is susceptible to selective excision at the sites of incorporation of that modification. In general, the sugar is modified to include one or more functional groups that provide the linkage of 3 'and / or 5' phosphate ester labile; that is, susceptible to cleavage, that the phosphate ester linkage 31 and / or 5 'of a natural nucleotide. The following are examples, without limitation, of those sugar modifications. Other modifications of the sugar will be readily apparent to those skilled in the art in light of the teachings herein and are therefore considered to be within the scope of this invention. In the following formulas, B and B 'refer to any base and may be the same or different. 1. In a deoxyribose-based polynucleotide, the replacement of one or more of the deoxyribonucleosides with a ribose analogue; for example and without limitation, the substitution of adenosine (12) by deoxyadenosine (13) provides the resulting modified polynucleotide susceptible to selective cleavage by means of chemical bases, as for example and without limitation, P1266 aqueous sodium hydroxide or concentrated ammonium hydroxide, at each point of occurrence of adenosine in the modified polynucleotide (Scheme 5), - Scheme 5 12 P1266 2. A 2'-sugarcane (14, synthesis: JACS, 1967, 89: 2697) can be substituted for the sugar of a deoxynucleotide; after the chemical-based treatment, as for example and without limitation, aqueous hydroxide, the keto group is equilibrated with its ketal form (15) which then attacks the phosphate ester bond that effects the cleavage (Scheme 6), - Scheme 6 P1266 3. A deoxyribose nucleotide can be replaced with its arabinose analogue, ie, a sugar containing a 2"-hydroxy group (16) Again, treatment with a mild chemical base (dilute aqueous) effects the intermolecular displacement of a phosphate ester bond resulting in cleavage of the polynucleotide (Scheme 7): Scheme 7 4. A deoxyribose nucleotide can be replaced by its 4'-hydroxymethyl analogue (17, synthesis: Helv. Chim. Acta, 1966, 79: 1980) which, in the treatment with weak chemical base, as for example and without limitation, the hydroxide diluted aqueous, similarly displaces a phosphate ester bond that causes cleavage of the polynucleotide as shown in the (Scheme P1266 17 Scheme 8 . A deoxyribose nucleotide can be replaced by its 4'-carbocyclic hydroxy analogue, i.e., a 4-hydroxymethylcycloperane derivative (18) which, in the aqueous-based treatment, results in cleavage of the polynucleotide at a phosphate ester linkage , as shown in (Scheme 9): P1266 18 Scheme 9 6. A sugar ring can be replaced with its analog that is substituted with a hydroxyl group (19). Depending on the stereochemical positioning of the hydroxyl group on the ring, either a 3 'or a 5' phosphate ester linkage can be selectively cleaved during mild chemical base treatment (Scheme 10): P1266 or Scheme 10 7. In each of examples 1, 3, 4, 5 and 6 above, the hydroxy group attacking the phosphate ester cleavage can be replaced with an amino group (-NH2). The amino group can be generated in situ from the corresponding azido sugar by treatment with three (2-carboxyethyl) -phosphine (TCEP) after the azido-modified polynucleotide has been formed (Scheme 11). The amino group, once formed, spontaneously attacks the phosphate ester bond resulting in cleavage.
P1266 Scheme 11 8. A sugar can be substituted with a functional group that is capable of generating a free radical, for example and without limitation, a t-butyl ester group (fcBuC (= 0) -) or phenylsenyl (PhSe-) (Angew, Chem. Int. Ed. Engl. 1993, 32: 1742-43). Treatment of sugar modified with ultraviolet light under anaerobic conditions results in the formation of a C4 radical 'whose fragmentation P1266 causes cleavage of the modified nucleotide and by cleavage of the modified nucleotide and thereby excision of the polynucleotide in the modified nucleotide (Scheme 12). Free radicals can be generated either before or during the MALDI mass analysis laser ionization / desorption process. Modified nucleotides with other photolabile 4 'substituents can also be used as for example and without limitation, 2-nitrobenzyl groups or 3-nitrophenyl groups (Synthesis, 1980, 1-26) and bromine or iodine groups, as precursors to form a C4 radical. .
Scheme 12 P1266 9. An electron extraction group can be incorporated into the sugar so that the nucleotide becomes susceptible to either β-elimination (when W is cyano (a "cyano-sugar" 20)) or the oxyanion formed by the hydrolysis of the link 3'-ester phosphate is stabilized and therefore mild chemical-based hydrolysis will be preferred in modified sugar, these electron extraction groups include, without limitation, cyano (-C = N), nitro (-N02), halo (in particular, fluoro), azido (-N3) or methoxy (-0CH3) (Scheme 13): Scheme 13 P1266 A sugar cyano can be prepared by a variety of techniques, one of which is shown in (Scheme 14). Other methods will undoubtedly be apparent to those of skill in the art based on the teachings herein; these alternative techniques for cyano (or other sugars substituted by the electron extraction group) are within the spirit and scope of the invention. n Scheme 14 . The oxygen of the ring of a sugar can be replaced with another atom, for example and without limitation, a nitrogen to form a pyrrole ring (21). Or another heteroatom can be placed in the sugar ring instead P1266 from one of the ring carbon atoms; for example and without limitation, a nitrogen atom to form an oxazole ring (22). In any case, the purpose of the additional or different heteroatom is to make the phosphate ester linkage of the resulting non-natural nucleotide more labile than that of the natural nucleotide (Scheme 15): Scheme 15 P1266 11. A group, as for example and without limitation, a mercapto group can be incorporated in the 2"position of a sugar ring whose group, in the treatment with slight chemical base, forms a ring by elimination of the 3'-phosphate ester (Scheme 16).
Scheme 16 12. A keto group can be incorporated in the 5 'position so that the resulting phosphate has the lability of an anhydride, i.e. of structure 23. A nucleotide triphosphate such as 23 can be synthesized by the procedure shown in (Scheme 17). It is recognized that other routes for these nucleotide triphosphates may be apparent to those skilled in the art based on the teachings herein; These syntheses are within the spirit and scope of this invention.
P1266 1) (8u e2SiCI TEMPO Imidazole HO ^ HO ^ ß DMF Phl (OAc) 2 HOCC B, OH H20 OSi e2 Its Camosulfonic Acid OSiMßjl? U CH3CN - Mßtanol, H20 AcOH-THF-H20 HO 2. 3 Scheme 17 The polynucleotides within which nucleotide triphosphates of structure 23 have been incorporated should be, like the analogous mixed anhydrides, susceptible to alkaline hydrolysis as shown in (Scheme 18): Scheme 18 13 The phosphate ester bond could P1266 became the relatively more labile enol ester link by the incorporation of a double bond at the 5 'position, i.e., that a nucleotide triphosphate of structure 24 could be used. A nucleotide triphosphate of structure 24 can be prepared by the procedure shown in (Scheme 19). It is again understood that other ways to produce structure 24 may be apparent to those skilled in the art, based on the teachings herein, as in the foregoing, these alternative syntheses are also within the spirit and scope of this invention. .
Oxidation of alcohol to aldehyde For example, • u «> < < , -? Dichlohexylcarbodiimide 1.}. í8uMe2S? CI CF3COOH HO. B, i ^ l '? 201 HO B < Pyridine OHC ß, - ° ^ DMF ° * ^ Dimetiisulfoxide ^ ° ^ l OH 2) Camfosuffdipic acid 0SiMe2fBu 0S¡ ß2 (Bu Methanol.H20 Oxidation of alcohol to ketone For example,) UN (s e3) 2, THF HOJ Mßß B, ((CC0OCCII)) 22 ', EElt33NN', 0 = í *? ß BB, | l¡) CIP0 (0CH2CC! 3) 2 MeMgßr ^ \ O - ^ -i • Dimetiisulfoxide _ ^ L-O- ^ J THF OS¡ ß, íBu OSi ejíBu Scheme 19 P1266 The enol ester would be susceptible to alkaline cleavage in accordance with (Scheme 20).
Scheme 20 14. The substitution of difluoro at the 5 'position would increase the lability of the phosphate ester linkage and also push the reaction to completion by virtue of the hydrolysis of the intermediate difluorohydroxy group to an acidic group as shown in (Scheme 22). The dihalo derivative could be synthesized by the procedure shown in (Scheme 21). Again, the route shown in (Scheme 21) is not the only possible way to make the difluoronucleotide triphosphate. However, as in the previous, these other routes would become evident based on the expositions of the present and would be within the spirit and scope of this P1266 invention. ßjíBu Scheme 21 0: P-O- Ó 1 Scheme 22 (3) Excision and modification of the phosphate ester Modification of the phosphate ester of a nucleotide results in the modification of the bonds P1266 of phosphodiester between the 3'-hydroxy group of a nucleotide and the 5'-hydroxy group of the adjacent nucleotide such that one or the other of the modified 3'- or 5'-phosphate ester bonds becomes practically more susceptible to excision than the corresponding unmodified link. Since the phosphodiester bond forms the structure of a polynucleotide, this method for modification will be referred to herein, alternatively, as "structure modification". The following examples are not limiting of the structure modification. Other of these modifications will become apparent to the experiments in the art based on the teachings herein and are therefore considered to be within the scope of the invention. 1. The replacement of an oxygen in the phosphate ester bond with a sulfide, that is, the creation of a phosphorothiolate bond (25a, 25b, 25c) that either directly in the treatment with mild base (Schemes 23 ( a) and 23 (b)) or in the treatment with an alkylating agent, such as, for example, methyl iodide, followed by the treatment with non-aqueous strong organic base, for example methoxide (Scheme 23 (c)), results selective cleavage of the phosphothioester linkage. Alternatively, phosphorothiolate linkages such as those of Formula 14 can also be selectively cleaved to P1266 through laser photolysis during the ALDI mass analysis. This source fragmentation procedure (Internat'l J. of Mass Spec. And Ion Process, 1997, 169/170: 331-350) consolidates the analysis and cleavage of polynucleotide in one step; 25th Scheme 23 (a) 25b Scheme 23 (b) P1266 Scheme 23 (c) 2. The replacement of an oxygen in the phosphate ester bond with a nitrogen creates a phosphoramidate bond (26) which, in the treatment with, for example and without limitation, dilute aqueous acid, will result in selective cleavage (Scheme 24); P1266 26 Scheme 24 3. The replacement of one of the free oxygen atoms attached to the phosphorus of the phosphate structure with an alkyl group, for example but not limited to a methyl group, to form a methylphosphonate bond, which, in the treatment with strong organic base does not aqueous, as for example and without limitation, methoxide, will similarly result in a selective cleavage (Scheme 25).
Scheme 25 P1266 4. The alkylation of the free oxyanion of a phosphate ester bond with an alkyl group, for example and without limitation, a methyl group, in the treatment with non-aqueous strong organic base as for example and without limitation, methoxide, will give as a result the selective cleavage of the resulting alkylphosphorotriester bond (Scheme 26).
Scheme 26 . The treatment of a phosphorothioate with β-mercaptoethanol in a strong base, such as, for example, and without limitation, methanolic sodium methoxide, where mercaptoethanol exists mainly as the disulfide, could result in the formation of a mixed disulfide, which would then degrade , with or without rearrangement, to give the cleavage products shown in (Scheme 27).
P1266 Scheme 27 (4) Excision and modification of the dinucleotide The above subsitutes are all simple substitutions, that is, a modified nucleotide is replaced by a natural nucleotide at any place where the natural nucleotide occurs in the target polynucleotide or, if desired, in a fraction thereof. these sites. In a further aspect of this invention, P1266 multiple substitutions can be used. That is, two or more different modified nucleotides can be substituted by two or more different natural nucleotides, respectively, wherein the natural nucleotides occur in a subject polynucleotide. The modified nucleotides and the cleavage conditions are selected such that, under appropriate cleavage conditions, they do not confer selective cleavage properties to a polynucleotide. When, however, the appropriate cleavage conditions that are applied in the modified nucleotides are incorporated into the polynucleotide in a particular reciprocal spatial relationship, they interact to convert the polynucleotides together, into selectively cleavable. Preferably, two modified nucleotides are replaced by two natural nucleotides in a polynucleotide, therefore, this method is referred to herein as "modification of the dinucleotide". It is important to note that, individually, each of the two modified nucleotides can choose the selective and specific cleavage of a polynucleotide although under typically more vigorous, very different chemical conditions. As used herein, "spatial relationship" refers to the three-dimensional relationship between two or more modified nucleotides after substitution to polynucleotide. In a preferred embodiment of this P1266 invention, two modified nucleotides must be contiguous in a modified polynucleotide in order to impart altered cleavage properties to the modified polynucleotide. By employing two nucleotides modified in this way and then cleaving the modified polynucleotide, the relationship between two natural nucleotides in a target polynucleotide can be established depending on the nature of the multiple substitution selected. That is, the natural nucleotides that are being replaced would also have been adjacent to each other in the natural nucleotide. For example and without limitation, if a modified A and a modified G are replaced at each point of occurrence of the natural A and the corresponding natural G, respectively, the modified polynucleotide will become selectively cleavable only where the natural A and G were directly adjacent, that is, AG or GA (but not both), in the polynucleotide that occurs naturally. As shown below, the appropriate selection of the modified polynucleotides will also reveal the exact ratio of the nucleotides, ie, in the example above, whether the nucleotide sequence in the natural polynucleotide was AG or GA. The following examples are not limiting of multiple substitutions. Other multiple substitutions will be evident for P1266 experts in the art based on the expositions set forth herein and therefore are considered to fall within the scope of this invention. 1. A modified nucleotide can contain a functional group capable of performing nucleophilic substitution while the partner modified nucleotide is modified to convert it into a selective output group. The nucleophile and the leaving group may be in a 5 '-3' orientation or in a 3 '-5' orientation with respect to each other. A non-limiting example of this is shown in (Scheme 28). The 2 'or 2"hydroxy group in a modified nucleotide, when treated with mild chemical base it becomes a good nucleophile. The other modified nucleotide contains a 3 'or 5' thiohydroxy (-SH) group that forms a 3 'or 5' phosphorothioate linkage when incorporated into the modified polynucleotide. This phosphorothiolate linkage is selectively more labile than a normal phosphodiester linkage. When treated with mild base, the oxyanion formed from the hydroxy group of a modified nucleotide selectively shifts the thiophosphate linkage to the other modified nucleotide resulting in cleavage. As shown in Schemes 28 (a) and 2 (b), which depends on the stereochemical relationship between the hydroxy group and the thiophosphate linkage, cleavage will occur either towards the 3 'side or towards the 5' side of the P1266 modified nucleotide containing hydroxy. Therefore, the exact relationship of the natural nucleotides in the naturally occurring polynucleotide is revealed.
Scheme 28 (a) Scheme 28 (b) 2 (a). If a modified nucleotide contains an amino group (-NH2) 3 'or 5' and the other modified nucleotide P1266 contains a 5 'or 3' hydroxy group, respectively, the treatment of the resulting phosphoroamidate-linked polynucleotide with mild acid results in the protonation of the amino group of the phosphoroamidate linkage which then becomes a good leaving group. Again, depending on the spatial relationship between the hydroxy group of a modified nucleotide and the amino group of the other modified nucleotide, the exact ratio of the nucleotides in the naturally occurring polynucleotide can be determined as shown in Formulas 29 ( a) and 29 (b).
Scheme 29 (a) P1266 Scheme 29 (b) The dinucleotide cleavage of a 5'-3'-ribonucleotide / 5'-amino-nucleotide linkage is currently a preferred embodiment of this invention. The examples of this method are shown in Figures 21-26. 2 (b). When the amino group of the modified nucleotide is 5 ', a ribonucleotide / 5'-amino 2', 5'-dideoxynucleotide pair can be cleaved during the polymerization process. For example and without limitation, cleavage occurs during the incorporation of adenine and nucleotide 5 '-aminodidesoxythylamine nucleotide into a polynucleotide using a combination of Klenow (exo-) and Klenow (exo-) mutant E710A wild-type polymerases. E710A is a Klenow polymerase (exo-) mutant wherein a glutamate at residue 710 has been replaced by alanine. The mutant E710A is more efficient to incorporate both the P1266 ribonucleotides as to deoxyribonucleotides in a single nascent polynucleotide chain, which Klenow (exo-). Other polymerases with similar properties will be apparent to those skilled in the art based on the teachings herein and their use for incorporation of ribonucleotide and 5'-amino-2 ', 5' dideoxynucleotide into a polynucleotide with subsequent cleavage during the reaction of polymerization is within the scope of this invention. When a radiolabelled 5 'end primer was extended using a mixture of Klenow (exo-) and E710A Klenow (exo-), only one fragment (the 5' end fragment) indicating full cleavage at the ribonucleotide-5 sites was observed '-aminonucleotide. We have shown (Figures 21-26) that polymerization and cleavage occur in the same step. That is, excision is induced during protein-DNA contact. The figures show that polymerases continue to extend the model even after excision, which also suggests that the cleavage is the result of protein-DNA contact. While the USB-branded Klenow polymerase (Amersham) was also able to incorporate both nucleotides, it was not as efficient as the polymerase mix and, therefore, multiple product bands were observed indicating incomplete cleavage at the AT sites.
P1266 The former is, of course, a specific example of a general concept. That is, other wild type polymers, mutant polymerases or combinations thereof must be able to cleave similarly, or facilitate the cleavage of, modified nucleotides or dinucleotides during the polymerization process. The procedure for determining the exact combinations of nucleotide and polymerase modifications that result in cleavage, based on the teachings herein, will be apparent to those skilled in the art. For example, as described below, it may be useful to generate a library of mutant polymerases and specifically select those that induce dinucleotide cleavage. Therefore, a polymerase or a combination of polymerases that cause the cleavage of a modified polynucleotide former during the polymerization process is still another aspect of this invention., as is the method for cleaving a modified polynucleotide during the polymerization process using a polymerase or a combination of polymerases and modified nucleotide (s) necessary for excision to occur. 3. An electron extraction group can be placed on a sugar carbon adjacent to the carbon that is attached to the hydroxy group participating in the bond of P1266 ester of a methylphosphonate structure (Scheme 30 (a)) or of methylphosphotriester (Scheme 30 (b)). This will result in a greater stability of the oxyanion formed when the phosphate group is hydrolyzed with mild chemical base (Scheme 30) and therefore, the selective hydrolysis of those phosphate ester bonds in comparison with the non-adjacent phosphate ester bonds. to these hydroxy groups.
Scheme 30 (a) Scheme 30 (b) P1266 4. An electron extraction or removal group can be placed on the 4'-carbon of a nucleotide that is linked through its 5'-hydroxy group with the 3'-hydroxy group of an adjacent ribonucleotide. Treatment with diluted base will result in cleavage, as shown in (Scheme 31).
Scheme 31 . A 2 'or 4' leaving group in a sugar may be susceptible to attack by the sulfur of a phosphorothioate, as shown in (Schemes 32 and 33) to provide the desired cleavage: P1266 Scheme 32 It's burning 33 6. Ethylene sulfide could effect the cleavage of a 2 'fluoro derivative of a sugar together with a phosphorothioate according to (Scheme 34): P1266 MeOVMeOH MeOVMeOH Scheme 34 The β-mercaptoethanol or a similar reagent can be replaced by ethylene sulfide. 7. A phosphorothioate could be coordinated with a metal oxidant, for example and without limitation, Cu11 or Fe111, which would be kept in close proximity to the 2'-hydroxy group of an adjacent ribonucleotide. Selective oxidation of the 2'-hydroxy group to a ketone should make the adjacent phosphate ester bond more susceptible to P1266 cleavage under basic conditions than the corresponding ribonucleotides or deoxyribonucleotides as shown in (Scheme 35): Scheme 35 The preceding cleavage reactions can be carried out in such a way as to cause cleavage at virtually all points of occurrence of the modified nucleotide or, in the case of multiple substitutions, at all points of occurrence of two or more P1266 nucleotides modified in the appropriate spatial relationship. On the other hand, by controlling the amount of cleavage reagent and the reaction conditions, the cleavage may be partial; that is, the excision will occur only at a fraction of the points of occurrence of a modified nucleotide or of modified nucleotide pairs.
B. Fragmentation of modified polynucleotides in mass spectrometers The above discussion relates to chemical methods for cleaving polynucleotides at sites where modified nucleotides have been incorporated. However, in addition to the fragmentation of polynucleotide molecules chemically in solution, it is a further aspect of this invention that the fragmentation is terminated within a mass spectrometer that uses physical or chemical means. In addition, by manipulating the conditions within the mass spectrometer, the degree of fragmentation can be controlled. The ability to control the degree of fragmentation of chemically modified oligonucleotides can be very useful in determining the relationships between adjacent sequences. This is because, while the mass spectrometric analysis (MS) of a completely cleaved polynucleotide provides the masses and therefore the nucleotide content of each P1266 fragment polynucleotide, the determination of the order in which these fragment polynucleotides are linked in the original polynucleotide (analyte) is a difficult problem. By relaxing the excision stringency one can generate fragments corresponding to two or more fragments of the complete excision set. The mass of these fragments of compound provides the information that allows the inference that the two component fragments are adjacent to the original polynucleotide. By determining that the multiple pairs or multiple triplets of complete excision fragments are adjacent to each other, eventually a much larger sequence can be joined than if one should rely solely on the analysis of complete excision fragments. The ability to control the conditions of fragmentation and manipulation in the mass spectrometer is particularly advantageous because, in contrast to the iterative generation and the subsequent test of partial splits in a test tube, the effect of various conditions of partial splitting can be observed directly in real time and manipulated instantaneously to provide the optimal partial split datasets. For some purposes, the use of several conditions of partial splitting can be very useful since the successive levels of partial splitting will provide a P1266 cumulative portrait of the relations between the largest fragments that have been had. The specific mechanisms for the fragmentation of modified polynucleotides are describelow. First, by selecting appropriate ionization methods, fragmentation can be induced during the ionization process. Alternatively, in the tandem mass spectrometry (MS / MS) technique, ions with mass to charge (m / z) ratios of interest can be selected and then activated by a variety of procedures including: collision with molecules, ions or electrons, or the absorption of photons of various wavelengths, leading to the fragmentation of ions. In one aspect ionization and fragmentation of the polynucleotide molecules can be achieved with fast atom bombardment (FAB). In this technique, the modified polynucleotide molecules are dissolved in a liquid matrix, for example glycerol, thioglycerol or other glycerol analogues. The solution is deposited on a metal surface. Particles with thousands of volts of electrons of kinetic energy are directed into the droplet of liquid. Depending on the modification of the polynucleotides, partial fragmentation or complete fragmentation in each modified nucleotide can be achieved.
P1266 In another aspect, ionization and fragmentation can be effected by mass spectrometry by matrix-assisted laser desorption ionization (MALDI-MS). In MALDI-MS a solution of modified polynucleotide molecules are mixed with a matrix solution, for example, 3-hydroxypicolinic acid in aqueous solution. An aliquot of the mixture is deposited on a solid support, usually a metal surface with or without modification. Lasers, preferably with a wavelength of between 6 μm and 10.6 Fm, are used to irradiate the modified polynucleotide / matrix mixture. To analyze fragmentation products in the source (ISF), delayed extraction can be used. To analyze post-source degradation products (PSD), an ion reflector can be used. In another technique, ionization and fragmentation can be achieved by electro-ionization ionization (ESI). In this procedure, the modified DNA solution is sprayed through the orifice of a needle with a few kilovolts of applied voltage. Fragmentation of the modified polynucleotide molecules would occur during the desolvation process in the region of the nozzle sucker (NS). The degree of fragmentation will depend on the nature of the modification as well as factors such as the voltage between the nozzle and the sucker, the speed P1266 flow and temperature of the drying gas. If capillarity is used to aid in the desolvation, then what needs to be controlled is the voltage between the capillary exit and the sucker and the capillarity temperature, to achieve the desired degree of fragmentation. In another technique, the modified polynucleotide molecules can be selectively activated and dissociated. Activation can be achieved by accelerating the precursor ions to a kinetic energy of a few hundred to a few million volts of electron and then causing them to collide with neutral molecules, preferably noble gas. In the collision, a part of the kinetic energy of the precursor ions is converted into internal energy and causes fragmentation. Activation can also be achieved by allowing accelerated precursor ions to collide on a conductive or semiconducting surface. Activation can also be achieved by allowing accelerated precursor ions to collide with ions of opposite polarity. In another approach, activation can be achieved by electron capture. In this technique, the precursor ions are allowed to collide with thermalized electrons. Activation can also be achieved by irradiating the precursor ions with photons of various wavelengths, preferably in the range of 193 nm to 10.6 μm. The P1266 activation can also be achieved by heating vacuum chambers for trapped ions; Heating of the walls of the vacuum chamber causes blackbody IR irradiation (Williams, E.R., Anal.Chem., 1998, 70: 179A-185A). The presence of modified nucleotides in a polynucleotide could also increase the rate constant of the fragmentation reaction, shortening the duration of 10-1000 seconds required by the blackbody IR irradiation approach for unmodified polynucleotides. As noted above, row mass spectrometry is another tool that can be employed with benefits with the methods of this invention. In row mass spectrometry, the precursor ions with m / z of interest are selected and subjected to activation. Depending on the activation technique used, some or all of the precursor ions can be fragmented to give product ions. When this is done within a suitable mass spectrometer (eg, a Fourier transformation cyclotron resonance mass spectrometer and ion trap mass spectrometers), the product ions with m / z of interest can select and also undergo activation and fragmentation, giving more product ions. Both the mass of the product ions can be determined P1266 precursors. To control the degree of fragmentation in different stages of activation, two or more different types of modified nucleotides that, for discussion purposes, will be called Type I and Type II, with different sensitivity to the different activation techniques, could be incorporated (replacement complete of the natural nucleotide) in a target polynucleotide. This polynucleotide can be fragmented with high efficiency by the type I activation technique at each position where the modified type I nucleotides are incorporated. The resulting fragment ions, which still contain modified type II nucleotides can then be selected and fragmented by a Type II activation technique to generate a set of subfragments from which the nucleotide content can be inferred more easily. A technique like this can be used for the detection of variance. For example, a 500-mer polynucleotide can be first fragmented into 10 to 50 fragments using a fragmentation technique of type I. The m / z of each fragment (when compared to the predicted set of fragment masses) will reveal whether in this fragment there is a variance. Once the fragments containing one variance are identified, the rest of the fragment ions P1266 are expelled from the device to trap ions, while the fragment ions of interest are subjected to activation. By controlling the degree of fragmentation of these fragment ions, a set of smaller DNA fragments can be generated, which allow the order of the nucleotides and the position of the variance to be determined. Compared to the technique involving a modified nucleotide type and fragmentation of a stage, this technique has the advantage that the number of experimental stages and the amount of data it needs to be processed is significantly reduced. In comparison with the technique involving a modified nucleotide type but two stages of partial fragmentation, this technique has the advantage that the fragmentation efficiency in the second stage is more controllable, thus reducing the opportunity for sequence spaces . Although the aforementioned activation schemes can be applied to all kinds of mass spectrometers, mass spectrometers per ion trap (ITMS) and Fourier transform ion cyclotron resonance mass spectrometers (FT-ICRMS) are particularly suitable for electron capture, photon activation and blackbody IR irradiation techniques.
P1266 C. Incorporation of Modified Nucleotide Several examples of the catalyzed incorporation of polymerase of a modified nucleotide into polynucleotides are described in the Examples section, below. It may be, however, that a particular polymerase will not incorporate all of the modified nucleotides described above, or others similar to them that fall within the scope of this invention, with the same ease and efficiency. Also, while a particular polymerase may be able to incorporate a modified nucleotide efficiently, it may be less efficient to incorporate a second modified nucleotide directly adjacent to the first modified nucleotide. In addition, the currently available polymerases may not be capable of inducing or facilitating cleavage in modified nucleotide or nucleotide pairs, an extremely convenient way to achieve cleavage (see above). There are, however, several techniques for acquiring polymerases that are capable of incorporating the modified nucleotides and contiguous pairs of modified nucleotides of this invention and, potentially, inducing or facilitating specific cleavage in that modified nucleotide or in those modified nucleotides. One approach to finding polymerases with the appropriate capabilities is to take advantage of the diversity P1266 inherent among polymerases that occur naturally, including, without limit, RNA polymerases, DNA polymerases and reverse transcriptases. It is known that polymerases that occur naturally have different affinities for non-natural nucleotides and it is likely that a natural polymerase can be identified which will effect the desired incorporation. In some cases, the use of a mixture of two or more naturally occurring polymerases having different properties with respect to the incorporation of one or more non-natural nucleotides may be advantageous. For example, W. Barnes has reported (Proc. Natl. Acad. Sci. USA, 1994, 91: 2216-2220) is use of two polymerases, an N-terminal deletion mutant without Tauk DNA polymerase exonuclease and a polymerase of Thermostable DNA having 3 '-exonuclease activity, to achieve the improved polymerization of long DNA models. Polymerases that occur naturally from thermophilic organisms are preferred polymerases for applications where thermal cycling amplification, eg, PCR, is the most convenient way to produce modified polynucleotides. Another approach is to employ the current knowledge of the polymerase structure-function relationships (see, for example, Delarue, M., et al., Protein Enqineerinq, P1266 1990, 3: 461-467; Joyce, C. M., Proc. Nati Acad. Sci. USA, 1997, 94: 1619-1622) to identify or assist in the rational design of a polymerase that can complete a particular modified nucleotide incorporation. For example, the amino acid residues of DNA polymerases that provide specificity for deoxyribo-NTP (dNTP: deoxyribo Nucleotide TriPhosphates), and which at the same time exclude ribo-NTP (rNTP) have been examined in some details. The phenylalanine residue 155 or the reverse transcriptase of the Moloney Murine Leukemia Virus seems to provide a spherical barrier that blocks the entry of the ribo-NTP. A similar role is played by residue 762 of phenylalanine from the Klenow Fragment of E. Coli DNA polymerase I, and the HIV-1 reverse transcriptase tyrosine residue 115. The mutation of this last amino acid, or its equivalent, in several different polymerases has the effect of altering the fidelity and sensitivity of the polymerase to the nucleotide inhibitors. The corresponding site in RNA polymerases has also been investigated and seems to play a similar role in the ribo discrimination of the nucleotides deoxyribo. For example, it has been shown that mutation of tyrosine 639 from T7 RNA polymerase to phenylalanine reduces the specificity of the polymerase for rNTPs by P12S6 approximately 20 times and almost eliminates the Km difference between rNTP and dNTP. The result is that the mutant T7 RNA polymerase can polymerize a mixed D? TP / r? TP chain. See, for example, Huang, Y., Biochemistry, 1997, 36: 13718-13728. These results illustrate the use of structure-function information in the design of polymerases that will easily incorporate one or more modified nucleotides. In addition, chemical modification or site-directed mutagenesis of specific or genetically engineered amino acids can be used to create truncated chimeric or mutant polymerases with particular properties. For example, chemical modification has been used to modify the polymerase of AD? T7 (Sequenase®, Amersham) to increase its processing capacity and affinity for non-natural nucleotides (Tabor, S., et al., Proc. Nati, Acad. Sci. USA, 1987, 84: 4767-4771). Similarly, site-directed mutagenesis has been employed to examine how polymerase I of E. coli DNA (Klenow fragment) distinguishes between deoxy and dideoxynucleotides (Astake M., et al., J. Mol. Biol. ., 1998, 278: 147-165). In addition, the development of a polymerase with optimal characteristics can be achieved by random mutagenesis of one or more known polymerases coupled with P1266 an assay that manifests the desired characteristics in the mutated polymerase. A particularly useful method for carrying out this mutagenesis is termed "redistribution of DNA" (see Harayama, S., Trends Biotechnol., 1998, 16: 76-82). For example, using only three turns of DNA redistribution and performing a test for β-lactamase activity, a variant with 16,000 times higher resistance to antibiotic cefotaxime was created than the wild-type gene (Stemmer, WPC, Nature, 1994). , 370: 389-391). A novel method, which is a further aspect of this invention, for creating and selecting polymerases capable of efficiently incorporating a modified nucleotide or a contiguous pair of modified polynucleotides of this invention is described in the Examples section, below.
D. Analysis of the fragment Once the nucleotide or modified nucleotides have been partially or completely replaced by one or more natural nucleotides in a polynucleotide and that the cleavage of the resulting modified polynucleotide has been completed, the analysis of the obtained fragments can be carried out. If the goal is the complete sequencing of a polynucleotide, the aforementioned partial incorporation P12S6 of the nucleotides modified in a polynucleotide or the partial cleavage of a completely modified nucleotide-substituted polynucleotide can be used to create fragment ladders similar to those obtained when using the Maxam-Gilbert or Sanger methods. In such a case, a sequencing ladder can be constructed using slab, capillarity or electrophoresis techniques by miniaturized gel. The advantages of the method of this invention with respect to the Maxam-Gilbert method is that the placement of the modified nucleotides in the modified polynucleotide is accurate insofar as it is cleavage, whereas the post-synthesis modification of a full-length polynucleotide is for the reactions Maxam-Gilbert is susceptible to error. For example, erroneous nucleotides could be modified and therefore erroneous cleavage could occur or the target nucleotides may not be completely modified, so that there may be insufficient cleavage and perhaps there may be no excision where it was expected to occur. cleavage. The advantages with respect to the Sanger procedure are diverse. First, the full-length clone can be purified after extension and before excision so that the fragments terminated prematurely due to arrests caused by polymerase error or by P1266 model secondary structure can be removed before gel electrophoresis resulting in cleaner cleavage bands. In fact, it may not be necessary to perform this cleansing in which the prematurely terminated polymerase extension fragments will cleave themselves if they contain a modified nucleotide and those correctly cleaved fragments will simply increase the other fragments obtained from the excision of the full-length clone (although this increase is confined to shorter fragments than the premature termination site). Second, the chemical method produces sequence ladder products of equal intensity in contrast to dye termination sequencing where substantial differences in the characteristics of different dye termination molecules or in the interaction of dye-modified dideoxynucleotides with complexes of polymerase model results in a non-uniform signal strength in the resulting sequence ladders. These differences can lead to errors and make identification of the heterozygote difficult. Third, the chemical methods described here allow the production of homogeneous sequence ladders over multiple kb distances, in contrast to the Sanger chain termination method, which generates sharply labeled fragments during an interval P1266 substantially shorter. This is demonstrated in Figures 17 and 18. The production of long sequence ladders can be coupled with restriction endonuclease digestion to complete IX sequencing of long models. The utility of this technique for sequencing genomic DNA is described in Figure 14 and its execution in Figures 15 and 16. These methods are of particular utility in the sequencing of repeat-rich genomes, such as, for example, and without limitation, the genome. human . A particular advantage of the methods described here for the use of mass spectrometry for the determination of the polynucleotide sequence is the speed, the reproductive capacity, the low cost and the automation associated with mass spectrometry, especially in comparison with electrophoresis by gel. See, for example, Fu, D.J., et al., Nature Biotechnology, 1998, 16: 381-384. Therefore, although some aspects of this invention may employ gel analysis, the preferred embodiments are those using mass spectroscopy. When the detection of variance between two or more related polynucleotides is the objective, the ability of mass spectrometry to differentiate between them, P1266 masses within a few units of atomic mass (amu) or even a single one, allows detection without the need to determine the complete nucleotide sequences of the polynucleotides being compared; that is, the masses of the oligonucleotides provide the content of the nucleotide. The use of mass spectrometry in this way constitutes yet another aspect of this invention. This use of mass spectrometry to identify and determine the chemical nature of the variances is based on the unique molecular weight characteristics of the four deoxynucleotides and their oligomers. Table 2 shows the differences in mass between the four deoxynucleotide monophosphates. Table 3A then shows the calculated masses of all the possible 2 -mers, 3 -mers, 4 -mers and 5 -mers by the nucleotide composition alone; that is, without consideration of a nucleotide order. As can be seen, only two of the 121 possible oligomers of 2-mer to 5-mer have the same mass. In this way, the nucleotide composition of all the 2mers, 3mers, 4mers and all but 2omers created by cleavage of a polynucleotide can be determined immediately by mass spectrometry using an instrument with sufficient resolving power. For the masses of Table 3A, an instrument P1266 with a resolution (full width at a height of half the maximum) from 1500 to 2000 would be sufficient; Mass spectrometers with resolution up to 10,000 are available commercially. However, when excision is performed at all modified nucleotide substitution sites, it is not necessary to consider the masses of all possible 2mer, 3mer, 4mer, etc. This is because there can be no internal occurrences of the cleavage nucleotide in any of the cleavage fragments. That is, if G is the cleavage nucleotide, then all the resulting cleavage fragments will have 0 or 1 G, depending on the cleavage mechanism and, if it is 1 G, that G must occur either at the 3 'end or at the 5 'end of the fragment, depending on the cleavage mechanism. Put another way, there can not be an internal G for a fragment because, if there were, that fragment would necessarily be fragmented in the internal G. Thus, if the cleavage chemistry does not leave a G at either end of all the cleavage fragments G, then the mass of G can be subtracted from the mass of each fragment and the resulting masses can be compared. The same can be done with A, C and T. Table 4 shows the masses of all 2mer to 7mer that lack a nucleotide. This calculation has been made for polynucleotides of up to 30mers and it has been shown that there is P1266 only 8 sets of isobaric oligonucleotides (oligonucleotides with masses within 0.0% of each other) below a mass of 5000 Da. The eight sets of isobaric oligonucleotides are shown in Table 3B. The inspection of Table 3B reveals that all sets, except Set 2, involve a polynucleotide with multiple residues G. Therefore, the G excision would eliminate all isobaric masses except one, d (T8) conta d (C3A5) that could not be solved by mass spectrometry with a resolution of 0.01%. However, either cleavage C or A would remove the last polynucleotide. Table 4 shows that cleavage at A or T produces, consistently, fragments with larger mass differences among the closest possible cleavage fragments. The split in A produces mass differences of 5, 10, 15, 20 or 25 Da between the closest fragments while the T excision produces mass differences of 8, 18 or 24 Da, although at the cost of a few isobaric fragments plus.
P126S Table 2. Panel A. The masses of the four deoxynucleotide residues are shown at the top and the molecular weight differences calculated between each pair of nucleotide residues are shown in the table. Note that the chemically modified nucleotides will generally have masses different from those shown above for the natural nucleotides. The difference in mass between a particular modified nucleotide and the other nucleotides will depend on the modification. Observe the description of the cleavage mechanisms and the modifications of P1266 specific nucleotide with respect to the details of cleavage products. Panel B. The mass differences between natural nucleotides and 2-chloroadenine (right-most column) are shown. The smallest mass difference is 17.3 Da instead of the 9 Da of panel A, providing advantageous discrimination of nucleotides using mass spectrometry.
P1266 TABLE 3a Table 3. The masses of all possible compositions of 2, 3, 4, and 5mers in order of Daltons (Da) of mass, rounded to the nearest whole number for ease of presentation. (Other orders of nucleotides are possible for many of the oligonucleotides). The column of 5mers is continued on the left side under the 2mbers. Note that two 5mers with different nucleotide content have the same mass (AAAA and CCGGG shaded in the lower right, weigh both 1504). Molecular masses are provided; the ionization will change the masses. More generally, these masses are illustrative; the actual masses will differ depending on the chemical modification, the cleavage mechanism and the ionization polarity.
Thus, for a given target analyte polynucleotide, if its sequence is known, it is possible to determine whether the cleavage in one or more of the base nucleotides would produce any of the artifacts of confusion in the above and then, by the successful selection of the experimental conditions, it is possible to avoid them or solve them. Based on the above analysis, it can be observed that any difference in the nucleotide sequence between two or more similar polynucleotides P1266 from different members of a population will result in a difference in the pattern of fragments obtained by cleavage of the polynucleotides and therefore, a difference in the masses observed in the mass spectrogram. Each variance will result in two mass changes, the disappearance of a mass and the appearance of a new mass. In addition, if a double-stranded polynucleotide is being analyzed or if two strands are being analyzed, the variance will result in a change in the mass of the two complementary strands of a target DNA, resulting in four mass changes completely (the disappearance of a mass and the appearance of a mass in each chain). The presence of a second chain that shows mass changes provides a useful internal corroboration of the presence of a variance. further, sets of mass changes in fragments of complementary strands can provide additional information regarding the nature of the variance. Figures 27 to 30 exemplify the detection of a mass difference in both strands of a polynucleotide after total substitution and cleavage in modified dA, a variant position in the transferrin receptor gene. Table 5 shows the sets of expected mass changes in complementary strands for all possible point mutations (transitions and transversions). Once P1266 that the mass spectrogram is obtained, will be immediately apparent if the variance was an addition of one or more nucleotides to a fragment (an increase of approximately 300+ au in the fragment mass), a deletion of one or more nucleotides of a fragment (approximately an increase of 300+ au in the fragment mass) or a substitution of one or more nucleotides for one or more other nucleotides (the differences are shown in Table 5). In addition, if the variance is a substitution, the exact nature of that substitution can also be determined.
TABLE 3b Polynucleotides Mass Set 1 d (C2G3) 1566.016 d (A5) 1566.068 Set 2 d (C5G3 2433,584 d (T8) 2433,603 d (C3A5) 2433,636 Set 3 d (A? G7) 2617.707 dfC? Ti) 2617.711 Set 4 d (C? oT?) 3196.090 d (G? o) 3196.137 Set 5 d (C6T1A4) 3292.134 d (C13) 3292.190 Set 6 d (C13) 3759.457 d (T7A! G4) 3759.472 Set 7 d (C5T9) 4183.751 d (A6G7) 4183.779 Set 8 d (T7G7) 4433,899 d (CnA4) 4433.936 P1266 TABLE 4 (part 1) Excision in G Excision in C Excision in A Excision in T g < s > E • q- P12S6 Table 4 (part 1 of 2). The masses resulting from the cleavage of oligonucleotides into specific nucleotides (G, C, A or T, as indicated). Cleavage at G will produce fragments without internal G residues; Depending on the cleavage mechanism, there may be a G at the 5 'or 3' end of the excised mass. In this table, G has been omitted from the G cleavage fragments for ease of presentation (therefore, each fragment could be considered as a longer nucleotide); note that the addition of a G in each of the cleavage fragments G would have no effect on the differences in mass between the fragments (mass?). Similar considerations are obtained for the C, A and T cleavage fragments. Two 5mers with the same cleavage mass T are shaded. The masses were calculated by adding nucleotide masses rounded to the nearest whole number (and therefore not exact, but the pattern of the results is not affected); 61 Daltons, the mass of a phosphate group, is subtracted from all fragments since most cleavage mechanisms will result in the removal of a phosphate group.
PL266 TABLE 4 (part 2) P1266 TABLE 4 (part 2). The masses resulting from the cleavage of oligonucleotides into specific nucleotides (G, C, A or T, as indicated). See the legend of part 1 of this Table. Note that the two 5mers with the same cleavage mass T (part 1) continues to be disseminated through the cleavage masses T (shaded LS).
E. Serial cleavage The foregoing discussion focuses first on the use of a cleavage reaction with any given modified polynucleotide. However, it is also possible and is another aspect of this invention to serially cleave a polynucleotide in which two or more natural nucleotides have been replaced with two or more modified nucleotides having different cleavage characteristics. That is, a polynucleotide containing two or more types of modified nucleotides, either fully or partially substituted, can be cleaved by * serial exposure to different cleavage conditions, either chemical, physical or both. A preferred embodiment of this approach is tandem mass spectrometry, where the fragmented molecular species produced by a process can be retained in a suitable mass spectrometer (eg, the mass spectrometer coupled to cyclotron resonance).
Ionic P1266 with Fourier transform or the ionic trap mass spectrometer) for a subsequent exposure to a second physical and / or chemical procedure that results in activation and cleavage to a second modified nucleotide. The ions produced can be subjected to a third and even a fourth excision condition, directed to specific modifications in a third and a fourth nucleotide, that allow the observation of the precursor-product relationship between the ions that enter (precursors) and those that are generated during each cycle of excision. The use of a gradient of graduated or continuous splitting conditions, of increasing efficiency, can be used to reinforce the elucidation of the precursor-product ratio between the ions. Producing a polynucleotide containing several modified nucleotides decreases the need to perform several polymerizations in the same template to produce a group of polynucleotides wherein each has a different modified single nucleotide; that is, one for A cleavage, one for G, one for T and one for C. Also, the serial application of specific cleavage procedures for different nucleotides, of a single polynucleotide, increases the detection of the precursor ratio -product, which is useful for determining the DNA sequence. Figure 21 P1266 shows the production of a modified polynucleotide by total replacement of riboGTP by dGTP and 5'-amino-TTP by dTTP followed by cleavage with base, which results in cleavage in G or acid cleavage that results in the T-cleavage. The following acid treatment of the split fragments with base or vice versa causes another fragmentation in split double fragments (G and T). This would be useful, for example non-exclusively, to identify a variation in position 27 (dA) of the sequence (Figure 21). That is, as can be seen in Figure 21, the only excision in G produces the ACTTCACCG fragment (position 27 is highlighted) that contains two residues dA. A change in the mass of this fragment of -24Da, which indicates a change from A to C, will not allow the determination as to which of the two residues dA changed to dC. Similarly, the only T excision to give the TCACCGGCACCA fragment, which contains three dA residues also prevents the determination of which of the dA fragments changed. However, the double cleavage in G and T produces the TCACCG fragment that undergoes the mass change -24Da and because it not only contains a dA, it allows the definitive assignment of the variant nucleotide. Schemes that employ this approach to accurately detect variations in other nucleotides will be apparent to those skilled in the art based on the P1266 exhibits herein and fall within the scope of the present invention. Another aspect of this invention is the algorithm or algorithms that allow the use of computers to directly infer the DNA sequence or the presence of variations from mass spectrometry.
F. Parallel excision In the same way, it is possible and is another aspect of this invention that a polynucleotide in which two or more modified nucleotides have been substituted, each of which is susceptible to different cleavage procedures, can be analyzed in parallel . That is, the polynucleotide can be divided into aliquots and each aliquot exposed to a specific cleavage procedure for one of the modified nucleotides. This avoids the effort of carrying out independent polymerization reactions for each of the modified nucleotides. This approach can be used to generate sequence ladders or to generate complete excision products for variation detection. As reviewed in Example 5, complete excision in two different nucleotides (performed independently), followed by mass spectrometry, greatly increases efficiency in the detection of variation compared to excision in a single P126G nucleotide. For example, consider a single polynucleotide substituted with nucleotides ribo-A, 5'-amino-C and 5 '(bridge) thio-G-. It is known that the three modified nucleotides are incorporated by polymerases. Sequence ladders can be produced from such a modified polynucleotide by exposing an aliquot to acid, which results in C cleavage; the exposure of a second aliquot to a base, which results in cleavage at A; and the exposure of a third aliquot to silver or mercury salts, which results in cleavage in G. It is possible that a polynucleotide produced with the above-mentioned three nucleotides plus 4'-C-acyl T could also be (separately) exposing to UV light to produce T cleavage, which results in a complete set of sequencing reactions from a single polymerization product.
G. Combination of modified nucleotide cleavage and chain termination Another application of modified nucleotide incorporation and cleavage is to combine them with a chain termination procedure. By incorporating one or more modified nucleotides in a procedure of Pl266 polymerization (eg, non-exclusively, modified A) with a different chain termination nucleotide, for example, dideoxy-G, a Sanger-type ladder of fragments ending in the dideoxy nucleotide can be generated. The following exposition of this fragment ladder to a chemical substance that unfolds in the modified A will result in another fragmentation, with resulting fragments ending in 5 'for A and in 3' either for A (most of the time ) or for G (in one fragment per chain termination product). The comparison of the resulting fragment set with a set of fragments produced only by substitution and cleavage at the modified nucleotide (A), will provide an instructive comparison: all the fragments will be the same except for the presence of additional fragments in the chain termination set that ends in 3'G, which by mass spectrometry analysis provides the mass (and by inference the nucleotide content) of all the fragments in which A is followed (directly or after some interval) by a G , without an A to intervene. The derivation of similar data using other chain termination nucleotides and other cleavage nucleotides will provide in a cumulative manner a set of useful data to determine the sequence of P1266 polymerization products.
H. Substitution of modified nucleotides resistant to cleavage and nucleotides with mass change The foregoing embodiments of this invention relate firstly to the substitution in a polynucleotide of one or more modified nucleotides, which have the effect of increasing the susceptibility of the polynucleotide a cleavage at the site or sites of incorporation of the modified nucleotide (s) as compared to the unmodified nucleotides. However, it is entirely possible and this is another aspect of this invention, that a modified nucleotide when incorporated into a polynucleotide, reduces the susceptibility to cleavage at the site of incorporation of the modified nucleotide compared to the unmodified sites. In this scenario, cleavage would then occur in the unmodified sites of the polynucleotide. Alternatively, a combination of excision-resistant and cleavage-sensitive modified nucleotides could be incorporated into the same polynucleotide in order to maximize the differential between the susceptible sites and those not susceptible to cleavage. An example of a modified nucleotide that imparts this type of cleavage resistance is the P1266 derivative 2 '-fluor of any natural nucleotide. The 2'-fluorine derivative has been shown to be significantly less susceptible to fragmentation in a mass spectrometer than unsubstituted natural nucleotides. As shown in Table 2, the mass differences between naturally occurring nucleotides vary from 9 to 40 Da and are sufficient to resolve the differences of a single nucleotide in all size and larger fragments. However, it would be convenient to increase the mass difference between the four nucleotides or between any pair of nucleotides to simplify their detection by mass spectrometry. This is illustrated in Table 2 for dA and its analog 2-chloroadenine. That is, the 2-chloroadenine substitution, mass 347.7, increases the mass difference A-T from 9 Da to 42.3 Da, the difference of A-C from 24 to 57.3 Da and the difference of A-G from 16 to 17.3 Da. Other nucleotide analogues that change mass are known in the art and it is an aspect of this invention that can be used to benefit the mass spectrometric methods of this invention.
I. Applications Various applications of the methods of the present invention are described below. It is understood that P1266 these descriptions are only exemplary and do not have the purpose or in any way are considered limiting the scope of this invention. Thus, other applications of the methods described herein will be apparent to those skilled in the art based on the present disclosure; said applications are within the scope of this invention. to. Total substitution, total extension and complete excision. In one aspect of the present invention at least one of the four nucleotides of which the target or target polynucleotide is composed, is completely replaced with a modified polynucleotide (either in a chain by means of an extension of the primer or in the two chains by means of a DNA amplification procedure), a full-length polynucleotide is made and almost complete cleavage is carried out. The result will be the cleavage of the modified polynucleotides, in fragments with an average length corresponding to the four nucleotides. This is because the abundance of nucleotides A, T, G and C is approximately equal in most genomes and their distribution is semi-random. Therefore, a particular nucleotide appears about once every four nucleotides in a P1266 natural polynucleotide sequence. Of course there will be a size distribution, with considerable deviation from the average size due to the non-random nature of the sequence of biological polynucleotides and the unequal amounts of the base pairs A: T with respect to G: C in the different genomes. The extended primer (when primer extension or amplification is present) will not cleave until the first appearance of a modified nucleotide after the end of the primer, which results in fragments greater than 15 nt (ie, greater than the length of the primer). Often, these fragments containing the primer will be the longest or will be among the longest that are produced. This may be an advantage in the design of genotypic assays. That is, the primers can be designed so that the first occurrence of a polymorphic nucleotide position is after the primer. After cleavage, the genotype can be determined from the length of the fragment containing primer. This is illustrated in Figures 27-32. Due to this variation in analyte mass size, it is essential that the mass spectrometer has the ability to detect polynucleotides that vary up to 20mers or even 30mers, with a level of resolution and mass accuracy consistent with the unambiguous determination of the nucleotide content in each mass. As P1266 will be discussed later, this requirement has different implications depending on whether it is already known (as will be the case in general with the detection of variation or genotyping) or not (as it will be the case of de novo DNA sequencing). Polynucleotide sequence of the analyte. i. Applications for variation detection Detection of variation is usually carried out on an analyte DNA or a cDNA sequence for which at least one reference sequence is available. The interest of the detection of variation is to examine a set of corresponding sequences from different individuals (sample sequences) in order to identify sequence differences between the reference and the sample sequences or between the sample sequences. Said sequence variations will be identified and characterized by the existence of different masses between the split sample polynucleotides. Depending on the scope of the variation detection method, analyte fragments of different lengths may be optimal. For genotyping, it is desirable that a primer be near the site of the known variant. Usually, a fragment of P1266 analyte of at least 50 nucleotides, more preferably at least 100 nucleotides and even more preferably at least 200 nucleotides by incorporation with polymerase of modified nucleotides (either A, G, C or T), followed by excision in the Incorporation sites of the modified nucleotides and analysis by mass spectrometry of the resulting products. Given the frequency of nucleotide variations (estimated between one in 200 and one in 100 nucleotides in the human genome), there will usually be zero or only one or two cleavage fragments that differ between any two samples. The fragments that differ between the samples can vary in size from a monomer to lOmer, less frequently up to 2Omone or rarely a fragment of even greater length; however, as noted above, the average excision fragment will be approximately 4 nucleotides. The information about the reference sequence can be used to avoid cleavage schemes that could generate very large cleavage products and more generally to increase the susceptibility to detection of any sequence variation that might exist between the samples, by calculating of the efficiency of the detection of variation at each nucleotide position for all possible cleavage schemes, as described below. Without However, large sequences do not really represent a problem, when a reference sequence is available and the analyte length is only several hundred nucleotides. This is because it is very unlikely that any analyte fragment contains two large cleavage masses whose sizes are very close. In general, if there are only a few large fragments, they can be easily identified and as shown in Table 5, even with a MALDI instrument with a mass resolution capacity of only 1000, the most difficult substitution, an A < - > T which results in a change of 9 amu (units of atomic mass) can be detected in a 27mer.
P1266 Table 5. This table summarizes the relationship between the resolution of the mass spectrometer and the nucleotide changes in determining the maximum size fragment in which a given base change can be identified. The maximum size of the DNA fragment (in nucleotides; nt) in which a base substitution can theoretically be solved, is provided in the four columns on the right (last 6 rows) for each possible nucleotide substitution, listed in the column from the left. As is evident from the table, the mass difference generated by each substitution (?, Measured in Daltons) and the resolving power of the mass spectrometer determines the limit size of the fragments that can be satisfactorily analyzed. MALDI instruments that are commercially available can resolve between 1 part in 1, 000 and 1 part in 5,000 (FWHM), while available ESI instruments can solve 1 part in 10,000. Modified MS ESI instruments have a mass resolution capability at least 10 times higher. (The numbers of theoretical resolution in the table do not take into account the limitations of the real resolution imposed by the isotopic heterogeneity of the molecular species and the technical difficulty to obtain large ions efficiently). FWHM: total width at half the maximum height, is a standard measure of the P1266 mass resolution. (For more information, see for example: Siuzdak, G. Mass Spectrometry for Biotechnology, Academic Press, San Diego, 1996). In order to select the experimental conditions for detection of variation that maximizes the probability of success, the reference sequence can be used to predict the fragments that would be produced by the split in A, G, C or T, as an advance of the experimental work . Based on that analysis, the optimal substitution of the modified nucleotide and the cleavage scheme for each DNA or cDNA sequence to be analyzed can be selected. Such an analysis can be carried out in the following manner: • For each nucleotide of the test polynucleotide, substitute each of the other three possible nucleotides and generate an associated mass change. For example, if in position 1 the test polynucleotide begins with A, then generate hypothetical polynucleotides starting with T, G and C. Then move to position two of the test sequence and perform again the three possible substitutions and so on. successively for all positions of the test polynucleotide. If the test polynucleotide is 100 nucleotides in length then a total of 300 new hypothetical fragments in a chain and 300 additional fragments will be generated by this procedure P1266 in the complementary chain. Each set of three substitutions can then be analyzed at the same time. • Generate the masses that would be produced by cleaving in T, C, G or A each of the three hypothetical test fragments obtained by the substitutions of T, C or G for A in position 1. Compare these sets of masses with the set of masses obtained from the reference sequence (which in the example has A in position 1). For each of the four divisions (T, C, G, A), determine whether the disappearance of an existing mass or the generation of a new mass would produce a difference in the total mass set. If a difference occurs, determine if it is a single difference or are two differences (that is, a disappearance of one mass and an appearance of another). Also determine the magnitude of the mass difference compared to the mass set generated by the excision of the reference sequence. Carry out this same analysis for each of the 100 positions of the test sequence, in each case examine the consequences of each of the four possible base-specific cleavages, that is, for DNA, in A, C, G and T. • Generate a correlation score for each of the four possible specific excisions of the bases. The correlation score increases in P1266 proportion to the fraction of the 300 possible deviations from the reference sequence that produces one or more mass changes (ie, a higher correlation score for two mass differences) and in proportion to the degree of differences in mass (the rating is higher for larger mass differences than for small ones). • In the case of primer extension, the analysis is performed for a chain; in the case of amplification, the calculation is carried out on the cleavage products of the two chains. The aforementioned method can be extended to the use of substitution and excision combinations. For example, the T-cleavage in each of the analyte polynucleotide chains (either independent or simultaneous cleavage of the two T-chains) or the T and A cleavage in a chain (again, either independent or simultaneous cleavage) of the two chains) or the excision of a chain with T and the excision of the complementary chain with A and so on. Based on the correlation ratings generated for each of the different schemes, an optimal scheme can be determined as an advance of the experimental work. A computer program can be developed to carry out the previous task. This program is also P1266 can be expanded to encompass the analysis of experimental excision masses. That is, the program can be developed to compare all masses in the mass spectrum determined experimentally, with the cleavage masses that are expected from the excision of the reference sequence and to indicate any new or missing mass. If there are new or missing masses, the set of experimental handles can be compared with the masses generated in the computational analysis of all possible substitutions, insertions or nucleotide deletions, associated with the conditions of experimental cleavage. However, nucleotide substitutions are about ten times more common than insertions or deletions, so an analysis of substitutions alone should be useful. In one embodiment, the computational analysis data for all possible nucleotide insertions, deletions and substitutions can be stored in a look-up table. The set of computational masses that matches the experimental data then provides the sequence of the new variant sequence or at least the restricted set of possible sequences of the new variant sequence. (The placement and chemical nature of a substitution may not be uniquely specified by an excision experiment). To solve all the P1266 ambiguities with respect to the nucleotide sequence of a variant sample may in some cases require another substitution and cleavage experiment (see Section E, Serial cleavage and DNA sequencing applications described below) or may be solved by some other method of sequencing (e.g., conventional methods of sequencing or sequencing by hybridization). It may be convenient to routinely perform multiple substitution and excision experiments on all samples, maximizing the fraction of variations that can be precisely assigned to a specific nucleotide. The inventors have performed a computational analysis of natural polynucleotides of 50, 100, 150, 200 and 250 nucleotides and have found that combinations of two nucleotide cleavage (eg, cleaving into A into a chain and into G into the complementary strand) gives as a result, 99 to 100% detection efficiency, considering all possible substitutions up to 250 nt. Potentially useful assays can be performed but sometimes with less than 100% sensitivity in larger fragments up to 1000 nt. See Example 5 for the details of this analysis.
P1266 ii. Applications to DNA sequencing Still another aspect of this invention uses the chemical methods set forth herein in conjunction with mass spectrometry to determine the complete nucleotide sequence of a de novo polynucleotide. The procedure involves the same reactions described above for the detection of variation; that is, the total replacement of one of the four nucleotides in a polynucleotide with a modified nucleotide, followed by an almost complete cleavage of the modified polynucleotide in each and every point of appearance of the modified nucleotide and then the determination of the masses of the fragments obtained. However, in this case, it may be necessary to routinely perform four sets of cleavage reactions, a different natural nucleotide is replaced with a modified nucleotide in each reaction, so that the four natural nucleotides are replaced in turn with nucleotides modified and the resulting modified polynucleotides are cleaved and the masses of the cleavage products are determined. It may also be necessary to employ one or more multiple nucleotide substitutions, as discussed above, to resolve sequencing ambiguities that may arise. While the number of reactions required by each sequence determination experiment is thus similar to that which is P1266 requires for Maxam-Gilbert or Sanger sequencing, the method of this invention has the advantages of eliminating radiolabels or dyes, providing greater speed and accuracy, allowing automation and eliminating manipulations, including compressions, which are associated with sequencing of Maxam-Gilbert and Sanger or any other gel-based method. This last consideration can be of great importance, since mass spectrometry currently allows the analysis of excision reactions in seconds or minutes (and in the future in milliseconds), in comparison with the hours required by the electrophoresis procedures in gel. In addition, the inherent accuracy of mass spectrometry, together with the control with respect to the construction of the modified polynucleotide that can be achieved using the methods of this invention, will greatly reduce the need for sequencing redundancy. A representative total sequencing experiment is discussed in the Example section, below. The process for inferring the DNA sequence from the mass pattern obtained by cleavage of analyte molecules is much more complicated than the process for detecting and inferring the chemical nature of the sequence variations. In the case of sequencing by complete excision and mass analysis, it should be P1266 do the following: • Determine the length of the sequence. From the experimentally determined masses infer the nucleotide content of each cleavage fragment in the same manner as described elsewhere in the present. This analysis is performed for each of the four sets of experimental cleavage masses. The shortcomings of this analysis are that two or more fragments (in particular the short ones) can have identical masses and therefore can count as one, which leads to an erroneous count of the length of the sequence. However, this is not a serious experimental problem because the masses of fragments can be added and compared for the four excisions; if they do not correspond then there must be two or more masses that overlap between the fragments. Thus, the determination of all the masses of fragments in the four cleavage reactions essentially eliminates this source of error. First, the set of cleavage masses that gives the largest length can be taken as a starting point. Then, the nucleotide content of all the masses in the other three cleavage reactions can be tested with respect to whether or not they are compatible with the nucleotide content of any of the masses associated with the cleavage set of greater length. If they are not compatible, then there must be an incorrect count P1266 even in the set associated with the largest length. The comparison of the contents of the sequence will usually allow the bases to be identified without counting and the total length of the sequence to be determined. «The next aspect of the analysis may include: (a) determining the intervals at which nucleotides A, C, G and T are presented based on the size of the respective cleavage products; (b) analyzing the nucleotide content of the largest fragments of each cleavage set to identify sets of nucleotides that must be together; (c) comparing the nucleotide content of the fragments between the different sets to determine which fragments are compatible (i.e., one could be included in the other or they could overlap) or incompatible (no nucleotides in common); (d) begin to integrate the results of these different analyzes to restrict the number of routes through which the fragments can be joined. The elimination of possibilities is as useful as the identification of possible relationships. A detailed illustration of the logic that is required to resolve the sequence of a short oligonucleotide is given in Example 4. One way to provide additional information about the local sequence relationship is to reduce the degree of nucleotide substitution or the term of the P12e6 cleavage (see below) in order to obtain sets of split fragments incompletely (but still almost complete). The mass analysis of these fragments can be quite useful, in combination with the sets of completely unfolded fragments, to identify the fragments that are adjacent to each other. A limited amount of that information is needed to complete the entire puzzle of assembling the excision fragments in a continuous sequence. Three additional pathways to facilitate DNA sequence inference from the analysis of complete substitution and cleavage masses are: (a) dinucleotide cleavage mass analysis (see below), which can provide a structure for that the small masses associated with the substitution and cleavage of the mononucleotides are shared in a smaller number of collections of intermediate size. The cleavage of dinucleotides also provides the location of dinucleotide sequences at intervals throughout the sequence, in fact the cleavage of dinucleotides in all possible dinucleotides is an alternative sequencing method; (b) substitution of mononucleotides and excision of the complementary strand using one or more modified nucleotides that can provide valuable complementary information on the P1266 length and overlap of fragments; (c) substitution in combination and cleavage schemes employing simultaneous cleavages of di- and mononucleotides or two different simultaneous cleavages of mononucleotides can unambiguously provide information on order of the sequence. In the above-mentioned descriptions, it has been assumed that the modified nucleotide is selectively more susceptible to chemical cleavage under appropriate conditions than the three unmodified nucleotides. However, an alternative approach to carry out mononucleotide cleavage is to use three modified nucleotides that are resistant to cleavage in either physical or chemical conditions, enough to induce cleavage in an unmodified natural nucleotide. Thus, in another aspect of the present invention, the cleavage of mononucleotides can be accomplished by selective cleavage at an unmodified nucleotide. A chemical modification of nucleotides that has been shown to make them more stable to fragmentation during mass spectrometric analysis is the 2'-fluorine modification. (Ono, T., et al., Nucleic Acids Research, 1997, 25: 4581-4588). The usefulness of substituted 2-fluoro DNA is recognized to extend the accessible mass range for Sanger sequencing reactions (which is generally limited by fragmentation), but it is an aspect of P1266 the present invention that these chemical methods also have utility for effecting the specific cleavage of nucleotides by replacing in its entirety three modified nucleotides that are resistant to the physical or chemical cleavage process. Another chemical modification that has been shown to increase the stability of the nucleotides during the analysis by mass spectrometry MALDI-MS, is the 7-deaza analog of adenine and guanine. (Schneider, K. and Chait, B.T., Nucleic Acids Research, 1995, 23: 1570-1575). In another aspect of this invention, it is possible to use the modified cleavage-resistant nucleotides together with modified cleavage-sensitive nucleotides, to effect an increase in the degree of selectivity in the cleavage step. iii. Applications in genotyping As DNA sequence data accumulate from several species, there is a growing demand for accurate, high-throughput, automatable and inexpensive methods to determine the state of a specific nucleotide or nucleotides in a biological sample, in where previously the variation in a specific nucleotide (either polymorphism or mutation) has already been discovered. This procedure (the determination of the nucleotide in a P1266 particular location in a DNA sequence) is known as genotyping. Genotyping is in many ways a special case of DNA sequencing (or detection of variation where only one position is questioned), but only the sequence of a nucleotide position is determined. Because only one nucleotide position can be analyzed, genotyping methods do not overlap at all with DNA sequencing methods. The methods of this invention provide the basis for useful and novel genotyping procedures. The basis of these methods is the polymerization of a polynucleotide that includes the polymorphic site. Polymerization can be done either by the method (polymerase chain reaction) or by extension of the primer, but preferably by the method. The polymerization is carried out in the presence of three natural nucleotides and a chemically modified nucleotide, so that the chemically modified nucleotide corresponds to one of the nucleotides at the polymorphic or mutant site. For example, if an A / T polymorphism is to be genotyped, the nucleotide susceptible to cleavage could be A or T. If the G / A polymorphism is to be genotyped, the nucleotide susceptible to being cleaved could be A or G. the trial could be prepared for the complementary chain, in P12S6 where T and C are opposite to A and G. Subsequently the polymerization product is split chemically by treatment with acid, base or other cleavage scheme. This results in two products from two possible alleles, one larger than the other as a consequence of the presence of the nucleotides susceptible to being excised at the polymorphic site in one allele but not in the other. There is also a change in mass but not in length, in the opposite chain. One restriction is that one of the primers used to produce the polynucleotide must be located in such a way that the first appearance of the nucleotide capable of cleaving after the end of the primer is at the polymorphic site. This usually requires that one of the primers be near the polymorphic site. An alternative method is to simultaneously incorporate two nucleotides that can be excised, one for a polymorphic nucleotide in the (+) chain, another for a polymorphic site in the (-) chain. For example, dA susceptible to splitting in the chain (+) could be incorporated (to detect a polymorphism A-G) and dC susceptible to being excised in the chain (-) (to positively detect the presence of the G allele in the (+) chain). In this case, it would be convenient to have two primers near the site of the variant. The two allelic products of different sizes are P1266 can separate, for example, non-exclusively, capillary electrophoresis by electrophoretic means. They could also be separated by mass, using nonexclusively, mass spectrometry. In addition, a FRET (fluorescence resonance energy transfer) assay can be employed to detect them, as described below. Any of these three assay formats is compatible with multiplex operations through means known in the art. One way of carrying out a FRET detection with respect to the presence or absence of the allelic cleavage product is to introduce a probe with a fluorine or chiller unit, such that the probe differentially hybridizes the unfolded strand (representing an allele) with with respect to the unfolded chain (representing the other allele, see Figure 2 for illustration of several possible schemes). This differential hybridization is easily achieved because one strand is longer than the other, by at least one and often by several nucleotides. If a fluorine or coolant group is also placed in the primer used to produce the polynucleotide capable of being cleaved (by or primer extension) so that there is an appropriate FRET interaction between the unit in the probe and the unit in the primer, it is say, the lengths of P1266 wave absorbers and transmitters of the two units and the distance and orientation between them is maximized by methods known to those skilled in the art, then a strong signal will be presented with one allele but not with the other, when the probe and the primer is heated to the temperature that achieves maximum hybridization discrimination. Ideally, the probe is synthesized so that maximum advantage is taken of the different lengths of unfolded and unfolded alleles. For example, the primer should hybridize to the region that is excised by an allele but is present in the other allele. When selecting primers for PCR or primer extension, one consideration of the experimental design would be to place the primer so that the difference in length between the two alleles is maximized. Another means of maximizing discrimination would include the use of a "molecular guide" strategy where the ends of the probe are complementary and form a stem, except in the presence of the unfolded allele when the unfolded segment is complementary to the stem of the probe. the probe and therefore compete in effect with the formation of intramolecular stems in the probe molecule (Figures 32 and 33). The above FRET methods can be performed in a single tube, for example, in the following manner: (1) PCR; P1266 (2) addition of cleavage reagent (and heat if necessary); (3) addition of the probe; and (4) temperature gradient if necessary in an instrument such as the ABI Prism that has the ability to detect by excitation and fluorescence in 96-well plates. Another way to produce a FRET signal that discriminates the two variant alleles is to incorporate a nucleotide with a dye that interacts with the dye in the primer. The key to performing differential FRET is that the dye-modified nucleotide must appear first (after the 3 'end of the primer) beyond the polymorphic site, such that after excision the nucleotide dye of an allele (unfolded) does not it will be more within the required distance that produces the resonance of the primer dye, while in the other allele (without cleaving), the appropriate distance will be preserved and the FRET will be presented. The only disadvantage of this method is that a purification step is required to remove unincorporated dye molecules that can produce a background signal that could interfere with FRET detection. A non-exclusive example of the experimental steps involved in the practice of this method are: (1) PCR with primer labeled with dye and either a modified nucleotide capable of being excised that also includes a dye or a P1266 modified nucleotide susceptible to cleavage and a nucleotide labeled with dye. The dye may be in the nucleotide capable of being cleaved if the cleavage mechanism results in the separation of the dye from the primer, as for example, in the case of the 5'-amino substitution which results in cleavage close to the sugar and the dye. nucleotide base; (2) cleavage in the nucleotide susceptible to excision; (3) purification to eliminate free nucleotides; and (4) detection by FRET. As noted earlier in this discussion, it has been shown that polynucleotides containing 7-nitro-7-deaza-2'-deoxyadenosine instead of 2'-deoxyadenosine can be specifically and completely cleaved using piperidine / TCEP / Tris base . There are many other examples of chemical reactions where PCR amplification and chemical cleavage are possible. In a putative genotyping assay, a PCR reaction is carried out with a nucleotide analogue capable of cleaving together with three other nucleotides. The PCR primers can be designed in such a way that the polymorphic base is close to one of the primers (P) and there is no base capable of being cleaved between the primer and the polymorphic base. If the base susceptible to split is one of the polymorphic bases, it is expected that the cleavage product containing P-, coming from the east allele, be P1266 shorter than the product from the other allele. The schematic presentation (Figure 27) and the experimental data (Figures 28 to 31) are examples of this arrangement. If the base susceptible to being cleaved is different from any of the polymorphic bases, the fragment containing P- would have the same length, but different molecular weight for the two alleles. In this case, mass spectrometry would be the preferred analytical tool; although it has been observed that oligonucleotides with a single base difference can move differently when analyzed by capillary electrophoresis. In a specific example, an 82bp fragment of the transferrin receptor gene was amplified by PCR using 7-nitro-7-deaza-2'-deoxyadenosine instead of 2'-deoxyadenosine. The polymorphic base pair is A: T a G: C. The PCR amplification generated a product substituted in its entirety, with performances similar to those of natural DNA (Figure 28). The MALDI-TOF mass spectrometry analysis revealed polymorphism in two regions of the spectrum. The first between 7000 Da and 9200 Da and the second between 3700 Da and 4600 Da (Figure 30, block A). the first region showed the difference in fragments of different lengths containing primer (Figure 30, block B). The second region showed the opposite DNA strand that contained the polymorphism with the same length but different mass P1266 (Figure 30, block C). The common fragments between the two alleles can serve as mass reference. You can also use analysis by capillary electrophoresis (Figure 31). The mobility difference between two fragments of different length was easily detected in the test sample, as expected. In addition, the mobility difference between two polymorphic fragments was observed (lint) of the same length but a different base (C vs T), which provides supporting evidence from the opposite strand, Figure 32 illustrates schemes for FRET detection of the same polymorphic site. b. Total substitution, total extension and complete excision in dinucleotides In another aspect of the present invention, two of the four nucleotides of which the polynucleotide in question is constituted, are replaced in their entirety by modified nucleotides (either in a chain that employs extension of the primer or in the two chains employing the amplification procedure) and then almost complete cleavage is performed, preferentially at the site of dinucleotides that include the two different modified nucleotides. In general, given the steric constraints of most cleavage mechanisms, the two modified nucleotides will be cleaved P1266 only when they appear in a specific order. For example, if T and C are modified, the 5 'TpC 3' sequence would be excised but not the 5 'CpT 3' (5 'and 3' indicate the polarity of the polynucleotide chain and p indicates an internal phosphate group). The basis for the cleavage of dinucleotides is that the cleavage of mononucleotides is not ideally suited to the analysis of polynucleotides greater than 300 to 400 nucleotides, because the number of fragments that must be detected and resolved by the mass spectrometer becomes a limiting factor. and increases the probability of the coincident appearance of two or more excision fragments with the same mass and limits the efficiency of the method. This last problem is especially serious with respect to the appearance of mono-, di-, tri and tetranucleotides of the same composition that can mask the appearance or disappearance of fragments because mass spectrometry (MS) is not quantitative. In contrast, capillary electrophoresis, while not providing mass and consequently nucleotide content, is a quantitative method that allows the detection of variation in the numbers of di-, tri and tetranucleotides. The split in modified dinucleotides must originate fragments with an average of sixteen P1266 nucleotides in length. This is because the abundance of any dinucleotide, given the four nucleotides, is 42, which is equal to 16, assuming that the nucleotide frequencies are equal and that there is no biological selection imposed on any of the dinucleotide classes (ie, the occurrence be random). None of these assumptions is entirely accurate, however, there will actually be a wide size distribution in the excision masses, with considerable deviation in the average mass size, depending on the pair of nucleotides selected for substitution and cleavage. However, the available information regarding the frequency of several dinucleotides in a mammalian, invertebrate and prokaryotic genome can be used to select the appropriate dinucleotides. It is well known, for example, that 5 'CpG 3' dinucleotides are under-represented in mammalian genomes; they can be avoided if relatively frequent cleavage intervals are desired. i. Variant detection applications If the sequence of the analyte polynucleotide is known, then an optimal dinucleotide cleavage scheme can be selected, based on the analysis of predicted cleavage fragment masses. For example, P1266 cleavage fragments remaining in the optimal size range by mass spectrometry analysis can be selected by fragment size analysis produced by all possible dinucleotide cleavage schemes. In addition, the theoretical efficiency of the detection of variation associated with all possible dinucleotide cleavage schemes, can be determined as described above for the total substitution and cleavage of mononucleotides, i.e., by determining the ease of detection of each possible nucleotide substitution in the complete analyte fragment. In some cases two or more independent dinucleotide cleavage reactions can produce complementary results or to corroborate a second dinucleotide cleavage experiment can be run. Given the length of the dinucleotides (16mer on average), it will often not be possible to accurately determine the location of a variant nucleotide based on a dinucleotide cleavage experiment. For example, if a mass difference of 15 Da in a 14mer is detected among the samples, then there must be a variation C < - > T (Table 2) in the 14mer, with the heavier alleles containing T in a position where the lighter alleles contain C. However, unless there is only one C in the lighter variant fragment or only one T in he P1266 heavier variant fragment, it is impossible to determine if C or T is variant one. This ambiguity with respect to the precise nucleotide that varies can be resolved in several ways. First, a second substitution and excision experiment of mono- or dinucleotides or a combination of these cleavage experiments can be designed in order to divide the original variant fragment into pieces that will allow unambiguous assignment of the polymorphic residue. Second, an alternative sequencing procedure can be used as independent verification of the results, for example, Sanger sequencing or hybridization sequencing. ii. Applications to DNA sequencing As an isolated procedure, substitution and cleavage of dinucleotides can provide useful information regarding the nucleotide content of DNA fragments that are on average 16 nucleotides in length, but vary up to 30, 40 to even 50 or more. nucleotides. However, as described above, the main applications of dinucleotide cleavage in DNA sequencing coupled with mononucleotide cleavage. The comparatively large DNA fragments produced by the dinucleotide cleavage can be very useful for classifying the smallest fragments produced by the P1266 cleavage of mononucleotides into sets of fragments that must be assembled together. The additional restrictions imposed by these groupings may be sufficient to allow the entire sequence to be determined from even relatively long fragments. In Example 4, the steps required to infer a nucleotide sequence from a 20mer molecule using the substitution of four mononucleotides and cleavage reactions are shown. The procedures described in Example 4 can be carried out in a series of 1mer to 30mers, whose sequential content has already been defined or at least restricted, by a process of dinucleotide cleavage. Therefore, the sequence of a much larger fragment can be obtained. It is observed that as the length of the nucleotide increases, the relation between the mass of the fragment and the sequential content becomes more ambiguous; that is, there are more sequences and more possible sequences that could produce the given mass. However, if the number of nucleotides comprising the mass is known, the number of possible nucleotide contents decreases significantly (Pomerantz, SC, et al. , J. Am. Soc. Mass Spectrom., 1993, 4: 204-209). In addition, sequential constraints, such as the absence of internal dinucleotide sequences of a particular type, further reduce the number of possible nucleotide contents, as illustrated in Table 4 for sets of mononucleotides. c. Total substitution with fied nucleotide and partial cleavage Partial replacement with fied nucleotide and total cleavage Partial replacement with fied nucleotide and partial cleavage These applications provide partially split polynucleotides by different strategies; each of these methods has utility in specific embodiments of the invention. However, total substitution with a fied nucleotide and partial cleavage is the preferred method for producing partial cleavage products by mass spectrometric analysis. The reason is that with the total substitution the degree of partial cleavage can be varied with respect to a very broad spectrum, from a cleavage of 1 in 100 nucleotides to a cleavage of 99 in 100 nucleotides. The partial substitution, even with total excision, does not allow this interval of degree of excision. However, for fied nucleotides that are not efficiently incorporated by the polymerases, lower degrees of substitution are preferred. As the degree of P1266 cleavage is reduced, the ratio between the cleavage fragments with respect to a larger interval, becomes apparent. On the other hand, as the degree of cleavage is increased, the ability to obtain accurate mass data and for unambiguous assignment of the nucleotide content also increases. The combination of light, intermediate and considerable cleavage provides an integrated representation of a complete polynucleotide, when the application is variation detection or sequencing. Small polynucleotides of defined nucleotide content can be assembled into larger and larger groups of defined order. Partial replacement with total excision and partial replacement with partial excision are useful for the preparation of ladders of sequences. If a fied nucleotide is not efficiently incorporated into the polynucleotides by the available polymerases, then the low-percentage partial substitution may be optimal for the efficient production of polynucleotides containing the fied nucleotide. However, a low degree of substitution may then require complete excision in order to produce sufficient cleavage fragments for rapid detection. Partial excision with partial excision is usually a preferred approach since conditions P126G for complete excision can be drastic and thereby result in some non-specific cleavage or fication of the polynucleotides. As well, the partial substitution at relatively high levels (ie, at 5% or more of the appearance of the nucleotide) allows a range of partial cleavage efficiencies to be analyzed. In the same way as with the MS analysis, there are advantages in being able to prove several degrees of excision. For example, it is well known that in sequencing Sanger there are compensations in the production of very long sequences of stairs; usually the beginning of the ladder, with the shorter fragments, is difficult to read, as is the end of the ladder with the longer fragments. Likewise, the ability to manipulate the conditions of partial cleavage with the polynucleotides of this invention, will allow a series of sequencing ladders to be produced from the same polynucleotide, which provides clear sequence data near the primer or at a certain distance from the primer. . As shown in Figure 17, the sequence ladders produced by the chemical cleavage have a better distribution of marked fragments than the dideoxy termination at distances up to 4kb and more. Partial excision can also be obtained by replacing nucleotides resistant to P1266 excision, previously described, for all nucleotides but natural, which then provide the cleavage sites. In addition, as already described, combinations of modified nucleotides resistant to cleavage and modified nucleotides sensitive to cleavage can be used. While with the methods of this invention any technique allowing the determination of the mass of relatively large molecules can be used, without causing non-specific disintegration of the molecules in the process, a preferred technique is MALDI mass spectrometry since it is very adaptable well to the analysis of complex mixtures of analyte. Commercial MALDI instruments that are available have the ability to determine masses with an accuracy of the order of 0.1% to 0.05%. that is, under optimal conditions, these instruments are capable of solving molecules that differ in molecular weight by values as small as one part in two thousand. It is very likely that advances in EM MALDI technology will increase the resolution of commercial instruments in a few more years.
Considering the smallest difference that can occur between two chains that contain a variation (A transversion A-T, a molecular weight difference of 9, see table 5) and given a MALDI device with a resolution P1266 of 2,000 (that is, a machine capable of distinguishing an ion with an m / z (mass / charge) of 2,000 from an ion with an m / z of 2,001), the largest DNA fragment in which the AT transversion it would be detected, is approximately 18,000 Daltons (a "Dalton" is a unit of molecular weight used when describing the size of large molecules, for all purposes and purposes it is equivalent to the molecular weight of the molecule). In experimental conditions, the practical resolution power of an instrument may be limited by the isotopic heterogeneity of the carbon; that is, carbon exists in nature as carbon 12 and carbon 13 and also other factors. Assuming an approximately uniform distribution of the four nucleotides in the DNA fragment, this results in a detection of an A-T transversion in an oligonucleotide containing about 55 nucleotides. On the other side of the spectrum, a single C-G transversion that results in a molecular weight difference of 40 could be detected by means of MALDI mass spectrometry in an oligonucleotide consisting of approximately 246 nucleotides. The size of an oligonucleotide in which a transversion A-T would be detectable could be increased by substituting a heavier unnatural nucleotide either by A or by T; for example, in a non-exclusive way, the substitution of A with 7-methyl-A, P1266 increases the change in molecular weight to 23. Table 5 shows the approximate size of an oligonucleotide in which each possible single point mutation could be detected by mass spectrometers with different resolution powers, without any change in molecular weight. A variety of chemical modifications of nucleotides have been described with respect to their utility to increase the ease of detection of mass differences during EM analysis. A mass modification which is especially useful for the methods of this invention, is the purine analog 2-chloroadenine having a mass of 364.5. As shown in Table 2, Block B, this has a favorable effect on the mass differences between all nucleotides and A. The most important thing is that it changes the T-A difference from 9 Da to 42.3 Da. In addition, it has been shown that 2-chloroadenine can be incorporated into the polynucleotides by DNA polymerase from Thermus aquaticus. The complete substitution in a chain has been described. (Hentosh, P.Anal .Biochem., 1992, 201: 277-281).
E. Examples 1. Development of polymerase It has been shown that a variety of mutant polymerases have altered catalytic properties with respect to modified nucleotides. The mutant polymerases P1266 with reduced discrimination between ribonucleotides and deoxyribonucleotides have been studied exhaustively. The human ß mutant DNA polymerases that discriminate with respect to the incorporation of azidothymidine (AZT) have been isolated by genetic selection. Thus, it is quite likely that polymerases having the ability to incorporate any of the modified nucleotides of this invention may be produced and selected better than the natural polymerases. The following procedure can be employed to obtain an optimal polymerase for the incorporation of a particular modified nucleotide or nucleotides into a polynucleotide. It is understood that the modifications of the following procedure will be very apparent to those skilled in the art; said modifications are within the scope of this invention. to. A polymerase is selected as the raw material. Alternatively, several polymerases having different sequences and / or different capacities can be selected with respect to incorporation of a modified nucleotide or nucleotides into a polynucleotide. For example, in a non-exclusive way, two polymerases can be selected, one of which efficiently incorporates a nucleotide that has a modification of the sugar and the other that incorporates P1266 efficient a nucleotide that has a modification in the phosphate chain. The sequence or coding sequences of the polymerases are then cloned in a porcariotic host. It may be advantageous to incorporate a protein tag into the polymerase during cloning, the protein tag being selected for its ability to direct the polymerase in the periplasmic space of the host. A non-exclusive example of a label of this type in the thioredoxin. Proteins in the periplasmic space can be obtained in a semi-hard state by thermal shock (or other methods known in the art) and are less likely to be incorporated into the inclusion bodies. b. Several rounds of intermingling are then carried out (preferably three or more) (Stemmer, mentioned above). c. After each round of intermixing, the resulting DNA is transformed inside the host. The library of transformants obtained (approximately 10-1000 colonies per mixture) are prepared from the colonies of host cells for screening by clan selection. A lysate is then made from each mixture. The host can be prokaryotic, for example non-exclusively, bacteria or a eukaryote P1266 unicellular as a yeast. The significant description involves the use of a bacterial prokaryotic host but other prokaryotic hosts will be apparent to those skilled in the art and are within the scope of this invention. d. The lysates are subjected to dialysis by means of a low molecular weight cutting membrane, to eliminate almost all natural nucleotides. This is necessary because the polymerase assay with the desired characteristics involves polymerase extension of a primer in the presence of modified nucleotides. The presence of the corresponding natural nucleotides will result in a background in the assay that could hide the results. An alternative procedure is the degradation of all natural nucleotides with a phosphatase, for example, the alkaline phosphatase of shrimp. and. Add to the lysate the following: a single-stranded DNA template, a single-stranded DNA primer complementary to one end of the template, the modified nucleotide or nucleotides whose incorporation into the DNA is desired, and the natural nucleotides that will not be replaced by the nucleotides modified. If it is desired that the polymerase has the ability to incorporate two contiguous modified nucleotides, then the template should be selected P1266 to contain one or more complementary contiguous sequences. For example, without being considered a limitation, if a polymerase having the ability to incorporate a C-modified-T-modified sequence 5 'to 3' is desired, the template should contain one or more G-A or 3 'to 5' A-G sequences. Next (ie, 5 'a) the segment of the template chain designed to test the ability of the polymerase to incorporate the nucleotide or modified nucleotides, is the segment of the template chain that produces a detectable sequence when copied by the polymerase . The sequence can be detected in several ways. One possibility is to employ a template having a homopolymer segment of nucleotides complementary to one of the natural nucleotides. Then, for example, if the objective is the identification of a polymerase that incorporates modified C, then the detection may involve the polymerization of a consecutive series of A, G or T, provided that the nucleotide used for detection does not appear very soon in the polymerized sequence complementary to the template sequence. The detection nucleotide could be a radiolabelled or dye-labeled nucleotide that would only be incorporated by a mutant polymerase that has already traversed the segment of the template that requires the incorporation of modified nucleotides or nucleotides. Another way to detect the homopolymer would be P1266 make a radiolabeled or dye-labeled complementary probe that could hybridize to the homopolymer produced only in those mixtures containing a polymerase that is capable of incorporating the modified nucleotide or nucleotides. Hybridization could then be detected, for example, by stitching the primer extension products from each mixture onto a nylon filter, then denaturing, drying and adding the homopolymer probe that would hybridize to the complementary strand of the product. polymerization. Of course, a homopolymer or other sequence that is not present in the genome of the host cell or an episome should be used in order to minimize background hybridization to the host sequences present in all mixtures. Yet another detection method would be to incorporate a sequence corresponding to an RNA polymerase promoter, for example, non-exclusively, the T7 promoter, followed by a reporter sequence in the template. These sequences should be located in the direction of the 3 'end of the primer sequence and the template that require the incorporation of modified nucleotides. The T7 promoter will be inactive until it becomes a double chain as a consequence of the polymerization; However, the polymerization of the promoter sequence T7 is only P1266 will present whether the mutant polymerase being tested is capable of incorporating the modified nucleotide or the modified nucleotide sequence which is located towards the 5 'end of the T7 promoter sequence. The reporter sequence may include a homopolymer sequence of a nucleotide (e.g., T) whose complement (in the case of A) is labeled with a dye or a radioactive label. Thus, high levels of transcription mediated by T7 polymerase will result in large amounts of high molecular weight labeled polymer (ie, capable of precipitation with hloroacetic acid). An alternative reporter sequence could be a ribozyme capable of cleaving an exogenously added marker oligonucleotide that allows for easy distinction of cleaved and unseparated products. For example, again in non-exclusive form, one end of the oligonucleotide may be biotinylated and the other end may contain a fluorescent dye-these systems have the ability to amplify a signal 1000 times or more. In this approach, it would first be necessary to demonstrate that the promoter function is not altered by the presence of a modified nucleotide or to create a version of the promoter lacking the nucleotide that is modified. F. Any mixture of lysed bacterial colonies containing a polymerase capable of incorporating the P1266 selected modified nucleotide or the contiguous modified nucleotides, will produce detectable homopolymer or will contain double-stranded T7 RNA polymerase promoter in the 5 'direction of a marker sequence, as a result of polymerization through the modified nucleotide or contiguous nucleotides, through of the T7 promoter and through the marker sequence. The addition of T7 RNA polymerase to the mixture (or alternatively, the expression of T7 RNA polymerase from a plasmid) will result in the transcription of the marker sequence which can then be detected by an appropriate method depending on the selected marker system . It may not be necessary to select or design a promoter that lacks either the nucleotide or the modified nucleotides or that may indeed work with the modified nucleotides. g. Bacterial colonies containing a polymerase with the desired properties are then identified and purified from mixtures of bacterial colonies by clan selection. In each selection round the mixture or mixtures with the desired properties are divided into submixes and each submix is tested to detect its activity in the manner described above. The sub-mix that shows the highest activity level is selected and separated in a second round of submixes and the process is P1266 repeats. This is repeated until only one colony remains which contains the desired polymerase. Then the polymerase can be cloned again into a protein expression vector and large amounts of polymerase can be expressed and purified. Another approach to polymerase development includes the well-known propensity of some antibiotics to kill only growing cells, for example, penicillin and related drugs that kill by interfering with the synthesis in the bacterial cell wall of growing cells but not they affect the cells at rest. The approach would be to introduce a modified nucleotide into bacterial cells that have been genetically altered to express one or more mutant polymerases, preferably a library of mutant polymerases. An ideal host strain would be one in which the endogenous polymerase has been inactivated but supplemented with a plasmid-encoded polymerase. A polymerase library could then be created in a second plasmid with a different selectable marker, for example, with antibiotic resistance. The library would then be introduced into the host cell in the presence of negative selection against the first plasmid encoding polymerase (without mutation), leaving the cells with only the mutant polymerases. If one or more of the P1266 mutant polymerases have the ability to incorporate the modified nucleotide into the genetic material of the cells, the expression of the modified gene or genes will be altered and / or a series of cellular host responses will be provoked, such as the SOS response that affects cell growth . The desired effect would be the arrest of the reversible growth, that is, a cytostatic effect rather than a cytocidal one. The cells would then be treated with an antibiotic that would only kill the cells with active growth. Then the cells are removed from the presence of the antibiotic and placed in fresh culture medium. Cells whose growth has been arrested by the incorporation of the modified nucleotide into their genetic material and which therefore were not affected by the antibiotic, would form colonies. Plasmids containing the code for polymerase that catalyzes the incorporation of the modified nucleotide into the genetic material of the cells are then isolated and the procedure repeated for additional rounds of selection. Once a sufficient number of rounds of selection have been made, the polymerase is isolated and characterized. An exemplary, but not limiting, procedure that can be used to carry out the above is as follows: 1. Select a polymerase or set of P12S6 polymerases for mutagenesis. The initial polymerase or polymerases may include, in a non-exclusive manner, a mutant polymerase such as the Klenow E710A fragment, wild-type polymerases, thermostable or thermolabile polymerases or polymerases known as E.cole DNA Pol I complement, etc. 2. Prepare a library of mutant polymerases using techniques such as "dirty PCR", intermingling, site-directed mutagenesis or other procedures that generate diversity. 3. Cloning the library in a plasmid vector. 4. Transform the bacteria with the plasmid library and isolate transfectants by selection in an appropriate antibiotic. Preferably, the host strain has an inactivated cormosomal polymerase and selection can be applied to ensure that only the mutant polymerases are expressed in the host cell, as described above. Only cells that harbor plasmids that encode functional polymerases will survive this stage. 5. Add the modified nucleotide triphosphate to the medium. It may be necessary to use a cellular permeabilization procedure such as electroporation, the addition of calcium chloride or rubidium, thermal shock, etc., in order to facilitate P12G6 the entrance of the modified nucleotide to the cells. The cells are then cultured in the presence of the modified nucleotide triphosphate until the incorporation of the modified nucleotide (s) induces the arrest of cell growth in the selected cells. 6. Add penicillin, ampicillin, nalidixic acid or any other antibiotic that selectively eliminates actively dividing cells. Continue the cultivation of the cells for a selected time. 7. Centrifuge the cells, suspend them in fresh LB medium and place them on a plate. Cultivate for a time that is determined empirically. 8. Select colonies, isolate the plasmids and repeat steps 4 to 7 for additional rounds of selection or alternatively, use a biochemical assay for incorporation of the modified nucleotide to examine individual colonies or colony mixtures. Such an assay could involve the polymerization of a template in the presence of a radiolabelled modified nucleotide in individual clones or in mixtures of clones in a clan selection scheme. 9. Then characterize the polymerases determined to have the desired activity, by P1266 the test of step 8. 10. Repeat again the mutagenesis of the polymerase or the polymerases obtained in step 8 and repeat the selection procedure from step 3. 11. When an acceptable level of ability to incorporate the modified nucleotide, isolate and characterize the polymerase. Another method for selecting active polymerases for the incorporation of modified nucleotide involves the use of a bacteriophage that has been described for the selection of an active enzyme (Pedersen et al., Proc. Natl. Acad. Sci. USA, 1998, 95: 10523 -8). A modification of that procedure could be used for the selection of mutant polymerase. that is, oligonucleotides that are covalently bound to phage surfaces can be amplified by mutant polymerases expressed on the surface of the phage. Modified nucleotides labeled with dye could be used for the extension of the primer. After removing the nucleotides that were not incorporated, the phage carrying the dye-modified nucleotide could be identified using fluorescence-activated cell sorting methods. Alternatively, if a suitable template design is used, the fluorescence label can be attached to another nucleotide that would only be incorporated into the P1266 to the 3 'end of a modified nucleoside extension. In yet another approach identifying active polymerases for the incorporation of modified nucleotide, X-ray detectable crystal structures of available polymerases bound to the template DNA and a nucleotide substrate would be used. Based on the observed or predicted interactions in the polymerase / substrate complex, logical amino acid changes could be created to accommodate the structural deviation with respect to given modified nucleotides. For example, based on the structural information on a T7 polymerase complex and its substrates for which the crystal structure of X-rays shows the amino acids that are in the polymerase active site (Doublie et al., Nature, 1998, 391 : 251-258), site-directed mutagenesis can be designed so that a Klenow fragment of structurally similar protein increases its specific activity for the incorporation of ribonucleotides (rNTPs) and / or 5'-amino-nucleotides (5'-aminodNTPs) . The E710A mutant of the Klenow fragment (Astatke et al., Proc. Natl. Acad. Sci. USA, 1998, 95: 3402-3407) has increased the ability to incorporate rNTPs compared to the wild type Klenow fragment, probably because the mutation withdraws the steric site towards group 2 '- P126S hydroxyl of rNTPs. However, this mutation decreases the activity of the mutants for the incorporation of natural dNTPs and 5 '-aminodNTPs. in this case, the use of the E710S mutation would lead to improved activity because the E710S possibly forms H bonds with the 2'-OH of the rNTPs substrates. The E710A or E710S mutation could also be used in combination with Y766F, as the already described mutant itself has little effect on the activity of polymerases (Astatke et al., J., Biol. Chem., 1995, 270: 1945 -54) The crystal structure of Y766 reveals that its hydroxyl groups form hydrogen bonds with the E710 side chain, which can affect the activity of the polymerase when the E710 is truncated in Ala. On the other hand, the E710 mutations in Combination with F762A can improve activity by keeping the sugar ring in a defined position.Also, better incorporation of the 5'-amino analogs can be achieved by relaxing the binding of the polymerase on the nucleotide substrate, since nitrogen 5 'changes the conformation of the nucleotide and thus the alignment of the alpha-phosphorous atom.First, the focus could be on the mutagenesis of a limited number of residues comprising the sugar and the phosphates of the nucleotide. nucleotide user, for example, residues R668, H734 and F762. The H881 residue can also work. Although it is beyond the dNTP binding site,P1266 a Ala substitution in this position influences the fidelity of dNTP incorporation (Polesky et al., J. Biol. Chem., 1990, 265: 14579-91). These residues can be directed by cassette mutagenesis to check the amino acid residue with maximized effect, followed by the selection of active polymerases as described. The substitution of R668K is of particular interest, because it must eliminate contact with the dNTP while retaining the interaction of the minor groove with the 3'-NMP primer. On the other hand, although R754 and R758 make contact with alpha and beta phosphates, changes in those positions are likely for severely damaged catalysis. Histidine or lysine at these positions could preserve interactions with phosphates and could retain activity. Another method for selecting active polymerases for the incorporation of modified nucleotides involves the use of the phage display system, which allows foreign proteins to be expressed on the surface of bacteriophages as fusions with the phage surface proteins. Kay, B.K., Winter, J. and J. McCafferty (Editors) Phage Display of Peptides and Proteins: A Laboratory manual. Academic Press, 1996. Establishing an experimental system for the detection of a mutant polymerase would involve expressing mutant polymerases in the P1266 surface of a phage library and then isolate phage-carrying polymerases with the desired polymerase activity. The aspects of such a system for the selection of an active nuclease enzyme have been described (Pedersen et al., Proc. Natl. Acad. Sci. USA, 1998, 95: 10523-8). A modification of that procedure can be used for the selection of mutant polymerase. That is, the oligonucleotides that are covalently bound to proteins on the surfaces of the phage can be amplified by mutant polymerases expressed on the surface of the same phage. The oligonucleotides can be folded to provide a primer-template complex recognizable by the polymerase or alternatively a primer complementary to the oligonucleotide can be provided separately. In any case, the portion of the oligonucleotide that serves as a template for the polymerization will contain nucleotides complementary to the nucleotide or modified nucleotides for which an efficient polymerase is sought. The template oligonucleotide can also be designed in such a way that the product of extension or enlargement is easily detectable, as a result of incorporation according to the template, of a labeled nucleotide appearing only after polymerization through the segment. of the mold that requires the incorporation of the modified nucleotide or nucleotides.
P1266 A method for selectively enriching phage carrying polymerases with the desired catalytic properties involves the use of a fluorescence activated cell sorter (FACS). The modified nucleotides labeled with dye would be used to incorporate into a primer extension reaction only after incorporation of the modified test nucleotides. After removing the nucleotides that were not incorporated, the phage carrier having the dye-modified nucleotides attached (which must encode mutant polymerases with the ability to incorporate the modified nucleotide or nucleotides) could be identified as enriched in one or more rounds using the fluorescence-activated cell sorting methods (Daugherty PS, et al., Antibody affinity maturation using bacterial surface display, Protein Eng. 11: 825-32, 1998). Alternatively, the modified nucleotides themselves can be labeled with dye and likewise the selection will be made by FACS classification of the phage labeled with dye. This process has the disadvantage that the dye may interfere with the polymerization; however, one skilled in the art will recognize that the dye can bind to the modified nucleotide via a bond that is unlikely to inhibit polymerization. Through the use of P1266 an appropriate template design, the fluorescence label can be attached to another nucleotide that would only be incorporated upstream of an extension of modified nucleosides. Yet another approach to identify active polymerases for the incorporation of modified nucleotide would be to use crystal structures detectable by available X-rays of polymerases bound to template DNA and nucleotide substrate.
Based on the observed or predicted interactions in the polymerase / substrate complex, logical amino acid changes could be created to accommodate the structural deviation with respect to given modified nucleotides. For example, based on the structural information on a T7 polymerase complex and its substrates for which the crystal structure of X-rays shows the amino acids that are in the polymerase active site (Doublie et al., Nature, 1998, 391 : 251-258), site-directed mutagenesis can be designed for a Klenow fragment of structurally similar protein, to increase its specific activity for the incorporation of ribonucleotides (rNTPs) and / or 5 '-amino-nucleotides (5'-aminodNTPs) . The E710A mutant of the Klenow fragment (Astatke et al., Proc. Natl. Acad. Sci. USA, 1998, 95: 3402-3407) has P1266 increased the ability to incorporate rNTPs compared to the wild-type Klenow fragment, probably because the mutation removes the steric site towards the 2'-hydroxyl group of rNTPs. However, this mutation decreases the activity of the mutants for the incorporation of natural dNTPs and 5 '-aminodNTPs. in this case, the use of the E710S mutation would lead to improved activity because the E710S possibly forms H bonds with the 2'-OH of the rNTPs substrates. The E710A or E710S mutation could also be used in combination with Y766F, as the already described mutant itself has little effect on the activity of the polymerases (Astatke et al., J. Biol. Chem., 1995, 270: 1945- 54). The crystal structure of Y766 reveals that its hydroxyl groups form hydrogen bonds with the side chain of E710, which can affect the activity of the polymerase when the E710 is truncated in Ala. On the other hand, E710 mutations in combination with F762A can improve activity by keeping the sugar ring in a defined position. Likewise, better incorporation of the 5'-amino analogs can be achieved by relaxing the binding of the polymerase on the nucleotide substrate since the 5 'nitrogen changes the conformation of the nucleotide and thus the alignment of the alpha-phosphorous atom. At the beginning, the focus could be on the mutagenesis of a limited number of residues that comprise sugar and P126S phosphates of the nucleotide substrate, for example, residues R668, H734 and F762. The H881 residue can also work. Although it is beyond the dNTP binding site, a Ala substitution in this position influences the fidelity of dNTP incorporation (Polesky et al., J ". Biol. Chem., 1990, 265: 14579-91). residues can be directed by cassette mutagenesis to check the amino acid residue with maximized effect, followed by the selection of active polymerases as described.The replacement of R668K is of particular interest, because it should eliminate contact with the dNTP while retaining the interaction of the minor groove with the 3'-NMP primer On the other hand, although R754 and R758 make contact with alpha and beta phosphates, changes in those positions are likely for severely damaged catalysis. Lysine in these positions could preserve the interactions with the phosphates and could retain the activity.An expert in the art will recognize that the collection of preferred amino acid modifications in the polymers to Klenow described above, can be applied to other polymerases to produce useful mutant versions of those polymerases. This can be done by aligning the amino acid sequences of other polymerases with that of Klenow polymerase, to determine the P1266 location of the corresponding amino acids in the other polymerases and / or when the crystal structure is available, compare the three-dimensional structures of other polymerases with that of the Klenow polymerase to identify orthologous amino acids. Methods for performing site-directed mutagenesis and expressing mutant polymerases in prokaryotic vectors are known in the art (Ausubel, F.M., et al., Current Protocols in Molecular Biology, John Wiley &Sons, 1998). In addition to producing and screening for mutant polymerases capable of incorporating modified nucleotides, it may also be useful in some cases to select from other properties of the polymerase. In general, the additional desirable properties of the polymerase that are described below are more difficult to analyze than the incorporation of modified nucleotides, so tests for these additional properties can be performed as a second screening of mutant polymerases with demonstrated ability to incorporate modified nucleotides. One aspect of this invention is that the cleavage in the modified nucleotides can be caused or enhanced by contact between the modified nucleotides and a polymerase (see Example and Figures 20-26). This is a preferred cleavage mode that avoids a separate cleavage step. So, it's useful in the trial of P1266 mutant polymerases to determine properties that increase or reinforce excision. A simple assay for such properties is an extension of the primer, wherein the extension sequence following the primer includes the nucleotide susceptible to cleavage followed by the first appearance of a different nucleotide that is detectably labeled. In the case of polymerase aided cleavage, the labeled molecule will be separated from the primer which will result in a smaller labeled molecule, which can be detected by electrophoresis or by other methods. A second useful property of mutant polymerases is the ability to recognize a modified nucleotide or nucleotides in a template chain and to catalyze the incorporation of the appropriate complementary nucleotide (natural or modified) into the nascent complementary strand. This property is a necessary condition for a polymerase to be used in a cyclic process such as PCR, where the newly synthesized polynucleotides serve as templates in successive rounds of amplification. A simple test for these properties is a short primer extension, wherein the template strand is synthesized with the nucleotide or modified nucleotides that appear shortly after the primer end, so that a primer extension reaction will soon find the nucleotide or nucleotides modified. The polymerization Successful P1266 through the template, which indicates the use of modified nucleotides as templates, will result in a product of longer extension than that obtained when the modified nucleotides are not used as templates. The extension product can be made readily detectable by synthesizing the template to cause incorporation according to the template of the labeled nucleotide only after passing through the nucleotide or modified nucleotides. The sequence of the extension product can then be determined to confirm that the nucleotides incorporated in the extension chain opposite the modified nucleotides are correct. Still other attractive properties of polymerases include high fidelity, thermostability and ease of processing. Analyzes of these properties are known in the art.
Example 2. Detection of variation by restriction of mononucleotides The following procedure is an example of detecting variation of nucleotide sequence in a polynucleotide without the need to obtain the complete sequence of the polynucleotide. While the modified nucleotide used in this example is 7-methylguanine (7-methyl G) and the polynucleotide in analysis is P1266 a fragment of base pairs 66 of a specific DNA, it is understood that the described technique can be employed using any of the modified nucleotides studied before or any other of the modified nucleotides which as noted above are within the scope of this invention. The polynucleotide can be a polynucleotide of any length that can be produced by a polymerase. A base pair region 66 of the replication subunit of 38 Kda C factor (RFC) cDNA was amplified by PCR (polymerase chain reaction). three primers were used in two separate amplification reactions. The forward primer (RFC bio) was subjected to biotinylation. This allows the isolation of a single chain template using streptavidin-coated beads that can then be expanded by the Klenow exo fragment of E.coli DNA polymerase to incorporate 7-methyl G. This also allows DNA clearance 7- G methyl after the expansion and before the excision. Two inverse primers were used in a separate amplification reaction; one coincided with the natural sequence for the RFC gene (RFC), the other (mutant RFC) introduced a base mutation (T to C) in the RFC sequence of base pair 66. The primers and P1266 corresponding products were also marked as RFC 4.4 and RFC 4.4 mut in some of the figures herein. Using PCR and the two primers mentioned above, fragments of 66 base pairs were produced (Figure 1). The two fragments differ in one position, in T a C they change in the biotinylated chain and in A to G they change in the complementary chain (encoded by the two inverse primers). The PCR products were purified by streptavidin agarose and the biotinylated chain of each PCR product was eluted and used as a template for primer extension. The biotinylated RFC bio primer was amplified in these templates and in the presence of dATP, dCPT, dTTP and 7-methyl dGTP. The streptavidin agarose-attached single chain DNA was then incubated with piperidine for 30 minutes at 90 ° C to cleave at the 7-methylG incorporation sites in the modified DNA fragment. This treatment also resulted in the separation of the biotinylated fragment from streptavidin. The reaction mixture was subjected to centrifugation and the supernatant containing the polynucleotide was transferred to a new tube. The DNA was dried in a Speedvac apparatus and resuspended in deionized water. The sample was then subjected to MALDI mass spectrometry. Figure 2 shows the molecular weights of the P1266 Interesting fragments expected as a result of cleavage of the biotinylated DNA strand at each 7-methylG incorporation site. These fragments and their molecular weights are: a 27-mer (8772.15), a 10-mer (3069.92), an 8-mer (2557.6) and one of the following 10-mer depending on the inverse primer used in the PCR reaction, RFC (3054.9) or mut RFC (3039.88). The biotinylated 20-mer primer is also present because it was supplied in excess in the extension reaction. The 10-mer fragments for RFC and RFC mut, which differ by 15 daltons, are those that would be detected and resolved by mass spectrometry, thus revealing the point mutation. Figure 3 shows a gel denaturing analysis for polynucleotide sequencing of the Klenow RFC and RFC mut polymerase extension fragments before and after cleavage with piperidine. All the expected fragments were present in both cases. Most of the additional minor bands are the result of incomplete cleavage of the DNA chain by piperidine. Complete excision can be achieved through two cycles of piperidine treatment using freshly distilled piperidine for 30 minutes at 90 ° C in each cycle and then drying and washing the samples (data not shown). The RFC mut cleavage band (line 4 of Figure 3) running between 8-number P1266 and 10 -mer is the only band that is not explained by complete or incomplete excision. Figure 4 is the mass spectrogram of RFC of the RFC sample. The peak on the far side to the right is the band of the biotinylated primer that was used as a standard to calculate the molecular weight of the other bands. The left side of the spectrogram reveals the three expected cleavage bands (two 10-mers and one 8 -mer). The insert in Figure 4 is an amplified view of the region surrounding the two 10-mer and 8-mer fragments. The molecular weights in this region were uniformly displaced by approximately 20 daltons because the primer used for calibration was shifted by 20 daltons. However, the mass differences between the peaks were exactly as predicted. Figure 5 shows the mass spectrogram and an amplified portion thereof from the sample RFC mut. Two peaks must remain the same between the RFC and RFC mut samples, one of the 10-mers fragments (3089.67) and the 8-mer (2576.93). The molecular weight of the remaining loomer should decrease in the 10-mer of RFC mut by 15.02 Da (from 3054.9 to 3039.88) due to the change from T to C and the mass difference between this and the 10-number of RFC without change should be 30.04 (3039.88 vs. 3069.92). However, the mass difference actually obtained from RFC P1266 mut was 319.73 Da. This may be due to a deletion of a C from the 10-mer fragment corresponding to nucleotides 57 to 66. This would explain the abnormal 9-mer fragment in the RFC mut sequencing gel (Figure 3). For this to be the case, the commercially obtained primer used in the amplification reaction would have to have lost a G. The expected molecular weights for the RFC primer, the RFC mut primer and the RFC mut primer with a single G deletion are shown in Table 6. To test the hypothesis that an error arose in the synthesis of the oligonucleotide primer of RFC mut, the oligonucleotides RFC and mut RFC were combined and subjected to mass spectrometry. As can be seen from the differences in mass obtained (Figure 6 and Table 6), the hypothesis was correct, the primer RFC mut in effect lost a G. The strength of the method of this invention manifests itself dramatically in the experiment previous. What began as a controlled test of the method using a known sequence and a known nucleotide variation actually detected an unknown variation in an unexpected place, the RFC mut.
P1266 Example 3. Detection of variation by restriction of dinucleotides The restriction enzyme that has a recognition site of four base pairs will cleave the DNA in specific form with a statistical frequency of one excision every 256 (44) bases, which gives as a result fragments that are often too large to be analyzed by mass spectrometry (Figure 19A). Our strategy of chemical restriction of dinucleotides, on the one hand, would result in much smaller fragments of the same polynucleotide. The average size of the fragments obtained is 16 (24) bases (Figure 19B) which is totally sensitive to mass spectrometry analysis. An example of this principle of chemical restriction is illustrated in Figure 20. In this figure is represented a pair of dinucleotides having a ribonucleotide and 5'-aminonucleotides attached in 5 'to 3' orientation, whereby the group 2 ' -hydroxyl is located in close proximity to the phosphoramidate linkage. The labile quality of the phosphoramidate linkage is reinforced since the hydroxyl group can attack the phosphorus atom to form a cyclic 2 ', 3' -phosphate that causes DNA cleavage at this particular dinucleotide site. Figure 21 shows the actual application of P1266 this approach. A 20nt primer labeled 5'-32P was spread with a mixture of Klenow polymerases (exo-) and E710A Klenow (exo-) using a single-stranded 87nt template in a Tris buffer at pH 9. Primer extension was performed with riboGTP (band 1), 5'-aminoTTP (band 3), or riboGTP / 5 '-aminoTTP (band 5) instead of the corresponding natural nucleotides. After extension, the reaction mixtures were purified on a G25 column. The extension product containing riboG was unfolded with aqueous base to generate a sequencing ladder G (lane 2). On the other hand, the product containing 5'-aminoT was labile to the acids and was unfolded to give a sequencing ladder (lane 4). under the conditions of the extension reaction with riboGTP / 5 '-aminoTTP (band 5), a 64nt product was obtained instead of the expected 87nt. Interestingly, the 64nt fragment in one of the dinucleotide cleavage products expected by the GT restriction and the only one that should be visible by autoradiography. The acidic cleavage of this product generated a T (band 6) ladder while the basic cleavage generated a G ladder (band 7), which indicates the successful incorporation of both riboGTP and 5'-aminoTTP into the polynucleotide. From these results it can be concluded that the GT restriction excision has occurred during the extension procedures and / or P1266 development, most likely due to the labile synergistic quality of the two modified nucleotides. In order to visualize the three expected restriction fragments, the same excision experiment was performed by extension, in the presence of a-32P-dCTP. As shown in Figure 22, three GT restriction fragments with the expected relative mobility and specific radioactivity were observed. The versatility of this dinucleotide restriction approach is demonstrated by the AT restriction of the same DNA. The specific restriction AT was observed by polyacrylamide gel electrophoresis (PAGE) analysis (Figure 23). A non-radioactive product generated in a similar way was analyzed by MALDI-TOF mass spectrometry (Figure 24). All expected restriction fragments were observed except for a 2nt fragment that was lost during purification by G25 column. The general application of this technology is further demonstrated when a different, longer DNA template was used (Figures 25 and 26). Primer extension with riboATP and 5 '-aminoTTP followed by the AT restriction generated the expected oligonucleotides as observed by PAGE analysis (Figure 25) or MALDI-TOF mass spectrometry analysis (Figure 26).
P1266 Example 4. Genotyping by complete substitution / complete excision The following genotyping procedure by chemical restriction is an alternative method of other genotyping methods, with many advantages including accuracy and speed. In general, this method involves the PCR amplification of genomic DNA that uses chemically modified nucleotides followed by chemical cleavage in the modified bases, with the resulting amplifications. A schematic representation of this technique is shown in Figure 27. One of the primers (Primer 1) is designed to be close to the polymorphic site of interest, so one of the polymorphic bases (eg, A) can be selected as the first nucleotide capable of being cleaved. After PCR amplification with the chemically modified nucleotide (supplemented with the other three natural nucleotides), only one of the two alleles would cleave at the polymorphic site. Treatment with chemical reagents would give cleavage products comprising Primer 1, whose length can reveal the genotype of the sample. The analysis can be implemented either by mass spectrometry or by electrophoresis to identify the expected length difference. In addition, the analysis by mass spectrometry can unmask the difference P1266 of the base alone in the complementary strand of DNA containing the polymorphism, by providing an integrated redundancy and greater accuracy. The chemical cleavage and analysis procedures used for the genotype of the transferrin receptor (TR) gene are illustrated in Figures 28 to 31. An 82bp DNA sequence of the TR gene was selected, based on the location of polymorphism and amplification efficiency (Figure 28). The polymorphic base (A or G) is located 3 bases from the 3 'end of primer 1. For allele A it is the first modified nucleotide incorporated; for the G allele, the first base capable of being cleaved is 6 bases of the primer. As a consequence, fragments of different lengths are produced from the chemical cleavage. PCR amplification reactions (50 μl each) were carried out in standard buffer with AmpliTaq Gold polymerase (0.1 unit / μl in Cycler (MJ Research PTC-200) using 35 cycles of amplification (1 minute of denaturation, 1.5 minutes of mating and 5 minutes extension.) Analysis of the PCR products in a non-denaturing polyacrylamide gel (stained with Stains-All from Sigma) showed that 7-deaza-7-nitro-dATP can replace dATP for Efficient amplification by PCR (Figure 28) To the products of 7-deaza-7-nitro-dATP P1266 were directly added piperidine, tris- (2-carboxyethyl) phosphine (TCEP) and Tris base to give a final concentration of IM, 0.2M and 0.5M, respectively, in a total volume of 100 1. After incubation at 95 ° C for 1 hour, 1 ml of 0.2 M triethylammonium acetate (TEAA) was added to each reaction mixture and the resulting solution was purified on OASIS column (Waters). The eluted products were concentrated to dryness in a Speedvac apparatus and the residues were analyzed by mass spectrometry or electrophoresis. Figure 29 shows the sequences of selected fragments expected from cleavage at 7-deaza-7-nitro-dA. The sequences were grouped according to the lengths and molecular weights. The first group contains longer fragments that extend or expand from the primers. The 22nt is an invariable fragment that can be used as an internal reference. The 25th and 28th fragments of the A and G alleles, respectively, are expected. The group of shaded sequences comes from the complementary strand of DNA, including the invariant 13nt and lint fragments that can be used as internal references and a pair of lint fragments expected from two allelic forms of the TR gene with a mass difference of 15 Da. Figure 30 (a) shows a MALDI-TOF spectrum of cleavage products Chemical P1266 from a DNA sample of heterozygous TR 82 bp. Two regions containing fragments depicted in Figure 29 are highlighted in the spectrum. Each sample of purified cleavage was mixed with 3-hydroxypicolinic acid and subjected to analysis by MALDI-TOF in a Perceptive Biosystems Voyager-DE mass spectrometer. The mass spectrum in the region of 7000 to 9000 daltons was recorded and the results for the three TR genotypes are shown in Figure 30 (b). The spectrum was aligned using the peak representing the invariant fragment 22nt (7189 Da). Two additional peaks were observed for the AG heterozygote sample, one corresponding to the A allele (8057 Da) and the other to the G allele. (9005 Da). As expected, only one additional peak was observed for the homozygous GG or AA samples, each with the molecular weights of cleavage fragments of the A or G alleles. Figure 31 (a) shows a mass spectrum of a heterozygous AG sample in the region of 3700 to 4600 Da. With fragments 3807 Da and 4441 Da as internal references, the genotype of this sample was confirmed by observing two peaks in the middle of the spectrum with a mass difference of 15 Da. The molecular weights observed by mass spectrometry indicated that the phosphate-deoxyribose-TCEP adducts were uniformly formed during the reaction of P1266 cleavage, which resulted in modified fragments at the 3 'end (Figure 31 (b)). The data shown in Figures 30 and 31 also illustrate that the combination of chemical restriction with mass spectrometry can provide information to corroborate the genotyping from the two DNA strands, thereby ensuring the accuracy of the analysis. Alternatively, samples subjected to chemical restriction can be analyzed by electrophoresis to detect the predicted length difference resulting from the two alleles. The capillary electrophoresis (CE) analysis was carried out an internally assembled instrument, with a UV detector and a capillary containing linear denaturing polyacrylamide gel, Figure 32 (a) shows the CE chromatogram obtained from the TR samples of several genotypes As predicted, each genotype showed a different elution pattern with the expected cleavage product lengths. While the homozygous AA produced a 25nt fragment and the homozygous GG generated a 28nt fragment, the heterozygote sample AG gave the two products 25nt and 28nt. After being labeled at the 5 'end with 32 P, the cleavage samples were subjected to PAGE analysis. The resulting autoradiogram shown in Figure 32 (b) showed that excision is specific with little or no background and that P1266 the results of the genotyping are unambiguous. Another alternative detection method involves the application of fluorescence resonance energy transfer (FRET). The FRET technique has been successfully applied for the detection of polymorphism in TaqMan assays (Todd JA et al., 1995, Nature Genetics, 3: 341-342) and Molecular Beacons (Tyagi, S. et al., 1998, Nature Biotechnology , 16: 49-53). However, when longer probes are necessary to achieve their hybridization to target sequences (eg, AT-rich sequences), it becomes very difficult to distinguish the small difference that results from the mismatch of a single nucleotide. The advantage of chemical restriction in this aspect is illustrated in Figure 33. Similar to the aforementioned example, a modified nucleotide analog of one of the polymorphic bases (e.g., A) was used in the PCR amplification. place of its natural counterpart. Primer 1 was designed to be close to the polymorphic site so that polymorphic base A was the first nucleotide capable of cleaving for allele A. Primer 1 was also labeled with a fluorescent group (Fl) located near the 3 'end ( Figure 33 (a)). After amplification and chemical restriction, a probe attached by covalent bond can be added to another fluorine F2 (shown in P1266 Figure 3 (b)) and perform the FRET technique between the two measured fluorophores. Because one of the alleles was unfolded closer to the 3 'end of primer 1 than the other, it is expected that the difference in hybridization between them is greater than a single nucleotide mismatch and can be exploited to distinguish the two allelic targets . As shown in Figure 33 (c), the experimental temperature can be attenuated so that only the fragment of the G allele can hybridize with the probe, from which FRET is produced. Since in this system a "NO FRET" result could be interpreted either as the A allele or as a failure in the PCR amplification, it is necessary to measure the fluorescence of each sample at various temperatures to ensure the positive detection of the shot fragment of the allele A at a lower temperature. Alternatively, positive detection can be achieved through the use of a fork-type probe such as that shown in Figure 33 (d). The probe has a tail end 5 'that folds back to form a fork, in addition to a fluorine F3 at the 5' end. With the sho excision fragment from allele A, the hairpin probe can form a double bridge as depicted, which generates detectable FRET between Fl and F3. Only with the longest fragment of the G allele, inter-chain hybridization can compete with the stability of the P1266 fork and cause loss of FRET between Fl and F3.
Example 5. Complete Sequencing by Partial Replacement / Partial Cleavage By using the following procedure, it is entirely possible to sequence, in a set of sequencing reactions, a polynucleotide consisting of 10,000, 20,000 or even more bases, by polymerization in the presence of nucleotides modified, enzymatic restriction of polymerization products, purification of restriction fragments and chemical degradation, to produce sequence ladders from each fragment. The procedure is limited only by the size of the mold and the ease of processing (the ability to continue the polymerization reaction) of the polymerase used to amplify the primer. Unlike a shotgun cloning library in which there is a normal distribution of sequential inserts that require quite redundant sequencing, by using the method described here each nucleotide is sampled once and only once. Repeating the procedure using a second or even a third enzymatic restriction mixture, provides the sequential information necessary to reassemble the sequences determined from the initial restriction, in the proper order to reconstruct the total length of P1266 the polynucleotide sequence, while also providing the necessary redundancy to ensure the accuracy of the results. In the description below a variety of options are provided to carry out each step. As mentioned above, it is understood that other modifications to the described process will be very apparent to the person skilled in the art; those other modifications are within the scope of this invention.
TABLE 6 Primer Molecular Weight Mass Difference RFCC 6099.6 RFC mut 6115.9 +16 RFC mut 5786.7 -313.2 to . Molding and priming of renaturation or mating. The template used can be a large or small cloning vector or an amplification product, for example, a PCR fragment; It can also be single chain or double chain. For example, in non-exclusive form, the template can be a plasmid, phagemid, cosmid, or clone, PAC, BAC or YAC. The ideal mold is made linear before the extension to ensure that all P1266 extension products end up in the same place. This can be done by restricting the template with a restriction endonuclease. For example, the templates can be prepared in a vector having restriction sites for one or more rare fragmenters on each side of the cloning site, so that a linear template can be routinely prepared using a rare fragmentation enzyme ( that is, an enzyme that unfolds, for example, a fragment of 7 or 8 nucleotides). Many plasmid vectors, such as, for example, in a non-exclusive form, Bluescript (Stratagene, Inc.) have these particularities. A primer that matches with a sequence in the vector can be selected, for example, the universal primer sequences M13. This allows sequencing of a library of clones using only one or two primers (one on each side of the insert). Alternatively, a series of insert-specific primers (in intervals of approximately 5 to 20 kb) can be used in a forward advancement version. b. Primer extended in the presence of the four natural deoxyribonucleotides and a modified nucleotide corresponding to one of the natural nucleotides. The aforementioned procedures are P1266 used to extend the primer with respect to the total length of the template using one of the modified nucleotides described above and any other modified nucleotide that is capable of imparting selective cleavage properties to the modified polynucleotide. In general, the modified nucleotide ratio and its natural counterpart can vary over a considerable range, from very little substitution (approximately 1%) to complete substitution (.> 99%). The control factor is the efficiency of the next chemical cleavage reaction. As long as the cleavage reaction is more efficient, the level of incorporation may be lower. The objective is to have approximately one nucleotide modified per restriction fragment, so that after cleavage, each molecule in the reaction mixture contributes to the sequencing ladder. Figure 7 shows one of the modified polynucleotides, a single-stranded M13 template, linearized, extended or extended to 87 nucleotides in the presence of the modified nucleotide, 5'-amino dTTP by means of the Klenow exo-minus fragment of E. coli DNA polymerase. Figure 9 shows an extension product of 7.2 Kb, again produced from the M13 template in the presence of 5'-amino-dTTP and dTTP in a molar ratio of 100: 1 (Block A, extension product).
P1266 c. Purifying the full-length primer extension product In order to remove prematurely terminated polymerase extension products (ie, less than full length), whereby a homogeneous sequencing ladder in electrophoresis is ensured after cleavage, It may be convenient to purify the full-length or near-complete extension products. However, it is observed that the purification of the restriction fragments after digestion (step f, below) achieve essentially the same goal, in most cases it is likely to be sufficient. In any case, the removal of short extension products can be carried out by various methods, for example, spun column chromatography or high performance liquid chromatography (HPLC). Figure 8 shows a purified full length extension product, before (Block A) and after (Block B) chemical cleavage with acid. d. Cleavage of the extension product of the primer with one or more restriction enzymes. As noted earlier, the optimal size for DNA sequencing templates (in this case, for restriction products) is approximately 300 P1266 800 nucleotides, when it is going to be used for the creation of sequencing stairs. These restriction endonucleases must be employed to reduce the full-length extension product of 10 Kb or more to a manageable size. Numerous endonucleases of this type are known in the art. For example, many four-base restriction endonucleases are known, which will generally give restriction products in the desired range. Shorter restriction fragments, for example, with less than 300 nucleotides, can also be sequenced, but to make gel runs more efficient, it is convenient to separate the restriction fragments in sets according to their length. Shorter fragments will require relatively short sequencing run times, while longer fragments will require more gel and / or longer run times. Two or more pools of restriction endonucleases, each with one or more restriction endonucleases and a compatible buffer, can be used to provide the cross-sequential information necessary to reassemble the complete polynucleotide sequence from the restriction fragments. Figure 9 shows an exemplary digestion with restriction endonuclease of an extended primer / template complex in the presence of dTTP and the modified nucleotide P12S6 5'-amino dTTP. As can be seen in Figure 9, the complete excision was obtained using the restriction endonuclease Msc I. No other MSC I restriction products are observed because only the 5 'end of the primer extension product was labeled with 3P. e. Marking of restriction endonuclease products. To visualize the ladder of DNA sequencing generated by this method, it is necessary to label or label the products of the restriction endonuclease with a detectable label. Many of these labels are known in the art; any of them can be used with the methods of this invention. Among these are, in non-exclusive form, radioactive labels and chemical fluorophores. For example, 35SdATP (Amersham Pharmacia Biotech, Inc.) or rhodamine-dUTP (Molecular Probes) can be incorporated in the primer extension step. Alternatively, the DNA can be labeled after restriction by modifying the ends of the restriction fragments, for example non-exclusively, with T4 polynucleotide kinase or filling in the interrupted ends with a DNA polymerase and a labeled nucleotide. This terminal label is well known in the art (see for example, Ausubel, F.M., et al., P1266 Current Protocols in Molecular Biology, John Wiley & Sons, 1998). Terminal labeling has the advantage of placing a label molecule in each DNA fragment that generates homogeneous sequencing ladders. The labeling of the template chain has no consequences, since it will not be excised during the chemical cleavage reaction due to the absence of modified nucleotide in this sequence. In this way, sequencing ladder for mold chain will not be produced. f. Separation of the labeled restriction endonuclease products. Restriction fragments must be separated before chemical cleavage. Numerous methods are known in the art for carrying it out (see, for example, Ausubel, F.M., op.cit.). A particularly useful technique is HPLC which is fast, simple, efficient and automated. For example, Figure 10 shows the resolution obtained by HPLC on DNA PhiX174 restricted with Hae III. Two preferred separation methods are reverse ion phase HPLC and ion exchange HPLC.
P1266 g. Cleavage of the labeled fragments of restriction endonucleases, separated, at the modified nucleotide incorporation sites. Depending on the incorporated modified nucleotide, one of the cleavage reactions already described herein or any other cleavage reaction is used which selectively unfolds at the site of incorporation of the modified nucleotide, those other cleavage reactions being within the scope of this invention. h. Determination of the sequence of the fragment. Figure 11 shows the sequence ladder obtained from a polynucleotide in which T has been replaced with 5'-amino T. This ladder, of course, only indicates where T appears in the complete sequence of the white polynucleotide. To obtain the entire sequence, the above procedure would be repeated three more times, in each case, one of the remaining nucleotides, A, C and G would be replaced with the corresponding modified nucleotide; for example, 5'-amino-dATP, 5'-amino-dCTP or 5'-amino-dGTP. When the four individual fragment ladders are at hand, the complete sequence of the polynucleotide can be easily reconstructed by analyzing and comparing gel sequencing data.
P1266 Example 6. Complete sequencing by complete substitution / almost complete excision, in combination with mass spectrometry. The above procedure of the complete sequencing of a polynucleotide still requires the use of gel electrophoresis to create ladders of fragments from which the sequence is read. As already mentioned before, gel electrophoresis is a laborious and time-consuming process that also requires a wide degree of ability to be carried out in a way that has a reasonable certainty that is reproducible and has exact results. One aspect of this invention is that the use of gel electrophoresis can be totally eliminated and can be replaced with a relatively simpler, faster, more sensitive, accurate and automated use of mass spectrometry. The basis of this aspect of the invention is the unique feature, already mentioned, in the molecular weights of virtually all 2 -mers up to 14 -mers, except for the pairs of 8 fragments described above (and other fragment pairs that are based on the addition of identical sets of nucleotides to pairs of 8 fragments). The following is an example of how the procedure could be carried out. While the example is described in terms of the human intervention of the specific analyzes at each step, it will be evident P1266 for the experts in this field that a computer program can be designed that completely automates the analytical procedure and, in addition, increases the speed of this aspect of the invention. The use of this computer program is, therefore, within the scope of the present invention. The procedure to determine the complete nucleotide sequence by mass spectroscopy would consist of the following steps: a. essentially complete replacement of a natural nucleotide in a polynucleotide with a modified nucleotide to form a modified polynucleotide. This would be achieved by an amplification procedure or by extension of a primer, using the aforementioned polymerase reaction. Optionally, the procedure set forth above could be used to arrive at the optimal polymerase or set of polymerases to prepare the desired modified polynucleotide; b. the cleavage of the modified polynucleotide under conditions that favor substantially complete cleavage essentially only at the sites of incorporation of the modified nucleotide into the modified polynucleotide; and c. determine the masses of the fragments obtained in the previous cleavage reaction.
P1266 The three previous steps are repeated three more times, each time with a different modified nucleotide corresponding to each of the remaining natural nucleotides. The result is a series of masses for which all or almost the entire original complete polynucleotide sequence can be determined. Any ambiguity in the sequence remaining after the main analysis should be easily resolved using one or more reactions that involve a substitution / cleavage reaction of the contiguous dinucleotide or by a conventional DNA sequencing procedure. The following is an example of how the analysis can proceed of a fragment. Starting from the following natural oligomer of 20 nucleotides extended from a 16-mer primer: 5'-primer-TTACTGCATCGATATTAGTC-3 'polymerization in the presence of dTTP, dCTP, dGTP and a modified dATP will originate, after a practically complete excision, five fragments whose masses are shown in table 7. When carrying out the procedure another three times for the three remaining natural nucleotides, three more sets of fragments will be obtained. , the masses of which are also shown in table 7.
From these masses, the content of nucleotides (although not yet the sequences) of all the fragments can be determined uniquely. The current sequence is P1266 determines by analyzing the four results of the split jointly. For example, upon viewing the masses of all the fragments of Table 1, it can be readily determined that only one mass of each cleavage set comprises more than 16 nucleotides, that all other fragments are 3 'of the primer (since the fragment having the primer must be at least 16 nucleotides) and that there are two nucleotides after the primer in one A cleavage column, three in column C, five in column G and none in column T. Therefore, the sequence must start with TT followed by an A, then a C, an unknown nucleotide and then a G. The sequence must start with 2 T residues since no excision of either A, C, or G is present in the interval initial. Also when adding the masses of the fragments in different cleavage sets, it can be seen that the length of the non-sequenced region is 20 nucleotides. The number of nucleotides in the four sets of cleavage can also be easily determined: set A: (primer + 2) + 5 + 4 + 3 + 2 = 16; set C: (primer + 3) + 10 + 3 + 3 + 1 = 20; set G: (primer + 5) + 7 + 5 + 3 = 20; set T: 4 + 3 + 3 + 2 + 2 + 1 = 15. From this information it is clear that there must be overlapping fragments in sets A and T.
P1266 By subtracting the known mass of the primer from that of the fragments containing the primer, the nucleotide content of the sequence immediately after the primer is found. Therefore, in band A, the residual mass of 608 Daltons which, for table 3, is observed corresponds to TT, must therefore correspond to the first two nucleotides in the sequence of the unknown fragment. The sequence that follows the primer is therefore already known and is TTAC_G. From the mass of the 5 groupers in the G-band (1514 Daltons), it can be seen that the 5 -mers contain three T's A and C. Therefore, the missing nucleotide must be a T; the guide sequence is TTACTG.
P1266 Table 7 W t Table 7: Nucleotide-specific cleavage patterns for the sequence shown at the top, consisting of a primer of unknown sequence and unknown length (unspecified) followed by 20 nucleotides of "unknown" sequence for the purposes of this example. The excisions in this example are presented by a mechanism that breaks the 5 'phosphodiester linkage of the modified nucleotide. Each cleavage set includes a fragment containing the primer plus many other nucleotides after the primer until the first occurrence of the modified nucleotide occurs. The known mass of the primer can be subtracted from this mass (which is larger) to obtain the difference, which gives the mass and therefore the nucleotide content of the sequence immediately 3 'of the primer. The masses provided in the table reflect the presence of an external phosphate group in each cleavage mass, however, it should be recognized that, depending on the chemical nature of the nucleotide modification and the cleavage reaction, the actual masses will most likely differ of those shown in the box. However, these differences are expected to be systematic and therefore do not invalidate the potential of the presented analysis. Returning - now to the masses shown in the lane or band T of table 7, the mass of 906 Daltons should P1266 contain a T, an A and a C. As there is already a known TAC sequence, it can tentatively be taken as a confirmation sequence part of the overlap of splits A and T. Of course, the fact that there is another can not be discarded yet. 3 - number containing T, A and C in the fragment, for this reason this assignment must remain tentative at this time. The following excised T fragment must contain, as a minimum, a T and a G. Two cleavage masses T allow this: 946 and 1235. Therefore, the additional sequence must be either G followed by T (if the mass of 946 is the next mass) or G followed by a C and an A, unknown order, and then a T. Now we already know that the sequence is either TTACTGGT or TTACTG (C, A) T (the parentheses and the comma between the nucleotides will be used to indicate a unknown order). Returning to cleavage reaction A, it can be seen that the next cleavage mass after TT contains ACTG. Two masses, 1235 Da and 1524 Da, must meet this criterion. If 1235 Da is the correct thing, the seventh nucleotide of the sequence is A since the excision must have been presented in that nucleotide. If 1524 Da is the correct thing, then the sequence is CA. CA is consistent with one of the two possibilities mentioned above, therefore the general sequence so far P1266 is TTACTGCAT. Now looking at the masses of the cleavage reactions C, it can be seen that the first mass after initial TTA must be CTG (C, A). Since the cleavage will occur at 5 'of any C, the possibilities are CTG or CTGA, only the first of these is supported by the masses in the C band. Therefore, the second mass fragment in the C band must be CTG followed by another C (since the split has been presented at this point). The third mass in band C (906 Da) must contain a C, an A and a T, which confirms the previous CAT sequence. This leaves only two possibilities in the remaining sequences, a C followed by a 10-mer or the 10 -mer followed by a C-terminus. However, if the first case were given, then a fragment of excision from one of the other bands, A, G or T, must show a 3-mer, 4-mer or 5-mer containing 2C. As none of these masses allows this type of oligomer, C alone must be at the 3 'end of the unknown fragment and the 10-mer is after CAT giving the following sequence TTACTGCATC C. Returning again to the G excisions, it is known that a fragment containing at least GCATC must exist. Of the available masses, this can be the GCATC itself (1524 Da or the 7-mer (2180 Da).
P1266 mass of 5 -mer is subtracted from the mass of 7-mer, the remaining mass, 656 Da, does not correspond to any of the known oligonucleotides. Therefore, the 7-mer can not be the next, GCATC is the correct sequence and the next nucleotide must be a G (since the excision has been presented to give the 5-mer). The sequence is now TTACTGCATCG C. The next mass in the T cleavage series should start with TCG. The only cleavage mass T that this combination allows is 1235 Da which corresponds to a TCGA sequence. This sequence must be followed by a T since the excision has to occur at that point. Therefore, the global sequence is: TTACTGCATCGAT C. There is only one mass among the available T cleavage series containing a C, 593 Da of the TC. Therefore, the nucleotide preceding the C-terminal must be a T. Likewise, the only mass containing TC in the cleavage series A that does not contain 2 C, which is known is not allowed, is 1235 or (A, G) TC. The mass of 1235 has already been used once (nucleotides 8-11) but it is also known that there is a fragment that overlaps, since the A series only has a total of 16 nucleotides. It is known that the sequence is TTACTGCATCGAT (A, G) TC. However, if the terminal sequence is ATC, there must be a mass of 906 Da between splits A, and there is not. On the other hand, if the P1266 terminal sequence is GTC, a mass of 922 Da must be found between the excision fragments G and there is none. Therefore, the sequence can be established as TTACTGCATCGAT AGTC. There is only one available cleavage mass T containing AG, but not C, the mass of 946 Da consisting of T (A, G). This mass must take AG into account at positions 17 and 18. Therefore, position 16 must be a T, it is known that the sequence is TTACTGCATCGAT TAGCT. Only two masses are still available in cleavage group A, 617 (AT) and 921 (ATT). This completes the general sequence in two ways, ATATT or ATTAT. None of the masses allows the resolution of this ambiguity. However, the 20 nucleotides of the target oligonucleotide have been unambiguously identified in a single experiment, and 18 of the 20 nucleotides have been unambiguously sequenced. With regard to ambiguity in general, let's say an ambiguity, as in the previous example, or more than one, as would be the case when sequencing longer fragments, depending on the nature of ambiguity and the environment in which it exists, ie , the nucleotides on both sides of it, with an additional experiment that uses one of the several available procedures can be resolved P1266 easily. For example, an experiment using the nucleotide cleavage method of this invention could provide additional information needed to resolve the ambiguityalternatively, some relaxation of the essentially complete cleavage conditions could result in a mass ladder where a known mass was linked to an adjacent ambiguous mass so as to clarify the position and order of the ambiguous mass relative to the known mass . Another possibility is that Sanger's one-step, low-accuracy sequencing can be used. Independently, this quick and relatively easy version of Sanger sequencing would not provide much valuable information, but as a complement to the method of this invention, it would probably provide enough information to resolve ambiguity (and, to the extent that the sequencing ladder obtained is not read ambiguously, would provide a partial redundancy to verify the data of mass spectrometry).
CONCLUSION Therefore, it will be appreciated that the method of this invention provides versatile tools for the detection of variance in polynucleotides, for the determination of the complete nucleotide sequences in P1266 polynucleotides and for the DNA genotype. Although certain embodiments and examples have been used to describe the invention, it will be apparent to the skilled person that changes can be made to the embodiments and examples shown without departing from the scope of the invention. Other modalities are within the scope of the following claims.
P1266

Claims (151)

  1. CLAIMS; A method for cleaving a polynucleotide comprising: a. replacing one or more natural nucleotides at virtually every point of occurrence in a polynucleotide, with modified nucleotides, in order to form a modified polynucleotide, provided that when only one natural nucleotide is replaced, the modified nucleotide is not a ribonucleotide or a nucleoside -triotriphosphate; b. contacting the modified polynucleotide with a reagent or reagents that cleave the modified polynucleotide essentially at each of the points of occurrence of the aforementioned one or more modified polynucleotides.
  2. The method according to claim 1, wherein variance is detected in the nucleotide sequence in the related polynucleotides, and further comprises: c. determine the masses of the fragments obtained from step b; and d. comparing the masses of the fragments with the masses of the fragments expected from the cleavage of a related polynucleotide of known sequence; or e. repeat steps a through c with one or more of the related polynucleotides of unknown sequence and P1266 compare the masses of the fragments of the polynucleotides with the masses of the fragments obtained from the related polynucleotides.
  3. 3. The method according to claim 1, wherein the nucleotide sequence of the polynucleotide is determined and comprises: c. determine the masses of the fragments obtained from step b; and d. repeat steps a, b and c, each time replacing a different natural nucleotide in the polynucleotide with a modified nucleotide until each natural nucleotide of the polynucleotide has been replaced with a polynucleotide, modified, each modified polynucleotide has been cleaved and the masses of the fragments of cleavage have been determined; and e. constructing the nucleotide sequence of the polynucleotide from the masses of first fragments.
  4. The method according to claim 1, by which the genotype of a known polynucleotide containing a polymorphism or mutation is determined, comprising: c. use as the natural nucleotide to be replaced, a known nucleotide that is involved in the polymorphism or mutation; P1266 d. replacing the natural nucleotide essentially at each point of occurrence, amplifying the polynucleotide by using a modified nucleotide to form a modified polynucleotide; and. cleaving the modified polynucleotide into fragments, essentially at each point of occurrence of the modified nucleotide; F. analyze the fragments to determine the genotype.
  5. The method according to claim 4, wherein the analysis of the fragments comprises using electrophoresis, mass spectrometry or FRET detection.
  6. The method according to claim 1, comprising: a. replacing a first natural nucleotide essentially at each point of occurrence, in a nucleotide, using a modified nucleotide to form a modified polynucleotide; b. replacing a second natural nucleotide, essentially at each point of occurrence of the already modified nucleotide, with a second modified nucleotide to form a doubly modified nucleotide; and c. contacting the dually modified polynucleotide with a reagent or reagents to cleave the dually modified polynucleotide in each P1266, where the first modified nucleotide is followed immediately in sequence by the second modified nucleotide.
  7. The method according to claim 6, wherein a variance in the nucleotide sequence of the related polynucleotide is detected, comprising: d. determine the masses of the fragments obtained from step c; and. comparing the masses of the fragments with the masses of the fragments expected from the cleavage of the related polynucleotide of known sequence, or f. repeat steps a through d with one or more of the related polynucleotides of unknown sequence and compare the masses of those fragments with the masses of the fragments obtained from the cleavage of the related polynucleotides.
  8. 8. The method according to claim 1, wherein a variance in the nucleotide sequence of the related polynucleotides is detected, comprising: a. replacing three of the four natural nucleotides essentially at each point of occurrence in a polynucleotide with the three modified stabilizing nucleotides, to form a modified polynucleotide having a remaining natural nucleotide; P1266 b. cleaving the modified polynucleotide into fragments, essentially at each point of occurrence of the remaining natural nucleotide; c. determine the masses of the fragments; and d. comparing the masses of the fragments with the masses of the fragments expected from the cleavage of a related polynucleotide of known sequence or, e. repeat steps a through c with one or more of the related polynucleotides of unknown sequence and compare the masses of the fragments with the masses obtained from the cleavage of the related polynucleotides.
  9. The method according to claim 8, further comprising replacing one of the remaining natural nucleotides with a destabilizing modified nucleotide.
  10. The method according to claim 1, wherein a variance in the nucleotide sequence of the related polynucleotides is presented and comprises: a. replacing two or more natural nucleotides at virtually every point of occurrence in a polynucleotide, with two or more modified nucleotides, wherein each modified nucleotide has a different cleavage characteristic from the other, in order to form a P1266 modified polynucleotide; b. cleaving the modified polynucleotide into first fragments, essentially at each point of occurrence of a first modified nucleotide of the two or more modified nucleotides; c. cleaving the first fragments in second fragments, at each point of occurrence of a second modified nucleotide from the two or more nucleotides modified in the first fragments; d. determine the masses of the first fragments and the second fragments; and e. comparing the masses of the first fragments and the second fragments with the masses of the first fragments and the second fragments expected from the excision of a related polynucleotide of known sequence; or f. repeat steps a through d with one or more of the related polynucleotides of unknown sequence and compare the masses of the first and second fragments with the masses obtained from the cleavage of the related polynucleotides.
  11. The method according to claim 10, wherein the steps are repeated using a modified polynucleotide obtained by replacing different pairs of natural nucleotides with modified nucleotides, P1266 say, replacing the first and a third, the second and fourth, the first and fourth, the second and third or the third and fourth natural nucleotides per modified nucleotides.
  12. The method according to claim 10, wherein the excision comprises using a mass spectrometer.
  13. The method according to claim 12, wherein the mass spectrometer is a battery or tandem mass spectrometer.
  14. 14. A method for determining the nucleotide sequence in a polynucleotide, comprising: a. replacing a natural nucleotide at a percentage of points of occurrence in a polynucleotide, with a modified nucleotide to form a modified polynucleotide, wherein the modified polynucleotide is not a ribonucleotide; b. cleaving the modified polynucleotide into fragments, at virtually every point of occurrence of the modified nucleotide; c. repeating steps a and b, each time replacing a natural nucleotide different from the polynucleotide, by a modified nucleotide; and, d. determine the masses of the fragments obtained from the cleavage reactions; Y P1266 e. construct the polynucleotide sequence from the masses, or f. analyze a sequence ladder obtained from the fragments of step c.
  15. 15. A method for determining the nucleotide sequence in a polynucleotide, comprising: a. replacing a natural nucleotide at a first percentage of points of occurrence in a polynucleotide, with a modified nucleotide to form a modified polynucleotide, wherein the modified nucleotide is not a ribonucleotide or a nucleoside a-thiotriphosphate; b. cleaving the modified polynucleotide into fragments at a second percentage of the points of occurrence of the modified nucleotide, such that the combination of the first percentage and the second percentage results in a partial cleavage; c. repeating steps a and b, replacing each time a different natural nucleotide in the polynucleotide, by a modified nucleotide; d. determining the masses of the fragments obtained from the cleavage reaction; and e. construct the polynucleotide sequence from the masses; or f. analyze a ladder of sequences obtained from the fragments of steps a and b. P1266
  16. 16. The method according to claim 1, wherein the nucleotide sequence is detected in a polynucleotide and comprises: a. replacing two or more natural nucleotides at virtually every point of occurrence in a polynucleotide, by two or more modified nucleotides to form a modified polynucleotide; b. separating the modified polynucleotide into two or more aliquots, the number of aliquots will be equal to the number of natural nucleotides replaced in step a; and c. cleaving the modified polynucleotide in each of the aliquots, in fragments, essentially at each point of occurrence of a nucleotide different from the modified nucleotides, such that each of the aliquots contains fragments of the excision in a different modified nucleotide than in each one of the other aliquots; d. determine the masses of the fragments; and e. construct the nucleotide sequence from the masses; or f. cleaving the modified polynucleotide in each of the aliquots, in fragments at a percentage of points of occurrence of a different modified nucleotide, so that each of the aliquots contains fragments P1266 from the cleavage to different modified nucleotides, that with respect to the other aliquots; and g. analyze a sequence ladder obtained from the fragments of step f.
  17. 17. A method for determining the nucleotide sequence in a polynucleotide comprising: a. replacing a first natural nucleotide at a percentage of points of incorporation into a polynucleotide, by a first modified nucleotide, wherein the first modified nucleotide is not a ribonucleotide or a nucleoside α-thiotriphosphate, in order to form a first partially modified polynucleotide; b. cleaving the first partially modified nucleotide into fragments, using the known cleavage efficiency cleavage procedure, to form a first set of nucleotide-specific cleavage products; c. repeat steps a and b replacing a second, a third and a fourth natural nucleotides with a second, a third and a fourth nucleotide modified to form a second, a third and a fourth partially modified polynucleotides that, when excised, produce a second, a third and a fourth set of nucleotide-specific cleavage products; P1266 d. performing gel electrophoresis of the first, second, third and fourth sets of cleavage products specific for the polynucleotide to form a sequence ladder; and. read the polynucleotide sequence from the sequence ladder.
  18. 18. A method for cleaving a polynucleotide during polymerization, comprising: mixing together with four different nucleotides, one or two of which are modified nucleotides; and two or more polymerases, at least one of which produces an enhanced cleavage at the points where the modified nucleotide is being incorporated or, if two modified nucleotides are used, at the points where one of the modified nucleotides is followed immediately in sequence by the other modified nucleotide.
  19. The method according to claim 18, wherein the two modified nucleotides are used, one of which is a ribonucleotide and the other is a 5'-amino-2 ', 5'-dideoxynucleotide.
  20. 20. The method according to claim 19, wherein two polymerases are used, one is the Klenow (exo-) polymerase and the other is the mutant Klenow (exo-) polymerase E710A. P1266
  21. 21. The method according to any of claims 1, 6, 8, 10, 14, 15, 16, 17 or 18, wherein the natural nucleotides that are not replaced with modified nucleotides are replaced with modified mass nucleotides.
  22. 22. The method according to any of claims 1, 6, 8, 10, 14, 15, 16, 17 or 18, wherein the polynucleotide is selected from the group consisting of DNA and RNA.
  23. 23. The method any of the claims 1, 6, 8, 10, 14, 15, 16, 17 or 18, wherein the detection of the masses of the fragments comprises using mass spectrometry.
  24. 24. The method according to claim 23, wherein the mass spectrometry is mass spectrometry with electro-spraying ionization.
  25. 25. The method according to claim 23, wherein the mass spectrometry is matrix assisted desorption / ionization mass spectrometry (MALDI).
  26. 26. The method according to claim 14, 15 or 16, wherein the analysis of a sequence ladder comprises gel electrophoresis.
  27. 27. The method according to claim 17, further comprising: c. split the first, the second, the third P1266 and the fourth partially modified polynucleotides obtained in step (a), with one or more restriction enzymes to form restriction fragments; d. mark the ends of the restriction fragments; and e. purifying the labeled restriction fragments before carrying out step (b) of claim 17.
  28. 28. A method for cleaving a polynucleotide such that virtually all fragments obtained from the cleavage bear a tag, comprising: a. replacing a natural nucleotide partially or essentially at each point of occurrence in a polynucleotide, by a modified nucleotide to form a modified polynucleotide; b. contacting, in the presence of a phosphine covalently bound to a tag, the modified polynucleotide with a reagent or reagents that cleave the modified polynucleotide partially or essentially at each point of occurrence.
  29. 29. The method according to claim 28, wherein the phosphine is tris (carboxyethyl) phosphine.
  30. 30. The method according to claim 28, wherein the label is selected from the group consisting of a fluorescent label and a radioactive label. P1266
  31. 31. The method for detecting a variance in the nucleotide sequence of a polynucleotide, for sequencing a polynucleotide or for determining the genotype of a polynucleotide known to contain a polymorphism or mutation, comprising: a. replacing one or more natural nucleotides of the polynucleotide with one or more modified polynucleotides, wherein each modified nucleotide is modified with one or more modifications selected from the group consisting of a modified base, a modified sugar and a modified phosphate ester, provided that if only one of the natural nucleotides is being replaced, the modified nucleotide is not a ribonucleotide or a nucleoside a-thiotriphosphate; b. contacting the modified polynucleotide with a reagent or reagents that cleave the modified polynucleotide into fragments, at the sites of incorporation of the modified nucleotide; c. analyze the fragments to detect the variance, in order to construct the sequence or determine the genotype of the polynucleotide.
  32. 32. The method according to claim 31, wherein the modified nucleotide comprises a modified base.
  33. 33. The method according to claim 32, in P1266 where the modified base comprises a modified adenine.
  34. 34. The method according to claim 33, wherein the modified adenine is 7-deaza-7-nitroadenine.
  35. 35. The method according to claim 34, wherein the cleavage with the fragmented modified polynucleotide comprises contacting the modified polynucleotide with a chemical base.
  36. 36. The method according to claim 34, wherein the cleavage of the modified polynucleotide into fragments, comprises contacting the modified polynucleotide with a phosphine.
  37. 37. The method according to claim 36, wherein the step of contacting the modified polynucleotide with a phosphine comprises contacting the modified nucleotide with tris (2-carboxyethyl) phosphine.
  38. 38. The method according to claim 32 wherein the modified base comprises modified cytosine.
  39. 39. The method according to claim 38 wherein the modified cytosine comprises azacytosine.
  40. 40. The method according to claim 38, wherein the modified cytosine is a cytosine substituted in the 5-position with an electron withdrawing group.
  41. 41. The method according to claim 40, wherein the electron withdrawing group is selected from the group consisting of nitro and halo. P1266
  42. 42. The method according to claim 39, wherein the cleavage of the modified polynucleotide into fragments comprises contacting the modified polynucleotide with a chemical base.
  43. 43. The method according to claim 42, wherein the cleavage of the modified polynucleotide into fragments comprises contacting the modified polynucleotide with tris (2-carboxyethyl) phosphine.
  44. 44. The method according to claim 32, wherein the modified base comprises modified guanine.
  45. 45. The method according to claim 44, wherein the modified guanine is 7-methylguanine.
  46. 46. The method according to claim 45, wherein the cleavage of the modified polynucleotide into fragments comprises contacting the modified polynucleotide with a chemical base.
  47. 47. The method according to claim 44, wherein the modified guanine is N2-allylguanine.
  48. 48. The method according to claim 47, wherein the cleavage of the modified polynucleotide into fragments, comprises contacting the modified polynucleotide with an electrophile.
  49. 49. The method according to claim 48, wherein the electrophile is iodine.
  50. 50. The method according to claim 32, in P1266 where the modified base is selected from the group consisting of modified thymine and modified uracil.
  51. 51. The method according to claim 50, wherein the modified thymine or the modified uracil is 5-hydroxyuracil.
  52. 52. The method according to claim 51, wherein the cleavage of the modified polynucleotide into fragments comprises: a. contacting the polynucleotide with a chemical oxidant; and then b. contacting the polynucleotide with the chemical base.
  53. 53. The method according to claim 31, wherein the modified nucleotide comprises a modified sugar, provided that when only one type of modified nucleotide is being used, it is not a ribonucleotide or a nucleoside α-thiophosphate.
  54. 54. The method according to claim 53, wherein the modified sugar comprises 2-keto sugar.
  55. 55. The method according to claim 54, wherein the cleavage of the modified polynucleotide to fragments comprises contacting the modified polynucleotide with a chemical base.
  56. 56. The method according to claim 53, wherein the modified sugar comprises arabinose. P1266
  57. 57. The method according to claim 56, wherein the cleavage of the modified polynucleotide to fragments comprises contacting the modified polynucleotide with a chemical base.
  58. 58. The method according to claim 53, wherein the modified sugar comprises a 4-hydroxymethyl group.
  59. 59. The method according to claim 58, wherein the cleavage of the modified polynucleotide to fragments comprises contacting the modified polynucleotide with the chemical base.
  60. 60. The method according to claim 53, wherein the modified sugar comprises hydroxycyclopentane.
  61. 61. The method according to claim 60, wherein the hydroxycyclopentane comprises 1-hydroxy- or 2-hydroxycyclopentane.
  62. 62. The method according to claim 60, wherein the cleavage of the modified polynucleotide to fragments comprises contacting the modified polynucleotide with the chemical base.
  63. 63. The method according to claim 53, wherein the modified sugar comprises an azido sugar.
  64. 64. The method according to claim 63, wherein the azido sugar comprises 2'-azido, 4'-azido or 4'-azidomethyl sugar. P1266
  65. 65. The method according to claim 63, wherein the cleavage of the modified polynucleotide to fragments comprises contacting the polynucleotide with tris (2-carboxyethyl) -phosphine (TCPE).
  66. 66. The method according to claim 53, wherein the modified sugar comprises a group capable of being photolyzed to form a free radical.
  67. 67. The method according to claim 66, wherein the group capable of being photolyzed to form a free radical is selected from the group consisting of fenynylselenyl and t-butylcarboxy.
  68. 68. The method according to claim 66, wherein the cleavage of the modified polynucleotide to fragments comprises contacting the modified polynucleotide with ultraviolet light.
  69. 69. The method according to claim 53, wherein the modified sugar comprises a cyano-sugar.
  70. 70. The method according to claim 69, wherein the cyano-sugar is selected from the group consisting of 2'-cyano-sugar and 2-cyano-sugar,
  71. 71. The method according to claim 69, wherein the cleavage of the modified polynucleotide to fragments comprises contacting the modified polynucleotide with a chemical base
  72. 72. The method according to claim 53, in P1266 where the modified sugar comprises an electron withdrawing group.
  73. 73. The method according to claim 72, wherein the electron withdrawing group is selected from the group consisting of fluorine, azido, methoxy and nitro.
  74. 74. The method according to claim 73, wherein the electron withdrawing group is located at positions 2 ', 2"or 4' of the modified sugar
  75. 75. The method according to claim 72, wherein the cleavage of the modified polynucleotide. to provide fragments comprises contacting the modified polynucleotide with the chemical base
  76. 76. The method according to claim 53, wherein the modified sugar comprises an element that removes electrons in the sugar ring.
  77. 77. The method according to claim 76, wherein the electron withdrawing element comprises nitrogen.
  78. 78. The method according to claim 77, wherein the nitrogen replaces the ring oxygen of the modified sugar.
  79. 79. The method according to claim 77, wherein the nitrogen replaces the ring carbon of the modified sugar.
  80. 80. The method according to claim 78, in P1266 wherein the cleavage of the modified polynucleotide to form fragments comprises contacting the modified polynucleotide with the chemical base.
  81. 81. The method according to claim 79, wherein the cleavage of the modified polynucleotide to form fragments comprises contacting the modified polynucleotide with a chemical base.
  82. 82. The method according to claim 53, wherein the modified sugar comprises a mercapto group.
  83. 83. The method according to claim 82, wherein the mercapto group is positioned at the 2 'position of the sugar.
  84. 84. The method according to claim 82, wherein the cleavage of the modified polynucleotide to fragments comprises contacting the modified polynucleotide with a chemical base.
  85. 85. The method according to claim 53, wherein the modified sugar is selected from the group consisting of a 5'-methylene-sugar, a 5'-keto-sugar and a 5 ', 5'-difluoro-sugar
  86. 86. The The method according to claim 85, wherein the cleavage of the modified polynucleotide to form fragments comprises contacting the modified polynucleotide with a chemical base.
  87. 87. The method according to claim 31, in P1266 wherein the modified nucleotide comprises a modified phosphate ester provided that only one type of modified nucleotide is used and is not nucleoside α-thiotriphosphate.
  88. 88. The method according to claim 87, wherein the modified phosphate ester comprises a phosphorothioate.
  89. 89. The method according to claim 88, wherein the sulfur atom of the phosphorothioate is not covalently bound to a sugar ring.
  90. 90. The method according to claim 89, wherein the cleavage of the modified polynucleotide to fragments comprises: a. contacting the sulfur of the phosphorus thiolate with an alkylating agent; and b. contacting the modified polynucleotide with a chemical base.
  91. 91. The method according to claim 90, wherein the alkylating agent is methyl iodide.
  92. 92. The method according to claim 89, wherein the cleavage of the modified polynucleotide to form fragments comprises contacting the sulfur of the phosphorothioate with β-mercaptoethanol in a chemical base.
  93. 93. The method according to claim 92, wherein the chemical base comprises sodium methoxide. P1266 in ethanol.
  94. 94. The method according to claim 88, wherein the sulfur atom of the phosphorothiolate is covalently bound to a sugar ring.
  95. 95. The method according to claim 94, wherein the cleavage of the modified polynucleotide to fragments comprises contacting the modified polynucleotide with a chemical base.
  96. 96. The method according to claim 87, wherein the phosphate ester comprises a phosphorus amirate.
  97. 97. The method according to claim 96, wherein the cleavage of the modified polynucleotide to fragments brings the modified polynucleotide into contact with an acid.
  98. 98. The method according to claim 87, wherein the modified ester phosphate comprises a group selected from the group consisting of alkyl phosphate and alkyl phosphorus triester.
  99. 99. The method according to claim 98, wherein the alkyl is methyl.
  100. 100. The method according to claim 96, wherein the cleavage of the modified polynucleotide to form fragments comprises contacting the modified polynucleotide with an acid.
  101. 101. The method according to claim 31, which P1266 comprises replacing a first and a second natural nucleotide with a first and a second modified nucleotide, such that the polynucleotide can be cleaved specifically at sites where the first modified nucleotide is followed immediately in sequence by the second modified nucleotide.
  102. 102. The method according to claim 101, wherein: the first modified nucleotide is covalently linked at its 5 'position with a sulfur atom of a phosphorothioate group; and the second modified nucleotide, which is modified with a 2'-hydroxy group is contiguous with the first modified nucleotide and is towards the 5 'end thereof.
  103. 103. The method according to claim 102, wherein the cleavage of the modified polynucleotide to fragments comprises contacting the modified polynucleotide with a chemical base.
  104. 104. The method according to claim 101, wherein: the first modified nucleotide is covalently linked in its 3 'position to a sulfur atom of a phosphorothioate group; and the second modified nucleotide, which is modified with a 2'-hydroxy group, is contiguous with the first P1266 modified nucleotide and is towards the 3 'end thereof.
  105. 105. The method according to claim 104, wherein the cleavage of the modified polynucleotide to form fragments comprises contacting the modified polynucleotide with a chemical base.
  106. 106. The method according to claim 101, wherein: the first modified nucleotide is covalently linked at its 5 'position with an oxygen atom of a phosphorothioate group; the second modified nucleotide is substituted at its 2 'position with a leaving group; and the second modified nucleotide is covalently linked in its 3 'position with a second oxygen of the phosphorothioate group.
  107. 107. The method according to claim 106, wherein the leaving group is selected from the group consisting of fluorine, chlorine, bromine and iodine.
  108. 108. The method according to claim 106, wherein the cleavage of the modified polynucleotide to form fragments comprises contacting the modified polynucleotide with a chemical base.
  109. 109. The method according to claim 108, wherein the chemical base comprises sodium methoxide.
  110. 110. The method according to claim 101, in P126S where: the first modified nucleotide is covalently linked in its 5 'position to a first oxygen atom of a phosphorothioate group; the second modified nucleotide is substituted at its 4 'position with a leaving group; and the second modified nucleotide is covalently linked in its 3 'position with a second oxygen of the phosphorothioate group.
  111. 111. The method according to claim 110, wherein the leaving group is selected from the group consisting of fluorine, chlorine, bromine and iodine.
  112. 112. The method according to claim 110, wherein the cleavage of the modified polynucleotide to form fragments comprises contacting the modified polynucleotide with a chemical base.
  113. 113. The method according to claim 112, wherein the chemical base comprises sodium methoxide.
  114. 114. The method according to claim 101, wherein: the first modified nucleotide is covalently linked at its 5 'position to a first oxygen atom of a phosphorothioate group; the second modified nucleotide is substituted at its 2 'position with one or two atoms of P1266 fluorine; and the second modified nucleotide is covalently linked in its 3 'position to a second oxygen of the phosphorothioate group.
  115. 115. The method according to claim 114, wherein the cleavage of the modified polynucleotide to fragments comprises: a. contacting the modified polynucleotide with an ethylene sulfide or β-mercaptoethanol; and then b. contacting the modified polynucleotide with a chemical base.
  116. 116. The method according to claim 115, wherein the chemical base comprises sodium methoxide.
  117. 117. The method according to claim 101, wherein: the first modified nucleotide is covalently linked at its 5 'position to a first oxygen atom of a phosphorothioate group; the second modified nucleotide is substituted at its 2 'position with a hydroxy group; and the second modified nucleotide is covalently linked in its 3 'position to a second oxygen of the phosphorothioate group.
  118. 118. The method according to claim 117, in P126S where the cleavage with the modified polynucleotide to form fragments comprises: a. contacting the modified polynucleotide with a metal oxidant; and then b. contacting the modified polynucleotide with a chemical base.
  119. 119. The method according to claim 118, wherein the metal oxidant is selected from the group consisting of Cu11 and Fe111.
  120. 120. The method according to claim 118, wherein the chemical base is selected from the group consisting of dilute hydroxide, piperidine and dilute ammonium hydroxide.
  121. 121. The method according to claim 101, wherein: the first modified nucleotide is covalently linked at its 5 'position to a nitrogen atom of a phosphoramidate group; and the second modified nucleotide, which is modified with a 2'-hydroxy group is contiguous with the first modified nucleotide and towards the 5 'end thereof.
  122. 122. The method according to claim 121, wherein the cleavage of the modified polynucleotide comprises contacting the modified polynucleotide with an acid. P1266
  123. 123. The method according to claim 101, wherein: the first modified nucleotide is covalently linked in its 5 'position to a nitrogen atom of a phosphoramidate group; and the second modified nucleotide, which is modified with a 2'-hydroxy group, is contiguous with the first modified nucleotide and towards the 3 'end thereof.
  124. 124. The method according to claim 123, wherein the cleavage of the modified polynucleotide to fragments comprises contacting the modified polynucleotide with an acid.
  125. 125. The method according to claim 101, wherein: the first modified nucleotide is covalently linked at its 5 'position to an oxygen atom of an alkyl phosphonate or an alkyl phosphorotriester group; and the second modified nucleotide, which is modified with a 2'-hydroxy group, is contiguous with the first modified nucleotide.
  126. 126. The method according to claim 125, wherein the cleavage of the modified polynucleotide to fragments comprises contacting the modified polynucleotide with an acid.
  127. 127. The method according to claim 101, in P1266 where: the first modified nucleotide has an electron withdrawing group in its 4 'position; and the second modified nucleotide, which is modified with a 2'-hydroxy group, is contiguous with the first modified nucleotide and towards the 5 'end thereof.
  128. 128. The method according to claim 127, wherein the cleavage of the modified polynucleotide to fragments comprises contacting the modified polynucleotide with an acid.
  129. 129. A method that has the chemical structure: O O O O O O ll ll ll Base II "" Base O "- P-O-P-O-P-O- v n I O'- -O-P-O-P-O- I I I OH OH P1266 99ZX & 39e wherein R1 is selected from the group consisting of: R2 is selected from the group consisting of cytosine, guanine, inosine and uracil; and, "Base" is selected from the group consisting of cytosine, guanine, inosine, thymine and uracil. P1266
  130. 130. A polynucleotide comprising a dinucleotide sequence selected from the group consisting of: P1266 6 266 6 P1266 where each "Base" is independently selected from the group consisting of adenine, cytosine, guaninine and thymine: it is a group that removes electrons; X is a leaving group; and R is a lower alkyl group; wherein, a second or X that is shown in the same carbon atom represents that a single or X may be in any of the positions or two W or two X may exist simultaneously.
  131. 131. The compound according to claim 130, wherein the group that removes electrons is selected from the group consisting of F, Cl, Br, I. N02, C = N, -C (0) 0H and OH. P1266
  132. 132. The compound according to claim 130, wherein the leaving group is selected from the group consisting of Cl, Br, I and OTs.
  133. 133. A method for synthesizing a polynucleotide comprising mixing a compound having the chemical structure: wherein R1 is selected from the group consisting of P1266 with adenosine triphosphate, guanosine triphosphate and thymidine triphosphate or uridine in the presence of one or more polymerases.
  134. 134. A method for synthesizing a polynucleotide comprising mixing a compound having the chemical structure: P1266 wherein R1 is selected from the group consisting of with adenosine triphosphate, cytidine triphosphate and guanosine triphosphate in the presence of one or more polymerases.
  135. 135. A method for synthesizing a polynucleotide, comprising a compound having the chemical structure: wherein R1 is selected from the group consisting of P1266 vs with cytidine triphosphate, guanosine triphosphate and thymidine triphosphate in the presence of one or more polymerases.
  136. 136. A method for synthesizing a polynucleotide, which comprises mixing a compound having the chemical structure: wherein R1 is selected from the group consisting of: with adenosine triphosphate, cytidine triphosphate and thymidine triphosphate in the presence of one or more P12S6 polymerases.
  137. 137. A method for synthesizing a polynucleotide, comprising mixing a compound of claim 129 with any of three of the four nucleoside triphosphates, adenosine triphosphate, cytidine triphosphate, guanosine triphosphate and thymidine triphosphate, does not contain a base ( or its substitute) present in the compound of claim 129 employed, in the presence of one or more polymerases.
  138. 138. A method for synthesizing a polynucleotide, comprising mixing one of the following pairs of compounds: P1266 P1266 P12S6 > . 'Y. * &. «*". k * ik »Z 4 Í4 *, &Z * A. _ «______. • > P1266 riÉÜMÉiii i T 'inffiirrtíl r- * - • • - < --- - PI; ? i? - _i_¡__i¡isa ___________________ í? fa__il_ii wherein: Basei is selected from the group consisting of adenine, cytosine, guanine or inosine and thymine or uracil; Base2 is selected from the group consisting of the three P1266 remaining bases that are not Baseí; R3 is 0"-P (= 0) (0") - 0-P (= 0) (0") - 0-P (= 0) (O") -O; R is a lower alkyl group; it is an electron withdrawing group; X is a leaving group, where a second or X shown in the same carbon atom represents that with the X or can be in any of the positions or that simultaneously there are two or two X; with which two of any of the nucleoside triphosphates, adenosine triphosphate, cytidine triphosphate, guanosine triphosphate and thymidine triphosphate, does not contain base 1 or base 2 (or its substitutes), in the presence of one or more polymerases .
  139. 139. A mutant polymerase that is capable of catalyzing the incorporation of a modified nucleotide into a polynucleotide, wherein the modified nucleotide is not a ribonucleotide, by a process comprising the shuffling or random reassortment of the DNA.
  140. 140. The polymerase of claim 139, wherein the random shuffling or DNA process comprises: a. selecting one or more known polymerases; b. perform shuffling or random randomization of DNA; P1266 c. transform the shuffled DNA into a host cell; d. cultivate the colonies of the host cell; and. form a lysate from the colony of the host cell; F. adding a DNA template containing a detectable reporter sequence, the modified nucleotide or nucleotides whose incorporation into the polynucleotide is desired and the natural nucleotides that have not been replaced by the modified nucleotides; and g. examine the lysate to determine presence of the detectable reporter sequence.
  141. 141. The polymerase of claim 139, wherein the shuffling or random scrambling process of DNA comprises: a. selecting a known polymerase or two or more known polymerases having different sequences or different biochemical properties, or both; b. perform shuffling or random shuffling of DNA; c. transforming the shuffled DNA into a host to form a library of transformants in host cell colonies; d. prepare a first separate grouping of P1266 the transformants by plating the colonies of host cells; and. forming a lysate from each of the first colonies of host cells separated by clusters; F. remove all natural nucleotides from each lysate; g. combine each lysate with: i. a single-stranded DNA template comprising a sequence corresponding to an RNA polymerase promoter followed by an indicator sequence; ii. a single-stranded DNA primer complementary to one end of the template; iii. the nucleotide or modified nucleotides whose incorporation into the polynucleotide is desired; iv. each natural nucleotide that has not been replaced by the modified nucleotide or nucleotides; h. add RNA polymerase to each of the combined lysates; i. examine the combined lysate to determine the presence of the indicator sequence; j. create the second separate cluster of transformants in host cell colonies P1266 from each of the first separate groupings of the host cell colonies, where the presence of the indicator sequence was detected; k. form a lysate from the second separate cluster of host cell colonies; 1. repeat steps g, h, i, j k and 1 to form separate groupings of transformants within host cell colonies until only one of the colonies remains, which contains the polymerase; and m. recloning the polymerase from the host cell colony within a protein expression vector.
  142. 142. A mutant polymerase that is capable of catalyzing the incorporation of a modified nucleotide into a polynucleotide, wherein the modified nucleotide is not a ribonucleotide obtained by a process comprising the senescence selection of the cells.
  143. 143. The polymerase according to claim 142, wherein the senescence selection of the cell comprises: a. mutagenizing a known polymerase to form a library of mutant polymerases; b. clone the library within a vector; c. transforming the vector into the selected host cells so that it is capable of being exterminated by a selected chemical agent, P1266 only when the cell is actively cultured; d. add a modified nucleotide; and. cultivate the host cells; F. treat the host cells with the selected chemical agent; g. separate living cells from dead cells; and h. Isolate polymerase or polymerases from living cells.
  144. 144. The polymerase according to claim 143, wherein steps d to g are repeated one or more times.
  145. 145. The polymerase according to claim 142, wherein the process comprises: a. mutagenesis of the known polymerase to form a library of mutant polymerases; b. cloning the library of mutant polymerases within a plasmid vector; c. transform the bacterial cells of the plasmid vector which, when cultured, are susceptible to an antibiotic; d. select the transfectants using the antibiotic; and. introducing a modified nucleotide, such as the corresponding nucleoside triphosphate, into the bacterial cell; P1266 f. cultivate the cells; g. add an antibiotic that will exterminate bacterial cells that are actively grown; h. isolate bacterial cells; i. cultivate the bacterial cells in a new medium that does not contain an antibiotic; j selecting live cells from the growing colonies; k. isolate the plasmid vector from living cells; 1. Isolate the polymerase; and m. do a polymerase assay.
  146. 146. The polymerase according to claim 145, wherein steps c through k of the process are repeated one or more times before proceeding to step 1.
  147. 147. The polymerase according to claim 139, wherein the polymerase is a thermostable polymerase.
  148. 148. A mutant polymerase that is capable of catalyzing the incorporation of a modified nucleotide into a polynucleotide, wherein the modified nucleotide is not a ribonucleotide obtained by a process comprising the deployment of a phage.
  149. 149. The mutant polymerase according to claim 148, wherein the phage display comprises: P1266 (a) select a DNA polymerase; (b) expressing the polymerase in a bacteriophage vector as a fusion to a bacteriophage coat protein; (c) attaching an oligonucleotide to the surface of the phage; (d) forming a primer template complex either by the addition of a second oligonucleotide complementary to the oligonucleotide of (c) or by the formation of a self-priming complex, using the intramolecular complementarity of the oligonucleotide of (c); (e) effecting a primer extension in the presence of a modified nucleotide or nucleotides and of those natural nucleotides to be replaced by the modified nucleotide or nucleotides, one of the natural nucleotides is labeled with a detectable indicator; and (f) classifying the phage with the detectable reporter sequence from the phage without the detectable reporter sequence.
  150. 150. The polymerase according to claim 139, 142 or 148, wherein the modified nucleotide is selected from the group consisting of: a compound having the chemical structure: P1266 wherein R1 is selected from the group consisting of: a compound that has a chemical structure; P1266 wherein the "Base" is selected from the group consisting of cytosine, guanine, inosine and uracil, a compound having the chemical structure: wherein the "Base" is selected from the group consisting of adenine, cytosine, guanine, inosine and uracil, a compound having the chemical structure: P1266 99ZXñ 96e wherein the "Base" is selected from the group consisting of adenine, cytosine, guanine, inosine, thymine and uracil, and a compound having the chemical structure: P1266 ase2 ase; ase2 99eta frOfr
  151. 151. A kit comprising: one or more modified nucleotides; one or more polymerases capable of incorporating those modified nucleotides into a polynucleotide so as to form a modified polynucleotide; a reagent or reagents capable of cleaving the modified polynucleotide at each point of occurrence of those modified nucleotides within the polynucleotide. P1266
MXPA/A/2001/003404A 1998-10-01 2001-04-02 A method for analyzing polynucleotides MXPA01003404A (en)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US60/102,724 1998-10-01
US60/149,533 1999-08-17
US09/394,387 1999-09-10
US09/394,774 1999-09-10
US09394457 1999-09-10
US09394467 1999-09-10

Publications (1)

Publication Number Publication Date
MXPA01003404A true MXPA01003404A (en) 2002-06-05

Family

ID=

Similar Documents

Publication Publication Date Title
US6440705B1 (en) Method for analyzing polynucleotides
US6825009B2 (en) Method for identifying polymorphisms
US6458945B1 (en) Method for analyzing polynucleotides
US6566059B1 (en) Method for analyzing polynucleotides
US6610492B1 (en) Base-modified nucleotides and cleavage of polynucleotides incorporating them
EP1117838B1 (en) A method for analyzing polynucleotides
JP4786904B2 (en) Fragmentation-based methods and systems for sequence change detection and discovery
CA2623539C (en) High throughput screening of mutagenized populations
JP2014221072A (en) Diagnostic sequencing by combination of specific cleavage and mass spectrometry
US20060252061A1 (en) Diagnostic sequencing by a combination of specific cleavage and mass spectrometry
JP2007527217A (en) 2&#39;-terminator nucleotide related methods and systems
JPH08508473A (en) Novel derivatives useful for nucleic acid sequencing
ES2392783T3 (en) Nucleic acid detection with labeled tagged ribonucleotides
US6627416B1 (en) 5′-modified nucleotides and the application thereof in molecular biology and medicine
WO2004067764A2 (en) Nucleic acid sequencing using nicking agents
US6777188B2 (en) Genotyping by mass spectrometric analysis of allelic fragments
MXPA01003404A (en) A method for analyzing polynucleotides
US6994998B1 (en) Base-modified nucleotides and their use for polymorphism detection
KR20200087727A (en) Mass spectrometry using dna polymerases with increased mutation specificity
Hsieh Design and Synthesis of 3'-Oxygen-Modified Cleavable Nucleotide Reversible Terminators for Scarless DNA Sequencing by Synthesis
Korte Nucleic acids, proteins and carbohydrates
Shaw One-Step PCR Sequencing. Final Technical Progress Report for February 15, 1997-November 30, 2001