CN105602937A - Methods for nucleic acid mapping and identification of fine-structural-variations in nucleic acids - Google Patents

Methods for nucleic acid mapping and identification of fine-structural-variations in nucleic acids Download PDF

Info

Publication number
CN105602937A
CN105602937A CN201610028288.2A CN201610028288A CN105602937A CN 105602937 A CN105602937 A CN 105602937A CN 201610028288 A CN201610028288 A CN 201610028288A CN 105602937 A CN105602937 A CN 105602937A
Authority
CN
China
Prior art keywords
dna
gvt
target
target dna
skeleton
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610028288.2A
Other languages
Chinese (zh)
Inventor
骆树恩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Versitech Ltd
Original Assignee
Versitech Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Versitech Ltd filed Critical Versitech Ltd
Priority claimed from CN2009801359358A external-priority patent/CN102165073A/en
Publication of CN105602937A publication Critical patent/CN105602937A/en
Pending legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1093General methods of preparing gene libraries, not provided for in other subgroups
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/64General methods for preparing the vector, for introducing it into the cell or for selecting the vector-containing host
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/66General methods for inserting a gene into a vector to form a recombinant vector using cleavage and ligation; Use of non-functional linkers or adaptors, e.g. linkers containing the sequence for a restriction endonuclease

Abstract

A method of juxtaposing sequence tags (GVTs) that are unique positional markers along the length of a population of target nucleic acid molecules is provided, the method comprising: fragmenting the target nucleic acid molecule to form target DNA insert; ligating the target DNA insert to a DNA vector or backbone to create a circular molecule; digesting the target DNA insert endonuclease to cleave the target DNA insert at a distance from each end of the target DNA insert yielding two GVTs comprising terminal sequences of the target DNA insert attached to an undigested linear backbone; recircularizing the linear backbone with the attached GVTs to obtain a circular DNA containing a GVT-pair having two juxtaposed GVTs; and recovering the GVT-pair DNA by nucleic acid amplification or digestion with endonuclease having sites flanking the GVT-pair. Cosmid vectors are provided for creating GVT-pairs of -45- to 50-kb separation sequencable by next-generation DNA sequencers.

Description

The method changing for the fine structure of nucleic acid mapping and qualification nucleic acid
The application is divisional application, and the applying date of original application is on July 9th, 2009, and application number is200980135935.8 (PCT/CN2009/000777), denomination of invention be " for nucleic acid mapping andThe method that fine structure in qualification nucleic acid changes ".
The cross reference of related application
The application requires the priority based on following application: the U.S. that is filed on January 4th, 2006Number of patent application 60/756,417; Be filed in the Application No. on April 17th, 200660/792,926; Be filed in the Application No. 60/814,378 on June 15th, 2006; CarryMeet at the Application No. 61/129,660 on July 10th, 2008; Be filed in December, 2008The Application No. 61/193,442 of 1 day; Be filed in the United States Patent (USP) Shen on January 3rd, 2007Please number 11/649,587; And be filed in the Application No. on December 12nd, 200711/954,947, described application is all incorporated herein with its entirety by reference.
Invention field
In general, the present invention relates to change for the fine structure of high throughput analysis nucleic acidMethod. Particularly, the present invention relates to produce right New Policy, the carrier of nucleic acid tag connectingWith other component, the right forming member of nucleic acid tag who wherein connects has user-defined intervalDistance and/or be the mark of nucleic acid position, its length along target nucleic acid molecule is divided one or manyPlant the adjacent cleavage site of different restriction endonucleases. A preferred embodimentIn, by the present invention for the identification of changing or label with the genome of phenotypic correlation. At anotherIn individual preferred embodiment, by the present invention for generation of high-resolution Genome Atlas to haveHelp carry out genome assembling from air gun DNA sequencing.
Background of invention
Although the abundantest and the most deep human genome variant type of research is that mononucleotide is manyState property (SNP), but be day by day clear that, comprise copy number (insert, disappearance and repeat) change,What is called " fine structure variation " in inversion, transposition and other sequence are rearranged in is human geneGroup and other genomic global feature. The variation of these types seems than the more frequency of originally thinkingBe present in general groups numerously. Set up evidence show, structural variant can be in each individualityComprise the heterogeneous nucleotides that has up to a million. Understand fine structure change genome evolution,With the effect in interaction, Phenotypic Diversity and the disease of environment be in current genome researchOne of most active research field. About summary, referring to (2006), Redon etc. such as Feuk(2006), (2005) and the Bailey etc. (2002) such as Check (2005), Cheng.
Compared with snp analysis, for analyze fine structure change effective high throughput method alsoFully do not developed. The important first step is array comparative genome hybridization (array CGH) skillArt (Pinkel etc., 1998; Pinkel etc., United States Patent (USP) the 5th, 830, No. 645 and the 6th, 159,685Number), this technology is the relative copy number between target DNA and reference DNA quantitatively. ArrayCGH allows the resolution ratio with bacterial artificial chromosome (BAC) the clone level of single arrangement, canDetect DNA (DNA) copy number difference (Snijders between DNA sample by groundDeng, 2001; Albertson etc., 2000; Pinkel etc., 1998). For cDNA (HeiskanenDeng, 2000; Pollack etc., 1999) and high density oligonucleotide array platform (Bignell etc.,2004; Brennan etc., 2004; Hung etc., 2004; Lucito etc., 2003) amendment array CGHResolution ratio and the application of the method are further expanded. By its application, array CGHRealize qualification and tumour (Pinkel and Albertson, 2005; Inazawa etc., 2004;Albertson and Pinkel, 2003; Pollack etc., 2002) and PD (Gonzalez etc.,2005) relevant copy number changes.
The mapping of 1.F clay pairing end
Measure although can be used for copy number, array CGH is also not suitable for determining other typeGenome structure changes, and the most significantly, is unsuitable for the nucleic acid weight of inversion, transposition and other typeRow. Tuzun etc. (2005) attempt with the side that is called " mapping of F clay (fosmid) pairing end "Method addresses these limitations. The method relies on head complete (head-full) mechanism of F clay packaging,Approximately 40 kilobase to produce from tester with suitable homogeneous are inserted the genome of (kb) sizeEnter the genome dna library of thing. According to experiment, actual sheet segment limit is 32kb to 48kb,Mean value standard deviation < 3,39.9+/-2.76-kb. The random approximately 40kb library insert of selectingEnd stops order-checking and produces paired short sequence label, and wherein each label is to two genes of markGroup position, these two genome positions are along the about 40kb in target DNA length interval. Then will markSign to comparing with computer with reference genome assembly, in their anticipated orientation or their pactAny inconsistent target and the reference core of crossing over this region of being all illustrated in of 40kb spacing distance aspectBetween acid, there is at least one architectural difference. Collection of illustrative plates location interval exceedes the label his-and-hers watches of 40kbShow and on target DNA, have disappearance with respect to reference; Interval is lower than the collection of illustrative plates positional representation of 40kbIn target, there is DNA to insert. The label of having mapped is potential to the inconsistent expression in directionDNA inversion or other compound chromosome reset. Label is to being assigned in reference sequenceTwo coloured differently body surfaces show chromosome translocation. By conventional DNA sequencing to exceeding 1,000,000Separately the F clay of purifying clone insert is analyzed, and makes (2005) such as Tuzun can beBetween tester and reference genome assembly, identify and approach 300 structural change positions.
Not instruction or disclose other and produce the label of label to, generation different interval of this authorTo the homogeneous of the spatial resolution to change analysis, the intubating length of improvement in their librariesProperty, by use modified DNA sequence analysis instrument (generationDNAsequencer) improveThe method of economy, does not also openly produce the right method of sequence label of other type, for exampleCan divide according to the position between paired adjacent endonuclease cleavage site and/or spacing distanceThe sequence label pair of the present invention of genome position.
Being permitted eurypalynous fine structure variation is not fixed by F clay pairing end graphing methodApproximately 40kb resolution ratio window differentiate. The mapping of F clay pairing end has other restriction. FCosmid vector is bred in host cell with utmost point low copy number, and this characteristic is used for making at some geneThe potential restructuring that runs between proliferation period in microbial hosts of group sequence, rearrangement and other is artificial(artifact) is minimum for product. The F cosmid vector of form although application at present can be increased (Szybalski,United States Patent (USP) the 5th, 874, No. 259), but due to low DNA yield (compared with conventional plasmid),End sequencing F clay clones to produce sequence label and still has the economy of extreme difference, makes to be difficult toMaintaining high throughput automated template produces and checks order. In addition, need two independently serial responsesTo produce label to sequence from single F cosmid DNA template, thereby further reduce warpJi property.
Although being the fine structure in qualification human genome, the mapping of F clay pairing end changesUseful beginning, but for each tester, all need huge cost and logistic work to carry out purifyingThe F clay insert end up to a million with order-checking, this has hindered it population-wide and census of populationThe application that middle identified gene group changes, described genome changes may or sound relevant with complex diseaseAnswer environmental factor etc. In addition, F cosmid vector and variant thereof generally exist with low-down copy numberIn host cell, breed, make to be difficult to maintain reliable automation DNA and produce and check order. CauseThis, need to be used for genome and correlative study effectively, stablize high flux and reflect cheaplyDetermine the method that fine structure changes, with by these genetic elements and disease, PD and diseaseNeurological susceptibility connects.
2. for generation of the existing method of genome label
The multiple fingerprint technique based on DNA has been described in the art for characterizing and icp geneGroup (Wimmer etc., 2002; Kozdroj and vanElsas, 2001; Rouillard etc., 2001;Schloter etc., 2000). All these methods are all used the restriction endonuclease of target DNASome combination that digestion, pcr amplification or gel electrophoresis separate. Conventionally, need loaded down with trivial detailsly from solidifyingIn glue, extract candidate's DNA fragmentation and hindered these methods for DNA sequencing. Dunn etc.(2002) work makes progress, and it is restricted that wherein they have described a kind of use IIS type/IIG typeEndonuclease MmeI produces " the genome identification marking mark for analyzing gene group DNASign (GenomicSignatureTag) " method (GST). By thering is MmeI discrimination bitThe attachment of point is connected to genomic DNA fragment and produces GST, described genomic DNA sheetInitial following generation of section: by initially digesting target gene group with II type restriction endonucleaseDNA, then carries out with the label enzyme (frequentcuttingtaggingenzyme) of frequent cuttingDigestion for the second time. Digest the DNA of attachment (adaptor) connection with MmeI, produce 21bpLabel (GST), this label has to be identified with respect to initial limit enzymic digestion in DNAThe position that site is fixing. Passing through after pcr amplification, the GST of oligomerization purifying, for cloneAnd DNA sequencing. The homogeneity of described label and relative abundance thereof are used for setting up genomeThe high-resolution " GST sequence overview " of DNA, it can be used for qualification and quantitatively set answeringInitial genome in assorted DNA separator. Use Yersinia pestis (Yersiniapestis)As model system, Dunn etc. (2002) can define may be in relatively simple genomeThe region of the change of restriction site is added or lacked to experience. But, the side of Dunn etc. (2002)Method is as limited in the effectiveness in human genome at complicated genome, in complicated genome, largeMost structural changes can not or lose a small amount of restriction enzyme in research by simple acquisitionDisclose in nuclease site. In addition, even if for 1 restriction site, cover large genomeOr to analyze the required GST quantity of multiple samples be also very high. In contrast, of the present inventionGVT is to Analysis of Complex genome overview or the multiple DNA offering sample of extensive diagnostic economyAnd analysis ability.
One is first by (1995) and the Kinzler etc. (1995) such as Velculescu (United States Patent (USP)5,695, No. 937) continuous analysis (SerialAnalysisofGene that describe, that be called gene expressionExpression, SAGE) the various ways of method, also utilize IIS type or IIG type restrictedEndonuclease produces DNA label (Ng etc., 2005; Wei etc., 2004; Saha etc.,2002). So-called " SAGE label " produced by cDNA template, to provide biological sampleThe complexity of middle cDNA kind and the evaluation of relative abundance. Recently the SAGE of form is called" LongSAGE ", it utilizes MmeI digestion, produces the sequence label of 21bp, to markNote mRNA transcript (Saha etc., 2002). Up-to-date improved form is called " SuperSAGE ",It utilizes III type restriction endonuclease EcoP15I to produce the long mark of 25bp to 27bpSign, for improving mRNA to genomic distribution (Matsumura etc., 2003). Although thisBright also utilize IIS type, IIG type or III type restriction endonuclease with produce sequence label,But with regard to the information content of preparation method and improvement, GVT of the present invention pair of gained with aforementionedSAGE and GST label fundamental difference. Just produce and especially can be used for characterizing new genome or annotation(annotate) fine structure of genome and DNA sample change high-resolution physical map andSpeech, with respect to using the single label that do not connect, the label that space of the present invention connects is to remarkable improvementEfficiency and analysis ability.
The recent work of Ng etc. (2005) has been described further developing of SAGE method. Researcher's profitBy Collins and the pioneering method of Weissman (1984), utilize in the method DNA fragmentationCyclisation (connecting also referred to as DNA in molecule), to connect into together carrier by far-end DNA sectionIn, produce so-called " genome jump library (genomicjumpinglibraries) " (CollinsDeng, 1987). The single cDNA of the cyclisation such as Ng, to connect the SAGE label in its 5 ' and 3 ' sourceBe connected together, produce " the two labels of pairing end " and (PET), then, by its oligomerization, be beneficial toEffectively order-checking. Be tested and appraised transcription initiation site and the polyadenylation site of transcriptional units, withDivide gene border and help their flank of qualification to regulate sequence, can be by PET for genomeAnnotation. Although GVT of the present invention is to all relying on and be connected to realize DNA mark in molecule with PETNote connects, but only has GVT of the present invention to for example having integrated physical distance and other useful informationThe connection in adjacent limits site, makes GVT to unique and can be used for detailed genome knot thusStructure is analyzed. Ng etc. (2005) do not instruct and produce the label that limits on space or based on as the disclosureThe method of the label of other standard described in content, how they can use them if also not disclosingPET method obtain genomic fine structure and change or disclose and limit by unique use IIS typeProperty endonuclease MmeI processed produces other method of sequence label. Finally, Ng etc. (2005)Prediction can effectively not used short reading (shortread) DNA sequence analysis instrument of future generationMethod.
Berka etc. (2006) (U.S. Patent application 2006/0292611) and Kobel etc. (2007) areClosely described the paired end graphing method of DNA, it is similar to the present invention in function, but theyMethod fundamental difference on the direction in space of the DNA product of final mark, and there is certainA little important shortcomings. In the method for (2007) and the Berka etc. (2006) such as Kobel, workerBiotinylation hair clip attachment is connected to each end of target DNA insert, after this,By attachment sequence is linked together to make molecule cyclisation, so that initial target DNA endHold mutual close proximity, be positioned at the new right either side of biotinylation attachment arranged side by side. Then willRing molecule cuts at random, to produce the random distance having from initial target DNA insert endFrom the end of exposure. Consequent linear DNA fragment is passed through to avidin parentReclaim with chromatogram, and along its total length order-checking.
Kobel etc. (2007) utilize DNA sequence analysis instrument GENOME of future generationSEQUENCERFLX(RocheDiagnostics,Indianapolis,IN;454LifeScienceCorp, Bradford, CT) (being often called " 454-sequenator "), obtain target DNAThe initial end sequence of insert. But, as described in the products therefrom of generation can not effectively existSOLEXAGENOMEANALYZER (Illumina, SanDiego, CA) (is often called" SOLEXA sequenator ") or produce the order-checking any of future generation of " short sequence is read "On the SOLiD sequenator (AppliedBiosystems, FosterCity, CA) of platform, inquire after(interrogate). The DNA product that Kobel etc. (2007) and Berka etc. (2006) produce is takedSo-called " ecto-entad (outside-in) " opens up benefit, the thus initial end of target DNA insert(" outside "), with reverse position (" inwardly ") orientation, described reverse position is by new biology arranged side by sideElementization attachment is spaced apart to institute, and described attachment is to being positioned at random the length of gained DNA fragmentationWithin degree. Owing to taking " ecto-entad " to open up benefit compared with initial target DNA end, instituteWith the end sequence in order to determine initial target dna fragment, for striding across biotinylation attachmentFor and the sequence of opposite side by DNA product, hundreds of bases of sequencing or withOn be necessary. The most of product so producing reads at the 400bp of 454-sequenatorWithin length. The short running cost that reads for example SOLEXA of DNA sequence analysis instrument is 454-Sequenator 1/10th or lower, but conventionally support the length that reads of 50 bases,This curtailment is definitely accurately to inquire after the side by (2006) and Kobel etc. (2007) such as BerkaThe product that method produces. Berka etc. (2006) have described the variant of their methods, wherein by IISType restriction endonuclease MmeI is for generation of corresponding to initial dna insert end orderThe label of approximately 20 bases of row. By the method, worker is fixed on the length of labelWithin the DNA sequencing limit of power of SOLEXA type DNA sequence analysis instrument. But, instituteState label and be still " ecto-entad " and open up benefit, and produced by MmeI digestion fixing approximately 20The label all too of individual base is short so that can not be clearly to complex genome mapping, to be used asGenome instrument or auxiliary sequencel assembling. In addition, the label of 20 fixing bases can notBe indebted to the short DNA sequence analysis instrument that reads of future generation in the up-to-date improvement of reading in length. OrderThe length that reads that front SOLEXA supports is 50 bases from each end of DNA profiling,Expection was increased to 76 bases after a while in 2009.
The present invention has overcome aforementioned limitations by following several: 1) produce the right ability of GVT,Label on target DNA can be transformed to hundreds of kb by 1kb is following member's spacing thusAbove, so that detection resolution be suitable for analyzing dissimilar nucleic acid and be suitable for any set realityTest design; (2) label between member obviously more accurately and the spacing of homogeneous, for higher dividingAnalyse precision; (3) other standard based on except spacing distance produces the right energy of genome labelPower, for example position based on adjacent endonuclease site of cutting and/or relative spacing distanceFrom, produce the label pair of inquiring after for improving target nucleic acid sample; (4) for higher economyProperty, the inventive method is suitable in large-scale parallel DNA sequence analysis instrument of future generation.So-called by adopting " outer outwards (outside-out) " opens up to mend and learns, end sequence arranged side by side thusThe direction in space that label (GVT to) retains with target DNA insert end is identical at first, andBy using the II type restriction endonuclease of frequent cutting to produce average length 100-200The GVT of bp, can directly be translated into even longer by SOLEXA " become p-end-read " platformGVT sequence, it is only by the actual length restriction that reads of this equipment.
Invention summary
The present invention relates to produce connect genome sequence label to quick generation high-resolutionSystem, method, composition, carrier, carrier component and the kit of Genome Atlas. ThisThe short sequence label arranged side by side of bright generation (be called genome change label (GenomicVariationTag,GVT)) right, the forming member that wherein GVT is right has user-defined spacing distance, and/orFor the mark of position, it divides one or more different limits along the length of nucleic acid molecule in researchThe adjacent site of cutting of property endonuclease processed.
When by each right GVT of computer comparison GVT and reference sequence, their expectionAny inconsistent target and the reference of all representing of homogeneity, spacing distance and/or direction and reference sequenceBetween nucleic acid GVT to cross over region in there is one or more fine structure difference. WithThis mode, the comprehensive library that GVT is right represents to can be used for producing high resolution structures collection of illustrative plates with qualificationThe hrr gene group overview that fine structure between nucleic acid group changes. The opposing party of the present inventionFace makes user can define and change the spacing distance of the nucleic acid group to mark by GVT, thereforeAllow to produce and be applicable to detecting fine structure variation with different spatial resolution and physics coverage rateGVT to library. Another aspect of the present invention is produced as the GVT couple of position mark, described inPosition is close to one or more different restriction endonucleases along research amplifying nucleic acid group lengthThe adjacent and recognition site pair that can cut. Therefore, can be by producing by the use sensitivity that methylatesThe sequence label that produces of restriction endonuclease difference digestion, by the present invention for researchThe methylation state of DNA colony. Another aspect of the present invention produces following GVT couple, and it isThe recognition site adjacent and that can cut of one or more different restriction endonucleases is rightMark, and spaced apart by user-defined distance in the length along research amplifying nucleic acid group. ThisAnother aspect of invention is provided for producing up to the above spacer of about 50kb on target DNAFrom right method, carrier and the DNA skeleton of GVT. Another aspect of the present invention provides productThe right side of GVT that life can effectively be checked order on large-scale parallel DNA sequence analysis instrument of future generationMethod. About the summary of DNA sequence analysis instrument of future generation referring to Morozova and Marra (2008)And Mardis (2008).
According to one aspect of the present invention, by the target complex DNA random fragmentation for analyzingOr limiting position fragmentation. The target DNA insert of fragmentation is connected to suitable carrierOr in DNA skeleton, the II with one or more frequent cuttings by the target insert connecting thusThe digestion of type restriction endonuclease, described nuclease is being cut in the useful distance of each endCut described insert and cause the release of intervening sequence, be still connected to not digested vector orThe GVT couple of DNA skeleton. Conventionally, use has the frequent cutting of 4 base recognition sitesThe digestion of II type restriction endonuclease produces the GVT of 100-200bp length, this length correspondenceAverage distance between the end of target DNA insert and the position of first cleavage site.By GVT is linked together produce GVT to by newly produce the compound body weight of carrier-GVTNew cyclisation, described GVT is to representing in phase the other side identical with initial target DNA insertTo end region arranged side by side. By the restriction endonuclease site digestion to flank at GVTOr by utilizing with GVT the PCR of the suitable primer to flank, by GVT to from carrier orIn DNA skeleton, discharge. When by right GVT computer and ginseng for single GVT sequenceDuring than sequence alignment, on their expection homogeneity, spacing distance or direction and described reference, arrangeThose any inconsistent of row all represents between target and reference nucleic acid the district to leap at GVTIn territory, there is one or more fine structure difference. Therefore, multiple GVT are to tabulatingSequence (tabulatedsequence) forms the detailed genome of target nucleic acid faciation for reference sequenceOverview.
According to another aspect of the present invention, the target DNA of fragmentation is cloned into novel clayCarrier pSLGVT-28, pSLGVT-35, pSLGVT-36, pSLGVT-37 or pSLGVT-38In, for generation of for using SOLEXA of future generation, SOLiD or 454-DNA sequence to divideAnalyse the GVT couple of the 45-50kb spacing distance in the sequencing of instrument. Of the present invention these andWhen describing in detail below reference, other side will become apparent. In addition, by various bibliography(comprising patent, patent application and journal of writings), mark was following and incorporated herein by reference.
The useful application that the present invention or its derivative products (derivedproduct) provide comprises but notThe rapid build that is limited to hrr gene picture group spectrum, described collection of illustrative plates can be used for: (1) identified geneThe fine dimension of group changes (fine-structural-variant), and this fine dimension changes facilitates the mankind manySample, and may cause disease, PD or disease susceptibility and be used as diagnostics or controlTreat other viewed proterties of intervening target; (2) make it possible to design and set up for fast andLarge-scale parallel is inquired after the oligonucleotide microarray of the fine structure variant in DNA sample or itsIts assay method, for medical diagnosis, Genotyping and other so useful purposes; (3)Promote accurately also to carry out rapidly DNA assembling by complete genome group or air gun DNA sequencing method;(4) fine structure of the rna transcription thing that qualification is produced by difference RNA processing changes, to helpIn genome annotation, functional genome research and potential disease diagnosis; (5) set up genome overview,To promote comparative genomics and phylogenetic study and to contribute to difference qualification closely-relatedBiological; (6) set up relevant strain, kind (race), bion, variant, kind (breed)Or the genome overview of species, with identify may cause any observable theory, medical science orThe genome element of business goal phenotype.
Detailed Description Of The Invention
The invention provides high throughput method, carrier and the carrier component of novel improved, screening andFine structure in qualification nucleic acid group changes. The present invention includes and produce sequence label arranged side by side (GVT)In vitro and in vivo method, two compositions of label to (GVT to) in described sequence label arranged side by sideMember for limit spacing distance unique location mark and/or be the mark of nucleic acid position, its alongThe length of multiple target nucleic acid molecules is divided the phase of one or more different restriction endonucleasesAdjacent cleavage site. Described method comprises: target nucleic acid molecule fragmentation is inserted to form target DNAEnter thing; Target DNA insert is connected with DNA vector or skeleton, to produce ring molecule;With the preferably II type restriction endonuclease digestion target of frequent cutting of one or more nucleasesDNA insert, to cut target on each end of target DNA insertDNA insert, thus produce two sequence labels (GVT), its comprise be connected to indigestedThe target DNA insert end sequence of wire carrier or DNA skeleton; And make to have connectionThe wire carrier of GVT or DNA skeleton recirculation, obtain containing having two GVT arranged side by sideThe right ring-shaped DNA molecule of GVT; By nucleic acid amplification or with thering is GVT to flankThe restriction endonuclease digestion in site, reclaims GVT to DNA.
When by the right single GVT of computer comparison GVT and reference sequence, their expectionAny inconsistent target and the reference of all representing of homogeneity, spacing distance and/or direction and reference sequenceBetween nucleic acid GVT to cross over region in there is one or more fine structure difference. LogicalCross the method, the comprehensive library that GVT is right represents following hrr gene group overview: it is availableIn produce that high resolution structures collection of illustrative plates changes with the fine structure between qualification nucleic acid group and forProduce genome support (genomicscaffold) with the assembling of auxiliary gene group and structural analysis.
1. for generation of preparation and the fragmentation of the right nucleic acid of GVT
As described herein, the invention provides the method that produces hrr gene picture group spectrum, this figureSpectrum can be used for characterizing assembling or the qualification target nucleic acid group of unknown gene group and auxiliary unknown gene groupAnd between reference sequence, fine structure changes. The target nucleic acid that is suitable for analyzing includes but not limited to:Eucaryote and procaryotic genomic DNA, microbial DNA, plastid DNA, plasmidAnd phagemid dna; Viral DNA and RNA; Derive from the complementation of ribonucleic acid (RNA)DNA (cDNA); And the DNA for example especially producing by PCR by amplification in vitro.For from aforementioned source DNA isolation, by the method for the synthetic cDNA of RNA and amplification of nucleic acidFor those skilled in the art known.
For some embodiment, GVT is to the physical distance of crossing over along target DNA lengthDetermine the level of resolution for analyzing. Spacing between GVT is less, for mapping and useThe spatial resolution that fine structure in detecting target nucleic acid group changes is just higher. Larger GVTSpacing is needed to less GVT couple, physically to cover the DNA sample of set complexity,But the spatial resolution that detects mini gene group structural variant is followed decline. Large GVT to spacing acrossThe analysis of larger duplicate block to promote that in from the beginning genome assembling and DNA, macrostructure changes.Generation has the GVT of 5kb, 10kb, 25kb, 50kb, 100kb or higher spacing distanceRight ability allows terminal use in GVT spacing, to detecting dissimilar DNA structure changeChange required level of resolution and provide enough physics coverage rates for the genome of set complexityRequired GVT is compromise to selection function between number. The right optimum number of GVT of different spacingOrder and ratio can be for application-specific microcomputer modellings.
As mentioned above, for building the physical length control of the target DNA insert that GVT is rightSpacing distance between the right intrinsic GVT (residentGVT) of GVT, thus set for pointThe level of resolution of analysing. Generation and purifying approach uniform fragmentation nucleic acid molecules group's sideMethod is existing description in this area. Fragmentation target DNA group extremely required insert length can use manyPlanting restriction endonuclease enzymatic under the condition of partially or completely digestion realizes. There are 6Or the use of the restriction endonuclease of the recognition site of more base-pairs can be used for producing moreLong DNA fragmentation. One or more have the restricted of different sensitiveness to DNA methylationThe use of endonuclease can be used for evaluating target DNA group's DNA methylation state. Frequently cutII type restriction endonuclease every 256bp cutting as average in MboI, HaeIII etc. of cuttingDNA is (random distribution based on four kinds of bases in target DNA and equivalent exist) once, these enzymesUse be known in the art, produce the DNA fragmentation of all size for digesting by part.Softening terms, lower use restriction endonuclease CviJI (cuts in GC dinucleotides positionDNA (Fitzgerald etc., 1992)) be especially used in and under part digestion condition, produce DNA sheetThe useful non-individual body of Duan great little. In certain embodiments, the random DNA fragmentation producing isUseful. Method for generation of random dna fragment comprises: (1) uses ox pancreas deoxyriboseNucleic acid nuclease I (DNA enzyme I) digestion, this enzyme under manganese ion exists, in DNA, carry out withThe double-stranded cutting of machine (Melgar and Goldwait, 1968; Heffron etc., 1978); (2) physics is cutCut (Shriefer etc., 1990); (3) ultrasonic (Deininger, 1983).
Condition for part enzymatic digestion is determined by rule of thumb, changes reaction volume, enzyme concentrationAnd one or more parameters in the ratio of enzyme-to-substrate, temperature retention time or temperature. For needThe Analytical high resolution at offer 5kb or less GVT interval, preferred non-sequence dependentFragmentation method. Ox pancreas DNA enzyme I carries out random two under manganese ion exists in DNAChain cutting (Melgar and Goldwait, 1968; Heffron etc., 1978), thereby can be used for thisObject. Equally, also can use by mechanical means for example ultrasonic or selective application of shear forceDNA fragmentation. HYDROSHEAR equipment (GenomicSolutionsInc, AnnArbor,MI) or adopt self adaptation focus on acoustics (AdaptiveFocusedAcoustics) COVARIS(CovarisInc, Woburn, MA) equipment especially can be used for producing the random of restriction magnitude rangeDNA fragmentation. Also can pass through at cDNA separately or with other described fragmentation Combination of MethodsBetween synthesis phase or during PCR, use random primer, produce random dna fragment. Pass through gelElectrophoresis is easily monitored the development of the fragmentation that produces Len req product. Producing suitable DNAAfter size distribution, use T4Archaeal dna polymerase reparation or prepare target DNA flush end, to prepareFlush end is connected to carrier, DNA skeleton or GVT-attachment, for generation of GVT of the present inventionRight. By partially or completely digesting fragmentation DNA with one or more endonucleasesAnd stay in the situation of cohesive end, without reparation, but need to design GVT-attachment, carryBody or DNA skeleton adapt to the specified viscosity end being produced by fragmentation enzyme. Because target DNAThe continuous damage of insert and other target DNA insert the synteny of sample(co-linearity), and destroyed the structure of Genome Atlas, so remove target by phosphatase5 ' the phosphate group of DNA, with prevent with phase that is connected of GVT-attachment or DNA skeletonBetween produce chimeric DNA insert.
2. the size fractionation of big or small selected DNA separates and purifying
For some embodiment, by gel electrophoresis or by high performance liquid chromatography (HPLC)Classification separates dephosphorylized DNA insert, inserts with the purify DNA that produces required sizeThing. Polyacrylamide gel is preferably used in the DNA of classification separation 50bp to 1kb. For greatlyLittle about 250bp is to the fragment of about 50kb, and 0.4% to 3% Ago-Gel is suitable. Arteries and veinsRushing field gel electrophoresis is suitable for classification and separates the DNA of about 10kb to hundreds of kb size. These sidesMethod is described in bibliography herein, and (Rickwood and Hames (editor), be loaded in: Gelelectrophoresisofnucleidacid-Apracticalapproach,OxfordUniversityPress, NewYork, 1990; Hamelin and Yelle, 1990; Birren and Lai, be loaded in:Pulsefieldelectrophoresis:Apracticalguide,AcademicPress,SanDiego,1993). DNA is determined to size by using with the suitable big or small mark of the parallel electrophoresis of sample,And visual by dyeing. Cut the gel slice that contains required big or small DNA with scalpel,By electroelution or by enzymatic or chemical degradation gel-type vehicle from gel-type vehicle return thereafterReceive DNA. Should approach homogeneous for the recovery DNA fragmentation size of analyzing. Divide for maximizingThe gel systems of High Resolution and deposition condition are known in the art. Use more than two-wheeled coagulatingGel electrophoresis can obtain higher sample size homogeneity. The big or small variance of average length exceedesThe sample of 2.5%-5% can cause the present invention to use unacceptable noise.
The design of 3.GVT-attachment and target DNA are connected with carrier or DNA skeleton
In certain embodiments, first target DNA insert is connected with attachment, to urgeEnter itself and being connected of suitable carrier or DNA skeleton. In other embodiments, by target DNAInsert is directly connected with carrier or DNA skeleton, and does not use connection intermediate. At otherIn embodiment, first each attachment is connected to each end of target DNA, so newThe free end recirculation of the attachment connecting forms function DNA skeleton, for producing subsequentlyGVT couple. Attachment can mix parts such as biotin group to contribute to required DNA to produceThe affinity purification of thing. Attachment also can mix restriction endonuclease recognition site, for fromIn DNA skeleton, cut produced GVT couple, or mix IIS, IIG or III type inscribeThe nuclease recognition site of nuclease, produces with the target DNA insert being connected by cuttingRaw GVT. For the GVT that wherein target DNA insert is directly connected with carrier or DNA skeletonProduce, can be by for aforementioned IIS, IIG or III type restriction endonuclease suitableRecognition site is incorporated in the design of carrier or DNA skeleton. Another aspect of the present invention profitDigest the target DNA insert connecting with one or more II type restriction endonucleases,To produce the GVT of each end that is connected to carrier or DNA skeleton, wherein by described carrierOr DNA frame design is not for digesting containing these digestion sites maintenance.
Those of skill in the art will recognize that and have the multiple GVT-of the present invention that is applicable to implementAttachment design. In general, suitable GVT-attachment comprises following material character: (1) 5 'The short cochain (topstrand) of phosphorylation oligonucleotides and short lower chain (bottomstrand), it canStable complementary base matches to produce duplex structure; (2) one end of GVT-attachment has viscosityExtend (the preferred non-palindrome) itself and carrier, DNA skeleton or there is another of complementary seriesAttachment connects; (3) another attachment end has flush end structure or other suitable end knotStructure, makes it possible to effectively be connected with target dna fragment (preferred dephosphorylized target DNA); (4)For some embodiment, the attachment end of target DNA insert flank can be with suitableIIS type, IIG type or III type restriction endonuclease recognition site, described in its direction makesSite guiding in target DNA with target DNA end at a distance of fixing and useful distance cutting,With produce GVT (about the summary of IIS type, IIG type and III type restriction endonuclease,Referring to Sistla and Rao (2004), Bujnicki (2001), Szybalski etc. (1991); (5)Attachment can have second restriction endonuclease site, produces for cutting from carrierRaw GVT couple.
Those skilled in the art become known for connecting attachment and DNA insert and for nucleic acidThe method of the general connection of molecule. (be loaded in: Short referring to (editors) such as such as AusubelProtocolsinMolecularBiology, the 3rd edition, JohnWiley and Sons, NewYork,1995). Need for the typical condition of contact that attachment is connected with the effective flush end of DNA insertWill be with respect to target DNA approximately 50 attachments to hundreds of times of molar excess, high T4DNA connectsEnzyme concentration or comprise volume exclusion agent (Hayashi etc., 1986 such as polyethylene glycol etc.;Pheiffer and Zimmerman, 1983; Zimmerman and Pheiffer, 1983). Attachment withEffective connection of cohesive end target DNA need to approximately 5 times of molar excess. Making to connect GVT-connectsThe DNA insert of thing by CHROMOSPIN post (Clontech, MountainView,CA),, to remove excessive attachment, then select by gel electrophoresis purifying and size. ForBy connecting and produce GVT couple in molecule, by the target DNA insert of the connection attachment of purifyingConnect into the one in several plasmid vectors and DNA skeleton as described below.
According to one aspect of the present invention, (the preferably frequent cutting of any restriction endonucleaseII type restriction endonuclease (it preferentially cuts target DNA insert instead of carrier)),DNA skeleton or any attachment being connected with target DNA, be applicable to produce GVT and GVTRight. REBASE restriction enzyme database provide II type restriction endonuclease, isoschizomer,The information of different point of contact enzyme (neoschizomer), recognition sequence, commercial availability and bibliography(rebase.neb.com). Preferred II type restriction endonuclease is frequent cutting target DNAThe enzyme of insert, for example following enzyme: it identifies 4 base-pair sites, thus produce average longThe GVT of degree 100-300bp. II type restriction endonuclease FspBI or Csp6I are independentOr be combined as and be especially suitable in the present invention to produce GVT, because these two kinds of enzymes frequently cutAnd produce identical complementary cohesive end, allow by connecting in molecule without to end modifiedDirectly produce GVT couple of the present invention. Think other only to cut target DNA insert and do not cutThe restriction enzyme nucleic acid of the attachment that carrier, DNA skeleton or target DNA insert connectEnzyme the present invention for generation of GVT and the right scope and spirit of GVT in.
4. carrier and the DNA skeleton to preparation for GVT
Need therein in some embodiment of large GVT-spacing, may produceGVT is before at host cell internal breeding target DNA. In the time breeding in host cell, containRich AT or GC sequence, repetition, hair clip, strong promoter, virulent gene and other problem orderRearrangement or the loss of the target DNA section of row are concerned. DNA resets and other human cloningFor product can be thought the structural change in target nucleic acid by mistake. In addition clone's bias (cloning,Bias) can limit the size of insert, and can present not the genomic important area in researchFoot (under-represent). The development of condition amplification type F cosmid vector and BAC carrier recentlySolve this problem (Szybalski, United States Patent (USP) the 5th, 874, No. 259), DNA in described carrierPropagation remain on each host cell 1-2 copy, until be induced to for analyzingHigher level. Report that 15kb is to the improved stability of genome insert that exceedes 100kb,And condition amplification type carrier is existing conventional for genome research. Condition amplification type F clay/BACFor example pCC1FOS of carrier (Epicentre, Madison, WI) and pSMART-VC (Lucigen,Middleton, WI) and their variant, be applicable to produce between 10kb to 200kbGVT-The GST-couple of distance. But the use of conventional low copy plasmid carrier seems to be enough to stable maintenanceLarge DNA fragmentation, and do not need BAC, PAC or F clay type carrier (Feng etc., 2002;Tao and Zhang, 1998). PSMART serial carrier provides low copy number propagation, and hasOn carrier, there is the additional features of transcription terminator, to reduce the latent effect of transcribing interference, thisCan further improve DNA stability (Mead and Godiska, United States Patent (USP) the 6th, 709,861Number). For the GVT that produces 10kb or larger GVT-spacing for, multiple foundation alsoThe widely used carrier based on low copy plasmid is suitable for producing GVT couple, these carriersComprise: pBR322 (Bolivar etc., 1977), pACYC177 (Chang and Cohen, 1978)With other carrier described in present disclosure.
In order to implement the present invention, the carrier being connected with target DNA or DNA skeleton must not containFor produce the cleavage site of the restriction endonuclease of GVT from target DNA insert.Due to the cutting of carrier or DNA skeleton is connected the space of destroying GVT, therefore this preventsBy being connected to form GVT couple in molecule. Can be by using standard method to carry out site-directed luringBecome the carrier framework of preparing without unwanted restriction site. Referring to, for example McPherson(editor) (is loaded in: DirectedMutagenesis-APracticalApproach, OxfordUniversityPress, NewYork, 1991) and Lok (United States Patent (USP) the 6th, 730, No. 500). LogicalOften, can change to change by single base-pair the substantial portion of carrier DNA or DNA skeleton,Can therefore not have functional to eliminate unwanted restriction endonuclease recognition siteImpact. Within albumen coded sequence, single core thuja acid is changed to target codon swing position(codonwobbleposition), to keep native protein coding. On carrier or DNA skeletonThe change done of other place in requisition for carrying out before use functional verification. Many restricted inCut the methylate sensitivity of nuclease to its recognition site; Particularly, at the 5-of deoxidation cytimidineMethylating of carbon location can make these sites on carrier or DNA skeleton not digested. Can lead toCross via PCR directly mix 5-methyl-dCTP, by by thering are different restriction modification systemsSuitable host cell goes down to posterity DNA or by realize DNA first with specific methylasesBase, so that the restriction site on carrier or DNA skeleton is no longer cut by enzymatic.REBASE restriction enzyme database provides the sensitive information that methylates of restriction endonuclease(rebase.neb.com)。
Also can be by direct by being connected to form the DNA skeleton that GVT and GVT are right in moleculeChemical synthesis produces with any required specification. Preparing in a large number subsequently DNA skeleton can be by chemistrySynthesize or partly or entirely prepare from template by PCR. DNA skeleton can comprise forThe replication initiation of breeding in microbial hosts and selected marker. Or DNA skeleton can only compriseMinmal sequence, it mainly comprises the attachment pair that space connects. First by each attachment and targetThe end of DNA insert connects, and then attachment free-end is linked together with reconstructDNA skeleton, thus be formed for ring molecule prepared by GVT. In some other enforcement sideIn case, attachment can mix the identification in IIS, IIG or III type restriction endonuclease siteSite, described site be instruct with target DNA end at a distance of limit distance cutting target DNA withProduce the direction of GVT. Biotin and other parts also can be mixed in DNA skeleton, to makeGVT is to affinity purification DNA intermediate in the different step of preparation in vitro. A kind of outstandingIts useful design comprises synthetic DNA skeleton, its containing all or most 16 kinds can4 base-pair palindromes of energy. This class DNA skeleton allows by being used alone or in combinationAlmost any 4 base identification restriction endonucleases digest the target DNA insert connectingAnd can cutting DNA skeleton or attachment, produce GVT. The DNA that another kind is particularly usefulFrame design is mixed following sequence: itself and the DNA of binding for DNA sequencing platform of future generationIncrease compatible with sequencing primer, for large-scale parallel high flux GVT to DNA sequencing. ExcellentSelect DNA skeleton long enough to be provided for primer that the GVT that formed of amplification is right in conjunction with positionPoint, realizing affinity purification, can effectively be connected (connection) to target DNA or preferably to becomeThe unique identifier of reference point is provided.
5.GVT to prepare carrier pSLGVT-1, pSLGVT-2, pSLGVT-28, pSLGVT-35,PSLGVT-36, pSLGVT-37 and pSLGVT-38
The DNA assembly that pSLGVT serial carrier comprises two chemical syntheses is to provide respectively medicineThing is selected and the basis of plasmid replication maintains function. Carrier module is with the IIS type of end uniquenessRestriction endonuclease site, it produces unique asymmetric cohesive end, with allow withRear quick reconfiguration carrier component, thus add or replace assembly or DNA expression for new functionBox.
The P15A origin of replication that the first carrier module contains modification. With P15A repliconPlasmid with the low number propagation of approximately 15 copies of each host cell (Sambrook etc., are loaded in:MolecularCloning-ALaboratoryManual, the 2nd edition, CSHLaboratoryPress,ColdSpringHarbor, NewYork, 1989), optimize thus the genome of cloning and insertThe stability of thing. MmeI site in P15A replicon is by eliminating two sitesEach possible single nucleotide alteration and removing, then screens each mutant for replication capacityTo produce functional " the P15A-m replicon assembly " for building pSLGVT-1. By letterSingle single sequence change is removed the EcoP15I site in P15A replicon, to produce for structureBuild plasmid pSLGVT-2 " P15A-e assembly ".
Second carrier module comprises the modified Kan gene from transposons Tn903, itsGive the resistance for antibiotic kanamycins (Grindley etc., 1980). Utilize swing position alsoConsistent with the best codon use in Escherichia coli as far as possible, remove in Kan gene coding region4 MmeI sites together with 2 NciI and NsiI site and for Esp3I, PstIIWith the Single locus of HindIII, to produce " Kan assembly ".
Cosmid vector pSLGVT-28 is having for the preparation of DNA sequencing platform of future generationThe GVT of 45-50kb space interval is to providing unique benefit. There is the GVT couple of this spacingEspecially can be used for providing effective physics coverage rate of genomic DNA, to identify that fine structure becomesChange, and for crossing over large repetition DNA district for preparing genome support, multiple to promoteAssorted genomic de novo sequencing. PSLGVT-28 by following step derived fromPSLGVT-2:(1) mix the COS site for external phage packaging from phageλ,Make it possible to effectively and accurately select target DNA insert with biology size, to produce toolThere is the right complicated library of GVT at accurate about 45-50kb interval; (2) by site-directed mutagenesisRemove all FspBI and Csp6I restriction endonuclease site on carrier, thereby allowProduce GVT by being used alone or in combination the target DNA insert that those enzymic digestions connectGVT couple subsequently; (3) for " Adaptor-A " that be positioned at IlluminaCorporationTarget DNA between " Adaptor-B " sequence produces cloning site, to allow useSOLEXA " become p-end-read " order-checking platform carries out solid phase DNA cloning and order-checking institute producesRaw GVT couple.
There is effective formation that the GVT at 45-50kb interval is right and at SOLEXA " in pairs-end-reading " large-scale parallel DNA sequencing on platform, with respect to (2005) such as TuzunSmall throughput F clay pairing-end graphing method change and the long scope of preparation is propped up in identified gene groupFrame, to contribute to DNA assembling aspect, provides the huge advance made of cost and validity.
Cosmid vector pSLGVT-35 is the derivative of pSLGVT-28, wherein a pair of reverseBciVI restriction endonuclease site is positioned at the SOLEXA of IlluminaCorporationBetween " Adaptor-A " and " Adaptor-B " sequence. BciVI is IIS type restriction enzymeNuclease, it produces 3 ' extension of a base that is positioned at 6 base-pairs from enzyme recognition site.BciVI digests for generation of single 3 ' of the Adaptors-A on carrier and Adaptor-B flankThymidine jag, to receive according to the SOLEXADNA preparing for DNA profilingPrepare the adenine of target DNA insert afterbody prepared by kit.
Cosmid vector pSLGVT-36 is the derivative of pSLGVT-28, wherein SOLEXAAdaptor-A and Adaptor-B sequence are by the 454-platform (GS from RocheDiagnosticsFLXTITANIUM) Adaptor-A and Adaptor-B displacement, for directly at this platformGo up GVT carrying out sequencing.
Cosmid vector pSLGVT-37 is the another kind of derivative of pSLGVT-28, whereinSOLEXAAdaptor-A and Adaptor-B are by the SOLiD from AppliedBiosystemsThe InternalAdaptor displacement of " Mate-PairLibrary " system, for directly describedOn SOLiD platform to GVT to carrying out sequencing.
Cosmid vector pSLGVT-38 is the another kind of derivative of pSLGVT-28, wherein willThe 454-Internal of RocheDiagnostics for SOLEXAAdaptor-A and Adaptor-BAdaptor displacement, is suitable for the GVT couple of " ecto-entad " configuration to produce, for describedThe order-checking of 454-platform.
6.GVT is to preparation
In certain embodiments, by machinery or enzymatic method random fragmentation for generation ofThe target DNA group that GVT is right, with produce the fragment with required size for GVT to preparation.In other embodiments, target DNA group is existed with one or more restriction endonucleasesIn independent reaction or in combination, digest to completely, to cut target DNA at assigned address. At anotherIn individual embodiment, target DNA is extremely complete with one or more restriction endonuclease digestionEntirely, then classification is separated to required size. For the enzymic digestion target DNA with producing cohesive end,Can be by dephosphorylized target DNA Direct Cloning to the carrier or DNA skeleton suitably modified.Use T4Archaeal dna polymerase or mung-bean nuclease reparation have the fragmentation of " not concordant " endTarget DNA, then dephosphorylation is to prevent chimeric target DNA insert. Equally, alsoDephosphorylation with the target DNA of cohesive end to prevent chimeric insert. UsingWhen attachment carries out being connected of target DNA and carrier or DNA skeleton, by CHROMASPINPost (Clontech, MountainView, CA), for removing the attachment not connecting, then will connectConnecing the target DNA that thing connects prepares carrier with GVT and is connected. In certain embodiments, existBefore GVT preparation, by gel electrophoresis or by other method, target DNA is selected through sizeTo required length.
Clay used herein, F clay, phasmid (phagmid), BAC and other episomeElement is collectively referred to as plasmid or DNA skeleton. For the DNA within the scope of certain fragment lengthSection, has described in the molecule for optimizing carrier or DNA skeleton and insert and has been connected and continuesAnd interior condition of contact (Collins and Weissman, 1984 that connect to produce ring molecule of molecule;Dugaiczyk etc., 1975; Wang and Davidson, 1966). Be used for connecting nucleic acid molecules,Being transfected into host cell neutralization is this area for building the universal method in the library based on plasmidKnown to the skilled. (be loaded in: MolecularCloning:A referring to such as Sambrook etc.Laboratorymanual the 2nd edition, CSHpress, NewYork, 1989); Ausubel etc. (compileVolume) (be loaded in: ShortProtocolsinMolecularBiology, the 3rd edition, JohnWiley andSons, NewYork1995); Birren etc., (are loaded in: Bacterialartificialchromosomesingenomeanalysis-Alaboratorymanual,CSHPress,NewYork,1999)。By electroporation or transfection, the target DNA connecting is imported in host cell. Or, by 45-50The target DNA insert of kb be connected to suitable cosmid vector for example pSLGVT-28,On pSLGVT-35, pSLGVT-36, pSLGVT-37, pSLGVT-38 or derivatives thereof,Use the suitable external phage packaging of commercially available back extract (Stratagene, LaJolla, CA)Afterwards, transduce to host cell. The propagation of methylated target DNA need to have inactivationThe allelic host cell bacterial strain of mcr and mrr, described methylated target DNA is for exampleBy some utilize the synthetic genomic DNA of the scheme of methylated nucleoside acid-like substance orCDNA. Suitable host strain comprises: 10G (Lucigen, Middleton, WI); XL1-BlueMR and XL2BlueMRF ' (Stratagene, LaJolla, CA). Under suitable drugs is selected,By the cell of electroporation, transfection or transduction with approximately 20,000-50, the density paving of 000 bacterium colony/platePlate is to 10cm diameter agar plate, to produce initial library. Alternative approach is in Liquid CultureIn base, cultivate transduction or transfectional cell, carefully do not make cell transition grow and promote not need simultaneouslyImmune Clone Selection. Clone's sum in cultivating should reflect the needed GVT of research and designRight number. Harvesting, and separation quality grain, for following subsequent step.
In one aspect of the invention, by the pSLGVT-28 with target DNA insert,PSLGVT-35, pSLGVT-36, pSLGVT-37, pSLGVT-38 and any other function etc.FspBI or Csp6I (FermentasInc, Hanover, MD) for valency carrier or DNA skeletonDigestion is extremely completely to produce GVT. The digestion cutting insert DNA producing and producingGVT, but do not cut connected carrier or DNA skeleton. The GVT producing with which is largeLittle variable, this depend on the average frequency in target DNA internal cutting site and first cleavage site fromThe distance of target DNA end. Expection is by the people of FspBI or Csp6I digestion random fragmentationThe GVT that DNA insert produces has the average length of 100-200bp. Will with new generationThe linearized vector that connects of GVT or DNA skeleton by gel electrophoresis or affinity chromatography fromIn the environment of the insert DNA fragmentation of digestion, purifying out. By the linear product cyclisation of purifyingTo obtain initial GVT to library. Can from cyclisation template, reclaim GVT by DNA cloningRight, for direct DNA sequencing. Or, will introduce host with the right cyclisation carrier of GVTIn cell, then under alternative condition with each 10cm plate approximately 20,000-50,000 clone'sIn density bed board or liquid medium within, cultivate, to obtain initial plasmid GVT to library. WillFrom the initial GVT of plasmid, the plasmid purification in library is disappeared with the enzyme of the right both sides of cutting GVTChange, to cut GVT to for direct DNA sequencing from carrier.
7. external GVT is to preparation
Within scope of the present invention and principle, consider to prepare in vitro GVT and GVT couple, andThe step of not breeding by host cell. Conventionally, be applicable to produce GVT and without passing throughThe length of the DNA skeleton of host cell propagation should be at least 50-100bp or longer, so that toolThere are enough section flexibilities to be used to form through connecting in molecule to produce the ring-type that GVT is rightMolecule. For the right DNA skeleton of external preparation GVT without necessarily comprising origin of replication or medicineThing selected marker. This class DNA skeleton should have the PCR primer of suitable GVT to flankBinding site, for the produced GVT couple that increases. DNA skeleton can partly or entirely derive fromTo the restriction endonuclease digestion of transformation plasmid. Also can be partly or entirely by PCR or straightConnect chemical method oligonucleotides and synthesize to prepare suitable DNA skeleton. Derive from DNA skeletonIn the situation of PCR or chemical synthesis, modified nucleotides can be mixed in DNA skeleton and usesIn extra function. For example, biotin moiety can be mixed in DNA skeleton to make it possible toGVT is to affinity purification DNA intermediate in the different step of preparation in vitro. One has especiallyWith DNA design comprise substantially without or eliminate 16 kinds of palindrome knots that 4 possible base-pairs are longThe DNA skeleton of structure, thus allow by identifying restriction enzyme cores by nearly all 4 basesThe target DNA insert that acid enzymic digestion connects produces GVT. DNA skeleton also can comprise useIn the primer binding site of clonal expansion DNA profiling and other sequence, at next article used in lieu of a prefaceOn row analyzer, carry out DNA sequencing.
Although external GVT provides the more complicated GVT of generation to the possibility in library also to preparationAvoid breeding by microbial host cell the inconvenience of step, but breed in microbial hostsStep wherein needs in the application minimized artifacts of existence, to have superiority at some. ArtificiallyThe main source of product is based on wherein two different target DNA molecules and carrier or DNA skeletonThe generation that does not need molecule that each end connects. The artifacts in another source is in moleculeConnect during producing the right step of GVT and form, in described step two different carriers orThe GVT of DNA skeleton connects by intermolecular connection. Particularly, along with pcr amplification,GVT from two different target DNAs connects and formation artifacts GVT couple. ForDNA section within the scope of certain fragment length, has described for optimizing intermolecular and moleculeGeneral condition of contact (Collins and Weissman, 1984 of interior connection; Dugaiczyk etc.,1975; Wang and Davidson, 1966), to obtain producing for external GVT preparationThe optimal conditions of ring molecule. But the probability of happening of unwanted connection event in practiceCan not eliminate completely. But most of artifacts GVT are to passing through the biography in bacteriumRide instead of walk and suddenly remove. Linear DNA or large series connection DNA vector can not effectively be transformed into and breedIn microbial cell, the method is become and select for application examples as from the beginning genome assemblingMethod (the sequence synteny that wherein GVT is right is most important).
8. use large-scale parallel DNA sequence analysis instrument order-checking GVT couple of future generation
Exist at present three kinds of new commercial systems to can be used for ultra-high throughput, large-scale parallel DNAOrder-checking: GENOMESEQUENCERFLX system, is often called 454-sequenator(RocheDiagnostics,Indianapolis,IN);SOLEXA(Illumina,SanDiego,CA); With SOLiD system (AppliedBioSystems, FosterCity, CA). These are newly establishedStandby flux can exceed billions of bases and call/move, and this coefficient is to swim based on 96-in current this generationThe more than 1.5 ten thousand times of road Capillary Electrophoresis sequencing equipment. Within scope of the present invention and principleConsider these new order-checking platforms to be used for characterizing GVT couple. GVT of the present invention is to can newly establishingStandby upper order-checking, without excessive retouching operation scheme.
The pyrophosphoric acid of 454-technology based on carrying out on the DNA profiling of clonal expansion on microballonOrder-checking (pyrosequencing) chemistry, described microballon is loaded on separately high-density optical circulationOn the etch-hole in pond (opticalflowcell) (Margulies etc., 2005). Each base is extended productRaw signal is caught by special optic fibre. Typical 454-equipment moves 500,000 singles that compriseReading of 500 bases, this length is enough to characterize GVT couple of the present invention.
Be used for the SOLiD platform of the AppliedBiosystems of large-scale parallel DNA sequencingThe continuous circulation connecting based on DNA. By the method, by immobilized DNA template at beadUpper clonal expansion, described bead with high density bed board to glass flow cell (glassflow-cell)On surface, in described flow cell, there is sequencing reaction. Probe and one by short restriction mark isThe continuous circulation of the connection of row primer is realized sequencing, and described primer and immobilized template are assortedHand over. SOLiD equipment operation comprises and exceedes reading of 100,000,000 50 bases of single.
Proprietary flow cell surface, its will be fixed on for the sequencing template of SOLEXA platformMiddle by them, clonal expansion is to form discrete sequencing template bunch in position, and its density is up to 1,000More than ten thousand template bunch/square centimeter. Order-checking based on SOLEXA is four kinds of proprietary modificationsNucleotides in a step-wise fashion utilizes primer to mediate synthetic the carrying out of DNA under existing, described modificationNucleotides has reversible 3' dideoxyribonucleoside acid moieties and the chromofluor that can cut. RespectivelyBefore individual extension circulation, 3' dideoxyribonucleoside acid moieties and chromofluor chemistry are removed. AsThe circulation of nucleotides is progressively added in lower detection from each template bunch: follow figure by laser excitationPicture is caught, and carries out base call (basecalling) according to image capture. Equipment operation at present comprises76 bases become p-end-read up to 100,000,000 times, this is ideally suited for by frequentThe FspBI of cutting or Csp6III type restriction endonuclease cutting target DNA produceGVT is to checking order.
On SOLEXA platform, preparation has the GVT couple of 45-50kb space interval
In three main platforms, SOLEXA is unique two kinds of template strands that exist on flow cellAnd can be from the platform of two of a DNA profiling end direct Sequencing. Therefore, the present invention is suitable forThe uniqueness of SOLEXA platform is " become p-end-read " ability directly. When with cosmid vectorWhen pSLGVT-35 or derivatives thereof uses together, the invention provides from target DNA group and produceThere is the right ability of GVT of 45-50kb space interval. With solidifying by use agarose separatelyGlue separates attainable size fractionation and compares, and utilizes the head integral packaging mechanism utmost point of bacterial virusThe earth has improved the precision of target DNA size classification. The spacing of accurate 45-50kb providesThe physics coverage rate of genomic economy, with identify fine dimension change and with cross over target DNARepeat region and promote to produce for the genome support of gene order-checking from the beginning. With TuzunF clay pairing-end graphing method Deng (2005) is compared, and the present invention is in the economy of physics coverage rateIn property and the degree of depth, provide essence progress.
SOLEXAAdaptor provides three groups of overlapping primer binding sites: one group is instructed PCRAmplification is to produce the filial generation sequence template of Adaptor-A and Adaptor-B sequence flank; SecondThe solid phase isothermal duplication of group mediation gained filial generation template, produces and is fixed on order-checking flow cell surfaceTemplate bunch; (3) last group is two DNA chains the sequencing primer of each provides knotCo-bit point. The present invention utilizes the order-checking of the p-end of the one-tenth of SOLEXA platform-reading capability to produceGVT couple. As pSLGVT-35 and derivative illustrated, by SOLEXA attachmentEngineered to DNA vector skeleton, be positioned at each side of target DNA cloning site. With thisMode, can be on SOLEXA platform the new GVT couple producing of direct Sequencing. 152 basesGVT sequence to be derived to two 76 independent bases of each end of DNA profiling singleRead. The right length that effectively reads of GVT that FspBI and Csp6I produce is contemplated to SOLEXARead length, its improvement from current 76 bases read. Estimate to support at the year ends 2009 largeIn the p-end of single one-tenth of 100 bases-read.
PSLGVT-35 is the carrier of 2.6kb, and it comprises kanamycins selected marker, for surelyDetermine amplified gene group DNA low copy number P15A origin of replication and pack for bacteriophage lambdaCOS site. Eliminate the restriction endonuclease on carrier by site-directed mutagenesisThe cleavage site of FspBI and Csp6I, make it possible to according to the inventive method from target DNAInsert prepares GVT and GVT centering is subsequently utilized these enzymes. Target DNA cloning siteThe a pair of reverse BciVI restriction endonuclease site of side joint, it is located immediately on carrierThe SOLEXA " Adaptor-A " of IlluminaCorporation and " Adaptor-B " sequenceBetween. BciVI is IIS type restriction endonuclease, and it produces and be positioned at from enzyme recognition site3 ' of a base of 6 base-pairs is extended. BciVI produces locating digested vector in reverse siteSingle 3 ' the thymidine jag of raw Adaptors-A and Adaptor-B flank, to receivePrepare the gland of target DNA insert afterbody prepared by kit according to SOLEXADNA templatePurine.
Target DNA is cut into the clip size between 40-55kb, and by end T4-DNAPolymerase reparation also utilizes the Klenow that there is no circumscribed activity (exominus) under dATP existsPolymerase connects tail with single adenylic acid. By the DNA fragmentation of 45-50kb from agaroseIn gel, be purified into and be connected on the pSLGVT-35 carrier that thymidine connects tail. In linearisationCarrier equates with the mol ratio of target DNA insert and high DNA concentration (common every ul TNAContaining more than 2-3ug) (driving the long concatermer replacing containing carrier and target dna fragment to produce) in fact lowerExisting cosmid vector is connected with target DNA. Utilize commercially available back extract (Stratagene, LaJolla, CA) connected product is packaged in phage particle. Methylated target DNA exampleMcr and the allelic host of mrr need to as the propagation of genomic DNA with inactivation are thinBorn of the same parents' bacterial strain. Suitable host strain comprises: 10G (Lucigen, Middleton, WI); XL1-BlueMR and XL2BlueMRF ' (Stratagene, LaJolla, CA). Under kanamycins is selected,By infect cell with approximately 20,000-50, the density bed board of 000 bacterium colony/plate is to 10cm diameterOn agar plate, to produce initial cosmid library, it is included in a side by SOLEXAAdaptor-ASide joint and opposite side are by the target DNA of the average 45-50kb of SOLEXAAdaptor-B side jointInsert. Alternative approach is in liquid medium within, to cultivate the cell infecting, and does not carefully make simultaneouslyCell transition is grown and is promoted unwanted Immune Clone Selection. Clone's sum in cultivating should be anti-Mirror the required GVT of research and design to number. Harvesting, and separate cosmid DNA, usePrepare in GVT. By the purifying cosmid DNA with target DNA insert with FspBI orCsp6I digestion is to complete. By digestion product by CHROMASPIN1000 (Clontech,MountainView, CA) post to be to remove the target DNA insert of a large amount of digestion. By what flow outMaterial is electrophoresis on Ago-Gel. From gel, reclaim the DNA fragmentation of about 2.6-3kb,It is corresponding to the complete wire cosmid vector of GVT with two connections, described two connectionsGVT corresponding to the end of target DNA insert. The material of recovery is diluted to lower than 25Ng/ul, for connecting to produce GVT couple in molecule. The junction of new GVT arranged side by side is by structureSeries of fortified passes is divided in the restriction endonuclease site of the enzyme for generation of GVT, and setsThe border of the GVT of GVT centering is used for data analysis subsequently. By using SOLEXAThe primer of Adaptor-A and Adaptor-B carries out DNA cloning, from carrier framework, reclaimsThe GVT couple of gained. By the GVT of the SOLEXAAdaptor flank reclaiming at flow cellAmplification on surface, for carrying out in pairs-end sequencing on SOLEXA platform.
Within scope of the present invention and principle, consider with or need not external virus pack and lead toThe step crossing or do not breed by host cell, preparation GVT and there is other space intervalGVT couple. Under latter event, by the target with SOLEXAAdaptor at each endDNA insert is cloned in the suitable DNA skeleton with COS site, then as instituteState and use commercially available back extract (Stratagene, LaJolla, CA) to be packaged to bacteriophage headIn. DNA skeleton can be used purification parts such as biotin mark, to assist affinity purification requiredDNA product. By unpacked DNA nuclease degradation, protect by phenol extraction purifying thereuponThe packaging DNA protecting. Exist with suitable restriction endonuclease (FspBI or Csp6I) cuttingTarget DNA insert in gained ring-shaped DNA molecule, is connected with GVT to produce to compriseThe linear molecule of DNA skeleton. By the required linear DNA of affinity chromatography purifying. Use DNALigase is by connecting by the GVT end recirculation exposing, to produce GVT couple in moleculeAnd seal DNA in COS site to produce stable ring molecule. Use Adaptor-AFrom be connected mixture, reclaim GVT pair by PCR with Adaptor-B primer, forOrder-checking that SOLEXA " becomes p-end ".
Outwards open up outward to fill at 454-platform and prepare the GVT couple with 45-50kb space interval
The present invention is especially quite suitable for preparing following GVT couple: it is without adopting (2006) such as BerkaThe method of (U.S. Patent application 2006/0292611) and Kobel etc. (2007) just can be used forOn the 454-platform of RocheDiagnostics, check order. Can be used at present the Berka of 454-platformIn function, be limited to and be no more than thousands of bases Deng the method for (2007) such as (2006) and KobelSpace length and take so-called " ecto-entad (out-side-in) " to open up benefit, this open up mend retouchState the oriented opposite of the initial end of target DNA. The invention provides preparation and there is 45-50kbThe mark of space length keeps " outwards outer (out-side-out) " to open up the method for benefit simultaneously, thereforeTarget DNA end sequence keeps identical relative direction. Although do not exist on 454-flow cellTwo kinds of template strands, but 500 bases of current GSFLXTitanium equipment read length footTo measure GVT couple from the single direct sequence that reads from a kind of template strand, it is by using frequentlyThe FspBI of numerous cutting or Csp6III type restriction endonuclease cutting target DNA produce.
Cosmid vector pSLGVT-36 makes it possible to prepare on 454-platform and is " outside " outwardOpen up the GVT couple with 45-50kb space interval of benefit. The precise marking spacing of 45-50kbProvide economic genome physics coverage rate to change with qualification fine dimension and stride across duplicate block withBe conducive to the generation of genome support, for gene order-checking from the beginning with to fine dimension genomeChange mapping. The class size gene group of leting others have a look at of 60,000 GVT his-and-hers watches with 50kb space interval1 times of physics coverage rate. The current ability of 454-equipment bring single operation just can provide with50kb resolution ratio is to 20 of human genome times of physics coverage rates, this and Tuzun etc. (2005)F clay-pairing-end graphing method compare in the economy of physics coverage rate and the degree of depth and have realityMatter progress.
Cosmid vector pSLGVT-36 is the carrier of 2.6kb, its comprise kanamycins selected marker,For the P15A origin of replication of the stable low copy number of breeding of genomic DNA with for λ phagocytosisThe COS site of body packaging. Eliminate restriction endonuclease on carrier by site-directed mutagenesisThe cleavage site of FspBI and Csp6I, makes these enzymes can be according to the inventive method by targetDNA insert produces GVT and GVT couple subsequently. The target DNA cloning site side of carrierConnect " Adaptor-A " and " Adaptor-B " sequence of a pair of RocheDiagnostics, withMake it possible to utilize 454-Adaptor-A and 454-Adaptor-B primer to reclaim institute by PCRThe GVT couple producing. By the amplification of the recovery of Adaptor-A and Adaptor-B sequence flankGVT is to the mould to check order for the preparation of 454-by emulsion-based PCR (emulsionPCR) amplificationPlate.
In operation, by the target DNA right 45-50kbGVT for generation of for 454-platformCut into the clip size of 40-60kb, and use T4-archaeal dna polymerase is repaired end. To repairTarget DNA be connected to pSLGVT-36 carrier. At linearized vector and target DNA insertMol ratio equate and high DNA concentration (common every ul TNA containing more than 2-3ug) (driving containsThe long concatermer that carrier and target dna fragment replace produces) under realize cosmid vector and target DNAConnection. Utilize commercially available back extract (Stratagene, LaJolla, CA) by connected productThing is packaged in phage particle. The propagation of for example genomic DNA of methylated target DNA needsThere is mcr and the allelic host cell bacterial strain of mrr of inactivation. Suitable host strainComprise: 10G (Lucigen, Middleton, WI); XL1-BlueMR and XL2BlueMRF '(Stratagene, LaJolla, CA). Under kanamycins is selected, by the cell infecting with approximately20,000-50, the density bed board of 000 bacterium colony/plate is to 10cm diameter agar plate, at the beginning of producingBeginning cosmid library, it is included in a side by 454-Adaptor-A side joint and opposite side quiltThe target DNA insert of the average 45-50kb of 454-Adaptor-B side joint. Alternative approach beIn fluid nutrient medium, cultivate the cell infecting, carefully do not make cell transition grow and promote not simultaneouslyThe Immune Clone Selection needing. Clone's sum in cultivating should reflect that research and design is requiredGVT is to number. Harvesting, and separate clay, prepare for GVT. Will be with target DNAPurifying cosmid DNA with FspBI or Csp6I digestion to completely. Digestion product is passed throughCHROMASPIN1000 (Clontech, MountainView, CA) post is a large amount of to removeThe target DNA insert of digestion. By the material electrophoresis on Ago-Gel flowing out. From gelThe DNA fragmentation of the about 2.6-3kb of middle recovery, it is corresponding to GVT complete with two connectionsWhole wire cosmid vector, the GVT of described two connections is corresponding to the end of target DNA. To returnThe material of receiving is diluted to lower than 25ng/ul, for connecting to produce GVT couple in molecule. Pass throughProduce again divide for generation of the restriction endonuclease site of the enzyme of GVT new arranged side by sideThe junction of GVT. On molecule, be now unique restriction site producing again at number subsequentlyAccording to one's analysis, set the border of the GVT of GVT centering. By using AdaptorA and B to drawThing carries out DNA cloning, reclaims the GVT couple of gained from carrier framework. By 454-AdaptorThe amplification GVT of flank, to directly increasing on bead by emulsion-based PCR, checks order for 454-.
Within scope of the present invention and principle, consider with or need not external virus pack and lead toCross or do not breed step by host cell, preparation GVT and there is other space intervalGVT couple. Under latter event, by each end with specificity 454-Adaptor'sTarget DNA insert is cloned in the suitable DNA skeleton with COS site, then makesBe packaged in bacteriophage head with commercially available back extract (Stratagene, LaJolla, CA).DNA skeleton can be used purification parts such as biotin mark, to contribute to affinity purification requiredDNA product. By unpacked DNA nuclease degradation, protect by phenol extraction purifying thereuponThe packaging DNA protecting. Divide at gained cyclic DNA with the cutting of suitable restriction endonucleaseTarget DNA in son, divides with the wire that produces the DNA skeleton that comprises the GVT with connectionSon. By the required linear DNA of affinity chromatography purifying. With DNA ligase by moleculeConnect by expose GVT end recirculation, with produce GVT to and in COS siteDNA is to produce stable ring molecule in sealing. Use Adaptor-A primer and Adaptor-BPrimer reclaims GVT couple by PCR from connect mixture, checks order for 454-.
Open up to fill at 454-platform ecto-entad and prepare the GVT couple with 45-50kb space interval
In the time combining with phage packaging, within the scope of the invention and principle, also consider that preparation has" ecto-entad " opens up the GVT couple of benefit, and this is that (U.S. is special due to (2006) such as itself and BerkaProfit application 2006/0292611) and Kobel etc. (2007) described in method relevant, in described methodEnd mark is taked oriented opposite.
Cosmid vector pSLGVT-38 or derivatives thereof is used for having from target DNA group preparationSo-called " ecto-entad " opens up the GVT couple of the 45-50kb spacing of benefit, for flat at 454-On platform, carry out DNA sequencing. PSLGVT-38 is the carrier of 2.6kb, and it comprises kanamycinsSelected marker, for the P15A origin of replication of the low copy number of the stable propagation of genomic DNA andFor the COS site of bacteriophage lambda packaging. Eliminate restriction enzyme core by site-directed mutagenesisAcid enzyme FspBI and the cleavage site of Csp6I on carrier, make these enzymes can be according to thisBright method is prepared GVT and GVT couple subsequently from any target DNA insert. Carrier454-" the Internal of a pair of RocheDiagnostics of target DNA cloning site side jointAdaptor-A " and 454-" InternalAdaptor-B " sequence, to make it possible to utilization454-InternalAdaptor-A and 454-InternalAdaptor-B primer reclaim institute by PCRThe GVT couple producing. PSLGVT-38 is also at 454-InternalAdaptor-A and 454-InternalThe rare cutting-type restriction site pair of 8 bases that each side of Adaptor-B comprises coupling,Make it possible to by enzymatic digestion reclaim GVT to the InternalAdaptor sequence of flank.
In operation, by the target DNA right 45-50kbGVT for generation of for 454-platformCut into the clip size of 40-55kb, and use T4-archaeal dna polymerase is repaired end. To repairTarget DNA be connected to pSLGVT-38 carrier. At linearized vector and target DNA insertMol ratio equate and high DNA concentration (common every ul TNA containing more than 2-3ug) (driving containsThe long concatermer that carrier and target dna fragment replace produces) under realize cosmid vector and target DNAConnection. Utilize commercially available back extract (Stratagene, LaJolla, CA) by connected productThing is packaged in phage particle. The propagation of for example genomic DNA of methylated target DNA needsThere is mcr and the allelic host cell bacterial strain of mrr of inactivation. Suitable host strainComprise: 10G (Lucigen, Middleton, WI); XL1-BlueMR and XL2BlueMRF '(Stratagene, LaJolla, CA). Under kanamycins is selected, by the cell infecting with approximately20,000-50, the density bed board of 000 bacterium colony/plate is to 10cm diameter agar plate, at the beginning of producingBeginning cosmid library, it is included in a side by 454-InternalAdaptor-A side joint and opposite side quiltThe target DNA insert of the average 45-50kb of 454-InternalAdaptor-B side joint. AlternativeMethod is in liquid medium within, to cultivate the cell infecting, and does not carefully make cell transition growth simultaneouslyAnd promote unwanted Immune Clone Selection. Clone's sum in cultivating should reflect research and designRequired GVT is to number. Harvesting, and separate clay, prepare for GVT. To be withThere are purifying cosmid DNA FspBI or the Csp6I digestion of target DNA extremely complete. To digestProduct passes through CHROMASPIN1000 (Clontech, MountainView, CA) post to removeRemove the target DNA insert of a large amount of digestion. By the material electrophoresis on Ago-Gel flowing out.From gel, reclaim the DNA fragmentation of about 2.6-3kb, it is corresponding to the GVT with two connectionsComplete wire cosmid vector, the GVT of described two connections is corresponding to the end of target DNA.The material of recovery is diluted to lower than 25ng/ul, for connecting to produce GVT couple in molecule.Divide newly also by producing again for generation of the restriction endonuclease site of the enzyme of GVTThe junction of the GVT of row. On molecule, be now unique restriction site producing again withAfter data analysis in set the border of the GVT of GVT centering. By using 454-InternalAdaptor-A and 454-InternalAdaptor-B primer carry out DNA cloning, come from carrier boneIn frame, reclaim the GVT couple of gained. By InternalAdaptor by products therefrom recirculation,Then use for generation of the II type restriction endonuclease (FspBI or Csp6I) of GVT and disappearChange. Linear molecule now comprises and has " ecto-entad " and open up the GVT couple of benefit, described in open up bowl sparesThe initial end of target DNA insert in an opposite direction with the new InternalAdaptor being connectedGVT in each side is contrary. By the linear molecule and the 454-Adaptor-A that so produce with454-Adaptor-B connects, for checking order on 454-platform.
On SOLiD platform, preparation has the GVT couple of 45-50kb space interval
Be used for the SOLiD platform of the AppliedBiosystems of large-scale parallel DNA sequencingThe sequential circulation connecting based on DNA. By the method, by immobilized DNA profiling on pearlClonal expansion on grain, described bead with high density bed board to the surface of glass flow cell, in instituteState in flow cell and check order. Be connected on a series of primers by the probe of short restriction markContinuous circulation is realized sequencing, described primer and immobilized template hybridization. Current SOLiDEquipment operation comprises and exceedes reading of 200,000,000 independent 50 bases.
Although SOLiD platform provides the base of maximum quantity to call in every secondary device operation,This platform by its short read length and in flow cell, do not have can be used for order-checking two kinds of templatesChain restriction. Therefore, SOLiD platform for becoming " pairing (mate-pair) " of p-end-readIt is (every that system depends on the DNA label that utilizes EcoP15I digestion to produce a pair of 25 short basesOne represents the end of target DNA) and take to be similar to (the 2006) (U.S. Patent applications such as Berka2006/0292611) and " ecto-entad " of the method for Kobel etc. (2007) open up benefit, to produceRaw inner DNA sequencing primer binding site is with another right member of order-checking label. By currentSpace length between the label that " pairing " system provides is only thousands of bases, and can be benefitedIn the right 45-50kb space length of GVT of the present invention.
In the time combining with phage packaging, within the scope of the invention and principle, consider that preparation has" ecto-entad " opens up the GVT couple of benefit, and this is that (U.S. is special due to (2006) such as itself and BerkaProfit application 2006/0292611) and Kobel etc. (2007) described in method relevant, in described methodEnd mark is taked oriented opposite. In addition, the invention provides 100-200 of preparation average lengthThe advantage of the GVT of base, this length and existing pair system utilize EcoP15I digestion preparationThe label of 25 bases has been compared sizable progress.
Cosmid vector pSLGVT-37 or derivatives thereof is used for having from target DNA group preparationSo-called " ecto-entad " opens up the GVT couple of the 45-50kb spacing of benefit, at SOLiDOn platform, carry out DNA sequencing. PSLGVT-37 is the carrier of 2.6kb, and it comprises card, and that is mouldElement selected marker, for the P15A origin of replication of the low copy number of the stable propagation of genomic DNAWith the COS site for bacteriophage lambda packaging. Eliminate the limit on carrier by site-directed mutagenesisProperty endonuclease FspBI processed and Csp6I cleavage site, make these enzymes can be according to thisBright method is prepared GVT and GVT couple subsequently from any target DNA insert. Carrier" the Internal of a pair of AppliedBiosystems of target DNA cloning site side joint (ABI)Adaptor-A " and " InternalAdaptor-B " sequence, to make it possible to utilize ABI-InternalAdaptor-A and ABI-InternalAdaptor-B primer reclaim produced GVT by PCRRight. PSLGVT-37 is also at ABI-InternalAdaptor-A and ABI-InternalAdaptor-BEach side rare cutting-type restriction site of 8 base of comprising pairing, make it possible to pass through enzymeShort digestion reclaim GVT to the InternalAdaptor sequence (if necessary) of flank.
In operation, by for generation of right for the 45-50kbGVT of ABISOLiD platformTarget DNA cuts into the clip size of 40-55kb, and uses T4-archaeal dna polymerase is repaired end.The target DNA of reparation is connected to pSLGVT-37 carrier. At linearized vector and target DNAThe mol ratio of insert equates and high DNA concentration (common every ul TNA is containing more than 2-3ug)Under (driving the long concatermer replacing containing carrier and target dna fragment to produce), realize cosmid vector andThe connection of target DNA. Utilize commercially available back extract (Stratagene, LaJolla, CA) by connectThe product connecing is packaged in phage particle. For example genomic DNA of methylated target DNAPropagation need to have mcr and the allelic host cell bacterial strain of mrr of inactivation. Suitable placeMain bacterial strain comprises: 10G (Lucigen, Middleton, WI); XL1-BlueMR and XL2BlueMRF ' (Stratagene, LaJolla, CA). Kanamycins select under, by infect cell withApproximately 20,000-50, the density bed board of 000 bacterium colony/plate is to 10cm diameter agar plate, to produceRaw initial cosmid library, it is included in a side by ABI-InternalAdaptor-A side joint and anotherSide is by the target DNA insert of the average 45-50kb of ABI-InternalAdaptor-B side joint.Alternative approach is in liquid medium within, to cultivate the cell infecting, and does not carefully make cell transition simultaneouslyGrow and promote unwanted Immune Clone Selection. Clone's sum in cultivating should reflect researchThe GVT of design is to number. Harvesting, and separate clay, prepare for GVT.Purifying cosmid DNA with target DNA is extremely complete with FspBI or Csp6I digestion. WillDigestion product is by CHROMASPIN1000 (Clontech, MountainView, CA) postTo remove the target DNA insert of a large amount of digestion. By flow out material on Ago-GelElectrophoresis. From gel, reclaim the DNA fragmentation of about 2.6-3kb, it is corresponding to having two companiesThe complete wire cosmid vector of the GVT connecing, the GVT of described two connections is corresponding to target DNAEnd. The material of recovery is diluted to lower than 25ng/ul, for connecting to produce in moleculeGVT couple. By producing again the restriction endonuclease site for generation of the enzyme of GVTDivide the junction of new GVT arranged side by side. On molecule, be now unique produce again restrictedThe border of the GVT of GVT centering is set in site in data analysis subsequently. By usingABI-InternalAdaptor-A and ABI-InternalAdaptor-B primer carry out DNA cloning,From carrier framework, reclaim the GVT couple of gained. By InternalAdaptor, gained is producedThing recirculation, then uses the II type restriction endonuclease (FspBI for generation of GVTOr Csp6I) digestion. Linear molecule comprises and has " ecto-entad " and open up the GVT couple of benefit, instituteState the initial end existing Internal with being newly connected in direction that opens up bowl spares target DNA insertGVT in each side of Adaptor is contrary. By so produce linear molecule withABI-Adaptor-P1 is connected with 454-Adaptor-P2, for the SOLiD pairing at ABIOn platform, check order.
In preferred embodiments, the present invention by produce multiple have restriceted envelope distance andThe GVT couple of unique genome location identifier of direction, identifies meticulous in target gene groupStructural change. The genome overview of described multiple GVT to common expression experimenter, when with ginsengThan sequence or the genome overview of other target gene group producing similarly relatively time, it indicates coreFine structure difference between acid group exists. By the detectable genome fine structure of the present inventionVariation comprises: deletion and insertion, repetition, inversion, transposition and other chromosomal rearrangement. ThisBright providing under the user-defined level of resolution being specified by experimental design identified these genesThe method of stack features.
The invention provides the generation of the GVT of hundreds of base average lengths, described length only byDNA sequencing platform effectively read length restriction. Suppose four kinds of base abundance homogeneous and divide at randomCloth, the length that reads of current 76 bases of SOLEXA platform will be predicted the sequence of this lengthCan be by accident with average every 476Base-pair occurs once, and should represent the mankind and other complexityUnique sequences identifier in genome. But, in a lot of complex genomes, there are four kinds of alkaliThe existence in the not reciprocity performance of base and a large amount of repetition DNA districts, causing in practice can not be byThe signal portion of this big or small short dna label is dispensed to unique genome position. Both fixed lengthDegree GVT be clearly dispensed to genome improve with second GVT be connected and its spacing distanceUnderstanding. For example, comprise the target DNA group who separates from size fractionation who connects on two spacesThe GVT of the 76bpGVT of middle preparation is to being 152bp sequence label effectively. Although longerEffective tag length, but still may not be by many GVT or GVT to being dispensed to uniquenessGenome position, for example complete those GVT within very long duplicate factor group districtRight. But the p-end of one-tenth-provide aspect the reading essence progress that can map is being provided in the present invention.The region that expection can not be analyzed by the present invention is considerably less, and this prepares mainly due to the present inventionThere is the right ability of GVT of 40-50kb or longer spacing distance, described spacer defection acrossMore most of localization district (localizedregion) of repetition DNA.
The general framework sequence existing on to monomer at each GVT allows by high-flux sequence numberAccording to clearly extracting GVT to sequence. Utilize MEGABLAST (Zhang etc., 2000) or similarComputer program disclose by comparison figure spectral position and the one or more reference order that GVT is rightRow figure spectral position between inconsistent. GVT is inconsistent to spacing distance or direction and referenceExceeding threshold level is shown in advance between target and reference DNA and has architectural difference. Threshold level byExperimental design is set, and departing from two standard deviations of average GVT spacing distance is rational default value.Compared with reference sequence, the disappearance in target DNA can be by 2 or more GVT to definition,Described GVT is to crossing over 2 of average distance more than standard deviation. Therefore, in target DNAInsertion may be defined as with upper/lower positions: wherein compared with reference sequence, two or more GVTTo crossing over two of equispaced below standard deviation. Inversion in target DNA be defined as withUpper/lower positions: the GVT direction that wherein two or more GVT are right compared with reference sequence differsCause. By inconsistent GVT to labor management (curate) and evaluate, then continue by PCR,Southern blotting technique hybridization analysis or separate and check order to verify by insert.
The present invention's target gene group nucleic acid used can derive from any source, comprising: eucaryote,The genomic DNA of prokaryotes, microorganism, plastid and virus. Target gene group nucleic acid is all rightDerive from biological rna gene group, for example, change RNA into DNA by reverse transcription processRNA virus. Can be subject to describing in scientific literature for the selection of the target nucleic acid studiedThe existing knowledge impact that specific chromosome or chromosomal region are relevant to some disease condition. The present inventionCan be used to the chromosome of self-separation or the target DNA of chromosomal region. The present invention can be used for necessarilyThe resolution ratio of scope widely full genome scanning patient crowd to be applicable to research and design. For pureThe method of changing chromosome, chromosome segment and genomic DNA and RNA is known in the art. This area is also known by PCR or by the method for other means amplification of nucleic acid, to produceThe target DNA of analyzing for the present invention.
Describe cutting target DNA above and separated the method for target DNA to required size with classification,For setting the space length between the GVT that GVT is right. Fluid dynamic is sheared, self adaptation is poly-Burnt acoustics or can be used for generation with the enzyme part enzymatic digestion DNA of frequent cutting and have highly heavyThe DNA fragmentation group of lamination section, for covering substantially each district of target DNA. Or,Available several restriction endonucleases in cleavage reaction independently by target DNA digestion to completeEntirely, then size fractionation is separated to for GVT preparing required big or small classification. Single by usingOne restriction endonuclease digests target DNA preparation, that select through size completely and producesGVT to being non-overlapped, and only covered a part of target DNA complexity. By oneOr the complete enzymatic digestion of multiple other restriction endonuclease obtains, select through sizeDNA fragmentation can be used for providing the overlapping of sequential covering. The physical parameter of experiment was for example both to have coveredDetermine genomic DNA fragmentation method, GVT spacing distance and its combination, the alkali of complexityThe distribution of base composition or repeat element, can be by those skilled in the art's microcomputer modelling, to obtainTo best research and design. Such as BamHI, HindIII, PstI, SpeI and XbaI etc.Enzyme CpG is methylated insensitive, and expection can be in each site cutting mammalian genes groupDNA, to produce the right GVT couple of adjacent recognition site that accurately represents those enzymes. To CpGMethylate, overlapping CpG methylates maybe can affect the DNA of other kind of foranalysis of nucleic acids of the present inventionInsensitive other suitable enzyme of effect of modifying is in document (McClelland etc., 1994;Geier etc., 1979; Kan etc., 1979; Hattman etc., 1978; Buryanov etc., 1978; MayDeng, 1975) neutralization by main restriction endonuclease supplier (Fermentas, Hanover,MD; NewEnglandBiolabs, Ispwich, MA) describe. In certain embodiments, itsThe application of the enzyme to the cutting of target DNA to DNA modification sensitivity can be used for dividing in target DNAExternal cause genomic modification site. For example, the present invention can identify what known regulatory gene was expressedDNA methylation site. For described application, with methylating responsive restriction enzyme by target DNADigestion is extremely complete, and produces GVT couple by the DNA digesting. By gained GVT toInconsistent qualification when adjacent limits site in reference sequence the is compared site that methylates.
First the inconsistent GVT couple of labor management, carries out a series of classified filtering afterwards,For checking. Inconsistent GVT comes from complete restriction endonuclease to origin thereinIn the situation that DNA digestion, that select through size produces, with same restrictions endonucleaseThe southern blotting technique analysis of the target DNA of enzymic digestion and reference DNA can be used for verifying target DNAAnd the difference of marking path between reference DNA. The length of GVT is enough to as specificityPCR primer, interleaves genome sequence for shotgun sequencing to separate, to determine structural changeDefinite character.
It is generally acknowledged, complex disease will be further illustrated in the research of structural change, for example fat andDiabetes, these advancings of disease are triggered by the interaction of gene, genetic elements and environment.The specific dyeing that the selection of the nucleic acid of analyzing for the present invention can be subject to describing in scientific literatureThe existing acquainted impact that body or chromosomal region are relevant to some disease condition. The present invention can be highResolution target is the chromosome of self-separation or the DNA of chromosomal region or tissue sample always. Or,The present invention can be used for the resolution ratio of certain limit widely full genome scanning patient crowd withBe applicable to research and design. F clay pairing-end plotting technique (Tuzun etc., 2005) need to exceed 21000000 two deoxidation base order-checkings of conventional Sanger are read with the resolution ratio with medium and coverage rateHorizontal analysis individuality, has limited the application of its scanning large group thus, and described large group is for closingConnection research, to find disease result as the biomarker of diagnostic or prognostic and for medicineThe potential drug target that thing is intervened. The invention provides the solution of these restrictions, therefore, thisInvention has the potentiality that produce new medical diagnosis method and ancillary drug discovery.
In a further preferred embodiment, the fine structure of the present invention's qualification is changed and is used for establishingIts in mensuration and this area of meter oligonucleotide arrays mensuration, microarray assays, PCR-basedIts diagnostic assay, to detect the difference between nucleic acid group. Microarray of the present invention and oligonucleotidesArray is having for detection of the change of nucleic acid copy number and single or minority nucleotide polymorphismsEffect platform, can facilitate or cause that other genome of disease changes but be unsuitable for detecting. The present inventionQualification product make it possible to design oligonucleotides and microarray assays or this area other examineDisconnected mensuration, to screen transposition, insertion, the disappearance of the fine structure variation of dividing the present invention's qualificationWith inversion junction. Then these mensuration can be used for screening general groups and large patient crowd,To determine that fine structure changes the effect in complex disease, described disease is for example fat, sugaredUrine disease and many cancers, these advancings of disease are by the interaction of multiple h and E factorCause. Other application of these mensuration include but not limited to diagnosis or distinguish medical diagnosis,In phylogenetics and industrial microbiology field, there is the biological closely-related thing of effectivenessKind, strain, kind or bion.
In a further preferred embodiment, the present invention is for generation of hrr gene picture groupSpectrum, to contribute to according to " shotgun DNA sequencing " from the beginning genome assembling. Shotgun is surveyedOrder is proposed by (1977) such as Sanger, wherein genomic DNA random fragment is changed into small fragmentFor separately order-checking, afterwards by sequence assembling to build genome sequence. For complex genome,Shotgun is what disputed on, in complex genome because repetitive sequence can exist pseudo-overlapping. By twoKind method is for the treatment of complex genome. Stage division (hierarchicalapproach) comprises generationMiddle size clone for example BAC overlapping collection, select these clones' watt approach (tiling that coversPath) and subsequently make each clone through shotgun sequencing. With which, large genome is decomposedBecome less more " manageable genome ". Second method is called " full genome shotgun "(WGS), wherein use computer approach one action (inonefellswoop) directly from the folded order of short weightRow read middle generation complete genome group sequence. Two progress make WGS feasible: (1) EdwardProvide two sequences to read it Deng (1990) by the insert end of the known approximate size that checks orderBetween the link information of distance restraint, the application that pairing-end reads is proposed; (2) can profitWith development (Huang etc., 2006 of packing algorithm that become p-end sequence information; Warren etc.,2006; Pop etc., 2004; Havlak etc., 2004; Jaffe etc., 2003; Mullikin and Ning,2003; Huang etc., 2003; Batzoglou etc., 2002; Pevzner and Tang, 2001;Myers etc., 2000). Read clone's length constraint as sequence between admissible distanceOffer WGS assembly program. This information is to differentiating repetition order by the structure that allows supportRow are crucial, and described support connection, arrangement and targeting sequence contig, for increasing gainedThe long scope adjacency of sequence assembling. The plasmid of Edwards etc. (1990) becomes p-end-read after a whileBecome p-end-read to supplement to build more orderly support (Warren etc., 2006 by BAC;Zhao, 2000; Mahairas etc., 1999). Although but use in a large number p-end-read,But most draft genome sequences comprise thousands of wrong assemblings (Salzberg and Yorke,2005). Assembly defect comes from the combination of following problem: in software defect, genome, be difficult to processDuplicate block (difficultrepeatedregion), most large genomic dliploid character and resolutionThe support of rate and coverage rate deficiency. Support lack of resolution come to a great extent derive from plasmid orCoarse distance of the p-end of one-tenth of BAC insert-read, this is because can not determineUse each clone's of current experimental program order-checking size. In addition not pin of the support of structure,Required parts number and spacing are optimized to obtain essential spatial resolution. The present invention carriesFor the method that produces high-resolution support to make it possible to carry out genome assembling, especially from the beginningThe genome that assembling does not characterize, does not have available existing knot in the described genome not characterizing conventionallyStructure information. Particularly, the invention provides improving one's methods of preparation GVT, described GVT existsIn an embodiment, represent (1990) such as Edward, Zhao (2000) and Tuzun (2005)Classical become the function equivalent of improvement of p-end-read. Become p-end-read phase with classicsRatio, GVT is accurately suitable for the ability of any required configuration to having the spacing of making, the more important thing isThere is the ability in adjacent limits endonuclease site in marker gene group to provide gainedThe independent confirmation of the accuracy of genome assembling. GVT to be applicable to routine based on SangerTwo deoxidation base order-checking chemistry or 454-equipment of new generation (RocheDiagnostics,Indianapolis, IN), SOLEXA equipment (Illumina, SanDiego, CA) or SOLiD establishOn standby (AppliedBioSystems, FosterCity, CA), carry out high flux DNA sequencing, to carryFor the covering that has cost-effectiveness completely to target gene group. Therefore, the invention provides one group comprehensiveThere is the unique genetic marker that limits spacing distance or adjacent limits endonuclease site,To promote the work of full genome shotgun sequencing.
That expection the present invention produces, with the current version of human genome assembling (36 editions, 2006Year April) in fact inconsistent a large amount of GVT to may not representing the meticulous knot in target DNAStructure changes, but has reflected mistake or room in the assembling of current mankind genome. Make problem moreThat complicated is the DNA that existing genome assembling derives from multiple donors of merging. Need sourceYu represents the reference sequence of a large amount of single individualities of mankind's diversity scope, to promote genomicsAdvance in field. Purposes provided by the invention provides the method for so implementing economically.
In another preferred embodiment, the present invention is for generation of high-resolution genomeCollection of illustrative plates is beneficial to phylogenetic study, and for determine heredity between closely-related biology andFunctional relationship. Be particularly suited for that one aspect of the invention utilization of this application produces by target DNAGVT couple, described target DNA is not separately or having in producing useful combination GVTHave in the situation of DNA size fractionation step with one or more restriction endonucleases digestionTo complete. Substantially the GVT, so producing contains the right gene of position mark to having formedGroup overview, described position mark is divided adjacent restriction enzyme nucleic acid along target DNA lengthEnzyme site. It is general that the homogeneity that GVT is right and relative abundance thereof can be used for producing hrr gene groupCondition, this genome overview can be used for qualification, differentiation and quantitative complicated medical science or environment DNA dividesFrom the original gene group in thing. The GVT producing is to also can be applicable to industrial microorganism fieldIn, for the identification of drawing in closely-related strain, bion or the kind of genetically modified organismsThe genome difference that plays ideal character, described ideal character is for example favourable growth rate and productRaw useful secondary metabolites and recombinant protein. Therefore, the present invention is moving by microorganism or lactationThing host cell carries out can contributing to strain improvement in industrial production. The high-resolution that the present invention producesRate Genome Atlas also provides low cost and effective method to study closely-related cause of diseaseBody nucleic acid, to identify region of variation, thereby examines detailed sequence analysis for the identification of can be used forDisconnected and can be used as the cause of disease determinant of the medicine target of medical intervention.
In a further preferred embodiment, the present invention can be used for genetic dissection domestic animal and agricultural workThe Phenotypic Diversity of thing, to be conducive to label assistant breeding. For qualification complex inheritance element, domestic animal is concerned especially, described genetic elements contributes to growth control, energy generationThank, growth, organization, reproduction and behavior and other proterties of seeking by classical breeding.About summarizing referring to Andersson (2001). Most of target agronomic traits is multifactorial, logicalControlled by the quantitative trait locus (QTL) of unknown number. Micro-satellite mapping of genome scanningSpectrum has been developed for Some Livestock. Use correlative study and the candidate gene approach of these marksTwo kinds of main policies for the identification of QTL. The clone of QTL is challenging, because baseBecause the relation between type and phenotype is considered to more complicated than monogenic character. But, likely logicalCross progeny test and indirectly determine QTL, in described test, utilize from the genetic marker between filial generationInfer separating of QTL with the data that phenotype changes. At present, the molecular based of most of QTLPlinth is still unknown. QTL in fruit bat mapping prompting, QTL is often and in noncoding regionSequence variation be correlated with (MacKay, 2001). As in people, expection domestic animal and crop geneFine structure in group changes aspect the interaction of phenotypic expression and genome and environmentProbably play an important role. The invention provides with low cost the broad range in domestic animal and cropThe method of genome structure diversity tabulation. Then, the information of tabulation can produce few coreThuja acid microarray and other diagnostic platform, for associated and chain research, to identify and sign is ledCause the actual QTL of label assistant breeding.
As main pollinator, honeybee is worked as neutralization many areas in the world in agricultural and plays keyEffect. Bee-keeping is to benefit from another field of the present invention. Honeybee is that one is important economicallySpecies, it is suitable for using genetic technique in Breeding and Development. The honeybee generation time is short, producesA large amount of filial generations. Germline is also easily by artificial essence propagation. Race of bees is in fertility performance, disease-resistantProperty and behavioral trait aspect show widely phenotype and change, the many complexity that are subject in described protertiesHeredity control. Comprised by genetically controlled important behavioral trait: with many African strains exampleAggressiveness, feeding habit, honey yield and so-called " health " behavior of showing. " health " propertyShape is regulated by least 7 locus that not yet define, and these locus are combined and cause honeycombMember removes the clean behavior of dead or ill colony, main as for fungi and mite invasion and attackDefence, fungi and mite are two kinds of main honeybee economy pathogen. Main target is that exploitation canThe diagnosis molecular labeling leaning on, these marks can be used for label assistant breeding, with fast and effecientlyIdentify required daughter lines, and without complicated and breeding experiment and field test consuming time. ThisThe bright 200 megabasse size gene groups that use apis mellifera Linnaeus (Apismellifera) strain DH4Genetic map and reference sequence (Weinstock, 2006) provides effectively and method cheaply,Study the genomic fine structure of multiple race of bees with high-resolution and change, thereby by requiredPhenotype and genotypic correlation connection. The ability that cost is studied multiple strains is effectively the invention providesKey advantages.
In a further preferred embodiment, the present invention can be used for identifying in neurological disorder and protertiesPotential genetic cause. It is generally acknowledged, many neurological disorders (as autism, bipolar disorder andSchizophrenia) at least one component there is complicated non-Mendelian heredity component (CraddockAnd Jones, 2001; Owen and Craddock, 1996; Holzman and Matthysse, 1990).Complementary chain and correlation research, at present for the identification of genome component, the invention provides evaluationGenome fine structure changes the method for the promotion effect in neurological disorder, and can produce useIn diagnosis, prognosis and case control's new method.
In a further preferred embodiment, the present invention can be used for identifying heredity potential in cancerReason, produces for diagnosing, the method for prognosis and Results thus. Nearly all cancerAll abnormal owing to DNA sequence dna, these are abnormal or hereditary, or by rawSomatic mutation in the middle of life obtains. The main principle that tumour generates is, the heredity of accumulation, together with environmental factor gene expression or gene function changed with body cell DNA mutationExceed and allowed clonal expansion, cell to invade surrounding tissue and start the key function threshold shifting.Have 1/3 people will suffer from cancer in western countries, and 1/5 will be directly because of this disease dead,This makes cancer become modal genetic disease. In history, this field causes to identify effectivelyOncogene or tumor suppressor gene start, and a small amount of nucleotides due to locus in described gene changesBecome and simply lose or obtain function be cancer mainly facilitate factor. This field expanded to afterwardsGene dosage (genedosage), wherein causes the repetition of the DNA section of gene copy number changeOr disappearance is tumorigenic supposition reason. Array CGH is to detecting the change of DNA copy numberAnd the forfeiture of the heterozygosity of cancerous cell line and primary tumor is particularly useful. Copy in cancerSomatic mutation catalogue in comprehensive review and cancer that number is analyzed and bibliography whereinCan be referring to " cancer gene batch total be drawn " of SangerInstitute(http://www.sanger.ac.uk/genetics/CGP/)。
Recently, have recognized that genome fine structure changes the important function in tumour occurs.In tumour generating process, Oncogenome has been accumulated a large amount of rearrangements, comprise amplification, disappearance,Transposition, inversion etc., wherein many tumour progressions (Gray and Collins, 2000) of directly facilitating.Volik etc. (2006) utilize the modification of F clay pairing-end mapping, the tumour in Detection progressGenome structure institute change, the transposition that especially can not detect by array CGH andInversion event. The trial that they resolve mastocarcinoma gene group is that tool is informational, but studiedPerson generally acknowledges the required expense of end sequence that is limited to a large amount of BAC clones that obtain each sampleAnd resource. The invention provides cheaply, high-resolution method overcomes these defects, andThe genome fine structure that qualification is unsuitable for detecting by array CGH changes. When with the next generationWhen the coupling of DNA sequence analysis instrument, cost of the present invention is enough low, makes it possible to for cancer widelyDisease patient cohort study and change tired for the genome of the tumour progression of following the tracks of individual patientLong-pending. The ability that genome in tracking of knub progression changes will have meaning on clinical effectivenessThe predictive value that justice is far-reaching, provides the remarkable improvement to patient treatment.
It should be understood that in the situation of known this paper disclosure, various other changes abilityField technique personnel are apparent, and can easily be made by these personnel, and can notDeviate from scope and spirit of the present invention.
Bibliography
The application mention everywhere with Publication about Document and all other articles, patent and publishedApply for all incorporated herein by reference:
AlbertsonDG and PinkelD, 2003.Genomicmicroarraysinhumangeneticdiseaseandcancer.HumMolGen12SpecNo2:R145-R152.
AlbertsonDG etc., 2000.QuantitativemappingofampliconstructurebyarrayCGHidentifiesCYP24asacandidateoncogene.NatGenet25:144-146.
AnderssonL,2001.Geneticdissectionofphenotypicdiversityinfarmanimals.NatRev2:130-138.
BaileyAB etc., 2002.Recentsegmentalduplicationsinthehumangenome.Science297:1003-1007.
BatzoglouS etc., 2002.ARACHNE:Awhole-genomeshotgunassembler.GenomeRes12:177-189.
BerkaJ etc., 2006.Pairedendsequencing. Application No. US2006/0292611.
BignellGR etc., 2004.High-resolutionanalysisofDNAcopynumberusingoligonucleotidemicroarrays.GenomeRes14:287-295.
BolivarF etc., 1977.Constructionandcharacterizationofnewcloningvehicles.IImultipurposesystem.Gene2:95-113.
BrennanC etc., 2004.High-resolutionglobalprofilingofgenomicalterationswithlongoligonucleotidemicroarray.CancerRes64:4744-4748.
BujnickiJM,2001.Understandingtheevolutionofrestriction-modificationsystems:Cluesfromsequenceandstructurecomparisons.ActaBiochimicaPolonica48:935-967.
BuryanovYI etc., 1978.SitespecificandchromatographicspropertiesofEcoliK12andEcoRIIDNA-cytosinemethylases.FEBSLett88:251-254.
ChangACY and CohenSN, 1978.ConstructionandcharacterizationofamplifiablemulticopyDNAcloningvehiclesderivedfromtheP15Acrypticminiplasmid.JBacteriology134:1141-1156.
CheckE,2005.Patchworkpeople.Nature437:1084-1096.
ChengZ etc., 2005.Agenome-widecomparisonofrecentchimpanzeeandhumansegmentalduplications.Nature437:88-93.
CollinsFS etc., 1987.Constructionofageneralhumanchromosome-jumpinglibrary,withapplicationincysticfibrosis.Science235:1046-1049.
CollinsFS and WeissmanSM, 1984.DirectionalcloningofDNAfragmentsatalargedistancefromaninitialprobe:Acircularizationmethod.ProcNatlAcadSci(USA)81:6812-6816.
CraddockN and JonesI, 2001.Moleculargeneticsofbipolardisorder.BrJPsychiatrySuppl41:S128-S133.
DeiningerPL,1983.RandomsubcloningofsonicatedDNA:ApplicationtoshotgunDNAsequenceanalysis.AnalytBiochem129:216-223.
DugaiczykA etc., 1975.LigationofEcoRIendonuclease-generatedDNAfragmentsintolinearandcircularstructures.JMolBiol96:171-178.
DunnJL etc., 2002.Genomicsignaturetags (GSTs): AsystemforprofilinggenomicsDNA.GenomeRes12:1756-1765.
EdwardsA etc., 1990.AutomatedDNAsequencingofthehumanHPRTlocus.Genomics6:593-608.
FengT etc., 2002.IncreasedefficiencyofcloninglargeDNAfragmentsusingalowercopynumberplasmid.BioTechniques32:992-998.
FeukL etc., 2006.Structuralvariationinthehumangenome.NatureRev7:85-97.
FitzgeraldMC etc., 1992.RapidshotguncloningutilizingthetwobaserecognitionendonucleaseCviJI.NucAcidRes20:3753-3762.
GeierGE and ModrichP, 1979.RecognitionsequenceofthedammethylaseofEscherichiacoliK12andmodeofcleavageofDpnIendonuclease.JBiolChem254:1408-1413.
GonzalezE etc., 2005.TheinfluenceofCCL3L1gene-containingsegmentalduplicationsonHIV-1/AIDSsusceptibility.Science307:1434-1440.
GrayJW and CollinsC, 2000.Genomechangesandgeneexpressioninhumansolidtumors.Carcinogenesis21:443-452.
GrindleyNDF and JoyceCM, 1980.GeneticandDNAsequenceanalysisofthekanamycinresistancetransposonTn903.ProcNatlAcadSci(USA)77:7176-7180.
HamelinC and YelleJ, 1990.GelandbuffereffectsonthemigrationofDNAmoleculesinagarose.ApplTheorElectrophor1:225-231.
HattmanS etc., 1978.SequencespecificityoftheP1modificationmethylase(M.EcoP1)andtheDNAmethylase(M.Ecodam)controlledbytheEscherichiacolidamgene.JMolBiol126:367-380.
HavlakP etc., 2004.Theatlasgenomeassemblysystem.GenomeRes14:721-732.
HayashiK etc., 1986.Regulationofinter-andintermolecularligationwithT4DNAligaseinthepresenceofpolyethyleneglycol.NucAcidsRes14:7617-7630.
HeffronF etc., 1978.InvitromutagenesisofacircularDNAmoleculebyusingsyntheticrestrictionsites.ProcNatlAcadSci(USA)74:6012-6016.
HeiskanenMA etc., 2000.DetectionofgeneamplificationbygenomichybridizationtocDNAmicroarrays.CancerRes60:799-802.
HolzmanPS and MatthysseS, 1990.Thegeneticsofschizophrenia:Areview.PyscholSci1:179-286.
HuangJ etc., 2004.WholegenomeDNAcopynumberchangesbyhighdensityoligonucleotidesarrays.HumGenomics1:287-299.
HuangX etc., 2006.Applicationofasuperwordarrayingenomeassembly.NucAcidsRes34:201-205.
HuangX etc., 2003.PCAP:Awhole-genomeassemblyprogram.GenomeRes13:2164-2170.
InazawaJ etc., 2004.Comparativegenomichybridization(CGH)-arrayspavethewayforidentificationofnovelcancer-relatedgenes.CancerSci95:559-563.
JaffeDB etc., 2003.Whole-genomesequenceassemblyformammaliangenomes:ARACHNE2.GenomeRes13:91-96.
KanNC etc., 1979.ThenucleotidesequencerecognizedbytheEscherichiacoliK12restrictionandmodificationenzymes.JMolBiol130:191-209.
KinzlerKW etc., 1995.Methodforserialanalysisofgeneexpression.United States Patent (USP) the 5th, 695, No. 937 (mandate on December 9th, 1997).
KorbelJO etc., 2007.Paired-endmappingrevealsextensivestructurevariationintheHumangenome.Science318:420-426.
KozdrojJ and vanElsasJD, 2001.Structuraldiversityofmicroorganismsinchemicallyperturbedsoilassessedbymolecularandcytochemicalapproaches.JMicrolMeth43:187-212.
LokS,2001.MethodsforgeneratingacontinuousnucleotideSequencefromnon-contiguousnucleotidesequences. United States Patent (USP)6,730, No. 500 (mandate on May 4th, 2004).
LucitoR etc., 2003.Representationaloligonucleotidemicroarrayanalysis:Ahigh-resolutionmethodtodetectgenomecopynumbervariation.GenomeRes13:2291-2305.
MackayTFC,2001.QuantitativetraitlociinDrosophila.NatRevGenet2:11-20.
MahairasGG etc., 1999.Sequence-taggedconnectors:Asequenceapproachtomappingandscanningthehumangenome.ProcNatlAcadSci(USA)96:9739-9744.
MardisER,2008.Next-generationDNAsequencingmethods.AnnuRevGenomicsHumGenet9:387-402.
MarguliesM etc., 2005.Genomesequencinginmicrofabricatedhigh-densitypicrolitrereactors.Nature437:376-380.
MatsumuraH etc., 2003.Geneexpressionanalysisofplanthost-pathogeninteractionsbySuperSAGE.ProcNatlAcadSci(USA)100:15718-15723.
MayMA and HattmanS, 1975.Analysisofbacteriophagedeoxyribonucleicacidsequencesmethylatedbyhost-andR-factor-controlledenzymes.JBacteriology123:768-770.
McClellandM etc., 1994.Effectofsite-specificmodificationonendonucleasesandDNAmodificationmethyltransferases.NucAcidsRes22:3640-3659.
Mead, DA and GodiskaR, 2001.CloningvectorsandvectorComponents. United States Patent (USP) the 6th, 709, No. 861 (mandate on March 23rd, 2004).
MelgarE and GoldthwaitDA, 1968.Deoxyribonucleicacidnucleases:II.TheeffectofmetalsonthemechanismofactionofdeoxyribonucleaseI.JBiolChem243:4409-4416.
MorozovaO,MarraMA,2008.Applicationsofthenext-generationsequencingtechnologiesinfunctionalgenomics.Genomics92:255-262.
MullikinJC and NingZ, 2003.ThePHUSIONassembler.GenomeRes13:81-90.
MyersEW etc., 2000.Awhole-genomeassemblyofDrosophila.Science287:2196-21204.
NgP etc., 2005.Geneidentificationsigniture (GIS) analysisfortranscriptomecharacterizationandgenomeannotation.NatMeth2:105-111.
OwenMJ and CraddockN, 1996.Modernmoleculargeneticapproachestocomplextraits:Implicationsforpsychiatricdisorders.MolPsychiatry1:21-26.
PevznerPA and TangH, 2001.Fragmentassemblywithdouble-barreleddata.Bioinformatics17Suppl1:S225-S233.
PheifferBH and ZimmermanSB, 1983.Polymer-stimulatedligation:Enhancedblunt-orcohesive-endligationofDNAordeoxyribooligonucleotidesbyT4DNAligaseinpolymersolutions.NucAcidsRes11:7853-7871.
PinkelD and AlbertsonDG, 2005.Arraycomparativegenomichybridizationanditsapplicationincancer.NatGenetSuppl37:S11-S17.
PinkelD etc., 1998.HighresolutionanalysisofDNAcopynumbervariationusingcomparativegenomichybridizationtomicroarrays.NatGenet20:207-211.
PinkelD etc., 1997.Comparativegenomichybridization. United States Patent (USP)The 6th, 159, No. 685 (mandate on December 12nd, 2000).
PinkelD etc., 1994.ComparativefluorescencehybridizationtoNucleicacidarrays. United States Patent (USP) the 5th, 830, No. 645 (mandate on November 3rd, 1998).
PollackJR etc., 2002.MicroarrayanalysisrevealsamajordirectroleofDNAcopynumberalternationinthetranscriptionalprogramofhumanbreasttumors.ProcNatlAcadSci(USA)99:12963-12968.
PollackJR etc., 1999.Genome-wideanalysisofDNAcopy-numberchangesusingcDNAmicroarrays.NatGenet23:41-46.
PopM etc., 2004.Comparativegenomeassembly.BriefingsinBioinformatics5:237-248.
RedonR etc., 2006.Globalvariationincopynumberinthehumangenome.Nature444:444-454.
Rouillard, J-M etc., 2001.Virtualgenomescan:Atoolforrestrictionlandmark-basedscanningofthehumangenome.GenomeRes11:1453-1459.
SahaS etc., 2002.Usingthetranscriptometoannotatethegenome.NatBiotech19:508-512.
SalzbergSL and YorkeJA, 2005.Bewareofmis-assembledgenomes.Bioinformatics21:4320-4321.
SangerF etc., 1977.DNAsequencingwithchainterminatinginhibitors.ProcNatlAcadSci(USA)74:5463-5467.
SchloterM etc., 2000.Ecologyandevolutionofbacterialmicrodiversity.FEMSMicobiolRev21:647-660.
SchrieferLA etc., 1990.LowpressureDNAshearing:AmethodforrandomDNAsequenceanalysis.NucAcidsRes18:7455.
SistlaS and RaoDN, 2004.S-adenosyl-L-methionine-dependentrestrictionenzymes.CritRevBiochemMolBiol39:1-19.
SnijdersAM etc., 2001.Assemblyofmicroarraysforgenome-widemeasurementofDNAcopynumbers.NatGenet29:263-264.
SzybalskiW, the 1997.ConditionallyamplifiableBACvector. U.S. is specialProfit the 5th, 874, No. 259 (mandate on February 23rd, 1999).
SzybalskiE etc., 1991.Class-IISrestrictionenzymes-Areview.Gene100:13-26.
TaoQ and Zhang, H-B, 1998.CloningandstablemaintenanceofDNAfragmentsover300kbinEscherichiacoliwithconventionalplasmid-basedvectors.NucAcidsRes21:4901-4909.
TuzunE etc., 2005.Fine-scalestructuralvariationofthehumangenome.NatGenet37:727-732.
VelculescuVE etc., 1995.Serialanalysisofgeneexpression.Science270:484-487.
VolikS etc., 2006.Decodingthefine-scalestructureofabreastcancergenomeandtranscriptome.GenomeRes16:394-404.
WangJC and DavidsonN, 1966.OntheprobabilityofringclosureoflambdaDNA.JMolBiol19:469-482.
WarrenRL etc., 2006.Physicalmap-assistedwhole-genomeshotgunsequenceassemblies.GenomeRes16:768-775.
WeiC-L etc., 2004.5'longserialanalysisofgeneexpression(LongSAGE)and3'LongSAGEfortranscriptomecharacterizationandgenomeannotation.ProcNatlAcadSci(USA)101:11701-11706.
WeinstockGM etc., 2006.InsightsintosocialinsectsfromthegenomeofthehoneybeeApismellifera.Nature443:931-949.
WimmerK etc., 2002.Combinedrestrictionlandmarkgenomicscanningandvirtualgenomescansidentifyanovelhumanhomeoboxgene,ALX3,thatishypermethylatedinneuroblastoma.GenesChromosomes&Cancer33:285-294.
ZhangZ etc., 2000.AgreedyalgorithmforaligningDNAsequencing.JComputationalBiol7:203-214.
ZhaoS,2000.HumanBACends.NucAcidsRes28:129-132.
ZimmermanSB and PheifferBH, 1983.Macromolecularcrowdingallowsblunt-endligationbyDNAligasesfromratliverorEscherichiacoli.ProcNatlAcadSci(USA)80:5852-5856.

Claims (34)

1. the method for the preparation of sequence label (GVT) arranged side by side, wherein along the sequence label of target nucleic acid molecule group's length location to two forming members of (GVT to) for limiting the position mark of spacing distance or the position mark for the restriction endonuclease site that two of one or more restriction endonucleases are adjacent and can cut, described method comprises:
By large nucleic acids molecule fragment to form target DNA insert;
Target DNA insert is directly connected and does not use connection intermediate at the end cloning site of target DNA insert with linear DNA skeleton, cause producing the ring molecule that comprises target DNA insert;
Target DNA insert with at least one in the endonuclease digestion ring molecule that cuts described insert from each target DNA insert end cloning site a distance, thereby produce the linear molecule that comprises two sequence labels (GVT), the end sequence that described sequence label comprises target DNA insert, in described two GVT one is connected with each end of DNA digestion skeleton not;
Recirculation has the linear DNA skeleton of the GVT of connection, to produce ring-shaped DNA molecule, thereby produces GVT couple, and it comprises two the arranged side by side GVTs identical with target DNA insert relative direction;
By the primer sites from DNA skeleton carry out nucleic acid amplification or by with endonuclease on DNA skeleton and be positioned at produced GVT the site of flank digested, separate the GVT couple producing;
Wherein for digesting target DNA insert to produce the restriction endonuclease of GVT as the II type restriction endonuclease in 4 base-pair sites of identification.
2. the process of claim 1 wherein and make each right GVT of GVT of described separation be the direction contrary with respect to target DNA insert by further comprising the steps of method:
By in molecule, connect make the GVT that separates and produce to recirculation;
With the ring molecule of restriction endonuclease digestion gained, described restriction endonuclease cut described GVT to and obtain having the linear molecule that is rightabout GVT.
3. the process of claim 1 wherein that produced GVT is to two end region that comprise target DNA insert, it has the spacing distance that is less than 250kb, is less than 100kb, is less than 50kb, is less than 25kb, is less than 10kb, is less than 5kb or is less than 2.5kb.
4. the process of claim 1 wherein described target DNA insert be genomic DNA, come the chromosomal genomic DNA of self-separation, DNA, the cDNA, the mitochondria RDNA that separate self-separation chromosomal region, chloroplast DNA, viral DNA, microbial DNA, plastid DNA, chemical synthesis DNA, nucleic acid amplification DNA product or from the DNA of rna transcription.
5. the method for claim 1, wherein by selective application machine power, alone or in combination with the digestion of one or more nuclease parts or digest completely with one or more nucleases alone or in combination, by described nucleic acid molecules random fragmentation with formation target DNA insert.
6. the process of claim 1 wherein by using one or more nucleases to DNA methylation state sensitivity, by described target nucleic acid molecule fragmentation to form target DNA insert.
7. the process of claim 1 wherein that described target DNA insert separates through size fractionation.
8. the process of claim 1 wherein that described target DNA insert separates without size fractionation.
9. the process of claim 1 wherein that described target DNA insert length is at least 250kb, 100kb, at least 50kb, at least 25kb, at least 10kb, at least 5kb or 2.5kb at least.
10. the process of claim 1 wherein that described II type restriction endonuclease is alone or in combinationFspBI、Csp6I or its any isoschizomer or different point of contact enzyme.
11. the process of claim 1 wherein that the length of described DNA skeleton is less than 25kb, is less than 10kb, is less than 5kb, is less than 1kb, is less than 500bp, is less than 250bp, is less than 100bp or is less than 50bp.
12. the process of claim 1 wherein before being connected with target DNA insert, during or afterwards, assemble described DNA skeleton from two, three or more DNA sections.
13. the process of claim 1 wherein that described DNA skeleton comprises can instruct produced GVT to carrying out one or more sequences of DNA cloning.
The method of 14. claims 13, wherein said DNA skeleton comprises and can instruct produced GVT to carry out one or more sequences of DNA cloning on solid support.
The method of 15. claims 14, wherein said DNA skeleton comprises and can instruct produced GVT to carry out one or more sequences of isothermal dna amplification on solid support.
16. the process of claim 1 wherein produced GVT to by many that from target DNA group, produce similarly, represent the one of GVT centering of connection genome tag library.
17. the process of claim 1 wherein that described DNA skeleton comprises one or more nucleotides of puting together with the part that can produce detectable signal, and described detectable signal can be read by instrument or by visual examination.
18. the process of claim 1 wherein that described DNA skeleton comprises one or more nucleotides of puting together with affinity purification label.
The method of 19. claims 18, wherein said affinity purification label is biotin.
The method of 20. claims 19, it comprises the step by carry out separating acid fragment with the affinity purification of avidin or the coated solid support of streptavidin.
21. the process of claim 1 wherein that described DNA skeleton is not 4 bases or longer palindromic sequence containing length.
22. the process of claim 1 wherein that described DNA skeleton is not containing II type restriction endonucleaseFspBI、CspIdentification and the cleavage site of 6I or its any isoschizomer or different point of contact enzyme.
23. the process of claim 1 wherein that methylating of described DNA skeleton prevents from being cut by one or more restriction endonucleases.
24. the process of claim 1 wherein the IIS type restriction endonuclease digestion that produces nucleotides jag by use, produce the end of described DNA skeleton, to promote the connection with the target DNA insert of complementary nucleotide jag.
The method of 25. claims 24, wherein produce the IIS type restriction endonuclease digestion of single base 3 '-nucleotides jag by being used in each end, produce the end of described DNA skeleton, to promote producing ring molecule with the connection of the target DNA insert of complementary 3 '-nucleotides jag.
The method of 26. claims 24,3 '-mononucleotide of wherein said DNA skeleton extends to thymine alkali bases, and complementation 3 '-mononucleotide on target DNA insert extends to adenine base.
The method of 27. claims 24, wherein said IIS type restriction endonuclease isBciVI or its any isoschizomer.
The method of 28. claims 24, wherein said DNA skeleton is not containing IIS type restriction endonucleaseBciThe recognition site of VI or its any isoschizomer or different point of contact enzyme.
29. the process of claim 1 wherein that described DNA skeleton is the DNA vector that can breed in cell.
30. the process of claim 1 wherein that described DNA skeleton is bacterial artificial chromosome carrier or yeast artificial chromosome's carrier.
31. the process of claim 1 wherein that described DNA skeleton is to be selected from following DNA vector: plasmid, phasmid, clay and F clay.
32. the process of claim 1 wherein that described DNA skeleton comprises one or more sequences that can mediate phage packaging.
The method of 33. claims 32, wherein said phage packaging sequence is to derive from phageλCOSSequence.
34. the process of claim 1 wherein that described DNA skeleton comprises selectable marker gene.
CN201610028288.2A 2008-07-10 2009-07-09 Methods for nucleic acid mapping and identification of fine-structural-variations in nucleic acids Pending CN105602937A (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US12966008P 2008-07-10 2008-07-10
US61/129660 2008-07-10
US19344208P 2008-12-01 2008-12-01
US61/193442 2008-12-01
CN2009801359358A CN102165073A (en) 2008-07-10 2009-07-09 Methods for nucleic acid mapping and identification of fine-structural-variations in nucleic acids

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN2009801359358A Division CN102165073A (en) 2008-07-10 2009-07-09 Methods for nucleic acid mapping and identification of fine-structural-variations in nucleic acids

Publications (1)

Publication Number Publication Date
CN105602937A true CN105602937A (en) 2016-05-25

Family

ID=56014424

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610028288.2A Pending CN105602937A (en) 2008-07-10 2009-07-09 Methods for nucleic acid mapping and identification of fine-structural-variations in nucleic acids

Country Status (1)

Country Link
CN (1) CN105602937A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113667727A (en) * 2020-05-13 2021-11-19 华中科技大学 Method for obtaining linear plasmid DNA break position sequence information

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005042781A2 (en) * 2003-10-31 2005-05-12 Agencourt Personal Genomics Corporation Methods for producing a paired tag from a nucleic acid sequence and methods of use thereof
CN1720331A (en) * 2002-10-30 2006-01-11 纽韦卢森公司 Method for obtaining a bifunctional complex
WO2007076726A1 (en) * 2006-01-04 2007-07-12 Si Lok Methods for nucleic acid mapping and identification of fine-structural-variations in nucleic acids and utilities

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1720331A (en) * 2002-10-30 2006-01-11 纽韦卢森公司 Method for obtaining a bifunctional complex
WO2005042781A2 (en) * 2003-10-31 2005-05-12 Agencourt Personal Genomics Corporation Methods for producing a paired tag from a nucleic acid sequence and methods of use thereof
WO2007076726A1 (en) * 2006-01-04 2007-07-12 Si Lok Methods for nucleic acid mapping and identification of fine-structural-variations in nucleic acids and utilities

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113667727A (en) * 2020-05-13 2021-11-19 华中科技大学 Method for obtaining linear plasmid DNA break position sequence information
CN113667727B (en) * 2020-05-13 2023-05-16 华中科技大学 Method for obtaining linear plasmid DNA breaking position sequence information

Similar Documents

Publication Publication Date Title
CN102165073A (en) Methods for nucleic acid mapping and identification of fine-structural-variations in nucleic acids
Williams‐Carrier et al. Use of Illumina sequencing to identify transposon insertions underlying mutant phenotypes in high‐copy Mutator lines of maize
JP5389638B2 (en) High-throughput detection of molecular markers based on restriction fragments
CN101395281B (en) Methods for nucleic acid mapping and identification of fine-structural-variations in nucleic acids and utilities
ES2393318T3 (en) Strategies for the identification and detection of high performance polymorphisms
ES2357549T3 (en) STRATEGIES FOR THE IDENTIFICATION AND DETECTION OF HIGH PERFORMANCE OF POLYMORPHISMS.
Xie et al. De novo plant genome assembly based on chromatin interactions: a case study of Arabidopsis thaliana
CN104334739A (en) Genotyping by next-generation sequencing
JP7332733B2 (en) High molecular weight DNA sample tracking tags for next generation sequencing
CN105121661A (en) Methods for genome assembly and haplotype phasing
CN103088120A (en) Large-scale genetic typing method based on SLAF-seq (Specific-Locus Amplified Fragment Sequencing) technology
US20090156431A1 (en) Methods for Nucleic Acid Mapping and Identification of Fine Structural Variations in Nucleic Acids
CN111655848A (en) Preserving spatial proximity and molecular proximity in nucleic acid templates
Xu et al. Genome reconstruction and haplotype phasing using chromosome conformation capture methodologies
JP2022513343A (en) Normalized control for handling low sample inputs in next-generation sequencing
CN114555821B (en) Detection of sequences uniquely associated with a target region of DNA
CN105602937A (en) Methods for nucleic acid mapping and identification of fine-structural-variations in nucleic acids
EP3594364A1 (en) Method of assaying nucleic acid in microfluidic droplets
Li et al. piggyBac transposon-based insertional mutagenesis for the fission yeast Schizosaccharomyces pombe
Meilan et al. Forest genomics and biotechnology
CN112996924A (en) Use of droplet single cell epigenomic profiling for patient stratification
US6924112B1 (en) Cloning method by multiple-digestion, vectors for implementing same and applications
Key Molecular genetics, recombinant DNA, & genomic technology
Lin et al. DNA sequence preference for de novo centromere formation on a Caenorhabditis elegans artificial chromosome
Rapley Molecular cloning and DNA sequencing

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20160525

WD01 Invention patent application deemed withdrawn after publication