Primer composition and uses thereof
Technical field
The present invention relates to biological technical field, specifically, relate to Primer composition and uses thereof, more specifically, the present invention relates to Primer composition, build the method in nucleic acid sequencing library, testing sample target area, determine the method for testing sample target area nucleotide sequence, determine the method for the target area nucleotide sequence of multiple testing sample simultaneously, determine the system of testing sample target area nucleotide sequence and determine the system of target area nucleotide sequence of multiple testing sample simultaneously.
Background technology
The technology of the target area enrichment order-checking of current use mainly adopts two kinds of strategies: probe hybridization is caught and pcr amplification enrichment.Wherein, probe hybridization is caught, and whole flow process probably needs 4 to 7 days, the cost of a usual target area from hundreds of to several thousand yuan not etc., and the cycle of probe design synthesis be generally the even longer time in 6 to 8 weeks.The target area beneficiation technologies of PCR-based principle only needs to carry out PCR and add joining operations just can checking order by above machine usually, whole flow process generally only needs 1-2 days, early stage, design of primers synthesis cycle was generally 2-4 week, and design of primers is relatively flexible, tens to several thousand to can, the cost of a sample generally can be low to moderate dozens of yuan to hundreds of unit, catches have certain advantage relative to probe hybridization.
But the method in current pcr amplification enrich target region still haves much room for improvement.
Summary of the invention
The present invention completes based on the following discovery of contriver:
The method in current pcr amplification enrich target region all will be added a joint and just can be carried out follow-up order-checking after amplification terminates on product, need two-step pcr to react, and centre also needs to add purification step.Thus, step is more, and process is still more complicated, is unfavorable for the pcr amplification enrichment of larger samples amount and more target area.
The present invention is intended at least to solve one of technical problem existed in prior art.For this reason, one object of the present invention is, by continuing to optimize design of primers, PCR reaction system and PCR thermocycling program etc., propose one compared with the conventional method, step is less, operates simpler, take less, the method in sample size and primer quantity selection aspect pcr amplification enrich target region more flexibly, and implement the method in conjunction with high-throughput chip, can easily realize carrying out target area enrichment to multiple sample multizone simultaneously.
Thus, according to an aspect of the present invention, first the invention provides a kind of Primer composition.According to embodiments of the invention, this Primer composition comprises: the first primer sets, described first primer sets comprises the first forward primer and the first reverse primer, wherein, described first forward primer is made up of target area specific forward primer and the first nucleotide sequence, described first nucleotide sequence is positioned at 5 ' end of described target area specific forward primer, described first reverse primer is made up of target area specific reverse primers and the second nucleotide sequence, described second nucleotide sequence is positioned at 5 ' end of described target area specific reverse primers, described first nucleotide sequence and described second nucleotide sequence are common sequence, and described first nucleotide sequence is different with described second nucleotide sequence, second primer sets, described second primer sets comprises the second forward primer and the second reverse primer, wherein, described second forward primer comprises the first nucleotide sequence and the first sequence measuring joints, described first sequence measuring joints is positioned at 5 ' end of described first nucleotide sequence, described second reverse primer comprises the second nucleotide sequence and the second sequence measuring joints, and described second sequence measuring joints is positioned at 5 ' end of described second nucleotide sequence.Contriver is surprised to find, the sample of nucleic acid that comprise target area of Primer composition of the present invention to testing sample is utilized to carry out pcr amplification, one step namely can the enrichment of realize target region nucleic acid, further, the target area nucleic acid of pcr amplification product and enrichment directly forms sequencing library, and then connects, connects the step of product purification without the need to extra sequence measuring joints, can directly check order, cost is low, efficiency is high, and sequencing result accurately, reliably, favorable repeatability.
According to another aspect of the invention, present invention also offers a kind of method building nucleic acid sequencing library, testing sample target area.According to embodiments of the invention, the method comprises: utilize foregoing Primer composition, carry out pcr amplification to the sample of nucleic acid that testing sample comprises target area, to obtain amplified production, described amplified production forms nucleic acid sequencing library, described testing sample target area.According to embodiments of the invention, utilize the method in nucleic acid sequencing library, structure testing sample target area of the present invention, namely one step can realize the enrichment of testing sample target area nucleic acid and the structure in nucleic acid sequencing library, target area, and, simple to operate, take few, sample size and primer quantity selection aspect more flexible, when implementing the method in conjunction with high-throughput chip, can easily realize carrying out target area enrichment to multiple sample multizone simultaneously.In addition, according to embodiments of the invention, the method is utilized to build nucleic acid sequencing library, testing sample target area, simple to operate, take less, cost is low, efficiency is high, and obtain nucleic acid sequencing library, target area, can directly check order, sequencing result is accurate, reliable, favorable repeatability.In addition, method of the present invention is suitable for minim DNA sample and especially receives the upgrading target area nucleic acid enriching of sample and library construction.
According to a further aspect in the invention, present invention also offers a kind of method determining testing sample target area nucleotide sequence.According to embodiments of the invention, the method comprises the following steps: according to the method in nucleic acid sequencing library, foregoing structure testing sample target area, builds the nucleic acid sequencing library, target area of testing sample; Checked order, to obtain sequencing result in the nucleic acid sequencing library, target area of described testing sample; And based on described sequencing result, determine the sequence of testing sample target area nucleic acid.Contriver finds, utilizes method of the present invention can determine testing sample target area nucleotide sequence easily, and simple to operate, takes few, and flexibly, sequencing result is accurate, reliable, favorable repeatability for sample size and primer quantity selection aspect.
In accordance with a further aspect of the present invention, present invention also offers a kind of method simultaneously determining the target area nucleotide sequence of multiple testing sample.According to embodiments of the invention, the method comprises the following steps: for each in described multiple testing sample, separately according to the method in nucleic acid sequencing library, foregoing structure testing sample target area, build the nucleic acid sequencing library, target area of testing sample, wherein, between first nucleotide sequence of described second forward primer and the first sequence measuring joints, and/or between the second nucleotide sequence of described second reverse primer and the second sequence measuring joints, comprise sequence label further, and the sequence label of described multiple testing sample is mutually different, describedly multiplely be at least 2, the nucleic acid sequencing library, target area of described multiple testing sample is mixed, to obtain mixing library, check order to described mixing library, to obtain sequencing result, described sequencing result comprises the sequence in the nucleic acid sequencing library, target area of described multiple testing sample and described sequence label, and distinguish based on the sequence of described sequence label to the nucleic acid sequencing library, target area of described multiple testing sample, and determine the target area nucleotide sequence of each of described multiple testing sample.Contriver finds, utilizes method of the present invention can determine the target area nucleotide sequence of multiple testing sample, and simple to operate simultaneously, takes few, sample size and primer quantity selection aspect flexible, and sequencing result accurately, reliable, favorable repeatability.
According to another aspect of the invention, present invention also offers a kind of system determining testing sample target area nucleotide sequence.According to embodiments of the invention, this system comprises: library construction device, foregoing Primer composition is provided with in described library construction device, for the method according to nucleic acid sequencing library, foregoing structure testing sample target area, build the nucleic acid sequencing library, target area of testing sample; First sequencing device, described first sequencing device is connected with described library construction device, for checking order, to obtain sequencing result to the nucleic acid sequencing library, target area of described testing sample; And the first nucleotide sequence determining device, described first nucleotide sequence determining device is connected with described first sequencing device, for based on described sequencing result, determines the sequence of testing sample target area nucleic acid.According to embodiments of the invention, utilize system of the present invention, can determine testing sample target area nucleotide sequence quickly and easily, and sample size and primer quantity selection aspect are flexibly, sequencing result is accurate, reliable, favorable repeatability.
According to a further aspect in the invention, present invention also offers a kind of system simultaneously determining the target area nucleotide sequence of multiple testing sample.According to embodiments of the invention, this system comprises: library construction and mixing device, described library construction and mixing device are used for for each in described multiple testing sample, separately according to the method in nucleic acid sequencing library, foregoing structure testing sample target area, build the nucleic acid sequencing library, target area of testing sample, and the nucleic acid sequencing library, target area of described multiple testing sample is mixed, to obtain mixing library, wherein, between first nucleotide sequence of described second forward primer and the first sequence measuring joints, and/or between the second nucleotide sequence of described second reverse primer and the second sequence measuring joints, comprise sequence label further, the sequence label of described multiple testing sample is mutually different, describedly multiplely be at least 2, second sequencing device, described second sequencing device is connected with described library construction and mixing device, for checking order to described mixing library, to obtain sequencing result, wherein said sequencing result comprises the sequence in the nucleic acid sequencing library, target area of described multiple testing sample and described sequence label, and the second nucleotide sequence determining device, described second nucleic acid determining device is connected with described second sequencing device, for distinguishing based on the nucleotide sequence of described sequence label to the target area sequencing library of described multiple testing sample, and determine the target area nucleotide sequence of each of described multiple testing sample.According to embodiments of the invention, utilize method of the present invention can determine the target area nucleotide sequence of multiple testing sample, sample size and primer quantity selection aspect are flexible, and simple to operate simultaneously, take few, sequencing result accurately, reliable, favorable repeatability.
In addition, it should be noted that, the present invention is by optimizing design of primers, PCR reaction system and PCR thermocycling program, what adopt is the mode of a PCR, can realize target region enrichment on single tube and high-throughput chip, joining operations is added without the need to other, also eliminate the purifying in the middle of two-step pcr step, the more important thing is and can catch by multiple sample multizone, only need single step purification just can carry out high-flux sequence after catching, with the twice PCR of prior art complete build storehouse method compared with, step is less, sample size and primer quantity selection aspect more flexible.
Additional aspect of the present invention and advantage will part provide in the following description, and part will become obvious from the following description, or be recognized by practice of the present invention.
Accompanying drawing explanation
Above-mentioned and/or additional aspect of the present invention and advantage will become obvious and easy understand from accompanying drawing below combining to the description of embodiment, wherein:
Fig. 1 shows the structural representation of the first forward primer of Primer composition according to an embodiment of the invention, the first reverse primer, the second forward primer and the second reverse primer;
Fig. 2 shows the schematic flow sheet of the method determining testing sample target area nucleotide sequence according to an embodiment of the invention;
Fig. 3 shows the schematic flow sheet of the method for the target area nucleotide sequence simultaneously determining multiple testing sample according to an embodiment of the invention;
Fig. 4 shows the structural representation of the system determining testing sample target area nucleotide sequence according to an embodiment of the invention;
Fig. 5 shows the structural representation of the system of the target area nucleotide sequence simultaneously determining multiple testing sample according to an embodiment of the invention;
Fig. 6 shows according to one embodiment of the invention, the electrophoresis detection result that the interior outer primer of 63 SNP site uses simultaneously.
Fig. 7 shows according to one embodiment of the invention, the coverage situation of 41 SNP site detections of 24 samples.
Fig. 8 shows according to one embodiment of the invention, and what 41 SNP site of 24 samples detected catches specific outcome.
Embodiment
Embodiments of the invention are described below in detail.Being exemplary below by the embodiment be described with reference to the drawings, only for explaining the present invention, and can not limitation of the present invention being interpreted as.
It should be noted that, term " first ", " second " only for describing object, and can not be interpreted as instruction or hint relative importance or imply the quantity indicating indicated technical characteristic.Thus, be limited with " first ", the feature of " second " can express or impliedly comprise one or more these features.Further, in describing the invention, except as otherwise noted, the implication of " multiple " is two or more.
Primer composition
According to an aspect of the present invention, the invention provides a kind of Primer composition.According to embodiments of the invention, this Primer composition comprises: the first primer sets and the second primer sets.Particularly, with reference to Fig. 1, described first primer sets comprises the first forward primer and the first reverse primer, wherein, described first forward primer is made up of target area specific forward primer and the first nucleotide sequence, described first nucleotide sequence is positioned at 5 ' end of described target area specific forward primer, described first reverse primer is made up of target area specific reverse primers and the second nucleotide sequence, described second nucleotide sequence is positioned at 5 ' end of described target area specific reverse primers, described first nucleotide sequence and described second nucleotide sequence are common sequence, and described first nucleotide sequence is different with described second nucleotide sequence.Described second primer sets comprises the second forward primer and the second reverse primer, wherein, described second forward primer comprises the first nucleotide sequence and the first sequence measuring joints, described first sequence measuring joints is positioned at 5 ' end of described first nucleotide sequence, described second reverse primer comprises the second nucleotide sequence and the second sequence measuring joints, and described second sequence measuring joints is positioned at 5 ' end of described second nucleotide sequence.According to embodiments of the invention, the sample of nucleic acid that comprise target area of Primer composition of the present invention to testing sample is utilized to carry out pcr amplification, one step namely can the enrichment of realize target region nucleic acid, further, the target area nucleic acid of pcr amplification product and enrichment directly forms sequencing library, and then connects, connects the step of product purification without the need to extra sequence measuring joints, can directly check order, cost is low, efficiency is high, and sequencing result accurately, reliably, favorable repeatability.
According to embodiments of the invention, the length of described target area specific forward primer and described target area specific reverse primers is 18-25nt.Thus, primer specificity is high, and expanding effect is good.
According to embodiments of the invention, described first nucleotide sequence is: 5 '-ACACTGACGACATGGTTCTACA-3 ' (SEQIDNO:1), and described second nucleotide sequence is: 5 '-TACGGTAGCAGAGACTTGGTCT-3 ' (SEQIDNO:2).
According to embodiments of the invention, between the first nucleotide sequence of described second forward primer and the first sequence measuring joints, and/or between the second nucleotide sequence of described second reverse primer and the second sequence measuring joints, comprise sequence label further.Thereby, it is possible to after carrying out pcr amplification, the PCR primer of multiple sample is carried out mixing order-checking, and then based on the difference of sequence label, the samples sources of each sequence is distinguished.
According to some embodiments of the present invention, the length of described sequence label is 6-11nt.
According to some embodiments of the present invention, described first sequence measuring joints and described second sequence measuring joints are respectively P5 sequence measuring joints and the P7 sequence measuring joints of Illumina order-checking platform.Wherein, according to concrete examples more of the present invention, when described first sequence measuring joints and described second sequence measuring joints are respectively P5 sequence measuring joints and the P7 sequence measuring joints of Illumina order-checking platform, between described sequence label and described first sequence measuring joints P5, comprise the 3rd nucleotide sequence that length is 4nt further, preferably described 3rd nucleotide sequence is: 5 '-ACAC-3 '.Thereby, it is possible to significantly improve order-checking success ratio, and sequencing result accurately, reliably, and order-checking efficiency is high.
According to other embodiments of the present invention, described first sequence measuring joints and described second sequence measuring joints are respectively A sequence measuring joints and the P sequence measuring joints of IonTorrent order-checking platform.
According to a concrete example of the present invention, target area specificity corresponding to described target area and described target area is positive/negative to primer sequence, as shown in the table:
Purposes
According to another aspect of the invention, present invention also offers a kind of method building nucleic acid sequencing library, testing sample target area.According to embodiments of the invention, the method comprises: utilize foregoing Primer composition, carry out pcr amplification to the sample of nucleic acid that testing sample comprises target area, to obtain amplified production, described amplified production forms nucleic acid sequencing library, described testing sample target area.
According to embodiments of the invention, described first primer sets is 1:6-10, preferred 1:8 with the final concentration ratio of described second primer sets.Thus, pcr amplification is effective, and after amplified production is directly used in order-checking, order-checking efficiency is high, result is accurate.
According to embodiments of the invention, calculate according to 100nl reaction system, the reaction system of described pcr amplification comprises: the template DNA of 10nl8ng/ μ l ~ 20ng/ul; 50nl2XPCRmastermix; Second forward primer of 10nl4 μM; Second reverse primer of 10nl4 μM; 10nl500nM first forward primer; 10nl500nM first reverse primer.Thus, pcr amplification is effective, and after amplified production is directly used in order-checking, order-checking efficiency is high, result is accurate.
According to embodiments of the invention, the response procedures of described pcr amplification is:
Thus, pcr amplification is effective, and efficiency is high, and amplified production can be directly used in order-checking.
According to concrete examples more of the present invention, the method in nucleic acid sequencing library, structure testing sample target area of the present invention can also comprise the following steps:
(1) for each target area, all design is applicable to the forward and reverse primer of target area specificity of this target area nucleic acid amplification, and add common sequence at its 5 ' end respectively: the first nucleotide sequence and the second nucleotide sequence, synthesize, to obtain the first forward primer and the first reverse primer, i.e. the first primer sets (in this article sometimes also referred to as " inner primer ").
(2) synthesis comprises the second forward primer of the first nucleotide sequence and the first sequence measuring joints, and comprise the second reverse primer of the second nucleotide sequence and the second sequence measuring joints, to obtain the second primer sets (in this article sometimes also referred to as " outer primer ").
(3) use the first primer sets to carry out pcr amplification and detected through gel electrophoresis, to verify whether the first primer sets meets the requirements, wherein with whether can amplify meet designed size single band for criterion.After empirical tests is qualified, by the first primer sets and the second primer sets according to final concentration than 1:6-10, preferred 1:8 mixing, carries out pre-PCR reaction and detected through gel electrophoresis, to determine whether to carry out effective, specific amplification to target area.
(4) testing gene group DNA sample is carried out dilute (being generally diluted to 10ng/ μ L).
(5) by the micro sample-adding instrument supporting with chip, verify the first qualified primer sets and the second primer sets by through pre-PCR, and sample and PCR reaction reagent application of sample are in chip, and carry out corresponding PCR reaction.Wherein before application of sample, first the first primer sets and a part of PCR reaction reagent are mixed, be called detection site reagent premixed liquid; And sample, unique corresponding the second primer sets with sequence label of this sample and another part PCR reaction reagent are mixed, be called sample premixed liquid.During application of sample, add sample premixed liquid in each micropore first on chip, chip can be divided into several regions by this step, and in the micropore of each existing sample premixed liquid, add the detection site reagent premixed liquid of its correspondence again.Thus, a pair inner primer, a pair outer primer with sequence label corresponding to sample, sample and PCR reagent is had in each micropore, namely complete PCR reaction system is formed, and then after PCR terminates, by PCR primer easily can be mixed centrifugal being recovered in centrifuge tube by chip upside down.
(6) then, by the PCR primer that each sample reclaimed mixes, carry out the AmpureXP magnetic beads for purifying of certain proportion (0.8-1 doubly), purified product is nucleic acid library.
(7) the above-mentioned nucleic acid library built is carried out Library Quality detection, can use such as Agilent2100Bioanalyzer or CaliperBioanalyzer, and ABIStepOnerPlusReal-TimePCRSystem carries out the detection of mass concentration, fragment size distribution and volumetric molar concentration.After qualified after testing, be namely available on the machine order-checking.
According to a further aspect in the invention, present invention also offers a kind of method determining testing sample target area nucleotide sequence.According to embodiments of the invention, with reference to Fig. 2, the method comprises the following steps:
S101: the nucleic acid sequencing library, target area building testing sample
According to the method in nucleic acid sequencing library, foregoing structure testing sample target area, build the nucleic acid sequencing library, target area of testing sample.
S102: checked order in nucleic acid sequencing library, target area
Checked order, to obtain sequencing result in the nucleic acid sequencing library, target area of described testing sample.
Wherein, according to embodiments of the invention, sequence measuring joints should be selected according to the order-checking platform for adopting.According to some embodiments of the present invention, when described first sequence measuring joints and described second sequence measuring joints are respectively P5 sequence measuring joints and the P7 sequence measuring joints of Illumina order-checking platform, Illumina order-checking platform is utilized to carry out described order-checking.According to other embodiments of the present invention, when described first sequence measuring joints and described second sequence measuring joints are respectively A sequence measuring joints and the P sequence measuring joints of IonTorrent order-checking platform, IonTorrent order-checking platform is utilized to carry out described order-checking.Thus, the PCR primer of acquisition and nucleic acid library can be directly used in order-checking, and sequencing result accurately, reliably, favorable repeatability.
S103: determine testing sample target area nucleotide sequence
Based on described sequencing result, determine the sequence of testing sample target area nucleic acid.
Contriver is surprised to find, and utilizes the method determination testing sample target area nucleotide sequence, simple to operate, take few, sample size and primer quantity are selected flexibly, can easily realize carrying out target area enrichment to multiple sample multizone simultaneously, and sequencing result is accurately, reliably, favorable repeatability.
In accordance with a further aspect of the present invention, present invention also offers a kind of method simultaneously determining the target area nucleotide sequence of multiple testing sample.Contriver finds, utilizes the method can easily realize carrying out target area enrichment and order-checking to multiple sample multizone simultaneously, and simple to operate, takes few, and sequencing result is accurate, reliable, favorable repeatability.
According to concrete examples more of the present invention, with reference to Fig. 3, of the present inventionly determine that the method for the target area nucleotide sequence of multiple testing sample can comprise the following steps simultaneously:
S201: the nucleic acid sequencing library, target area separately building each in multiple testing sample
For each in described multiple testing sample, separately according to the method in nucleic acid sequencing library, foregoing structure testing sample target area, build the nucleic acid sequencing library, target area of testing sample, wherein, between first nucleotide sequence of described second forward primer and the first sequence measuring joints, and/or between the second nucleotide sequence of described second reverse primer and the second sequence measuring joints, comprise sequence label further, and the sequence label of described multiple testing sample is mutually different, be describedly multiplely at least 2.
According to embodiments of the invention, in porous plate, carry out sequencing library structure to described multiple testing sample, wherein, each hole in described porous plate becomes a reaction system independently simultaneously.According to concrete examples more of the present invention, described porous plate is the chip with 5184 micropores, and the reaction volume of each micropore of described chip is 100 receive liter.
S202: the nucleic acid sequencing library, target area of multiple testing sample is mixed
The nucleic acid sequencing library, target area of described multiple testing sample is mixed, to obtain mixing library.
S203: checked order in mixing library
Check order to described mixing library, to obtain sequencing result, described sequencing result comprises the sequence in the nucleic acid sequencing library, target area of described multiple testing sample and described sequence label.
According to embodiments of the invention, when utilizing Illumina order-checking platform to carry out described order-checking, described first sequence measuring joints and described second sequence measuring joints are respectively P5 sequence measuring joints and the P7 sequence measuring joints of Illumina order-checking platform.Wherein, according to concrete examples more of the present invention, when utilizing Illumina order-checking platform to carry out described order-checking, between described sequence label and described first sequence measuring joints P5, comprise the 3rd nucleotide sequence that length is 4nt further, preferably described 3rd nucleotide sequence is: 5 '-ACAC-3 '.Thereby, it is possible to significantly improve order-checking success ratio, and sequencing result accurately, reliably, and order-checking efficiency is high.
According to embodiments of the invention, when utilizing IonTorrent order-checking platform to carry out described order-checking, described first sequence measuring joints and described second sequence measuring joints are respectively A sequence measuring joints and the P sequence measuring joints of IonTorrent order-checking platform.
S204: the target area nucleotide sequence determining in multiple testing sample each
Distinguish based on the sequence of described sequence label to the nucleic acid sequencing library, target area of described multiple testing sample, and determine the target area nucleotide sequence of each of described multiple testing sample.
According to a further aspect in the invention, present invention also offers a kind of system determining testing sample target area nucleotide sequence.According to embodiments of the invention, with reference to Fig. 4, this system 100 comprises: library construction device 101, first sequencing device 102 and the first nucleotide sequence determining device 103.According to embodiments of the invention, utilize this system determination testing sample target area nucleotide sequence, simple to operate, take few, and sequencing result is accurately, reliably, favorable repeatability.
Particularly, in described library construction device 101, be provided with foregoing Primer composition, for the method according to nucleic acid sequencing library, foregoing structure testing sample target area, build the nucleic acid sequencing library, target area of testing sample; Described first sequencing device 102 is connected with described library construction device 101, for checking order, to obtain sequencing result to the nucleic acid sequencing library, target area of described testing sample; Described first nucleotide sequence determining device 103 is connected with described first sequencing device 102, for based on described sequencing result, determines the sequence of testing sample target area nucleic acid.
According to some embodiments of the present invention, described first sequencing device 102 is Illumina order-checking platform, and described first sequence measuring joints and described second sequence measuring joints are respectively P5 sequence measuring joints and the P7 sequence measuring joints of Illumina order-checking platform.
According to other embodiments of the present invention, described first sequencing device 102 is IonTorrent order-checking platform, and described first sequence measuring joints and described second sequence measuring joints are respectively A sequence measuring joints and the P sequence measuring joints of IonTorrent order-checking platform.
According to another aspect of the invention, present invention also offers a kind of system simultaneously determining the target area nucleotide sequence of multiple testing sample.According to embodiments of the invention, with reference to Fig. 5, this system 200 comprises: library construction and mixing device 201, second sequencing device 202 and the second nucleotide sequence determining device 203.According to embodiments of the invention, utilize this system can easily realize carrying out target area enrichment and order-checking to multiple sample multizone, and system architecture is simple, easy to operate, take few, cost is low, and sequencing result is accurate, reliable, favorable repeatability simultaneously.
Particularly, described library construction and mixing device 201 are for for each in described multiple testing sample, separately according to the method in nucleic acid sequencing library, structure testing sample target area noted earlier, build the nucleic acid sequencing library, target area of testing sample, and the nucleic acid sequencing library, target area of described multiple testing sample is mixed, to obtain mixing library, wherein, between first nucleotide sequence of described second forward primer and the first sequence measuring joints, and/or between the second nucleotide sequence of described second reverse primer and the second sequence measuring joints, comprise sequence label further, the sequence label of described multiple testing sample is mutually different, describedly multiplely be at least 2.Described second sequencing device 202 is connected with described library construction and mixing device 201, for checking order to described mixing library, to obtain sequencing result, wherein said sequencing result comprises the sequence in the nucleic acid sequencing library, target area of described multiple testing sample and described sequence label.Described second nucleotide sequence determining device 203 is connected with described second sequencing device 202, for distinguishing based on the nucleotide sequence of described sequence label to the target area sequencing library of described multiple testing sample, and determine the target area nucleotide sequence of each of described multiple testing sample.
According to some embodiments of the present invention, porous plate is provided with in described library construction and mixing device 201, to utilize described porous plate to carry out sequencing library structure to described multiple testing sample simultaneously, wherein, each hole in described porous plate becomes a reaction system independently.According to concrete examples more of the present invention, described porous plate is the chip with 5184 micropores, and the reaction volume of each micropore of described chip is 100 receive liter.
According to some embodiments of the present invention, described second sequencing device 202 is Illumina order-checking platform, and described first sequence measuring joints and described second sequence measuring joints are respectively P5 sequence measuring joints and the P7 sequence measuring joints of Illumina order-checking platform.Wherein, according to concrete examples more of the present invention, when described second sequencing device 202 checks order platform for Illumina, between described sequence label and described first sequence measuring joints P5, comprise the 3rd nucleotide sequence that length is 4nt further, preferably described 3rd nucleotide sequence is: 5 '-ACAC-3 '.Thereby, it is possible to significantly improve order-checking success ratio, and sequencing result accurately, reliably, and order-checking efficiency is high.
According to other embodiments of the present invention, described second sequencing device 202 is IonTorrent order-checking platform, and described first sequence measuring joints and described second sequence measuring joints are respectively A sequence measuring joints and the P sequence measuring joints of IonTorrent order-checking platform.
In addition, according to some embodiments of the present invention, method of the present invention is combined with high-throughput chip technology, enrichment and the order-checking in the multiple region of multiple sample can be realized quickly and easily.Particularly, the specification of adoptable chip can be 72*72=5184 (hole), but is not limited to this specification, such as, can also be 36*36 hole, 48*48 hole and 60*60 hole.According to one embodiment of present invention, the 72*72=5184 hole chip of WaferGen company of the U.S. is adopted to carry out target area enrichment, wherein this chip uses has the aluminium of good heat conductive performance to be made, outside parcel one deck inert material, the volume of its each micropore is all equal, for receiving upgrading, thus each micropore can carry out single PCR reaction.
And then, can according to the array mode of arbitrary science in actual experiment operation, multiple sample and multiple target area are combined, the array configuration of such as 24 samples and 216 target areas can be: chip is divided into 24 regions according to the quantity of sample, each region adopts a kind of sample and a kind of label (as previously mentioned, label is positioned in the second primer sets), the sample that Zhong Ge region, 24 regions adopts is mutually different with label, thus just there are 24 samples and 24 sequence labels in 24 regions; And each in 24 regions comprises 216 micropores, for same region, the sample adopted in its 216 micropores is all identical with label, and target area Auele Specific Primer is different.And then, each region in 24 regions is exactly can to a sample amplification enrichment 216 target area nucleotide sequences, thus whole chip can realize the amplification enrichment of 216 target area nucleotide sequences to 24 sample standard deviations, namely always have 24*216=5184 target area by enrichment.Be an independently PCR reaction system based on each micropore, namely each amplification enrichment reaction is independently, just can ensure the specificity and efficiency that nucleic acid enriching is caught.And then the amplified production based on 24 samples all carries mutually different sequence labels, then all 5184 amplified productions can be carried out mixing order-checking, finally utilize sequence label to carry out the different sample of Division identification.
Also it should be noted that, method and system of the present invention can be combined with high-throughput chip, reacted by One_step PCR, can from multiple (such as 5184) interested target fragment region of the multiple sample of enrichment genomic dna, and the product of Enrichment Amplification directly can form the DNA library of order-checking (because be One_step PCR reaction, centre does not need purification step, wholely builds storehouse process so can complete in microwell chips).Compared with existing PCR Enrichment Amplification technology, shorten experiment flow and time, more convenient and quicker.Compared with traditional target area capture technique, advantage is given prominence to: the general 4-7 consuming time of DNA library days (the probe hybridization test kit of Aglient and NimbleGen) that traditional target area is caught, and the present invention only needs 3-4 hour, the DNA library that traditional target area is caught builds and needs to comprise multiple steps such as target area enrichment and library construction, this technology only needs One_step PCR to react, the minim DNA sample being low to moderate 50ng can be operated, the system of 100nL just can complete catching of the DNA fragmentation of an about 200bp, such as 100nl reaction system comprises: template DNA (10ng/ μ l) 10nl, 2XPCRmastermix50nl, second forward primer (4 μMs) 10nl, second reverse primer (4 μMs) 10nl, first primer sets (250nM often plants) 20nl, namely reagent cost significantly reduces.And to increase beneficiation technologies (Ampli-seq with Single-tube multiplex-PCR, Lifetechnologies) compare, present method uses the amplification method of a reaction system PCR of high integration, while ensure that flux, effectively solve the problem mutually disturbing the efficiency that causes not high between multiple PCR primer.Relative to traditional method, library constructing method of the present invention can carry out on high-throughput chip, the volume that each PCR reacts receives upgrading (such as 100nl reaction system), the consumption of sample and reagent is all greatly reduced, and then is particularly useful for nucleic acid enriching and the order-checking of trace sample.To sum up, method and system of the present invention, many-sided advantage such as collect quick, simple to operation, cost is low, flux is high and bioaccumulation efficiency is high, has the advantage that enrichment order-checking database technology field, target area, zonule other technologies cannot match in excellence or beauty.
Below in conjunction with embodiment, the solution of the present invention is made an explanation.It will be understood to those of skill in the art that the following examples only for illustration of the present invention, and should not be considered as limiting scope of the present invention.Unreceipted concrete technology or condition in embodiment, (such as show with reference to J. Pehanorm Brooker etc. according to the technology described by the document in this area or condition, " Molecular Cloning: A Laboratory guide " that Huang Peitang etc. translate, the third edition, Science Press) or carry out according to product description.Agents useful for same or the unreceipted production firm person of instrument, being can by the conventional products of commercial acquisition.
Embodiment 1:
With reference to Fig. 2 and Fig. 3, according to the method for nucleic acid sequencing library, structure testing sample target area of the present invention and order-checking, utilize Illumina order-checking platform (Miseq sequenator), 24 human genome DNA's samples of known genome sequencing result are carried out to variation detection and the target area order-checking of 63 SNP site, specific as follows:
Design multipair inner primer for 63 mankind's SNP site, and then test the genomic dna sample of multiple sample, concrete operations are as follows:
(1) respectively for the forward and reverse primer of zone design target area specificity at each SNP site place, ensure that SNP site is positioned at forward and reverse primer and covers the centre that zero position adds the region of reading length of checking order, primer length 18-25nt, Poly-X number is less than or equal to 3, and primer Tm is within the scope of 58-60 DEG C;
(2) add common sequence and synthesize at 5 ' end of the forward and reverse primer of target area specificity designed, to obtain inner primer (i.e. the first primer sets: the first forward primer and the first reverse primer) respectively:
First forward primer: 5 '-ACACTGACGACATGGTTCTACA-3 '+5 '-
nNNNNNNNNNNNNNNN nNNN-3 ', wherein, 5 '-ACACTGACGACATGGTTCTACA-3 ' is the first nucleotide sequence, and underscore is depicted as target area specific forward primer sequence;
First reverse primer: 5 '-TACGGTAGCAGAGACTTGGTCT-3 '+5 '-
nNNNNNNNNNNNNNNN nNNN-3 ', wherein, " 5 '-TACGGTAGCAGAGACTTGGTCT-3 ' " be the second nucleotide sequence, underscore is depicted as target area specific reverse primers sequence,
First forward primer of each SNP site and the concrete sequence of the first reverse primer, in table 1.
Table 1
(3) synthesize the primer with sequence measuring joints and common sequence, to obtain outer primer (i.e. the second primer sets: the second forward primer and the second reverse primer), the sequence of the outer primer of each SNP site is identical:
Second forward primer: 5 '-AATGATACGGCGACCACCGAGATCT-3 '+5 '-ACAC-3 '+
nNNNNNN n+ 5 '-ACACTGACGACATGGTTCTACA-3 ', wherein, " 5 '-AATGATACGGCGACCACCGAGATCT-3 ' " be the first sequence measuring joints (SEQIDNO:266, the P5 sequence measuring joints of Illumina order-checking platform), underscore is depicted as the sequence label that bases longs is 8nt, further, between sequence label and the first sequence measuring joints (P5 sequence measuring joints), the 3rd nucleotide sequence: the 5 '-ACAC-3 ' that length is 4nt is comprised further;
Second reverse primer: 5 '-CAAGCAGAAGACGGCATACGAGAT-3 '+
nNNNNNNN+ 5 '-TACGGTAGCAGAGACTTGGTCT-3 ', wherein, " 5 '-CAAGCAGAAGACGGCATACGAGAT-3 ' " be the second sequence measuring joints (the P7 sequence measuring joints of SEQIDNO:267, Illumina order-checking platform), underscore is depicted as the sequence label that bases longs is 8nt.
Wherein, in the present embodiment, by 8 the second forward primer sequence labels (SEQIDNO:255-262) and 3 the second reverse primer sequence label (SEQIDNO:263-265) combinations, form 8*3=24 tag combination, and then 24 " tag combination " one_to_one corresponding are applied to 24 samples, thus, second forward primer sequence label of each sample and the combination of the second reverse primer sequence label are one of them " tag combination ", for 24 samples, described " tag combination " of each sample is mutually different.And then, complete library construction follow-up and after checking order, based on tag combination (i.e. the second forward primer sequence label and the second reverse primer sequence label), easily can distinguish 24 samples.
Article 8, the concrete sequence of the second forward primer sequence label (SEQIDNO:255-262) is as follows:
Article 3, the concrete sequence of the second reverse primer sequence label (SEQIDNO:263-265) is as follows:
(4) using human genome DNA's standard substance as template, the first primer sets is used to carry out pcr amplification and detected through gel electrophoresis, to verify whether the first primer sets meets the requirements, wherein with whether can amplify meet designed size single band for criterion.
Wherein, the cumulative volume 10 μ l of PCR system, system comprises:
Template DNA (10ng/ μ l) 1 μ l, 2XPCRmastermix5 μ l, the first forward primer (10 μMs) 0.5 μ l, the first reverse primer (10 μMs) 0.5 μ l, ddH
2o3 μ l.
PCR response procedures is as shown in table 2 below:
Table 2PCR program
Then, the first qualified for empirical tests primer sets and the second primer sets are mixed than 1:8 according to final concentration, using human genome DNA's standard substance as template, carry out pre-PCR reaction and detected through gel electrophoresis, to determine whether to carry out effective, specific amplification to target area.Wherein, PCR reaction system cumulative volume 10 μ l, comprise: template DNA (10ng/ μ l) 1 μ l, 2XPCRMastermix5 μ l, second forward primer (10 μMs) 0.4 μ l, second reverse primer (10 μMs) 0.4 μ l, the first primer sets (a pair inner primer) (250nM often plants) 2 μ l, ddH
2o1.2 μ l.PCR response procedures is as shown in table 2.
Wherein, the electrophoresis detection result that the interior outer primer that Fig. 6 shows each SNP site uses simultaneously.As shown in Figure 6, the corresponding relation (i.e. swimming lane sequence number and its detection adopt the first primer sets of corresponding target area) of each swimming lane band and target area first primer sets of employing is as shown in the table:
Result shows, wherein first primer sets (i.e. the inner primer of 41 target areas) of swimming lane 133,134,145,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,155,156,157,158,159,161,167,170,171,173,175,176,177,178,180,181,182,187,188,189,190,193 correspondence is qualified, may be used for follow-up building storehouse.The first primer sets that remaining swimming lane is corresponding is defective, is not used further to build storehouse.
Sample genomic dna to be detected is diluted to 10ng/ μ l, for subsequent use.
(5) then, utilize the 72*72=5184 hole chip of WaferGen company of the U.S., PCR reaction is carried out: template DNA (10ng/ μ l) 10nl according to following 100nl reaction system, 2XPCRmastermix50nl, second forward primer (4 μMs) 10nl, second reverse primer (4 μMs) 10nl, the first primer sets (a pair inner primer) (250nM often plants) 20nl.Wherein, by the micro sample-adding instrument (MultisampleNanodispenser, WaferGen provide) supporting with chip and PCR instrument (Bio-Rad provides, model T-100), carry out application of sample and PCR reaction.Particularly:
Verify the first qualified primer sets and the second primer sets by through pre-PCR, and sample and PCR reaction reagent application of sample are in chip, and carry out corresponding PCR reaction.Wherein before application of sample, first the first primer sets and a part of PCR reaction reagent are mixed, be called detection site reagent premixed liquid; And sample, unique corresponding the second primer sets with sequence label of this sample and another part PCR reaction reagent are mixed, be called sample premixed liquid.Before application of sample, chip selects 24 regions, each region is made to have 41 micropores (corresponding 41 target areas, i.e. 41 SNP site), during application of sample, first on chip 24 regions each micropore in add sample premixed liquid, then, have the detection site reagent premixed liquid adding its correspondence in the micropore of sample premixed liquid again in each region.Thus, a pair inner primer, a pair outer primer with sequence label corresponding to sample, sample and PCR reagent is had in each micropore, namely complete PCR reaction system is formed, and then after PCR terminates, by chip upside down can be easily recovered to centrifugal for PCR primer in centrifuge tube.Wherein, PCR program is with table 2.
(6) after PCR completes, centrifugal by being inverted after chip striping, PCR primer now inside different hole can mix centrifugal, be recovered in centrifuge tube, then with the AmpureXP magnetic beads for purifying of 1 times of volume with the primer removing PCR reaction reagent and do not reacted, purified product is the nucleic acid library building and obtain.
(7) Library Quality detects: the library built uses Agilent2100Bioanalyzer and ABIStepOnerPlusReal-TimePCRSystem to carry out the detection of Quality and yield.
(8) library quality inspection qualified after, Miseq sequenator checks order, reads long 150bp, require the order-checking degree of depth reach 800 ×, the data volume of an amplified production 0.24Mb.
(9) information analysis flow process: first use software fqclean to filter out the base sequence of nontarget area to order-checking gained rawdata (all data), such as joint and low-quality reads (sequence), obtain cleandata (clean data).Used by cleandata base sequence comparison software Bwa comparison to reference on genome hg19, statistic data Quality Control information (wherein, distinguishing the sequencing data of each sample based on the sequence label of each sample).Cleandata is input in nucleotide variation inspection software Samtools simultaneously and carries out SNPcalling (searching has the base position of variation), 41 SNP (site of the nucleotide variation) positional information finally introducing the present embodiment concern carries out SNP filtration, obtain final Genotype (genotype), then the sequencing data of whole genome of the genotype obtained with corresponding sample is compared, for analyzing the accuracy that SNP detects.
(10) add up the coverage of 41 SNP site of 24 samples and catch specificity situation, the results are shown in Figure 7 and Fig. 8.Wherein, the final data result display of 800X data, the mean coverage of each sample, 95%, is caught specificity and was greater than for 90% (as shown in Figure 7 and Figure 8).Particularly, in the figure 7, the site quantity of the site quantity/concern of coverage=detect.As shown in Figure 7, the coverage of 41 SNP site of 24 samples, all more than 0.91, illustrates the respond well of detection site; Detect number of times and be greater than the coverage of 10 times all more than 0.94, and detection number of times is greater than the coverage of 50 times all more than 0.91, illustrates that the result obtained is very homogeneous.In fig. 8, catch the quantity of all sequences of specificity=obtain with the quantity/order-checking of the identical sequence of sequence paid close attention to, this numerical value is higher, illustrates that the availability of the data that order-checking obtains is higher.As shown in Figure 8,24 samples catch specificity minimum also close to 0.9, show that the availability of sequencing data is very high.
Genome sequencing result known with it for the variation detected result of 24 samples, 41 SNP site is carried out comparison of coherence, and result is as shown in table 3 below.Show the present embodiment to the variation detected result of 24 samples, 41 SNP site accurately, reliably.
The consistence of table 3SNP detected result and part sample genome sequencing result
In the description of this specification sheets, specific features, structure, material or feature that the description of reference term " embodiment ", " some embodiments ", " example ", " concrete example " or " some examples " etc. means to describe in conjunction with this embodiment or example are contained at least one embodiment of the present invention or example.In this manual, identical embodiment or example are not necessarily referred to the schematic representation of above-mentioned term.And the specific features of description, structure, material or feature can combine in an appropriate manner in any one or more embodiment or example.
Although illustrate and describe embodiments of the invention, those having ordinary skill in the art will appreciate that: can carry out multiple change, amendment, replacement and modification to these embodiments when not departing from principle of the present invention and aim, scope of the present invention is by claim and equivalents thereof.