The construction method of DNA library available for noninvasive antenatal monogenic disease detection
Technical field
The invention belongs to molecular diagnosis fields, and in particular to a kind of DNA texts available for noninvasive antenatal monogenic disease detection
The construction method in storehouse.
Background technology
There are fetus dissociative DNA (cell-free fetal DNA, cffDNA) in pregnant woman blood plasma so that non-invasive means are examined
The gene information for surveying fetus is possibly realized.In consideration of it, more and more methods are used for the hereditary disease situation of Non-invasive detection fetus.Its
In, the examination that progress numerical abnormalities of chromosomes is sequenced using two generations has obtained authorized by state and industrialization, including 13,18 and 21
Three body situations of number chromosome.Recently, as technology develops, non-invasive means can be used for detecting chromosome micro-deleted and
Micro- repetition.However, for the Non-invasive detection of monogenic disease, need technical further exploitation.Due to Disease in Infants cfDNA
The presence of (dissociative DNA) easily disturbs the judgement of fetus genotype, in particular for the heterozygosis pathogenic mutation of Disease in Infants.
The detection means of current noninvasive monogenic disease mainly has digital pcr and haplotype linkage analysis.Both approaches are all
The defects of having itself, if digital pcr is used to detecting the genotype of fetus and inaccurate, is often present with deviation or mistake, draws
The template that object can be enriched to will cover upstream and downstream primer in itself, this requires template is enough long in itself, and in blood plasma itself
CffDNA molecules are with regard to fewer, in addition the efficiency of amplification, final point that can be detected is less;The side of haplotype linkage analysis
Method, the genotype for detecting fetus that can be more accurate, however this method is the large-engineering taken time and effort very much, it is difficult to
In the pregnant woman for knowledge of result of being clinically eager.
In order to realize the detection to monogenic disease, prior art generally use following steps carry out, and idiographic flow includes:
(1) each molecules of cfDNA after being repaired in end, which are added, (to be made of multiple N, also cries containing Unique ID
Barcodes connector), and carry out first time amplification.
(2) amplified production is denatured, is cyclized by taq ligase unimolecules.With EXOI and EXOIII digestion removals
The linear molecule of non-cyclizing.
(3) with back-to-back primer (back-to-back primers) the capture mutational site of specificity, carry out second
Amplification.
(4) joint sequence that product adds Iillumina platforms is captured.However, the ssDNA after being denatured in the prior art
During cyclisation there is renaturation, cause have quite a few molecule that can not be cyclized and by subsequent linear digestion
It disposes.In addition, cyclisation unimolecule is not easy accurate quantitative analysis, cyclisation success or not is caused to be only capable of being commented after sequencing terminates
Valency, it is possible to very big waste can be caused.
The content of the invention
In view of above-mentioned the deficiencies in the prior art, it is an object of the invention to provide a kind of more efficient and rings of cyclisation
Change the construction method that effect is easy to the DNA library of monitoring.
The present inventor has made intensive studies to solve above-mentioned technical problem, it turns out that:In the prior art,
The 5' ends of the amplimer used when expanding first time introduce restriction enzyme site, and using corresponding in subsequent step
Restriction enzyme, which cuts amplified production, reuses ligase is attached, and is cyclized its double-strand, can improve cyclisation
Efficiency simultaneously causes cyclisation effect easily to monitor.So as to complete the present invention.
That is, the present invention includes:
1. a kind of method for building sequencing DNA library, including:
Step D:Adjunction head DNA fragmentation is subjected to PCR amplification, obtains first time amplified production, wherein, the PCR amplification institute
The restriction enzyme site of the restrictive restriction endonuclease in 5' ends of the primer used;
Step E:The first time amplified production is subjected to digestion with the restriction enzyme, obtains digestion products;
Step F:The digestion products are subjected to itself connection, obtain including the DNA mixtures of double-strand cyclisation product;
Step G:The DNA mixtures with exonuclease are digested, obtain digestion product;
Step H:The digestion product is subjected to PCR amplification, obtains second of amplified production, wherein, which uses
Back-to-back primer is as amplimer;And
Step I:Second of amplified production is subjected to PCR amplification, obtains third time amplified production, wherein, which expands
Increasing, which use, carries sequencing by the use of the primer of joint sequence as amplimer.
2. the method for the structure sequencing DNA library according to item 1, further includes:
The step C carried out before the step D:3' ends are added to the DNA fragmentation adjunction head of A, obtain described plus linker DNA
Segment.
3. the method for the structure sequencing DNA library according to item 2, further includes:
The step B carried out before the step C:Flat terminal DNA fragments are subjected to 3' ends and add A, the 3' ends is obtained and adds A
DNA fragmentation.
4. the method for the structure sequencing DNA library according to item 3, further includes:
The step A carried out before the step B:End reparation is carried out to the dissociative DNA from biological sample, obtains institute
State flat terminal DNA fragments.
5. the method for the structure sequencing DNA library according to item 4, wherein, it is carried out before the step A from biology
The step of dissociative DNA is isolated and purified in sample.
6. the method for the structure sequencing DNA library according to item 4 or 5, wherein, the biological sample is pregnant woman periphery
Blood blood plasma, paraffin embedding sample, saliva, buccal swab or peripheral blood blood plasma.
7. the method for the structure sequencing DNA library according to item 4, wherein, the step A and the step B it
Between, between the step B and the step C, between the step C and the step D, the step D and step E it
Between, between the step E and the step F, between the step F and the step G, the step G and step H it
Between, and/or the step after further include purification step.
8. the method for the structure sequencing DNA library according to item 1, wherein, the restriction endonuclease recognition sequence
For 4~8 bases.
9. the method for the structure sequencing DNA library according to item 1, wherein, the Exonucleolytic used in the step G
Enzyme is protection plasmid dependent on the DNA enzymatic of ATP and exonuclease III.
10. the method for the structure sequencing DNA library according to item 1, wherein, the back-to-back primer in the step H
For capturing monogenic disease related locus.
Invention effect
According to the present invention, it is cyclized with double-strand instead of single-stranded cyclisation, so that cyclisation efficiency improves and cyclisation effect is easy
In monitoring.The construction method of the DNA library of the present invention is applicable to noninvasive antenatal monogenic disease detection.
Description of the drawings
Fig. 1 is the experiment flow schematic diagram of the embodiment of the present invention.
The specific embodiment of invention
First, in an aspect, the present invention provides a kind of method (structure of the invention for building sequencing DNA library
The method of sequencing DNA library), including:
Step D:Adjunction head DNA fragmentation is subjected to PCR amplification, obtains first time amplified production, wherein, the PCR amplification institute
The restriction enzyme site of the restrictive restriction endonuclease in 5' ends of the primer used;
Step E:The first time amplified production is subjected to digestion with the restriction enzyme, obtains digestion products;
Step F:The digestion products are subjected to itself connection, obtain including the DNA mixtures of double-strand cyclisation product;
Step G:The DNA mixtures with exonuclease are digested, obtain digestion product;
Step H:The digestion product is subjected to PCR amplification, obtains second of amplified production, wherein, which uses
Back-to-back primer is as amplimer;And
Step I:Second of amplified production is subjected to PCR amplification, obtains third time amplified production, wherein, which expands
Increasing, which use, carries sequencing by the use of the primer of joint sequence as amplimer.
In the conventional method, first time PCR is carried out to the DNA fragmentation to be sequenced after end repairs, adds A, adjunction head
Amplification, obtains first time amplified production.The method of the structure sequencing DNA library of the present invention uses aforesaid way substantially, still
The primer of the restriction enzyme site of the restrictive restriction endonuclease in 5' ends is used in the first time PCR amplification as amplimer, from
And cause the both ends of the first time amplified production that there is the restriction enzyme site of the restriction enzyme.For described restricted interior
Enzyme cutting is not particularly limited, and it is preferable to use the restriction enzymes of 4~8 base-pairs of identification.
Then, the first time amplified production is subjected to digestion with the restriction enzyme, obtains digestion products.Digestion
Reaction condition can be according to property of used restriction enzyme etc. suitable for selection.For example, usually can 0.1~10 ×
In endonuclease reaction buffer solution in 0~80 DEG C (preferably 10~40 DEG C) carry out about 1 minute~100 it is small when (preferably 10 minutes~10 is small
When).Gained digestion products are double chain DNA molecule of the tool there are two cohesive end, not special for the base number of cohesive end
Limitation, but be preferably 2~6 from the perspective of being conducive to carry out itself connection in subsequent step.
Then, above-mentioned digestion products are subjected to itself connection, obtain including the DNA mixtures of double-strand cyclisation product.In general,
In the DNA mixtures in addition to comprising double-strand cyclisation product, it is also possible to include not cyclized double-stranded DNA, single stranded DNA etc..
Itself connection can be carried out using the DNA ligase with end connection activity, described to connect activity with end
DNA ligase is such as T4DNA ligases, T3 DNA ligases, e. coli dna ligase, heat-stable DNA ligase.Even
It is reversed should in the amount and reaction condition of the enzyme that uses and substrate can be by those skilled in the art optionally suitable for selection.For example,
Usually can in 0.1~10 × ligase buffer solution in 0~80 DEG C (preferably 10~40 DEG C) carry out about 1 minute~200 it is small when
When small (preferably 1~30).
Then, the DNA mixtures with exonuclease are digested, obtains digestion product.For used core
Sour excision enzyme is not particularly limited, such as can be used exonuclease III, exonuclease I, be protected depending on for plasmid
DNA enzymatic (such as the Plasmid-Safe of ATPTMATP-Dependent DNase) etc., it is preferable to use the DNA enzymatic dependent on ATP with
The combination of exonuclease III, because can ensure the presence of circular double stranded DNA and the removing of linear DNA.Digestion reaction condition
It can be according to property of used exonuclease etc. suitable for selection.For example, it can usually delay in 0.1~10 × digestion reaction
In fliud flushing in 0~80 DEG C carry out about 1 minute~100 it is small when small (preferably 10 minutes~10).
Existing method is using cyclisation single strand dna (ssDNA), however the ssDNA after being denatured is deposited during cyclisation
In the possibility of renaturation, cause have quite a few single chain molecule that can not be cyclized, and disposed by subsequent digestion.With this phase
Right, with cyclisation double chain DNA molecule (dsDNA) is employed in the method for DNA library, this can be significantly for structure sequencing of the invention
Improve cyclisation efficiency.On the other hand, easy, accurate detection means is lacked for ssDNA, and tradition may be employed in dsDNA
Ultraviolet light absorption method accurately detected, this cause the present invention structure sequencing be more easy to the cyclisation in the method for DNA library
In detection, this is an important index for later stage evaluation capture rate.
In addition, " clamping plate cyclisation " is employed in the conventional method, if the auxiliary DNA probe used there are remnants, in " clamping plate
It can be combined after cyclisation " with single-stranded cyclic DNA, so as to cause the duplex structure of subregion.In this case, due to digestion step
It needs, using exonuclease EXO III, cyclic DNA to be caused to damage under the action of EXO III, finally may result in cyclisation
Efficiency reduces.In contrast, structure sequencing of the invention is not with " clamping plate cyclisation " is used in the method for DNA library, without using auxiliary
DNA probe is helped, thus effectively prevents this problem.
Then, the digestion product is subjected to PCR amplification, obtains second of amplified production, wherein, which uses
Back-to-back primer is as amplimer.In the present specification, " amplification back-to-back " refers to:Using the DNA of cyclisation as template, upstream and downstream
Primer designs in opposite direction from purpose location proximate, during amplification the back-to-back direction extension of primer replicate entire ring molecule.The back of the body
Backrest primer refers to the primer expanded back-to-back.Particularly, the back-to-back primer can be used for capturing monogenic disease phase
The gene loci of pass.
Then, second of amplified production is subjected to PCR amplification, obtains third time amplified production, wherein, which expands
Increasing, which use, carries sequencing by the use of the primer of joint sequence as amplimer.This step is in sequencing DNA library construction method
Conventional steps.
Preferably, the step of structure sequencing of the invention carries out before being additionally included in the step D with the method for DNA library
C:3' ends are added to the DNA fragmentation adjunction head of A, obtain the adjunction head DNA fragmentation.The connector is included for the number after sequencing
According to the Barcode sequences that Reads sources are identified in analysis.
It is highly preferred that the step that the structure sequencing of the present invention carries out before being additionally included in the step C with the method for DNA library
Rapid B:Flat terminal DNA fragments are subjected to 3' ends and add A, obtain the DNA fragmentation that the 3' ends add A.
It is highly preferred that the step that the structure sequencing of the present invention carries out before being additionally included in the step B with the method for DNA library
Rapid A:End reparation is carried out to the dissociative DNA from biological sample, obtains the flat terminal DNA fragments.Wherein, the biological sample
This for example can be that DNA content is few or the serious sample of degradation in sample, for example, maternal plasma, paraffin embedding sample,
Saliva, buccal swab or peripheral blood blood plasma;It can also be great amount of samples.It is highly preferred that the structure sequencing DNA library of the present invention
Method the step of can also carrying out isolating and purifying dissociative DNA from biological sample before the step A.
Preferably, the method for structure sequencing DNA library of the invention between the step A and the step B, it is described
Between step B and the step C, between the step C and the step D, between the step D and the step E, the step
Suddenly between E and the step F, between the step F and the step G, between the step G and the step H, and/or institute
It states step and further includes purification step afterwards.
Above-mentioned steps A, B, C and the step of dissociative DNA is isolated and purified from biological sample, purification step this can be used
The conventional method of technical field carries out.
In addition, in an aspect, the present invention provides a kind of sequencing approach (sequencing approach of the invention), wherein, to adopt
The sequencing built by the use of the method for the structure sequencing DNA library of the present invention is sequenced by the use of DNA library as object.
In addition to using the sequencing of the present invention by the use of DNA library as object, this skill may be employed in sequencing approach of the invention
The conventional method in art field carries out.
Embodiment
More specific description is carried out to the present invention by the following examples.It should be appreciated that embodiment described herein is
It is of the invention not for limiting for explaining the present invention.
Samples selection
First, the relevant SNP of folic acid metabolism (rs1801394) of foetal DNA and female blood DNA is carried out by the flow of Fig. 1
A generation is sequenced.The blood plasma for selecting two samples carries out noninvasive experiment, and the corresponding maternal gene type of the two samples is AG, and tire
The genotype of youngster is respectively then AA (sample 1) and GG (sample 2).
Noninvasive single-gene detection
The flow of 1.1 single-stranded cyclisation captures
(1) cfDNA (plasma DNA) is extracted:Use MagMAXTMCell-Free DNA Isolation Kit extract 1mL
Pregnant woman blood plasma about extracts 10ng or so.
(2) expand for the first time:CfDNA is subjected to following flow " repair-add A- and add barcode connectors-PCR amplification in end "
Wherein end is repaired:
Reaction condition is 20 DEG C, 30 minutes;1.8 times of magnetic beads for purifying, 70% ethyl alcohol are washed twice, 20 μ L ddH2O eluted dnas,
Band magnetic bead finally obtains end and repaiies DNA purified products.
Add " A ":
Reaction condition is 37 DEG C, and 30 minutes, this step product did not purified, and is finally added " A " DNA product.
Add barcode connectors:
1 sequences of barcode-Adapter:
5'-pGATCGGAAGAGCATGACAGT-3'
2 sequences of barcode-Adapter:
5'-CTGAACCTCCTAGTGTAACGNNNNNNGCTCTTCCGATCT-3'
Routinely cycle of annealing is paired into Y-shaped connector barcode-Adapter to these two pair primer.
Reaction condition is 20 DEG C, 15 minutes;1.8 times of magnetic bead buffer solution purifying, 70% ethyl alcohol are washed twice, 18 μ LddH2O is washed
De- DNA, band magnetic bead finally obtain and add barcode-Adapter products.
PCR amplification:
- 1 sequence of primer:5'-pACTGTCATGCTCTTCCGATC-3'
- 2 sequence of primer:5'-pCTGAACCTCCTAGTGTAACG-3'
Response procedures are:94 DEG C, 2 minutes;(94 DEG C 15 seconds, 62 DEG C 30 seconds, 72 DEG C 30 seconds) 13cycles;72 DEG C 10 points
Clock;It 4 DEG C, preserves.PCR product is separately added into 0.9 times of magnetic bead buffer solution purifying, and 70% ethyl alcohol is washed twice, 30 μ L ddH2O is eluted
DNA, product after being expanded.
(3) clamping plate is cyclized:Amplified production with barcode is cyclized, because each being expanded plus the molecule of given joint
Two-end structure afterwards is all, it is possible to go to combine both ends with an auxiliary DNA molecular, then promote under the action of ligase
Make single-stranded cyclisation.
Cyclisation system is as follows:
- 3 sequence of primer:
5'-GATCGGAAGAGCATGACAGTCTGAACCTCCTAGTGTAACG-3'
Response procedures are:95 DEG C 10 minutes, when 16 DEG C of 16-30 are small, wherein ligase adds in after being down to 16 DEG C after denaturing.
1.8 times of magnetic beads for purifying of product, 70% ethyl alcohol are washed twice, 43 μ LddH2O is eluted, and finally obtains single-stranded cyclisation product.
DNA digests:
Reaction condition is 37 DEG C, 60 minutes;80 DEG C, 20 minutes.1.8 times of magnetic beads for purifying of product, 70% ethyl alcohol are washed twice, and 20
μLddH2O is eluted, product after being digested.
(4) expand for second:Related locus is captured with back-to-back primer, reaction system is as follows:
- 4 sequence of primer:
5'-ACACGACGCTCTTCCGATCTAAGGCCATCGCAGAAGAAAT-3'
- 5 sequence of primer:
5'-AGACGTGTGCTCTTCCGATCTATGGCCTTTGCCTGTCCCT-3'
Response procedures are:94 DEG C, 2 minutes;(94 DEG C 15 seconds, 55 DEG C 30 seconds, 72 DEG C 30 seconds) 25cycles;72 DEG C 10 points
Clock;It 4 DEG C, preserves.PCR product uses 0.9 times of magnetic beads for purifying, and 70% ethyl alcohol is washed twice, 20 μ L ddH2O eluted dnas, are carried on the back
Backrest product.
Qubit hs quantitative amount of product concentration is respectively:
(5) third time expands
This amplification is in order to add the joint sequence with illumina microarray datasets to the product of capture, in order to upper
Machine is sequenced.
Ann consensus primers:
5'-AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT-3'
Ann Index:
5'-CAAGCAGAAGACGGCATACGAGATNNNNNNNGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT-
3'
Wherein, the sequence for the Ann Index that different samples uses is different, predominantly NNNNNNN by following sequence replacing,
Other parts are constant, and situation is as follows:
Single-stranded cyclisation sample 1:67index sequences:GTTGCAAC substitutes NNNNNNN;
Single-stranded cyclisation sample 2:68index sequences:CTCAATTA substitutes NNNNNNN;
Double-strand is cyclized sample 1:71index sequences:CAAGTCTA substitutes NNNNNNN;
Double-strand is cyclized sample 2:72index sequences:ACAACCTA substitutes NNNNNNN.
Response procedures are:94 DEG C, 2 minutes;(94 DEG C 15 seconds, 62 DEG C 30 seconds, 72 DEG C 30 seconds) 5cycles;72 DEG C 10 minutes;
It 4 DEG C, preserves.PCR product is purified using 0.9 times of magnetic bead buffer solution, and 70% ethyl alcohol is washed twice, 30 μ L EB solution eluted dnas.Its
It is 67 and No. 68 that middle index selects company's numbering respectively.
Qubit hs quantitative amount of product concentration is respectively:
(6) machine is sequenced on
Agilent 2100 and QPCR detections, upper machine sequencing:The each library 0.5G raw data of PE125.
The flow of 1.2 digestion double-strands cyclisation capture
Primer used differs for the step of primer -1.1 and primer -1.2 and cyclisation digestion when being expanded except first time
Outside sample, remaining step and single-stranded cyclisation are identical.I.e. experimental implementation only (3) is different, is denoted as here (3').
Primer -1.1:5'-CCATCGATACTGTCATGCTCTTCCGATC-3'
Primer -1.2:5'-CCATCGATCTGAACCTCCTAGTGTAACG-3'
(3') double-strand is cyclized:
Clal digestion systems:
Reaction condition is 37 DEG C, 60 minutes;80 DEG C, 20 minutes;1.8 times of magnetic beads for purifying of product, 70% ethyl alcohol are washed twice, and 30
μlLddH2O eluted dnas obtain digestion products.
Qubit hs quantitative amount of product concentration is respectively:
Connection:
Reaction condition for 16 DEG C overnight when small (16~30).1.8 times of magnetic beads for purifying of product, 70% ethyl alcohol are washed twice, 30 μ
LddH2O eluted dnas obtain cyclisation product.
Qubit hs quantitative amount of product concentration is respectively:
Digestion:
Reaction condition is 37 DEG C, 30 minutes;75 DEG C, 20 minutes;Add 2 μ L 0.5M EDTA.1.8 times of magnetic beads for purifying of product,
70% ethyl alcohol is washed twice, 20 μ LEB eluted dnas, obtains digestion product.
Qubit hs quantitative amount of product concentration is respectively:
With 1.1 flows, by the second wheel and third round amplification, (wherein index selects company's numbering to product after cyclisation respectively
For 71 and No. 72) become final library afterwards.
Back-to-back primer (back-to-back primer) capture product qubit hs quantitative amount of product (volume is 20 μ L) is dense
Degree is respectively:
Qubit hs quantitative amount of product concentration is respectively:
It is more to can be seen that the Quality Control point of double-strand cyclisation from above-mentioned flow, and under the same terms, from the product amount of capture
From the point of view of, the capture effect of double-strand cyclisation is better than single-stranded cyclisation.
1.3 information analysis
Four libraries are subjected to PE sequencing post analysis, situation such as following table.Wherein,
Target number:All reads numbers containing primer4;
All numbers:All reads numbers obtained are sequenced;
The reads number with label:In the case of reads number of the first row, how many kind these reads have altogether
barcode;
The correct reads number of 5bp near label:In the above conditions, 5 bases of these reads capture regions are all right
How many kinds of barcode should correctly be shared;
1 × genotype:Two kinds of genotype are captured to how many kinds of barcode respectively;
200 × genotype:Reads numbers are more than 200 (it is believable thinking that this number is only, be not accidentally to occur)
Barcode shares how many kinds of, and represents which kind of genotype respectively.
Two libraries that wherein index is 67 and 71 come from a plasma sample and are cyclized respectively by single-stranded and double-strand
It carries out.And index is carried out by the cyclisation of single-stranded and double-strand respectively for 68 and 72 with another sample.
Available reads numbers are as follows after final filtration:
Then the situation for the genotype that two kinds of site experiment flow is drawn is calculated:
The genotype of mother and fetus in the site is analyzed as a result,:
It can be seen that it can be detected for the polymorphism of SNP in plasma sample.The result of sample 2 is consistent, but sample
This 1 result occurs inconsistent.According to a generation be sequenced as a result, the genotype of fetus should be AA rather than GG, illustrate pair
Chain cyclization is more accurate for the difference of gene frequency in blood plasma, is more applicable for the genetic profile of detection fetus.
It should also be noted that, on the premise of it can implement and unobvious run counter to the purport of the present invention, in the present specification
The combination of the described any technical characteristic of composition part or technical characteristic as a certain technical solution equally can also be applied
In other technical solutions;Also, on the premise of it can implement and unobvious run counter to the purport of the present invention, as different technologies scheme
The described technical characteristic of composition part between can also be combined in any way, to form other technical solutions.This
Invention is also contained under the above situation the technical solution as obtained from combination, and these technical solutions are equivalent to and are documented in this
In specification.
The preferred embodiment of the present invention has shown and described in above description, as previously described, it should be understood that the present invention is not office
Be limited to form disclosed herein, be not to be taken as the exclusion to other embodiment, and available for various other combinations, modification and
Environment, and can be changed in the scope of the invention is set forth herein by the technology or knowledge of above-mentioned introduction or association area
It is dynamic.And the modifications and changes that those skilled in the art are carried out do not depart from the spirit and scope of the present invention, then it all should be in the present invention
In the protection domain of appended claims.
Industrial applicibility
According to the present invention, a kind of structure side for being cyclized more efficient and cyclisation effect and being easy to the DNA library of monitoring is provided
Method.