WO2020232635A1 - 基于甲基化dna目标区域构建测序文库及***和应用 - Google Patents

基于甲基化dna目标区域构建测序文库及***和应用 Download PDF

Info

Publication number
WO2020232635A1
WO2020232635A1 PCT/CN2019/087824 CN2019087824W WO2020232635A1 WO 2020232635 A1 WO2020232635 A1 WO 2020232635A1 CN 2019087824 W CN2019087824 W CN 2019087824W WO 2020232635 A1 WO2020232635 A1 WO 2020232635A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
universal
primer
sequencing
dna sample
Prior art date
Application number
PCT/CN2019/087824
Other languages
English (en)
French (fr)
Inventor
杨林
张艳艳
王其伟
卢佳
陈芳
蒋慧
Original Assignee
深圳华大智造科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳华大智造科技有限公司 filed Critical 深圳华大智造科技有限公司
Priority to EP19929647.6A priority Critical patent/EP3950956A4/en
Priority to CN201980092935.8A priority patent/CN113811618B/zh
Priority to JP2022502317A priority patent/JP7203276B2/ja
Priority to PCT/CN2019/087824 priority patent/WO2020232635A1/zh
Publication of WO2020232635A1 publication Critical patent/WO2020232635A1/zh
Priority to US17/493,991 priority patent/US20220056519A1/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1093General methods of preparing gene libraries, not provided for in other subgroups

Definitions

  • the invention relates to the field of gene sequencing, in particular to a method, system and application for constructing a sequencing library based on a target region of methylated DNA.
  • DNA methylation is an apparent regulatory modification, which participates in the regulation of protein synthesis without changing the base sequence.
  • DNA methylation is a very wonderful chemical modification. The care of relatives, the body's aging, smoking, alcoholism and even obesity will be truthfully recorded on the genome by methylation. The genome is like a diary, and methylation is used as text to record the experience of the human body.
  • DNA methylation is an important epigenetic marker information. Obtaining the methylation level data of all C sites in the whole genome is of great significance for the study of epigenetic spatio-temporal specificity.
  • mapping the DNA methylation level of the whole genome, and analyzing the high-precision methylation modification patterns of specific species will surely have a milestone significance in epigenomics research.
  • Whole Genome Methylation Sequencing WGBS Whole Genome Bisulfite Sequencing
  • Whole Genome Methylation Sequencing WGBS Whole Genome Bisulfite Sequencing
  • Bisulfite treatment will single-strand DNA and cause serious damage
  • Unmethylated C bases after bisulfite treatment will change U bases, the GC content of the entire genome undergoes extreme changes, resulting in great preference for subsequent amplification
  • the library construction requires microgram-level starting DNA, and it is difficult to have a very effective library construction method for trace DNA.
  • whole-genome methylation sequencing is complicated and too expensive. The use of targeted methylation sequencing technology can effectively solve these problems.
  • Targeted methylation sequencing technology can be divided into probe capture and multiplex PCR-based sequencing technology.
  • probe capture the required starting amount is high.
  • trace samples such as plasma free DNA
  • it is difficult to capture Moreover, the design and operation process of the probe capture probe are too complicated, the detection cycle is long, and the cost is high; and the multiplex PCR based on the DNA bisulfite treatment requires low initial requirements, simple operation and high sensitivity, but this technology It needs further improvement.
  • an object of the present invention is to provide a method, system and application for constructing a sequencing library based on a target region of methylated DNA.
  • the method provided by the present invention builds a library for the target region of the methylated DNA sample, and during the library building process, only one strand of the methylated DNA sample is amplified and the library is built.
  • specific primers and universal primers for amplification the target product is obtained, which can effectively solve the problem of primer dimers.
  • multiple specific primers are used to amplify the target region of the same methylated DNA template to ensure the specificity of amplification.
  • the inventors of the present invention noticed in the research process that multiple PCR based on DNA bisulfite treatment has simple operation and high sensitivity but high technical requirements. It has been previously reported that single-molecule BS-PCR using droplet technology can detect about nine thousand targets at the same time, but the starting amount is relatively high, requiring 2 ⁇ g DNA. In 2015, Lu Wen and other researchers cleverly used the characteristic sequences of CpG islands as primer binding sites to develop MCTA-seq based on PCR technology, which can simultaneously detect the methylation signals of a large number of CpG island regions. This technology is extremely sensitive. It can detect 7.5pg of gDNA, but MCTA-seq is more like a fixed CGI Panel. As a targeted sequencing platform, it is slightly less flexible. Therefore, the development of a targeted methylation technology with low initial quantity requirements and strong flexibility is the future development direction of targeted methylation.
  • Basic PCR is mainly due to the formation of serious primer dimers in the PCR process.
  • the unmethylated cytosine is converted to uracil after the bisulfite-treated DNA, and most of the cytosine in the genome is unmethylated , So most of the bases of the sequence have changed from the previous four components of A/T/C/G to A/T/G.
  • one primer is designed for the positive strand and the other is designed for the complementary strand. Therefore, one strand used for PCR is an ATG-rich sequence, and the other is an ATC-rich sequence.
  • This "natural "Complementary" primer sequences can easily form primer dimers. When the number of primer pairs increases, the formation of primer dimers also increases sharply. In the process of multiplex PCR, too many primers are exhausted due to the generation of primer dimers, causing multiple PCR failures. Therefore, it is necessary to solve multiple sub The problem of sulfate multiplex PCR is to solve the problem that primers easily form primer dimers.
  • the present invention provides the following technical solutions:
  • the present invention provides a method for constructing a sequencing library based on a target region of methylated DNA, including: (1) based on the methylated DNA sample, At least one end is connected to the universal sequence, and the DNA sample is treated with bisulfite to obtain a transformed DNA sample with the universal sequence; (2) using the first specific primer and the first universal primer to pair the transformed band A DNA sample with a universal sequence is first amplified to obtain a first amplification product; wherein, the first specific primer is located upstream of the target region, and the first universal primer is at least part of the universal sequence Matching or overlapping; the universal sequence is located downstream of the target region; (3) using a second specific primer, a second universal primer and a tag primer to perform a second amplification on the first amplification product to obtain a second amplification Product to obtain a sequencing library; wherein the second specific primer is located downstream of the first specific primer and upstream of the target region, and the second universal primer and at least a partial sequence of the second
  • the method for constructing a sequencing library based on a methylated DNA target region is to design specific primers for a chain of a methylated DNA template to obtain enrichment of the target region and build a library.
  • the first specific primer can match a strand of the DNA sample, and the first universal primer can match the universal sequence. Achieve specific amplification.
  • the first specific primer designed is a sequence rich in bases A, T, G or bases A, T, and C. Dimers will not form between.
  • the first universal primer contains four bases A, T, C, and G, and will not form a primer dimer with the first specific primer, so the formation of primer dimer can be completely avoided.
  • a second specific primer is designed downstream of the first specific primer and upstream of the target region or downstream of the target region, using the second specific primer, the second universal primer and The tag primer is used to perform second amplification on the first amplification product to obtain the second amplification product, and obtain the required sequencing library.
  • the above-mentioned method for constructing a sequencing library based on a target region of methylated DNA may further include the following technical features:
  • the 5'end of the second specific primer overlaps at least part of the sequence of the 3'end of the second universal primer, and the 3'end of the tag primer It overlaps with the partial sequence of the 5'end of the first universal primer.
  • the 5'end sequence of the second specific primer can overlap with at least part of the 3'end sequence of the second universal primer, and the 3'end sequence can match the template region on the DNA template downstream of the first specific primer and upstream of the target region Therefore, specific amplification of the target region can be achieved based on the first amplification product.
  • the 5'end of the second specific primer overlaps with at least part of the sequence of the 3'end of the tag primer, and the 3'end of the second universal primer overlaps with The partial sequence of the 5'end of the first specific primer overlaps.
  • the 5'end sequence of the second specific primer overlaps at least part of the 3'end sequence of the tag primer, and its 3'end sequence can be matched with the template region located downstream of the target region on the DNA template, thereby achieving specificity for the target region Amplification.
  • the tag primers contain tag sequences. These tag sequences may be commonly used by some sequencing platforms to distinguish different samples, so as to facilitate simultaneous sequencing of multiple mixed samples. According to embodiments, these The length of the tag sequence can be 8-12 bp, for example, it can be 10 bp, 8 bp, etc.
  • step (1) further comprises: (1-a) processing the methylated DNA sample with bisulfite to obtain a transformed DNA sample; (1-b) ) Using DNA polymerase and a random primer with a first sequencing sequence to replicate the transformed DNA sample to obtain the transformed DNA sample with a universal sequence, the 3'end of the random primer It is a random base sequence, and the 5'end of the random primer is a universal sequence.
  • the random base sequence is 6-12, and the random base is A, T, C or G.
  • the random base sequence is 6-12, and the random base is A, T or C.
  • the universal sequence is a sequencing linker sequence or a fixed sequence.
  • the cytosine in the sequencing linker sequence or the fixed sequence is a methylated modified cytosine.
  • step (1) further includes: (1-1) performing end repair plus A on the methylated DNA sample to obtain a repaired DNA sample; (1-2) adding A to the methylated DNA sample; At least one end of the repaired DNA sample is connected with a universal sequence to obtain a DNA sample with a universal sequence; (1-3) using bisulfite to process the DNA sample with a universal sequence to obtain the The transformed DNA sample with universal sequence.
  • the universal sequence is selected from at least one of the following: sequencing adapter sequence or modified sequencing adapter sequence.
  • the modified sequencing linker sequence is that one chain of cytosine is modified by methylation, one chain of cytosine is not modified by methylation, and the 3'end base of one chain is modified by non-hydroxyl group.
  • the random sequence is a molecular tag sequence.
  • the number of original DNA templates can be counted through a large number of different molecular tag sequences, and the number of original templates can be traced through subsequent statistics of molecular tag sequences and errors generated in the sequencing or PCR process can be corrected, so as to realize the detection of DNA templates. Precise detection and quantitative research.
  • step (1) further includes: 1Using a transposase to interrupt and transpose the DNA sample, so as to obtain a DNA sample with a universal sequence, in the transposase Embedded with the universal sequence; 2Using bisulfite to process the DNA sample with the universal sequence to obtain the transformed DNA sample with the universal sequence.
  • the universal sequence is a transposase effector sequence or a transposase effector sequence with a sequencing linker, preferably a transposase effector sequence, and the transposase can be Tn5, MuA or Other transposases with similar functions are preferably Tn5 transposase.
  • the cytosine in the transposase effector sequence is methylated modified cytosine.
  • the conversion of unmethylated cytosine to guanine is not a 100% process, and it may or may not be converted, so the subsequent amplification with universal primers will increase uncertainty.
  • the methylated cytosine will not be converted to uracil under the condition of subsequent sulfite treatment, and the sequence information will remain unchanged. Therefore, in order to sequence more accurately, the cytosine in the transposase effect sequence can be methylated. Of course, the cytosine may not undergo methylation modification treatment.
  • the methylated DNA sample is genomic DNA, fragmented genomic DNA, or free DNA.
  • the present invention provides a system for constructing a sequencing library based on a target region of methylated DNA, comprising: a universal transformation module, which is constructed based on the methylated DNA sample At least one end of the methylated DNA sample is connected with a universal sequence and is bisulfite-treated DNA sample, so as to obtain a transformed DNA sample with a universal sequence; a first amplification module, the first amplification module The amplification module is connected to the universal transformation module, and the first amplification module uses the first specific primer and the first universal primer to first amplify the transformed DNA sample with the universal sequence, so as to obtain the first An amplification product; wherein the first specific primer is located upstream of the target region, and the first universal primer matches or overlaps at least partially with the universal sequence; and the second amplification module, the second amplification The amplification module is connected to the first amplification module, and the second amplification module uses a second specific primer, a second universal primer and a
  • the second specific primer is located downstream of the first specific primer and upstream of the target region, and the second universal primer and the second specific primer are at least Partial sequence overlap
  • the tag primer contains a tag sequence
  • the tag primer overlaps with the partial sequence of the first universal primer
  • the second specific primer is located downstream of the target region, and the second At least a partial sequence of the universal primer overlaps with the first specific primer
  • the tag primer contains a tag sequence
  • the tag primer overlaps with a partial sequence of the second specific primer.
  • the aforementioned system for constructing a sequencing library based on a target region of methylated DNA may further include the following technical features:
  • the 5'end of the second specific primer in the second amplification module overlaps at least part of the sequence of the 3'end of the second universal primer, and the The 3'end of the tag primer overlaps with a partial sequence of the 5'end of the first universal primer.
  • the 5'end of the second specific primer in the second amplification module overlaps at least part of the sequence of the 3'end of the tag primer, and the second universal The 3'end of the primer overlaps with the partial sequence of the 5'end of the first specific primer.
  • the length of the tag sequence is 8-12 bp.
  • the universal transformation module further includes: a transformation unit that uses bisulfite to process the methylated DNA sample to obtain a transformed DNA sample;
  • the amplification unit is connected to the transformation unit, and the amplification unit uses DNA polymerase and the first sequencing primer to replicate the transformed DNA sample so as to obtain the transformed DNA sample.
  • the 3'end of the first sequencing primer is a random base
  • the 5'end of the first sequencing primer is a universal sequence.
  • the random base is 6-12, and the random base is A, T, C or G.
  • the random bases in the above system are 6-12, and the random bases are A, T or C.
  • the universal sequence is a sequencing linker sequence or a fixed sequence.
  • the cytosine in the sequencing linker sequence or the fixed sequence is a methylated modified cytosine.
  • the universal transformation module further includes: a repair unit for performing end repair plus A on the methylated DNA sample to obtain a repaired DNA sample; a connection unit, The connection unit is connected to the repair unit, and the connection unit is used to connect at least one end of the repaired DNA sample with a universal sequence, so as to obtain a DNA sample with a universal sequence; a transformation unit, the transformation unit and The connecting unit is connected, and the transformation unit uses bisulfite to process the DNA sample with the universal sequence, so as to obtain the transformed DNA sample with the universal sequence.
  • the universal sequence is selected from at least one of the following: a sequencing adapter sequence or a modified sequencing adapter sequence.
  • the modified sequencing linker sequence is one chain cytosine undergoing methylation modification, one chain cytosine undergoing no methylation modification, and one chain 3' A sequencing adapter sequence with a non-hydroxyl modified end base, a sequencing adapter sequence with a fixed sequence and a random sequence, or a sequencing adapter sequence with a fixed sequence and a random sequence modified by a non-hydroxyl end base of a chain.
  • the random sequence is a molecular tag sequence.
  • the number of original DNA templates can be counted through a large number of different molecular marker sequences, and the number of original templates can be traced through subsequent statistics of molecular marker sequences and the errors generated during sequencing or PCR can be corrected, so as to realize the detection of DNA templates. Precise detection and quantitative research.
  • the universal transformation module further includes: a transposition unit that uses transposase to transpose the DNA sample to obtain a DNA sample with a universal sequence, A universal sequence is embedded in the transposase; a transformation unit, the transformation unit is connected to the transposition unit, and the transformation unit uses bisulfite to process the DNA sample with the universal sequence, In order to obtain the transformed DNA sample with universal sequence.
  • the universal sequence is a transposase effector sequence or a transposase effector sequence with a sequencing linker, preferably a transposase effector sequence.
  • the cytosine in the transposase effector sequence is a methylated modified cytosine.
  • the methylated DNA sample is genomic DNA, fragmented genomic DNA or free DNA.
  • the present invention provides a method for sequencing a methylated DNA sample, including:
  • a sequencing library is constructed according to the method described in any embodiment of the first aspect of the present invention or using the system described in any embodiment of the second aspect of the present invention; Throughput sequencing to obtain sequencing results.
  • a sequencing platform is used to perform high-throughput sequencing on the sequencing library, and the sequencing platform is selected from at least one of MGISEQ, Illumina, and Proton.
  • the present invention provides a method for determining the methylation status of a methylated DNA sample, including:
  • a sequencing library is constructed according to the method described in any embodiment of the first aspect of the present invention or using the system described in any embodiment of the second aspect of the present invention; Throughput sequencing to obtain a sequencing result; comparing the sequencing result with a reference genome to determine the methylation status of the methylated DNA sample.
  • the reference genome is the human genome hg19 or Yanhuang genome.
  • the present invention provides a kit comprising: a universal sequence, a tag primer, a first universal primer, a second universal primer and a conventional methylation detection reagent; wherein the tag primer contains a tag Sequence, the first universal primer matches or overlaps at least part of the universal sequence, the first universal primer is SEQ ID NO: 1, and the second universal primer is SEQ ID NO: 22.
  • the conventional methylation detection reagent can be, for example, a bisulfite detection reagent or a corresponding kit.
  • the kit described above further includes the following additional technical features:
  • the tag primer is shown in SEQ ID NO: 23.
  • the kit further includes: a first specific primer and a second specific primer, the first specific primer includes the sequence shown in SEQ ID NO: 1 to SEQ ID NO: 10 , The second specific primer includes the sequence shown in SEQ ID NO: 11 to SEQ ID NO: 20.
  • the kit utilizes the method described in the first aspect of the present invention to construct a sequencing library based on methylated DNA target regions.
  • Fig. 1 is a flowchart of random primer library construction according to an embodiment of the present invention.
  • Fig. 2 is a flow chart of joint connection library construction according to an embodiment of the present invention.
  • Fig. 3 is a flow chart of transposon library construction according to an embodiment of the present invention.
  • Figure 4 is a schematic diagram of different linker sequences provided according to an embodiment of the present invention.
  • Figure 5 is a quality inspection diagram of a sequencing library provided according to an embodiment of the present invention.
  • Fig. 6 is a result diagram of the sequencing depth of each amplicon provided according to an embodiment of the present invention.
  • Fig. 7 is a quality inspection diagram of a sequencing library provided according to an embodiment of the present invention.
  • Fig. 8 is a result diagram of the sequencing depth of each amplicon provided according to an embodiment of the present invention.
  • Fig. 9 is a schematic structural diagram of a system for constructing a sequencing library based on a target region of methylated DNA according to an embodiment of the present invention.
  • Fig. 10 is a schematic structural diagram of a universal conversion module according to an embodiment of the present invention.
  • Fig. 11 is a schematic structural diagram of a universal conversion module according to an embodiment of the present invention.
  • Fig. 12 is a schematic structural diagram of a universal conversion module according to an embodiment of the present invention.
  • upstream and downstream refer to the sequence of nucleotide 5'-3', compared with two or more nucleic acid sequences, the nucleic acid sequence located upstream is compared with the nucleic acid sequence located downstream, The recognition or matching region is closer to the 5'end of the template sequence.
  • the length of different nucleic acid sequences may be different, the length of the region to be recognized or matched may also be different.
  • the A nucleic acid sequence is located downstream of the B nucleic acid sequence, only the 3'end recognition or binding site of the A nucleic acid sequence is closer to the 3'end of the template sequence than the recognition or binding site of the B nucleic acid sequence. End.
  • nucleic acid sequences when it means “match” between two nucleic acid sequences, it means that complementary pairing occurs between the bases of the two nucleic acid sequences. When it means that two nucleic acid sequences at least partially overlap, it means that the two nucleic acid sequences have at least one nucleic acid sequence that is the same.
  • bisulfite refers to a reagent or process that deamination of cytosine in DNA into uracil. Therefore, whether it is based on bisulfite treatment, sulfite treatment, or bisulfite treatment, it is included in the protection scope of the present invention.
  • the present invention creatively invented a single-directional primer amplification method, that is, only for DNA
  • One chain of the template is designed for primers, and the designed specific primers only contain A, T, G or A, T, C, and it is difficult to form primer dimers between each other.
  • specific primers are designed on the products of the first round of amplification for amplification to further ensure the specificity of amplification.
  • the sequencing library thus prepared meets the requirements of sequencing.
  • genomic DNA is transposed by a Tn5 transposon, and the broken gDNA or free DNA (cfDNA) molecules are connected by a linker or the DNA is randomly copied.
  • a universal sequence is introduced on the original DNA.
  • the DNA is subjected to bisulfite treatment (BS treatment) to obtain the bisulfite-converted DNA sequence (the original DNA unmethylated modified cytosine (C) is converted to uracil (U)).
  • BS treatment bisulfite treatment
  • C original DNA unmethylated modified cytosine
  • U uracil
  • Design universal primers based on the introduced universal sequence, and design specific primers upstream of the target region of the transformed DNA sequence. Specific primers are designed for only one strand of the DNA template, and PCR amplification is performed through universal primers and specific primers , Get the PCR product.
  • nested primers downstream of the above specific primers or design specific primers downstream of the target region are designed for only one strand on the DNA template.
  • the second step of the PCR product of the first step is amplified by nested primers or downstream specific primers and universal primers, and finally a PCR amplification product (BS-PCR) directed against the template after the bisulfite treatment is obtained.
  • BS-PCR PCR amplification product
  • the present invention provides a method for constructing a sequencing library based on a target region of methylated DNA, comprising: (1) constructing a sequencing library based on the methylated DNA sample At least one end is connected with a universal sequence and a DNA sample treated with bisulfite to obtain a transformed DNA sample with a universal sequence; (2) using the first specific primer and the first universal primer to The transformed DNA sample with universal sequence undergoes first amplification to obtain a first amplification product; wherein, the first specific primer is located upstream of the target region, and the first universal primer is the same as the universal The sequences overlap or match at least partially; the universal sequence is located downstream of the target region; (3) the second specific primer, the second universal primer and the tag primer are used to perform a second amplification on the first amplification product to obtain the first Two amplification products to obtain a sequencing library; wherein the second specific primer is located downstream of the first specific primer and upstream of the target region, and the second universal primer and the second specific primer are At least
  • the universal sequence is introduced by the following methods: 1. gDNA, interrupted gDNA or cfDNA, first treat the DNA molecule with bisulfite, and then use the first sequencing primer, that is, 3' End with 6-12 random N bases (degenerate bases composed of A/T/C/G) or 6-12 random H bases (degenerate bases composed of A/T/C), 5'end Primer with partial or complete sequencing linker sequence or fixed sequence (wherein the cytosine in the sequence is preferentially methylated modified cytosine) and DNA polymerase to replicate the template to obtain a repeat with a universal sequence at the 5'end DNA template after sulfite treatment (shown in Figure 1).
  • the available sequencing adapter sequences include, but are not limited to, the sequencing adapters of the MGI platform and the sequencing adapter sequences of the illumina and proton platforms.
  • the available DNA polymerase can be conventional rTaq, Fusion, or Bst or phi29.
  • the universal sequence is introduced by the following method:
  • the broken gDNA or cfDNA is repaired by adding A, and then a specific linker sequence is added.
  • the sequence can be a partial or complete sequencing linker sequence or a modified sequencing linker sequence.
  • These modified sequencing linker sequences can be the 3 end of a strand A sequencing adapter sequence with a fixed sequence in which the base is modified by a non-hydroxy group, or a sequencing adapter sequence with a fixed sequence, or a sequencing adapter sequence with a fixed sequence in a chain 3'end base modified by a non-hydroxy group, such as Number 1, number 2, number 3 and number 4 shown in FIG.
  • the product with universal sequence added with sulfite was used to obtain the transformed DNA template ( Figure 2).
  • the universal sequence is introduced by the following method:
  • Tn5 transposase embeds a linker sequence.
  • the linker can be an effective 19bp specific sequence of Tn5 transposase itself, or a combination of effective sequence + other sequences (such as sequencing linker sequence), preferably 19bp specific Sequence, the cytosine in the 19bp specific sequence is preferentially methylated modified cytosine, and the gDNA is transposed by Tn5 transposition plus a specific linker.
  • the product with a specific linker is processed by bisulfite. The processed DNA template is obtained (as shown in Figure 3).
  • PCR amplification is performed with one-way specific primers to obtain a sequencing library, and the amplification method can be any of the following:
  • the sequencing library is obtained by PCR amplification by the following method:
  • the first specific primer and the first universal primer are used to perform the first step PCR amplification of the sulfite-treated DNA.
  • the 3'end sequence of the first universal primer is partially or completely complementary to or overlaps with the introduced universal sequence.
  • the 5'end of the first universal sequence is part or all of the sequencing linker sequence (preferred partial sequence).
  • the binding site of the first specific primer sequence is located upstream of the target region to be amplified, and its design is for the DNA template sequence after the bisulfite treatment; the product obtained is purified and then passed through the second specific primer ( In the following examples, it is also called nested primer), second universal primer, and tag primer for the second PCR amplification.
  • the second specific primer and the tag primer are first subjected to PCR, and the subsequent cycles are performed through the second specific primer, the second universal primer and the tag primer together for multiple rounds of PCR.
  • the 5'end of the second specific primer overlaps with part or all of the 3'end of the second universal primer.
  • the 3'end of the second specific primer is a specific sequence, and the specific sequence is designed in the first specific primer and the target region Between; the second universal primer can be part or all of the sequence of the universal adapter for sequencing, the 3'end and the 5'end of the second specific primer are partly or completely the same; the 3'end of the tag primer and the 5'of the first universal primer The end part or all of the sequence is the same, and there is a fixed tag sequence of 8-12 bp in the middle (each platform is used to distinguish the tag sequence of mixed samples) for subsequent multi-sample mixed sequencing ( Figure 1A, Figure 2A, Figure 3A ).
  • the sequencing library is obtained by performing PCR amplification by the following method.
  • the first specific primer also referred to as the upstream specific primer in the following examples
  • the first universal primer are used to perform the first step PCR amplification of the sulfite-treated DNA.
  • the 3'end sequence of the first universal primer is partially or fully complementary or overlapped with the introduced universal sequence (here the universal sequence preferentially uses a fixed sequence other than the sequencing adapter sequence), the specific sequence at the 3'end of the first specific primer Design the upstream of the target region that needs to be amplified, and the design is for the DNA template sequence after the bisulfite treatment, and the 5'end is part or all of the sequence of the sequencing adapter sequence (the priority part sequence).
  • the second specific primer (correspondingly, can also be referred to as the downstream specific primer in the following embodiments), the second universal primer, and the tag primer are used for the second step of PCR amplification.
  • the second specific primer and the second universal primer are first subjected to PCR amplification, and the second specific primer, the second universal primer and the tag primer are combined to perform multiple rounds of PCR in the subsequent cycles;
  • the 5'end of the downstream specific primer overlaps part or all of the sequence at the 3'end of the tag primer.
  • the 3'end of the second specific primer is a specific sequence, and the specific sequence is designed downstream of the target region;
  • the second universal primer can be Sequencing a part or all of the sequence of the adapter sequence, the 3'end of which overlaps with the 5'end of the first specific primer partially or all of the sequence; the 3'end of the tag primer and the 5'end of the second specific primer partially or completely have the same sequence,
  • There is a fixed tag sequence of 8-12 bp in the middle (each platform is used to distinguish the tag sequence of mixed samples) for subsequent multi-sample mixed sequencing ( Figure 1B, Figure 2B, Figure 3B).
  • the present invention provides a system for constructing a sequencing library based on a target region of methylated DNA, as shown in FIG. 9, including a universal transformation module, a first amplification module, and a second amplification module, The modules are connected in turn.
  • the universal transformation module is based on the methylated DNA sample and constructs a bisulfite-treated DNA sample connected to at least one end of the methylated DNA sample to obtain a transformed DNA sample.
  • DNA samples with universal sequences are examples of DNA samples.
  • the first amplification module uses the first specific primer and the first universal primer to perform the first amplification on the transformed DNA sample with the universal sequence, so as to obtain a first amplification product, wherein the first The specific primer is located upstream of the target region, and the first universal primer at least partially matches or overlaps with the universal sequence.
  • the second amplification module uses a second specific primer, a second universal primer, and a tag primer to perform a second amplification on the first amplification product to obtain a second amplification product to obtain a sequencing library; wherein Two specific primers, the universal primer and the tag primer are as shown in (i) or (ii): (i) the second specific primer is located downstream of the first specific primer and the target region Upstream of, the second universal primer overlaps at least a partial sequence of the second specific primer, the tag primer contains a tag sequence, and the tag primer overlaps with a partial sequence of the first universal primer; (ii ) The second specific primer is located downstream of the target region, the second universal primer and the first specific primer overlap at least a part of the sequence, the tag primer contains a tag sequence, and the tag primer is The partial sequence of the second specific primer overlaps.
  • the universal transformation module includes a transformation unit and an amplification unit connected to the transformation unit.
  • the conversion unit uses bisulfite to process the methylated DNA sample so as to obtain a transformed DNA sample.
  • the amplification unit uses a DNA polymerase and a first sequencing primer to replicate the transformed DNA sample, so as to obtain the transformed DNA sample with a universal sequence, 3'of the first sequencing primer The end is a random base, and the 5'end of the first sequencing primer is a universal sequence.
  • the universal transformation module includes a repair unit, a connection unit and a transformation unit, and each unit is connected in sequence.
  • the repair unit is used to perform end repair plus A on the methylated DNA sample to obtain a repaired DNA sample.
  • the connecting unit is used to connect at least one end of the repaired DNA sample with a universal sequence, so as to obtain a DNA sample with a universal sequence.
  • the conversion unit uses bisulfite to process the DNA sample with the universal sequence, so as to obtain the transformed DNA sample with the universal sequence.
  • the universal transformation module includes a transposition unit and a transformation unit connected to the transposition unit.
  • the transposable unit uses a transposase to interrupt and transpose the DNA sample, so as to obtain a DNA sample with a universal sequence, and the transposase has a universal sequence embedded in it.
  • the conversion unit uses bisulfite to process the DNA sample with the universal sequence, so as to obtain the transformed DNA sample with the universal sequence.
  • Experimental design Use 100ng Yanhuang genomic DNA for bisulfite treatment, and then prepare a DNA target methylation library according to the steps of the invention, and send the library to MGISEQ-2000 sequencer for computer sequencing, sequencing type PE100, and then perform data Analysis, including data utilization, comparison rate, amplicon specificity, uniformity and other properties.
  • CT Conversion Reagent solution Take out CT Conversion Reagent (solid mixture) from the kit, add 900 ⁇ L of water, 50 ⁇ L of M-Dissolving Buffer and 300 ⁇ L of M-dissolving buffer respectively Solution (M-Dilution Buffer), dissolve at room temperature and shake for 10 minutes or shake on a shaker for 10 minutes.
  • the random primer sequence (that is, the first sequencing primer mentioned in this article): CGCTTGGCCTCCGACTTNNNNNNNN (SEQ ID NO: 24), where N is a random sequence composed of four bases: A/T/C/G.
  • sample 1 to sample 3 represent the same sample made three replicates
  • the comparison rate refers to the ratio of the comparison to the genome
  • the specificity refers to the ratio of the reads in the target region to the total reads in the total sequence.
  • Uniformity It refers to the proportion of the target area whose depth is greater than 0.1 times the average depth of the target area to the total number of target areas.
  • Experimental design use Yanhuang genomic DNA interrupted to 200-300bp, and then prepare a DNA target methylation library according to the method provided by the present invention, and send the library to MGISEQ-2000 sequencer for computer sequencing, sequencing type PE100, Then perform data analysis, including data utilization, comparison rate, amplicon specificity, uniformity and other properties.
  • Connector 1 5’/5Phos/AGTCGGAGGCCAAGCGGT (SEQ ID NO: 25)
  • Connector 2 5’ACATGGCTACGATCCGACTddT (SEQ ID NO: 26)
  • the C in the linker 1 sequence is protected by methylation modification
  • the sequence in linker 2 can be protected with or without methylation modification
  • the last base of the 3 end in linker 2 is blocked and modified to prevent connection with the template , That is, dideoxy modification.
  • EZ DNA Methylation-Gold Kit TM (ZYMO) was used to co-process the above-mentioned ligated DNA with bisulfite.
  • CT Conversion Reagent solution Take out CT Conversion Reagent (solid mixture) from the kit, add 900 ⁇ L of water, 50 ⁇ L of M-Dissolving Buffer and 300 ⁇ L of M-dissolving buffer respectively Solution (M-Dilution Buffer), dissolve at room temperature and shake for 10 minutes or shake on a shaker for 10 minutes.
  • a Bioanalyzer analysis system (Agilent, Santa Clara, USA) was used to detect the size and content of the insert in the library, and the results are shown in Figure 7.
  • the sequencing platform uses MGISEQ-2000, sequencing type PE100. After sequencing, the data is compared and the basic parameters are counted, including offline data, available data, comparison rate, and specificity The results are shown in Table 2. The sequencing depth of each amplicon is shown in Figure 8.
  • sample 1 to sample 3 represent the same sample for three replicates
  • the comparison rate refers to the ratio of the comparison to the genome
  • the specificity refers to the ratio of the reads in the target region to the total reads in the total sequence.
  • Uniformity It refers to the proportion of the target area whose depth is greater than 0.1 times the average depth of the target area to the total number of target areas.
  • the adapter filtering ratio is around 1%, the primer dimer is less, and the comparison rate is between 84-86%.
  • the performance is between 89-90%, the performance is good, and the coverage depth between each amplicon is uniform.
  • the first specific primer pool is made up of equimolar mixing of the above primers, and the Y base is a C/T merged base
  • the second specific primer pool is made up of equimolar mixing of the above primers, and the Y base is a C/T merged base
  • the N base is the barcode sequence on the MGI sequencing platform.
  • first, second, etc. are only used for descriptive purposes, and cannot be understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features. Therefore, the features defined with “first” and “second” may explicitly or implicitly include at least one of the features.
  • a plurality of means at least two, such as two, three, etc., unless otherwise specifically defined.
  • the terms “connected”, “connected”, “fixed” and other terms should be understood in a broad sense, for example, they may be fixedly connected, detachably connected, or integrated ; It can be mechanically connected, or electrically connected, or can communicate with each other; it can be directly connected, or indirectly connected through an intermediate medium, it can be the internal communication of two components or the interaction relationship between two components, unless otherwise clear The limit.
  • the specific meaning of the above-mentioned terms in the present invention can be understood according to specific circumstances.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Microbiology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Immunology (AREA)
  • Plant Pathology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

提供了基于甲基化DNA目标区域构建测序文库的方法及***和应用。该方法包括:获得经重亚硫酸氢盐转化的带有通用序列的DNA样本;利用第一特异性引物和第一通用引物扩增获得,第一特异性引物位于目标区域的上游,第一通用引物与通用序列至少部分匹配或重叠;利用第二特异性引物、第二通用引物和标签引物扩增得到测序文库;其中第二特异性引物位于第一特异性引物的下游和目标区域的上游,第二通用引物与第二特异性引物的至少部分序列重叠,标签引物与第一通用引物的部分序列重叠;或者第二特异性引物位于目标区域的下游,第二通用引物和第一特异性引物的至少部分序列重叠,标签引物与第二特异性引物的部分序列重叠。

Description

基于甲基化DNA目标区域构建测序文库及***和应用
优先权信息
无。
技术领域
本发明涉及基因测序领域,具体涉及一种基于甲基化DNA目标区域构建测序文库的方法及***和应用。
背景技术
DNA甲基化是一种表观调控修饰,它在不改变碱基序列的情况下,参与调控蛋白质合成的多少。对人类来说,DNA甲基化是一种非常奇妙的化学修饰,亲人的关怀,机体的衰老、抽烟、酗酒甚至肥胖,都会被甲基化如实地记录到基因组上。基因组就像是一个日记本,甲基化作为文字,记录下人体的经历。DNA甲基化是重要的表观遗传学标记信息,获得全基因组范围内所有C位点的甲基化水平数据,对于表观遗传学的时空特异性研究具有重要意义。以新一代高通量测序平台为基础,进行全基因组DNA甲基化水平图谱绘制,特定物种的高精确度甲基化修饰模式的分析,必将在表观基因组学研究中具有里程碑式的意义,并为细胞分化、组织发育等基础机制研究,以及动植物育种、人类健康与疾病研究奠定基础。
全基因组甲基化测序WGBS(Whole Genome Bisulfite Sequencing),即全基因组亚硫酸氢盐测序,是研究生物甲基化的最常用手段,它可以覆盖所有甲基化位点,能够获得更加全面的甲基化图谱。但其在高通量测序中遇到了很多挑战:1、亚硫酸氢盐处理会对DNA单链化并造成严重的损伤;2、亚硫酸氢盐处理后的未甲基化C碱基会转变成U碱基,整个基因组的GC含量发生极端变化,造成后续扩增产生极大的偏好性;3、建库需要微克级别的起始DNA,对于微量DNA很难有很有效的建库方法。对于临床检测和某些特定的研究来讲,全基因组甲基化测序操作复杂并且成本还过于昂贵,而采用靶向甲基化测序技术可以有效解决这些问题。
靶向甲基化测序技术可以分为以探针捕获和以多重PCR为基础的测序技术,对于探针捕获,其要求的起始量高,对于一些微量样本如血浆游离DNA,很难进行捕获,并且探针捕获探针的设计和操作流程也过于复杂,检测周期长,成本高;而基于DNA重亚硫酸盐处理后的多重PCR起始要求量低,操作简单,灵敏度高,但该技术还需要进一步改进。
发明内容
本发明旨在至少在一定程度上解决相关技术中的技术问题之一。为此,本发明的一个目的在于提出一种基于甲基化DNA目标区域构建测序文库的方法及***和应用。通过本发明提供的方法对甲基化DNA样本的目标区域进行建库,在建库过程中只针对甲基化DNA样 本的一条链进行扩增,建库。通过设计特异性引物和通用引物进行扩增,得到目标产物,可以有效解决引物二聚体的问题。同时利用多种特异性引物对同一甲基化DNA模板的目标区域进行扩增,可以保证扩增的特异性。
本发明的发明人在研究过程中注意到,基于DNA重亚硫酸盐处理后的多重PCR,操作简单,灵敏度高,但技术要求高。先前有过报道,使用微滴技术进行单分子BS-PCR,可以同时检测九千个左右的靶标,但是起始量较高,需要2μg DNA。2015年,Lu Wen等研究人员巧妙地利用CpG岛的特征序列作为引物结合位点,开发了基于PCR技术的MCTA-seq,可以同时检测大量CpG岛区域的甲基化信号,该技术极其灵敏,能够对7.5pg的gDNA进行检测,不过,MCTA-seq更像是一种固定的CGI Panel,作为靶向测序平台,灵活性稍显不足。因此开发一个起始量要求低,灵活性强的靶向甲基化技术是未来靶向甲基化的发展方向。
发明人通过研究发现:如何有效的进行超多重靶标扩增是主要瓶颈,即使是开展上万重的基因组扩增子测序也是非常具有挑战性的工作,更别说针对Bisulfite转化后序列的多重甲基化PCR,主要是由于在PCR过程中形成了严重的引物二聚体。在重亚硫酸盐处理后的DNA进行多重PCR的过程中,DNA经过重亚硫酸盐处理后,未甲基化的胞嘧啶转换为尿嘧啶,基因组上大部分胞嘧啶都是未甲基化的,因此大部分序列的碱基由以前A/T/C/G四种组成变为A/T/G组成。在常规的PCR中,一条引物是针对正链设计,一条是针对互补的链设计,因此用于PCR的一条链是富含ATG的序列,另一条链是富含ATC的序列,这种“天然互补”的引物序列很容易形成引物二聚体。当引物对数增加时,引物二聚体的形成也急剧增加,在多重PCR过程中,过多的引物由于引物二聚体的产生而消耗殆尽,造成多重PCR的失败,因此要解决多重亚硫酸盐多重PCR问题就先得解决引物容易形成引物二聚体的问题。
针对引物二聚体的问题,我们创造性的发明了单方向的引物扩增方法,只针对DNA模板两条链中其中一条链设计特异性引物,所有的特异性引物都只含有ATG或者ATC,这些引物互相之间很难形成引物二聚体。通过这些单方向的特异性引物和一些固定的通用引物进行扩增,得到目标产物,可以有效地解决引物二聚体的问题。
具体而言,本发明提供了如下技术方案:
根据本发明的第一方面,本发明提供了一种基于甲基化DNA目标区域构建测序文库的方法,包括:(1)基于所述甲基化DNA样本,在所述甲基化DNA样本的至少一端连接通用序列,用重亚硫酸氢盐处理DNA样本,以便获得经转化的带有通用序列的DNA样本;(2)利用第一特异性引物和第一通用引物对所述经转化的带有通用序列的DNA样本进行第一扩增,以便获得第一扩增产物;其中,所述第一特异性引物位于所述目标区域的上游,所述第一通用引物与所述通用序列至少部分匹配或重叠;所述通用序列位于目标区域下游;(3)利用第二特异性引物、第二通用引物和标签引物对所述第一扩增产物进行第二扩增,以便获得第二扩增产物,得到测序文库;其中所述第二特异性引物位于所述第一特异性引物的下游和所述目标区域的上游,所述第二通用引物与所述第二特异性引物的至少部分序 列重叠,所述标签引物中含有标签序列,所述标签引物与所述第一通用引物的部分序列重叠;或者其中所述第二特异性引物位于所述目标区域的下游,所述第二通用引物和所述第一特异性引物的至少部分序列重叠,所述标签引物中含有标签序列,所述标签引物与所述第二特异性引物的部分序列重叠。
本发明提供的基于甲基化DNA目标区域构建测序文库的方法,其是针对甲基化DNA模板的一条链设计特异性引物来得到对目标区域富集,建库的目的。首先,在甲基化DNA模板的至少一端引入通用序列,进行重亚硫酸盐处理,或者先进行重亚硫酸盐处理,然后再引入通用序列也可。即首先获得经转化的带有通用序列的DNA样本。然后只针对该DNA样本的一条链设计引物。即通过第一特异性引物和第一通用引物来针对该DNA样本的一条链进行扩增,第一特异性引物能够和该DNA样本的一条链匹配,第一通用引物能够和通用序列匹配,从而实现特异扩增。而且由于所用到的DNA模板是经过重亚硫酸盐转化后的样本,所以所设计的第一特异性引物是富含碱基A、T、G或者碱基A、T、C的序列,相互之间不会形成二聚体。而第一通用引物上含有A、T、C、G四种碱基,也不会和第一特异性引物形成引物二聚体,因此可以完全避免引物二聚体的形成。
同时,为了保证引物扩增的特异性,在第一特异性引物的下游和目标区域的上游或者目标区域的下游再设计一条第二特异性引物,利用第二特异性引物、第二通用引物和标签引物,针对第一扩增产物进行第二扩增,获得第二扩增产物,得到所需要的测序文库。
根据本发明的实施例,以上所述的基于甲基化DNA目标区域构建测序文库的方法可以进一步包括如下技术特征:
在本发明的一些实施例中,步骤(3)中,所述第二特异性引物的5’端与所述第二通用引物的3’端的至少部分序列重叠,所述标签引物的3’端与所述第一通用引物的5’端的部分序列重叠。第二特异性引物的5’端序列能够与第二通用引物的3’端的至少部分序列重叠,3’端序列能够和DNA模板上位于第一特异性引物下游和目标区域上游的模板区域进行匹配,从而可以基于第一扩增产物,实现对于目标区域的特异性扩增。
在本发明的一些实施例中,步骤(3)中,所述第二特异性引物的5’端与所述标签引物3’端的至少部分序列重叠,所述第二通用引物的3’端与所述第一特异性引物的5’端的部分序列重叠。第二特异性引物的5’端序列与标签引物3’端的至少部分序列重叠,其3’端序列能够和DNA模板上位于目标区域下游的模板区域进行匹配,从而可以实现对于目标区域的特异性扩增。
在本发明的一些实施例中,标签引物中含有标签序列,这些标签序列可以是一些测序平台常用的用于区分不同样本的标签序列,方便同时用于多个混合样本测序,根据实施例,这些标签序列的长度可以为8~12bp,例如可以为10bp,8bp等。
在本发明的一些实施例中,步骤(1)进一步包括:(1-a)利用重亚硫酸氢盐对所述甲基化DNA样本进行处理,以便获得经转化的DNA样本;(1-b)利用DNA聚合酶和带有第一测序序列的随机引物,对所述经转化的DNA样本进行复制,以便获得所述经转化的带有通用序列的DNA样本,所述随机引物的3’端为随机碱基序列,所述随机引物5’端为通 用序列。
在本发明的一些实施例中,所述随机碱基序列为6~12个,所述随机碱基为A、T、C或者G。
在本发明的一些实施例中,所述随机碱基序列为6~12个,所述随机碱基为A、T或者C。
在本发明的一些实施例中,所述通用序列为测序接头序列或固定序列。
在本发明的一些实施例中,所述测序接头序列或者所述固定序列中胞嘧啶为甲基化修饰的胞嘧啶。
在本发明的一些实施例中,步骤(1)进一步包括:(1-1)对所述甲基化DNA样本进行末端修复加A,以便获得修复的DNA样本;(1-2)将所述修复的DNA样本的至少一端与通用序列连接,以便获得带有通用序列的DNA样本;(1-3)利用重亚硫酸氢盐对所述带有通用序列的DNA样本进行处理,以便获得所述经转化的带有通用序列的DNA样本。
在本发明的一些实施例中,所述通用序列选自下列中的至少一种:测序接头序列或者经修饰的测序接头序列。
在本发明的一些实施例中,所述经修饰的测序接头序列为一条链胞嘧啶进行甲基化修饰,一条链胞嘧啶不进行甲基化修饰、一条链3’端碱基经非羟基修饰的测序接头序列、带有固定序列和随机序列的测序接头序列、或者一条链3’端碱基经非羟基修饰的带有固定序列和随机序列的测序接头序列。
在本发明的一些实施例中,所述随机序列为分子标签序列。通过大量不同的分子标签序列可以对原始的DNA模板个数进行计数,通过后续对分子标记序列的统计来追溯原始模板的个数和矫正测序或者PCR过程中产生的错误,从而可以实现对于DNA模板的精确检测和定量研究。
在本发明的一些实施例中,步骤(1)进一步包括:①利用转座酶对所述DNA样本进行打断和转座处理,以便获得带有通用序列的DNA样本,所述转座酶中包埋有通用序列;②利用重亚硫酸氢盐对所述带有通用序列的DNA样本进行处理,以便获得所述经转化的带有通用序列的DNA样本。
在本发明的一些实施例中,所述通用序列为转座酶效应序列或者带有测序接头的转座酶效应序列,优选为转座酶效应序列,所述转座酶可以为Tn5、MuA或其他具有类似功能的转座酶,优选为Tn5转座酶。
在本发明的一些实施例中,转座酶效应序列中的胞嘧啶为甲基化修饰的胞嘧啶。未甲基化胞嘧啶转化为鸟嘧啶不是一个100%的过程,可能转化也可能不转化,那么在后续用通用引物扩增的会增加不确定性。而甲基化修饰的胞嘧啶在后续的亚硫酸盐处理的条件下胞嘧啶不会转化为尿嘧啶,保持序列信息不变。因此为了测序更加精准,转座酶效应序列中的胞嘧啶可以进行甲基化修饰。当然,胞嘧啶也可以不进行甲基化修饰处理。
在本发明的一些实施例中,所述甲基化DNA样本为基因组DNA、片段化的基因组DNA、或者游离DNA。
根据本发明的第二方面,本发明提供了一种基于甲基化DNA目标区域构建测序文库的***,包括:通用转化模块,所述通用转化模块基于所述甲基化DNA样本,构建在所述甲基化DNA样本的至少一端连接有通用序列,且经重亚硫酸氢盐处理的DNA样本,以便获得经转化的带有通用序列的DNA样本;第一扩增模块,所述第一扩增模块和所述通用转化模块相连,所述第一扩增模块利用第一特异性引物和第一通用引物对所述经转化的带有通用序列的DNA样本进行第一扩增,以便获得第一扩增产物;其中,所述第一特异性引物位于所述目标区域的上游,所述第一通用引物与所述通用序列至少部分匹配或重叠;第二扩增模块,所述第二扩增模块和所述第一扩增模块相连,所述第二扩增模块利用第二特异性引物、第二通用引物和标签引物对所述第一扩增产物进行第二扩增,获得第二扩增产物,得到测序文库;其中所述第二特异性引物位于所述第一特异性引物的下游和所述目标区域的上游,所述第二通用引物与所述第二特异性引物的至少部分序列重叠,所述标签引物中含有标签序列,所述标签引物与所述第一通用引物的部分序列重叠;或者其中所述第二特异性引物位于所述目标区域的下游,所述第二通用引物和所述第一特异性引物的至少部分序列重叠,所述标签引物中含有标签序列,所述标签引物与所述第二特异性引物的部分序列重叠。
根据本发明的实施例,以上所述基于甲基化DNA目标区域构建测序文库的***可以进一步包括如下技术特征:
在本发明的一些实施例中,以上***中,所述第二扩增模块中所述第二特异性引物的5’端与所述第二通用引物的3’端的至少部分序列重叠,所述标签引物的3’端与所述第一通用引物的5’端的部分序列重叠。
在本发明的一些实施例中,以上***中,所述第二扩增模块中所述第二特异性引物的5’端与所述标签引物3’端的至少部分序列重叠,所述第二通用引物的3’端与所述第一特异性引物的5’端的部分序列重叠。
在本发明的一些实施例中,以上***中,所述标签序列的长度为8~12bp。
在本发明的一些实施例中,所述通用转化模块进一步包括:转化单元,所述转化单元利用重亚硫酸氢盐对所述甲基化DNA样本进行处理,以便获得经转化的DNA样本;扩增单元,所述扩增单元与所述转化单元相连,所述扩增单元利用DNA聚合酶和第一测序引物,对所述经转化的DNA样本进行复制,以便获得所述经转化的带有通用序列的DNA样本,所述第一测序引物的3’端为随机碱基,所述第一测序引物的5’端为通用序列。
在本发明的一些实施例中,以上***中,所述随机碱基为6~12个,所述随机碱基为A、T、C或者G。
在本发明的一些实施例中,以上***中所述随机碱基为6~12个,所述随机碱基为A、T或者C。
在本发明的一些实施例中,以上***中,所述通用序列为测序接头序列或固定序列。
在本发明的一些实施例中,以上***中,所述测序接头序列或者所述固定序列中胞嘧啶为甲基化修饰的胞嘧啶。
在本发明的一些实施例中,所述通用转化模块进一步包括:修复单元,所述修复单元用于对所述甲基化DNA样本进行末端修复加A,以便获得修复的DNA样本;连接单元,所述连接单元与所述修复单元相连,所述连接单元用于将所述修复的DNA样本的至少一端与通用序列连接,以便获得带有通用序列的DNA样本;转化单元,所述转化单元与所述连接单元相连,所述转化单元利用重亚硫酸氢盐对所述带有通用序列的DNA样本进行处理,以便获得所述经转化的带有通用序列的DNA样本。
在本发明的一些实施例中,所述通用转化模块中,所述通用序列选自下列中的至少一种:测序接头序列或者经修饰的测序接头序列。
在本发明的一些实施例中,所述通用转化模块中,所述经修饰的测序接头序列为一条链胞嘧啶进行甲基化修饰,一条链胞嘧啶不进行甲基化修饰、一条链3’端碱基经非羟基修饰的测序接头序列、带有固定序列和随机序列的测序接头序列、或者一条链3’端碱基经非羟基修饰的带有固定序列和随机序列的测序接头序列。
在本发明的一些实施例中,所述通用转化模块中,所述随机序列为分子标签序列。通过大量不同的分子标记序列可以对原始的DNA模板个数进行计数,通过后续对分子标记序列的统计来追溯原始模板的个数和矫正测序或者PCR过程中产生的错误,从而可以实现对于DNA模板的精确检测和定量研究。
在本发明的一些实施例中,所述通用转化模块进一步包括:转座单元,所述转座单元利用转座酶对所述DNA样本进行转座处理,以便获得带有通用序列的DNA样本,所述转座酶中包埋有通用序列;转化单元,所述转化单元与所述转座单元相连,所述转化单元利用重亚硫酸氢盐对所述带有通用序列的DNA样本进行处理,以便获得所述经转化的带有通用序列的DNA样本。
在本发明的一些实施例中,以上转座单元中,所述通用序列为转座酶效应序列或者带有测序接头的转座酶效应序列,优选为转座酶效应序列。
在本发明的一些实施例中,以上转座单元中,所述转座酶效应序列中的胞嘧啶为甲基化修饰的胞嘧啶。
在本发明的一些实施例中,所述甲基化DNA样本为基因组DNA、片段化的基因组DNA或者游离DNA。
上述对本发明任一实施例中的基于甲基化DNA目标区域构建测序文库的方法的优点和技术特征的描述,同样适用本发明中上述任一实施例中基于甲基化DNA目标区域构建测序文库的***,在此不再赘述。
根据本发明的第三方面,本发明提供了一种对甲基化DNA样本进行测序的方法,包括:
基于所述甲基化DNA样本,根据本发明第一方面任一实施例所述的方法或者利用本发明第二方面任一实施例所述的***构建得到测序文库;对所述测序文库进行高通量测序,以便获得测序结果。
在本发明的一些实施例中,利用测序平台对所述测序文库进行高通量测序,所述测序平台选自MGISEQ、Illumina、Proton中的至少一种。
根据本发明的第四方面,本发明提供了一种确定甲基化DNA样本的甲基化状态的方法,包括:
基于所述甲基化DNA样本,根据本发明第一方面任一实施例所述的方法或者利用本发明第二方面任一实施例所述的***构建得到测序文库;对所述测序文库进行高通量测序,以便获得测序结果;将所述测序结果与参考基因组进行比对,以便确定所述甲基化DNA样本的甲基化状态。
在本发明的一些实施例中,所述参考基因组为人类基因组hg19或炎黄基因组。
根据本发明的第五方面,本发明提供了一种试剂盒,包括:通用序列,标签引物,第一通用引物,第二通用引物和甲基化常规检测试剂;其中所述标签引物中含有标签序列,所述第一通用引物与所述通用序列的至少部分匹配或重叠,所述第一通用引物为SEQ ID NO:1,所述第二通用引物为SEQ ID NO:22。所述甲基化常规检测试剂例如可以为重亚硫酸盐检测试剂或者相应的试剂盒等。
根据本发明的实施例,以上所述的试剂盒进一步包括如下附加技术特征:
在本发明的一些实施例中,所述标签引物为SEQ ID NO:23所示。
在本发明的一些实施例中,所述试剂盒进一步包括:第一特异性引物和第二特异性引物,所述第一特异性引物包括SEQ ID NO:1~SEQ ID NO:10所示序列,所述第二特异性引物包括SEQ ID NO:11~SEQ ID NO:20所示序列。
在本发明的一些实施例中,所述试剂盒利用本发明第一方面所述的方法基于甲基化DNA目标区域构建测序文库。
附图说明
本发明的上述和/或附加的方面和优点从结合下面附图对实施例的描述中将变得明显和容易理解,其中:
图1是根据本发明的一个实施例提供的随机引物建库流程图。
图2是根据本发明的一个实施例提供的接头连接建库流程图。
图3是根据本发明的一个实施例提供的转座子建库流程图。
图4是根据本发明的一个实施例提供的不同接头序列的示意图。
图5是根据本发明的一个实施例提供的测序文库质检图。
图6是根据本发明的一个实施例提供的各个扩增子的测序深度结果图。
图7是根据本发明的一个实施例提供的测序文库质检图。
图8是根据本发明的一个实施例提供的各个扩增子的测序深度结果图。
图9是根据本发明的实施例提供的基于甲基化DNA目标区域构建测序文库的***的结构示意图。
图10是根据本发明的实施例提供的一种通用转化模块的结构示意图。
图11是根据本发明的实施例提供的一种通用转化模块的结构示意图。
图12是根据本发明的实施例提供的一种通用转化模块的结构示意图。
具体实施方式
下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,旨在用于解释本发明,而不能理解为对本发明的限制。
为了对于本申请有更为直观的理解,下面对本申请中存在的术语进行解释和说明。本领域技术人员需要理解的是,这些解释和说明仅为了理解更为方便,不应看做是对本申请保护范围的限制。本文中,如无特殊说明,当提到两个核酸序列相连时,是指通过3’-5’磷酸二酯键相连。本文中如无特别说明,当表示碱基时,碱基N或者n代表任意碱基A、T、C或者G。
本文中,术语“上游”、“下游”指的是按照核苷酸5’-3’的排列顺序,两个或者多个核酸序列相比,位于上游的核酸序列相比较位于下游的核酸序列,其识别或者匹配区域更靠近模板序列的5’端。当然,由于不同核酸序列的长度可能不一样,其识别或者匹配的区域的长度可能也会不一样。当表示A核酸序列位于B核酸序列的下游时,只需要A核酸序列的3’端识别或者结合位点相比较于B核酸序列的3’端的识别或者结合位点,更靠近模板序列的3’端即可。
本文中,当表示两个核酸序列之间“匹配”时,是指两个核酸序列的碱基之间发生互补配对。当表示两个核酸序列至少部分序列重叠时,是指两个核酸序列至少有一段相同的核酸序列。
本文中,无论是“重亚硫酸氢盐”、“亚硫酸盐”或者“亚硫酸氢盐”处理均指使DNA中的胞嘧啶脱氨转变为尿嘧啶的试剂或者过程。所以无论是基于重亚硫酸氢盐处理、亚硫酸盐处理、或者亚硫酸氢盐处理,均包含在本发明的保护范围之内。
为了解决在对甲基化DNA进行扩增的过程中,多对甲基化特异性引物之间的引物二聚体问题,本发明创造性的发明了单方向的引物扩增方法,即只针对DNA模板的一条链进行引物设计,由此所设计的特异性引物都只含有A、T、G或者A、T、C,相互之间很难形成引物二聚体。同时,为了保证引物扩增的特异性,在第二轮PCR扩增的过程中,针对第一轮扩增的产物,在其上设计特异性引物进行扩增,进一步保证扩增的特异性。由此所制备的测序文库满足测序的要求。
详细地说,基因组DNA(gDNA)通过Tn5转座子转座,打断的gDNA或者游离DNA(cfDNA)分子上通过接头连接或者是DNA随机复制在原始DNA上引入一段通用序列,对引入通用序列的DNA进行重亚硫酸盐处理(BS处理),得到重亚硫酸盐转化后的DNA序列(原始DNA未甲基化修饰的胞嘧啶(C)被转化为尿嘧啶(U))。根据所引入的通用序列上设计通用引物,在转化后DNA序列的目标区域上游设计特异性引物,特异性引物只针对DNA模板上的一条链进行设计,通过通用引物和特异性引物进行PCR扩增,得到PCR产物。同时为增加扩增的特异性,在上述特异性引物的下游设计巢式引物或者在目标区域的下游设计特异性引物,巢式引物或者特异性引物都只针对DNA模板上的一条链进行设计,通过巢式 引物或者下游特异性引物和通用引物对第一步PCR的产物进行第二步扩增,最终得到针对重亚硫酸盐处理后的模板的PCR扩增产物(BS-PCR)。
在本发明的一个方面,本发明提供了一种基于甲基化DNA目标区域构建测序文库的方法,包括:(1)基于所述甲基化DNA样本,构建在所述甲基化DNA样本的至少一端连接有通用序列,且经重亚硫酸氢盐处理的DNA样本,以便获得经转化的带有通用序列的DNA样本;(2)利用第一特异性引物和第一通用引物对所述经转化的带有通用序列的DNA样本进行第一扩增,以便获得第一扩增产物;其中,所述第一特异性引物位于所述目标区域的上游,所述第一通用引物与所述通用序列至少部分重叠或匹配;所述通用序列位于目标区域的下游;(3)利用第二特异性引物、第二通用引物和标签引物对所述第一扩增产物进行第二扩增,获得第二扩增产物,得到测序文库;其中所述第二特异性引物位于所述第一特异性引物的下游和所述目标区域的上游,所述第二通用引物与所述第二特异性引物的至少部分序列重叠,所述标签引物中含有标签序列,所述标签引物与所述第一通用引物的部分序列重叠;或者其中所述第二特异性引物位于所述目标区域的下游,所述第二通用引物和所述第一特异性引物的至少部分序列重叠,所述标签引物中含有标签序列,所述标签引物与所述第二特异性引物的部分序列重叠。
其中,在获得经转化的带有通用序列的DNA样本的过程中,根据通用序列和重亚硫酸盐处理的先后,根据需要,可以采取不同的方式:
在本发明的至少一些实施方式中,通过如下方法引入通用序列:1.gDNA、打断的gDNA或cfDNA,先用重亚硫酸盐对DNA分子进行处理,然后用第一测序引物,即3’端带有6-12个随机N碱基(A/T/C/G组成的兼并碱基)或者6-12个随机H碱基(A/T/C组成的兼并碱基),5’端带有部分、全部测序接头序列或者固定序列(其中,序列中的胞嘧啶优先采用甲基化修饰的胞嘧啶)的引物和DNA聚合酶对模板进行复制,得到5’端带有通用序列的重亚硫酸盐处理后的DNA模板(如图1所示)。其中,可用的测序接头序列包括但不限于MGI平台的测序接头也包括illumina和proton平台的测序接头序列。在至少一些实施例中,可用的DNA聚合酶可以是常规的rTaq、Fusion、也可以是Bst或者phi29等。
在本发明的至少一些实施方式中,通过如下方法引入通用序列:
打断的gDNA或cfDNA通过末端修复加A,然后加上特定的接头序列,序列可以是部分、全部测序接头序列或者经修饰的测序接头序列,这些经修饰的测序接头序列可以是一条链3端碱基经非羟基修饰的带有固定序列的测序接头序列,或者是带有固定序列的测序接头序列,或者是一条链3’端碱基经非羟基修饰带有固定序列的测序接头序列,如图4所示标号1、标号2、标号3和标号4所示。其中纯化后,采用亚硫酸盐对加上通用序列的产物进行处理得到转化后的DNA模板(图2)。
在本发明的另一些实施方式中,通过如下方法引入通用序列:
Tn5转座酶包埋一段接头序列,该接头可以是Tn5转座酶本身的起效应的19bp特定序列,也可以使起效应的序列+其他序列(如测序接头序列)的组合,优先采用19bp特定序列,19bp特定序列中的胞嘧啶优先采用甲基化修饰的胞嘧啶,通过Tn5转座对gDNA进行转座加上特 定的接头,纯化后,采用重亚硫酸盐对加上特定接头的产物进行处理得到转化后的DNA模板(如图3所示)。
在获得上述经转化的带有通用序列的DNA样本之后,通过单方向特异性引物进行PCR扩增得到测序文库,扩增方式可以采用以下的任意一种:
在本发明的至少一些实施方式中,通过如下方法进行PCR扩增得到测序文库:
通过第一特异性引物和第一通用引物对亚硫酸盐处理后的DNA进行第一步PCR扩增。第一通用引物3’端序列和上述引入的通用序列部分或者全部互补或者重叠。例如,第一通用序列的5’端是测序接头序列的部分或者全部(优先部分序列)。第一特异性引物序列的结合位点位于需要扩增的目标区域的上游,其设计是针对重亚硫酸盐处理后的DNA模板序列;得到的产物经过纯化后,再通过第二特异性引物(在以下实施例中也被称为巢式引物)、第二通用引物、标签引物进行第二部PCR扩增。其中,在第二步PCR的第一个循环,第二特异性引物和标签引物先进行PCR,后续循环通过第二特异性引物、第二通用引物和标签引物一起进行多轮PCR。第二特异性引物5’端与第二通用引物的3’端的部分或者全部序列重叠,第二特异性引物的3’端为特异性序列,特异性序列设计在第一特异性引物和目标区域之间;第二通用引物可以为测序通用接头的部分或者全部序列,3’端和第二特异性引物的5’端部分或者全部序列相同;标签引物3’端和第一通用引物的5’端部分或者全部序列相同,中间有8-12bp的固定标签序列(每个平台用于区分样本混样的标签序列),用于后续多样本混合测序(附图1A,附图2A,附图3A)。
在本发明的另一些实施方式中,通过如下方法进行PCR扩增获得测序文库。
通过第一特异性引物(在下述实施例中也被称为上游特异性引物)和第一通用引物对亚硫酸盐处理后的DNA进行第一步PCR扩增。第一通用引物3’端序列和上述引入的通用序列部分或者全部互补或者重叠(此处的通用序列优先采用除测序接头序列之外的固定序列),第一特异性引物3’端的特异性序列设计需要扩增的目标区域的上游,其设计是针对重亚硫酸盐处理后的DNA模板序列,5’端为测序接头序列的部分或者全部序列(优先部分序列)。得到的产物经过纯化后,再通过第二特异性引物(相应地,在以下实施例中也可以称为下游特异性引物)、第二通用引物、标签引物进行第二步PCR扩增。其中,在第二步PCR的第一个循环,第二特异性引物和第二通用引物先进行PCR扩增,后续循环第二特异性引物、第二通用引物和标签引物一起进行多轮PCR;下游特异性引物5’端与标签引物的3’端的部分或者全部序列重叠,第二特异性引物的3’端为特异性序列,特异性序列设计在目标区域的下游;第二通用引物可以为测序接头序列的部分或者全部序列,其3’端和第一特异性引物的5’端部分或者全部序列重叠;标签引物3’端和第二特异性引物的5’端部分或者全部序列相同,中间有8-12bp的固定标签序列(每个平台用于区分样本混样的标签序列),用于后续多样本混合测序(附图1B,附图2B,附图3B)。
根据本发明的另一方面,本发明提供了一种基于甲基化DNA目标区域构建测序文库的***,如图9所示,包括通用转化模块、第一扩增模块和第二扩增模块,各模块依次相连。其中,所述通用转化模块基于所述甲基化DNA样本,构建在所述甲基化DNA样本的至少 一端连接有通用序列,且经重亚硫酸氢盐处理的DNA样本,以便获得经转化的带有通用序列的DNA样本。所述第一扩增模块利用第一特异性引物和第一通用引物对所述经转化的带有通用序列的DNA样本进行第一扩增,以便获得第一扩增产物,其中所述第一特异性引物位于所述目标区域的上游,所述第一通用引物与所述通用序列至少部分匹配或重叠。所述第二扩增模块利用第二特异性引物、第二通用引物和标签引物对所述第一扩增产物进行第二扩增,获得第二扩增产物,得到测序文库;其中所述第二特异性引物,所述通用引物和所述标签引物如(i)或(ii)所示:(i)所述第二特异性引物位于所述第一特异性引物的下游和所述目标区域的上游,所述第二通用引物与所述第二特异性引物的至少部分序列重叠,所述标签引物中含有标签序列,所述标签引物与所述第一通用引物的部分序列重叠;(ii)所述第二特异性引物位于所述目标区域的下游,所述第二通用引物和所述第一特异性引物的至少部分序列重叠,所述标签引物中含有标签序列,所述标签引物与所述第二特异性引物的部分序列重叠。
在本发明的至少一些实施方式中,所述通用转化模块如图10所示,包括转化单元和与转化单元相连的扩增单元。其中,所述转化单元利用重亚硫酸氢盐对甲基化DNA样本进行处理,以便获得经转化的DNA样本。所述扩增单元利用DNA聚合酶和第一测序引物,对所述经转化的DNA样本进行复制,以便获得所述经转化的带有通用序列的DNA样本,所述第一测序引物的3’端为随机碱基,所述第一测序引物的5’端为通用序列。
在本发明的至少一些实施方式中,所述通用转化模块如图11所示,包括修复单元、连接单元和转化单元,各单元依次相连。所述修复单元用于对所述甲基化DNA样本进行末端修复加A,以便获得修复的DNA样本。所述连接单元用于将所述修复的DNA样本的至少一端与通用序列连接,以便获得带有通用序列的DNA样本。所述转化单元利用重亚硫酸氢盐对所述带有通用序列的DNA样本进行处理,以便获得所述经转化的带有通用序列的DNA样本。
在本发明的至少一些实施方式中,所述通用转化模块如图12所示,包括转座单元和与转座单元相连的转化单元。所述转座单元利用转座酶对所述DNA样本进行打断和转座处理,以便获得带有通用序列的DNA样本,所述转座酶中包埋有通用序列。所述转化单元利用重亚硫酸氢盐对所述带有通用序列的DNA样本进行处理,以便获得所述经转化的带有通用序列的DNA样本。
下面将结合实施例对本发明的方案进行解释。本领域技术人员将会理解,下面的实施例仅用于说明本发明,而不应视为限定本发明的范围。实施例中未注明具体技术或条件的,按照本领域内的文献所描述的技术或条件或者按照产品说明书进行。所用试剂或仪器未注明生产厂商者,均为可以通过市购获得的常规产品。
实施例1:甲基化多重PCR建库测序
实验设计:用100ng炎黄基因组DNA进行重亚硫酸盐处理,然后按照发明的步骤对DNA靶向甲基化文库制备,文库到MGISEQ-2000测序仪上进行上机测序,测序类型PE100,然后进行数据分析,包括数据利用率、比对率、扩增子特异性、均一性等性能。
1、重亚硫酸盐处理
采用EZ DNA Methylation-Gold Kit TM(美国ZYMO公司货号D5005)试剂盒,将上述的DNA进行重亚硫酸盐共处理。
配制溶液:
制备CT转换试剂(CT Conversion Reagent)溶液:从试剂盒中取出CT转换试剂(固体混合物),分别加入900μL的水、50μL的M-溶解缓冲液(M-Dissolving Buffer)和300μL的M-稀释缓冲液(M-Dilution Buffer),室温下溶解并且震荡10分钟或在摇床上摇动10分钟。
M-洗涤缓冲液的制备:向M-洗涤缓冲液中添加24mL 100%的乙醇,备用。
具体步骤如下:
(1)在PCR管中加入130μL的CT转换试剂溶液和上述DNA,轻弹或移液器吹悬混合样品。
然后将样品管放到PCR仪上按以下步骤操作:
98℃下持续5分钟,64℃下持续2.5小时。
完成上述操作后,立刻进行下一步操作。
(2)将Zymo-Spin IC TMColumn放入收集管(Collection Tube)中,并加入600μL的M-结合缓冲液(M-Binding Buffer)。
然后将重亚硫酸盐处理的样品加入到含M-结合缓冲液的Zymo-Spin IC TMColumn中,盖上盖子颠倒混匀。
全速(>10,000x g)离心30秒,弃收集管中的收集液。向柱中加入100μL的M-洗涤缓冲液,全速(>10,000x g)离心30秒,弃收集管中的液体。
向柱中添加200μL的M-Desulphonation Buffer,室温放置15min,全速(>10,000x g)离心30s,弃收集管中的液体。
向柱中添加200μL的M-洗涤缓冲液,全速(>10,000x g)离心30s,弃收集管中的液体,并再重复此步骤1次。
将Zymo-Spin IC TMColumn置于新的1.5mL EP管中,加入40μL的M-洗脱缓冲液r到柱基质中,室温放置2min,全速(>10,000x g)离心洗脱目的片段DNA。
2、DNA复制
(1)在PCR管中按照以下反应体系对重亚硫酸盐处理后的DNA进行DNA复制
Figure PCTCN2019087824-appb-000001
Figure PCTCN2019087824-appb-000002
其中随机引物序列(即本文中所提到的第一测序引物):CGCTTGGCCTCCGACTTNNNNNNNN(SEQ ID NO:24),其中N为A/T/C/G四种碱基组成的随机序列。
(2)对上述反应体系放置到PCR仪上,65度,反应10分钟,
(3)反应完后用1.5×AMPure磁珠进行纯化(Beckman公司AMPure XP,货号A63881),最后将纯化产物溶于22μl洗脱缓冲液。
3、第一轮PCR
(1)在PCR管中按照以下反应体系配置PCR体系
Figure PCTCN2019087824-appb-000003
(2)PCR反应条件为
Figure PCTCN2019087824-appb-000004
(3)反应完后用1.5×AMPure磁珠进行纯化,最后将纯化产物溶于22μl洗脱缓冲液。
3、第二轮PCR
(1)在PCR管中按照以下反应体系配置PCR体系,其中,巢氏引物池如下表4所示,标签引物如下表5所示。
Figure PCTCN2019087824-appb-000005
(2)PCR反应条件
Figure PCTCN2019087824-appb-000006
(3)反应完后用1.0×AMPure磁珠进行纯化,最后将纯化产物溶于22μl洗脱缓冲液。
4、文库检测:
使用Bioanalyzer分析***(Agilent,Santa Clara,USA)检测文库***片段的大小及含量,其结果如图5所示;
5、上机测序
将得到的文库进行高通量测序,测序平台MGISEQ-2000,测序类型PE100,测序后数据经过比对后统计各项基本参数,包括下机数据、可用数据、比对率、GC含量等,其结果如下表1所示。其中各扩增子的深度如图6所示。其中,图6中横坐标代表不同的CpG位点。
表1 测序检测结果
编号 下机数据 接头过滤比例 比对率 特异性 0.1X均一性
样本1 136227 1.3% 89.6% 78.6% 90%
样本2 115298 1.0% 88.5% 77.5% 90%
样本3 114045 0.9% 88.1% 78.7% 90%
表1中,样本1~样本3分别代表同一个样本做了三次重复,比对率是指比对到基因组上的比例,特异性是指目标区域的reads占总测序总reads的比例,均一性是指目标区域深度大于目标区域平均深度0.1倍个数占总目标区域数的比例。
从表1可以看出,各样本的接头过滤比例在1%上下,结合图5所示的文库质检结果,说明所形成的引物二聚体极少,比对率均在88-89%,特异性77-79%,性能良好。且各个扩增子的深度均一性良好。
实施例2:甲基化多重PCR建库测序
实验设计:采用打断到200-300bp的炎黄基因组DNA,然后按照本发明所提供的方法对DNA靶向甲基化文库制备,文库到MGISEQ-2000测序仪上进行上机测序,测序类型PE100,然后进行数据分析,包括数据利用率、比对率、扩增子特异性、均一性等性能。
1、末端修复
(1)将上一步获得的DNA片段按照下表在1.5mL的离心管中配制末端修复反应体系:
Figure PCTCN2019087824-appb-000007
(2)将上述反应体系置于20℃的Thermomixer(Eppendorf)上,进行反应30min。反应完后用AMPure磁珠进行纯化,最后将纯化产物溶于34μl洗脱缓冲液。上述试剂均采用enzymatic公司的试剂。
2、末端添加碱基A:
(1)将上一步得到的DNA按下表在1.5mL的离心管中配制添加碱基A的反应体系:
Figure PCTCN2019087824-appb-000008
(2)将上述反应体系置于37℃的Thermomixer(Eppendorf)上,进行反应30min。反应完后用AMPure磁珠进行纯化,最后将纯化产物溶于20μl洗脱缓冲液。
2、连接甲基化接头1:
(1)将上一步得到的DNA按下表配制甲基化接头(有时也称为“甲基化标签接头”)的连接反应体系:
Figure PCTCN2019087824-appb-000009
*甲基化接头序列为:
接头1:5’/5Phos/AGTCGGAGGCCAAGCGGT(SEQ ID NO:25)
接头2:5’ACATGGCTACGATCCGACTddT(SEQ ID NO:26)
接头1序列中的C均进行了甲基化修饰保护,接头2中的序列可进行或不进行甲基化修饰保护,接头2中的3端的最后一个碱基进行阻断修饰防止和模板进行连接,即进行了双脱氧修饰。
(2)将上述反应体系置于20℃的Thermomixer(Eppendorf)上,进行反应15min,获得连接产物。反应完后用AMPure磁珠进行纯化,最后将纯化产物溶于22μl洗脱缓冲液。
3、亚硫酸盐处理
采用试剂盒EZ DNA Methylation-Gold Kit TM(ZYMO公司)将上述连接好的DNA进行重亚硫酸盐共处理。
(1)准备试剂:
制备CT转换试剂(CT Conversion Reagent)溶液:从试剂盒中取出CT转换试剂(固体混合物),分别加入900μL的水、50μL的M-溶解缓冲液(M-Dissolving Buffer)和300μL的M-稀释缓冲液(M-Dilution Buffer),室温下溶解并且震荡10分钟或在摇床上摇动10分钟。
M-洗涤缓冲液的制备:向M-洗涤缓冲液中添加24mL 100%的乙醇,备用。
(2)在PCR管中加入130μL的CT转换试剂溶液和上述连接好的DNA,轻弹或移液器吹悬混合样品。
然后将样品管放到PCR仪上按以下步骤操作:
98℃下持续5分钟,64℃下持续2.5小时。
完成上述操作后,立刻进行下一步操作或者在4℃下存储(最多20小时)备用。
(3)将Zymo-Spin IC TMColumn放入收集管(Collection Tube)中,并加入600μL的M-结合缓冲液(M-Binding Buffer)。
然后将上述重亚硫酸盐处理的样品加入到含M-结合缓冲液的Zymo-Spin IC TMColumn中,盖上盖子颠倒混匀。
全速(>10,000x g)离心30秒,弃收集管中的收集液。
向柱中加入100μL的M-洗涤缓冲液,全速(>10,000x g)离心30秒,弃收集管中的液体。
向柱中添加200μL的M-Desulphonation Buffer,室温放置15min,全速(>10,000x g)离心30s,弃收集管中的液体。
向柱中添加200μL的M-洗涤缓冲液,全速(>10,000x g)离心30s,弃收集管中的液体,并再重复此步骤1次。
将Zymo-Spin IC TMColumn置于新的1.5mL EP管中,加入18μL的M-洗脱缓冲液r到柱基质中,室温放置2min,全速(>10,000x g)离心洗脱目的片段DNA。
4、第一轮PCR
(1)在PCR管中按照以下反应体系配置PCR体系,其中上游特异性引物池包含的引物如下表3所示,第一通用引物如下表5所示。
Figure PCTCN2019087824-appb-000010
(2)PCR反应条件
Figure PCTCN2019087824-appb-000011
反应完后用1.5×AMPure磁珠进行纯化,最后将纯化产物溶于22μl洗脱缓冲液。
5、第二轮PCR
(1)在PCR管中按照以下反应体系配置PCR体系,其中巢氏引物池中所包含的引物如下表4所示,第二通用引物和标签引物如下表5所示。
Figure PCTCN2019087824-appb-000012
(2)PCR反应条件
Figure PCTCN2019087824-appb-000013
反应完后用1.0×AMPure磁珠进行纯化,最后将纯化产物溶于22μl洗脱缓冲液。
6、文库检测:
使用Bioanalyzer分析***(Agilent,Santa Clara,USA)检测文库***片段的大小及含量, 其结果如图7所示。
7、上机测序
将得到的文库进行高通量测序,测序平台采用华大智造MGISEQ-2000,测序类型PE100,测序后数据经过比对后统计各项基本参数,包括下机数据、可用数据、比对率、特异性和均一性等,其结果如表2所示。各扩增子的测序深度如图8所示。
表2 测序结果
编号 原始数据 接头过滤比例 比对率 特异性 均一性
样本1 112792 0.8% 84.3% 89.3% 100%
样本2 131590 1.1% 85.6% 90.8% 100%
样本3 120311 0.9% 86.1% 90.7% 100%
表2中,样本1~样本3分别代表同一个样本做了三次重复,比对率是指比对到基因组上的比例,特异性是指目标区域的reads占总测序总reads的比例,均一性是指目标区域深度大于目标区域平均深度0.1倍个数占总目标区域数的比例。
从表2、图7以及图8的结果可以看出,利用本发明所提供的扩增方法接头过滤比例在1%上下,引物二聚体少,比对率在84-86%之间,特异性在89-90%之间,性能良好,而且各扩增子之间的覆盖深度均一性良好。
表3:第一特异性引物池
Figure PCTCN2019087824-appb-000014
Figure PCTCN2019087824-appb-000015
第一特异性引物池有上述引物等摩尔混合而成,Y碱基为C/T的兼并碱基
表4:巢式引物池
Figure PCTCN2019087824-appb-000016
第二特异性引物池有上述引物等摩尔混合而成,Y碱基为C/T的兼并碱基
表5:通用引物
Figure PCTCN2019087824-appb-000017
Figure PCTCN2019087824-appb-000018
其中,N碱基为MGI测序平台上的barcode序列。
在本发明的描述中,术语“第一”、“第二”等仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本发明的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。
在本发明中,除非另有明确的规定和限定,术语“相连”、“连接”、“固定”等术语应做广义理解,例如,可以是固定连接,也可以是可拆卸连接,或成一体;可以是机械连接,也可以是电连接或彼此可通讯;可以是直接相连,也可以通过中间媒介间接相连,可以是两个元件内部的连通或两个元件的相互作用关系,除非另有明确的限定。对于本领域的普通技术人员而言,可以根据具体情况理解上述术语在本发明中的具体含义。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。
尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。

Claims (32)

  1. 一种基于甲基化DNA目标区域构建测序文库的方法,其特征在于,包括:
    (1)基于所述甲基化DNA样本,在所述甲基化DNA样本的至少一端连接有通用序列,重亚硫酸氢盐处理DNA样本,以便获得经转化的带有通用序列的DNA样本;
    (2)利用第一特异性引物和第一通用引物对所述经转化的带有通用序列的DNA样本进行第一扩增,以便获得第一扩增产物;
    其中,所述第一特异性引物位于所述目标区域的上游,所述第一通用引物与所述通用序列至少部分匹配或重叠;
    (3)利用第二特异性引物、第二通用引物和标签引物对所述第一扩增产物进行第二扩增,获得第二扩增产物,得到测序文库;
    其中所述第二特异性引物、所述第二通用引物和所述标签引物如(i)或(ii)所示:
    (i)所述第二特异性引物位于所述第一特异性引物的下游和所述目标区域的上游,所述第二通用引物与所述第二特异性引物的至少部分序列重叠,所述标签引物中含有标签序列,所述标签引物与所述第一通用引物的部分序列重叠;
    (ii)所述第二特异性引物位于所述目标区域的下游,所述第二通用引物和所述第一特异性引物的至少部分序列重叠,所述标签引物中含有标签序列,所述标签引物与所述第二特异性引物的部分序列重叠。
  2. 根据权利要求1所述的方法,其特征在于,步骤(3)中所述第二特异性引物的5’端与所述第二通用引物的3’端至少部分序列重叠,所述标签引物的3’端与所述第一通用引物的5’端的部分序列重叠。
  3. 根据权利要求1所述的方法,其特征在于,步骤(3)中所述第二特异性引物的5’端与所述标签引物的3’端的至少部分序列重叠,所述第二通用引物的3’端与所述第一特异性引物的5’端的部分序列重叠。
  4. 根据权利要求1所述的方法,其特征在于,所述标签序列的长度为8~12bp。
  5. 根据权利要求1所述的方法,其特征在于,步骤(1)进一步包括:
    (1-a)利用重亚硫酸氢盐对所述甲基化DNA样本进行处理,以便获得经转化的DNA样本;
    (1-b)利用DNA聚合酶和第一测序引物,对所述经转化的DNA样本进行复制,以便获得所述经转化的带有通用序列的DNA样本,所述第一测序引物的3’端为随机碱基,所述第一测序引物的5’端为通用序列。
  6. 根据权利要求5所述的方法,其特征在于,所述随机碱基为6~12个,所述随机碱基为A、T、C或者G;
    任选地,所述随机碱基为6~12个,所述随机碱基为A、T或者C。
  7. 根据权利要求5所述的方法,其特征在于,所述通用序列为测序接头序列或固定序列;
    任选地,所述测序接头序列或者所述固定序列中胞嘧啶为甲基化修饰的胞嘧啶。
  8. 根据权利要求1所述的方法,其特征在于,步骤(1)进一步包括:
    (1-1)对所述甲基化DNA样本进行末端修复加A,以便获得修复的DNA样本;
    (1-2)将所述修复的DNA样本的至少一端与通用序列连接,以便获得带有通用序列的DNA样本;
    (1-3)利用重亚硫酸氢盐对所述带有通用序列的DNA样本进行处理,以便获得所述经转化的带有通用序列的DNA样本。
  9. 根据权利要求8所述的方法,其特征在于,所述通用序列选自下列中的至少一种:
    测序接头序列或者经修饰的测序接头序列;
    任选地,所述经修饰的测序接头序列为一条链胞嘧啶进行甲基化修饰,一条链胞嘧啶不进行甲基化修饰、一条链3’端碱基经非羟基修饰的测序接头序列、带有固定序列和随机序列的测序接头序列、或者一条链3’端碱基经非羟基修饰的带有固定序列和随机序列的测序接头序列;
    任选地,所述随机序列为分子标签序列。
  10. 根据权利要求1所述的方法,其特征在于,步骤(1)进一步包括:
    ①利用转座酶对所述DNA样本进行打断和转座处理,以便获得带有通用序列的DNA样本,所述转座酶中包埋有通用序列;
    ②利用重亚硫酸氢盐对所述带有通用序列的DNA样本进行处理,以便获得所述经转化的带有通用序列的DNA样本。
  11. 根据权利要求10所述的方法,其特征在于,所述通用序列为转座酶效应序列或者带有测序接头的Tn5转座酶效应序列,优选为转座酶效应序列;
    优选地,转座酶效应序列中的胞嘧啶为甲基化修饰的胞嘧啶。
  12. 根据权利要求1所述的方法,其特征在于,所述甲基化DNA样本为基因组DNA、片段化的基因组DNA、或者游离DNA。
  13. 一种基于甲基化DNA目标区域构建测序文库的***,其特征在于,包括:
    通用转化模块,所述通用转化模块基于所述甲基化DNA样本,在所述甲基化DNA样本的至少一端连接有通用序列,经重亚硫酸氢盐处理DNA样本,以便获得经转化的带有通用序列的DNA样本;
    第一扩增模块,所述第一扩增模块和所述通用转化模块相连,所述第一扩增模块利用第一特异性引物和第一通用引物对所述经转化的带有通用序列的DNA样本进行第一扩增,以便获得第一扩增产物,其中所述第一特异性引物位于所述目标区域的上游,所述第一通用引物与所述通用序列至少部分匹配或重叠;
    第二扩增模块,所述第二扩增模块和所述第一扩增模块相连,所述第二扩增模块利用第二特异性引物、第二通用引物和标签引物对所述第一扩增产物进行第二扩增,获得第二扩增产物,得到测序文库;
    其中所述第二特异性引物,所述通用引物和所述标签引物如(i)或(ii)所示:
    (i)所述第二特异性引物位于所述第一特异性引物的下游和所述目标区域的上游,所述第二通用引物与所述第二特异性引物的至少部分序列重叠,所述标签引物中含有标签序列,所述标签引物与所述第一通用引物的部分序列重叠;
    (ii)所述第二特异性引物位于所述目标区域的下游,所述第二通用引物和所述第一特异性引物的至少部分序列重叠,所述标签引物中含有标签序列,所述标签引物与所述第二特异性引物的部分序列重叠。
  14. 根据权利要求13所述的***,其特征在于,所述第二扩增模块中所述第二特异性引物的5’端与所述第二通用引物的3’端的至少部分序列重叠,标签引物的3’端与所述第一通用引物的5’端的部分序列重叠。
  15. 根据权利要求13所述的***,其特征在于,所述第二扩增模块中所述第二特异性引物的5’端与所述标签引物的3’段的至少部分序列重叠,所述第二通用引物的3’端与所述第一特异性引物的5’端的部分序列重叠。
  16. 根据权利要求13所述的***,其特征在于,所述标签序列的长度为8~12bp。
  17. 根据权利要求13所述的***,其特征在于,所述通用转化模块进一步包括:
    转化单元,所述转化单元利用重亚硫酸氢盐对所述甲基化DNA样本进行处理,以便获得经转化的DNA样本;
    扩增单元,所述扩增单元与所述转化单元相连,所述扩增单元利用DNA聚合酶和第一测序引物,对所述经转化的DNA样本进行复制,以便获得所述经转化的带有通用序列的DNA样本,所述第一测序引物的3’端为随机碱基,所述第一测序引物的5’端为通用序列。
  18. 根据权利要求17所述的***,其特征在于,所述随机碱基为6~12个,所述随机碱基为A、T、C或者G;
    任选地,所述随机碱基为6~12个,所述随机碱基为A、T或者C。
  19. 根据权利要求17所述的***,其特征在于,所述通用序列为测序接头序列或固定序列;
    任选地,所述测序接头序列或者所述固定序列中胞嘧啶为甲基化修饰的胞嘧啶。
  20. 根据权利要求13所述的***,其特征在于,所述通用转化模块进一步包括:
    修复单元,所述修复单元用于对所述甲基化DNA样本进行末端修复加A,以便获得修复的DNA样本;
    连接单元,所述连接单元与所述修复单元相连,所述连接单元用于将所述修复的DNA样本的至少一端与通用序列连接,以便获得带有通用序列的DNA样本;
    转化单元,所述转化单元与所述连接单元相连,所述转化单元利用重亚硫酸氢盐对所述带有通用序列的DNA样本进行处理,以便获得所述经转化的带有通用序列的DNA样本。
  21. 根据权利要求20所述的***,其特征在于,所述通用序列选自下列中的至少一种:
    测序接头序列或者经修饰的测序接头序列;
    任选地,所述经修饰的测序接头序列为一条链胞嘧啶进行甲基化修饰,一条链胞嘧啶不进行甲基化修饰、一条链3’端碱基经非羟基修饰的测序接头序列、带有固定序列和随机 序列的测序接头序列、或者一条链3’端碱基经非羟基修饰的带有固定序列和随机序列的测序接头序列;
    任选地,所述随机序列为分子标签序列。
  22. 根据权利要求13所述的***,其特征在于,所述通用转化模块进一步包括:
    转座单元,所述转座单元利用转座酶对所述DNA样本进行打断和转座处理,以便获得带有通用序列的DNA样本,所述转座酶中包埋有通用序列;
    转化单元,所述转化单元与所述转座单元相连,所述转化单元利用重亚硫酸氢盐对所述带有通用序列的DNA样本进行处理,以便获得所述经转化的带有通用序列的DNA样本。
  23. 根据权利要求22所述的***,其特征在于,所述通用序列为转座酶效应序列或者带有测序接头的转座酶效应序列,优选为转座酶效应序列;
    优选地,所述转座酶效应序列中的胞嘧啶为甲基化修饰的胞嘧啶。
  24. 根据权利要求13所述的***,其特征在于,所述甲基化DNA样本为基因组DNA、片段化的基因组DNA或者游离DNA。
  25. 一种对甲基化DNA样本进行测序的方法,其特征在于,包括:
    基于所述甲基化DNA样本,根据权利要求1~12任一项所述的方法或者利用权利要求13~24任一项所述的***构建得到测序文库;
    对所述测序文库进行高通量测序,以便获得测序结果。
  26. 根据权利要求25所述的方法,其特征在于,利用测序平台对所述测序文库进行高通量测序,所述测序平台选自MGISEQ、Illumina、Proton中的至少一种。
  27. 一种确定甲基化DNA样本的甲基化状态的方法,其特征在于,包括:
    基于所述甲基化DNA样本,根据权利要求1~12任一项所述的方法或者利用权利要求13~24任一项所述的***构建得到测序文库;
    对所述测序文库进行高通量测序,以便获得测序结果;
    将所述测序结果与参考基因组进行比对,以便确定所述甲基化DNA样本的甲基化状态。
  28. 根据权利要求27所述的方法,其特征在于,所述参考基因组为人类基因组hg19或者炎黄基因组。
  29. 一种试剂盒,其特征在于,包括:通用序列,标签引物,第一通用引物,第二通用引物和甲基化检测试剂;
    其中所述标签引物中含有标签序列,所述第一通用引物与所述通用序列的至少部分匹配或重叠,所述第一通用引物为SEQ ID NO:21,所述第二通用引物为SEQ ID NO:22。
  30. 根据权利要求29所述的试剂盒,其特征在于,所述标签引物为SEQ ID NO:23所示。
  31. 根据权利要求29或30所述的试剂盒,其特征在于,进一步包括:第一特异性引物和第二特异性引物,所述第一特异性引物包括SEQ ID NO:1~SEQ ID NO:10所示序列,所述第二特异性引物包括SEQ ID NO:11~SEQ ID NO:20所示序列。
  32. 根据权利要求29所述的试剂盒,其特征在于,所述试剂盒利用权利要求1~12中 所述的方法基于甲基化DNA目标区域构建测序文库。
PCT/CN2019/087824 2019-05-21 2019-05-21 基于甲基化dna目标区域构建测序文库及***和应用 WO2020232635A1 (zh)

Priority Applications (5)

Application Number Priority Date Filing Date Title
EP19929647.6A EP3950956A4 (en) 2019-05-21 2019-05-21 METHOD AND SYSTEM FOR CREATING A SEQUENCE LIBRARY BASED ON METHYLATED DNA TARGET REGION AND THEIR USE
CN201980092935.8A CN113811618B (zh) 2019-05-21 2019-05-21 基于甲基化dna目标区域构建测序文库及***和应用
JP2022502317A JP7203276B2 (ja) 2019-05-21 2019-05-21 メチル化されたdnaの標的領域に基づいてシーケンシングライブラリーを構築する方法及びキット
PCT/CN2019/087824 WO2020232635A1 (zh) 2019-05-21 2019-05-21 基于甲基化dna目标区域构建测序文库及***和应用
US17/493,991 US20220056519A1 (en) 2019-05-21 2021-10-05 Method and system for constructing sequencing library on the basis of methylated dna target region, and use thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/087824 WO2020232635A1 (zh) 2019-05-21 2019-05-21 基于甲基化dna目标区域构建测序文库及***和应用

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/493,991 Continuation US20220056519A1 (en) 2019-05-21 2021-10-05 Method and system for constructing sequencing library on the basis of methylated dna target region, and use thereof

Publications (1)

Publication Number Publication Date
WO2020232635A1 true WO2020232635A1 (zh) 2020-11-26

Family

ID=73459291

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/087824 WO2020232635A1 (zh) 2019-05-21 2019-05-21 基于甲基化dna目标区域构建测序文库及***和应用

Country Status (5)

Country Link
US (1) US20220056519A1 (zh)
EP (1) EP3950956A4 (zh)
JP (1) JP7203276B2 (zh)
CN (1) CN113811618B (zh)
WO (1) WO2020232635A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115386966B (zh) * 2022-10-26 2023-03-21 北京寻因生物科技有限公司 Dna表观修饰的建库方法、测序方法及其建库试剂盒
WO2024119481A1 (zh) * 2022-12-09 2024-06-13 深圳华大智造科技股份有限公司 一种快速制备多重pcr测序文库的方法及其应用
WO2024124400A1 (zh) * 2022-12-13 2024-06-20 深圳华大智造科技股份有限公司 一种基于多重pcr的靶向甲基化建库体系、方法及其应用
CN117316289B (zh) * 2023-09-06 2024-04-26 复旦大学附属华山医院 一种中枢神经***肿瘤的甲基化测序分型方法及***

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107541791A (zh) * 2017-10-26 2018-01-05 中国科学院北京基因组研究所 血浆游离dna甲基化检测文库的构建方法、试剂盒及应用
CN109666720A (zh) * 2018-12-28 2019-04-23 北京中科遗传与生殖医学研究院有限责任公司 一种对胚胎培养液进行DedscRRBS-PGS分析的方法

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102796808B (zh) * 2011-05-23 2014-06-18 深圳华大基因科技服务有限公司 甲基化高通量检测方法
US20150011396A1 (en) 2012-07-09 2015-01-08 Benjamin G. Schroeder Methods for creating directional bisulfite-converted nucleic acid libraries for next generation sequencing
CN106011230A (zh) * 2016-05-10 2016-10-12 人和未来生物科技(长沙)有限公司 用于检测碎片化dna目标区域的引物组合物及其应用
CN105861724B (zh) * 2016-06-03 2019-07-16 人和未来生物科技(长沙)有限公司 一种kras基因超低频突变检测试剂盒
EP3553180B1 (en) 2016-12-07 2022-05-04 MGI Tech Co., Ltd. Method for constructing single cell sequencing library and use thereof
CN117778531A (zh) * 2017-06-30 2024-03-29 生命技术公司 分子库制备方法以及其组合物和用途
CN107937985A (zh) * 2017-10-25 2018-04-20 人和未来生物科技(长沙)有限公司 一种微量碎片化dna甲基化检测文库的构建方法和检测方法

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107541791A (zh) * 2017-10-26 2018-01-05 中国科学院北京基因组研究所 血浆游离dna甲基化检测文库的构建方法、试剂盒及应用
CN109666720A (zh) * 2018-12-28 2019-04-23 北京中科遗传与生殖医学研究院有限责任公司 一种对胚胎培养液进行DedscRRBS-PGS分析的方法

Also Published As

Publication number Publication date
JP2022525373A (ja) 2022-05-12
CN113811618A (zh) 2021-12-17
CN113811618B (zh) 2024-02-09
US20220056519A1 (en) 2022-02-24
EP3950956A4 (en) 2022-05-04
JP7203276B2 (ja) 2023-01-12
EP3950956A1 (en) 2022-02-09

Similar Documents

Publication Publication Date Title
WO2020232635A1 (zh) 基于甲基化dna目标区域构建测序文库及***和应用
CN102329876B (zh) 一种测定待检测样本中疾病相关核酸分子的核苷酸序列的方法
CN106591441B (zh) 基于全基因捕获测序的α和/或β-地中海贫血突变的检测探针、方法、芯片及应用
ES2393318T3 (es) Estrategias para la identificación y detección de alto rendimiento de polimorfismos
WO2019144582A1 (zh) 用于检测基因突变和已知、未知基因融合类型的高通量测序靶向捕获目标区域的探针和方法
CN111471754B (zh) 一种通用型高通量测序接头及其应用
CN108611398A (zh) 通过新一代测序进行基因分型
CN105368930B (zh) 测序基因分型技术中测序酶切组合的确定方法
CN110004225B (zh) 一种肿瘤化疗药个体化基因检测试剂盒、引物及方法
JP2020536525A (ja) プローブ及びこれをハイスループットシーケンシングに適用するターゲット領域の濃縮方法
EP2785865A1 (en) Method and kit for characterizing rna in a composition
CN109825552B (zh) 一种用于对目标区域进行富集的引物及方法
CN109136217B (zh) 一种测序文库构建的方法、建库试剂及其应用
US20220090059A1 (en) Method and use for construction of sequencing library based on dna samples
CN114250279B (zh) 一种单倍型的构建方法
CN116083423A (zh) 一种靶向富集核酸的探针
CN113969307A (zh) Dna甲基化测序文库及制备方法和dna甲基化检测方法
Nikiforova et al. Amplification-based methods
WO2014086037A1 (zh) 构建核酸测序文库的方法及其应用
CN107904297B (zh) 用于微生物多样性研究的引物组、接头组和测序方法
CN112048543A (zh) 高通量低成本高碱基准确性的质粒或dna片段的新型测序方法
BR112021010425A2 (pt) Método de amplificação e identificação de ácido nucleico
KR20160148807A (ko) 장 베체트병 진단용 바이오마커 및 이의 용도
CN114686579B (zh) 用于核酸样本扩增的组合物、试剂盒、方法及***
CN114686561B (zh) 用于核酸样本扩增的组合物、试剂盒、方法及***

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19929647

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022502317

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2019929647

Country of ref document: EP

Effective date: 20211029

NENP Non-entry into the national phase

Ref country code: DE