WO2014089797A1 - Fragment d'adn modifié par un acide nucléique bloqué pour séquençage haut débit - Google Patents

Fragment d'adn modifié par un acide nucléique bloqué pour séquençage haut débit Download PDF

Info

Publication number
WO2014089797A1
WO2014089797A1 PCT/CN2012/086521 CN2012086521W WO2014089797A1 WO 2014089797 A1 WO2014089797 A1 WO 2014089797A1 CN 2012086521 W CN2012086521 W CN 2012086521W WO 2014089797 A1 WO2014089797 A1 WO 2014089797A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequencing
primer
nucleic acid
locked nucleic
chain
Prior art date
Application number
PCT/CN2012/086521
Other languages
English (en)
Chinese (zh)
Inventor
龚梅花
章文蔚
李计广
朱鹏远
Original Assignee
深圳华大基因科技服务有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳华大基因科技服务有限公司 filed Critical 深圳华大基因科技服务有限公司
Priority to PCT/CN2012/086521 priority Critical patent/WO2014089797A1/fr
Publication of WO2014089797A1 publication Critical patent/WO2014089797A1/fr

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/30Chemical structure
    • C12N2310/32Chemical structure of the sugar
    • C12N2310/323Chemical structure of the sugar modified ring structure
    • C12N2310/3231Chemical structure of the sugar modified ring structure having an additional ring, e.g. LNA, ENA

Definitions

  • the present invention relates to high throughput sequencing related techniques, and in particular to LNA modified DNA fragments for high throughput sequencing, including DNA linkers, PCR primers and/or sequencing primers.
  • the present invention also relates to a high throughput sequencing method comprising a DNA library preparation and sequencing method for high throughput sequencing, the DNA library preparation and sequencing method comprising the step of using an LNA modified DNA fragment. Background technique
  • Locked nucleic acid is a synthetic antisense oligonucleotide, a special bicyclic nucleotide derivative in which the nucleotide residue of the ribose ring ( ⁇ -D-)
  • the 2,-oxygen and 4,-carbon of ribofuranose form a fluorenylene linkage by shrinkage, and the structure contains one or more 2'-0,4'-C-fluorenylene- ⁇ -D-ribofuranosyl
  • the monomer, the 2'-0 position and the 4'-C position of ribose form an oxysulfinyl bridge, a sulfinylene bridge or an amine fluorene bridge through different shrinkage, and are connected in a ring shape, and the ring bridge is locked.
  • the N-configuration of the furanose C3'-endotype reduces the flexibility of the ribose structure and increases the stability of the local structure of the phosphate backbone. Because LNA and DNA/RNA have the same phosphate skeleton in structure, they have good recognition ability and strong affinity for DNA and RNA (Li Shengmao, Xu Xiang, Liang Huaping, Research progress in locked nucleic acid, Physiological science) Progress, 2003, 34 ( 4 ), 319-323 ) 0
  • the present invention includes the following aspects:
  • a first aspect of the invention relates to a locked nucleic acid modified DNA fragment for high throughput sequencing, wherein the DNA fragment is selected from one, two or three of a linker, a PCR primer and a sequencing primer, characterized in that The linker, PCR primer and/or sequencing primer in the DNA fragment contains a locked nucleic acid.
  • the high throughput sequencing refers to SOLEXA sequencing.
  • One .
  • the DNA fragment is a PCR primer.
  • the DNA fragment is a sequencing primer.
  • the DNA fragment is a linker and a PCR primer.
  • the DNA fragment is a PCR primer and a sequencing primer. In another embodiment of the invention, the DNA fragment is a linker, a PCR primer, and a sequencing primer.
  • the locked nucleic acid contained in the linker is located at the 5th nucleotide of the terminal F chain 5, and is located at the 3rd nucleotide of the end of the linker R chain 3, The number of locked nucleic acids in the F chain and the R chain of the linker is one.
  • sequence of the linker F chain is the sequence of SEQ ID NO: 3; in a specific embodiment of the invention, the sequence of the linker R chain is SEQ ID NO: 4 The sequence shown.
  • the sense primer contains a locked nucleic acid located near the 5th end of the PCR primer
  • the near 5 is Refers to the 2nd to 5th (eg, 2nd, 3rd, 4th, 5th) nucleotides near the 5th end of the primer
  • the antisense primer contains the locked nucleic acid located near the PCR primer 3
  • the end preferably, the near 3
  • the end refers to the 2nd to 5th (for example, 2nd, 3rd, 4th, 5th) nucleotides located near the 3rd end of the primer
  • the number of locked nucleic acids in the PCR primer is 1 to 3 (for example, 1, 2, 3).
  • the sense primer refers to a PCR primer for amplifying a coding strand in genomic DNA, and in one embodiment of the present invention, a partial sequence thereof is identical to a fixed sequence P5 on a chip. PCR primers.
  • the PCR sequence of the partial sequence and the immobilized sequence P5 on the chip refers to a PCR Primer PE 1.0 primer.
  • the locked nucleic acid contained in the PCR primer is located at the 3rd nucleotide of the PCR Primer PE 1.0 primer near the 5th end, and the number of the locked nucleic acid is 1.
  • the antisense primer refers to a PCR primer for amplifying a template strand in genomic DNA, and in one embodiment of the invention, a partial sequence thereof and a fixed sequence on the chip P7- The resulting PCR primers.
  • the PCR primer that binds to the immobilized sequence P7 on the chip comprises a tag sequence for distinguishing different sequencing results, thus the on-chip
  • the number of libraries that are mixed and sequenced is related. In a specific embodiment of the present invention, four different libraries are mixed and sequenced, so the number of the tag sequences is four, so the PCR primers that bind to the immobilized sequence P7 on the chip The number is four.
  • the tag consists of A/T/C/G, which serves to identify different libraries, allowing different libraries to be mixed and sequenced to take full advantage of sequencing throughput.
  • the label has a length of 8 nt, such as AGAGACTT, GCGAGGCC. AGATCTCT or TAGAGAGC.
  • tags can be introduced by ligation tag ligation or PCR.
  • PCR primers are used to introduce tags using PCR, allowing different libraries to be labeled with different markers for sequencing.
  • the PCR primer of the partial sequence and the immobilized sequence P7 on the chip is a PCR Primer PE2.0 primer, for example, PCR Primer PE2.0 primers A, B, C. D.
  • the locked nucleic acid contained in the PCR primer is located at the 3rd nucleotide of the 3' end of the PCR Primer PE2.0 primer, and the number of the locked nucleic acid is 1.
  • the sequence of the PCR primer is the sequence shown in SEQ ID NO: 6.
  • the sequence of the PCR primer is the sequence shown in SEQ ID NOs: 11-14.
  • the locked nucleic acid contained in the sequencing primer is located near the 3' end of the sequencing primer; preferably, the 3 end of the sequencing primer is located near the sequencing primer.
  • the 2nd to 5th for example, 2nd, 3rd, 4th, 5th
  • the number of locked nucleic acids in the sequencing primer is 1 to 3 (for example, 1, 2, 3) )).
  • the sequencing primer is a Read2 sequencing primer.
  • the locked nucleic acid contained in the sequencing primer is located at the 2nd and 4th nucleotides of the sequencing primer near the 3' end, and the number of locked nucleic acids in the sequencing primer is 2.
  • sequence of the sequencing primer is the sequence shown in SEQ ID NO: 16.
  • a second aspect of the invention relates to a composition comprising the DNA fragment of any of the first aspects of the invention.
  • a third aspect of the invention relates to a method for constructing a DNA library, the method comprising the step of performing a nucleic acid modification on a DNA fragment, the DNA fragment being a linker and/or a PCR primer, the lock nucleic acid modification being a linker in the DNA fragment and / or PCR primers contain a locked nucleic acid; preferably,
  • the lock nucleic acid contained in the linker is located at the 5th end of the linker F chain, and/or 3 end of the linker R chain; preferably, the 5 end of the link F chain is located at the F chain 5 At the 2nd to 5th (for example, 2nd, 3rd, 4th, and 5th) nucleotides of the end, the 3rd end of the R chain near the linker is located at the 2nd to 5th of the R chain 3, for example, 2, 3, 4, 5) nucleotides; preferably, the number of locked nucleic acids in the F chain or R chain of the linker is 1-3 (for example, 1, 2, 3);
  • the sense primer contains a locked nucleic acid located near the PCR primer 5, , , , 3, 4, 5) nucleotides; and/or antisense primers contain a locked nucleic acid located near the 3' end of the PCR primer, preferably, close to 3, the end refers to the primer 2 to 5 (for example, 2, 3, 4, 5) nucleotides near the 3'end; further preferably, the number of locked nucleic acids in the PCR primer is 1 to 3 (for example, 1, 2, 3).
  • the locked nucleic acid contained in the linker is located at the 5th nucleotide of the F chain 5, and is located at the 3rd nucleotide of the R chain 3, the F chain
  • the number of locked nucleic acids in the R chain and each of the R chains is one.
  • sequence of the linker F chain is the sequence of SEQ ID NO: 3; in a specific embodiment of the invention, the sequence of the linker R chain is SEQ ID NO: 4 The sequence shown.
  • the sense primer refers to a PCR primer for amplifying a coding strand in genomic DNA, and in one embodiment of the present invention, a partial sequence thereof is identical to a fixed sequence P5 on a chip. PCR primers.
  • the PCR primer that binds the partial sequence to the immobilized sequence P5 on the chip refers to a PCR Primer PE 1.0 primer.
  • the locked nucleic acid contained in the PCR primer is located at the 3rd nucleotide of the PCR Primer PE 1.0 primer near the 5th end, and the number of the locked nucleic acid is 1.
  • the antisense primer refers to a PCR primer for amplifying a template strand in genomic DNA, and in one embodiment of the invention, a partial sequence thereof and a fixed sequence on the chip P7- The resulting PCR primers.
  • the PCR primers that bind to the immobilized sequence P7 on the chip comprise a tag sequence for distinguishing between different sequencing results, such that the binding to the immobilized sequence P7 on the chip
  • the number of PCR primers is the same as the number of tag sequences. In a specific embodiment of the present invention, the number of the tag sequences is four, and therefore the number of the PCR primers combined with the immobilized sequence P7 on the chip is four.
  • the PCR primer of the partial sequence and the immobilized sequence P7 on the chip is a PCR Primer PE2.0 primer, for example, PCR Primer PE2.0 primers A, B, C. D.
  • the locked nucleic acid contained in the PCR primer is located at the 3rd nucleotide of the 3' end of the PCR Primer PE2.0 primer, and the number of the locked nucleic acid is 1.
  • the sequence of the PCR primer is the sequence shown in SEQ ID NO: 6.
  • the sequence of the PCR primer is the sequence shown in SEQ ID NOs: 11-14.
  • a fourth aspect of the invention relates to a method of sequencing a DNA library, the method comprising modifying with a locked nucleic acid , ,
  • the locked nucleic acid contained in the sequencing primer is located near the 3 end of the sequencing primer; preferably, the 3 terminal near the sequencing primer refers to the 2nd to 5th of the sequencing primer near the 3rd end (for example, the 2nd and the 3rd) 4, 5) nucleotides; preferably, the number of locked nucleic acids in the sequencing primer is 1-3 (for example, 1, 2, 3).
  • the sequencing primer is a Read2 sequencing primer.
  • the locked nucleic acid contained in the sequencing primer is located at the 2nd and 4th nucleotides of the sequencing primer near the 3' end, and the number of locked nucleic acids in the sequencing primer is 2.
  • the sequence of the sequencing primer is the sequence shown in SEQ ID NO: 16.
  • a fifth aspect of the present invention relates to a high-throughput sequencing method, which comprises a method for constructing a DNA library and a method for sequencing a DNA library, the method for constructing the DNA library according to any one of the third aspect of the present invention, The sequencing method of the DNA library is the sequencing method according to any one of the fourth aspects of the invention.
  • a sixth aspect of the invention relates to the use of a DNA fragment according to any of the first aspects of the invention in high throughput sequencing, construction of a DNA library or sequencing of a DNA library.
  • the high throughput sequencing refers to SOLEXA sequencing.
  • SOLEXA sequencing the high throughput sequencing. The invention is further described below.
  • the high-throughput sequencing is also called "Next-generation" sequencing technology, in order to be capable of paralleling hundreds of thousands to millions of DNA molecules at a time. Sequence determination is a marker, and high-throughput sequencing makes it possible to perform detailed analysis of the transcriptome and genome of a species, so it is also known as deep sequencing, including but not limited to: massively parallel signatures Massively Parallel Signature Sequencing (MPSS).
  • MPSS Massively Parallel Signature Sequencing
  • Polymerase cloning POLony Sequencing
  • 454 pyrosequencing Illumina (Solexa) sequencing
  • ABI SOLiD sequencing Ion semiconductor sequencing
  • DNA nanospheres DNA nanoball sequencing Helicos' single-molecule DNA sequencing technology, etc.
  • synthetic sequencing such as SOLEXA sequencing and various sequencing technologies developed based on Solexa sequencing.
  • the SOLEXA sequencing belongs to the next-generation sequencing technology developed by SOLICA, and the core idea is to sequence while synthesizing. That is, when a new DNA complementary strand is generated, either the added dNTP catalyzes the substrate to catalyze the fluorescence by enzymatic cascade reaction, or directly adds the fluorescently labeled dNTP or semi-degenerate primer, and releases when the synthetic strand is synthesized or linked to form a complementary strand. Fluorescent signal. Complementary strand sequence information is obtained by capturing the optical signal and transforming it into a sequencing peak (Mardis ER (2008). x ,. u .
  • the SOLEXA test includes sequencing of DNA samples and sequencing of RNA samples. Depending on the sequencing method, SOLEXA sequencing can be divided into single-end sequencing (Single-read Sequencing) and double-ended sequencing (Paired-end Sequencing and Mate-pair Sequencing). In an embodiment of the invention, the SOLEXA sequencing method is a Paired-end sequencing method.
  • the locked nucleic acid refers to a synthetic antisense oligonucleotide, which is a special bicyclic nucleotide derivative and also belongs to a kind of nucleotide.
  • the 2,-oxygen and 4,-carbon of the ribose ring ( ⁇ -D-ribofuranosyl) of the nucleotide residue form a fluorenylene linkage by shrinkage (see Formula I, where B is a base), and the structure contains One or more 2'-0,4'-C-arylene- ⁇ -D-ribofuranoic acid monomers, the VO position and the 4'-C position of ribose form an oxy-indenylene bridge through different shrinkage, a sulfinylene bridge or an amine sulfhydryl bridge, which is connected in a ring shape.
  • This ring bridge locks the N-form of the furanose C3'-endotype, reduces the flexibility of the ribose structure, and increases
  • the locked nucleic acid modification means that the nucleotide in the DNA fragment is replaced by a locked nucleic acid having the same base.
  • the DNA library refers to a library prepared by extracting genomic DNA from a cell, then breaking it to a size of about 100-1000 bp, and then ligating the linker to the fragment and PCR-amplifying it.
  • the library is used for high throughput sequencing, such as SOLEXA sequencing.
  • the construction of the DNA library refers to a process from the extraction of DNA in a cell to the obtaining of a DNA library.
  • the sequencing of the DNA library refers to a process of sequencing the obtained DNA library to obtain a nucleotide sequence of each fragment in the library.
  • the DNA fragment refers to a small DNA fragment required for high-throughput sequencing such as SOLEXA sequencing, including a linker, a PCR primer, and a sequencing primer.
  • the genomic DNA fragment refers to a fragment obtained by disrupting the extracted genomic DNA.
  • the adaptor is used in high-throughput sequencing, particularly in SOLEXA sequencing, specifically to add a "Y"-type double-stranded DNA fragment at the end of the interrupted genomic DNA fragment, which is A "Y" type double-stranded DNA fragment synthesized by annealing the F chain and the R chain.
  • the function of the linker is to add a known sequence to design the corresponding primer for PCR.
  • the PCR primer is used in high-throughput sequencing, particularly in SOLEXA sequencing, w.
  • the sequencing primers are used in high-throughput sequencing, particularly SOLEXA sequencing, and specifically refer to primers for sequencing a constructed DNA library.
  • the F chain of the linker refers to a forward oligonucleotide chain of a double-linker, wherein the 5, end sequence is complementary to the R chain 3, the end sequence, forming a Y-type double-stranded fragment, 5, After phosphorylation, it can be linked to the 3, end A of the genomic DNA fragment after addition of A.
  • the 3' end sequence of the R chain of the linker is complementary to the F chain 5, the end sequence to form a Y-type double-stranded fragment, and the 3' end may be ligated to the 5' end of the genomic DNA fragment.
  • the 5, or 3, terminal near the DNA fragment is located within a third of the length of the 5, or 3, end of the fragment.
  • the n-th nucleotide located at the 5, or 3, end of a DNA fragment is calculated from the first nucleotide of the 5, or 3, end of the fragment. The position of the nth nucleotide.
  • the flowcell refers to a sequencing chip to which a single-stranded oligonucleotide sequence is attached.
  • the fixed sequence P7 refers to a binding sequence on a flowcell, and a 5, end sequence of a sequence sequence of a SBS process template.
  • the fixed sequence P5 refers to a binding sequence on a flowcell.
  • the sense primer is also referred to as an upstream primer, and refers to a primer which is identical to the 5' end sequence of the coding strand in the DNA fragment to be amplified.
  • the antisense primer also referred to as a downstream primer, refers to a primer complementary to the 3, end sequence of the coding strand in the DNA fragment to be amplified.
  • the Readl sequencing primer refers to a sequencing primer used for synthesizing a read library 5, a terminal sequence at the time of double-end sequencing.
  • the Read2 sequencing primer refers to a sequencing primer used for synthesizing the read library 3, the terminal sequence, at the time of double-end sequencing.
  • the nucleotide includes deoxyribonucleotides, ribonucleotides, and also includes a locked nucleic acid.
  • the position of the locked nucleic acid means that the deoxyribonucleotide at that place is replaced by a corresponding locked nucleic acid, that is, by a locked nucleic acid containing the same base.
  • a library of LNA-modified linkers and PCR primers is separately constructed, and a library of commonly used DNA small fragments that have not been modified by LNA is used, and after successful library preparation, solexa high-throughput sequencing is performed, wherein The LNA-modified library was subjected to LNA-modified supernatant sequencing primers, and the LNA-modified library was subjected to sequencing primers provided by Illumina, and the results obtained were compared to verify the stability, reproducibility and true reliability of the present invention.
  • the invention combines the LNA modification technology with the high-throughput sequencing technology, and improves the thermal stability of the DNA fragment and the stability against the enzyme degradation by performing LNA modification on the linker, PCR primer and/or sequencing primer involved in SOLEXA sequencing. Sex, which activates RNase H, which reduces DNA dimer production and improves ligation and PCR efficiency.
  • the present invention optimizes the LNA modification site, and adopts different strategies for modification of different DNA fragments, thereby further improving the effect of LNA modification.
  • the modification site is located at the 5th end of the F chain of the linker and at the 3rd end of the R chain, which can improve the efficiency of the joint annealing and improve the connection of the linker with the target fragment which is reacted with "A" reaction.
  • the LNA modification site of the common PCR Primer PE 1.0 primer is close to the 5th end, which can improve the binding efficiency with the p5 sequence
  • PCR Primer PE2.0 primers (A, B, C, D) have LNA modification sites close to the 3' end, which improves the binding efficiency to the p7 sequence, and ultimately makes the sequence of interest and the immobilization sequence more stable
  • LNA for sequencing primers Modification, using double-modification of sequencing primers at the 3' end can improve the stability, specificity and sensitivity of sequencing primers, thereby improving the quality of the entire sequencing Run (single-on-sequence sequencing reaction).
  • the present invention performs parallel analysis on the sequencing data of the LNA modified library and the non-LNA modified library, and evaluates the sequencing quality by sequencing the base quality value (Q30%), the sequencing error rate, the GC content, the joint contamination, and the genomic alignment ratio.
  • Library quality was evaluated by GC distribution, gene coverage, and the like. By comparison, it was found that the library modified with LNA was better than the library without LNA modification, both in library quality and in sequencing quality.
  • FIG. 1 LNA modification protocol for SOLEXA sequencing
  • FIG. 1 Schematic diagram of the preparation process of the DNA PE index library
  • Figure 4A Non-LNA modified library sequencing mass distribution map, the abscissa is the number of cycles of PE90, the ordinate is the base mass value corresponding to each cycle, the color represents a different percentage, white: 0, green: 10%, yellow: 30%, red: 50%, deep red: 70%, black: 100%;
  • Figure 4B LNA modified library sequencing mass distribution map, the abscissa is the number of cycles of PE90, the ordinate is the base mass value corresponding to each cycle, the color represents a different percentage, white: 0, green: 10%, yellow: 30 %, red: 50%, dark red: 70%, black: 100%; , ,
  • the number of cycles, the ordinate is the error rate (number of bases per cycle error / number of bases in all cycles);
  • Non-LNA modified library sequencing results GC content distribution map the abscissa is the GC content of each window on the statistical Acinetobacter reference sequence, and the ordinate is the number of times of comparison to each window;
  • Figure 6 LNA modified library sequencing results GC content distribution map, the abscissa is the GC content of each window on the statistical Acinetobacter reference sequence, and the ordinate is the number of times of comparison to each window;
  • Figure 7 Gene coverage map of library sequencing results, the abscissa is the number of coverages per base, and the ordinate is the number of bases; where the light curve (lower peak curve) indicates a non-LNA modified library, dark The curve (higher curve of the peak) represents the LNA modified library;
  • FIG. 8 Aglient 2100 results for unannealed joint annealing
  • FIG. 8B Aglient 2100 results for LNA-modified joint annealing
  • the box plot is drawn by sorting Q30% of all tiles in each cycle and plotting 5 points as a box plot: highest, lowest, median, quarter, three-quarters;
  • the box plot is drawn in such a way that Q30% of all tiles are sorted and 5 points are drawn into a box plot: highest, lowest, median, quarter, three quarters.
  • Acinetobacter genomic DNA (genome ⁇ 3.6M, GC content 40.4%) was extracted as a template, about 30 ⁇ g, the starting amount of each library was 3 ⁇ ⁇ , the Covaris S2 was used to break the main band 350bp, and 8 inserts were constructed in parallel.
  • a 350 bp DNA PE inde library 4 of which used LNA-modified adapters and PCR primers, and the other 4 libraries used LNA-free linkers and PCR primers.
  • high-throughput sequencing of solexa SBS-sequencing by synthesis
  • was carried out in which the LNA-modified library was sequenced with LNA-modified sequencing primer read 2, and the library without LNA modification used the sequencing primer provided by Illumina.
  • 350bp DNA PE index library preparation operation See Figure 1, Figure 2. Specific steps are as follows:
  • the terminal repair product was purified by QIAquick PCR Purification Kit (QIAGEN) and dissolved in 34 ⁇ l EB buffer.
  • the product was purified by MinElute PCR Purification Kit (QIAGEN) and dissolved in 12 ⁇ l of EB buffer.
  • the underlined part is the LNA modification site of the linker, which represents the LNA modified G and T, respectively.
  • the LNA modification site is close to the 5th end of the F chain, the R chain 3, and the end, which can improve the joint annealing efficiency and improve
  • the ligation of the linker with the target fragment reacted with "addition of A" makes the binding of the linker and the PCR primer more efficient, specific and sensitive.
  • the ligated product was purified by QIAquick PCR Purification Kit (QIAGEN) and dissolved in 32 ⁇ l of EB buffer.
  • the ligated product prepared in the step 4 was prepared, and a 2% agarose gel was prepared, and 50 bp DNA Ladder (NEB) was selected, and 120 rpm was electrophoresed for 60 minutes. The gelation was recovered, and the range of the gel was determined according to the size of the linker and the desired fragment size. The cut pieces were recovered by QIAquick Gel Extraction Kit (QIAGEN) and finally dissolved in 23 ⁇ 1 EB buffer.
  • the PCR Primer PE 1.0 primer (5, end primer, upstream primer) sequence is:
  • the LNA-modified PCR Primer PE 1.0 primer sequence is:
  • the underlined part is the LNA modification site of the linker, indicating the LNA modified T.
  • the four unmodified PCR Primer PE 2.0 primers (3, end primers, downstream primers) sequence are:
  • ACGTGTGCTCTTCCGA TCT3 ' (SEQ ID NO : 8)
  • the four LNA-modified PCR Primer PE 2.0 primer sequences are:
  • the LNA modification sites of the underlined linker represent the LNA modified T, respectively.
  • the partial sequence of the PCR Primer PE 2.0 primer is identical to the fixed sequence P7 on the flowcell, underlined.
  • the LNA modification site of the PCR primer the LNA modification site of the common PCR Primer PE 1.0 primer is close to the 5, and the binding efficiency of the P5 complementary sequence (DNA library template strand) can be improved.
  • PCR Primer PE2.0 primer (A) The LNA modification site of B, C, D) is close to the 3, and the binding efficiency of the P7 complementary sequence (DNA library coding strand) can be improved, and finally the target sequence is more stably bound to the fixed sequence.
  • binding sequence in the fixed sequence P5 is: AATGATACGGCGACCACCGA ( SEQ ID NO: 19);
  • the binding sequence in the fixed sequence P7 is: CAAGCAGAAGACGGCATACGA (SEQ ID NO: 20).
  • step 6 Take the PCR amplification product obtained in step 6. Prepare 2% agarose gel, select 50 bp DNA Ladder, 120v electrophoresis for 60min, and cut the gel. The range of gelation is determined by the size of the linker and the desired target fragment. The cut rubber pieces are recovered by QIAquick Gel Extraction Kit and finally dissolved in 25 ⁇ 1 ⁇ u
  • Tables 1 and 2 The results of the sequencing information analysis are shown in Tables 1 and 2, wherein the results in Tables 1 and 2 are the results of sequencing analysis of a set of parallel constructed libraries. Table 1 Sequencing information analysis results of unmodified library
  • the ratio of the ratio of CT to 0 is 1.
  • the ratio is 99.85; the paired reads are only 96.6 inserts large 320 insert size error -12/+15 (%Align) 97.96 ratio is small ( Insert Size
  • the sequencing result is better than the unmodified sequencing result.
  • the q30 box plot is concentrated, and the median value is also significantly higher. It is indicated that the base quality of the library modified with LNA is better than no modification.
  • Figure 4A shows the sequencing results without LNA modification.
  • Figure 4B shows the sequencing results with LNA modification.
  • the abscissa is the cycle number of PE90, and the ordinate is the base corresponding to each cycle.
  • Base quality value color represents a different percentage, white: 0, green: 10%, yellow: 30%, red: 50%, dark red: 70%, black: 100%, such as a certain quality value at a certain position.
  • the mass value of the locus on the abscissa is 10% of the total mass value of the ordinate. The higher the mass value, the darker the color, the better the quality. It can be seen from Figures 4A and 4B that the LNA-modified library has a significantly better mass value distribution than the LNA-free library.
  • Figure 5A shows the sequencing result without LNA modification.
  • Figure 5B shows the sequencing result of LNA modification.
  • the data of the lower machine is compared with the reference genome. Two 32 bp are allowed. In the case of mismatch, select the segment (sss). If the first 32 bp is matched in the case of allowing 2 mismatches, use eland software to calculate, in the comparable reads, each cycle is wrong. Number of bases / number of bases in all cycles. The abscissa indicates the number of cycles of PE90, the ordinate is the number of bases per cycle error/the number of bases in all cycles. From the results, the LNA-modified library has a lower error rate than the unmodified library. many.
  • the gene coverage distribution is shown in Figure 7.
  • the abscissa is the number of times of coverage of each base, and the ordinate is the number of bases.
  • the curve conforms to the Poisson distribution. The more concentrated the graph is on the central axis, the more random the coverage is. The figure shows that the coverage randomness is improved after the LNA modification.
  • Table 3 Sequencing information of two sets of parallel library libraries
  • Parts a and b refer to unmodified libraries, respectively, and groups A and B refer to LNA-modified texts, respectively.
  • the LNA-modified library increased the unique rate and map to genome rate, reduced the duplication rate, and the coverage and depth were better than those without the LNA modified library.
  • the F chain and the R chain (SEQ ID NO: 1 and SEQ ID NO: 2, respectively) of an unmodified hydrazine linker of equal volume and concentration of 100 ⁇ M were subjected to gradient joint annealing, and after annealing, diluted to 5 ⁇ , using the Agilent 2100 for testing, the results are shown in Figure 8A.
  • the F chain and the R chain (SEQ ID NO: 3 and SEQ ID NO: 4, respectively) of an LNA modified ⁇ linker of equal volume and concentration of 100 ⁇ were subjected to gradient joint annealing, and after annealing, diluted to 1 ⁇
  • the test was carried out using an Agilent 2100, and the results are shown in Fig. 8B.
  • the size of the synthesized double strands after annealing is 80 bp and 82 bp, respectively, and the proportion of the double-linked head synthesized by LNA modification is 60%, and the peak of the single chain is only one, and there is no modification.
  • the ratio of the double link head is 57%, and the peak of the single chain has two.
  • a set of parallel libraries was prepared according to the method of Example 1 and subjected to high-throughput sequencing of SOLEXA, except that when the linker was added, the linker used was a linker not modified by LNA (SEQ ID NO: 3 and SEQ ID NO: 4).
  • the PCR primers used were PCR primers without LNA modification (SEQ ID NOS: 7-10), and only LNA-modified and unmodified sequencing primers were used for sequencing (SEQ ID NO: 15 ⁇ 16). The sequencing results are shown in Tables 4 and 5.
  • RawClusters/Tile represents the number of clusters of clusters per tile, where the median of all tiles is taken.
  • PFClusters/Tile indicates the number of DNA clusters after each tile has been filtered by PF.
  • PF Illumina's default filter rule: Only one base in the first 25 bases is allowed to have a bad quality. Reads that do not meet this condition are filtered.
  • %PF PFClusters/RawClusters FirstCyclelnt: The light intensity of the first cycle.
  • %Phasing Probability of response lag (in the current number of cycles, the proportion of reads that are still in the previous cycle).
  • %Prephasing Probability of response advancement (the proportion of reads in the current cycle number that has already been processed in the last cycle).
  • 9A and 9B are box results of sequencing results of fq mass value q30 using LNA-modified or unmodified sequencing primer read2, wherein the more concentrated the box plot, the closer the value of Q30% is in each cycle, the box The higher the median value in the graph, the better the base quality.
  • the sequencing result of the sequencing primer using the LNA modified read2 q30 box plot is more concentrated than the sequencing result of the unmodified sequencing primer read2 q30 box plot, the median value is also higher, indicating the use of LNA The modified sequencing primer read2 is better than the base of the unmodified sequencing primer read2.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Biotechnology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

L'invention concerne un fragment d'ADN modifié par un acide nucléique bloqué pour séquençage haut débit. Le fragment d'ADN est sélectionné parmi un, deux ou trois éléments suivants : une séquence de liaison, une amorce PCR et une amorce de séquençage, ces dernières situées dans le fragment d'ADN contiennent des acides nucléiques bloqués. L'invention concerne également un procédé de séquençage haut débit, un procédé de construction de bibliothèque d'ADN, et un procédé de séquençage de bibliothèque d'ADN, comprenant une étape de modification d'acide nucléique bloqué sur une séquence de liaison, une amorce PCR et/ou une amorce de séquençage. L'invention concerne en outre l'utilisation du fragment d'ADN dans un séquençage haut débit, une construction de bibliothèque d'ADN ou un séquençage de bibliothèque d'ADN.
PCT/CN2012/086521 2012-12-13 2012-12-13 Fragment d'adn modifié par un acide nucléique bloqué pour séquençage haut débit WO2014089797A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2012/086521 WO2014089797A1 (fr) 2012-12-13 2012-12-13 Fragment d'adn modifié par un acide nucléique bloqué pour séquençage haut débit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2012/086521 WO2014089797A1 (fr) 2012-12-13 2012-12-13 Fragment d'adn modifié par un acide nucléique bloqué pour séquençage haut débit

Publications (1)

Publication Number Publication Date
WO2014089797A1 true WO2014089797A1 (fr) 2014-06-19

Family

ID=50933710

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2012/086521 WO2014089797A1 (fr) 2012-12-13 2012-12-13 Fragment d'adn modifié par un acide nucléique bloqué pour séquençage haut débit

Country Status (1)

Country Link
WO (1) WO2014089797A1 (fr)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005073409A2 (fr) * 2004-01-26 2005-08-11 Applera Corporation Methodes, compositions et trousses pour l'amplification et le sequençage de polynucleotides
CN101413034A (zh) * 2008-11-21 2009-04-22 东南大学 高通量核酸分子克隆制备分子克隆芯片的方法
CN101831500A (zh) * 2010-05-19 2010-09-15 广州市锐博生物科技有限公司 一种小rna的定量检测方法及试剂盒
CN102301011A (zh) * 2009-02-02 2011-12-28 埃克西库恩公司 对小rna进行定量的方法
EP2405000A1 (fr) * 2010-07-06 2012-01-11 Alacris Theranostics GmbH Synthèse de bibliothèques chimiques
EP2405017A1 (fr) * 2010-07-06 2012-01-11 Alacris Theranostics GmbH Procédé de séquençage d'acide nucléique
WO2012118802A1 (fr) * 2011-02-28 2012-09-07 Transgenomic, Inc. Trousse et procédé de séquençage d'un adn cible dans une population mixte
CN102712955A (zh) * 2009-11-03 2012-10-03 Htg分子诊断有限公司 定量核酸酶保护测序(qNPS)

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005073409A2 (fr) * 2004-01-26 2005-08-11 Applera Corporation Methodes, compositions et trousses pour l'amplification et le sequençage de polynucleotides
CN101413034A (zh) * 2008-11-21 2009-04-22 东南大学 高通量核酸分子克隆制备分子克隆芯片的方法
CN102301011A (zh) * 2009-02-02 2011-12-28 埃克西库恩公司 对小rna进行定量的方法
CN102712955A (zh) * 2009-11-03 2012-10-03 Htg分子诊断有限公司 定量核酸酶保护测序(qNPS)
CN101831500A (zh) * 2010-05-19 2010-09-15 广州市锐博生物科技有限公司 一种小rna的定量检测方法及试剂盒
EP2405000A1 (fr) * 2010-07-06 2012-01-11 Alacris Theranostics GmbH Synthèse de bibliothèques chimiques
EP2405017A1 (fr) * 2010-07-06 2012-01-11 Alacris Theranostics GmbH Procédé de séquençage d'acide nucléique
WO2012004203A1 (fr) * 2010-07-06 2012-01-12 Alacris Theranostics Gmbh Procédé pour le séquençage d'acides nucléiques
WO2012118802A1 (fr) * 2011-02-28 2012-09-07 Transgenomic, Inc. Trousse et procédé de séquençage d'un adn cible dans une population mixte

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
HUMMELSHOJ, L. ET AL.: "Locked nucleic acid inhibits amplification of contaminating DNA in real-time PCR", RESEARCH REPORT, vol. 38, no. 4, 31 December 2005 (2005-12-31), pages 605 - 610, XP001247310, DOI: doi:10.2144/05384RR01 *
RAYMOND, C.K. ET AL.: "Simple, quantitative primer-extension PCR assay for direct monitoring of microRNAs and short-interfering RNAs", RNA, vol. 11, no. 11, 31 December 2002 (2002-12-31), pages 1737 - 1744 *
SHAO, NINGSHENG ET AL.: "Advances in The SELEX Technique and Aptamers", PROGRESS IN BIOCHEMISTRY AND BIOPHYSICS, vol. 33, no. 4, 31 December 2006 (2006-12-31), pages 329 - 335 *

Similar Documents

Publication Publication Date Title
CN105506125B (zh) 一种dna的测序方法及一种二代测序文库
JP6982087B2 (ja) 競合的鎖置換を利用する次世代シーケンシング(ngs)ライブラリーの構築
CN106795514B (zh) 泡状接头及其在核酸文库构建及测序中的应用
US20210363570A1 (en) Method for increasing throughput of single molecule sequencing by concatenating short dna fragments
WO2012037882A1 (fr) Étiquettes d'adn et leur utilisation
JP7460539B2 (ja) 核酸を結合、修飾、および切断する物質の基質選択性および部位のためのin vitroでの高感度アッセイ
EP2334802A1 (fr) Procédés de génération de bibliothèques spécifiques de gènes
WO2012068919A1 (fr) Bibliothèque d'adn et procédé de préparation de celle-ci, procédé et dispositif de détection de snp
CN107604046B (zh) 用于微量dna超低频突变检测的双分子自校验文库制备及杂交捕获的二代测序方法
WO2012037880A1 (fr) Index d'adn et son application
WO2015081890A1 (fr) Bibliothèque de séquençage, sa préparation et son utilisation
WO2012126398A1 (fr) Marqueur adn et son utilisation
CN112359093B (zh) 血液中游离miRNA文库制备和表达定量的方法及试剂盒
EP2531610A1 (fr) Procédé de réduction de complexité
CN111979307B (zh) 用于检测基因融合的靶向测序方法
WO2018113799A1 (fr) Méthode et kit de test pour construire une banque génomique simplifiée
WO2017204572A1 (fr) Procédé de préparation de bibliothèque destiné à un séquençage hautement parallèle à l'aide du codage à barres moléculaire et son utilisation
WO2012037875A1 (fr) Etiquettes d'adn et leur utilisation
WO2012037881A1 (fr) Marqueurs d'acides nucléiques et leurs utilisations
US20150087556A1 (en) COMPOSITIONS AND METHODS FOR MAKING cDNA LIBRARIES FROM SMALL RNAs
EP2785865A1 (fr) Procédé et kit pour la caractérisation d'arn dans une composition
CN108359723B (zh) 一种降低深度测序错误的方法
TW201321520A (zh) 用於病毒檢測的方法和系統
CN112501249B (zh) Rna文库的制备方法、测序方法和试剂盒
WO2023202030A1 (fr) Procédé de construction d'une banque de séquençage à haut débit de petit arn

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12890130

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12890130

Country of ref document: EP

Kind code of ref document: A1