WO2019085546A1

WO2019085546A1 - Method for constructing microbial 16s rdna single-molecule level sequencing library

Info

Publication number: WO2019085546A1
Application number: PCT/CN2018/095582
Authority: WO
Inventors: 林东旭; 刘小龙; 杨明燕; 魏国鹏
Original assignee: 南京格致基因生物科技有限公司
Priority date: 2017-10-31
Filing date: 2018-07-13
Publication date: 2019-05-09
Also published as: CN108070643A

Abstract

Provided is a method for constructing a microbial 16S rDNA single-molecule level sequencing library, relating to the technical field of biochemistry. The library construction method comprises steps of: sample collection and DNA extraction; amplification; purification; quantification; and sequencing and bioinformatics analysis. The method for constructing a microbial 16S rDNA single-molecule level sequencing library is low in cost and simple in operation, and can simplify experimental processes and reduce sequencing cost.

Description

微生物16S rDNA单分子水平测序文库的构建方法Method for constructing microbial 16S rDNA single molecule horizontal sequencing library

技术领域Technical field

本发明涉及微生物基因测序领域，具体涉及用于细菌和古菌16S rDNA单分子水平测序的扩增引物及测序文库的构建方法。The invention relates to the field of microbial gene sequencing, in particular to a method for constructing an amplification primer and a sequencing library for sequencing 16S rDNA single molecule level of bacteria and archaea.

背景技术Background technique

16S rDNA是编码16S rRNA的DNA序列，存在于所有细菌基因组中。16S rDNA序列由9个高变区(V1-V9)组成，中间穿插着保守区。保守区在细菌间无显著差异，可用于构建所有生命的统一进化树。高变区在不同细菌中存在一定差异，对16S rDNA高变区进行测序，可将菌群鉴定精细到分类学上属，甚至种的级别。通过检测16S rDNA的序列变异和丰度，能够反映环境样本中细菌的分类和丰度，这对研究海洋、土壤、肠道粪便等环境中的微生物构成具有重要指导意义。16S rDNA is a DNA sequence encoding 16S rRNA and is present in all bacterial genomes. The 16S rDNA sequence consists of nine hypervariable regions (V1-V9) interspersed with conserved regions. Conserved regions have no significant differences between bacteria and can be used to construct a unified evolutionary tree of all life. The hypervariable region has some differences in different bacteria. The 16S rDNA hypervariable region is sequenced, and the microbial identification can be refined to the taxonomic and even species level. By detecting the sequence variation and abundance of 16S rDNA, it can reflect the classification and abundance of bacteria in environmental samples, which has important guiding significance for studying the microbial composition in the environment of marine, soil and intestinal feces.

目前对16S rDNA测序主要有三种方法：测单V区(V3/V4/V6)，测双V区(V3-V4或V4-V5)，测三V区(V1-V3、V5-V7、V7-V9)，或者测整个16S rDNA全序列。目前V4区在各个水平上的物种鉴定精确度最高，但双V区比单V区序列读长更长，V3-V4区是所有可变区中最长的，加起来长度有459bp，所含信息量最大，V3-V4区引物结合特异性好，在细菌和古菌中的覆盖率均高，一次测序可同时检测细菌和古菌的多样性分布，而且对于细菌分类双V区已经精确到属甚至种水平，没有必要测整个16S rDNA基因。所以综合来讲16S rDNAV3-V4区测序是微生物物种鉴定的最佳选择。At present, there are three main methods for 16S rDNA sequencing: single V region (V3/V4/V6), double V region (V3-V4 or V4-V5), and three V regions (V1-V3, V5-V7, V7). -V9), or measure the entire 16S rDNA sequence. At present, the V4 region has the highest accuracy in species identification at each level, but the double V region is longer than the single V region sequence, and the V3-V4 region is the longest of all variable regions, and the combined length is 459 bp, including The amount of information is the largest, the binding specificity of the primers in the V3-V4 region is good, and the coverage in bacteria and archaea is high. The single-sequence can simultaneously detect the diversity distribution of bacteria and archaea, and the double-V region for bacterial classification has been accurate. At the genus level, there is no need to measure the entire 16S rDNA gene. Therefore, the 16S rDNAV3-V4 region sequencing is the best choice for microbial species identification.

基于Illumina Miseq的16S rDNA测序或者Illumina Hiseq平台的宏基因组测序，可以一次性完成多个样本的平行测序，提供环境样本物种分类，物种丰度，种群结构，***进化，群落比较等诸多信息。目前测序建库的方法主要以两步法和一步法为主，一步法的应用现在越来越多，优点包括操作更简单，而且可以减少由于操作带来的误差和影响，还有从成本角度来讲一步法更节省成本。但是现有的一步法扩增没有考虑到扩增本身给结果带来的影响，因为扩增的偏好性问题，导致均一性差，不能准确的反映客观菌群的丰度。而单分子16S rDNA测序检测的优势就是可以精确到单个宏基因组模板，可以做到绝对的定量菌群丰度分析。Based on Illumina Miseq's 16S rDNA sequencing or Illumina Hiseq platform's metagenomic sequencing, parallel sequencing of multiple samples can be performed in one go, providing environmental sample species classification, species abundance, population structure, phylogenetic evolution, community comparison and much more. At present, the method of sequencing and building the library is mainly based on the two-step method and the one-step method. The application of the one-step method is now more and more, the advantages include simple operation, and can reduce the error and influence caused by operation, and from the perspective of cost. One step is more cost effective. However, the existing one-step amplification does not take into account the effect of the amplification itself on the results, because the preference problem of amplification leads to poor uniformity and cannot accurately reflect the abundance of the objective flora. The advantage of single-molecule 16S rDNA sequencing detection is that it can be accurate to a single metagenomic template, and absolute quantitative abundance analysis can be performed.

现有的单分子16S rDNA测序技术主要是使用ion torrent的测序平台，ion torrent 的测序原理是通过微油滴包裹单个模板进行扩增，因此可以做到单分子级别。然而，目前更普遍使用的illumina测序平台还不能实现16S rDNA单分子级别的测序检测分析。The existing single-molecule 16S rDNA sequencing technology mainly uses the sequencing platform of ion torrent. The sequencing principle of ion torrent is to amplify a single template by micro-oil droplets, so it can achieve single molecule level. However, the currently more widely used illumina sequencing platform cannot achieve 16S rDNA single-molecule sequencing analysis.

因此，在保持一步法优点的基础上需要开发一种基于illumina平台的可以精确到16S rDNA单分子级别的建库测序方法。Therefore, on the basis of maintaining the advantages of the one-step method, it is necessary to develop a library-based sequencing method based on the illumina platform that can be accurate to the 16S rDNA single molecule level.

发明概述Summary of invention

本发明要解决的技术问题是提供一种成本低、操作简便的微生物16S rDNA单分子水平测序文库的构建方法。The technical problem to be solved by the present invention is to provide a method for constructing a microbial 16S rDNA single molecule horizontal sequencing library with low cost and simple operation.

在第一个方面，本发明提供了用于文库构建的两个引物对，其中第一引物对包含长引物，其用于第一步扩增，包括上游引物和下游引物。其中，上游引物从5’端到3’端分别包括：接头序列(与测序芯片上的序列互补)、index序列(用于区分不同样本)、UID随机标签(用于区分DNA分子是否是扩增产生，长度为10bp碱基，其组合是随机的)、V3-V4区扩增引物序列。下游引物从5’端到3’端分别包括：接头序列(与测序芯片上的序列互补)、index序列(可选的)、UID随机标签(用于区分DNA分子是否是扩增产生，长度为10bp碱基，其组合是随机的)、V3-V4区扩增引物序列。进行单端index测序时，下游引物可以不含index序列。如果需要进行双端index测序，下游引物可以包含index序列。In a first aspect, the invention provides two primer pairs for library construction, wherein the first primer pair comprises a long primer for use in the first step amplification, including an upstream primer and a downstream primer. The upstream primers include a linker sequence (complementary to the sequence on the sequencing chip), an index sequence (used to distinguish different samples), and a UID random tag (for distinguishing whether the DNA molecule is amplified or not) from the 5' end to the 3' end, respectively. The resulting primer sequences are 10 bp in length, the combination of which is random, and the V3-V4 region is amplified. The downstream primers include a linker sequence (complementary to the sequence on the sequencing chip), an index sequence (optional), and a UID random tag (for distinguishing whether the DNA molecule is amplified or not, and the length is from 5' to 3', respectively. The 10 bp base, the combination of which is random), the V3-V4 region amplification primer sequence. When performing single-ended index sequencing, the downstream primers may be free of index sequences. If double-ended index sequencing is required, the downstream primer can contain an index sequence.

在一个具体的实施方式中，第一引物对的上游引物为下表REP1/3-9中的任一条，下游引物为下表REP2/10-17中的任一条。In a specific embodiment, the upstream primer of the first primer pair is any one of the following Tables REP1/3-9, and the downstream primer is any one of the following Tables REP2/10-17.

用于文库构建的第二引物对包含P1和P2短引物。在一个具体的实施方式中，上游引物P1和下游引物P2的序列分别如SEQ ID NO.3和SEQ ID NO.4所示。A second primer pair for library construction contains P1 and P2 short primers. In a specific embodiment, the sequences of the upstream primer P1 and the downstream primer P2 are set forth in SEQ ID NO. 3 and SEQ ID NO. 4, respectively.

在第二个方面，本发明提供了一种构建微生物16S rDNA高变区测序文库的方法，所述方法包括：采用本发明的引物对扩增微生物细菌基因组16S rDNA，从而构建测序文库。In a second aspect, the invention provides a method of constructing a 16S rDNA hypervariable region sequencing library of a microorganism, the method comprising: amplifying a microbial bacterial genome 16S rDNA using the primer pair of the invention to construct a sequencing library.

进一步地，构建微生物细菌16S rDNA测序文库的方法包括下列步骤：Further, the method of constructing a 16S rDNA sequencing library of microbial bacteria comprises the following steps:

a)采集生物样品并提取样品中的微生物宏基因组DNA；a) collecting a biological sample and extracting microbial metagenomic DNA in the sample;

b)采用本发明的引物对通过PCR方法扩增16S rDNA v3-v4高变区；b) amplifying the 16S rDNA v3-v4 hypervariable region by PCR using the primer pair of the present invention;

c)对扩增后所得产物进行纯化，得到纯化后的测序文库；c) purifying the product obtained after amplification to obtain a purified sequencing library;

d)对纯化后的测序文库进行精确定量和片段大小分析并进行测序；和d) performing accurate quantification and fragment size analysis of the purified sequencing library and sequencing;

e)对测序结果进行生物信息学分析，完成微生物16S RDNA单分子水平测序文库的构建。e) Bioinformatics analysis of the sequencing results, complete the construction of the microbial 16S RDNA single molecule level sequencing library.

在一个具体的实施方式中，步骤a)中所述样品为粪便样品。优选地，在提取样品宏基因组DNA之后使用微量紫外分光光度计检测其浓度和纯度。In a specific embodiment, the sample in step a) is a stool sample. Preferably, the concentration and purity are measured using a micro-ultraviolet spectrophotometer after extraction of the sample metagenomic DNA.

进一步地，步骤b)中所述扩增为两步扩增，第一步：采用本发明所述的第一引物对以样品宏基因组DNA为模板扩增两个循环，保证两段连有接头完整而且单一的文库片段产生；第二步扩增以第一次扩增的产物为模板，采用本发明所述的第二引物对扩增30个循环以放大信号。优选地，对经过两步扩增后所得产物进行琼脂糖凝胶电泳检测。Further, the amplification in step b) is a two-step amplification, and the first step: using the first primer pair of the present invention to amplify two cycles using the sample metagenomic DNA as a template to ensure that the two segments are connected with a linker. A complete and single library fragment was generated; the second step amplification was performed using the first amplified product as a template, and the second primer pair of the present invention was used to amplify for 30 cycles to amplify the signal. Preferably, the product obtained after two-step amplification is subjected to agarose gel electrophoresis.

在一个具体的实施方式中，在步骤c)中用Beckman&Coulter公司的AMPure XP磁珠进行纯化。在一个具体的实施方式中，纯化后的测序文库用Qubit 3.0和Agilent 2100进行精确定量和片段大小分析，按照测序平台要求稀释测序文库，以Qubit 3.0定量的结果为标准并根据数据量需要进行混样，最后采用Illumina HiseqX或者Miseq进行测序。In a specific embodiment, purification is carried out in step c) using AMPure XP magnetic beads from Beckman & Coulter. In a specific embodiment, the purified sequencing library is subjected to accurate quantification and fragment size analysis using Qubit 3.0 and Agilent 2100, and the sequencing library is diluted according to the sequencing platform requirements, and the results of Qubit 3.0 quantification are used as standards and mixed according to the amount of data required. Finally, sequencing was performed using Illumina HiseqX or Miseq.

本发明微生物16S rDNA单分子水平测序文库的构建方法与现有技术不同之处在于：The method for constructing the 16S rDNA single molecule level sequencing library of the microorganism of the present invention is different from the prior art in that:

1、本发明设计的引物为包含UID随机标签的长引物，通过两步扩增可以保证一个标签对应唯一的被扩增16S rDNA模板，最后在测序结果的分析的时候就能精确到单分子模板的水平，可以更准确的计算出16S rDNA的菌群丰度和组成。1. The primer designed by the invention is a long primer containing a UID random tag, and the two-step amplification can ensure that one label corresponds to the unique amplified 16S rDNA template, and finally the single molecule template can be accurately determined in the analysis of the sequencing result. The level of the 16S rDNA flora abundance and composition can be calculated more accurately.

2、本发明的独特设计的两步扩增法，通过第一步扩增可以对待测样本的16S rDNA模板进行加接头同时两端标记唯一的标签，第二步扩增有效的扩大信号但并不增加标签数量，便于后续的测序和分析。2. The unique two-step amplification method of the present invention, the 16S rDNA template of the sample to be tested is subjected to the first step of amplification, and the two ends are labeled with a unique label, and the second step amplifies the effective amplification signal but No increase in the number of tags for subsequent sequencing and analysis.

3、本发明实现了通过illumina测序平台测序分析单分子16S rDNA，打破了原来的只能通过ion torrent测序平台分析单分子16S rDNA的局面，简化实验流程并降低测序成本。3. The invention realizes the sequencing and analysis of the single molecule 16S rDNA through the illumina sequencing platform, breaks the original situation that the single molecule 16S rDNA can only be analyzed through the ion torrent sequencing platform, simplifies the experimental process and reduces the sequencing cost.

4、本发明所提供的长引物所扩增的目的片段为16S rDNA的V3-V4区，与仅扩增16S rDNA单一区段相比扩增片段长度更长，包含的信息量更多，分辨率更高，便于区分种之间的差异4. The target fragment amplified by the long primer provided by the invention is the V3-V4 region of 16S rDNA, and the amplified fragment is longer in length than the single segment of 16S rDNA only, and contains more information and distinguishes. Higher rate, easy to distinguish between species

附图说明DRAWINGS

图1为本发明微生物16S rDNA单分子水平测序文库的构建方法中第一次扩增中引物的序列REP1和REP2示意图，其中，N为不确定碱基，N选自A、C、G和T中的任意一个；index：拆分标签序列；UID：随机标签；16S rDNA primer：16S rDNA引物序列；1 is a schematic diagram of sequences REP1 and REP2 of primers in a first amplification in a method for constructing a 16S rDNA single-molecule level sequencing library of the present invention, wherein N is an uncertain base and N is selected from A, C, G, and T Any one of them; index: split tag sequence; UID: random tag; 16S rDNA primer: 16S rDNA primer sequence;

图2为采用本发明微生物16S rDNA单分子水平测序文库的构建方法中两步扩增后的琼脂糖凝胶电泳图，其中左侧为两步扩增后的产物电泳图，右侧为DNA marker；2 is an agarose gel electrophoresis pattern of a two-step amplification method using a 16S rDNA single-molecule level sequencing library of the microorganism of the present invention, wherein the left side is a product electrophoresis pattern after two-step amplification, and the DNA marker is on the right side. ;

图3为16S rDNA单分子水平测序文库经过磁珠纯化后2100质控峰图；Figure 3 is a 2100 quality control peak map of a 16S rDNA single molecule level sequencing library after magnetic bead purification;

图4为验证试验样本的肠道微生物16S rDNA属水平上的组成和比例。Figure 4 is a graph showing the composition and proportion of the 16S rDNA genus of the gut microbes in the test sample.

图5为验证验女性生殖道分泌物样本和男性粪便样本的微生物16S rDNA属水平上的组成和比例。Figure 5 shows the composition and proportion of the microbial 16S rDNA genus at the level of the female genital tract secretion and the male faecal sample.

发明详述Detailed description of the invention

通过参考在附图中描述和/或说明的并且在下面说明书中详细说明的非限制性实施方案以及实例将更全面地说明本发明的实施方案以及它们的不同特征以及有利的细节。应当注意的是在附图中所述的特征不必按比例绘制，并且当本领域技术人员可以认可时，一个实施方案的特征可以与其他实施方案一起使用，尽管在此没有清楚地说明。Embodiments of the present invention, as well as its various features and advantageous details, are described more fully by way of non-limiting embodiments and examples of the embodiments illustrated in the accompanying drawings. It should be noted that the features described in the drawings are not necessarily to scale, and the features of one embodiment can be used with other embodiments, although not explicitly illustrated herein.

定义definition

除非另外说明，权利要求以及说明书中所使用的术语是如下面列出定义的。Unless otherwise stated, the terms used in the claims and the specification are as defined below.

术语“核酸”是指核苷酸聚合物，包括任何形式的DNA或RNA，包括例如基因组DNA；互补DNA(cDNA)，互补DNA是mRNA的一种DNA表示，通常通过信使RNA(mRNA)的逆转录、或者通过扩增而得到；合成方式或通过扩增生产的DNA分子；以及mRNA。The term "nucleic acid" refers to a polymer of nucleotides, including any form of DNA or RNA, including, for example, genomic DNA; complementary DNA (cDNA), a DNA representation of mRNA, usually reversed by messenger RNA (mRNA). Recorded, or obtained by amplification; synthetically produced or amplified by production of DNA molecules; and mRNA.

术语“寡核苷酸”是指一种核酸，该核酸是相对短的，总体上短于200个核苷酸、更特别地短于100个核苷酸、最特别地短于50个核苷酸。典型地，寡核苷酸是单链DNA分子。The term "oligonucleotide" refers to a nucleic acid that is relatively short, generally shorter than 200 nucleotides, more specifically shorter than 100 nucleotides, and most particularly shorter than 50 nucleosides. acid. Typically, an oligonucleotide is a single stranded DNA molecule.

术语“引物”是指一种寡核苷酸，在适宜的条件下(即在四种不同核苷三磷酸以及一种聚合反应试剂(例如，DNA或RNA聚合酶或逆转录酶)的存在下)在适宜的缓冲液中并且在适宜的温度下该寡核苷酸能够与核酸杂交(也称为“退火”)并且用作核苷酸(RNA或DNA)聚合反应的起始点。引物的适当的长度取决于引物的预期用途，但是典型地引物是至少7个核苷酸长度，更典型地范围从10个核苷酸至30个核苷酸，或甚至更典型地从15个核苷酸至30个核苷酸长度。其他引物可以是稍微更长的，例如30至50个核苷酸长度。在此上下文中，“引物长度”是指杂交到互补的“目标”序列上并且引发核苷酸合成的寡核苷酸或核酸的部分。短引物分子总体上需要更冷的温度以与模板形成足够稳定的杂交复合体。引物不必反映模板的确切序列但是必须是足够互补的以与模板杂交。The term "primer" refers to an oligonucleotide under suitable conditions (ie, in the presence of four different nucleoside triphosphates and a polymerization reagent (eg, DNA or RNA polymerase or reverse transcriptase) The oligonucleotide is capable of hybridizing to a nucleic acid (also referred to as "annealing") in a suitable buffer and at a suitable temperature and is used as a starting point for nucleotide (RNA or DNA) polymerization. The appropriate length of the primer depends on the intended use of the primer, but typically the primer is at least 7 nucleotides in length, more typically ranging from 10 nucleotides to 30 nucleotides, or even more typically from 15 Nucleotides up to 30 nucleotides in length. Other primers may be slightly longer, for example 30 to 50 nucleotides in length. In this context, "primer length" refers to a portion of an oligonucleotide or nucleic acid that hybridizes to a complementary "target" sequence and initiates nucleotide synthesis. Short primer molecules generally require cooler temperatures to form a sufficiently stable hybrid complex with the template. Primers do not have to reflect the exact sequence of the template but must be sufficiently complementary to hybridize to the template.

术语“引物对”是指一组引物，这组引物包括与有待扩增的DNA序列的5’末端的互补序列杂交的一个5’“上游引物”或“正向引物”，以及与有待扩增的序列的3’末端杂交的一个3’“下游引物”或“反向引物”。如本领域普通技术人员应当了解的，在具体实施方案中术语“上游”和“下游”或“正向”和“反向”不旨在进行限制，而是提供示意性的方向。The term "primer pair" refers to a set of primers comprising a 5' "upstream primer" or "forward primer" that hybridizes to the complementary sequence at the 5' end of the DNA sequence to be amplified, and to be amplified. A 3' "downstream primer" or "reverse primer" hybridized at the 3' end of the sequence. As will be appreciated by one of ordinary skill in the art, the terms "upstream" and "downstream" or "forward" and "reverse" are not intended to be limiting, but rather provide a schematic orientation.

“试剂”广义地是指除了分析物(例如被分析的核酸)之外用于反应中的任何试剂。用于核酸扩增反应的示意性试剂包括但不限于，缓冲剂、金属离子、聚合酶、逆转录酶、引物、模板核酸、核苷酸、标记物、染料、核酸酶、等。用于酶反应的试剂包括例如底物、辅因子、缓冲剂、金属离子、抑制剂、以及活化剂。"Reagent" broadly refers to any reagent used in the reaction other than an analyte, such as a nucleic acid being analyzed. Illustrative reagents for nucleic acid amplification reactions include, but are not limited to, buffers, metal ions, polymerases, reverse transcriptases, primers, template nucleic acids, nucleotides, labels, dyes, nucleases, and the like. Reagents for the enzymatic reaction include, for example, substrates, cofactors, buffers, metal ions, inhibitors, and activators.

以下基于实施例对本发明进行描述，但是本发明并不仅仅限于这些实施例。The invention is described below based on the examples, but the invention is not limited to only these examples.

实施例1Example 1

本实施例的微生物16S rDNA单分子水平测序文库的构建方法按以下步骤进行：The method for constructing the 16S rDNA single molecule level sequencing library of the microorganism of the present embodiment is carried out as follows:

(1)采集正常人粪便样本，存放于1mL粪便保存液中混匀(可以常温保存15天)，采用天漠公司的粪便微生物DNA提取试剂盒(ZYMO公司的Quick-DNA ^TM Fecal Microbe Microprep Kit)，按照说明书的步骤提取粪便宏基因组DNA； (1) Collect normal human feces samples, store them in 1mL fecal preservation solution (can be stored at room temperature for 15 days), and use Tianmu's fecal microbial DNA extraction kit (ZYMO's Quick-DNA ^TM Fecal Microbe Microprep Kit) , according to the steps of the manual to extract fecal macrogenomic DNA;

(2)提取后的宏基因组DNA使用微量紫外分光光度计检测样本的浓度和纯度，微量紫外分光光度计型号为Nanodrop 2000；(2) The extracted metagenomic DNA is detected by a micro-ultraviolet spectrophotometer to determine the concentration and purity of the sample, and the micro-ultraviolet spectrophotometer model is Nanodrop 2000;

(3)采用英杰公司的multiplex扩增试剂(Invitrogen的Multiplex PCR master mix)进行第一步扩增；第一步扩增的23μL PCR反应体系如表1所示，第一步扩增反应程序如表2所示，引物序列如SEQ ID NO.1和SEQ ID NO.2所示；如图1所示；(3) The first step of amplification was carried out using Invitrogen's multiplex amplification reagent (Invitrogen's Multiplex PCR master mix); the first step of amplification of 23 μL PCR reaction system is shown in Table 1, the first step of the amplification reaction procedure as As shown in Table 2, the primer sequences are shown in SEQ ID NO. 1 and SEQ ID NO. 2; as shown in FIG. 1;

第一步扩增以粪便微生物DNA为模板扩增两个循环，保证两段连有接头完整而且单一的文库片段产生；The first step of amplification uses the fecal microbial DNA as a template to amplify two cycles, ensuring that the two segments are ligated with a complete linker and a single library fragment is produced;

(4)第二步扩增以第一次扩增的产物为模板，加入两端的P1和P2短引物进行扩增30个循环放大信号，第二步扩增的50μL PCR反应体系如表3所示，第二步扩增反应程序如表4所示；P1和P2短引物的序列如SEQ ID NO.3和SEQ ID NO.4所示；(4) The second step of amplification is to use the first amplified product as a template, and add P1 and P2 short primers at both ends to amplify 30 cycles to amplify the signal. The 50 μL PCR reaction system amplified in the second step is shown in Table 3. The second step of the amplification reaction procedure is shown in Table 4; the sequences of the short primers of P1 and P2 are shown in SEQ ID NO. 3 and SEQ ID NO.

(5)取10uL经过两步扩增后所得产物进行琼脂糖凝胶电泳检测；检测结果见图2。V3-V4区片段大小为455bp，再加上两端接头和标签序列，一共是641bp。电泳图显示样本扩增条带大小与预期相符，而且条带明亮无非特异性扩增条带。(5) The product obtained after 10uL amplification in two steps was detected by agarose gel electrophoresis; the detection results are shown in Fig. 2. The V3-V4 region has a fragment size of 455 bp, plus a linker and tag sequence at both ends, which is 641 bp in total. The electropherogram shows that the size of the amplified band of the sample is as expected, and the band is bright without a non-specific amplification band.

(6)经过两步扩增后所得产物用Beckman&Coulter公司的AMPure XP磁珠进行纯化：(6) The product obtained after two-step amplification was purified using Beckman & Coulter's AMPure XP magnetic beads:

向反应体系中加入75uL磁珠悬液，充分吹打混匀，室温放置5min；吸取上清废弃，使用80％的乙醇洗两次，室温放置5min晾干磁珠，加入20ul的low TE洗脱磁珠得到纯化后的测序文库；Add 75uL magnetic bead suspension to the reaction system, mix thoroughly by pipetting and let stand for 5min at room temperature; take the supernatant and discard it, wash it twice with 80% ethanol, leave it at room temperature for 5 minutes, dry the magnetic beads, and add 20ul of low TE to elute the magnetic The purified sequencing library is obtained from the beads;

(7)纯化后的测序文库用Qubit 3.0和Agilent 2100进行精确定量和片段大小分析，结果如图3所示，主峰尖锐片段大小正确，无其他片段污染；(7) The purified sequencing library was accurately quantified and fragment size analyzed by Qubit 3.0 and Agilent 2100. The results are shown in Fig. 3. The sharp peaks of the main peaks are correct in size and no other fragments are contaminated;

(8)按照测序平台要求稀释测序文库，以Qubit 3.0定量的结果为标准并根据数据量需要进行混样，最后采用Illumina HiseqX进行测序；(8) Dilute the sequencing library according to the sequencing platform requirements, quantify the results of Qubit 3.0 as the standard and mix the samples according to the data volume, and finally use Illumina HiseqX for sequencing;

(9)对测序结果进行生物信息学分析，即完成微生物16S rDNA单分子水平测序文库的构建，合并所有两端标签重复而且序列重复的reads，合并后单端标签没有重复。(9) Bioinformatics analysis of the sequencing results, that is, the construction of the microbial 16S rDNA single-molecule level sequencing library was completed, and all the duplicated and repeated sequences of the reads were combined, and the single-ended tags were not repeated after the combination.

表1 第一步扩增的23μL PCR反应体系Table 1 23 μL PCR reaction system amplified in the first step

成分组成Composition	体积/DNA量Volume/DNA volume
Multiplex PCR master mix(Multiplex扩增试剂)Multiplex PCR master mix (Multiplex Amplification Reagent)	11.5ul11.5ul
上游引物REP1(10uM)Upstream primer REP1 (10uM)	1ul1ul

下游引物REP2(10uM)Downstream primer REP2 (10uM)	1ul1ul
微生物基因组DNAMicrobial genomic DNA	3ng3ng
水water	补足23ulMake up 23ul

表2 第一步扩增反应程序Table 2 First step amplification reaction procedure

表3 第二步扩增的50μL PCR反应体系Table 3 50μL PCR reaction system amplified in the second step

成分组成Composition	体积/DNA量Volume/DNA volume
第一步扩增体系First step amplification system	23uL23uL
Multiplex PCR master mixMultiplex PCR master mix	25uL25uL
上游引物P1，浓度为10uMUpstream primer P1 at a concentration of 10uM	1uL1uL
下游引物P2，浓度为10uMDownstream primer P2 with a concentration of 10uM	1uL1uL

表4 第二步扩增反应程序Table 4 second step amplification reaction procedure

实施例2Example 2

本实施例的微生物16S rDNA单分子水平测序文库的构建方法与实施例1的不同之处在于：步骤(8)中采用Miseq进行测序，步骤(9)继续采用与实施例1相同的方法进行生物信息学分析，即完成微生物16S rDNA单分子水平测序文库的构建。The method for constructing the microbial 16S rDNA single-molecule level sequencing library of the present embodiment is different from that of the first embodiment in that the step (8) is performed by using Miseq, and the step (9) continues to be carried out in the same manner as in the first embodiment. Informatics analysis, the completion of the construction of a microbial 16S rDNA single molecule level sequencing library.

验证试验Verification test

收集正常人样本1例，通过实施例1描述的流程进行实验，通过Hiseq Xten平台测序得到测序结果进行分析，GC在51-53％之间，双向测序R1和R2的reads数在3百万左右，tag去重后分别为60万和90万，R1的tag重复率略高于R2。双向的16S序列数据库的比对率都比较高90％以上。因为是Hiseq平台测序，所以clean reads的长度较短120bp左右，如果使用Miseq平台可以测通，双向300bp，然后拼接。聚类后的以属为单位的cluster数量有600多个，具体数据如表5所示。A normal human sample was collected, and the experiment was carried out by the procedure described in Example 1. The sequencing results were obtained by sequencing on the Hiseq Xten platform. The GC was between 51-53%, and the number of R1 and R2 reads was about 3 million. The tag is de-duplicated to 600,000 and 900,000 respectively, and the tag repetition rate of R1 is slightly higher than R2. The bidirectional 16S sequence database has a higher ratio of more than 90%. Because it is the Heseq platform sequencing, the length of the clean reads is about 120bp. If you use the Miseq platform, you can test it, bidirectional 300bp, and then splicing. There are more than 600 clusters in the genus after clustering. The specific data is shown in Table 5.

表5 验证试验分析结果Table 5 Verification test analysis results

由图4数据可知，其中比例最多的是Prevotella菌属占总量的37％，后面依次是经黏液真杆菌属Blautia(6.7％)，真杆菌属Eubacterium(6.5％)，Faecalibacterium(4.0％)，梭菌属Clostridium(3.2％)，粪球菌属Coprococcus(2.6％)，拟杆菌属Bacteroides(2.4％)，瘤胃球菌属Ruminococcus(2.1％)，剩余的菌属由于比例都低于2％没有列出，但剩余的菌属比例总和可以占到34.6％，如图4所示。It can be seen from the data in Fig. 4 that the proportion of Prevotella is 37% of the total, followed by Blautia (6.7%), Eubacterium (6.5%) and Faecalibacterium (4.0%). Clostridium (3.2%), Copulococcus (2.6%), Bacteroides (2.4%), Ruminococcus (2.1%), and the remaining bacteria are not listed because they are less than 2%. However, the total proportion of the remaining bacteria can account for 34.6%, as shown in Figure 4.

和相关文献或检查报告比较发现Prevotella菌属在肠道中的含量在本研究得到的结果要低，一些含量稀少菌群(eg.双歧杆菌)的比例反而升高，推测可能是由于常规扩增方法由于多轮扩增偏好性把主要菌群比例进一步放大，而稀有菌群由于比例低，竞争劣势得不到有效扩增反而比例越来越低，所以单分子扩增方法能更客观的反应实际肠道内微生物组成的真实情况。Compared with related literatures or examination reports, the content of Prevotella in the intestine was lower in this study, and the proportion of some rare bacteria (eg. Bifidobacterium) increased, which may be due to conventional amplification. The method further magnifies the proportion of the main flora due to the multiple rounds of amplification preference, while the rare flora has a low proportion, the competitive disadvantage is not effectively amplified, but the proportion is getting lower and lower, so the single molecule amplification method can react more objectively. The actual situation of the microbial composition in the actual intestine.

实施例3Example 3

本实施例采集1例正常男性粪便样本(与验证试验中粪便样本采集时间点不同)和1例正常女性生殖道分泌物样本，分别采用常规的16S rDNA一步扩增建库的方法和本文描述的微生物16S rDNA单分子水平两步扩增文库构建方法按以下步骤进行操作比较：In this example, one normal male stool sample (different from the stool sample collection time in the verification test) and one normal female reproductive tract secretion sample were collected, and the conventional 16S rDNA one-step amplification method was used to construct the library and the method described in this paper. Microbial 16S rDNA single-molecule two-step amplification library construction method is compared according to the following steps:

(1)粪便样本，存放于1mL粪便保存液中混匀(可以常温保存15天)，采用天漠公司的粪便微生物DNA提取试剂盒(ZYMO公司的Quick-DNA ^TM Fecal Microbe Microprep Kit)，按照说明书的步骤提取粪便宏基因组DNA；生殖道分泌物样本，存放于1mL保存液中混匀(可以常温保存15天)，采用凯杰公司的DNA提取试剂盒(Qiagen公司的QIAamp DNA Mini Kit),按照说明书的步骤提取生殖道分泌物总DNA。 (1) Fecal samples, stored in 1mL of fecal preservation solution (can be stored at room temperature for 15 days), using Tianmu's fecal microbial DNA extraction kit (ZYMO Quick-DNA ^TM Fecal Microbe Microprep Kit), according to the instructions The steps of extracting fecal macrogenomic DNA; genital secretion samples, stored in 1mL preservation solution and mixed (can be stored at room temperature for 15 days), using Kaijie's DNA extraction kit (Qiagen's QIAamp DNA Mini Kit), according to The steps of the instructions extract total DNA from the genital tract secretions.

(2)提取后的DNA使用微量紫外分光光度计检测样本的浓度和纯度，微量紫外分光光度计型号为Nanodrop 2000；(2) The extracted DNA is used to detect the concentration and purity of the sample using a micro-ultraviolet spectrophotometer, and the model of the micro-ultraviolet spectrophotometer is Nanodrop 2000;

(3)一步法和单分子两步法扩增均采用英杰公司的multiplex扩增试剂(Invitrogen的Multiplex PCR master mix)；两步法扩增步骤见实施例2。(3) One-step and single-molecule two-step amplification were performed using Invitrogen's multiplex amplification reagent (Invitrogen's Multiplex PCR master mix); the two-step amplification procedure is shown in Example 2.

(4)一步法扩增使用REP1和REP2直接扩增30个循环。(4) One-step amplification directly amplified 30 cycles using REP1 and REP2.

(5)扩增后所得产物用Beckman&Coulter公司的AMPure XP磁珠进行纯化。(5) The product obtained after amplification was purified by AMPure XP magnetic beads of Beckman & Coulter.

(6)纯化后的测序文库用Qubit 3.0和Agilent 2100进行精确定量和片段大小分析；(6) The purified sequencing library was accurately quantified and fragment size analyzed using Qubit 3.0 and Agilent 2100;

(7)按照测序平台要求稀释测序文库，以Qubit 3.0定量的结果为标准并根据数据量需要进行混样，最后采用Illumina HiseqX进行测序；(7) Dilute the sequencing library according to the sequencing platform requirements, quantify the results of Qubit 3.0 and mix the samples according to the data volume, and finally use Illumina HiseqX for sequencing;

(8)对测序结果进行生物信息学分析，合并所有两端标签重复而且序列重复的reads，合并后单端标签没有重复，一步法扩增测序结果不进行tag合并处理。(8) Bioinformatics analysis was performed on the sequencing results, and all the reads with repeated tags and repeated sequences at both ends were combined. The single-ended tags were not duplicated after the combination, and the one-step amplification and sequencing results were not tagged.

从图5所示结果上来看，女性生殖道微生物组成比较单一，乳酸菌占99％。而肠道微生物菌群组成和比例多样化，从图中可以看出有些菌群实际比例是被夸大的，比如faecalibacterium，phascolarctobacterium，后者其实只占1％，常规一步法扩增具有偏好性被放大到17％。同样一步法扩增Blautia菌属4％的比例是被低估了，其实是11％。因此，可以得出结论单分子两步法对于微生物群体比例的评估比传统的一步法更准确。From the results shown in Figure 5, the microbial composition of the female reproductive tract is relatively simple, and the lactic acid bacteria account for 99%. The composition and proportion of intestinal microflora are diversified. It can be seen from the figure that the actual proportion of some microflora is exaggerated, such as faecalibacterium, phascolarctobacterium, which actually accounts for only 1%, and the conventional one-step amplification has a preference. Zoom in to 17%. The same one-step amplification of 4% of Blautia is underestimated, in fact, 11%. Therefore, it can be concluded that the single-molecule two-step method is more accurate for the evaluation of the proportion of microbial populations than the traditional one-step method.

虽然以上描述了本发明的具体实施方式，但是本领域的技术人员应当理解，这些仅是举例说明，本发明的保护范围是由所附权利要求书限定的。本领域的技术人员在不背离本发明的原理和实质的前提下，可以对这些实施方式作出多种变更或修改，但这些变更和修改均落入本发明的保护范围。While the invention has been described with respect to the preferred embodiments of the present invention, it is understood that the scope of the invention is defined by the appended claims. A person skilled in the art can make various changes or modifications to the embodiments without departing from the spirit and scope of the invention, and these modifications and modifications fall within the scope of the invention.

Claims

用于构建微生物16S rDNA单分子水平测序文库的引物对，其包括由上游引物和下游引物组成的第一引物对，其中所述上游引物从5’端到3’端分别包括：与测序芯片上的序列互补的接头序列、用于区分不同样本的index序列、UID随机标签、V3-V4区扩增引物序列，所述下游引物从5’端到3’端分别包括：与测序芯片上的序列互补的接头序列、用于区分不同样本的index序列、UID随机标签、V3-V4区扩增引物序列。A primer pair for constructing a microbial 16S rDNA single molecule level sequencing library, comprising a first primer pair consisting of an upstream primer and a downstream primer, wherein the upstream primer comprises from the 5' end to the 3' end, respectively: on the sequencing chip a sequence complementary to the linker sequence, an index sequence for distinguishing different samples, a UID random tag, a V3-V4 region amplification primer sequence, and the downstream primer includes a sequence from the 5' end to the 3' end, respectively: Complementary linker sequences, index sequences for differentiating different samples, UID random tags, V3-V4 region amplification primer sequences.
根据权利要求1所述的引物对，其中所述上游引物为REP1/3-9中的任一条，所述下游引物为REP2/10-17中的任一条。The primer pair according to claim 1, wherein the upstream primer is any one of REP1/3-9, and the downstream primer is any one of REP2/10-17.
根据权利要求1或2所述的引物对，其还包括第二引物对，所述第二引物对由序列分别如SEQ ID NO.3和SEQ ID NO.4所示的上游引物P1和下游引物P2组成。The primer pair according to claim 1 or 2, further comprising a second primer pair, the upstream primer P1 and the downstream primer shown by SEQ ID NO. 3 and SEQ ID NO. 4, respectively. P2 composition.
用于构建微生物16S rDNA单分子水平测序文库的方法，其特征在于：采用权利要求3所述引物对扩增微生物细菌基因组16S rDNA，从而构建测序文库。A method for constructing a microbial 16S rDNA single-molecule horizontal sequencing library, characterized in that a primer pair is used to amplify a microbial bacterial genome 16S rDNA to construct a sequencing library.
根据权利要求4所述的方法，其包括下列步骤：The method of claim 4 comprising the steps of:

a)采集生物样品并提取样品中的微生物宏基因组DNA；a) collecting a biological sample and extracting microbial metagenomic DNA in the sample;

b)采用权利要求3所述的引物对通过PCR方法扩增16S rDNA v3-v4高变区；b) amplifying the 16S rDNA v3-v4 hypervariable region by PCR using the primer set of claim 3;

c)对扩增后所得产物进行纯化，得到纯化后的测序文库；c) purifying the product obtained after amplification to obtain a purified sequencing library;

d)对纯化后的测序文库进行精确定量和片段大小分析并进行测序；和d) performing accurate quantification and fragment size analysis of the purified sequencing library and sequencing;

e)对测序结果进行生物信息学分析，完成微生物16S RDNA单分子水平测序文库的构建。e) Bioinformatics analysis of the sequencing results, complete the construction of the microbial 16S RDNA single molecule level sequencing library.
根据权利要求5所述的方法，其中步骤a)中所述样品为粪便样品，优选地，在提取样品宏基因组DNA之后使用微量紫外分光光度计检测其浓度和纯度。The method according to claim 5, wherein the sample in step a) is a stool sample, preferably, after extracting the sample metagenomic DNA, its concentration and purity are detected using a micro-ultraviolet spectrophotometer.
根据权利要求5所述的方法，其中步骤b)中所述扩增为两步扩增，第一步：采用权利要求1或2所述的第一引物对以样品宏基因组DNA为模板扩增两个循环，保证两段连有接头完整而且单一的文库片段产生；第二步扩增以第一次扩增的产物为模板，采用权利要求3所述的第二引物对扩增30个循环以放大信号；优选地，对经过两步扩增后所得产物进行琼脂糖凝胶电泳检测。The method according to claim 5, wherein said amplification in step b) is a two-step amplification, and the first step: using the first primer pair according to claim 1 or 2 to amplify the sample metagenomic DNA as a template Two cycles, ensuring that the two segments are ligated with a complete and a single library fragment; the second step amplifies the first amplified product as a template and the second primer pair of claim 3 for 30 cycles. To amplify the signal; preferably, the product obtained after two-step amplification is subjected to agarose gel electrophoresis.
根据权利要求5所述的方法，其中在步骤c)中用Beckman&Coulter公司的AMPure XP磁珠进行纯化，优选地，纯化后的测序文库用Qubit 3.0和Agilent 2100进行精确定量和片段大小分析，按照测序平台要求稀释测序文库，以Qubit 3.0定量的结果为标准并根据数据量需要进行混样，最后采用Illumina HiseqX或者Miseq进行测序。The method according to claim 5, wherein the purification is carried out in step c) with AMPure XP magnetic beads from Beckman & Coulter, preferably, the purified sequencing library is subjected to accurate quantification and fragment size analysis using Qubit 3.0 and Agilent 2100, according to sequencing. The platform required dilution of the sequencing library, quantification of the results of Qubit 3.0 and mixing according to the amount of data required, and finally sequencing with Illumina HiseqX or Miseq.
根据权利要求5所述的方法，包括以下步骤：The method of claim 5 comprising the steps of:

(1)采集正常人粪便样本，存放于1mL粪便保存液中混匀，采用天漠公司的粪便微生物DNA提取试剂盒，按照说明书的步骤提取粪便宏基因组DNA；(1) Collecting normal human feces samples, storing them in 1 mL of fecal preservation solution, using Tianmu's fecal microbial DNA extraction kit, and extracting fecal macrogenomic DNA according to the instructions;

(2)提取后的宏基因组DNA使用微量紫外分光光度计检测样本的浓度和纯度；(2) The extracted metagenomic DNA is detected using a micro-ultraviolet spectrophotometer to determine the concentration and purity of the sample;

(3)采用英杰公司的multiplex扩增试剂进行第一步扩增；(3) using the multiplex amplification reagent of Yingjie Company for the first step amplification;

第一步扩增以粪便微生物DNA为模板扩增两个循环，保证两段连有接头完整而且单一的文库片段产生；上游引物REP1和下游引物REP2序列分别如SEQ ID NO.1和SEQ ID NO.2所示；The first step of amplification uses the fecal microbial DNA as a template to amplify two cycles, ensuring that the two segments are ligated with a complete and single library fragment; the upstream primer REP1 and the downstream primer REP2 sequences are SEQ ID NO. 1 and SEQ ID NO, respectively. .2;

(4)第二步扩增以第一次扩增的产物为模板，加入两端的P1和P2短引物进行扩增30个循环放大信号；上游引物P1和下游引物P2的序列分别如SEQ ID NO.3和SEQ ID NO.4所示；(4) The second step of amplification uses the first amplified product as a template, and adds P1 and P2 short primers at both ends to amplify 30 cycles to amplify the signal; the sequences of the upstream primer P1 and the downstream primer P2 are respectively SEQ ID NO. .3 and SEQ ID NO. 4;

(5)取10uL经过两步扩增后所得产物进行琼脂糖凝胶电泳检测；(5) taking 10uL after two-step amplification, the product was detected by agarose gel electrophoresis;

(6)经过两步扩增后所得产物用Beckman&Coulter公司的AMPure XP磁珠进行纯化：(6) The product obtained after two-step amplification was purified using Beckman & Coulter's AMPure XP magnetic beads:

向反应体系中加入75uL磁珠悬液，充分吹打混匀，室温放置5min；吸取上清废弃，使用80％的乙醇洗两次，室温放置5min晾干磁珠，加入20ul的low TE洗脱磁珠得到纯化后的测序文库；Add 75uL magnetic bead suspension to the reaction system, mix thoroughly by pipetting and let stand for 5min at room temperature; take the supernatant and discard it, wash it twice with 80% ethanol, leave it at room temperature for 5 minutes, dry the magnetic beads, and add 20ul of low TE to elute the magnetic The purified sequencing library is obtained from the beads;

(7)纯化后的测序文库用Qubit 3.0和Agilent 2100进行精确定量和片段大小分析；(7) The purified sequencing library was accurately quantified and fragment size analyzed using Qubit 3.0 and Agilent 2100;

(8)按照测序平台要求稀释测序文库，以Qubit 3.0定量的结果为标准并根据数据量需要进行混样，最后采用Illumina HiseqX或者Miseq进行测序；和(8) Dilute the sequencing library according to the sequencing platform requirements, quantify the results of Qubit 3.0 and mix the samples according to the data volume, and finally use Illumina HiseqX or Miseq for sequencing;

(9)对测序结果进行生物信息学分析，即完成微生物16S RDNA单分子水平测序文库的构建，合并所有两端标签重复而且序列重复的reads，合并后单端标签没有重复。(9) Bioinformatics analysis of the sequencing results, that is, the construction of the microbial 16S RDNA single-molecule level sequencing library was completed, and all the duplicated and repeated sequences of the reads were combined, and the single-ended tags were not repeated after the combination.