CN112695100A - STR and SNP genetic marker combined detection system and detection method based on NGS - Google Patents

STR and SNP genetic marker combined detection system and detection method based on NGS Download PDF

Info

Publication number
CN112695100A
CN112695100A CN202110039036.0A CN202110039036A CN112695100A CN 112695100 A CN112695100 A CN 112695100A CN 202110039036 A CN202110039036 A CN 202110039036A CN 112695100 A CN112695100 A CN 112695100A
Authority
CN
China
Prior art keywords
dna
library
locus
pcr
sequencing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202110039036.0A
Other languages
Chinese (zh)
Inventor
周亮
董晓静
夏昊强
张振兴
全旺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou High Tech Biotechnology Co ltd
Original Assignee
Zhengzhou High Tech Biotechnology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou High Tech Biotechnology Co ltd filed Critical Zhengzhou High Tech Biotechnology Co ltd
Priority to CN202110039036.0A priority Critical patent/CN112695100A/en
Publication of CN112695100A publication Critical patent/CN112695100A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention belongs to the technical field of biology, and particularly relates to an STR and SNP genetic marker combined detection system and method based on NGS. The genetic markers selected by the invention comprise the autosomal STR, the male specific Y chromosome STR and the Y chromosome SNP genetic markers, can realize one-time detection of more genetic information, provide reliable guarantee for the accuracy of sequencing data, and have the advantages of good repeatability, high detection performance and the like.

Description

STR and SNP genetic marker combined detection system and detection method based on NGS
Technical Field
The invention belongs to the technical field of biology, and particularly relates to an STR and SNP genetic marker combined detection system and method based on NGS.
Background
The human Y chromosome short tandem repeat (Y-STR) has the characteristics of paternal inheritance, lack of recombination, simple typing, large information amount, high polymorphism and the like, and is a powerful tool for researching the origin evolution, ethnic difference, population and regional distribution and the like of human beings. At present, family investigation by applying the Y-STR becomes an important way for criminal investigation, and the investigation work in criminal case handling can be effectively reduced. However, the number of Y-STR sites in the current commercial kit is insufficient, so that the number of individuals actually returning results is too large, and the workload of criminal investigation cannot be effectively reduced.
Currently, fluorescence labeling multiplex amplification combined with capillary electrophoresis (PCR-CE) is the mainstream technology of STR typing. However, when typing is carried out by PCR-CE, there is a case where the sequence structures of the repetitive regions are not identical among alleles having the same fragment length. In addition, PCR-CE technology typing multi-copy Y-STR can only detect length polymorphism. The second generation sequencing technology (NGS) has the advantages of high sequencing flux and high sequencing speed. NGS can not only analyze STR loci from length polymorphism, but also detect sequence information of multiple copies of Y-STR loci, and has higher gene diversity and individual identification capability. At present, the research and development of a second-generation sequencing commercialized Y-STR typing kit are in the initial stage, and mainly comprise
Figure BDA0002894937960000011
Y23 System(Promega)、ForenSeqTM DNA Signature Prep Kit(Illumina)、YfilerTMPlus PCR Amplification Kit (Thermol Fisher) and GlobalFiler PCR Amplification Kit (Thermol Fisher). However, the Y-STR related to the kit is relatively few, and the four kits respectively comprise 23Y-STR loci, 24Y-STR loci, 27Y-STR loci and 2Y-STR loci. Meanwhile, the price of the kit is high, the matched data analysis software only can be used for developing the kit respectively, the setting of parameters is relatively fixed, and only sequencing data of the inherent locus can be analyzed, so that the kit is not beneficial to the actual application of forensic medicine.
Disclosure of Invention
Aiming at the defects generally existing in the prior art, the invention creatively provides an STR and SNP genetic marker combined detection system and a detection method based on NGS. The detection method provided by the invention can realize one-time detection of more genetic information, provides reliable guarantee for the accuracy of sequencing data, and has the advantages of good repeatability, high detection performance and the like.
In order to achieve the purpose, the invention adopts the technical scheme that:
an NGS-based STR and SNP genetic marker combined detection system comprises 123 loci, wherein 49Y-STR loci, 50Y-SNP loci, 3Y-indel loci close to zero mutation, and 21 autosomal STR loci.
Preferably, the Y-STR locus is DYS19, DYS385a/b, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, DYS437, DYS438, DYS439, DYS448, DYS456, DYS458, DYS635, GATA _ H4, GATA-a10, DYS481, DYS533, DYS576, DYS643, DYS460, DYS549, dysf 387S1a/b, DYS449, DYS627, DYS570, DYS527a/b, DYS447, DYS444, DYS557, DYS388, DYS 404S1a/b, DYS593, DYS645, DYS 399S1, DYS 36526, DYS526, DYS547 626, DYS 576; the Y-SNP loci are M7, M89, M95, M111, M117, M119, M122, M134, M175, M214, M216, P123, P128, P131, P132, P136, P145, P148, P149, P151, P157, P164, P186, P191, P196, P197, P198, P199, P200, P201, M173, M174, rs11096433, M145, rs9306845, rs9786479, rs17276358, rs2075640, F449, M88, M188, rs 169826, rs 17323323322, N7, rs13447354, F1478, rs9786707, M15, rs 16981 and M9 respectively; the Y-indel locus comprises rs771783753, rs199815934 and rs 759551978; the autosomal STR loci include Amelogenin, D3S1358, D1S1656, D6S1043, D13S317, PentaE, D16S539, D18S51, D2S1338, CSF1PO, PentaD, TH01, vWA, D21S11, D7S820, D5S818, TPOX, D8S1179, D12S391, D19S433, FGA.
The invention also provides a method for detecting population genetic polymorphism by using the detection system, which comprises the following steps:
s1, screening locus loci, and designing primers according to locus sequences;
s2, amplifying the gene library by adopting a multiple PCR technology according to the primer designed in S1;
s3, sequencing the gene library prepared in the step S2, and analyzing data.
Preferably, the sequence information of the primers in the step S1 is referred to a sequence table, and each primer is diluted to 20 mu M;
preferably, the construction process of the gene library in step S2 is:
(1) preparing a multiple PCR reaction mixed solution according to the following reaction conditions: 10 mu L of primer mixed solution, 4 mu L of multiplex PCR amplification buffer solution, 1-100 ng of template DNA, and supplementing sterile ultrapure water to 20 mu L of total reaction system; the primer mixed solution comprises oligonucleotide and 1 × TE Buffer, the multiplex PCR amplification Buffer solution comprises DNA polymerase with the enzyme activity of 5-20U, dNTP with the final concentration of 0.1-0.5mmol/L and MgCl with the final concentration of 1.5-6mmol/L2
PCR amplification was performed under the following program conditions: 28 cycles at 98 deg.C for 2min, (15 s at 98 deg.C, 4min at 60 deg.C), 10min at 72 deg.C, and hold at 4 deg.C; preparing a PCR product;
(2) after amplification is completed, adding a digestion buffer solution into each PCR product prepared in the step (1), wherein the digestion buffer solution comprises the following specific components: 20 mu L of multiplex PCR product and 2 mu L of digestion buffer solution, digesting according to the following procedures of 10min at 50 ℃, 10min at 55 ℃, 20min at 60 ℃ and 10 ℃ hold to prepare a digestion product; the digestion buffer solution comprises Tris-HCl with the final concentration of 50-100mmol/L and MgCl with the final concentration of 1.5-6mmol/L2ATP with final concentration of 0.1-0.5mmol/L, dNTP with final concentration of 0.1-0.5 mmol/L;
(3) the reaction mixture was prepared as follows: and (3) preparing 22 mu L of digestion product prepared in the step (2), 6 mu L of connection buffer solution, 1 mu L of linker, 1 mu L of ligase and 30 mu L of total system, and connecting according to the following reaction: preparing a PCR library by hold at 22 ℃ for 30min, 72 ℃ for 10min and 10 ℃; the connection buffer solution comprises Tris-HCl with the final concentration of 50-100mmol/L and MgCl with the final concentration of 1.5-6mmol/L2ATP with a final concentration of 0.2-1mmol/L, DTT with a final concentration of 0.1-0.5mmol/L, and the ligase is T4A DNA ligase;
(4) purifying the PCR library prepared in the step (3), and then amplifying again by taking the purified library as a template, wherein an amplification system comprises: 20 mu L of purified library, 25 mu L of HiFi library amplification buffer solution and 5 mu L of PCR primer mixture, wherein the amplification reaction conditions are as follows: 95 deg.C for 3min, (98 deg.C for 20s, 60 deg.C for 15s, 72 deg.C for 30s)5 cycles, 72 deg.C10min, hold at 4 ℃ to prepare an amplification library; the HiFi library amplification buffer solution comprises DNA polymerase with the enzyme activity of 3-6U, dNTP with the final concentration of 0.15-0.35mmol/L and MgCl with the final concentration of 2-5mmol/L2The PCR primer premix comprises oligonucleotide and 1 × TE Buffer;
(5) and (4) purifying the amplification library prepared in the step (4), performing library quality inspection and quality evaluation, and finally quantifying each library by using a qubit respectively to ensure that the final library volume is 20 mu L and the final concentration is in a range of 30-60pM to obtain the target product.
Preferably, the PCR library purification process of step (4) specifically includes the following steps:
a. prepared before purification, VAHTSTMUniformly mixing DNA Clean Beads in a shaking way, balancing to room temperature, preparing enough ethanol with the fresh volume fraction of 80%, wherein each sample needs about 400 mu L, and before the purification step is carried out, filling the sample with sterilized water to 60 mu L;
b. vortexing the magnetic beads to fully mix the magnetic beads, adding 60 mu L of 1 multiplied magnetic beads into the PCR library, gently blowing and beating the mixture for 10 times by using a pipettor to ensure that the whole system is uniform, and incubating the mixture at 25 ℃ for 8min to combine the library on the magnetic beads;
c. centrifuging the reaction tube at the rotating speed of 500rpm for 1min, and placing the reaction tube on a magnetic frame to separate magnetic beads and liquid;
d. keeping the PCR tube on a magnetic frame, after 5min, clarifying the solution, carefully discarding the supernatant, adding 200 μ L of ethanol with volume fraction of 80%, carefully adding ethanol without disturbing magnetic beads, incubating for 30s, and carefully removing the supernatant;
e. repeating the step d, rinsing twice, centrifuging for 1min at the rotating speed of 500rpm for a short time, collecting the sample to the bottom of the PCR tube, placing the PCR tube on a magnetic frame for 30s, sucking all residual ethanol by using a pipettor, and opening the cover to dry for 3-5min in air;
f. after the magnetic beads are dried in the air, taking down the PCR tube from the magnetic frame, adding 22 mu L of enzyme-free water to cover the magnetic beads, and blowing and uniformly mixing the magnetic beads by using a pipettor; then incubating at 25 ℃ for 2 min;
g. centrifuging the PCR tube at the rotating speed of 500rpm for 1min, then placing the PCR tube in a magnetic frame, and separating magnetic beads from liquid until the solution is clear;
k. carefully sucking 20 μ L of the supernatant and transferring to a new EP tube;
the purification process of step (5) is similar to the purification process of step (4), and differs from step (4) in that in step b, 120. mu.L of 1.2X magnetic beads are added for purification.
Preferably, the library quality inspection and quality evaluation process is as follows: using fragment analysis instruments
Figure BDA0002894937960000041
GX TouchTMAnalyzing fragments, performing fragment quality inspection on the constructed library, and screening the library without small fragment joints or large fragment tailing peaks.
Preferably, the data analysis process in step S3 is: after the sequencing was completed, the instrument automatically generated a fastq file, data analysis was performed on the fastq file using STRait Razor 2.6 software, BAM file was analyzed with IGV _2.3.72, sequencing data for each locus was visualized by integrating genome viewer v2.3.72, data processing was performed using SAMtools and PICARD and BAM and BAI creation tools, data extraction and variant calls were performed with GATK, Microsoft Excel and RStudio v1.2.1335 for data processing and statistical analysis.
Preferably, said data processing and statistical analysis involve the following parameters:
(1) analysis threshold value: expressed as the set limit for the filtered Noise sequence, is a percentage;
(2) locus sequence composition ratio: the locus sequence composition ratio is the ratio of Allle, Stutter and Noise in each locus in the total coverage in that locus;
(3) depth of locus coverage: expressed as the sequencing depth of a locus, i.e., the total reads number for that locus;
(4) sample coverage depth: representing the average sequencing depth of each sample in one sequencing;
(5) allele coverage ratio: the heterozygosity balance ratio is a heterozygote and two different allee coverage ratio values, and is expressed as ACR ═ lower/highher; φ lower represents low coverage alleles; phi highher denotes high coverage alleles;
(6) mixing ratio: calculating the mixing proportion by using the number of allelic genes reads in the NGS mixed map to obtain an actual measured value; heterozygote allele mixing ratio: m ═ a1 reads + a2 reads)/Total reads; homozygote allele mixing ratio: and M is A reads/Total reads.
Preferably, Stutter in step (2) is defined as a sequence of ± 4bp longer than the corresponding Allele; noise is defined as a sequence that is neither Allele nor Stutter; and the calculation formula of each gene frequency is as follows: the whole area percent is equal to area reads/Total reads multiplied by 100 percent; stuffer% ═ stuffer reads/Total reads × 100%; and 3, 1-allele% -stuffer%.
The invention aims to prepare an STR and SNP detection technology system suitable for population genetic polymorphism analysis of China, construct a composite amplification system which is suitable for a MiSeq FGx (TM) system of the current NGS detection platform and can accommodate more STR and SNP genetic markers based on the NGS technology, analyze a mixed DNA sample which is one of the most common problems in DNA samples of forensic science, provide more accurate database construction data and have important significance for STR database construction across the country. In addition, the research and development of the domestic multi-site STR kit also provides a reference scheme. The use of the kit can be (h1) or (h2) or (h3) or (h 4): (h1) STR parting; (h2) typing an autosomal STR; (h3) Y-STR typing; (h4) and (4) carrying out Y-SNP typing.
Aiming at the condition that the number of Y-STR kit sites on the market is insufficient, the method specially collects and arranges a large amount of information; (1) combining 20 core sites and 15 preferred sites commonly used by the ministry of public Security and mutation information given by statistics. (2) Refer to the site information of the commercial Y-STR kit commonly used in China. (YHRD Core Loci, Promega
Figure BDA0002894937960000051
Y,Applied Biosystems
Figure BDA0002894937960000052
Figure BDA0002894937960000053
,Applied Biosystems
Figure BDA0002894937960000054
Plus) (3) referring to the sites of Y-STR population genetic polymorphism (forensic verification and polymorphism investigation of 24Y-STR detection systems) researched in the existing literature and applicable to Chinese population [ D].2017.). (4) And properly increasing multiple copies of Y-STR by using low-frequency mutation and relative conserved Y-STR locus core. And finally, selecting 49Y chromosome STR genetic markers.
Aiming at the condition that the genetic polymorphism research of Y-SNP is still insufficient, the invention particularly collects and arranges a large amount of information; (1) refer to 20Y chromosome SNP genetic markers which are obtained by screening in the prior domestic patents and belong to a haplotype group D sub-haplotype group and are closely related to Chinese population; (CN108823294A) (2) referring to the existing Y chromosome SNP genetic marker kit developed by research, (Forenseq)TMDNA Signature Prep Kit). (3) Referring to a phylogenetic tree constructed by 38,818 SNP loci of Y chromosomes, provided by the International society for genetic pedigree (ISOGG, www.isogg.org), the phylogenetic tree has extremely high resolution. (4) In the literature of reference (typing of SNP markers of the Y chromosome and its forensic applications [ D)]2017; construction of Y-SNP composite detection system and application thereof in group region inference and family investigation [ D]2019.) SNP site showing better polymorphism in Chinese population of our country. (5) Refer to Chinese forensic SNP typing and application specification SF/ZJD 0105003-2015. And finally, 93Y chromosome SNP genetic markers are selected.
The invention further adds 3Y-indels close to zero mutation to enhance the performance of the kit; in order to enhance the individual recognition capability of the kit, the kit takes the genetic marker on the Y chromosome to reduce the investigation range as a main route, and meanwhile, the autosomal genetic marker is used as an auxiliary route to more accurately recognize the individual. Finally, 21 autosomal STR genetic markers are selected; analysis was performed on mixed DNA samples, one of the most common problems in forensic DNA samples.
When the mixing ratio is as low as 1:19, the genotyping of low target components can still be detected to reach 78.6 percent, which shows that the human STR and SNP genetic marker combined detection technical system and the detection method based on NGS have higher detection capability on mixed samples.
The invention relates to combination of a plurality of genetic markers, in particular to design of a primer, which follows the following thought and steps: (1) the primers are not bound to each other nor to regions other than the target fragment on the template DNA. (2) The combination of the primers needs to solve the problem that different amplification fragments compete with each other, and the high-abundance template is prevented from being completely covered, so that the low-abundance template falls into the background completely. (3) The collection of locus sequence information is from various authoritative databases and looking up a large body of literature, as well as the mature commercial kits (databases: GeneBank database of NCBI; YHDR-Y chromosome haplotype reference database https:// yhrd. org/; GRCh38 human genome SNP database https:// www.snpedia.com/index. php/GRCh 38; and STRBase database https:// STRBase. nist. gov/; SNPedia database htps:// www.snpedia.com), primers were designed using primer5 software, primer amplification specificity was considered by BLAST function in GenBank, species specificity of these locus primers designed by other species (4) genome data in the database, multiplex PCR amplification systems were used, and some interference between primers of each set of PCRs in a mixed system was necessary depending on the needs of the detection method, designing a large number of primers for experimental screening, and adjusting the primers by considering the interference and balance among a plurality of groups of primers of the multiplex PCR through professional primer design software. (5) Then PCR amplification is carried out to obtain specific amplification products of each gene locus. And (3) combining the amplification conditions of the single gene locus, selecting a proper amplification program, and carrying out composite amplification. Because the number of complex loci is large and the mutual inhibition condition among primers is complex, the loci need to be eliminated one by one, found out, redesigned and synthesized. (6) In addition, primer dimers are formed between primers at different loci, which reduces the amplification efficiency of the primers and requires redesign and synthesis of primers. (7) During primary composite amplification, the concentration of each primer in a reaction system is 0.5 mu M; and then adjusting the concentration of the primers to obtain the optimal concentration of each primer during PCR amplification.
Through the steps, the optimal primer combination of the STR and the SNP genetic marker suitable for detection is finally obtained. The primer sequence is designed and screened, and the primer mixture is synthesized by ThermoFisher company and is finally in a dry powder state. Primer mix 1 contained 70 designed STR primers and primer mix 2 contained 53 designed SNP primers. The length of the amplicons in 123 loci in the composite amplification system provided by the invention is less than 300 bp.
The primer mixture solution involved in the invention is divided into 2 types, one type is STR mixed primer, the other type is SNP mixed primer, the amplification is respectively carried out, wherein the mixed primer is obtained by mixing the primers in 53 pairs of SNP and 70 pairs of STR primers in equal proportion in sequence, namely the oligonucleotide in each mixture solution during the construction of PCR library, and the joint is VAHTSTMAmpSeq Adapters for Illumina cargo number NA 121; the magnetic beads are VAHTSTMDNA Clean Beads cat # N411; in the Sequencing, the Sequencing was carried out according to "MiSeq FGx Sequencing System Reference Guide # VD 2018006". The same library was sequenced using PE 300.
Compared with the prior art, the detection reagent and the detection method have the following advantages:
(1) the detection system provided by the invention can detect more genetic information at one time.
(2) The detection system and the method analyze mixed DNA samples which are one of the most common problems in forensic DNA samples, analyze the data quality by statistically analyzing parameters such as a threshold, a sample coverage depth, a locus average coverage depth, a locus sequence composition ratio, an allele coverage ratio and the like, wherein the average sequencing depth of all samples is 5675 +/-546 reads, the locus average sequencing depth is 5570 +/-945 reads, when the threshold is set to be 5%, the allele sequence average coverage depth in the locus composition ratio is 97.96%, the stutter sequence average coverage depth is 1.73%, the noise sequence average coverage depth is 0.31%, and further analyze the balance of the loci of the system, wherein the data shows that most of the loci are well balanced and the value is close to 95.14% of 1; this provides a reliable guarantee for the accuracy of the sequencing data to a certain extent;
(3) the library is prepared by gradient dilution samples from 1ng to 50pg according to different initial DNA amounts and sequenced, when the analysis threshold value is 5 percent, 96.68 percent of genotyping can still be detected when the initial amount of DNA is reduced to 50pg, and by the library construction and sequencing method, the sample input amount can be reduced to 1 to 0.5ng, and accurate sequencing data can still be obtained. The human STR and SNP genetic marker combined detection technology system and the detection method based on NGS can reach the sensitivity equivalent to that of the current DNA typing technology;
(4) the coverage depth of each gradient dilution sample is not different, the average normalized reading depth is not different, and each DNA sample is not obviously different, which shows that the detection technology system and the method have good repeatability;
(5) by analyzing the average sequencing depth of loci of different mixing ratios, namely 1:1(6174 +/-668 reads), 1:4(5963 +/-984 reads), 1:9(5359 +/-522 reads) and 1:19(3963 +/-284 reads), the higher coverage depth of loci of mixed samples of different mixing ratios guarantees the sequencing accuracy to a certain extent. Library construction and sequencing detection are carried out on different mixed samples by using 1ng of initial DNA amount, complete genotyping of main component alleles can be obtained at each mixing ratio and better balance is presented, and with the reduction of the low target component template amount, 78.6% of genotyping of low target components can still be detected when the mixing ratio is as low as 1:19, which shows that the NGS-based human STR and SNP genetic marker combined detection technical system and the detection method have higher detection capability on the mixed samples.
Drawings
FIG. 1 is a design roadmap for the technical solution of the present invention;
FIG. 2 shows the results of the depth of sequencing coverage of the DNA in a dilution series;
FIG. 3 is a graph of genotyping data for different amounts of starting DNA;
FIG. 4 shows the average reading of each marker in three replicate samples at different inputs of template DNA;
FIG. 5 is a graph of readings between markers;
FIGS. 6 and 7 show the results of the equilibrium distribution of the loci;
FIG. 8 is a graph of sequencing depth for each locus at a mixing ratio of 1: 1;
FIG. 9 is a graph of sequencing depth for each locus at a mixing ratio of 1: 4;
FIG. 10 is a graph of sequencing depth for each locus at a mixing ratio of 1: 9;
FIG. 11 is a graph of sequencing depth for each locus at a mixing ratio of 1: 19;
FIG. 12 is a diagram showing the detection of a low target component locus in each mixed sample.
Detailed Description
The present invention is further explained with reference to the following specific examples, but it should be noted that the following examples are only illustrative of the present invention and should not be construed as limiting the present invention, and all technical solutions similar or equivalent to the present invention are within the scope of the present invention. Unless otherwise specified, the technical means used in the examples are conventional means well known to those skilled in the art, and the raw materials used are commercially available products.
The joint is VAHTSTMAmpSeq Adapters for Illumina cargo # NA 111; the magnetic bead is VAHTSTMDNA Clean Beads cat # N411; the Qubit dsDNA HS Assay Kit is a product of Thermo Fisher company, and the product number is Q32854; the sequencing reagent is Miseq reagent kit v3 of Illumina company, and the product number is MS-102-3003; the NGS sequencer is a product of Illumina company, and the product model is as follows: miseq FGxTM
Example 1 evaluation of performance of STR and SNP genetic marker combined detection technology system and detection method for forensic science based on NGS on gradient dilution samples
1. Sample preparation:
(1) FTA blood card samples of 8 unrelated healthy male individuals from Han population are taken, and a chelex method is adopted for DNA extraction according to the standard in GA/T383-2014.
(2) Repeatability and sensitivity section
Gradient dilution series DNA templates (1ng, 500pg, 200pg, 100pg, 50pg) were prepared for sequencing using qubit absolute quantitation results for 1 sample, and each sample was run in parallel for 3 replicates and sensitivity studies. In addition, sequencing data of an example of 2800M Control DNA standard was analyzed to investigate the threshold setting.
2. Library construction:
the gene library is constructed by applying the library construction method of the step S2 of the invention.
3. Sequencing and data analysis:
the sequencing method of the step S3 of the invention is applied, and the NGS platform is utilized to analyze the sequencing performance of the 8 constructed DNA samples and 1 example of the gradient diluted DNA samples and the positive standard DNA samples, and the specific implementation steps are shown in the implementation method of the sequencing link in the specific technical scheme.
4. And (4) analyzing results:
(1) analysis of sequencing data for one example of 2800M Control DNA A threshold setting was studied and 83 allels were obtained from one example of 2800M Control DNA sequencing, with no loss of allele at any of the set thresholds. The threshold was set at 2.5%, 5%, 10%, and the number of each sequence detected at different assay thresholds is shown in table 1 below, with 48, 12 and 0 stutter and 2, 0, 0 Noise, respectively. A 5% analysis threshold was finally selected, considering minimizing non-alleles and increasing the true allele detection rate.
TABLE 1 number of samples 1 detected for each sequence at different thresholds
Figure BDA0002894937960000101
(2) And (3) carrying out statistical analysis on the sequencing results of the 8 samples, wherein the statistical analysis comprises the parameter conditions of sample sequencing coverage depth, locus average coverage depth, locus sequence composition ratio, allele coverage ratio and the like. All sequencing depth data parameters are presented as mean sequencing depth ± standard deviation (mean ± SD).
a. Firstly, 8 sample sequencing depths (reads), locus average sequencing depths, locus composition ratio calculation and statistical analysis are carried out to evaluate data, and sequencing efficiency of the detection system and the detection method is comprehensively evaluated. The results of 3 replicate tests on average sequencing coverage for 8 samples are shown in Table 2.
TABLE 28 average sequencing depth of coverage results for 3 replicate assays of samples
Sample name Depth of coverage for sequencing (reads number)
1 5979
2 5169
3 5221
4 5340
5 5782
6 5227
7 6897
8 5782
Mean±SD (5676±1538)reads
Statistical analysis of the data quality parameters shows that the average sequencing depth of 8 samples is 5676 +/-1538 reads (mean +/-SD), and the average sequencing depth of the gene locus is 5570 +/-945 reads; when the threshold value is set to 5%, the mean coverage depth of allele sequence, stutter sequence and noise sequence in the locus formation ratio is 97.96%, 1.73% and 0.31%, respectively.
b. Statistical analysis was performed on the sensitivity and reproducibility of the sequencing samples of the serially diluted DNA templates (1ng, 500pg, 200pg, 100pg, 50 pg). The sequencing coverage depth of the gene locus with the input amount of 1ng is 5260 +/-2678 reads; 0.5ng of locus sequencing coverage was 5037 + -2326 reads; 0.2ng of locus sequencing coverage depth is 4873 +/-2074 reads; 0.1ng of locus sequencing coverage depth of 4275 + -1968 reads; 50pg locus sequencing coverage depth 3817 ± 913 reads; sequencing coverage depth of the serially diluted DNA. See fig. 2.
c. Libraries were prepared and sequenced from 1ng to 50pg samples at different starting amounts of DNA, and when the threshold of analysis was 5%, 96.68% genotyping could still be detected when the starting amount of DNA was reduced to 50 pg. As shown in fig. 3.
d. Three replicate samples of template DNA dosed at 1ng and 0.5ng were analysed, with no significant difference between the mean readings for each marker, from 1ng to 0.5 ng. See fig. 4.
e. The mean normalized read depth was between 1ng and 0.5ng, showing no significant difference between the DNAs. The normalized read depth was 0.00864, with only 2 STRs (1.6%) below 0.004 at both DNA inputs. This indicates that the read distribution between the markers remains stable even though the DNA input has an effect on the total read. (FIG. 5) with the library-building sequencing method of the present invention, the sample input amount can be as low as 1 to 0.5ng, and accurate sequencing data can still be obtained.
f. Further analysis of the distribution of locus equilibria, the performance of locus amplification and sequencing was assessed by comparing the depth of sequencing obtained for each locus. The distribution of locus equilibria, calculated from the ratio of the reads obtained for each locus to the average reads for each locus, is shown in FIGS. 6 and 7 below. Most loci are well balanced, with values close to 1 (95.14%). The average reading for only 5 STR loci (4.06%) per locus was more than twice that of the population, with 1 STR locus (0.8%) reading less than half the average reading per locus. All SNP sites performed well in balance.
Example 2 evaluation of performance of an NGS-based STR and SNP genetic marker combined detection technology system and detection method for forensics on mixed samples
1. Sample preparation (mixed DNA fraction): the DNA quantification results of 8 male samples were grouped according to the quantification results of example 1, and 2 single-typing DNA samples were selected to construct 4 sets of DNA samples mixed by two males, each set of mixed samples having 4 mixing ratios (1:1, 1:4, 1:9, 1:19), and each mixed sample was tested in parallel for 3 times.
2. Library construction: by applying the library construction method of step S2, library construction and sequencing detection are carried out on different mixed samples by 1ng of initial DNA amount, and specific implementation steps are shown in a library construction link implementation method in a specific technical scheme.
3. Sequencing and data analysis: the sequencing method of the step S3 of the invention is used for sequencing and analyzing the constructed two male mixed samples by utilizing the NGS platform, and the specific implementation steps are shown in the sequencing link implementation method in the specific technical scheme.
4. And (4) analyzing results: the performance of locus amplification and sequencing was assessed by comparing the depth of sequencing obtained for each locus in the pooled sample.
(1) Analysis of 4 sets of pooled samples for sequencing mean depth of coverage:
TABLE 3 average depth of coverage of pooled samples for 3 replicates
Mixed sample Sequencing depth/reads
Mixed sample 1:1 6174±668
Mixed sample 1:4 5963±984
Mixed sample 1:9 5359±522
Mixed sample 1:19 3963±283
As can be seen from Table 4, the mean sequencing depth of the loci was 6174. + -. 668reads at the mixing ratio of 1: 1. The sequencing depth obtained for each locus is shown in FIG. 8 below; at a mixing ratio of 1:4, the mean sequencing depth of the loci was 5963 ± 984reads, and the sequencing depth obtained for each locus is shown in fig. 9 below; at a mixing ratio of 1:9, the mean sequencing depth of the locus was 5359 ± 522 reads; the sequencing depth obtained for each locus is shown in FIG. 10 below; at a mixing ratio of 1:19, the mean sequencing depth of the loci was 3963 ± 284reads, and the obtained sequencing depth of each locus is shown in fig. 11 below;
(2) analyzing the detection rate of the low target component genotypes of the 4 groups of mixed samples:
analyzing the detection condition of the low target component gene locus of the mixed sample, and repeatedly performing normalization analysis for 3 times, wherein the ratio of the mixed sample to the mixed sample is 1:1, and the detection rate is 100%; mixing samples 1:4, and detecting the rate of 96.9%; mixing samples at a ratio of 1:9, detecting 123 alleles in total, wherein the detection rate is 88.9%; mixed samples 1:19, detection rate 78.6% low target component loci were detected for each mixed sample as shown in fig. 12 and table 4.
TABLE 4 Low target component Gene detection numbers of different mixed samples
Figure BDA0002894937960000121
In summary, data quality is analyzed by statistically analyzing parameters such as analysis threshold, sample coverage depth, locus average coverage depth, locus sequence composition ratio, allele coverage ratio, and the like. The average sequencing depth of all samples is 5675 +/-546 reads, the average sequencing depth of the loci is 5570 +/-945 reads, and when the threshold value is set to be 5%, the average coverage depth of the allele sequence in the locus composition ratio is 97.96%, the average coverage depth of the stutter sequence is 1.73%, and the average coverage depth of the noise sequence is 0.31%. Further analysis of the equilibrium of the loci of the system, data indicate that most loci are well-balanced, with values approaching 95.14% of 1; this provides a reliable guarantee to some extent of the accuracy of the sequencing data.
The library is prepared by gradient dilution samples from 1ng to 50pg according to different initial DNA amounts and sequenced, when the analysis threshold value is 5 percent, 96.68 percent of genotyping can still be detected when the initial amount of DNA is reduced to 50pg, and by the library construction and sequencing method, the sample input amount can be reduced to 1 to 0.5ng, and accurate sequencing data can still be obtained. The invention shows that the human STR and SNP genetic marker combined detection technology system and the detection method based on NGS can achieve the sensitivity equivalent to that of the current DNA typing technology.
The coverage depth of each gradient dilution sample is not different, the average normalized reading depth is not different, and each DNA sample is not obviously different, which shows that the detection technology system and the method have good repeatability.
By analyzing the average sequencing depth of loci of different mixing ratios, namely 1:1(6174 +/-668 reads), 1:4(5963 +/-984 reads), 1:9(5359 +/-522 reads) and 1:19(3963 +/-284 reads), the higher coverage depth of loci of mixed samples of different mixing ratios guarantees the sequencing accuracy to a certain extent. Library construction and sequencing tests were performed on different pooled samples with a starting DNA amount of 1 ng. The major component alleles gave complete typing at each mixing ratio and exhibited a better balance. With the reduction of the template amount of the low target component, the genotyping of the low target component can still reach 78.6 percent when the mixing ratio is as low as 1:19, which shows that the human STR and SNP genetic marker combined detection technical system and the detection method based on NGS have higher detection capability on mixed samples.
Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention and not for limiting the protection scope of the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions can be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.
Sequence listing
<110> Zhengzhou high and New Biotechnology Co., Ltd
<120> STR and SNP genetic marker combined detection system and detection method based on NGS
<130> 2020.12.29
<160> 234
<170> SIPOSequenceListing 1.0
<210> 1
<211> 22
<212> DNA
<213> DYS19-F
<400> 1
ggttaaggag agtgtcacta ta 22
<210> 2
<211> 28
<212> DNA
<213> DYS19-R
<400> 2
gactgtgatt attttttgat ttcactat 28
<210> 3
<211> 27
<212> DNA
<213> DYS385a/b-F
<400> 3
tattttaaaa aataatctat ctattcc 27
<210> 4
<211> 27
<212> DNA
<213> DYS385a/b-R
<400> 4
gggtgacaga gctagacacc atgccaa 27
<210> 5
<211> 25
<212> DNA
<213> DYS389I-F
<400> 5
gaagaatgtc atagatagat gatgg 25
<210> 6
<211> 23
<212> DNA
<213> DYS389I-R
<400> 6
ctctcatctg tattatctat gtg 23
<210> 7
<211> 25
<212> DNA
<213> DYS389II-F
<400> 7
gaagaatgtc atagatagat gatgg 25
<210> 8
<211> 23
<212> DNA
<213> DYS389II-R
<400> 8
ctctcatctg tattatctat gtg 23
<210> 9
<211> 25
<212> DNA
<213> DYS390-F
<400> 9
gtatactcag aaacaaggaa agata 25
<210> 10
<211> 23
<212> DNA
<213> DYS390-R
<400> 10
gccctgcatt ttggtacccc ata 23
<210> 11
<211> 24
<212> DNA
<213> DYS391-F
<400> 11
tcattcaatc atacacccat atct 24
<210> 12
<211> 21
<212> DNA
<213> DYS391-R
<400> 12
ctccctggtt gcaagcaatt g 21
<210> 13
<211> 23
<212> DNA
<213> DYS392-F
<400> 13
agaagtcaaa acagagggat cat 23
<210> 14
<211> 25
<212> DNA
<213> DYS392-R
<400> 14
caactaattt gatttcaagt gtttg 25
<210> 15
<211> 28
<212> DNA
<213> DYS393-F
<400> 15
tgtcattcct aatgtggtct tctacttg 28
<210> 16
<211> 28
<212> DNA
<213> DYS393-R
<400> 16
ctcaagtcca aaaaatgagg tatgtctc 28
<210> 17
<211> 26
<212> DNA
<213> DYS437-F
<400> 17
cagtctcctg agtagctggg actatg 26
<210> 18
<211> 28
<212> DNA
<213> DYS437-R
<400> 18
atagataagt agatagacat cattcaca 28
<210> 19
<211> 24
<212> DNA
<213> DYS438-F
<400> 19
attagtgggg aatagttgaa cggt 24
<210> 20
<211> 28
<212> DNA
<213> DYS438-R
<400> 20
gagatcacac cattgcattt cagcctgg 28
<210> 21
<211> 26
<212> DNA
<213> DYS439-F
<400> 21
tcaaggtgat agatatacag atagat 26
<210> 22
<211> 28
<212> DNA
<213> DYS439-R
<400> 22
acaggcataa tccaccatgc ctggcttg 28
<210> 23
<211> 22
<212> DNA
<213> DYS448-F
<400> 23
agattagaaa tagagatcgc ga 22
<210> 24
<211> 26
<212> DNA
<213> DYS448-R
<400> 24
cctcatattt ctggccggtc tggaaa 26
<210> 25
<211> 27
<212> DNA
<213> DYS456-F
<400> 25
ctcggactgg ctcatcttgc tcctcag 27
<210> 26
<211> 29
<212> DNA
<213> DYS456-R
<400> 26
ccaaaacttc ttaaactgat gtattaggg 29
<210> 27
<211> 21
<212> DNA
<213> DYS458-F
<400> 27
ctgagcaaca ggaatgaaac t 21
<210> 28
<211> 28
<212> DNA
<213> DYS458-R
<400> 28
cagccacctc ggcctcccaa agttctgg 28
<210> 29
<211> 27
<212> DNA
<213> DYS635-F
<400> 29
ggaaccagcc caaatatcca tcaatca 27
<210> 30
<211> 28
<212> DNA
<213> DYS635-R
<400> 30
ctgctgaatg ggagcagaaa tgcccaat 28
<210> 31
<211> 27
<212> DNA
<213> GATA_H4-F
<400> 31
caggataaat cacctatcta tgtatct 27
<210> 32
<211> 28
<212> DNA
<213> GATA_H4-R
<400> 32
tcctaggaat catcattaaa atgttatg 28
<210> 33
<211> 26
<212> DNA
<213> DYS481-F
<400> 33
gaatgtggct aacgctgttc agcatg 26
<210> 34
<211> 30
<212> DNA
<213> DYS481-R
<400> 34
cagcatgtct tggcatactt aacaattcat 30
<210> 35
<211> 28
<212> DNA
<213> DYS533-F
<400> 35
atgtattatc tatcaatctt ctacctat 28
<210> 36
<211> 28
<212> DNA
<213> DYS533-R
<400> 36
aaatgtattt attcatgatc agttctta 28
<210> 37
<211> 25
<212> DNA
<213> DYS576-F
<400> 37
aggagttcaa tctcagccaa gcaac 25
<210> 38
<211> 24
<212> DNA
<213> DYS576-R
<400> 38
tggagatgaa ggaggagatg ggag 24
<210> 39
<211> 26
<212> DNA
<213> DYS643-F
<400> 39
aggtgttcac tgcaagccat gcctgg 26
<210> 40
<211> 29
<212> DNA
<213> DYS643-R
<400> 40
agaacttgtt catgtaacca aacaccacc 29
<210> 41
<211> 31
<212> DNA
<213> DYS460-F
<400> 41
tccagtagtg atgctgtgtc actatatttc t 31
<210> 42
<211> 28
<212> DNA
<213> DYS460-R
<400> 42
cacaagaata ccagaggaat ctgacacc 28
<210> 43
<211> 25
<212> DNA
<213> DYS549-F
<400> 43
aataaggtag acatagcaat taggt 25
<210> 44
<211> 25
<212> DNA
<213> DYS549-R
<400> 44
gtaatgtccc cttttccatt tgtga 25
<210> 45
<211> 28
<212> DNA
<213> DYF387S1a/b-F
<400> 45
aaagcagaac atctgtgtat cagtgctg 28
<210> 46
<211> 30
<212> DNA
<213> DYF387S1a/b-R
<400> 46
tgactgcact ccagcctggg tgacagagct 30
<210> 47
<211> 28
<212> DNA
<213> DYS449-F
<400> 47
gtgtgatttt ttgtttaaaa agttcccc 28
<210> 48
<211> 32
<212> DNA
<213> DYS449-R
<400> 48
gcactctagg ttggacaaca agagtaagac ag 32
<210> 49
<211> 31
<212> DNA
<213> DYS518-F
<400> 49
actccagcct gggcaacaca agtgaaactg c 31
<210> 50
<211> 25
<212> DNA
<213> DYS518-R
<400> 50
ggacttagtt ttctaatcac atctt 25
<210> 51
<211> 24
<212> DNA
<213> DYS627-F
<400> 51
ctccacccta ggtgacagcg cagg 24
<210> 52
<211> 22
<212> DNA
<213> DYS627-R
<400> 52
ttctttcctt ccttacttcc at 22
<210> 53
<211> 24
<212> DNA
<213> DYS570-F
<400> 53
tgaatgatga ctaggtagaa atcc 24
<210> 54
<211> 31
<212> DNA
<213> DYS570-R
<400> 54
gacaactggt ggcaacctaa gctgaaatgc a 31
<210> 55
<211> 27
<212> DNA
<213> DYS527a/b-F
<400> 55
gcccagacaa cagagcaaaa ctctatc 27
<210> 56
<211> 29
<212> DNA
<213> DYS527a/b-R
<400> 56
acaacataag taaggtagtt ttcttttca 29
<210> 57
<211> 25
<212> DNA
<213> DYS447-F
<400> 57
ggttttatac attttaggga gacat 25
<210> 58
<211> 23
<212> DNA
<213> DYS447-R
<400> 58
ctttgcgtta tctctgcctt tct 23
<210> 59
<211> 26
<212> DNA
<213> DYS444-F
<400> 59
gtgcaataga tatataggta ggtaag 26
<210> 60
<211> 30
<212> DNA
<213> DYS444-R
<400> 60
gaaggaaatc tatatataag tgagcccatg 30
<210> 61
<211> 28
<212> DNA
<213> DYS557-F
<400> 61
gtgccaagcc tacatataat attttgac 28
<210> 62
<211> 28
<212> DNA
<213> DYS557-R
<400> 62
ggtcctgtag gcagggttaa gacagaag 28
<210> 63
<211> 28
<212> DNA
<213> DYS596-F
<400> 63
ccgtgccctt tactgcataa atgacatg 28
<210> 64
<211> 30
<212> DNA
<213> DYS596-R
<400> 64
ctattactac tgagtttctg atatagtgtt 30
<210> 65
<211> 29
<212> DNA
<213> DYS388-F
<400> 65
tagccgttta gcgatatata catattatg 29
<210> 66
<211> 20
<212> DNA
<213> DYS388-R
<400> 66
cgcaaccact gcgctccagc 20
<210> 67
<211> 24
<212> DNA
<213> DYF404S1a/b-F
<400> 67
gactgcagtg agccatgatg gaac 24
<210> 68
<211> 31
<212> DNA
<213> DYF404S1a/b-R
<400> 68
ttaaacaata ctgaagttta tcaaagggct t 31
<210> 69
<211> 27
<212> DNA
<213> DYS593-F
<400> 69
atagaagatc tcaccagtgg actccag 27
<210> 70
<211> 26
<212> DNA
<213> DYS593-R
<400> 70
tttatgccca agtgacactg ctgatt 26
<210> 71
<211> 28
<212> DNA
<213> DYS645-F
<400> 71
ttactacctt ccacacacgt ccactcaa 28
<210> 72
<211> 28
<212> DNA
<213> DYS645-R
<400> 72
taattacagc taagttaatt atatggac 28
<210> 73
<211> 19
<212> DNA
<213> GATA-A10-F
<400> 73
ctgtgtctca catcggact 19
<210> 74
<211> 19
<212> DNA
<213> GATA-A10-R
<400> 74
cttaacctgc ttcagataa 19
<210> 75
<211> 21
<212> DNA
<213> DYS626-F
<400> 75
tctggtgaac tgatccaatc c 21
<210> 76
<211> 20
<212> DNA
<213> DYS626-R
<400> 76
gtttgggtta cttcgccaga 20
<210> 77
<211> 19
<212> DNA
<213> DYS612-F
<400> 77
ggctttcacc agtttgcat 19
<210> 78
<211> 20
<212> DNA
<213> DYS612-R
<400> 78
ccatgtttag ggacattcct 20
<210> 79
<211> 19
<212> DNA
<213> DYS526-F
<400> 79
cctcccttct ccctgtctt 19
<210> 80
<211> 18
<212> DNA
<213> DYS526-R
<400> 80
cagagcagga taccatct 18
<210> 81
<211> 24
<212> DNA
<213> DYF399S1-F
<400> 81
agtctctcaa gcctgttcta tgaa 24
<210> 82
<211> 24
<212> DNA
<213> DYF399S1-R
<400> 82
attagctgga agtggagttt gctg 24
<210> 83
<211> 21
<212> DNA
<213> DYF403S1-F
<400> 83
ccatgttact gcaaaataca c 21
<210> 84
<211> 19
<212> DNA
<213> DYF403S1-R
<400> 84
tgacagagca taaacgtgt 19
<210> 85
<211> 20
<212> DNA
<213> DYS630-F
<400> 85
ttgggctgag gagttcaatc 20
<210> 86
<211> 19
<212> DNA
<213> DYS630-R
<400> 86
gcagtctcat ttcctggag 19
<210> 87
<211> 21
<212> DNA
<213> DYS547-F
<400> 87
tctggtgaac tgatccaatc c 21
<210> 88
<211> 20
<212> DNA
<213> DYS547-R
<400> 88
gtttgggtta cttcgccaga 20
<210> 89
<211> 21
<212> DNA
<213> Amelogenin-F
<400> 89
tgccctgggc tctgtaaaga a 21
<210> 90
<211> 24
<212> DNA
<213> Amelogenin-R
<400> 90
gaggccaacc atcagagctt aaac 24
<210> 91
<211> 19
<212> DNA
<213> D3S1358-F
<400> 91
gcaatcaaca gaggcttgc 19
<210> 92
<211> 18
<212> DNA
<213> D3S1358-R
<400> 92
cgtgacagag caagaccc 18
<210> 93
<211> 23
<212> DNA
<213> D1S1656-F
<400> 93
ctacatacaa ttaaacacac aca 23
<210> 94
<211> 18
<212> DNA
<213> D1S1656-R
<400> 94
cagggtcaac tgtgtgat 18
<210> 95
<211> 20
<212> DNA
<213> D6S1043-F
<400> 95
cgacttccca taataaatcc 20
<210> 96
<211> 16
<212> DNA
<213> D6S1043-R
<400> 96
cgcaaggatg ggtgga 16
<210> 97
<211> 20
<212> DNA
<213> D13S317-F
<400> 97
acagaagtct gggatgtgga 20
<210> 98
<211> 20
<212> DNA
<213> D13S317-R
<400> 98
gcccaaaaag acagacagaa 20
<210> 99
<211> 20
<212> DNA
<213> Penta E-F
<400> 99
tgctgaaaca ggagaatcac 20
<210> 100
<211> 21
<212> DNA
<213> Penta E-R
<400> 100
ctgtaaagtg cttagtatca t 21
<210> 101
<211> 20
<212> DNA
<213> D16S539-F
<400> 101
gatcccaagc tcttcctctt 20
<210> 102
<211> 20
<212> DNA
<213> D16S539-R
<400> 102
acgtttgtgt gtgcatctgt 20
<210> 103
<211> 20
<212> DNA
<213> D18S51-F
<400> 103
gagccatgtt catgccactg 20
<210> 104
<211> 20
<212> DNA
<213> D18S51-R
<400> 104
caaacccgac taccagcaac 20
<210> 105
<211> 20
<212> DNA
<213> D2S1338-F
<400> 105
ccagtggatt tggaaacaga 20
<210> 106
<211> 20
<212> DNA
<213> D2S1338-R
<400> 106
acctagcatg gtacctgcag 20
<210> 107
<211> 20
<212> DNA
<213> CSF1PO-F
<400> 107
cgcacccaat caccatagcc 20
<210> 108
<211> 22
<212> DNA
<213> CSF1PO-R
<400> 108
cttccaacct gagtctgcca ag 22
<210> 109
<211> 20
<212> DNA
<213> Penta D-F
<400> 109
ccttgagcct ggaaggtcga 20
<210> 110
<211> 27
<212> DNA
<213> Penta D-R
<400> 110
ctgcctaacc tatggtcata acgattt 27
<210> 111
<211> 24
<212> DNA
<213> TH01-F
<400> 111
gtgggctgaa aagctcccga ttat 24
<210> 112
<211> 24
<212> DNA
<213> TH01-R
<400> 112
gtgattccca ttggcctgtt cctc 24
<210> 113
<211> 33
<212> DNA
<213> vWA-F
<400> 113
gccctagtgg atgataagaa taatcagtat gtg 33
<210> 114
<211> 30
<212> DNA
<213> vWA-R
<400> 114
ggacagatga taaatacata ggatggatgg 30
<210> 115
<211> 22
<212> DNA
<213> D21S11-F
<400> 115
atatgtgagt caattcccca ag 22
<210> 116
<211> 22
<212> DNA
<213> D21S11-R
<400> 116
tgtattagtc aatgttctcc ag 22
<210> 117
<211> 19
<212> DNA
<213> D7S820-F
<400> 117
ccaatatttg gtgcaattc 19
<210> 118
<211> 19
<212> DNA
<213> D7S820-R
<400> 118
ccttaaaatc tgaggtatc 19
<210> 119
<211> 20
<212> DNA
<213> D5S818-F
<400> 119
gggtgatttt cctctttggt 20
<210> 120
<211> 20
<212> DNA
<213> D5S818-R
<400> 120
tgattccaat catagccaca 20
<210> 121
<211> 24
<212> DNA
<213> TPOX-F
<400> 121
actggcacag aacaggcact tagg 24
<210> 122
<211> 24
<212> DNA
<213> TPOX-R
<400> 122
ggaggaactg ggaaccacac aggt 24
<210> 123
<211> 23
<212> DNA
<213> D8S1179-F
<400> 123
ctgtatttca tgtgtacatt cgt 23
<210> 124
<211> 21
<212> DNA
<213> D8S1179-R
<400> 124
agattatttt cactgtgggg a 21
<210> 125
<211> 20
<212> DNA
<213> D12S391-F
<400> 125
aacaggatca atggatgcat 20
<210> 126
<211> 20
<212> DNA
<213> D12S391-R
<400> 126
tggcttttag acctggactg 20
<210> 127
<211> 20
<212> DNA
<213> D19S433-F
<400> 127
cctgggcaac agaataagat 20
<210> 128
<211> 22
<212> DNA
<213> D19S433-R
<400> 128
taggttttta aggaacaggt gg 22
<210> 129
<211> 20
<212> DNA
<213> FGA-F
<400> 129
cgcaaaaaag aaaggaagaa 20
<210> 130
<211> 23
<212> DNA
<213> FGA-R
<400> 130
acttaggcat atttacaagc tag 23
<210> 131
<211> 23
<212> DNA
<213> rs199815934-F
<400> 131
ccagtgattt aaactctctg aat 23
<210> 132
<211> 22
<212> DNA
<213> rs199815934-R
<400> 132
gagtaaagaa tattaatgat ac 22
<210> 133
<211> 21
<212> DNA
<213> rs759551978-F
<400> 133
aattgacagt tatcagtttg a 21
<210> 134
<211> 26
<212> DNA
<213> rs759551978-R
<400> 134
aaacttttcc tatagaagca aagata 26
<210> 135
<211> 24
<212> DNA
<213> rs11096433-F
<400> 135
aacagctgcc atcaactggc catg 24
<210> 136
<211> 25
<212> DNA
<213> rs11096433-R
<400> 136
aggaattgtt agaagggcca agggt 25
<210> 137
<211> 27
<212> DNA
<213> M145-F
<400> 137
ttgcatcatc ttcagctagt aacacag 27
<210> 138
<211> 24
<212> DNA
<213> M145-R
<400> 138
aggttcctcc cactcctttt tgga 24
<210> 139
<211> 25
<212> DNA
<213> rs9306845-F
<400> 139
ataaaatgac agtcttgacc tctaa 25
<210> 140
<211> 23
<212> DNA
<213> rs9306845-R
<400> 140
tttacttaaa gaaacaaatg agg 23
<210> 141
<211> 25
<212> DNA
<213> rs9786479-F
<400> 141
ctatgatcca ggcaatgtat ttaag 25
<210> 142
<211> 23
<212> DNA
<213> rs9786479-R
<400> 142
gaaatgaact aatacaaaga tat 23
<210> 143
<211> 23
<212> DNA
<213> rs17276358-F
<400> 143
agcgaaatga tgactaattg gtt 23
<210> 144
<211> 24
<212> DNA
<213> rs17276358-R
<400> 144
tgaaaactaa tcatgtttct gctg 24
<210> 145
<211> 23
<212> DNA
<213> rs2075640-F
<400> 145
atgtctaaat tactaaatca gta 23
<210> 146
<211> 23
<212> DNA
<213> rs2075640-R
<400> 146
cgatttccag catttcctcg gtc 23
<210> 147
<211> 27
<212> DNA
<213> F449-F
<400> 147
tgagaattct catttaccac tgtggag 27
<210> 148
<211> 26
<212> DNA
<213> F449-R
<400> 148
caatcagaat catcaaaccc agaagg 26
<210> 149
<211> 25
<212> DNA
<213> M88-F
<400> 149
ataggctatg gcctaggtgc ttttc 25
<210> 150
<211> 28
<212> DNA
<213> M88-R
<400> 150
cttagagagg tagtcactat atgctaca 28
<210> 151
<211> 23
<212> DNA
<213> M95-F
<400> 151
ttgggatcaa atggagttcc tga 23
<210> 152
<211> 23
<212> DNA
<213> M95-R
<400> 152
tacatcccta gtaagtctgg act 23
<210> 153
<211> 24
<212> DNA
<213> rs16980426-F
<400> 153
ttcaaataga caaatgccag aaga 24
<210> 154
<211> 27
<212> DNA
<213> rs16980426-R
<400> 154
agctttgtgt tttcagtagc acctaca 27
<210> 155
<211> 23
<212> DNA
<213> rs17323322-F
<400> 155
tgtcctcatc catttaaagg cca 23
<210> 156
<211> 28
<212> DNA
<213> rs17323322-R
<400> 156
acttcagagc actgaaaatt cagcctgc 28
<210> 157
<211> 27
<212> DNA
<213> N795-F
<400> 157
actgtgactt tgagagtcac ttgctct 27
<210> 158
<211> 27
<212> DNA
<213> N7-R
<400> 158
gccttttgga aatgaataaa tcaaggt 27
<210> 159
<211> 23
<212> DNA
<213> rs13447354-F
<400> 159
tgttaacctg tgctcaaatc ctt 23
<210> 160
<211> 24
<212> DNA
<213> rs13447354-R
<400> 160
aggtattttt aattctcact tagc 24
<210> 161
<211> 28
<212> DNA
<213> F1478-F
<400> 161
ccacagaagg atgctgctca gcttcctg 28
<210> 162
<211> 29
<212> DNA
<213> F1478-R
<400> 162
aatagctgct caggtacaca cagagtatc 29
<210> 163
<211> 27
<212> DNA
<213> rs9786707-F
<400> 163
tgtgcctatt tttctatgta ggggagt 27
<210> 164
<211> 28
<212> DNA
<213> rs9786707-R
<400> 164
agctgcaaaa ttgagtgagc tactttgt 28
<210> 165
<211> 28
<212> DNA
<213> M15-F
<400> 165
ctctgatctg gtcgcatgtc cagagggt 28
<210> 166
<211> 28
<212> DNA
<213> M15-R
<400> 166
ctcatgcgca tatacaatca aatgtgtt 28
<210> 167
<211> 31
<212> DNA
<213> rs16980711-F
<400> 167
tacatctaca aaaacgtaga tatatgccaa t 31
<210> 168
<211> 28
<212> DNA
<213> rs16980711-R
<400> 168
gatgcatagc ttcatggctg ttttgtct 28
<210> 169
<211> 32
<212> DNA
<213> M9-F
<400> 169
gaattcgctg cagcatataa aactttcagg ac 32
<210> 170
<211> 27
<212> DNA
<213> M9-R
<400> 170
gctaccttac ttacataact aagtatg 27
<210> 171
<211> 24
<212> DNA
<213> M7-F
<400> 171
acctgcatca ccaaagggca tgta 24
<210> 172
<211> 24
<212> DNA
<213> M7-R
<400> 172
ccttgtgaac caattatttc catt 24
<210> 173
<211> 24
<212> DNA
<213> M89-F
<400> 173
ccacagaagg atgctgctca gctt 24
<210> 174
<211> 24
<212> DNA
<213> M89-R
<400> 174
cactttgggt ccaggatcac cagc 24
<210> 175
<211> 27
<212> DNA
<213> M188-F
<400> 175
ttgggatcaa atggagttcc tgaggat 27
<210> 176
<211> 27
<212> DNA
<213> M188-R
<400> 176
tccctagtaa gtctggactc tcctaag 27
<210> 177
<211> 23
<212> DNA
<213> M111-F
<400> 177
cgttggatgg ccaaaacaac aga 23
<210> 178
<211> 23
<212> DNA
<213> M111-R
<400> 178
cgttggatgg ctgtgacact tct 23
<210> 179
<211> 20
<212> DNA
<213> M117-F
<400> 179
ttaaaattga cagttatcag 20
<210> 180
<211> 18
<212> DNA
<213> M117-R
<400> 180
ataactcacc aaaggaat 18
<210> 181
<211> 22
<212> DNA
<213> M119-F
<400> 181
ttaatacaaa atttaaaccg tt 22
<210> 182
<211> 26
<212> DNA
<213> M119-R
<400> 182
tatgctccaa accgcagtgc tatgtg 26
<210> 183
<211> 29
<212> DNA
<213> M122-F
<400> 183
tgagagtcac ttgctctgtg ttagaaaag 29
<210> 184
<211> 28
<212> DNA
<213> M122-R
<400> 184
agctatattt acagcaaact tggtaaac 28
<210> 185
<211> 28
<212> DNA
<213> M134-F
<400> 185
taattttact ctttttgaga attctcat 28
<210> 186
<211> 27
<212> DNA
<213> M134-R
<400> 186
gacaatcaga atcatcaaac ccagaag 27
<210> 187
<211> 32
<212> DNA
<213> M175-F
<400> 187
caagaaaaat agtacccaaa tcaactcaac tc 32
<210> 188
<211> 28
<212> DNA
<213> M175-R
<400> 188
ccatgtactt tgtccaatgc tgaaagta 28
<210> 189
<211> 28
<212> DNA
<213> M214-F
<400> 189
tttgctgctg atacaacaca ctggaaag 28
<210> 190
<211> 25
<212> DNA
<213> M214-R
<400> 190
catggaaatg ccacttcact ccagc 25
<210> 191
<211> 23
<212> DNA
<213> M216-F
<400> 191
ctcaaccagt ttttatgaag cta 23
<210> 192
<211> 26
<212> DNA
<213> M216-R
<400> 192
atatgagagt agcaaaagat aattgt 26
<210> 193
<211> 24
<212> DNA
<213> P123-F
<400> 193
gaatattcca aatatccacc ccaa 24
<210> 194
<211> 28
<212> DNA
<213> P123-R
<400> 194
aaagtcagga tccttacaga tttatctt 28
<210> 195
<211> 27
<212> DNA
<213> P128-F
<400> 195
agctcctctt tggacatcgg gagctgc 27
<210> 196
<211> 28
<212> DNA
<213> P128-R
<400> 196
gaagaaatat gtgatcacta tgatgaga 28
<210> 197
<211> 25
<212> DNA
<213> P131-F
<400> 197
attaggattt aatagtcccc ttcca 25
<210> 198
<211> 27
<212> DNA
<213> P131-R
<400> 198
aaagttaaaa actttatatg tagagaa 27
<210> 199
<211> 25
<212> DNA
<213> P132-F
<400> 199
aagctgcagt atcaaattaa ctaaa 25
<210> 200
<211> 27
<212> DNA
<213> P132-R
<400> 200
ataggcaata tgaataaata ctaaagg 27
<210> 201
<211> 28
<212> DNA
<213> P136-F
<400> 201
aatgcaagga atgcagagtc ctcagtgc 28
<210> 202
<211> 29
<212> DNA
<213> P136-R
<400> 202
tctcttgtaa ctgagcctcc tcttctctc 29
<210> 203
<211> 30
<212> DNA
<213> P145-F
<400> 203
ttgggaggtg gcctgatact tgtcagcgtc 30
<210> 204
<211> 27
<212> DNA
<213> P145-R
<400> 204
ctgtgaagag gactgctagc atctctt 27
<210> 205
<211> 27
<212> DNA
<213> P148-F
<400> 205
acatgcaatt cttgttaacc ctgtgag 27
<210> 206
<211> 27
<212> DNA
<213> P148-R
<400> 206
tcaggaaatc tctgatcatt gcatagt 27
<210> 207
<211> 29
<212> DNA
<213> P149-F
<400> 207
aatatgacca tatgtataca atcacatgt 29
<210> 208
<211> 29
<212> DNA
<213> P149-R
<400> 208
ttcatggttt ctctaatcta gactggctc 29
<210> 209
<211> 25
<212> DNA
<213> P151-F
<400> 209
tgtgcctatt tttctatgta gggga 25
<210> 210
<211> 27
<212> DNA
<213> P151-R
<400> 210
gtcattataa tcagaagtag aaacaga 27
<210> 211
<211> 27
<212> DNA
<213> P157-F
<400> 211
cgatgtgtaa ttcatttgat atgtttc 27
<210> 212
<211> 27
<212> DNA
<213> P157-R
<400> 212
tgcatattct aatacatggc agaatag 27
<210> 213
<211> 32
<212> DNA
<213> P164-F
<400> 213
gttcagcttc cagatattac cagcatgcag at 32
<210> 214
<211> 28
<212> DNA
<213> P164-R
<400> 214
attattcatg ctcccctctt tttcctcc 28
<210> 215
<211> 25
<212> DNA
<213> P186-F
<400> 215
cttgggcaga atctagaaga tgatt 25
<210> 216
<211> 28
<212> DNA
<213> P186-R
<400> 216
ccaggtttta ttcaaattga aatgtgga 28
<210> 217
<211> 26
<212> DNA
<213> P191-F
<400> 217
tcaattctcg caaatagtag cagaaa 26
<210> 218
<211> 28
<212> DNA
<213> P191-R
<400> 218
ttctttacat tatatgctaa catttgac 28
<210> 219
<211> 26
<212> DNA
<213> P196-F
<400> 219
agcatttaag ccagaattgc agatcc 26
<210> 220
<211> 27
<212> DNA
<213> P196-R
<400> 220
tgctgggatt gtaagcgaga gccactg 27
<210> 221
<211> 30
<212> DNA
<213> P197-F
<400> 221
tcacctctga ggccagcgaa atgatgacta 30
<210> 222
<211> 27
<212> DNA
<213> P197-R
<400> 222
tgattgagtt aaataaacaa gggtttg 27
<210> 223
<211> 30
<212> DNA
<213> P198-F
<400> 223
ctgttatagc tgaagagtga attggcctta 30
<210> 224
<211> 27
<212> DNA
<213> P198-R
<400> 224
aatgcagttt gctcttcaat gagggag 27
<210> 225
<211> 27
<212> DNA
<213> P199-F
<400> 225
tgtttaagat aggtaagtgt tcctagc 27
<210> 226
<211> 26
<212> DNA
<213> P199-R
<400> 226
caataaagga tcatacagct aaatga 26
<210> 227
<211> 29
<212> DNA
<213> P200-F
<400> 227
gtgtttgtcc agaggtatct tttcagatc 29
<210> 228
<211> 27
<212> DNA
<213> P200-R
<400> 228
gggagagttg ccatctgtcc cctctta 27
<210> 229
<211> 28
<212> DNA
<213> P201-F
<400> 229
tactgagcat gatgtgctgt gcaagttg 28
<210> 230
<211> 26
<212> DNA
<213> P201-R
<400> 230
ccccaaatcc caaggtagtt cctcag 26
<210> 231
<211> 24
<212> DNA
<213> M174-F
<400> 231
ataatgtcct ttttaatgta tcaa 24
<210> 232
<211> 25
<212> DNA
<213> M174-R
<400> 232
gcaaaaggag aaggacaaga cccat 25
<210> 233
<211> 28
<212> DNA
<213> M173-F
<400> 233
catataaatt tactgtaact tcctagaa 28
<210> 234
<211> 32
<212> DNA
<213> M173-R
<400> 234
tgactcagta tgggtaaaag aaatgctgca gt 32

Claims (10)

1. An NGS-based STR and SNP genetic marker combined detection system is characterized by comprising 123 loci, wherein 49Y-STR loci, 50Y-SNP loci, 3Y-indel loci close to zero mutation and 21 autosomal STR loci.
2. The detection system of claim 1, wherein the Y-STR locus is DYS19, DYS385a/b, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, DYS437, DYS438, DYS439, DYS448, DYS456, DYS458, DYS635, GATA _ H4, GATA-a10, DYS481, DYS533, DYS576, DYS643, DYS460, DYS549, dysf 387S1a/b, DYS449, DYS518, DYS627, DYS570, DYS 527/b, DYS447, DYS444, DYS557, DYS596, DYS388, DYS 404S1a/b, DYS593, DYS645, DYS1, DYS 403, DYS626, DYS576, DYS 612; the Y-SNP loci are M7, M89, M95, M111, M117, M119, M122, M134, M175, M214, M216, P123, P128, P131, P132, P136, P145, P148, P149, P151, P157, P164, P186, P191, P196, P197, P198, P199, P200, P201, M173, M174, rs11096433, M145, rs9306845, rs9786479, rs17276358, rs2075640, F449, M88, M188, rs 169826, rs 17323323322, F1478, rs13447354, N7, rs9786707, M15, rs 071981 and M9 respectively; the Y-indel locus comprises rs771783753, rs199815934 and rs 759551978; the autosomal STR loci include Amelogenin, D3S1358, D1S1656, D6S1043, D13S317, PentaE, D16S539, D18S51, D2S1338, CSF1PO, PentaD, TH01, vWA, D21S11, D7S820, D5S818, TPOX, D8S1179, D12S391, D19S433, FGA.
3. A method for detecting a population genetic polymorphism using the detection system of claim 1, comprising the steps of:
s1, screening locus loci, and designing primers according to locus sequences;
s2, amplifying the gene library by adopting a multiple PCR technology according to the primer designed in S1;
s3, sequencing the gene library prepared in the step S2, and analyzing data.
4. The detection method according to claim 3, wherein the primer sequence information and the positions thereof in the sequence table of step S1 are as shown in Table 1:
TABLE 1 repeat and primer information corresponding to different loci
Figure FDA0002894937950000011
Figure FDA0002894937950000021
Figure FDA0002894937950000031
Figure FDA0002894937950000041
Figure FDA0002894937950000051
Figure FDA0002894937950000061
Figure FDA0002894937950000071
Figure FDA0002894937950000081
Figure FDA0002894937950000091
Figure FDA0002894937950000101
Figure FDA0002894937950000111
Figure FDA0002894937950000121
Figure FDA0002894937950000131
Figure FDA0002894937950000141
5. The method according to claim 3, wherein the gene library of step S2 is constructed by:
(1) preparing a multiple PCR reaction mixed solution according to the following reaction conditions: 10 mu L of primer mixed solution, 4 mu L of multiplex PCR amplification buffer solution, 1-100 ng of template DNA, and supplementing sterile ultrapure water to 20 mu L of total reaction system; the primer mixture solution comprises oligonucleotide and 1 × TE Buffer, the multiplex PCR amplification Buffer solution comprises DNA polymerase with the enzyme activity of 5-20U, dNTP with the final concentration of 0.1-0.5mmol/L and MgCl with the final concentration of 1.5-6mmol/L2
PCR amplification was performed under the following program conditions: 28 cycles at 98 deg.C for 2min, (15 s at 98 deg.C, 4min at 60 deg.C), 10min at 72 deg.C, and hold at 4 deg.C; preparing a PCR product;
(2) after amplification is completed, adding a digestion buffer solution into each PCR product prepared in the step (1), wherein the digestion buffer solution comprises the following specific components: 20 mu L of multiplex PCR product and 2 mu L of digestion buffer solution, digesting according to the following procedures of 10min at 50 ℃, 10min at 55 ℃, 20min at 60 ℃ and 10 ℃ hold to prepare a digestion product; the digestion buffer solution comprises Tris-HCl with the final concentration of 50-100mmol/L and MgCl with the final concentration of 1.5-6mmol/L2The final concentration is 0.1-0.5mmol/LATP, 0.1-0.5mmol/L dNTP;
(3) the reaction mixture was prepared as follows: and (3) preparing 22 mu L of digestion product prepared in the step (2), 6 mu L of connection buffer solution, 1 mu L of linker, 1 mu L of ligase and 30 mu L of total system, and connecting according to the following reaction: hold at 22 deg.C for 30min, 72 deg.C for 10min, and 10 deg.C to obtain PCA library of R; the connection buffer solution comprises Tris-HCl with the final concentration of 50-100mmol/L and MgCl with the final concentration of 1.5-6mmol/L2ATP with a final concentration of 0.2-1mmol/L, DTT with a final concentration of 0.1-0.5mmol/L, and the ligase is T4A DNA ligase;
(4) purifying the PCR library prepared in the step (3), and then amplifying again by taking the purified library as a template, wherein an amplification system comprises: 20 mu L of purified library, 25 mu L of HiFi library amplification buffer solution and 5 mu L of PCR primer mixture, wherein the amplification reaction conditions are as follows: 5 cycles of 95 ℃ for 3min, (98 ℃ for 20s, 60 ℃ for 15s, 72 ℃ for 30s), 72 ℃ for 10min, 4 ℃ hold, to prepare an amplification library; the HiFi library amplification buffer solution comprises DNA polymerase with the enzyme activity of 3-6U, dNTP with the final concentration of 0.15-0.35mmol/L and MgCl with the final concentration of 2-5mmol/L2The PCR primer mixed solution comprises oligonucleotide and 1 × TE Buffer;
(5) and (4) purifying the amplification library prepared in the step (4), performing library quality inspection and quality evaluation, and finally quantifying each library by using a qubit respectively to ensure that the final library volume is 20 mu L and the final concentration is in a range of 30-60pM to obtain the target product.
6. The detection method according to claim 5, wherein the PCR library purification process of step (4) comprises the following steps:
a. prepared before purification, VAHTSTMUniformly mixing DNA Clean Beads in a shaking way, balancing to room temperature, preparing enough ethanol with the fresh volume fraction of 80%, wherein each sample needs about 400 mu L, and before the purification step is carried out, filling the sample with sterilized water to 60 mu L;
b. vortexing the magnetic beads to fully mix the magnetic beads, adding 60 mu L of 1 multiplied magnetic beads into the PCR library, gently blowing and beating the mixture for 10 times by using a pipettor to ensure that the whole system is uniform, and incubating the mixture at 25 ℃ for 8min to combine the library on the magnetic beads;
c. centrifuging the reaction tube at the rotating speed of 500rpm for 1min, and placing the reaction tube on a magnetic frame to separate magnetic beads and liquid;
d. keeping the PCR tube on a magnetic frame, after 5min, clarifying the solution, carefully discarding the supernatant, adding 200 μ L of ethanol with volume fraction of 80%, carefully adding ethanol without disturbing magnetic beads, incubating for 30s, and carefully removing the supernatant;
e. repeating the step d, rinsing twice, centrifuging for 1min at the rotating speed of 500rpm for a short time, collecting the sample to the bottom of the PCR tube, placing the PCR tube on a magnetic frame for 30s, sucking all residual ethanol by using a pipettor, and opening the cover to dry for 3-5min in air;
f. after the magnetic beads are dried in the air, taking down the PCR tube from the magnetic frame, adding 22 mu L of enzyme-free water to cover the magnetic beads, and blowing and uniformly mixing the magnetic beads by using a pipettor; then incubating at 25 ℃ for 2 min;
g. centrifuging the PCR tube at the rotating speed of 500rpm for 1min, then placing the PCR tube in a magnetic frame, and separating magnetic beads from liquid until the solution is clear;
k. carefully sucking 20 μ L of the supernatant and transferring to a new EP tube;
the purification process of step (5) is similar to the purification process of step (4), and differs from step (4) in that in step b, 120. mu.L of 1.2X magnetic beads are added for purification.
7. The assay of claim 5, wherein the library quality testing and quality assessment process comprises: using fragment analysis instruments
Figure FDA0002894937950000161
GX TouchTMAnalyzing fragments, performing fragment quality inspection on the constructed library, and screening the library without small fragment joints or large fragment tailing peaks.
8. The detection method according to claim 3, wherein the data analysis process of step S3 is: after the sequencing was completed, the instrument automatically generated a fastq file, data analysis was performed on the fastq file using STRait Razor 2.6 software, BAM file was analyzed with IGV _2.3.72, sequencing data for each locus was visualized by integrating genome viewer v2.3.72, data processing was performed using SAMtools and PICARD and BAM and BAI creation tools, data extraction and variant calls were performed with GATK, Microsoft Excel and RStudio v1.2.1335 for data processing and statistical analysis.
9. The detection method according to claim 8, characterized in that said data processing and statistical analysis involve the following parameters:
(1) analysis threshold value: expressed as the set limit for the filtered Noise sequence, is a percentage;
(2) locus sequence composition ratio: the locus sequence composition ratio is the ratio of Allle, Stutter and Noise in each locus in the total coverage in that locus;
(3) depth of locus coverage: expressed as the sequencing depth of a locus, i.e., the total reads number for that locus;
(4) sample coverage depth: representing the average sequencing depth of each sample in one sequencing;
(5) allele coverage ratio: the heterozygosity balance ratio is a heterozygote and two different allee coverage ratio values, and is expressed as ACR ═ lower/highher; φ lower represents low coverage alleles; phi highher denotes high coverage alleles;
(6) mixing ratio: calculating the mixing proportion by using the number of allelic genes reads in the NGS mixed map to obtain an actual measured value; heterozygote allele mixing ratio: m ═ a1 reads + a2 reads)/Total reads;
homozygote allele mixing ratio: and M is A reads/Total reads.
10. The assay of claim 9 wherein Stutter in step (2) is defined as a sequence of ± 4bp longer than the corresponding Allele; noise is defined as a sequence that is neither Allele nor Stutter;
and the calculation formula of each gene frequency is as follows:
allele%=allele reads/Total reads×100%;
stuffer%=stuffer reads/Total reads×100%;noise%=1-allele%-stuffer%。
CN202110039036.0A 2021-01-12 2021-01-12 STR and SNP genetic marker combined detection system and detection method based on NGS Withdrawn CN112695100A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110039036.0A CN112695100A (en) 2021-01-12 2021-01-12 STR and SNP genetic marker combined detection system and detection method based on NGS

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110039036.0A CN112695100A (en) 2021-01-12 2021-01-12 STR and SNP genetic marker combined detection system and detection method based on NGS

Publications (1)

Publication Number Publication Date
CN112695100A true CN112695100A (en) 2021-04-23

Family

ID=75514228

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110039036.0A Withdrawn CN112695100A (en) 2021-01-12 2021-01-12 STR and SNP genetic marker combined detection system and detection method based on NGS

Country Status (1)

Country Link
CN (1) CN112695100A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7463564B2 (en) 2021-08-30 2024-04-08 司法鑑定科学研究院 Primer compositions, kits, methods and uses thereof for detecting Y-SNP haplogroups by next generation sequencing technology

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102725422A (en) * 2009-09-11 2012-10-10 生命科技公司 Analysis of Y-chromosome STR markers
CN106399543A (en) * 2016-10-26 2017-02-15 四川大学 Forensic medicine II sequence testing kit based on 74 gama chromosome SNP genetic markers
CN108300790A (en) * 2018-01-12 2018-07-20 四川大学 Medical jurisprudence two generations sequencing kit based on 165 Y-SNP
CN108517363A (en) * 2018-03-08 2018-09-11 深圳华大法医科技有限公司 A kind of individual identification system, kit and application thereof based on the sequencing of two generations
CN110777211A (en) * 2019-11-19 2020-02-11 公安部物证鉴定中心 Composite amplification system based on Y-STR locus and Y-indel locus and primer combination used by same
CN110863056A (en) * 2018-08-27 2020-03-06 深圳华大法医科技有限公司 Method, reagent and application for accurately typing human DNA
CN111286548A (en) * 2020-04-13 2020-06-16 公安部物证鉴定中心 Kit for detecting 68 loci based on next-generation sequencing technology and primer combination used by kit
CN111808933A (en) * 2020-06-23 2020-10-23 安徽微分基因科技有限公司 Standard substance for illiminina second-generation sequencing platform

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102725422A (en) * 2009-09-11 2012-10-10 生命科技公司 Analysis of Y-chromosome STR markers
CN106399543A (en) * 2016-10-26 2017-02-15 四川大学 Forensic medicine II sequence testing kit based on 74 gama chromosome SNP genetic markers
CN108300790A (en) * 2018-01-12 2018-07-20 四川大学 Medical jurisprudence two generations sequencing kit based on 165 Y-SNP
CN108517363A (en) * 2018-03-08 2018-09-11 深圳华大法医科技有限公司 A kind of individual identification system, kit and application thereof based on the sequencing of two generations
CN110863056A (en) * 2018-08-27 2020-03-06 深圳华大法医科技有限公司 Method, reagent and application for accurately typing human DNA
CN110777211A (en) * 2019-11-19 2020-02-11 公安部物证鉴定中心 Composite amplification system based on Y-STR locus and Y-indel locus and primer combination used by same
CN111286548A (en) * 2020-04-13 2020-06-16 公安部物证鉴定中心 Kit for detecting 68 loci based on next-generation sequencing technology and primer combination used by kit
CN111808933A (en) * 2020-06-23 2020-10-23 安徽微分基因科技有限公司 Standard substance for illiminina second-generation sequencing platform

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7463564B2 (en) 2021-08-30 2024-04-08 司法鑑定科学研究院 Primer compositions, kits, methods and uses thereof for detecting Y-SNP haplogroups by next generation sequencing technology

Similar Documents

Publication Publication Date Title
CA2786357C (en) Simultaneous determination of aneuploidy and fetal fraction
US20220205023A1 (en) Se33 mutations impacting genotype concordance
CN107012225B (en) STR locus detection kit and detection method based on high-throughput sequencing
Morin et al. Rapid screening and comparison of human microsatellite markers in baboons: allele size is conserved, but allele number is not
CN110331193B (en) PCR-SBT method and reagent for genotyping of human killer cell immunoglobulin receptor KIR3DL2
CN109112217B (en) Genetic marker obviously related to pig body length and nipple number and application
CN111286548B (en) Kit for detecting 68 loci based on next-generation sequencing technology and primer combination used by kit
CN112695100A (en) STR and SNP genetic marker combined detection system and detection method based on NGS
CN108823294B (en) Forensic medicine composite detection kit based on Y-SNP genetic markers of 20 haplotype groups D
Almeida et al. Authentication of human and mouse cell lines by short tandem repeat (STR) DNA genotype analysis
CN113637776A (en) Primer group special for Diego blood group gene sequencing, kit and method
CN112342303A (en) NGS-based human Y chromosome STR and SNP genetic marker combined detection system and detection method
CN110669833B (en) Primer and kit for detecting human motor neuron genes by using single tube
Banlaki et al. Intraspecific evolution of human RCCX copy number variation traced by haplotypes of the CYP21A2 gene
CN108441547B (en) Primer group, kit and method for HLA gene amplification and genotyping
CN116287319A (en) Primer composition, kit and method for detecting STR and SNP based on second-generation sequencing technology and application of primer composition
CN115820879B (en) Molecular marker related to intramuscular fat traits of pigs in pig AOPEP gene and application thereof
CN104726604B (en) Decayed-sample degradation DNA (deoxyribonucleic acid) detection method and application thereof
CN114574595B (en) Application of human chromosome InDel gene locus, primer group, product thereof and individual identification method of test material
CN111394477B (en) Reagent kit for detecting 120 gene loci based on second-generation sequencing technology and primer combination used by reagent kit
CN108841929B (en) PCR amplification system and detection kit for human ATP7b gene exon and application method thereof
CN111560420A (en) ABO gene haploid typing method and reagent
CN112410411A (en) HLA single SNP detection kit and detection method
Wu et al. Forensic application of a novel MPS-based panel (90 STRs and 100 SNPs) in a non-exclusion parentage case with three autosomal STRs incompatibilities
CN112746096A (en) Human Y-STR detection method based on next-generation sequencing and application thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20210423