CN109971852A - Detect the mutation and ploidy in chromosome segment - Google Patents
Detect the mutation and ploidy in chromosome segment Download PDFInfo
- Publication number
- CN109971852A CN109971852A CN201910135027.4A CN201910135027A CN109971852A CN 109971852 A CN109971852 A CN 109971852A CN 201910135027 A CN201910135027 A CN 201910135027A CN 109971852 A CN109971852 A CN 109971852A
- Authority
- CN
- China
- Prior art keywords
- allele
- sample
- dna
- gene
- hypothesis
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6806—Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B15/00—ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/10—Ploidy or copy number detection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
- G16B40/20—Supervised data analysis
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H10/00—ICT specially adapted for the handling or processing of patient-related medical or healthcare data
- G16H10/40—ICT specially adapted for the handling or processing of patient-related medical or healthcare data for data related to laboratory analysis, e.g. patient specimen analysis
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16Z—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS, NOT OTHERWISE PROVIDED FOR
- G16Z99/00—Subject matter not provided for in other main groups of this subclass
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2539/00—Reactions characterised by analysis of gene expression or genome comparison
- C12Q2539/10—The purpose being sequence identification by analysis of gene expression or genome comparison characterised by
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/156—Polymorphic or mutational markers
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/158—Expression markers
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/16—Primer sets for multiplex assays
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/172—Haplotypes
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
- G16B25/20—Polymerase chain reaction [PCR]; Primer or probe design; Probe optimisation
Abstract
The present invention provides the methods, system and computer-readable medium for detecting chromosome segment or whole chromosome ploidy, for detecting mononucleotide variant and for detecting chromosome segment ploidy and mononucleotide variant.In some respects, the present invention provides the methods, system and computer-readable medium for detecting fetus cancer or chromosome abnormality.
Description
National Phase in China, application No. is 201580033190.X, the applying date to enter on December 20th, 2016 by the application
For on 04 21st, 2015, the division of the application for a patent for invention of entitled " mutation and ploidy in detection chromosome segment "
Application.
Cross reference to related applications
This application claims the U.S. Provisional Patent Applications submitted on April 21st, 2014 the 61/982,245th, in May, 2014
The U.S. Provisional Patent Application submitted the 61/987,407th on the 1st, the U.S. Provisional Patent Application submitted on October 21st, 2014
No. 62/066,514, the U.S. Provisional Patent Application submitted on April 10th, 2015 the 62/146,188th, April 14 in 2015
U.S. Provisional Patent Application the 62/147,377th that day submits, the U.S. Provisional Patent Application submitted on April 15th, 2,015 the
62/148, No. 173 priority, these applications are all hereby incorporated by a part of introduction knowledge by citation herein.
Technical field
The invention mainly relates to the method and systems for detecting chromosome segment ploidy, and for detecting single nucleosides
The method and system of sour variant.
Technical background
Copy number variation (CNV) and the main reason for be considered as genome structure variation, duplication including sequence and
It deletes, the sequence normal length range is in 1,000 base-pair (1kb) to 20 megabasses between (mb).Chromosome segment or
The deletion and duplication of person's whole chromosome are related with a variety of situations, such as neurological susceptibility or the resistance to disease.
Copy number variation is typically divided into two major classes, the length based on impacted sequence.The first kind includes that copy number is polymorphic
Property (CNP), this be in general population it is common, the sum frequency of appearance is greater than 1%.(most of length are less than the usual very little of CNP
10 kilobase), and they are usually enriched with and encode important removing toxic substances and immune protein gene.The subset of these CNP is relative to copying
Shellfish number is alterable height.Therefore, different human chromosomes can have a variety of different copy numbers for specific one group of gene
(such as 2,3,4,5 etc.).And the relevant CNP of immune response gene is related with the neurological susceptibility of complex inheritance disease recently, including ox-hide
Tinea, regional ileitis and glomerulonephritis.
Second class CNV includes the relatively rare variation more much longer than CNP, and magnitude range is from hundreds of thousands base-pair to super
Cross the length of 1,000,000 base-pairs.In some cases, these CNV are likely to occur in the sperm or ovum for generating particular individual
During formation or they may spread several generations in a family.These big and rare structure variations are
Through with mental retardation, hypoevolutism is disproportionately observed in the subject of schizophrenia and self-closing disease.He
Appearance in these cases cause the conjecture of people: big and rare CNV may be than other in neuro-cognitive disease
The genetic mutation of form is even more important, including mononucleotide replaces.
Gene copy number can change in tumour cell.For example, being replicated in breast cancer for Chr1p is common
, the copy number of EGFR is higher than normal level in non-small cell lung cancer.One of the main reason for cancer is death;Therefore, cancer
The early diagnosis and therapy of disease is important, because it can improve the prognosis of patient (as by increasing remission rate and paracmasis
Duration).Early diagnosis can also allow for patient by less or more mild therapeutic choice.Many existing treatments
Method, which destroys cancer cell, also influences normal cell, leads to various possible side effects, such as nausea, vomits, low blood count,
Increase the risk of infection, alopecia and mucosal ulcer.Therefore, the early detection of cancer is desirable, because it can reduce elimination
The amount and/or quantity for the treatment of needed for cancer (such as chemotherapeutics and radiation).
Copy number variation is also related with serious spirit and physiological barrier and idiopathic learning disorder.Using cell-free
The antenatal test (NIPT) of the Noninvasive of DNA (cfDNA) can be used for detecting exception, such as fetus 13,18 and trisomy 21 syndrome,
Trisomies and sex chromosome aneuploidy.Subchromoso is micro-deleted, can also lead to serious spirit and physiological barrier, due to
Its lesser size and be more difficult to detect.The total incidence of eight kinds of microdeletion syndromes is more than 1/1000, so that they are almost
It is common as fetal chromosomal patau syndrome.
In addition, the relatively high copy number of CCL3L1 is associated with the hyposensitivity of HIV infection, FCGR3B (CD16 cell surface
Immunoglobulin receptor) low copy number can increase to systemic loupus erythematosus and similar inflammatory autoimmune disease
Neurological susceptibility.
The missing and repetition of chromosome segment or whole chromosome are detected therefore, it is necessary to improved method.Preferably, these
Method can be used for more accurately diagnosing the illness or the increase risk of disease, such as the gene copy number in cancer or Pregnant Fetus becomes
Change.
Summary of the invention
In illustrative embodiment, the present invention provides a kind of for measuring the ploidy of chromosome segment in individual specimen
Method.Method includes the following steps:
A. gene frequency data are received, the gene frequency data include one on chromosome segment in sample
Allele quantity present on each gene loci in group polymorphic loci;
B. the phase allelic information in Genetic polymorphism site is generated by the phase of assessment gene frequency data;
C. gene frequency data are used, the individual for generating the gene frequency of Different Ploidy polymorphic site is general
Rate;
D. using individual probability and phase allelic information, Genetic polymorphism site union of sets probability is generated;With
E. be based on joint probability, a most suitable model selected to indicate ploidy, so that it is determined that chromosome piece
The ploidy of section.
One measure ploidy method illustrative embodiment in, data be by nucleic acid sequence data, it is especially high
What flux nucleic acid sequence data generated.In certain illustrative embodimentss for determining ploidy method, gene frequency number
According to by carry out error correction, before it is used to individual probability.In specific description embodiment, the error of correction
Including amplified allele efficiency variation.Illustrate in embodiment at other, the error of correction includes environmental pollution and genotype
Pollution.In some embodiments, the error of correction includes amplified allele efficiency variation, and environmental pollution and genotype are dirty
Dye.
In the embodiment of certain measurement times counting methods, pass through one group of Different Ploidy state and polymorphic site equipotential base
Because the model of unbalance factor generates individual probability.In these embodiments and other embodiments, by considering chromosome piece
Chain between polymorphic locus generates joint probability to section above.
Therefore, it in an illustrative embodiment for being combined with these embodiments, provides a method, examines here
Survey the ploidy in individual specimen comprising following steps:
A. the nucleic acid sequence data of the allele in individual chromosome segment at one group of polymorphic site is received;
B. the gene frequency for detecting gene loci set, uses nucleotide sequence data;
C. amplified allele efficiency variation is corrected in the gene frequency detected, correct one group of generation polymorphic
The gene frequency in property site;
D. the phase allelic information of polymorphic site group is generated by the phase of assessment nucleotide sequence data;
E. gene frequency and Different Ploidy state set and polymorphic site allele after correcting by comparing are not
The model of balanced ratio generates the individual probability of the gene frequency of the polymorphic site for Different Ploidy state;
F. joint probability is generated, to polymorphic position point set, by combining individual probability (to consider more on chromosome segment
Linkage relationship between state property site);
G. according to joint probability, the selection instruction most suitable model of chromosome aneuploid.
On the other hand, a kind of system is provided here, for detecting the ploidy in individual specimen, this system
Include:
A. input processor is configured as receiving gene frequency number in the polymorphic site group of chromosome segment
According to the quantity including each allele on site each in sample;
B. modeling device is configured as:
I. for a series of polymorphic sites, the allele that the phase by assessing gene frequency generates phase is believed
Breath;With
Ii. the gene frequency of the polymorphic site for Different Ploidy state is generated using gene frequency data
Frequency of individuals;
Iii. the Combined Frequency of polymorphic site group is generated using frequency of individuals and phase allelic information;
C. assume manager, be configured to select the most suitable model of display ploidy according to joint probability,
To determine the ploidy of chromosome segment.
In certain embodiments of the system implementation plan, gene frequency data are generated by nucleic acid sequencing system
Data.In certain embodiments, which further includes error correction unit, to correct those gene frequency data
In mistake, wherein the gene frequency data after those corrections be not then modeled person using so that generate individual probability.One
In a little embodiments, error correction unit corrects for the efficiency variation of amplified allele.In certain embodiments, modeler
Individual probability is generated by a group model, this group model includes the equipotential in different ploidy state and polymorphic position point set
Gene unbalance factor.In some example embodiments, modeler is by considering Genetic polymorphism site in chromosome segment
On linkage relationship, to generate joint probability.
In illustrative embodiment, a kind of system is provided here, for detecting the chromosome in individual specimen times
Property comprising it is following:
A. input processor is configured to receive the nucleosides of the allele of polymorphic site group in individual chromosome segment
Acid sequence data, and the gene frequency on gene loci is detected using nucleotide sequence data;
B. error correction unit is configured to the error for the gene frequency that correction detects, is polymorphic position point set
Symphysis is at correct gene frequency;
C. modeling device is configured as:
I. the allelic information of the polymorphic position point set of phase is generated by the phase of assessment nucleic acid sequence data;
Ii. by comparing the allelic information of phase and Different Ploidy state set and serial polymorphic site equipotential base
Because the model of unbalance factor generates the individual probability of the gene frequency to the polymorphic site of Different Ploidy state;
Iii. by combining individual probability, consider that the relative distance of polymorphic site on chromosome segment generates polymorphic position
The joint probability of point set;
D. assume manager, be configured for according to joint probability, selection indicates the most suitable of chromosomal aneuploidy
Model.
In some aspects, this invention provides a method to determine in individual sample whether deposit
In the tumour nucleic acid of circulation, comprising:
A. sample is analyzed to determine a series of ploidy of polymorphic sites in individual chromosome segment;
B. the unbalanced level of allele determined based on ploidy occurred in polymorphic site is determined, wherein equipotential base
Show that there are circulating tumor nucleic acid in sample because imbalance is equal to or more than 0.4%, 0.45% or 0.5%.
In certain embodiments, it is determined whether further include that detection mononucleotide becomes there are the method for circulating tumor nucleic acid
The single nucleotide variations on single nucleotide variations site in ectopic sites group, wherein detect allele imbalance be equal to or
Greater than 45% or detect single nucleotide acid variation, or both have concurrently, show that there are circulating tumor nucleic acid in sample.
In certain embodiments, it is determined whether there are the analytical procedures of the method for circulating tumor nucleic acid, including analysis one
Group chromosome segment, it is known that it is shown as aneuploid in cancer.In certain embodiments, it is determined whether it is swollen to there is circulation
The analytical procedure of the method for tumor nucleic acid, including analysis 1000 to 5000 between or 100 to 1000 between ploidy polymorphic position
Point.
In some aspects, it provides here a kind of for detecting the method that single nucleotide acid makes a variation in sample.Therefore, here
It provides a kind of for the determining side that single nucleotide variations whether there is on one group of genomic locations from individual specimen
Method, this method include:
A. to each genomic locations, frequency estimation is generated and to amplicon in genome using a set of training dataset
The assessment of the error rate of each circulation when extending on position;
B. the nucleotide similarity information observed is received for each of sample genomic locations;
C. by comparing the nucleotide similarity information and a Different Variation index observed on each genomic locations
Model, using the error rate of independent each circulation on the amplification efficiency and each genomic locations after assessment determine one group every
A possibility that one or more single nucleotide acid mutation probabilities being really mutated are caused on a genomic locations;
D. most possible true mutation rate and confidence level are determined from the Making by Probability Sets of each genomic locations.
In the illustrative embodiment for the method for being used to determine whether to make a variation there are single nucleotide acid, for one group across base
Because the amplicon of group position generates assessment efficiency and each cyclic error rate.For example, 2,3,4,5,10,15,20,25,50,100
Or more the amplicon across genomic locations can be included.In certain embodiments of this method, for detecting
The detection limit of one or more spleen necrosis virus is 0.015%, 0.017% or 0.02%.
It is being used to determine whether that there are the nucleosides in the illustrative embodiment of the method for single nucleotide acid variation, observed
Sour affinity information includes to number observed by each genomic locations total indicator reading and to making a variation on each genomic locations
Number observed by allele.
For determining in the illustrative embodiment of method that single nucleotide acid variation whether there is, sample is blood plasma sample
This, single nucleotide acid variation is present in the Circulating tumor DNA of sample.
In another embodiment, it provides a method here for detecting in the tested sample of individual
One or more single nucleotide variations.According to the method for this embodiment, comprising the following steps:
A. it determines average variant gene frequency in one group of reference sample from each normal individual, is transported according to sequencing
Row determines selected single nucleotide variations as a result, in each mononucleotide variant position of one group of mononucleotide differential location
An average variant gene frequency of the site in normal sample is lower than a threshold value, and determination is deleting each monokaryon
The background error in the later each single nucleotide variations site of thuja acid variant sites outliers;
B. a reading weighted average observed and variance depth are determined, to the mononucleotide selected in detection sample
Variant sites run data generated based on the sequencing to test sample.
C. it is determined using computer, one or more single nucleotide variations sites, compared with the background error in the site
With weighted average statistical significance is read, to detect one or more single nucleotide variations.
In certain embodiments of this method for detecting one or more SNV, sample is plasma sample, control sample
It originally is plasma sample, the one or more single nucleotide acid variants detected are present in the Circulating tumor DNA of sample.It is being used for
It include at least 25 samples in multiple samples for reference in the certain embodiments for detecting this method of one or more SNV.With
In certain embodiments of this method of detection one or more SNV, exceptional value is by the number from high-flux sequence operation generation
According to middle removing, the observation depth for reading weighted average is calculated, determines the variance observed.For detecting one or more
In certain embodiments of this method of SNV, the reading depth to each single nucleotide variations site of test sample is at least
100 readings.
In certain embodiments of this method for detecting one or more SNV, sequencing operation includes limiting primer
Multiplex amplification reaction under reaction condition.In certain embodiments of this method for detecting one or more SNV, detection
It is limited to 0.015%, 0.017% or 0.02%.
In one aspect, the present invention describes a kind of method, it is determined whether there are the copies of the first homologous chromosomal segments
Number is overexpressed, compared with the second homologous chromosomal segments in one or more cellular genomes from individual.In some realities
It applies in scheme, which comprises obtain the phase gene data of the first homologous chromosomal segments composition, including the first homologous dye
The identity of the allele at that gene loci on chromosome fragment, for polymorphic position on the first homologous chromosomal segments
For each locus in point set;The phase gene data of the second homologous chromosomal segments is obtained, including is present in second
The identity of allele on homologous chromosomal segments at the gene loci, for polymorphic on the second homologous chromosomal segments
Each locus in property gene loci set;And the genetic alleles data of measurement are obtained, including one from individual
The quantity of each allele present in DNA the or RNA sample of a or multiple cells, in the set of Genetic polymorphism site
Each gene loci at each allele for.In some embodiments, the method includes enumerating one group one
Or multiple hypothesis, specify the copy of the first homologous chromosomal segments in the genome of individual one or more cells excessively high
A possibility that degree, calculating (such as calculating on computers) one or more hypothesis, the genetic data of the sample based on acquisition
With phase gene data obtained, and selection have maximum likelihood it is assumed that thereby determining that individual one or more thin
The overexpression degree of the copy number of first homologous chromosomal segments in born of the same parents' genome.In some embodiments, the number of phases
According to including the phase data speculated using the group obtained based on Haplotype frequencies, and measurement phase data (for example, logical
It crosses and determines phase data to what the sample containing DNA or RNA from individual or individual relatives measured acquisition).
In one aspect, the present invention describes a kind of method, is used to determine whether that there are the first homologous chromosomal segments
The overexpression of copy number, compared with the second homologous chromosomal segments in the genome of one or more cells from individual.
In some embodiments, the method includes obtaining the phase genetic data of the first homologous chromosomal segments, including first same
The identity of the allele at that gene loci in source chromosome segment, for polymorphic on the first homologous chromosomal segments
Property site set in each locus for;The phase gene data of the second homologous chromosomal segments is obtained, including is present in
The identity of allele on second homologous chromosomal segments at the gene loci, on the second homologous chromosomal segments
Each locus in the set of Genetic polymorphism site;And the genetic alleles data of measurement are obtained, including individual one
Or the quantity of each allele present in multiple cell DNAs or RNA sample, for every in the set of Genetic polymorphism site
For each allele at a gene loci.In some embodiments, the method includes enumerating one group one or more
A hypothesis specifies the degree of the overexpression of the first homologous chromosomal segments;It calculates, for each hypothesis, multiple genes in sample
The expection genetic data in site, from phase genetic data obtained;(such as calculating on computers) is calculated between sample
Fitting data in the genetic data and sample of middle acquisition between expected genetic data;According to data fitting to one or more false
If being ranked up;Selected and sorted it is highest it is assumed that so that it is determined that in individual one or more cellular genomes the first homologous dye
The degree that chromosome fragment copy number is overexpressed.
In one aspect, the present invention describes a kind of method, it is determined whether there are the copies of the first homologous chromosomal segments
Number is overexpressed, compared with the second homologous chromosomal segments in one or more cellular genomes from individual.In some realities
It applies in scheme, which comprises obtain the phase gene data of the first homologous chromosomal segments composition, including the first homologous dye
The identity of the allele at that gene loci on chromosome fragment, for polymorphic position on the first homologous chromosomal segments
For each locus in point set;The phase gene data of the second homologous chromosomal segments is obtained, including is present in second
The identity of allele on homologous chromosomal segments at the gene loci, for polymorphic on the second homologous chromosomal segments
Each locus in property gene loci set;And the genetic alleles data of measurement are obtained, for Genetic polymorphism position
For each allele at each gene loci in point set, including from the one or more target cells and one of individual
The quantity of each allele present in DNA the or RNA sample of a or multiple non-target cells.In some embodiments, institute
Stating method includes enumerating one group of one or more hypothesis, specifies the degree of the overexpression of the first homologous chromosomal segments;Calculate (example
As calculated on computers), for each hypothesis, the expection genetic data in multiple sites in sample is lost from the phase obtained
Pass in data, for one or more target cells DNA or RNA until one or more of DNA or RNA total in sample can
For energy probability;Each possible probability of (such as calculating on computers) DNA or RNA are calculated, and is directed to each hypothesis,
The possibility probability data of DNA or RNA is fitted between acquisition genetic data and the expection genetic data of sample of sample, and
And it is fitted that hypothesis;One or more hypothesis is ranked up according to data fitting;And it selects wherein to arrange highest vacation
If so that it is determined that from individual one or more cells genome in the first homologous chromosomal segments copy number cross table
The degree reached.
In one aspect, the present invention describes a kind of method, is used to determine whether that there are the first homologous chromosomal segments
Copy number is overexpressed, compared with the second homologous chromosomal segments in one or more cellular genomes from individual.One
In a little embodiments, which comprises obtain the phase gene data of the first homologous chromosomal segments composition, including first same
The identity of the allele at that gene loci in source chromosome segment, for polymorphic on the first homologous chromosomal segments
Property site set in each locus for;The phase gene data of the second homologous chromosomal segments is obtained, including is present in
The identity of allele on second homologous chromosomal segments at the gene loci, on the second homologous chromosomal segments
Each locus in the set of Genetic polymorphism site;And the genetic alleles data of measurement are obtained, including from individual
The number of each allele present in DNA the or RNA sample of one or more target cells and one or more non-target cells
Amount, for each allele at each gene loci in the set of Genetic polymorphism site.In some embodiments
In, the method includes enumerating one group of one or more hypothesis, specify the degree of the overexpression of the first homologous chromosomal segments;Meter
Calculate (such as calculating on computers), for each hypothesis, the expection genetic data in multiple sites in sample, from the phase obtained
Position genetic data in, for one or more target cells DNA or RNA until one or more in DNA or RNA total in sample
For a possibility probability;To each of multiple sites site calculate (such as on computers calculate) its DNA or RNA can
Energy probability, and a possibility that its hypothesis is correct is calculated to each hypothesis, by comparing the acquisition heredity of that gene loci of sample
The expection genetic data of data and that site, for the possibility probability of DNA or RNA and that hypothesis;Determine each vacation
The joint probability said, by combining a possibility that hypothesis is for each site and each possibility probability;Selection has
The hypothesis of maximum joint probability, so that it is determined that the degree that the copy number of the first homologous chromosomal segments is overexpressed.In some implementations
In scheme, all sites are considered disposably to calculate the probability of specific hypothesis, and the hypothesis with maximum probability is selected.
In one aspect, the present invention describes a kind of method, for determining an interested dyeing in Fetal genome
The copy number of body segment.In some embodiments, the method includes obtaining the phase of at least one biology parent of fetus
Position genetic data, wherein phase genetic data includes polymorphic on the first homologous chromosomal segments and the second homologous chromosomal segments
Property locus group in allele existing for each locus identity, a pair of homologous comprising chromosome segment interested
In chromosome segment.In some embodiments, the method includes one group of polymorphisms on interested chromosome segment
Genetic data is obtained in site, in portion comprising foetal DNA or RNA and from the maternal DNA of fetus mother or the mixing of RNA
DNA or RNA sample in, pass through the amount of each allele on each gene loci of measurement.In some embodiments, institute
Stating method includes enumerating one group of one or more hypothesis, the specified chromosome segment interested being present in Fetal genome
Copy number.In some embodiments, the method includes enumerating one group of one or more hypothesis, for one of fetus or two
A parent specifies the copy number of first homologous chromosomal segments in Fetal genome from parent or part thereof, fetus gene
The copy number of the second homologous chromosomal segments in group from parent or part thereof, and the sense being present in Fetal genome are emerging
The copy sum of interesting chromosome segment.In some embodiments, this method includes calculating (such as calculating on computers), right
In it is each it is assumed that multiple locus in mixing sample expection genetic data, from it is obtained from parent phase heredity
Data;Calculate the expected hereditary number for obtaining genetic data and mixing sample of (such as calculating on computers) between mixing sample
Fitting data between;One or more hypothesis are ranked up according to data fitting;The highest hypothesis of selected and sorted, thus
Determine the copy number of interested chromosome segment in Fetal genome.
In one aspect, the present invention describes a kind of method, for determining an interested dyeing in Fetal genome
The copy number of body segment.In some embodiments, the method includes obtaining the phase of at least one biology parent of fetus
Position genetic data, wherein the phase genetic data is included in the first homologous chromosomal segments and the second homologue in parent
The identity of allele existing for each locus in one group of polymorphic locus in segment.In some embodiments,
It include tire in portion the method includes obtaining genetic data in one group of polymorphic site on chromosome or chromosome segment
Youngster DNA or RNA and in the mixed DNA or RNA sample of the maternal DNA or RNA of fetus mother passes through each base of measurement
Because of the amount of each allele on site.In some embodiments, one or more false the method includes enumerating one group
It says, the copy number of the specified chromosome or chromosome segment interested being present in Fetal genome.In some embodiments,
The method includes creation (such as creating) on computers, for each hypothesis, creates multiple genes in a mixing sample
The probability distribution of the desired amount of each allele in site on each site, it is obtained from parent's from (i)
The probability of the one or more exchanges of phase genetic data and (ii) optionally, it may occur however that Yu Peizi shaping age carrys out fetus
It says and contributes to a copy for its interested chromosome or chromosome segment;It is quasi- to calculate (such as calculating on computers) one
It closes, for each it is assumed that in multiple sites in the genetic data and (2) mixing sample for the mixing sample that (1) obtains each
On site between the probability distribution of the desired amount of each allele;One or more hypothesis are arranged according to data fitting
Sequence;And the highest hypothesis of selected and sorted, so that it is determined that in Fetal genome interested chromosome segment copy number.
In some embodiments, this method includes obtaining phase genetic data for mother of fetus.In some embodiment party
In case, the method includes enumerating one group of one or more hypothesis, specify homologous from maternal first in Fetal genome
The copy number of chromosome segment or part thereof, from maternal the second homologous chromosomal segments or part thereof in Fetal genome
Copy number, and the total copy number of chromosome segment interested being present in Fetal genome.In some embodiments,
The method includes calculating, for each hypothesis, the expection genetic data in multiple sites, comes from from obtained in mixing sample
In maternal phase genetic data.
In some embodiments, the expection genetic data of each hypothesis includes mother body D NA or RNA and tire in mixing sample
The consistency and amount of one or more allele of each locus in multiple locus of youngster DNA or RNA.In some implementations
In scheme, the method includes calculating (such as calculating on computers) expected genetic datas, pass through tire in measurement mixing sample
The ratio of youngster DNA or RNA and the ratio of mother body D NA or RNA.In some embodiments, this method includes calculating, for multiple
Each site in gene loci, in mixing sample in female parent DNA or RNA one or more allele of the locus it is pre-
Phase amount, using the identity of the allele occurred on that site, in the maternal phase genetic data of acquisition, and it is mixed
Close the part of female parent DNA or RNA in sample.In some embodiments, this method includes calculating (such as to count on computers
Calculate), for each gene loci in multiple gene locis in each hypothesis, inherit in mixing sample in maternal foetal DNA
Or the desired amount of one or more allele in RNA on that gene loci, by by hypothesis specify via fetus after
The identity of the allele that gene loci occurs on maternal first or second homologous chromosomal segments held,
The copy number from maternal first or second homologous chromosomal segments by fetus genetic is appointed as by hypothesis, and
The ratio of foetal DNA or RNA in mixing sample.
In some embodiments, the expection genetic data of each hypothesis includes mother body D NA or RNA and tire in mixing sample
The consistency and amount of one or more allele of each locus in multiple locus of youngster DNA or RNA.In some implementations
In scheme, the method includes calculating expected genetic data, by the ratio of foetal DNA or RNA in measurement mixing sample and
The ratio of mother body D NA or RNA.In some embodiments, this method includes calculating (such as calculating on computers), for more
Each site in a gene loci, maternal DNA of the locus in mixing sample or one or more equipotential bases in RNA
The desired amount of cause uses the base in the phase genetic data of female parent obtained in mixing sample and the ratio of mother body D NA or RNA
Because of the identity of allele existing at site.In some embodiments, this method includes calculating (such as on computers
Calculate), for each gene loci in multiple gene locis in each hypothesis, inherit in mixing sample in maternal and male parent
The desired amount of one or more allele in foetal DNA or RNA on that gene loci, by by hypothesis specify via
The allele that gene loci occurs on maternal first or second homologous chromosomal segments that fetus inherits
Identity is appointed as the copy from maternal first or second homologous chromosomal segments by fetus genetic by hypothesis
Number, specifies that gene position on the first or second homologous chromosomal segments from male parent inherited via fetus by hypothesis
The identity for the allele that point occurs, is appointed as same by the first or second from male parent of fetus genetic by hypothesis
The ratio of foetal DNA or RNA in the copy number and mixing sample of source chromosome segment.In some embodiments, group's frequency
Rate be used to predict the identity of the first or second homologous chromosomal segments allelic from male parent.In some embodiment party
In case, the possible allele each of at each gene loci in the first or second homologous chromosomal segments of male parent
Probability be considered identical.
In some embodiments, the method includes obtaining the female parent of fetus and the phase genetic data of male parent.One
In a little embodiments, the method includes enumerating one group of one or more hypothesis, the female parent in Fetal genome is specified
The copy number of first homologous chromosomal segments or part thereof, the second maternal homologous chromosomal segments in Fetal genome
Copy number, the copy number of first homologous chromosomal segments of male parent in Fetal genome or part thereof, come from fetus
The copy number of the second homologous chromosomal segments of male parent in genome or part thereof, and the sense being present in Fetal genome
Total copy number of interest chromosome segment.In some embodiments, the method includes calculating (such as to count on computers
Calculate), for it is each it is assumed that in mixing sample multiple gene locis expection genetic data, from obtained from maternal phase
In position genetic data and the phase genetic data of the male parent of acquisition.
In some embodiments, expected genetic data includes each position in multiple gene locis for each hypothesis
The identity and amount of one or more allele on point, from the maternal DNA or RNA and foetal DNA or RNA in mixing sample
In.In some embodiments, the method includes calculating expected genetic data, pass through foetal DNA in measurement mixing sample
Or RNA ratio and mother body D NA or RNA ratio.In some embodiments, this method includes calculating (such as in computer
Upper calculating), for each site in multiple gene locis, that gene loci in female parent DNA or RNA in mixing sample
One or more allele desired amount, using occurring on that gene loci in maternal phase genetic data obtained
Allele identity and mixing sample in female parent DNA or RNA ratio.In some embodiments, this method includes meter
Calculate (such as calculating on computers), for each site in multiple gene locis of each hypothesis, tire in mixing sample
The desired amount of one or more allele in youngster DNA or RNA on that gene loci, by being specified by hypothesis via tire
The allele of that gene loci appearance is same on the first or second homologous chromosomal segments from female parent that youngster inherits
One property is appointed as the copy from maternal first or second homologous chromosomal segments by fetus genetic by hypothesis
Number, specifies that gene position on the first or second homologous chromosomal segments from male parent inherited via fetus by hypothesis
The identity for the allele that point occurs, is appointed as same by the first or second from male parent of fetus genetic by hypothesis
The ratio of foetal DNA or RNA in the copy number and mixing sample of source chromosome segment.
In some embodiments, this method includes calculating (such as calculating on computers), multiple for each hypothesis
One probability distribution of the expected genetic data of gene loci, from the mixing sample for obtaining phase genetic data in parent.?
In some embodiments, the method includes increasing the specific allele in mixing sample on first gene loci in probability
Probability in distribution if that specific allele appears in the first homologous chromosomal segments of parent, and is obtaining
Mixing sample genetic data in an equipotential on a site near the first homologous chromosomal segments of parent for observing
On gene;Or probability of the specific allele in reduction mixing sample on first gene loci in probability distribution, such as
That specific allele of fruit appears in the first homologous chromosomal segments of parent, and in the mixing sample heredity number of acquisition
On the allele on a site near the first homologous chromosomal segments of parent observed in.In some implementations
In scheme, the method includes increasing the specific allele in mixing sample on second gene loci in probability distribution
Probability, if that specific allele appears in the second homologous chromosomal segments of parent, and in the aggregate sample of acquisition
On the allele on a site near the second homologous chromosomal segments of parent observed in this genetic data;Or
Person reduces probability of the specific allele in probability distribution in mixing sample on second gene loci, if that is specific
Allele appear in the second homologous chromosomal segments of parent, and observed in the mixing sample genetic data of acquisition
The second homologous chromosomal segments of parent near a site on an allele on.
In some embodiments, this method includes the female parent for obtaining fetus and the phase genetic data of male parent.Some
In embodiment, the method includes enumerating one group of one or more hypothesis, specify in Fetal genome from maternal first
The copy number of homologous chromosomal segments or part thereof section, in Fetal genome from maternal the second homologous chromosomal segments or
The copy number of its partial sector, the copy of first homologous chromosomal segments in Fetal genome from male parent or part thereof section
It counts, the copy number and fetus gene of second homologous chromosomal segments in Fetal genome from male parent or part thereof section
The copy sum of interested chromosome segment in group.In some embodiments, this method includes calculating (such as in computer
Upper calculating), for each it is assumed that multiple gene locis are expected a probability distribution of genetic data in mixing sample, from female parent
In the phase genetic data obtained with parent.In some embodiments, the method includes increasing to be present in mixing sample
Probability of the specific allele in probability distribution on first gene loci, if the specific allele be present in it is maternal or
In first homology segment of male parent, and the allele on the gene loci near the first homologous chromosomal segments of the parent
It can be observed in mixing sample genetic data obtained;Or it reduces and is present in mixing sample on the first gene loci
Probability of the specific allele in probability distribution, if the specific allele is present in maternal or male parent the first homologous region
Duan Zhong, and the allele on the gene loci near the first homologous chromosomal segments of the parent is obtained in mixing sample
Genetic data in do not observe.In some embodiments, the method includes increasing to be present in the second base in mixing sample
Because of probability of the specific allele in probability distribution on site, if the specific allele be present in it is maternal or male parent
In second homology segment, and the allele on the gene loci near the second homologous chromosomal segments of the parent can be
It is observed in mixing sample genetic data obtained;Or reduce be present in mixing sample it is specific etc. on the second gene loci
Probability of the position gene in probability distribution, if the specific allele is present in maternal or male parent the second homology segment,
And the allele on the gene loci near the second homologous chromosomal segments of the parent is in mixing sample something lost obtained
It passes and is not observed in data.
In some embodiments, the gene loci near the first gene loci and the first gene loci isolates.One
In a little embodiments, the gene loci near the second gene loci and the second gene loci is isolated.In some embodiments,
It does not expect to intersect between the first gene loci and the gene loci close to the first gene loci.In some embodiments
In, it does not expect to intersect between the second gene loci and the gene loci of close second gene loci.In some embodiment party
In case, the distance between the first gene loci and gene loci of close first gene loci are less than 5mb, 1mb, 100kb,
10kb, 1kb, 0.1kb or 0.01kb.In some embodiments, the second gene loci and the gene close to the second gene loci
The distance between site is less than 5mb, 1mb, 100kb, 10kb, 1kb, 0.1kb or 0.01kb.
In some embodiments, one or more to intersect the chromosome interested for betiding and contributing a copy for fetus
During the gamete of segment is formed;And intersects and generate interested chromosome segment in a Fetal genome, it includes come from
In a part of the first homology segment of parent and a part of the second homology segment.In some embodiments, including one
Or the hypothesis set of multiple hypothesis, specify Fetal genome in interested chromosome segment copy number, it includes from
A part of the first homology segment of parent and a part of the second homology segment.
In some embodiments, the expection genetic data of mixing sample includes multiple bases in the mixing sample of each hypothesis
Because of the desired amount of allele one or more on gene loci each in site.
In one aspect, the present invention describes a kind of method, for determining that the copy number of the first homologous chromosomal segments is
No overexpression, compared with the second homologous chromosomal segments in genes of individuals group (such as in one or more cellular genomes,
CfDNA, cfRNA suspect the individual for suffering from cancer, fetus or embryo), utilize phase genetic data.In some embodiments,
The method includes simultaneously or successively (i) obtains phase genetic data in any order, to the first homologous chromosomal segments,
Identity comprising being present in the allele on the first homologous chromosomal segments at the gene loci, dyeing homologous for first
In body segment for each gene loci of multiple polymorphic sites, (ii) obtains phase genetic data, to the second homologous dyeing
Body segment, it includes the identity for being present in the allele on the second homologous chromosomal segments on the gene loci, for
For each gene loci on two homologous chromosomal segments in multiple Genetic polymorphism sites, and (iii) obtains measurement
Genetic alleles data, the amount including each allele at each gene loci in the set of Genetic polymorphism site,
The mixing of dissociative DNA or RNA from individual one or more cells or from the different cells of the two or more heredity of individual
In sample.In some embodiments, the method includes calculating allele ratio, at least one for separating sample
For one or more sites in multiple heterozygosity Genetic polymorphisms site in cell.In some embodiments, it calculates
The allele ratio for specific site, be the measurement amount an of allele divided by allele all on gene loci
Overall measurement amount.In some embodiments, the method includes determining whether there is the copy of the first homologous chromosomal segments
Number is overexpressed, by comparing one or more allele ratios and expected allele calculated on a gene loci
Ratio, if such as first and second homologous chromosomal segments of expected ratio of that gene loci deposit at equivalent ratios
When.In some embodiments, it is contemplated that ratio 0.5, for biallelic marker.
In some embodiments for antenatal test, the method includes simultaneously or successively in any order (i)
The phase gene data (such as the maternal fetus bred of pregnancy) of the first homologous chromosomal segments in Fetal genome is obtained, including
It is present in the identity of the allele on that site of the first homologous chromosomal segments, on the first homologous chromosomal segments
One group of Genetic polymorphism site in each site for, (ii) obtains the second homologous chromosomal segments in Fetal genome
Phase gene data, including the identity for the allele being present on that site of the second homologous chromosomal segments, for
For each site in one group of Genetic polymorphism site on two homologous chromosomal segments, and (iii) obtains the something lost of measurement
The measurement that allele data include each allele amount is passed, in one group of polymorphism of fetus female parent DNA or RNA mixing sample
On each site of gene loci comprising foetal DNA or RNA and female parent DNA or RNA are (for example originating from maternal blood sample
In dissociative DNA or RNA a mixing sample, dissociative DNA or RNA including fetus and maternal dissociative DNA or RNA).?
In some embodiments, the method includes calculating the allele ratio of one or more gene locis, it is in fetus
Heterozygosis and/or be heterozygosis in female parent.In some embodiments, allele ratio specific gene site calculated
It is the measurement amount an of allele divided by the overall measurement amount of allele all on gene loci.In some embodiments,
The method includes determining whether there is the copy number of the first homologous chromosomal segments to be overexpressed, by comparing a gene loci
On one or more allele ratios calculated and expected allele ratio, such as one of that gene loci in advance
If the first and second homologous chromosomal segments of phase ratio at equivalent ratios in the presence of.
In some embodiments, the allele ratio of a calculating indicates the copy number of the first homologous chromosomal segments
Overexpression, if (i) being present in the allele of allele measured quantity at that gene loci on the first homologue
Ratio divided by allele all on locus overall measurement amount, greater than the expection allele ratio of that gene loci, or
(ii) it is present in the allele ratio of allele measured quantity at that gene loci on the second homologue divided by gene
The overall measurement amount of all allele on seat, greater than the expection allele ratio of that gene loci.In some embodiments
In, the gene frequency of a calculating indicates that the copy number of the first homologous chromosomal segments is not overexpressed, if (i) existed
In the allele ratio of allele measured quantity owns divided by locus at that gene loci on the first homologue
The overall measurement amount of allele is present in the second homologous dye less than the expection allele ratio of that gene loci, or (ii)
On colour solid at that gene loci the allele ratio of allele measured quantity divided by the total of allele all on locus
Measurement amount, more than or equal to the expection allele ratio of that gene loci.
In some embodiments, it is determined whether there are the overexpressions of the copy number of the first homologous chromosomal segments to include column
One group of one or more hypothesis is lifted to specify the degree of the overexpression of the first homologous chromosomal segments.In some embodiments
In, at least one cell heterozygosis site (such as in fetus heterozygosis and/or in female parent heterozygosis site) it is pre-
The allele ratio of survey is assessed each hypothesis, and the degree of overexpression is specified by that hypothesis.One
In a little embodiments, a possibility that hypothesis is correct, is calculated, by comparing the equipotential base of the allele ratio and prediction that calculate
Because of ratio, and select the hypothesis with maximum likelihood.In some embodiments, an expection of a test statistics point
Cloth is calculated, and the prediction allele ratio of each hypothesis is used.In some embodiments, a possibility that hypothesis is correct is counted
It calculates, is calculated by comparing using a test statistics for calculating allele ratio calculating and using expected allele ratio
Test statistics expected distribution, and select the hypothesis with maximum likelihood.In some embodiments, at least one is thin
The prediction of the gene loci of heterozygosis in born of the same parents' (such as being heterozygosis in fetus, and/or be the gene loci of heterozygosis in parent)
Allele ratio is estimated, according to the phase genetic data on the first homologous chromosomal segments, the second homologous chromosomal segments
On phase genetic data, and the degree of overexpression specified by the hypothesis.In some embodiments, hypothesis correctly may be used
Energy property is calculated, by comparing the allele ratio calculated and expected allele ratio;And select that there is maximum likelihood
Hypothesis.
In some embodiments, from the DNA (or RNA) of one or more target cells until total DNA (or RNA) in sample
Ratio calculated.One exemplary ratios is the ratio of the foetal DNA (or RNA) and total DNA (or RNA) in sample.One
In a little embodiments, the ratio of foetal DNA and total DNA is by measuring an equipotential on one or more gene locis in sample
The amount of gene determines that wherein fetus has the allele and female parent does not have.In some embodiments, tire in sample
The ratio of youngster DNA and total DNA is determined by the methylation differential between the one or more maternal and foetal alleles of measurement.
In some embodiments, one group of one or more hypothesis is listed to describe the overexpression journey of the first homologous chromosomal segments
Degree.In some embodiments, at least one cell heterozygosis site (such as in fetus heterozygosis and/or it is maternal
The site of middle heterozygosis) prediction allele ratio, when being calculating ratio according to DNA or RNA and assessing each hypothesis
Hypothesis specified overexpression degree is assessed.In some embodiments, a possibility that hypothesis is correct is calculated, and is passed through
Compare the allele ratio of calculating and the allele ratio of prediction, and selects the hypothesis with maximum likelihood.Some
In embodiment, tested to obtain a statistic using the allele ratio of prediction and DNA the or RNA ratio of calculating
It is expected that distribution, estimates each hypothesis.In some embodiments, a possibility that hypothesis is correct is determined, and passes through ratio
Compared with the test statistics that the ratio using the allele ratio of calculating and the DNA of calculating or RNA is calculated, and use
The expected distribution for the test statistics that the allele ratio and DNA of prediction or the calculating ratio of RNA are calculated, and
Select the hypothesis with maximum likelihood.
In some embodiments, the method includes enumerating one group of one or more hypothesis to specify the first homologous dyeing
The degree of the overexpression of body segment.In some embodiments, the method includes assessments, for each hypothesis or (i) exist
Be at least one cell heterozygosis gene loci (such as in fetus be heterozygosis and/or in mother be heterozygosis gene
Site) prediction allele ratio, according to the overexpression degree that the hypothesis is specified, or (ii) for one or more possible
DNA or RNA ratio (such as ratio of the total DNA or RNA in foetal DNA or RNA and sample), is calculated a test statistics
The expected distribution of amount, using the allele ratio of prediction and from one or more target cells (such as fetal cell) DNA or RNA
Until the possibility ratio of total DNA in sample or RNA.In some embodiments, data fitting is calculated, is calculated by comparing (i)
Allele ratio and prediction allele ratio, or (ii) utilize the allele ratio and DNA or RNA calculated
Possible ratio, and the test statistics being calculated using the allele ratio and DNA of prediction or the possibility ratio of RNA
Desired distribution.In some embodiments, it is fitted according to data and ranking is carried out to one or more hypothesis, and select ranking
Highest hypothesis.In some embodiments, a kind of technology or algorithm, such as a searching algorithm, are used in following steps
One or more: calculate data fitting, sort to hypothesis, or the top ranked hypothesis of selection.In some embodiments, number
It is for β-bi-distribution or for a fitting of bi-distribution according to fitting.In some embodiments, the technology or
Algorithm is gathered selected from one, including maximal possibility estimation, MAP estimation, Bayesian Estimation, dynamic estimation (such as dynamic
Bayesian Estimation) and expectation maximization estimation.In some embodiments, the method includes the application technologies or algorithm to go
Obtain genetic data and expected genetic data.
In some embodiments, the method includes one possible ratio of creation (such as foetal DNA or RNA and samples
In total DNA or RNA ratio) subregion, range total DNA or RNA into sample from one or more target cell DNA or RNA
DNA or RNA ratio under be limited to the upper limit.In some embodiments, one group of one or more hypothesis is listed, is specified
The degree of overexpression on one homologous chromosomal segments.In some embodiments, the method includes assessments, in subregion
The possibility ratio and each hypothesis of each DNA or RNA or (i) at least one cell be heterozygosis gene loci
(such as in fetus be heterozygosis and/or in mother be heterozygosis gene loci) prediction allele ratio, according to DNA
Or the overexpression degree that the possibility ratio of RNA and the hypothesis are specified, or (ii) using prediction allele ratio and DNA or
The expected distribution of one test statistics of possibility ratio calculation of RNA.In some embodiments, the method includes calculating,
A possibility that possibility ratio for each of subregion DNA or RNA and for each hypothesis, hypothesis is correct, pass through ratio
Compared with the allele ratio of (i) allele ratio calculated and prediction, or (ii)
The inspection statistics obtained using the allele ratio and DNA of calculating or the possibility ratio calculation of RNA, with benefit
The inspection statistics obtained with the possibility ratio calculation of the allele ratio of prediction and DNA or RNA.In some embodiments,
The joint probability of each hypothesis is determined, by combining that hypothesis probability to possibility ratio each in subregion;And
Select the hypothesis with greatest combined probability.In some embodiments, the joint probability of each hypothesis is determined, and passes through weight one
A possibility that a hypothesis is for specific possible ratio, be based on the possible ratio correct ratio possibility under.
In one aspect, the present invention describes a kind of method, for determining the copy number of chromosome or chromosome segment,
In the genome of one or more cells from individual, using phase or genetic data is obscured.In some embodiments, institute
Stating method includes obtaining genetic data, on one group of polymorphic site on the chromosome or chromosome segment of a sample, is led to
Cross the amount for measuring each allele at each gene loci.In some embodiments, sample comes from individual one
Or DNA the or RNA sample of multiple cells, or the mixing sample from individual dissociative DNA comprising come from two or more
The dissociative DNA of a heredity difference cell.In some embodiments, the allele ratio of heterozygous sites is calculated, in sample
In at least one cell in source.In some embodiments, it is for the allele ratio of the calculating in specific gene site
The measurement amount of each allele divided by allele all on gene loci total measurement amount.In some embodiments,
The allele ratio of the calculating in specific gene site is (such as the first homologous chromosomal segments of allele on the site
On allele) measurement amount divided by one or more of the other allele measurement amount (such as the second homologue piece
Allele in section).In some embodiments, one group of one or more hypothesis is listed, one or more cell is specified
The copy number of chromosome or chromosome segment in genome.In some embodiments, based on the most probable of test statistics
Hypothesis selected, so that it is determined that the copy number of chromosome or chromosome segment in one or more cellular genome.
In one aspect, the present invention describes a kind of method, for determining that fetus (such as is being pregnant and is breeding in female parent
Fetus) copy number of chromosome or chromosome segment in genome, using phase or obscure genetic data.In some embodiments
In, the method includes obtaining the genetic data of one group of polymorphic site on sample chromosomes or chromosome segment, pass through survey
Measure the amount of each allele at each gene loci.In some embodiments, sample be comprising foetal DNA or RNA and
The mixing sample of maternal DNA or RNA from fetomaternal
(such as the trip in the maternal serum sample containing fetus dissociative DNA or RNA and maternal dissociative DNA or RNA
From DNA or RNA mixing sample).In some embodiments, allele ratio is calculated, for being heterozygosity in fetus
And/or in female parent be heterozygosity gene loci for.In some embodiments, the calculating in specific gene site etc.
Position gene ratio is the measurement amount of an allele at gene loci divided by the overall measurement amount of all allele.Some
In embodiment, the allele ratio of the calculating in specific gene site is the allele (such as at gene loci
Allele on one homologous chromosomal segments) measurement amount divided by other one or more allele, (such as second is homologous
Allele on chromosome segment) measurement amount.In some embodiments, one group of one or more hypothesis is listed,
Specify the copy number of chromosome or chromosome segment in Fetal genome.In some embodiments, it is based on test statistics
Most probable hypothesis selected, so that it is determined that the copy number of chromosome or chromosome segment in Fetal genome.
In some embodiments, a hypothesis is selected, if belonging to the survey of the test statistics distribution of the hypothesis
It tries statistic probability and is higher than the upper limit;One or more hypothesis are denied, if belong to the test statistics distribution of the hypothesis
Test statistics probability is lower than lower limit;Or a hypothesis is not only unselected but also is not denied, if belonging to the test of the hypothesis
The test statistics probability of statistics distribution between lower and upper limit, or if probability not with sufficiently high confidence level quilt
It determines.In some embodiments, the overexpression of the copy number of the first homologous chromosomal segments is due to the first homologue
The missing of the repetition of segment or the second homologous chromosomal segments.In some embodiments, the institute of one or more gene locis
There is the overall measurement amount of allele to be compared with reference quantity, to determine the overexpression of the copy number of the first homologous chromosomal segments
Whether be repetition or the second homologous chromosomal segments due to the first homologous chromosomal segments missing.In some embodiments
In, the size of the difference between the allele ratio of the calculating at one or more gene locis and expected allele ratio
It is used to determine whether being overexpressed for the copy number of the first homologous chromosomal segments is weight due to the first homologous chromosomal segments
Multiple or the second homologous chromosomal segments missings.In some embodiments, the first and second homologous chromosomal segments are determined
Exist at equivalent ratios, if there is no the overexpression of the copy number of the first homologous chromosomal segments, and it is same without second
The overexpression of source chromosome segment (such as in the genome of cell, cfDNA, cfRNA are individual, fetus or embryo).
In some embodiments, from the DNA ratio of one or more target cells until the total DNA ratio in sample is true
Determine, based on one or more the total amount or relative quantity of one or more allele at gene loci, for the base of target cell
Being different from non-target cell genotype because of type and target cell and non-target cell are expected to be the cell of two-body.In some embodiment party
In case, which is used to determine whether that the overexpression of the copy number of the first homologous chromosomal segments is due to the first homologue
The missing of the repetition of segment or the second homologous chromosomal segments.In some embodiments, the ratio is for determining duplicate dye
The additional copy number of chromosome fragment or chromosome.In some embodiments, phase genetic data includes probability data.Some
In embodiment, obtains the phase of the first homologous chromosomal segments and/or the second homologous chromosomal segments in Fetal genome and lose
Pass data include obtain the first homologous chromosomal segments in one or two of fetus biology parent parental gene group and/
Or second homologous chromosomal segments phase genetic data, and infer fetus inherited from one or two biology parent to
Be which homologous chromosomal segments.In some embodiments, one or more to intersect the general of (such as 1,2,3 or 4 exchange)
Rate may betide ligand forming process, contribute to one of the first homologous chromosomal segments or the second homologous chromosomal segments
It is copied to fetus individual, which homologue is used to infer that fetus inherits from one or two biology parent is
Segment.In some embodiments, fetus female parent and/or the phase genetic data of male parent are obtained, using a kind of technology, are selected from
In one group of technology including digital pcr, haplotype is inferred using the group based on Haplotype frequencies, for example using haploid cell
Sperm or ovum carry out Haplotyping A, carry out haplotype point using the genetic data from one or more first degree relatives
Type, and combinations thereof.In some embodiments, individual phase genetic data is obtained, by will be in individual specimen
Split-phase position is carried out corresponding to missing or duplicate all or part of region.In some embodiments, the phase heredity number of fetus
According to obtained, by will correspond in the sample of fetus or fetus mother missing or duplicate all or part of region into
Row split-phase position.In some embodiments, the phase genetic data for obtaining the first and second homologous chromosomal segments includes determining
It is present in the identity of the allele in a chromosome segment, and determination is present in another chromosome segment by inference
In allele identity.In some embodiments, do not exist in the first homologous chromosomal segments obscures hereditary number
Allele in is assigned to the second homologous chromosomal segments.For example, if individual genotype be (AB, AB), and
The phase data of individual indicates that first haplotype is (A, A);So, another haplotype may infer that as (B, B).Some
In embodiment, if only measuring an allele at gene loci, which is confirmed as the first He
(for example, if the genotype at gene loci is AA, two haplotypes all have a part of second homologous chromosomal segments
There is A allele).In some embodiments, individual phase genetic data comprises determining whether that one or more has occurred
Possible chiasma, such as the sequence by determining any one flank region of recombination hotspot and recombination hotspot.One
In a little embodiments, it is that haplotype section is deposited with determination that any primed libraries of the invention, which are used to detection recombination event,
It is in genes of individuals group.
In some embodiments, this method include using Joint Distribution model (such as consider track between link
Joint Distribution model), linkage analysis is executed, using bi-distribution model, using beta-binomial model, and/or uses generation
(such as existed using chromosome in the probability of the chiasma of Meiosis (generate gamete formed embryo grow up to fetus)
The probability that different loci is intersected, builds on chromosome or chromosome segment interested in a mould and relies between polymorphic allele
On the chromosome of property.).
In some embodiments, one or more allele ratios calculated of cfDNA or cfRNA indicate cfDNA
Or in the derived cell of cfRNA DNA or RNA corresponding allele ratio.In some embodiments, cfDNA or cfRNA
Corresponding allele ratio in the allele ratio instruction genes of individuals group that one or more calculates.In some embodiments
In, an allele ratio is only calculated or is only compared with expected allele ratio, if the heredity of measurement
Statistics indicate that there are more than one different allele (such as in cfDNA or cfRNA sample) for the gene loci in sample.
In some embodiments, an allele ratio is only calculated or is only compared with expected allele ratio,
If at least one cell for carrying out sample separation site be heterozygosity (such as in fetus be heterozygosity and/or
It is the gene loci of heterozygosity in female parent).In some embodiments, an allele ratio only calculated or only with
Expected allele ratio is compared, if gene loci is heterozygosity in fetus.In some embodiments, one
A allele ratio is calculated or is compared with expected allele ratio, for the gene loci of homozygosity.
For example, be predicted as the gene frequency in homozygosity site, for tested particular individual (or for fetus and pregnant mothers
The two) for, it can be analyzed to determine the noise or error level of system.
In some embodiments, at least 10;50;100;200;300;500;750;1,000;2,000;3,000;
4000, or more gene loci (such as SNP) is analyzed, for interested chromosome or chromosome segment.?
In some embodiments, the average of the gene loci (such as SNP) of every mb in interested chromosome or chromosome segment
It is at least 1;10;25;50;100;150;200;300;500;750;1,000;Or more the every mb in site.In some embodiments
In, the average of the gene loci (such as SNP) of every mb is between 1 to 500 in interested chromosome or chromosome segment
Between site/mb, such as 1 to 50,50 to 100,100 to 200,200 to 400,200 to 300 or 300 to 400 site/mb.
In some embodiments, the gene loci in potential missing or duplicate multiple portions is analyzed, to increase the spirit of CNV measurement
Sensitivity and/or specificity, compared with only analyzing 1 gene loci or only analyzing several gene locis closer to each other.Some
In embodiment, most common two allele is measured or is used for determining the equipotential calculated at only each gene loci
Gene ratio.In some embodiments, gene loci is expanded, using with low 5' → 3' exonuclease and/or
The polymerase (for example, archaeal dna polymerase, RNA polymerase or reverse transcriptase) of low strand-displacement activity.In some embodiments, it surveys
The genetic alleles data of amount are obtained, and the DNA or RNA in sample is sequenced by (i), DNA in (ii) amplified sample or
Then DNA or RNA in the DNA of amplification, or (ii) amplified sample is sequenced in RNA, connect PCR product, and then sequencing connection produces
Object.In some embodiments, the genetic alleles data of measurement are obtained, multiple by the way that the DNA of sample or RNA to be divided into
Part, increase different bar codes in each part DNA or RNA (for example, having in all DNA or RNA of specific part
Have same bar code), the DNA or RNA of bar shaped code labeling are arbitrarily expanded, these parts are combined, then to item in built-up section
The DNA or RNA of code indicia are sequenced.In some embodiments, the allele of Genetic polymorphism site (such as SNP) is reflected
It is fixed, use one of following methods or a variety of: sequencing (such as nano-pore sequencing or Halcyon molecule are sequenced), SNP array,
Real-time PCR, TaqMan, NanostringnAnalysis system uses distinctiveness archaeal dna polymerase and ligase
The detection of Illumina GoldenGate Genotyping, ligation-mediated PCR, or the reversed probe (LIPs of connection;It can also be claimed
For pre-cyclization probe, pre-cyclization probe, circularizing probes, padlock probe or the reversed probe of molecule (MIPs)) Illumina
The measurement of GoldenGate Genotyping.In some embodiments, two or more (such as 3 or 4) target amplicons are connected
It is connected together, then the product of connection is sequenced.In some embodiments, to the different equipotential bases of identical gene loci
The measurement of cause is adjusted, and for the metabolism between allele, apoptosis, histone is inactivated, and/or the difference (example in amplification
The difference of amplification efficiency between such as not iso-allele of identical gene loci).In some embodiments, the adjustment is held
The capable calculating prior to the genetic data allele ratio to acquisition, or prior to measurement genetic data and expected genetic data
Compare.
In some embodiments, the method also includes determining depositing for one or more risk factors of disease or obstacle
Whether.In some embodiments, the method also includes determining wind related to disease or obstacle or with disease or obstacle
Danger increases the presence or absence of relevant one or more polymorphisms or mutation.In some embodiments, the method also includes
Determine the total level of cfDN Acf mDNA, cf nDNA, cfRNA, miRNA or other compositions.In some embodiments, institute
Stating method includes the water for measuring interested one or more cfDN Acf mDNA, cf nDNA, cfRNA and/or miRNA molecule
It is flat, such as molecule that is relevant to disease or obstacle or increasing relevant polymorphism or mutation with disease or obstacle risk.One
In a little embodiments, Tumour DNA accounts in all DNA ratio (such as ratio or the total cfDNA of the tumour cfDNA in total cfDNA
In with specific mutation tumour cfDNA ratio) be determined.In some embodiments, the tumour ratio is for determining cancer
The phase (because higher tumour ratio may be related to the relatively late period of cancer) of disease.In some embodiments, the method
It also include the total level for determining DNA or rna level.In some embodiments, the method includes measuring interested one kind
Or the methylation level of a variety of DNA or RNA molecule, such as it is relevant to disease or obstacle or increase phase with disease or obstacle risk
The polymorphism of pass or the molecule of mutation.In some embodiments, the method includes determining the presence of the variation of DNA integrality
Whether.In some embodiments, the method also includes determining the total level of mRNA montage.In some embodiments, institute
The method of stating includes the level of determining mRNA montage or detects for interested one or the optional mRNA montage of RNA molecule, example
Molecule that is such as relevant to disease or obstacle or increasing relevant polymorphism or mutation with disease or obstacle risk.
In some embodiments, this invention describes a kind of methods, for detecting a cancerous phenotype in individual,
Middle cancerous phenotype is defined by the presence of at least one in one group of mutant.In some embodiments, the method includes obtaining
DNA or RNA measured value is obtained, for a DNA or RNA sample from individual one or more cells, one of them
Or multiple cells possess cancerous phenotype by doubtful;Analysis DNA or RNA measured value goes to determine, prominent for each of ensemble de catastrophes
A possibility that change, at least one cell possesses that mutation.In some embodiments, the method includes determining that individual possesses
A possibility that if cancerous phenotype (i) is mutated at least one, at least one cell contains the mutation is greater than threshold value, or
(ii) at least one mutation, a possibility that at least one cell contains the mutation, is less than threshold value, and for multiple
For mutation, the joint possibility that at least one cell possesses at least one mutation is greater than threshold value.In some embodiments
In, some or all of one or more cells possess in ensemble de catastrophes mutation.In some embodiments, sample includes free
DNA or RNA.In some embodiments, DNA or RNA measurement includes measurement (such as each allele of each gene loci
Amount), on one group of polymorphic site on interested one or more chromosomes or chromosome segment.
In one aspect, the present invention describes certain methods, for selecting a kind of therapy, for treatment, stablizes or prevents
Disease or obstacle in mammal.In some embodiments, the method includes determining whether there is the first homologous dyeing
The copy number of body segment is overexpressed, and compared with the second homologous chromosomal segments, utilizes any method described herein.In some realities
It applies in scheme, the therapy for mammal is selected a kind of (such as the therapy of disease or obstacle, with the first homologue piece
Section is overexpressed related).
In one aspect, the present invention describes certain methods, for preventing, delays, stablizes or treat in mammal
Disease or obstacle.In some embodiments, the method includes determining whether there is the copy of the first homologous chromosomal segments
Number is overexpressed, and compared with the second homologous chromosomal segments, utilizes any method described herein.In some embodiments, one
Kind selected that (such as a kind of therapy of disease or obstacle crosses table with the first homologous chromosomal segments for the treatment of mammal
Up to correlation), then this therapy be used to treat mammal.
In some embodiments, it treats, stable or prevention disease or obstacle include preventing or delaying disease or obstacle
Initial generation or follow-up developments increase the disease-free survival time that symptom disappears between recurrence, stablize or reduction is related to illness
Ill symptoms, inhibit or stablize illness progress.In some embodiments, at least 20,40,60,80,90 or 95%
Treatment subject has a complete incidence graph, and wherein all symptoms of disease disappear.In some embodiments, subject's quilt
Diagnosis is at least 20,40,60,80,100,200 or even 500% greater than (i) with the time-to-live length after disease and treatment
The survival mean time area of a room of untreated subject, or (ii) use the survival average time of the subject of another therapy treatment
Amount.
In some embodiments, it treats, stable or pre- anti-cancer includes reducing or stablizing tumour (for example, one benign
Or malignant tumour) size, slow down or prevent the increase of tumor size, reduce or stablize the number of tumour cell, increase tumour
Disappear and its recur between the disease-free survival time, prevent tumour it is initial generation or follow-up developments, reduce or stablize with it is swollen
The relevant ill symptoms of tumor.In one embodiment, the cancer cell count survived after treatment is at least originated than the first of cancer cell
Raw number low 10,20,40,60,80 or 100%, as using measured by any code test.In some embodiments
In, it is reduced by using the cancer cell number purpose that a kind of therapy of the invention obtains bigger than the reduction of non-cancerous cell number extremely
It is 2,5,10,20 or 50 times few.In some embodiments, existing cancer cell count is than application pair after applying a kind for the treatment of of therapy
It is at least 2,5,10,20 or 50 times low (such as application salt water or buffers) according to the number of rear existing cancer cell.In some realities
It applies in scheme, certain methods of the invention result in 10,20,40,60,80 or 100% reduction of tumor size, and size is logical
Cross standard method measurement.In some embodiments, at least 10,20,40,60,80,90 or 95% treatment subject has had
Direct release, without detectable cancer cell.In some embodiments, cancer is after at least 2,5,10,15 or 20 years
No longer occur or retransmits.In some embodiments, a subject is being diagnosed with cancer and with therapy of the invention
Time-to-live length after treating more at least 10,20,40,60,80,100,200 or at least 500%, it is more untreated than (i) by
The mean survival time amount for the subject that the mean survival time amount of examination person or (ii) are treated using another therapy.
In one aspect, the present invention describes some methods for subject's layering, is related to one kind for treating, stablizes
Or the clinical test of prevention mammalian diseases or obstacle.In some embodiments, the method includes determining whether there is
The copy number of first homologous chromosomal segments is overexpressed, and compared with the second homologous chromosomal segments, is described using the present invention before
Any method, during or after clinical test.In some embodiments, the first homologue in receptor gene's group
Subject is divided into a subgroup of clinical test by the presence or absence that segment is overexpressed.
In some embodiments, disease or obstacle are selected, from containing cancer, dysnoesia, learning disorder (such as first
Nature learning disorder), baryencephalia, hypoevolutism, self-closing disease, neurodegenerative disease or obstacle, schizophrenia, physiology lack
It falls into, autoimmune disease or obstacle, systemic loupus erythematosus, psoriasis, Crohn disease, glomerulonephritis, HIV infection, AIDS
And combinations thereof in the set of disease.In some embodiments, disease or obstacle are selected, from contain DiGeorge syndrome,
DiGeorge2 syndrome, DiGeorge/VCFS syndrome, Prader-Willi syndrome, Angelman syndrome,
Beckwith-Wiedemann syndrome, 1p36 deletion syndrome, 2q37 deletion syndrome, 3q29 deletion syndrome, 9q34 are lacked
Mistake syndrome, 17q21.31 deletion syndrome, Cri-du-chat syndrome, Jacobsen syndrome, Miller Dieker are comprehensive
Simulator sickness, Phelan-McDermid syndrome, Smith-Magenis syndrome, WAGR syndrome, Wolf-Hirschhom are comprehensive
Sign, Williams syndrome, Williams-Beuren syndrome, Miller-Dieker syndrome, Phelan-McDermid are comprehensive
Simulator sickness, Smith-Magenis syndrome, Down syndrome, Edward syndrome, Patau syndrome, Klinefelter are comprehensive
It levies, Tumer syndrome, 47, XXX syndromes, 47, XYY syndromes, in the set of Sotos syndrome and combinations thereof disease.One
In a little embodiments, this method has determined the presence or absence of one or more following chromosome abnormalities: nullisomic, monomer, single parent two
Times body, triploid match triploid, mismatch triploid, maternal triploid, parent triploid, triploidy, and four times of mosaic
Body, match tetraploid, mismatch tetraploid, other aneuploids, unbalanced translocation, balanced translocation, be inserted into, missing, recombination and
A combination thereof.In some embodiments, chromosome abnormality is the copy number and the chromosome of specific chromosome or chromosome segment
Any deviation of the most common copy number of segment, such as in human somatic cell, and any deviation of 2 copies can be thought of as
Chromosome abnormality.In some embodiments, this method determines the presence or absence of euploid.In some embodiments, it copies
Number hypothesis includes one or more copy number hypothesis of single pregnancy.In some embodiments, copy number hypothesis includes polyembryony
One or more copy number hypothesis of gestation, such as gemellary pregnancy is (for example, subtract with ovum or fraternal twin or naturally the double born of the same parents gone out
Tire).In some embodiments, it is euploid that copy number hypothesis, which includes all fetuses in multifetation, in multifetation
All fetuses are one or more tires in aneuploid (such as any aneuploid disclosed herein) and/or multifetation
Youngster is that one or more fetuses are aneuploid in euploid and multifetation.In some embodiments, copy number hypothesis packet
Include identical twins' (also referred to as identical twin) or fraternal twin (also referred to as double ovum twins).In some embodiments
In, copy number hypothesis includes mole gestation, such as gestation completely or partially.In some embodiments, interested chromosome
Segment is whole chromosome.In some embodiments, chromosome or chromosome segment are selected, and from chromosome 13 is contained, are contaminated
Colour solid 18, chromosome 21, X chromosome, Y chromosome, segment and combinations thereof formed set in.In some embodiments,
One homologous chromosomal segments and the second homologous chromosomal segments are the pair of homologous chromosome pieces comprising chromosome segment interested
Section.In some embodiments, the first homologous chromosomal segments and the second homologous chromosomal segments are interested pair of homologous
Chromosome.In some embodiments, confidence level is calculated, for CNV measurement or the diagnosis of disease or obstacle.
In some embodiments, missing is at least 0.01kb, 0.1kb, 1kb, 10kb, 100kb, 1mb, 2mb, 3mb,
The missing of 5mb, 10mb, 15mb, 20mb, 30mb or 40mb.In some embodiments, missing is between 1kb between 40mp
Missing, for example including 1kb to 100kb, 100kb to 1mb, 1 to 5mb, 5 to 10mb, 10 to 15mb, 15 to 20bp mb, 20 to
25mb, 25 to 30mb or 30 to 40mb.In some embodiments, a copy of chromosome segment is missing from, and one is copied
Shellfish is existing.In some embodiments, two copies of chromosome segment are missing from.In some embodiments, whole
What a chromosome was missing from.
In some embodiments, it repeats to be at least 0.01kb, 0.1kb, 1kb, 10kb, 100kb, 1mb, 2mb, 3mb,
The repetition of 5mb, 10mb, 15mb, 20mb, 30mb or 40mb.In some embodiments, it repeats to be between 1kb between 40mp
Repetition, for example including 1kb to 100kb, 100kb to 1mb, 1 to 5mb, 5 to 10mb, 10 to 15mb, 15 to 20mb, 20 to
25mb, 25 to 30mb or 30 to 40mb.In some embodiments, chromosome segment repeats one times.In some embodiments
In, chromosome segment repeats to be more than one times, such as 2,3,4 or 5 times.In some embodiments, whole chromosome is duplicate.
In some embodiments, a region in the first homologous fragment is missing from, same area in the second homologous fragment or
Another region is duplicate.In some embodiments, at least 50 the SNV of test, 60,70,80,90,95,96,98,99 or
100% is transversional mutation rather than transition mutations.
In some embodiments, sample includes DNA and/or RNA, from (i) one or more target cells, or (ii)
One or more non-target cells.In some embodiments, sample is a mixing sample of DNA and/or RNA, from one
A or multiple target cells and one or more non-target cells.In some embodiments, target cell is the cell containing CNV,
Such as interested missing or repetition, non-target cell are free from the cell of interested copy number variation.In some embodiments
In, wherein one or more target cells are cancer cells, and one or more non-target cells are non-cancerous cells, and this method includes determining
With the presence or absence of the overexpression of the first homologous chromosomal segments copy number, in the genome of one or more cancer cells.Some
In embodiment, wherein one or more target cells are the identical cancer cells of heredity, and one or more non-target cells are non-cancerous
Cell, the method includes determining whether there is the overexpression of the first homologous chromosomal segments copy number, in the gene of cancer cell
In group.In some embodiments, wherein one or more target cells are the different cancer cell of heredity, one or more non-target
Cell is non-cancerous cells, the method includes determining whether there is the overexpression of the first homologous chromosomal segments copy number,
In the genome of one or more hereditary not identical cancer cells.In some embodiments, wherein sample includes dissociative DNA, is come
From in the mixture of one or more cancer cells and one or more non-cancerous cells, the method includes determining whether there is
The overexpression of first homologous chromosomal segments copy number, in the genome of one or more cancer cells.In some embodiments
In, wherein one or more target cells are the identical fetal cells of heredity, and one or more non-target cells are parental cells, described
Method includes determining whether the overexpression of the first homologous chromosomal segments copy number, in the genome of fetal cell.?
In some embodiments, wherein one or more target cells are the different fetal cell of heredity, one or more non-target cells
Parental cells, the method includes determining whether there is the overexpression of the first homologous chromosomal segments copy number, at one or
In the genome of multiple different fetal cells of heredity.Because the cell of most of individuals contains one group of almost the same core
DNA, term " target cell " can be used interchangeably with term " target cell ", in some embodiments.Cancer cell, which has, is different from place
The genotype of main individual.In this case, cancer itself is considered an individual.In addition, many cancers are heterogeneous
, it is meant that the different cells in a tumour are genetically different from other cells in same tumour.In such case
Under, different hereditary same areas is considered different individuals.Alternatively, cancer, which is considered one, has difference
The single individual of the mixing with cells of genome.In general, non-target cell is euploid, although being not necessarily such case.
In some embodiments, sample is obtained from maternal whole blood sample or its ingredient blood sample, maternal blood sample
Middle isolated cell, amniocentesis sample, fetus sample, placenta tissue sample, chorionic villus sample, placenta membrane sample, palace
Neck stick liquid sample, or the sample from fetus.In some embodiments, sample includes the blood sample or ingredient from mother
The dissociative DNA obtained in blood sample.In some embodiments, sample includes from the mixture of fetal cell and parental cells
The core DNA of acquisition.In some embodiments, sample is obtained from the mother for containing karyocyte (being enriched in fetal cell)
A part of this blood.In some embodiments, sample is divided into multiple portions (such as 2,3,4,5 or more parts), often
A part is analyzed, uses method of the invention.If it is (such as one or more interested that each part generates identical result
CNV presence or absence), then result confidence level increase.Generate different as a result, sample can be by again in different parts
Analysis can collect another sample and be analyzed from same subject.
Subject exemplary includes mammal, such as people and the interested mammal of veterinary science.In some implementations
In scheme, mammal is primate (such as people, monkey, gorilla, ape, mongoose lemur etc.), ox, horse, pig, dog or cat.
In some embodiments, any method includes generating a report (such as written or electronic report), and disclosure is originally
The result (such as a missing or duplicate presence or absence) of the method for invention.
In some embodiments, any method includes taking a clinical evolution, based on a kind of method of the invention
As a result (such as a missing or duplicate presence or absence).In some case study on implementation, one of embryo or fetus possess sense
The one or more polymorphisms or mutation (such as CNV) of interest, it is based on the method for the present invention as a result, clinical evolution includes carry out volume
Outer test (such as the presence tested to confirm polymorphism or mutation), is not implanted into the embryo of in-vitro fertilization (IVF), is implanted into in-vitro fertilization (IVF)
Different embryos, terminal pregnancy is prepared for the child of special requirement, or is carried out an intervention and be intended to reduce genetic disease table
The severity that type occurs.In some embodiments, clinical evolution is selected from a set, comprising carrying out ultrasound, fetus
Amniocentesis inherits the amniocentesis of the subsequent fetus of inhereditary material, the chorion suede of fetus from mother and/or father
Knitting inspection, the chorionic villus biopsy of the subsequent fetus of inhereditary material is inherited from mother and/or father, it is in vitro fertilization, to from
One or more embryos that mother and/or father inherit inhereditary material be implanted into preceding genetic diagnosis, the karyotyping of mother,
The karyotyping of father, Study of Fetal Echocardiography (such as with 21,18 or trisomy 21, monomer X or micro-deleted fetus it is super
Sound cardiogram) and combinations thereof.In some embodiments, clinical evolution is selected from a set, including giving with monosomy X
Children born applies growth hormone (such as starting to apply at about 9 months), applies calcium to the children born lacked with 22q
(such as DiGeorge syndrome), to the children born application androgen such as testosterone with 47, XXY (such as to baby or child
Mensal injection 3 months 25mg testosterone heptanoate), to having complete or partial mole of gestation (such as triploid
Fetus) women carry out cancer test, apply cancer to the women with complete or partial mole of gestation (such as triploid fetus)
Disease treats such as chemotherapeutics, and the fetus (such as the fetus for being determined as male using method of the invention) that screening is determined as male is right
In one or more X- linkage inheritance diseases such as Du Shi muscular dystrophy (DMD), adrenoleukodystrophy or blood friend
Disease, carries out amniocentesis to the male fetus in the chain disease risks of X, is in adrenal,congenital hyperplasia risk to nourishing
The women of female child (such as the fetus for being determined as women using method of the invention) apply dexamethasone, in congenital
Property adrenal hyperplasia risk female child carry out amniocentesis, to 22q 11.2 lack immune deficiency children born apply
Inactivated vaccine (rather than live vaccine) does not apply certain vaccines, carry out occupation and/or physical therapy, carries out in education early
Phase intervenes, and delivers a child baby in the tertiary care centre with NICU and/or the paediatrics specialist for having license of delivering a child, to children born
(such as children of XXX, XXY or XYY) carry out behavior intervention, and combinations thereof.
In some embodiments, ultrasonic or another Screening tests are performed, and are confirmed as to one with multifetation
The women of (such as twins), to determine whether two or more fetuses are single villus.Identical twin is female thin by single ovum
The ovulation and fertilization of born of the same parents generates, egg division of being then fertilized;Placenta may be double chorions or single chorion.Double ovum twins from
The ovulation and fertilization generation of two egg mother cells, typically result in dichorial placenta.Identical twin has twins defeated
The risk of blood syndrome, the blood that may cause between fetus are unevenly distributed, and cause the difference of their growth and development, sometimes
Cause stillborn foetus.Therefore, it is needed using the twins that method of the invention is determined as identical twin tested (such as by super
Sound) to determine whether they are identical twins, if it is, these twins can be monitored (such as from 16 weeks
Double Zhou Chaosheng) twins' transfusion syndrome sign.
An embryo or fetus are without containing interested one or more polymorphisms or mutation (example in some embodiments
Such as CNV), it is based on the method for the present invention as a result, clinical evolution includes being implanted into the embryo of in-vitro fertilization (IVF) or continuing pregnant.In some realities
It applies in scheme, it includes carrying out selected from a set that clinical evolution, which is additional test to confirm there is no polymorphism or mutation,
Ultrasound, amniocentesis, chorionic villus biopsy and combinations thereof.
An individual has one or more polymorphisms or mutation (such as such as with disease or obstacle in some embodiments
Cancer is relevant or relevant to the increase risk of disease or obstacle such as cancer polymorphism or mutation), based on the method for the present invention
As a result, clinical evolution include disease or obstacle are carried out additional test or the one or more therapies of application (such as treatment of cancer,
The treatment or any treatment disclosed herein for the mutation type that specific type or diagnosis of case for cancer go out).In some realities
It applies in scheme, clinical evolution is additional test to confirm the presence or absence of polymorphism or mutation, includes selected from one group of set
Biopsy, operation, medical imaging (such as Mammogram or ultrasonic wave) and combinations thereof.
In some embodiments, additional test includes executing identical or different method (such as described herein
Where method) to confirm polymorphism or be mutated the presence or absence of (such as CNV), such as test same test sample or same individual
The second part of (such as identical pregnant woman, fetus, embryo or the individual for increasing risk with cancer) different samples.In some realities
Apply in scheme, additional test is performed, for a polymorphism or mutation (such as CNV) a possibility that higher than threshold value individual
For (such as confirming possible polymorphism or the existing additional test of mutation).In some embodiments, additional survey
Examination is performed, for the individual that the confidence level of a polymorphism or mutation (such as CNV) or z value are higher than threshold value (such as volume
Outer test is to confirm that there are possible polymorphism or mutation).In some embodiments, additional test is performed, for
One polymorphism or to be mutated the individual of the confidence level or z value of (such as CNV) between minimum and maximum threshold value (such as additional
Test is to increase the correct confidence level of initial results).In some embodiments, additional test is performed, for one
Determine polymorphism or be mutated (such as CNV) presence or absence confidence level lower than threshold value individual for (such as " noncall " as a result,
Because the presence or absence of CNV can not be determined with effective confidence level).One exemplary Z value is calculated, and is delivered in Chiu et al.
Document BMJ 2011;In 342:c7401 (it is fully incorporated by reference herein), wherein No. 21 chromosomes are used as an example
Son, and can be replaced with the chromosome of any other in tested sample or chromosome segment.
Test Z value=((percentage of No. 21 chromosome in test case) one of the percentage of No. 21 chromosome in case
(average percent of No. 21 chromosome in reference pair photograph))/(standard deviation of the percentage of No. 21 chromosome in reference pair photograph).
In some embodiments, additional test is performed, and does not meet quality control guide or tool for initial sample
There are fetus score or tumour score to be lower than the individual of threshold value.In some embodiments, the method includes selecting an individual
It is based on method of the invention as a result, a possibility that result for additionally testing, confidence level or z value as a result;And to a
Body is additionally tested (such as on identical or different sample).In some embodiments, disease or barrier are diagnosed with
Hinder the subject of (such as cancer) to carry out retest at multiple time points, uses method of the invention or known for disease
The test of disease or obstacle, to monitor the alleviation or recurrence of the progress or disease or obstacle of disease or obstacle.
In one aspect, the present invention describes a result report (such as written or electronic report), from the present invention
A kind of method (such as missing or duplicate presence or absence).
In various embodiments, primer extension reaction or polymerase chain reaction include by polymerase addition one or
Multiple nucleotide.In some embodiments, primer is in the solution.In some embodiments, primer is not in the solution and
It is fixed on solid support.In some embodiments, primer is not a part of microarray.In various embodiments,
Primer extension reaction or polymerase chain reaction do not include ligation-mediated PCR.In various embodiments, primer extension reaction or
Polymerase chain reaction does not include connecting two primers by ligase.In various embodiments, primer does not include that connection is anti-
To probe (LIPs), the probe being also referred to as cyclized in advance, pre-cyclization probe, circularizing probes, padlock-probe or molecule are reversed
Probe (MIP).
It is reported that the aspect and embodiment of invention as described herein include any two or many aspects or reality of the invention
Apply the combination of scheme.
Definition
Single nucleotide polymorphism (SNP) refers to monokaryon glycosides that may be different between the genome of two members of same species
Acid.The use of the term is not construed as any restrictions for the frequency that each variant occurs.
Sequence refers to DNA sequence dna or gene order.It can refer to the DNA molecular of individual or primary structure, the physical structure of chain.
It can refer to the nucleic acid sequence found in DNA molecular, or refer to the nucleic acid sequence found on DNA molecular complementary strand.He may be used also
To refer to the including information represented in DNA molecular as its biology (in silico)
Site refers to the region to cherish a special interest on individual DNA, this can refer to single nucleotide polymorphism (SNP), may
Insertion or the site deleted or the site that corresponding genetic mutation may occur.Related this of disease can refer to mononucleotide
Polymorphism (SNP) can also be referred to as disease related locus.
Polymorphic allele is also referred to as " polymorphic site ", refers to a kind of allele or site,
In these allele or site, there is variation in same kind of interindividual genotype.Polymorphic allele
Some examples include single nucleotide polymorphism, Short tandem repeatSTR, missing, duplication and inversion.
Polymorphic site refers to the specific nucleosides found in changed polymorphic regions between Different Individual.
Mutation refers to the variation occurred in naturally occurring nucleic acid sequence or reference nucleic acid sequence, such as be inserted into, delete,
Duplication, displacement, replacement, frameshift mutation, silent mutation, nonsense mutation, missense mutation, point mutation, sharp transition, transversional mutation,
Inverse transition or microsatellite alteration.In some embodiments, by the amino acid sequence of nucleic acid sequence encoding from naturally occurring
With the change of at least one amino acid in sequence.
Allele refers to the gene for occupying specific gene site.
Genetic data is also referred to as " gene data ", refers to description one or more than one genes of individuals group various aspects
Data.It can refer to one or perhaps full sequence chromosome dyad or all dyeing of one group of site, partial sequence
Body or whole gene group.It can refer to the consistency of one or some nucleotide;It can refer to one group of continuous nucleosides
Acid or the nucleotide from genome different loci or its combination.Genetic data is usually typical biology vocabulary, but
It is that he is it is also possible to be considered as with certain tactic actual nucleosides, thus the genetic coding data of chemistry.Hereditary number
According to " on individual " can be referred to as, " individual ", " be located at individual place ", " from individual " or " on individual ".Genotype
Data can refer to the output measurement result from genotyping platform, and wherein those measurements are carried out to inhereditary material.
Inhereditary material is referred to also as " genetic sample ", refers to from one or more individual including DNA or RNA
Actual substance, such as tissue or blood.
Confidence level refers to the copy that the SNP, allele, one group of allele, chromosome or chromosome segment determine
It is several, or the statistics likelihood of individual breeding true state representated by presence or the diagnosis there is no certain disease.
Ploidy interpretation is also referred to as " chromosomal copy number interpretation " or " copy number interpretation " (CNC), can refer in measurement cell
The behavior of the quantity and/or chromosome consistency of existing one or more chromosomes or chromosome segment.
Aneuploidy refer in cell there are the chromosome of number of errors (for example, the complete chromosome of number of errors or
The chromosome segment of number of errors, such as there are the missing of chromosome segment or duplications) state.The human body cell the case where
Under, it can refer to the case where cell is free of 22 pairs of autosomes and a pair of of sex chromosome.In the case where people's gamete, it can refer to
Cell is free of the case where one in 23 chromosomes.In the case where single chromosome type, it can refer to wherein in the presence of more
In or less than two homologous but inconsistent chromosome copies, or in which there are two chromosome copies for being originated from same parent
Situation.In some embodiments, the missing of chromosome segment is micro-deleted.
Ploidy state refers to that the quantity and/or chromosome of one or more chromosomes or chromosome segment in cell are consistent
Property.
Chromosome can refer to single chromosome copies, refer to that there are 46 single DNA moleculars in normal somatic cell;
One example is ' derived from No. 18 chromosomes of parent '.Chromosome can also refer to chromosome type, in normal human body cell
There are 23 chromosome types;One example is ' No. 18 chromosomes '.
Chromosome consistency can refer to reference to chromosome quantitative, i.e. chromosome type.The normal mankind have 22 seed types
Numbered autosome type and two kinds of sex chromosome.It can also refer to the parental source of chromosome.It may be used also
To refer to the specific chromosome from parent's heredity.It can also refer to other attributive character of chromosome.
Allele data refer to one group of genotype data about one or more allele groups.It can specify phase
Haplotype data.It can refer to single nucleotide polymorphism (SNP) consistency, and it can refer to the sequence data of DNA, including insert
Enter, lack, repeating and being mutated.It may include the parental source of each allele.
Allele status refers to the virtual condition of the gene in one or more allele groups.It can refer to and pass through
The virtual condition of the gene of position gene data description.
Allele counts the quantity for referring to the sequence for being mapped to specific gene site, and if the gene loci is
Polymorphism, then it refers to the quantity for the sequence being mapped in each allele.If in a binary fashion to each
Allele is counted, then allele counting will be integer.If in terms of being carried out to allele by probabilistic manner
Number, then allele counting can be percentage.
Allele counts probability and refers to one that may be mapped to specific gene site or be mapped at polymorphic site
The quantity of the sequence of group allele, combines with mapping probabilities.It should be noted that when each counting sequence mapping probabilities be two into
When (zero or one) of system, allele counting is equivalent to allele and counts probability.In some embodiments, allele meter
Number probability can be binary.In some embodiments, allele, which counts probability, can be set equal to DNA measurement
As a result.
Existing for allele distributions or ' allele count distribution ' refer at each site in one group of locus
The relative quantity of each allele.Allele distributions can refer to individual, sample or the one group of measurement carried out to sample.For example
In the digital allele measurement of sequencing, allele distributions refer to each allele being mapped in one group of polymorphic locus
The numerical value of the reading of the specific allele at place may numerical value.In the simulation allele measurement of such as SNP array, equipotential
Gene distribution refers to allele intensity and/or allele ratio.Allele measurement result can with probabilistic manner into
Row processing, that is to say, that be point between 0 and 1 for specifying the likelihood of allele in specified sequence reading presence
Number, alternatively, they can be handled by binary mode, that is to say, that any specified reading is considered precisely specific etc.
Zero or one copy of position gene.
Allele distributions mode refers to a different set of allele for background (for example, different parent's backgrounds)
Distribution.Certain allele distributions modes can indicate certain ploidy states.
Allele deviation refers to the ratio and initial DNA or RNA sample in the allele of heterozygous genes site measurement
In the presence of ratio difference degree.Allele extent of deviation at specific site is equal to be seen at the gene loci
The allele ratio (such as measured) observed divided by DNA initial on this site or RNA sample allelic ratio.
Allele deviation may be due to amplification deviation, purifying deviation or in different ways to influence some other of not iso-allele
Phenomenon.
Allele imbalance refers to, for SNV, usually using mutation allele frequency (the equipotential base of mutation
Because of number of sites/total site allele sum) ratio of the abnormal DNA of measurement.Due to tumour two homologue quantity it
Between difference be it is similar, we measure the ratio of exception DNA in CNV by average allele uneven (AAI), are defined as
| (H1H2) |/(H1H2), wherein Hi is homologue I copy number average value in sample, and Hi/ (H1+H2) indicates that homologue I's is rich
Spend score or homologous ratio.Maximum homology is the homology of more abundant homologue.
Detection Loss Rate refers to single nucleotide polymorphism (SNP) percentage not read, more with whole mononucleotides
State property (SNP) estimation.
Monoallelic loses (ADO) rate and refers to single nucleotide polymorphism existing for only one allele (SNP)
Percentage only uses heterozygosis SNP estimation
Primer, also referred to as " PCR probe " refer to monokaryon acid molecule (such as DNA molecular or DNA oligomer) or nucleic acid point
The set of sub (such as DNA molecular or DNA oligomer), wherein the molecule is consistent or almost consistent, and wherein
Primer contains a region, which is designed to hybridize to target site (for example, targeting polymorphic site or non-polymorphic
Property site) or hybridize to it is common cause sequence, and include an initiation sequence, which is designed that
PCR amplification.Primer can also contain molecular barcode.Primer can containing for each individually molecule it is different with
Machine region.
Primed libraries refer to the group of two or more primers.In various embodiments, the library includes at least
100、200、500、750、1,000、2,000、5,000、7,500、10,000、20,000、25,000、30,000、40,000、
50,000,75,000 or 100,000 different primers.In various embodiments, the library include at least 100,200,
500、750、1,000、2,000、5,000、7,500、10,000、20,000、25,000、30,000、40,000、50,000、75,
000 or 100,000 different primer pair, wherein each pair of primer includes positive test primer and negative testing primer, wherein often
To test primer hybridization a to target site.In some implementation embodiments, primed libraries include at least 100,200,
500、750、1,000、2,000、5,000、7,500、10,000、20,000、25,000、30,000、40,000、50,000、75,
000 or 100,000 respectively hybridizes to the different independent primers in different target site, wherein the independent primer is not primer
Pair a part.In some embodiments, it is not the list of a part of primer pair that the library, which has (i) primer pair and (ii),
Only primer (such as universal primer).
Different primers refers to different primer.
Different libraries refers to different library.
Different target sites refers to different target site.
Different amplicons refers to different amplicon.
Hybrid capture probe refers to any nucleic acid sequence that can be modified, and the nucleic acid sequence is for example, by PCR or straight
It the various methods such as is bonded into generate, and is intended to complementary with a chain of the specific targets DNA sequence dna in sample.It can be to system
Exogenous hybrid capture probe is added in standby sample and by denaturation-reannealing process hybridization to form exogenous-endogenous
The double helix of segment.These double helixs may then pass through various means and physically separate with sample.
Sequence reads refer to the data for indicating the nucleotide base sequence using the measurement of (for example) clone sequencing.Clone surveys
Sequence, which can produce, to be indicated single part of initial DNA molecular or clones or the sequence data of cluster.Sequence reads can also be in sequence
There is relevant mass fraction, which indicates nucleotide by the probability of correctly interpretation at each base positions of column.
Sequence of mapping reading is the process of the source position of sequence reads in the genome sequence for measure specific organism.Sequence
The source position of reading is based on the similarity of the nucleotide sequence of reading and genome sequence.
It matches copy errors and is also referred to as " matching chromosomal aneuploidy " (MCA), refer to that a cell contains two unanimously
Or the aneuploid state of almost consistent chromosome.Such aneuploidy can appear in gamete in meiosis
During formation, and meiosis can be referred to as and do not separate mistake.Such mistake can appear in mitosis.
Matching trisomy can refer to that there are two in the specified chromosome of three copies and the copy to be consistent in individual
Situation.
Unmatched copy errors are also referred to as " unique chromosomal aneuploidy " (UCA), refer to that a cell contains and come
From the aneuploid state of same two chromosome of parent, they can be homologous but inconsistent.Such non-multiple
During property can appear in meiosis, and meiosis mistake can be referred to as.Unmatched trisomy can refer to a
Two in specified chromosome and the copy copied in body there are three are from same parent and are homologous but different
The case where cause.It should be noted that unmatched trisomy can refer to wherein exist two homologues from a parent and
Wherein some sections of the chromosome are consistent and other sections are only homologous situation.
Homologue refers to the chromosome copies containing the same group of gene usually matched during meiosis.
Consistent sex chromosome refers to that they have consistent or almost consistent containing with group gene and about each gene
With the chromosome copies of group allele.
Allelic loss (ADO) refers to that at least one base-pair in one group of base-pair from homologue is referring to
Determine the case where can't detect at allele.
It loses (LDO) and refers to two base-pairs in one group of base-pair from homologue in specified equipotential base in site
The case where can't detect because of place.
Homozygosis refers to similar allele as corresponding chromosomal foci.
Heterozygosis refers to different allele as corresponding chromosomal foci.
Heterozygosis rate refers to the ratio of the individual in group at specified site with Heterozygous alleles.Heterozygosis rate can be with
Refer to the allele ratio for expecting or measuring at the specified site in individual or DNA or RNA sample.
Chromosomal region refers to the section or complete chromosome of chromosome.
Chromosome segment refers to that magnitude range can be the chromosomal section from a base-pair to whole chromosome.
Chromosome refers to segment or the part of complete chromosome or chromosome.
Copy refers to the copy number of chromosome segment.It can refer to chromosome segment consistency copy or inconsistency,
Homologous copies, wherein the different of chromosome segment copy containing one group of essentially similar site, and in its allelic
One or more be different.It should be noted that under some cases of aneuploidy, such as M2 copy errors, it is possible to specified dyeing
Some copies of body segment are consistent and some copies of identical chromosome segment are inconsistent.
Haplotype refers to the combination of the allele on multiple sites of usual coinheritance on same chromosome.According to one
The quantity for the recombination event having occurred and that between the specified site of group, haplotype can only refer to as little as two sites, or refer to entire dye
Colour solid.Haplotype can also refer to one group of single nucleotide polymorphism (SNP) on the relevant single chromatid of statistics.
Haplotype data is also referred to as " determining phase data " or " orderly genetic data ", refers to from diploid or polyploid gene
The data of single chromosome in group, that is, the separated maternal or male parent copy of the chromosome in diploid gene group.
Determine mutually to refer to the haplotype genetic data for measuring individual in view of unordered diploid (or polyploid) genetic data
Behavior.It can determine two genes at allele with pointer to the one group of allele found on item chromosome
Which of behavior relevant to each in two homologues in individual.
Determine phase data and refers to the genetic data it has been determined that one or more haplotypes.
Assuming that referring to a kind of possible state, such as the first homologue or chromosome segment and the second homologous dyeing
Body or chromosome segment are compared, the possibility degree that copy number is overexpressed, a possibility that deletion, a possibility that repetition, one group give
Possible ploidy state in fixed one or more than one chromosome or chromosome segment, at one group specified one or one
Possible allele status in a above site, parent's relationship possibility or one group given one or one with
Possible DNA, RNA, fetus percentage on upper chromosome or chromosome segment or inhereditary material amount from one group of site.
The genetic state property of can choose is connected with probability, illustrate to assume in kinship possibility and its in assuming between each element
Kinship between his element is truer, or the kinship possibility assumed is entirely correct.This group of possibility can
To include one or more elements.
Copy number assumes that the chromosome also referred to as " ploidy state hypothesis " referred to about in individual or chromosome segment are copied
The hypothesis of shellfish number.It is that it can also refer to the identity about each in chromosome it is assumed that including that the parent of every chromosome comes
Which item in source and two parentals set of chromosome is present in individual.It can also refer to about which dyeing from related individuals
Body or chromosome segment (if present) genetically correspond to the hypothesis of the specified chromosome of individual.
Related individuals refer to and therefore shared haploid any individual genetically related to target individual.One
In the case of kind, related individuals can be the gene parent of target individual or any inhereditary material from parent, such as sperm,
Polar body, embryo, fetus or child.It can also refer to siblings, parent or grand parents.
The identical any individual of individual that siblings refer to its gene parent and discussed.In some embodiments,
It can refer to bear child, embryo or fetus, or from gone out to bear child, one or more cells of embryo or fetus.
Siblings can also refer to the individual of the monoploid from one side of parent, such as sperm, polar body or any other group of haplotype heredity
Substance.Individual is considered the siblings of its own.
Child can refer to embryo, blastomere or fetus.It should be noted that in disclosed embodiments of the present invention, the concept
Be applied equally well to as gone out to bear child, fetus, embryo or individual from one group of cell therein.Term child's
Using can simply mean that the individual referred to as child is the hereditary offspring of parent.
Fetus refers to " fetus " or " genetically similar to the placenta region of fetus ".In pregnant woman, placenta it is certain
Part is genetically similar to fetus, and the foetal DNA of the free floating found in maternal blood is probably derived from placenta
The part to match with fetus genotype.It should be noted that the hereditary information of a hemichromosome is the mother of heredity from fetus in fetus.
It in some embodiments, is considered as " fetal origin from the DNA of the chromosome from fetal cell of these maternal inheritances
", rather than " maternal source ".
The DNA of fetal origin refers to the DNA of its genotype cell initial protion substantially equal with fetus genotype.
The DNA in maternal source refers to the DNA of its genotype cell initial protion substantially equal with maternal gene type.
Parent refers to science of heredity mother or father of individual.There are two parent (maternal and male parents) for individual usually tool, still
Situation may be not necessarily in this way, for example in gene or chromosomal mosaic.Parent is considered individual.
Parent's content refers to each in one or both two relative chromosomes in two parents of target
On, specified single nucleotide polymorphism (SNP) genetic state.
Maternal blood plasma refers to the blood plasma fractions of the blood from pregnant female.
It is clinical to determine to refer to that any of action for taking or not taking the result with the health or survival for influencing individual determines
It is fixed.Clinic determines to refer to the decision for continuing test, refers to termination or maintain the decision of pregnancy, refer to and take action to subtract
The decision that the decision of light undesirable phenotype or the phenotype thus that takes action to are prepared.
Diagnosis box refers to a machine of the one or more aspects for being designed to execute method disclosed herein
The combination of device or machine.In one embodiment, diagnosis box can be placed on patient care point.In one embodiment, it examines
Disconnected box can execute targeting and expand and then be sequenced.In one embodiment, diagnosis box can be individually or by means of technician
It works.
Referred to based on the method for information and is largely dependent upon statistics to understand the method for mass data.Antenatal
In the case where diagnosis, it refers to the side for being designed to determine one or more chromosome or chromosome segment ploidy state
Method, the method for determining allele status at one or more allele, the or (example in given a large amount of genetic datas
Such as, the genetic data from molecular array or sequencing), most probable state is intervened by statistics and determines parent child relationship, without
It is the method that direct physics measurement state determines parent child relationship.In one embodiment of the invention, the technology based on information can
To be the technology disclosed in this patent.In one embodiment of the invention, it can be PARENTA LSUPPORTTM
Original genetic data refers to the analog intensity signal exported by genotyping platform.In the case where SNP array,
Original genetic data refers to the strength signal before carrying out any genotype interpretation.In the case where sequencing, original heredity number
According to the analogue measurement referred to similar to chromatogram as a result, it has been reflected before the identity for measuring any base-pair and in sequence
Sequenator is completed before being mapped to genome.
Secondary genetic data refers to the processed genetic data exported by genotyping platform.In the feelings of SNP array
Under condition, secondary genetic data refers to the allele interpretation carried out by software relevant to SNP array reader, wherein described
The interpretation that specified allele is present or not present in sample has been made in software.In the case where sequencing, secondary heredity
Data refer to the base-pair identity it has been determined that sequence, and may also refer to that of genome be the sequence have been mapped to
Place.
The priority enrichment of the priority enrichment of DNA corresponding to site or the DNA at gene loci refer to promote enrichment after
High percentage in DNA mixture corresponding to the DNA molecular of the gene loci corresponds to described before being enriched in DNA mixture
Any method of the percentage of the DNA molecular of locus.The method can be related to selective amplification corresponding to gene loci
DNA molecular.The method can be related to the DNA molecular that removal does not correspond to locus.The method can be related to method combination.
Degree of enrichment is defined as corresponding in mixture after being enriched with the percentage of the DNA molecular of the locus divided by mixture before being enriched with
In correspond to the site DNA molecular percentage.Priority enrichment can execute at multiple locus.Of the invention one
In a little embodiments, degree of enrichment is greater than 20.In some embodiments of the invention, degree of enrichment is greater than 200.In some realities of the invention
It applies in example, degree of enrichment is greater than 2,000.When executing in the priority enrichment of multiple locus, degree of enrichment can refer in locus group
The average enrichment of all locus.
Amplification refers to the method for increasing the copy number of DNA or RNA molecule.
Selective amplification, which can refer to, increases specific DNA (either RNA) molecule or corresponding to the region specific DNA (or RNA)
DNA (or RNA) molecule copy number method.It, which can also refer to, increases specific targeting DNA (or RNA) molecule or targeting
The copy number in the region DNA (either RNA) and be more than increase non-targeted molecule or the region DNA (or RNA) method.Selectivity
Expand the method that can be priority enrichment.
It is general cause sequence refer to can for example by engagement, PCR or engagement mediate PCR be attached to target dna (or
Person RNA) molecular population DNA (or RNA) sequence.After being added to target molecule group, there is spy to general initiation sequence
Anisotropic primer can expand target group to use pair for amplification primer.General initiation sequence usually and target sequence without
It closes.
General aptamer or ' engagement aptamer ' or ' library label ' are containing can be covalently attached to target double chain DNA molecule group
5 ' and 3 ' end general initiation sequences DNA molecular.5 ' and 3 ' ends for being added to target group of aptamer provide general draw
Sequence is sent out, pair for amplification primer can be used from the general initiation sequence, PCR amplification occurs, own to from target group
Molecule is expanded.
Targeting refers to for corresponding to one group of gene in selective amplification or priority enrichment DNA (or RNA) mixture
The method of those of seat DNA (or RNA) molecule.
Joint Distribution model refers to the model for defining the probability of happening, and the event is defined about multiple stochastic variables,
The multiple stochastic variables defined on identical probability space are specified, wherein the probability of variable is chain.In some embodiment party
In case, the not chain degeneracy situation of the probability of variable can be used.
Cancer related gene refers to a gene relevant to the prognosis of cancer of risk of cancer or change changed.Example
The gene relevant to cancer that can promote tumour of property includes oncogene;Promote proliferation, invasion and the base of transfer of cell
Cause;Inhibit the gene of apoptogene;With the gene of Angiogensis.The cancer related gene of cancer is inhibited to include, but are not limited to
Tumor suppressor gene;Inhibit the gene of cell Proliferation, invasion or transfer;Promote the gene of Apoptosis;With anti-angiogenesis base
Cause
The relevant cancer of estrogen refers to a kind of cancer adjusted by estrogen.The example packet of the relevant cancer of estrogen
It includes, is not limited to, breast cancer and oophoroma.HER2 is in many estrogen relevant cancer (U.S. Patent No. 6165464, by drawing
Card be fully incorporated herein herein) in overexpression.
The relevant cancer of androgen refers to a kind of cancer adjusted by androgen.One example of cancer relevant to androgen
Son is prostate cancer
Refer to that the expression of mRNA or albumen is higher than control group (such as without disease or illness, such as higher than normal expression level
Cancer) corresponding molecule Average expression level.In various embodiments, expression is at least higher than the expression of control group
50,40,75,90,100,200,500, even 1000%.
Lower than the expression that normal expression level refers to mRNA or albumen lower than control group (such as without disease or illness,
Such as cancer) corresponding molecule Average expression level.In various embodiments, expression is at least than the expression water of control group
Put down low 20,40,50,75,90,95 or 100%.In some embodiments, the expression of mRNA or protein is undetectable
's.
Adjust expression or activity refer to relative to the expression for increasing or decreasing protein or nucleic acid sequence referring to condition or
Activity.In some embodiments, expression or it is active adjusting be increase or reduce at least 10,20,40,50,75,90,
100,200,500 or even 1000%.In various embodiments, treatment method adjusting transcription, translation, mRNA or protein
Stability or mRNA or protein and the in vivo combination of other molecules.In some embodiments, it is printed using standard Northern
It scores and analyses determining mRNA level in-site, and analyzed with the standard Western marking and determine protein level, analyze as described herein
Or in such as Ausubel et al. (Current Protocols in Molecular Biology (molecular biosciences at present
Scheme), John Wiley&Sons, New York is incorporated herein on July 11st, 2013 herein by citation) described in.?
In one embodiment, enzyme activity level is measured by using standard method to determine the level of protein.It is preferred at another
In embodiment, mRNA, albumen or enzyme activity level be equal to or less than 20,10,5 or 2 times of respective horizontal in reference cell with
On, the functional form of the albumen is not expressed, for example, the cell homozygote of nonsense mutation.In still another embodiment
In, mRNA, albumen or enzyme activity level are equal to or less than 20,10,5 or 2 times or more of the corresponding basic horizontal of reference cell, institute
Reference cell such as non-cancerous cells is stated, inducing cell abnormality proliferation is not contacted or inhibits the cell of the environment of Apoptosis, or
The cell of patient from the disease or exception that do not have care.
The expression or active dosage that are enough regulating mRNA or protein refer to a kind of amount for the treatment of, are administered when to theme
When, this amount can increase or decrease the expression or activity of mRNA or albumen.In some embodiments, the table for that can reduce
Reach or active compound, the adjusting be compared with identical main body is before be administered inhibitor, or be not treated
It is compared referring to main body, the intracorporal expression of treated master or activity reduction at least 10%, 30%, 40%, 50%, 75%, or
90%.In addition, in one embodiment, for can increase expression or active compound, mRNA in treated main body
Either protein expression perhaps active amount compared with identical main body is before being administered inhibitor or with not treated reference
Main body is compared, and at least increases by 1.5 times, 2 times, 3 times, 5 times, 10 times or 20 times.
In some embodiments, compound can directly or indirectly regulating mRNA or protein expression or activity.
For example, compound can be by can directly or indirectly influence mRNA or protein expression of concern or active adjusting
The expression or activity of molecule (such as nucleic acid, albumen, signaling molecule, growth factor, cell factor or chemotactic factor (CF)), are adjusted indirectly
The mRNA and protein expression interest or activity of section, directly or indirectly affect the expression or activity of the mRNA and albumen of interests.
In certain embodiments, compound inhibits cell division or induces cell apoptosis.These compounds may include in the treatment,
For example, not purifying or purifying protein, antibody, the organic molecule of synthesis, naturally occurring organic molecule, nucleic acid molecules and its group
Point.Compound in combination therapy can be simultaneously or sequentially administered.Exemplary compounds include signal transduction inhibitor.
Purifying refers to separates a certain component from its original adjoint component.Under normal conditions, when a factor
Protein, antibody are not contained from weight at least 50%, and its original adjoint natural organic molecules are that this factor is base
It is pure in sheet.In some embodiments, factor purity in weight at least accounts for 75%, 90% or 99%.One basic
The upper pure factor can be obtained by chemical synthesis, separated and obtained from the natural factor, or from do not generate originally this because
It is produced in the recombinant cell of the host cell of son.Standard technique protein purification and small can be used in those of ordinary skill in the art
Molecule, as Ausubel et al. (Current Protocols in Molecular Biology, John Wiley&Sons,
New York, July 11,2013 is fully incorporated herein herein by citation).In some embodiments, using polyacrylamide
Amine gel electrophoresis, column chromatography, spectrodensitometry, efficient liquid phase chromatographic analysis or the western marking analyze (Ausubel
Et al., ibid) described 2,5,10 times at least purer than starting material of the factor of measurement.Illustrative purification process includes immune heavy
It forms sediment, column chromatography (for example, immunoaffinity chromatography), magnetic bead immunoaffinity purification, and translation and plate binding antibody.
From being described in detail below in claims, other features and advantages of the invention are become apparent.
Detailed description of the invention
Patent or application documents include an at least cromogram.This patent or Patent Application Publication with color drawings
Copy will be provided by office, according to requesting and pay necessary expense.
Presently disclosed embodiment will be explained further by the reference of attached drawing, wherein the identical knot in several views
Structure is indicated by the same numbers.Shown in attached drawing it is not necessarily to scale, emphasis is usually not placed, current according to explanation
The principle of open embodiment.
Figure 1A -1D shows that the distribution of test statistics S is false divided by the various copy numbers for being 500 for reading depth (DOR)
If T (quantity of SNP) (" S/T ") and tumour score be 1%, for more and more single nucleotide polymorphism (SNP) come
It says.
Fig. 2A -2D shows the distribution of S/T, is 2% for the DOR various copy number hypothesis for being 500 and tumour score,
For more and more single nucleotide polymorphism (SNP).
Fig. 3 A-3D shows the distribution of S/T, is 3% for the DOR various copy number hypothesis for being 500 and tumour score,
For more and more single nucleotide polymorphism (SNP).
Fig. 4 A-4D shows the distribution of S/T, is 4% for the DOR various copy number hypothesis for being 500 and tumour score,
For more and more single nucleotide polymorphism (SNP).
Fig. 5 A-5D shows the distribution of S/T, is 5% for the DOR various copy number hypothesis for being 500 and tumour score,
For more and more single nucleotide polymorphism (SNP).
Fig. 6 A-6D shows the distribution of S/T, is 6% for the DOR various copy number hypothesis for being 500 and tumour score,
For more and more single nucleotide polymorphism (SNP).
Fig. 7 A-7D shows the distribution of S/T, is for the DOR various copy number hypothesis for being 1000 and tumour score
0.5%, for more and more SNP.
Fig. 8 A-8D shows the distribution of S/T, is 1% for the DOR various copy number hypothesis for being 1000 and tumour score,
For more and more SNP.
Fig. 9 A-9D shows the distribution of S/T, is 2% for the DOR various copy number hypothesis for being 1000 and tumour score,
For more and more SNP.
Figure 10 A-10D shows the distribution of S/T, is for the DOR various copy number hypothesis for being 1000 and tumour score
3%, for more and more SNP.
Figure 11 A-11D shows the distribution of S/T, is for the DOR various copy number hypothesis for being 1000 and tumour score
4%, for more and more SNP.
Figure 12 A-12D shows the distribution of S/T, is for the DOR various copy number hypothesis for being 3000 and tumour score
0.5%, for more and more SNP.
Figure 13 A-13D shows the distribution of S/T, is for the DOR various copy number hypothesis for being 3000 and tumour score
1%, for more and more SNP.
Figure 14 is a table, the sensitivity and specificity of instruction 6 kinds of microdeletion syndromes of detection.
Figure 15 A-15C is the diagram of euploid.X-axis indicates linear position of the individual polymorphic site along chromosome, y-axis
The number for indicating A allele reading, a part as total (A+B) allele reading.Maternal and fetus genotype quilt
Instruction is on the right side of figure.Picture has carried out color coding according to maternal genotype, so that red indicate maternal frequency of genotypes AA,
Blue indicates maternal genotype BB, and green indicates maternal genotype AB.Figure 15 A be when two chromosomes exist simultaneously,
The figure that fetus cfDNA ratio is 0%.The figure comes from the Ms not being pregnant, therefore represents genotype and be entirely
Maternal mode.Therefore allele cluster surrounds 1 (AA allele), 0.5 (AB allele) and 0 (BB allele).Figure
15B is the figure that the ratio of fetus is 12% in the presence of two chromosomes.Foetal allele reads ratio to A allele
Contribution move some allele point position, upward or downward along y-axis.Figure 15 C is when two chromosomes exist
When, figure that the ratio of fetus is 26%.The mode, including two red and the peripheral band of two blues and three central green bars
Band is obvious.
Figure 16 A and 16B are the graphic representation of 11.2 deletion syndrome of 22q.Figure 16 A is that maternal 22q 11.2 is lacked
Carrier (is indicated) by the missing of green AB SNPs.Figure 16 B is (red by one for the 22q11 missing of the paternal inheritance in fetus
The presence of color and a blue peripheral strip indicates).X-axis indicates the linear position of SNPs, and y-axis indicates A equipotential base in total indicator reading
Because of the ratio of reading.Each point represents single SNP gene loci.
Figure 17 be matrilinear inheritance Cri-du-Chat deletion syndrome (by two center green bands rather than three it is green
The presence of vitta band indicates) diagram.X-axis indicates the linear position of SNPs, and y-axis indicates A allele reading in total indicator reading
Ratio.Each point represents single SNP gene loci.
Figure 18 is the Wolf-Hirschhom deletion syndrome of paternal inheritance (by a red and a blue peripheral strip
Presence indicate) diagram.X-axis indicates the linear position of SNP, and y-axis indicates the ratio of A allele reading in total indicator reading.
Each point represents single SNP gene loci.
Figure 19 A-19D is the diagram of X chromosome mark-on experiment, to indicate the additional copy of chromosome or chromosome segment.
The figure illustrates the not same amounts of the male parent DNA mixed with daughter DNA: 16% male parent DNA (Figure 19 A), 10% male parent DNA
(Figure 19 B), the male parent DNA (Figure 19 D) of 1% male parent DNA (Figure 19 C) and 0.1%.X-axis indicates SNP on X chromosome
Linear position, y-axis indicate the ratio of the M allele reading in total indicator reading (M+R).Each point, which represents, has allele M or R
Single SNP gene loci.
Figure 20 A and 20B are the figures of false negative rate, using Haplotype data (Figure 20 A) and not Haplotype data (figure
20B)。
Figure 21 A and 21B are the figures of the false positive rate of p=1%, using (Figure 21 A) of Haplotype data and without haplotype
(Figure 21 B) of data.
Figure 22 A and 22B are the figures of the false positive rate of p=15%, using (Figure 22 A) of Haplotype data and without haplotype
(Figure 22 B) of data.
Figure 23 A and 23B are the figures of the false negative rate of p=2%, using (Figure 23 A) of Haplotype data and without haplotype
(Figure 23 B) of data.
Figure 24 A and 24B are the figures of the false positive rate of p=2.5%, using (Figure 24 A) of Haplotype data and without single times
(Figure 24 B) of type data.
Figure 25 A and 25B are the figures of the false positive rate of p=3%, using (Figure 25 A) of Haplotype data and without haplotype
(Figure 25 B) of data.
Figure 26 is the table to the false positive rate of first time simulation.
Figure 27 is the table to the false negative rate of first time simulation.
Figure 28 A is reference count (counting of an allele, such as " A " allele) divided by the figure of tale, right
In the gene loci of normal (non-cancerous) cell line.
Figure 28 B is chart of the reference count divided by tale, for having the cancer cell system of missing.Figure 28 C is reference
The chart divided by tale is counted, for the DNA mixture from normal cell system and cancer cell system.
Figure 29 is chart of the reference count divided by tale, for the plasma sample from IIa primary breast cancer patient, is swollen
Tumor score is estimated as 4.33% (wherein 4.33% DNA is from tumour cell).The green portion expression of chart is not deposited wherein
In the region of CNV.The blue of chart and red part indicate the region that wherein there is CNV, and measured allele
Ratio has one significantly to separate with expected allele ratio 0.5.One haplotype of blue-colored instruction, and red
Color indicates another haplotype.The SNP of about 636 heterozygosity is analyzed in the region of CNV.
Figure 30 is chart of the reference count divided by tale, for the plasma sample from IIb primary breast cancer patient, is swollen
Tumor score is estimated as 0.58%.The green portion of chart indicates the region that CNV is wherein not present.The blue of chart and red portion
Point indicate wherein there is the region of CNV, but measured allele ratio and expected allele ratio 0.5 without one
A apparent separation.For the analysis, the SNP of 86 heterozygosity is analyzed in the region of CNV.
Figure 31 A and 31B show the maximal possibility estimation of tumour score.Maximal possibility estimation is indicated by the peak value of figure, right
It is 4.33% in Figure 31 A, is 0.58% for Figure 31 B.
Figure 32 A is a comparison to the logarithmic chart of various possible tumour score odds ratios, for high tumour score sample
This (4.33%) and low tumour fractional samples (0.58%).If logarithm probability ratio less than 0, is more likely euploid vacation
It says.If logarithm probability ratio is greater than 0, CNV more likely there are.
The probability that Figure 32 B is missing from divides low tumour various possible tumour scores divided by the probability of no missing
For numerical example (0.58%).
Figure 33 is the logarithmic chart of the odds ratio of the various possible tumour scores of low tumour fractional samples (0.58%).Figure 33 is
The amplified version of low tumour fractional samples is used in Figure 32.
Figure 34 is shown to the limiting value of single nucleotide mutation detection in tumor biopsy, three kinds described in use example 6
Distinct methods.
Figure 35 is shown to the limiting value of single nucleotide mutation detection in plasma sample, three kinds described in use example 6
Distinct methods.
Figure 36 A and 36B are the analysis charts of DNA (Figure 36 B) in genomic DNA (Figure 36 A) or individual cells, are set using one
Meter is used to detect the library of about 28,000 primers of CNV.There are two center strips rather than a center strip shows CNV
Presence.X-axis indicates the linear position of SNP, and y-axis indicates the ratio of A allele reading in total indicator reading.
Figure 37 A and 37B are the analysis charts of DNA (Figure 37 B) in genomic DNA (Figure 37 A) or individual cells, are set using one
Meter is used to detect the library of about 3,000 primers of CNV.There are two center strips rather than a center strip shows
The presence of CNV.X-axis indicates the linear position of SNP, and y-axis indicates the ratio of A allele reading in total indicator reading.
Figure 38 shows the uniformity of the reading depth (DOR) of these about 3,000 gene locis.
Figure 39 is the table that a Comparative genomic strategy DNA and the error from individual cells DNA call index.
Figure 40 is the figure of the error rate of transition mutations and transversional mutation.
Figure 41 a-d is the figure with the sensitivity of the CoNVERGe of PlasmArts measurement.(a) CoNVERGe calculate AAI and
The correlation between score is actually entered, in the PlasmArt sample of DNA lack from 22q11.2 and matching normal cell system
In this.(b) correlation between the AAI calculated and the input of practical Tumour DNA, from chromosome 2p and 2q CNV's
The DNA's of HCC2218 breast cancer cell and matched normal HCC2218BL cell (containing 0-9.09% Tumour DNA score)
In PlasmArt sample.(c) calculate AAI and practical Tumour DNA input between correlation, from have chromosome 1p with
The HCC1954 breast cancer cell of 1q CNV and matched normal HCC1954BL cell (containing 0-5.66% Tumour DNA score)
In the PlasmArt sample of DNA.(d) gene frequency figure, for the HCC1954 cell used in (c).At (a), (b) and
(c) in, data point and error bars respectively indicate average value and standard deviation (SD), for 3-8 repetition.
Figure 42 provides details, Plasmart standard exemplary for one, including in the clip size compared with lower part point
The figure of cloth.
Figure 43 correctly provides from Plasmart synthesis ctDNA standard items dilution curve as a result, micro- for verifying
Missing and cancer index.Figure 43 A;Right figure shows the maximum likelihood of tumour, assesses the result of the part DNA as an advantage
Than figure.Figure 43 B is one for detecting the figure of transversion event.Figure 43 C is one for detecting the figure of transition events.Figure 44 is one
It opens and shows the figure of the CNV of various chromosomal regions, i.e., what different samples were indicated at different %ctDNA.
Figure 45 is the figure of a CNV for showing various chromosomal regions, for each of different %ctDNA levels
For kind oophoroma sample.
Figure 46 is that a table shows that breast cancer or patients with lung cancer have SNV or combined SNV and/or CNV in ctDNA
Percentage.
The chart of the % sample of breast cancer out of phase of Figure 47 _ be has tumour-specific SNV in blood plasma
And/or CNV and associated tables of data are on the right.
Figure 48 is the chart of the % sample of breast cancer different subagees, have in blood plasma tumour-specific SNV and/
Or CNV and associated tables of data are on the right.
Figure 49 is the chart of the % sample of a lung cancer out of phase, in blood plasma have tumour-specific SNV and/or
CNV and associated tables of data are on the right.
Figure 50 is the chart of the % sample of breast cancer different subagees, have in blood plasma tumour-specific SNV and/
Or CNV and associated tables of data are on the right.
Figure 51 A indicates histology discovery/history of primary tumors of lung, analyzes its clone and subclone Tumor Heterogeneity.
Figure.Figure 51 B is the table of the VAF identity of a biopsy lung neoplasm, is measured by genome sequencing and AmpliSEQ.
Figure 52 illustrates to go identification clone and subclone SNA mutation using the ctDNA from blood plasma, to overcome tumour different
Matter.
Figure 53 is a table, and the VAF for comparing AmpliSeq and mmPCR-NGS is called, for SNV in primary tumor
Detection, is missed by AmpliSeq the and SNV mutant identified in blood plasma ctDNA.
Figure 54 A is the figure of %VAF in primary tumors of lung.Figure 54 B is a linear regression graph, to AmpliSeq VAF phase
For Nater aVAF's.
Figure 55 is the figure in the library 1/4 of 84-plex SNV PCR primer reaction, when primer concentration is by limited time.
Figure 56 is the figure in the library 2/4 of 84-plex SNV PCR primer reaction, when primer concentration is by limited time.
Figure 57 is the figure in the library 3/4 of 84-plex SNV PCR primer reaction, when primer concentration is by limited time.
Figure 58 is the figure in the library 4/4 of 84-plex SNV PCR primer reaction, when primer concentration is by limited time.
Figure 59 illustrates a detection limit (LOD) to the figure of reading depth (DOR), and for detecting, SNV is converted and transversion is prominent
Become, is repeated in PCR reaction the 84 of 15 PCR cycles.
Figure 60 illustrates a detection limit (LOD) to the figure of reading depth (DOR), and for detecting, SNV is converted and transversion is prominent
Become, is repeated in PCR reaction the 84 of 20 PCR cycles.
Figure 61 illustrates a detection limit (LOD) to the figure of reading depth (DOR), and for detecting, SNV is converted and transversion is prominent
Become, is repeated in PCR reaction the 84 of 25 PCR cycles.
Figure 62 is that a figure illustrates comparable sensitivity between tumour and individual cells genomic DNA.Upper part
Display uses the result of tumor cell gene group DNA.Lower part shows the result using individual cells genomic DNA.
Figure 63 illustrates the workflow of analysis CNV, in kinds cancer sample type, in the extensive more of targeting SNP
- Figure 63 a in weight PCR (mmPCR) measurement.Figure 63 b-f compares CoNVERGe measurement and microarray assays, in breast cancer cell line
In matched normal cell system.
Figure 64 provide a fresh food frozen (FF) and FFPE (the fixed paraffin embedding of formalin) breast cancer sample with
Comparison with control.Figure a-h compares CoNVERGe measurement and microarray assays, in breast cancer cell line and matched leucocyte
In layer gDNA check sample.
Figure 65 illustrates gene frequency figure to reflect chromosomal copy number, is detected using CoNVERGe measurement single
CNV in cell.Figure 65 a-c comes from the unicellular duplicate analysis of three breast cancer.Figure 65 d is lacked in target region
The analysis of the bone-marrow-derived lymphocyte system of CNV.
Figure 66 illustrates gene frequency figure to reflect chromosomal copy number, is detected using CoNVERGe measurement true
CNV in plasma sample.Figure 66 a is II primary breast cancer blood plasma cfDNA sample and its matched tumor biopsy gDNA.Figure 66 b
It is advanced ovarian cancer blood plasma cfDNA sample and its matched tumor biopsy gDNA.Figure 66 c is a chart, illustrates to pass through
The Tumor Heterogeneity of CNV detection assay, in five kinds of advanced ovarian cancer blood plasma and matched tissue samples.
Figure 67 illustrates that chromosome location and mutation in breast cancer change.
Figure 68 illustrates main (Figure 68 A) and minorAllele (Figure 68 B) frequency of SNP, anti-for 3168mmPCR
It answers.
Figure 69 shows an example system system X00, for executing embodiment of the present invention.
Figure 70 illustrates an example computer system, for executing embodiment of the present invention.Although above-mentioned attached drawing is explained
Presently disclosed embodiment is stated, other embodiments are also conceived to, as pointed by under discussion.The disclosure illustrates
Illustrative embodiment, to present and unrestricted mode.Many other modifications and embodiment can be by those in skill
The technical staff in art field designs, in accordance with the scope and spirit of the principle of presently disclosed embodiment.
The specific descriptions of invention
On the one hand, the present invention relates generally to, be at least partly related to determine copy number variation presence or absence improved method,
Such as the missing or duplication of chromosome segment or whole chromosome.The method is particularly useful for detecting small missing or repetition,
It is difficult the detection by high specific and sensitivity, by existing method, due to from the available of relative chromosome segment
Data are seldom.This method includes improved analysis method, improved bioassay method and improved analysis and bioassay
The combination of method.Method of the invention may be utilized for detection and exist only in test cell or the nucleic acid molecules of small percentage
Missing or repetition.This allow lack or repeat be detected, disease occur before (such as in precancer) or disease morning
Phase, such as before there is missing or duplicate a large amount of diseased cells (such as cancer cell) accumulation.More accurate detection, is directed to
Missing or duplicate relevant to disease or obstacle is predicted so that being used to diagnose, and is prevented, delay, stablizes or treat disease or disease
The method of disease is improved.Several missings repeat known related to cancer or serious spirit or physical disturbances.
On the other hand, the present invention relates generally to be at least partly related to detecting the improvement side of single nucleotide variations (SNV)
Method.These improved methods include improved analysis method, improved bioassay method and by improved analysis and biology
The improved method that measuring method is composed.Method in certain illustrative embodiments be used to detect, diagnosis, monitoring or
Cancer staging, such as in SNV in sample existing for low-down concentration, to be, for example, less than 10%, 5%, 4%, 3%, 2.5%,
2%, 1%, 0.5%, 0.25% or 0.1%, for the normal copy of SNV gene loci sum, such as circulation is free
DNA sample.That is, in certain illustrative embodiments especially suitable for there is relatively low percentage in these methods
The sample of mutation or variation, existing for the gene site normally for polymorphic allele.Finally, mentioning herein
The method of confession is combined with the modification method for detecting copy number variation and the improvement side for detecting single nucleotide variations
Method.
A kind of disease such as cancer is successfully treated, early diagnosis is often relied on, correct staging is effectively treated
The selection of scheme, and monitoring closely is to prevent or detect recurrence.For cancer diagnosis, the tumour material obtained from tissue biopsy
The Histological evaluation of material is typically considered most reliable method.However, sampling invasive based on tissue biopsy so that its
It is unsuitable for Large-scale Screening and regular follow-up.Therefore, this method has advantage, can non-invasively carry out, if necessary to opposite
Low cost and if the quick turnaround time.Method of the invention can use targeting sequencing, need to be sequenced than air gun less
Reading, such as it is millions of reading rather than 4,000 ten thousand reading, to reduce cost.Multiplex PCR and next-generation sequencing can by with
In increase read volume and reduce cost.
In some embodiments, the method be used to detect the missing in an individual, and duplication or mononucleotide become
It is different.One sample of individual can be analyzed comprising having the cell or nucleic acid of missing, duplication or single nucleotide variations.One
In a little embodiments, sample from the doubtful tissue or organ with missing, duplication or single nucleotide variations, such as cell or
The a large amount of cell for suspecting canceration.Method of the invention, which can be used to detect, to be existed only in a cell or a small amount of cell
Missing, duplication or single nucleotide variations, at one containing having a missing, the cell of duplication or mononucleotide variant and without having
In the mixture of the cell of missing, duplication or mononucleotide variant.In some embodiments, from the blood sample of individual
CfDNA or cfRNA it is analyzed.In some embodiments, cfDNA or cfRNA are secreted by cell, such as cancer cell.One
In a little embodiments, cfDNA or cfRNA are discharged by the cell of experience necrosis or apoptosis, such as cancer cell.Method of the invention can
For detecting the missing in the cfDNA or cfRNA that exist only in small percentage, duplication or single nucleotide variations.In some implementations
In scheme, one or more cells from embryo are tested.
In some embodiments, the method be used for fetus Noninvasive or invasive antenatal exaination.These sides
Method can be used to determine the missing or duplicate presence or absence of chromosome segment or whole chromosome, such as known to missing or duplication
With serious spirit or physical disturbances, learning disorder or cancer are related.For some of the antenatal test (NIPT) of Noninvasive
In embodiment, from the cell of the blood sample of pregnant mothers, cfDNA or cfRNA are tested.This method, which allows to detect, to be lacked
Lose or repeat, in the cell from fetus, in cfDNA or cfRNA, although from maternal a large amount of cells, cfDNA or
CfRNA there is also.DNA or RNA (example in some embodiments for invasive antenatal test, in fetus sample
Such as CVS or amniocentesis sample) it is tested.Even if sample is by DNA or the RNA pollution from pregnant mother, the method
Missing or repetition in foetal DNA or RNA can also be used to detect.
Other than determining the presence or absence of copy number variation, one or more other factors can be analyzed, if needed
If wanting.These factors can be used for improving diagnosis accuracy (such as determine cancer presence or absence or cancer increase
Risk, cancer classification or cancer staging) or prognosis.These factors can also be used to select a specific therapy or treatment side
Case, may be in subject effectively.Example factors include the presence or absence of polymorphism or mutation;Change (it is increased or
Reduce) total or specific cfDNA, cfRNA, the level of tiny RNA (miRNA);(the increased or reduction) tumour point changed
Number;(the increased or reduction) methylation level changed, (increased or reduction) DNA integrality of change, change (increase
It is adding or reduction) or variable mRNA montage.
Following section describes method for detect missing or repeat, using phase data (such as infer or measurement phase
Position data) or obfuscated data;The sample that can be tested;Sample preparation, amplification and quantitative method;The side of phase genetic data
Method;Polymorphism, mutation, nucleic acid change, mRNA alternative splicing, and the change for the nucleic acid level that can be detected;From
The result database of method, other risk factors and screening technique;The cancer that can be diagnosed or treat;Treatment of cancer;For
Test the cancer model for the treatment of;And the method for formulating and applying treatment.
The illustrative methods of ploidy are determined using phase data
Certain methods of the invention are based partially on following discovery: detect CNV using phase data, reduce false negative and
False positive rate, compared with using obfuscated data (Figure 20 A-27).This improvement is for CNV is with sample existing for low-level
It is maximum.Therefore, phase data improve CNV detection accuracy, with use obfuscated data ratio in (such as calculate one or more
Allele ratio at a gene loci summarizes allele ratio to provide summarizing on chromosome or chromosome segment
It is worth the method for (such as average value), does not consider whether the allele ratio at different genes site shows identical or different list
Times type seems with abnormal amount presence).Allow more accurately to determine using phase data, measurement and expected allele ratio
Between difference whether be due to noise or due to the presence of CNV.For example, if most of or whole base in a region
Because it is being measured on site and expected from difference between allele ratio show that identical haplotype is overexpressed, then CNV is more
It may be existing.Using chain between haplotype allelic, allow to determine whether measured genetic data and mistake
The identical haplotype (rather than random noise) of expression is consistent.On the contrary, if allele ratio measure and expected
Between difference only due to noise (such as experimental error), then in some embodiments, approximately half of time first
Haplotype looks like overexpression, about the other half time, and the second haplotype looks like overexpression.
Can be chain between SNP by considering, and (generation forms embryo and grows into matching for fetus in meiosis
Son) during a possibility that intersecting improve accuracy.When the expection for the allele measurement for creating one or more hypothesis
It is more preferable when than not using chain corresponding to reality using the expected allele measurement distribution of chain creation when distribution.
For example, it is assumed that it is near there are two SNP, 1 and 2, it is A, SNP 2 at the SNP 1 of mother on a homologue
Place is A, and SNP 1 is B on Article 2 homologue, and SNP 2 is B.If father is on two homologues
Two SNP are A, measure B for fetus SNP 1, then this shows that Article 2 homologue is inherited by fetus, therefore
There is a higher possibility to come across the site fetus SNP2 for B.In view of chain model can predict this point, without
Consider that chain model then cannot.It alternately, is AB at nigh SNP 2 if mother is AB at SNP 1, then that
Two hypothesis that a site corresponds to maternal three-body can be used-one (not divide in subtrahend comprising matched copy error
Divide in II or in the mitosis of early stage development of fetus), one (is not divided in meiosis comprising unmatched copy error
In 1).In the case where a matching copy error three-body, if fetus is at SNP 1 from mother's heredity AA, fetus
It is more likely at SNP 2 from mother heredity AA or BB, rather than in AB.In the case where a unmatched copy error, tire
Youngster inherits AB from mother at two SNP.The allele distributions hypothesis that CNV call method is formulated, it is contemplated that it is chain, it can do
These are predicted out, therefore the measurement for corresponding to actual allele has comparable bigger degree, does not consider than one chain
CNV call method.
In some embodiments, phase genetic data is used to determine whether that there are the first homologous chromosomal segments copies
Several overexpressions, compared with the second homologous chromosomal segments in genes of individuals group (such as the gene in one or more cells
In group or in cfDNA or cfRNA).Repetition or second of the illustrative overexpression including the first homologous chromosomal segments are homologous
The missing of chromosome segment.In some embodiments, there is no be overexpressed because the first and second homologous chromosomal segments with
Equal proportion (such as one of each segment copy in diploid sample) exists.In some embodiments, in sample of nucleic acid
The allele ratio of calculating be compared with expected allele ratio, be discussed further below with determining whether there is
Overexpression.In this specification, phrase " the first homologous chromosomal segments compared with the second homologous chromosomal segments " refers to one
First homologue of chromosome segment and the second homologue of chromosome segment.
In some embodiments, the method includes obtaining the phase genetic data of the first homologous chromosomal segments, packet
Containing the identity for being present in the allele on the first homologous chromosomal segments at the gene loci, for the first homologue
For each gene loci in the set of Genetic polymorphism site in segment, the phase for obtaining the second homologous chromosomal segments is lost
Pass data, the identity comprising being present in the allele on the second homologous chromosomal segments at the gene loci, for second
For each gene loci in the set of Genetic polymorphism site on homologous chromosomal segments, and obtain the heredity etc. of measurement
Position gene data for each allele at each gene loci in the set of Genetic polymorphism site, including comes
Each equipotential base present in DNA the or RNA sample of individual one or more target cells and one or more non-target cells
The amount of cause.In some embodiments, the method includes enumerating one group of one or more hypothesis, the first homologue is specified
The degree of the overexpression of segment;It calculates, for each hypothesis, the expected genetic data in multiple sites in sample, from what is obtained
In phase genetic data, for from one or more target cell DNA or RNA into sample one or more of total DNA or RNA
For a possibility ratio;Calculate (such as on computers calculate), ratio possible for each of DNA or RNA and it is each it is assumed that
Data fit between the sample genetic data of acquisition and the expection genetic data of sample, for DNA or RNA possibility ratio and
For that hypothesis;One or more hypothesis is ranked up according to data fitting;The highest hypothesis that wherein sorts is selected, thus
Determine the degree of the overexpression of the copy number of the first homologous chromosomal segments in individual one or more cellular genomes.
In one aspect, the present invention describes a kind of method and is used to determine the chromosome of fetus or the copy of chromosome segment
Number.In some embodiments, the method includes obtaining the phase genetic data of at least one biology parent of fetus,
Middle phase genetic data includes the Genetic polymorphism position on the first homologous chromosomal segments of parent and the second homologous chromosomal segments
The identity of allele present on each gene loci in point set.In some embodiments, the method includes
The genetic data at the Genetic polymorphism site set in DNA or RNA mixing sample on chromosome or chromosome segment is obtained, is mixed
Closing sample includes foetal DNA or RNA and mother body D NA or RNA from fetus mother, by measuring on each gene loci often
The amount of a allele.In some embodiments, this method includes enumerating one group of one or more hypothesis, specified to be present in tire
The copy number of interested chromosome or chromosome segment in youngster's genome.In some embodiments, the method includes
Creation (such as on computers create), for each hypothesis, each site in multiple gene locis in mixing sample
On each allele desired amount probability distribution, the phase genetic data obtained from (i) from parent, or (ii)
Be likely to occur in one or more probability intersected during gamete is formed, gamete be fetus contribute to interested chromosome or
One copy of chromosome segment;(such as calculating on computers) is calculated, for each hypothesis, in (1) mixing obtained
The genetic data of sample and (2) are for each allele on each site in multiple gene locis in the hypothesis mixing sample
Desired amount probability distribution between;One or more hypothesis are ranked up according to data fitting;And selected and sorted highest
Hypothesis, so that it is determined that in Fetal genome interested chromosome segment copy number.
In some embodiments, the method includes obtaining phase genetic data, any side described herein is utilized
Method or any known method.In some embodiments, the method includes simultaneously or successively (i) obtains the in any order
The phase genetic data of one homologous chromosomal segments, it includes be present on the first homologous chromosomal segments at the gene loci
The identity of allele, for each gene loci in polymorphic position point set on the first homologous chromosomal segments, (ii)
The phase genetic data for obtaining the second homologous chromosomal segments, it includes be present in the gene position on the second homologous chromosomal segments
The identity of allele at point, for each gene position in polymorphic position point set on the second homologous chromosomal segments
Point, and (iii) obtain the genetic alleles data of measurement comprising in the set of Genetic polymorphism site on each site
The amount of allele, in the DNA sample from individual one or more cells.
In some embodiments, the method includes calculating allele ratio, Genetic polymorphism site is gathered
In one or more gene locis, at least one cell is that (such as the gene loci exists heterozygosis in isolated sample
It is heterozygosis in fetus and/or is heterozygosis in female parent).In some embodiments, the calculating in specific gene site etc.
Position gene ratio is the measurement amount an of allele divided by the overall measurement amount of allele all on gene loci.In some realities
It applies in scheme, the allele ratio of the calculating in specific gene site is an allele (such as the first homologue piece
Allele in section) measurement amount divided by other one or more allele measurement amount (such as the second homologue
Allele in segment).The allele ratio of calculating can be calculated, and any method described herein or any mark are utilized
Quasi- method (such as any mathematic(al) manipulation of the allele ratio of calculating described herein).
In some embodiments, the method includes determining whether there is the mistake of the first homologous chromosomal segments copy number
Expression, the allele ratios with the gene loci of one or more calculating of a gene loci is expected by comparing
Allele ratio, if the first and second homologous chromosomal segments exist at equivalent ratios.In some embodiments, in advance
The allele ratio of phase, which assumes that the possible allele on a gene loci is having the same, there is a possibility that.Some
In embodiment, wherein being the measurement amount an of allele for the allele ratio of the calculating in a specific gene site
Divided by the overall measurement amount of allele all on gene loci, corresponding expected allele ratio is 0.5 double for one
Allele site, or for 1/3 for a triallelic site.In some embodiments, it is contemplated that allele ratio
Rate assume the possible allele of a gene loci can have it is different there is a possibility that, such as based on each equipotential base
A possibility that frequency of cause, in the specific crowd belonging to subject, such as the crowd of the ancestors based on subject.It is such etc.
Position gene frequency is publicly available (plans see, for example, HapMap;Perlegen mankind's haplotype project;
The website ncbi.nlm.nih.gov/projects/SNP/;Sherry ST, Ward MH, Kholodov M et al. dbSNP:the
NCBI database of genetic variation.Nucleic Acids Res.2001Jan 1;29 (1): 308-11,
These are each by as a whole incorporated by reference).In some embodiments, it is contemplated that allele ratio be to particular individual
Allele ratio, the specific hypothesis which is just being specified the first homologous chromosomal segments overexpression degree are tested.Example
Such as, the expected allele ratio of particular individual can be determined, and based on the phase from individual or obscure genetic data (example
Such as it is less likely from individual with missing or a duplicate sample, such as non-cancerous sample), or from individual
The data of one or more relatives.In some embodiments for antenatal test, it is contemplated that allele ratio be to one
The expected allele ratio of a mixing sample, the mixing sample include DNA or the RNA from pregnant mothers and fetus
(such as maternal blood plasma or serum sample comprising from the cfDNA of mother and cfDNA of fetus), one is referred to
For the specific hypothesis of the overexpression degree of fixed first homologous chromosomal segments.For example, the expected allele of mixing sample
Ratio can be determined, the genetic data of the prediction based on the genetic data and fetus from mother (such as fetus may
From the prediction of mother and/or the allele of father's heredity).In some embodiments, come solely from mother (such as
From the leukocytic cream of maternal blood sample) DNA or RNA a sample phase or obscure genetic data, be determining
May inherit from mother (therefore may for the allele of female parent DNA or RNA and fetus in mixing sample
Be present in the foetal DNA or RNA of mixing sample) allele.In some embodiments, the DNA of father is come solely from
RNA sample phase or obscure genetic data be used to determine allele that fetus may inherit from father (with
And be therefore likely to be present in the foetal DNA or RNA of mixing sample).Expected allele ratio can be calculated, this is utilized
Any method or any standard method described in text (such as any mathematics of expected allele ratio as described herein becomes
Change) (US publication on November 18th, 2012/0270212,2011 submits, and is listed in herein with reference to whole reference).
In some embodiments, the allele ratio of calculating indicates the mistake of the copy number of the first homologous chromosomal segments
Expression, if (i) allele ratio (is present in the survey of the allele on the first homologous chromosomal segments at the gene loci
Amount amount, divided by the overall measurement of all allele at gene loci) it is greater than the expected allele ratio at the gene loci
Rate, or (ii) allele ratio (are present in the measurement of the allele on the second homologous chromosomal segments at the gene loci
Amount, divided by the overall measurement of all allele at gene loci) it is less than the expected allele ratio at the gene loci.
In some embodiments, the allele ratio of calculating only considers that instruction is overexpressed, if it is to be significantly greater or less than
The desired ratio in the site.In some embodiments, the allele ratio of calculating indicates the first homologous chromosomal segments
Copy number is not overexpressed, if (i) allele ratio (is present on the first homologous chromosomal segments at the gene loci
The measurement amount of allele, divided by the overall measurement of all allele at gene loci) it is less than or equal at the gene loci
Expected allele ratio, or (ii) allele ratio (is present in the gene loci on the second homologous chromosomal segments
The measurement amount of the allele at place, divided by the overall measurement of all allele at gene loci) it is greater than or equal to the gene position
Expected allele ratio at point.In some embodiments, with mutually it is contemplated that the equal calculating ratio of ratio be ignored (because
It indicates not to be overexpressed for them).
In various embodiments, one or more following methods are used to the allele of the one or more calculating of comparison
Ratio is expected allele ratio with corresponding.In some embodiments, a kind of method determines whether the allele calculated
Ratio is higher or lower than expected allele ratio, and gene loci specific for one does not consider the size of difference.One
In a little embodiments, a kind of method determines the big of the difference between the allele ratio calculated and expected allele ratio
Small, gene loci specific for one is higher or lower than expected equipotential base regardless of whether the allele ratio of calculating
Because of ratio.In some embodiments, it is expected etc. to determine whether that the allele ratio calculated is higher or lower than for a kind of method
Position gene ratio, and the size of the difference for a specific gene site.In some embodiments, a kind of method determines
The average value or weighted average of the allele ratio whether calculated are higher or lower than being averaged for expected allele ratio
Value or weighted average, do not consider the size of difference.In some embodiments, a kind of method determines the allele ratio calculated
Difference between the average value or weighted average of rate and the average value or weighted average of expected allele ratio it is big
Small, average value or weighted average regardless of whether the allele ratio of calculating are higher or lower than expected allele ratio
The average value or weighted average of rate.In some embodiments, a kind of method determines whether the allele ratio calculated
Average value or weighted average are higher or lower than the average value or weighted average and difference of expected allele ratio
Size.In some embodiments, a kind of method determines average value or weighted average, for calculating allele ratio with
The size of difference between expected allele ratio.
In some embodiments, the difference between the allele ratio of calculating and expected allele ratio is big
It is small, for one or more gene locis, it is used to determine whether the overexpression of the copy number of the first homologous chromosomal segments
It is the missing of the repetition or the second homologous chromosomal segments due to the first homologous chromosomal segments, in one or more cells
In genome.
In some embodiments, the overexpression of the copy number of the first homologous chromosomal segments is determined existing, if one
A or multiple following situations occur.In some embodiments, the number of the allele ratio of calculating, instruction first are homologous
The overexpression of the copy number of chromosome segment is higher than threshold value.In some embodiments, the number of the allele ratio of calculating
Mesh, indicate the copy number of the first homologous chromosomal segments without being overexpressed, be lower than threshold value.In some embodiments, it counts
The allele ratio (it indicates the overexpression of the copy number of the first homologous chromosomal segments) and corresponding expected equipotential base of calculation
Because the difference size between ratio is higher than threshold value.In some embodiments, the equipotential for all calculating being overexpressed for instruction
The summation of gene ratio, the difference size between the allele ratio of calculating and corresponding expected allele ratio is higher than threshold
Value.In some embodiments, the difference between the allele ratio of calculating and corresponding expected allele ratio is big
Small to be lower than threshold value, the ratio indicates that the copy number of the first homologous chromosomal segments is not overexpressed.In some embodiments
In, the allele ratio (to the measurement amount for the allele being present on the first homologue) of calculating is divided by gene loci
All allele overall measurement amount average value or weighted average, higher than expected allele ratio average value or
At least one threshold value of weighted average.In some embodiments, the allele ratio of calculating is (to being present in the second homologous dye
The measurement amount of allele on colour solid) divided by gene loci all allele overall measurement amount average value or weighting it is flat
Mean value, lower than at least one threshold value of the average value or weighted average of expected allele ratio.In some embodiments,
Data fitting between the allele ratio of calculating and the allele ratio of prediction, on the first homologous chromosomal segments
Copy number be overexpressed, lower than threshold value (indicate good data fitting).In some embodiments, the allele of calculating
Data fitting between ratio and the allele ratio of prediction, did not had the copy number on the first homologous chromosomal segments
Expression, it is higher than threshold value (the data fitting of instruction difference).
In some embodiments, the overexpression of the copy number of the first homologous chromosomal segments is confirmed as being not present, such as
The one or more following situations of fruit occur.In some embodiments, indicate that the copy number of the first homologous chromosomal segments crosses table
The quantity of the allele ratio of the calculating reached is lower than threshold value.In some embodiments, the first homologous chromosomal segments are indicated
The number of the allele ratio of calculating that is not overexpressed of copy number be higher than threshold value.In some embodiments, calculating
Allele ratio (it indicates the overexpression of the copy number of the first homologous chromosomal segments) and corresponding expected allele ratio
Difference size between rate is lower than threshold value.In some embodiments, (it indicates the first homologous dye to the allele ratio of calculating
The copy number of chromosome fragment is not overexpressed) and the difference size between expected allele ratio is higher than threshold value accordingly.?
In some embodiments, the allele ratio of calculating (to the measurement amount for the allele being present on the first homologue)
Divided by the average value or weighted average of the overall measurement amount of all allele at gene loci, expected allele is subtracted
The average value or weighted average of ratio are lower than threshold value.In some embodiments, it is contemplated that allele ratio average value
Or weighted average, subtract the allele ratio (measurement to the allele being present on the second homologue of calculating
Amount) divided by all allele at gene loci overall measurement amount average value or weighted average, be lower than threshold value.Some
In embodiment, the data between the allele ratio of calculating and the allele ratio of prediction are fitted, homologous for first
What the copy number on chromosome segment was overexpressed, it is higher than threshold value.In some embodiments, the allele ratio of calculating and pre-
Data fitting between the allele ratio of survey, is not overexpressed the copy number on the first homologous chromosomal segments,
Lower than threshold value.In some embodiments, threshold value is determined, to the known sample with CNV interested and/or known
Lack the empirical test of the sample of CNV.
In some embodiments, it is determined whether there are the overexpressions of the copy number of the first homologous chromosomal segments, including
One group of one or more hypothesis is enumerated, the degree of the overexpression of the first homologous chromosomal segments is specified.
One exemplary hypothesis is that there is no overexpressions, because the first and second homologous chromosomal segments are at equivalent ratios
In the presence of (such as copy for each segment in diploid sample).Other exemplary hypothesis include the first homologue
Segment is replicated that one or many (such as the 1 of the first homologue, 2,3,4,5 or more additional copies are homologous with second
The copy number of chromosome segment is compared).Another exemplary hypothesis includes the missing of the second homologous chromosomal segments.However it is another
A exemplary hypothesis is the missing of the first and second homologous chromosomal segments.In some embodiments, the allele of prediction
Ratio, for be at least one cell heterozygosity gene loci (such as in fetus heterozygosity and/or in parent it is miscellaneous
The site of conjunction property), it is evaluated for each hypothesis, it is contemplated that the degree of the specified overexpression of that hypothesis.In some embodiment party
In case, a possibility that hypothesis is correct, is calculated, by comparing the allele ratio of the allele ratio and prediction that calculate,
And the hypothesis with maximum likelihood is selected.
In some embodiments, the desired distribution of a test statistics is calculated, using prediction allele ratio,
For each hypothesis.In some embodiments, a possibility that hypothesis is correct is calculated, by comparing test statistics (benefit
Calculated with the allele ratio of calculating) it (is calculated using the allele ratio of prediction with the expected distribution of test statistics
), and the hypothesis with maximum likelihood is selected.
In some embodiments, the allele ratio of prediction, for being the base of heterozygosity at least one cell
Because site (such as in fetus heterozygosity and/or in parent heterozygosity site), be evaluated, it is contemplated that the first homologous dyeing
What the phase genetic data of body segment, the phase genetic data of the first homologous chromosomal segments and that hypothesis were specified crosses table
The degree reached.In some embodiments, a possibility that hypothesis is correct is calculated, by comparing the allele ratio calculated
With the allele ratio of prediction, and selected with the hypothesis of maximum likelihood.
Use mixing sample
It should be appreciated that sample is the mixing sample with DNA or RNA, from one for many embodiments
Or multiple target cells and one or more non-target cell.In some embodiments, target cell is the cell with CNV, example
Such as interested missing or repetition, non-target cell is that the cell without copy number interested variation (such as is lacked with interested
Lose or duplicate cell and without any missing or one of duplicate tested cell mixing).In some embodiments,
Target cell is cell (such as cancer cell) relevant to disease or obstacle or relevant with the increase risk of disease or obstacle, non-target
Cell is cell (such as non-cancerous cell) unrelated with disease or illness or unrelated with the increase risk of disease or obstacle.?
In some embodiments, target cell CNV all having the same.In some embodiments, two or more target cells have
Different CNV.In some embodiments, one or more target cells have CNV, polymorphism or mutation (with disease or obstacle
It is related or related to the increase risk of disease or obstacle), it is not found at least one other target cell.It is some this
In the embodiment of sample, cell relevant to disease or obstacle or relevant with the increase risk of disease or obstacle is from sample
Ratio in this total cell is assumed to be greater than or equal to the most common of these CNVs in sample, polymorphism or mutation
Ratio.For example, if 6% cell is mutated with K-ras, and 8% cell is mutated with BRAF, at least 8% cell quilt
It is assumed to carcinous.
In some embodiments, from the DNA (or RNA) of one or more target cells at sample total DNA (or RNA)
In ratio calculated.In some embodiments, one group of one or more hypothesis (mistake of specified first homologous chromosomal segments
The degree of expression) it is listed.In some embodiments, the allele ratio of prediction is evaluated, at least one
Be in cell heterozygosity gene loci (such as in fetus be the site of heterozygosity and/or in female parent be heterozygosity position
Point), it is contemplated that the allele ratio of the calculating of DNA or RNA, and the overexpression degree specified by hypothesis are evaluated, for
Each hypothesis.In some embodiments, a possibility that hypothesis is correct is calculated, by comparing the allele ratio calculated
With the allele ratio of prediction, and selected with the hypothesis of maximum likelihood.
In some embodiments, the expected distribution of a test statistics, utilizes the allele ratio and meter of prediction
The ratio of the DNA or RNA of calculation are calculated, are assessed, for each hypothesis.In some embodiments, hypothesis is just
Really a possibility that, is determined, and (utilizes the DNA or RNA of the allele ratio and calculating that calculate by comparing test statistics
Ratio calculated) and test statistics expected distribution (using the allele ratio of prediction and the DNA of calculating or
What the ratio of RNA was calculated), and the hypothesis with maximum likelihood is selected.
In some embodiments, the method includes enumerating one group of one or more hypothesis, the first homologous dyeing is specified
The degree of the overexpression of body segment.In some embodiments, the method includes assessments, for each it is assumed that no matter (i) is pre-
The allele ratio of survey, the gene loci at least one cell being heterozygosity (such as is heterozygosity in fetus
Site and/or in female parent be heterozygosity site), it is contemplated that the specified overexpression degree of that hypothesis, or (ii) are right
(allele ratio of prediction is utilized in the expected distribution of the ratio of one or more possible DNA or RNA, test statistics
And calculated from possibility ratio of one or more target cell DNA or RNA in sample total DNA or RNA).One
In a little embodiments, data fitting is calculated, by comparing the allele ratio of (i) allele ratio calculated and prediction
Rate, or (ii) test statistics (are calculated using the allele ratio and DNA of calculating or the possibility ratio of RNA
), the expected distribution with test statistics (is counted using the allele ratio and DNA of prediction or the possibility ratio of RNA
It calculates).In some embodiments, one or more hypothesis are sorted, and are fitted according to data, and the highest hypothesis quilt that sorts
Selection.In some embodiments, a technology or algorithm, such as a searching algorithm, be used for a step in following steps or
Multistep: calculating data fitting, to hypothesis sequence or the highest hypothesis of selected and sorted.In some embodiments, data, which are fitted, is
Fitting to β-bi-distribution fitting or to a bi-distribution.In some embodiments, the technology or algorithm
It include maximal possibility estimation, MAP estimation, Bayesian Estimation, dynamic estimation (such as Dynamic Bayesian selected from one group
Estimation) and expectation maximization estimation set.In some embodiments, the method includes using the technology or algorithm,
In the genetic data and desired genetic data of acquisition.
In some embodiments, the method includes creating the subregion of a possible ratio, range is from a lower limit
To a upper limit, for ratio of the DNA or RNA from one or more target cells in the total DNA or RNA of sample.?
In some embodiments, one group of one or more hypothesis is specified the degree of the overexpression of the first homologous chromosomal segments, is arranged
It lifts.In some embodiments, the method includes assessments, for DNA or RNA ratio possible each of in subregion and
For each hypothesis, no matter the allele ratio of (i) prediction, for being the gene loci of heterozygosity at least one cell
(such as in fetus be the site of heterozygosity and/or in female parent be heterozygosity site), it is contemplated that possible DNA or RNA
Ratio and that hypothesis specified by overexpression degree, or the expected distribution of (ii) test statistics utilizes prediction
Allele ratio and the ratio of possible DNA or RNA calculated.In some embodiments, the method packet
Calculating is included, for DNA or RNA ratio possible each of in subregion and for each hypothesis, a possibility that hypothesis is correct,
(meter is utilized by comparing the allele ratio of (i) allele ratio calculated and prediction, or (ii) test statistics
What the allele ratio and DNA of calculation or the possibility ratio of RNA were calculated), the expected distribution with test statistics (utilizes
What the allele ratio and DNA of prediction or the possibility ratio of RNA were calculated).In some embodiments, for each
The combined probability of hypothesis is determined, by combining the probability in the hypothesis of the possible ratio of each of subregion;And there is maximum
The hypothesis of join probability is selected.In some embodiments, the combined probability of each hypothesis is determined, by weighting a hypothesis
A possibility that probability of (specifically may ratio for one) based on the possible ratio is correct ratio.
In some embodiments, a kind of technology includes maximal possibility estimation, MAP estimation, shellfish selected from one group
The set of Ye Si estimation, dynamic estimation (such as Dynamic Bayesian estimation) and expectation maximization estimation is used to estimation and comes from
In the ratio in sample total DNA or RNA of DNA or RNA of one or more target cells.In some embodiments, from
Ratio of the DNA or RNA of one or more target cells in sample total DNA or RNA be assumed it is identical, for two or more
Multiple (or all) interested CNV.In some embodiments, from the DNA or RNA of one or more target cells in sample
Ratio in this total DNA or RNA is calculated, for every kind of interested CNV.
Utilize the illustrative methods of incomplete phase data
It should be appreciated that incomplete phase data is used for many embodiments.For example, it may be possible to 100% determination
Do not know which allele is present on one or more sites of the first and/or second homologous chromosomal segments.One
In a little embodiments, it be used to calculate the probability of each hypothesis to the priori of the possibility haplotype of individual.It can for individual
The priori (such as on the basis of haplotype of the group based on Haplotype frequencies) of energy haplotype is for calculating the general of each hypothesis
Rate.In some embodiments, the priori of possible haplotype is adjusted, by using another method come phase genetic data or
Information (is used for improve population data by using the phase data from other subjects (such as priori subject)
Learn), the phase data based on individual.
In some embodiments, the phase genetic data includes probability data, for two or more possible phases
Position genetic data set, wherein each possible phase data set includes the polymorphic site on the first homologous chromosomal segments
In the possible identity of allele existing at each gene loci in set and the second homologous chromosomal segments
The possible identity of allele existing at each gene loci in polymorphic position point set.In some embodiments
In, the probability of at least one hypothesis is determined, for set possible for each of phase genetic data.In some embodiment party
In case, it is assumed that combined probability be determined, by combine hypothesis probability, for each possible collection of phase genetic data
For conjunction;And the hypothesis with greatest combined probability is selected.
Any method disclosed herein or any known method can be used to generate incomplete phase data (such as benefit
Gone to infer most probable phase with the group based on Haplotype frequencies), in the method for being stated.In some embodiments
In, phase data is obtained, by the haplotype for combining smaller fragment probabilityly.For example, it may be possible to haplotype can be true
It is fixed, according to a haplotype from first area and another list in another region of same chromosome
The possibility combination of times type.Probability from the specific haplotype of different zones be it is identical, it is bigger on same chromosome
Haplotype section can be determined, and be utilized, for example, the known weight between group and/or different zones based on Haplotype frequencies
Group rate.
In some embodiments, the exclusion test of single hypothesis is used for the null hypothesis of diploidy.In some implementations
In scheme, the probability of disomy hypothesis is calculated, and the hypothesis of disomy is excluded, if probability is lower than given threshold value
(being, for example, less than one thousandth).If null hypothesis is excluded, this may be due in incomplete phase data mistake or by
In the presence of a CNV.In some embodiments, more accurate phase data is obtained (such as from disclosed herein
To obtain the phase data that any molecule phase method of actual phase data obtains, rather than inferred based on bioinformatics
Phase data).In some embodiments, the probability of disomy hypothesis is recalculated, the more accurate phase data of utilization with
Determine whether that two-body hypothesis still should be excluded.The exclusion of the hypothesis shows the repetition of chromosome segment or missing is to exist
's.If desired, false positive rate can be changed, by adjusting threshold value.
The further exemplary implementation scheme of ploidy is determined using phase data
In illustrative embodiment, provide a method here, for determining the ploidy of a chromosome segment,
In individual specimen.Method includes the following steps:
A. gene frequency data, the amount including each allele present in sample, in chromosome segment are received
On one group of Genetic polymorphism site in each gene loci at;
B. phase allelic information is generated, for one group of Genetic polymorphism site, by assessing gene frequency number
According to phase;
C. the individual probability for generating gene frequency utilizes the Genetic polymorphism site under Different Ploidy state
Gene frequency data;
D. the joint probability for generating one group of Genetic polymorphism site uses individual probability and phase allelic information;With
And
E. it selects, is based on joint probability, a best fit model indicates ploidy, thereby determines that chromosome segment
Ploidy.
As disclosed herein, gene frequency data (the also referred to as genetic alleles data of measurement herein)
It can be generated, the method by being known in the art.For example, data can be generated, qPCR or microarray are utilized.One
In a illustrative embodiment, data are generated, and utilize nucleic acid sequence data, especially high-throughput nucleic acid sequence data.
In certain illustrative examples, gene frequency data are corrected error, be used to generate at it individual general
Before rate.In specific illustrative embodiment, the error of correction includes amplified allele efficiency variation.In other implementations
In scheme, the error of correction includes environmental pollution and genotype pollution.In some embodiments, the error of correction includes equipotential
Gene magnification deviation, environmental pollution and genotype pollution.
In certain embodiments, individual probability is generated, and using a group model, which has Different Ploidy state and wait
Position gene imbalance score, for one group of Genetic polymorphism site.In these embodiments and other embodiments, joint
Probability is generated, chain between Genetic polymorphism site on chromosome segment by considering.
Therefore, it (is combined with some in these embodiments) in one illustrative embodiment, provided herein is
A kind of method, for detecting the ploidy in individual specimen comprising following steps:
A. the nucleic acid sequence data of allele, one group of Genetic polymorphism site in individual chromosome segment are received
Place;
B. the gene frequency at one group of gene loci is detected, the nucleic acid sequence data is utilized;
C. amplified allele efficiency variation is corrected, in gene frequency detected, to generate the equipotential of correction
Gene frequency, for one group of Genetic polymorphism site;
D. phase allelic information is generated, for one group of polymorphic site, by estimating the nucleic acid sequence
According to phase;
E. the individual probability for generating gene frequency leads to for the Genetic polymorphism site of Different Ploidy state
The gene frequency for comparing correction is crossed, and has Different Ploidy state and allele uneven on one group of Genetic polymorphism site
One group model of weighing apparatus ratio;
F. joint probability is generated, for one group of Genetic polymorphism site, by combining individual probability, it is contemplated that dyeing
It is chain between Genetic polymorphism site in body segment;And
G. it selects, is based on the joint probability, indicates the best fit model of chromosomal aneuploidy.
As disclosed herein, individual probability can be generated, and using a group model or hypothesis, there is different ploidies
State peace allele imbalance score, for one group of Genetic polymorphism locus.For example, particularly showing at one
Example property example in, individual probability is generated, by simulate chromosome segment the first homologue and chromosome segment second
The ploidy state of homologue.The ploidy state being modeled include the following:
(1) all cells all not no missings or amplification of the first homologue of chromosome segment or the second homologue;
(2) at least some cells have the missing of the first homologue of chromosome segment or the amplification of the second homologue;With
(3) at least some cells have the missing of the second homologue of chromosome segment or the amplification of the first homologue.
It should be appreciated that above-mentioned model can also be referred to as the hypothesis for restricted model.Therefore, above to prove 3 hypothesis
It can be used.
The average allele imbalance score of modeling may include that the average allele of any range is uneven, packet
The actual average allele for including chromosome segment is uneven.For example, modeling is averaged in certain illustrative embodiments
The unbalanced range of allele can 0,0.1,0.2,0.25,0.3,0.4,0.5,0.6,0.75,1,2,2.5,3,4 and
5% lower limit and 1,2,2.5,3,4,5,10,15,20,25,30,40,50,60,70,80,90,95 and 99% the upper limit it
Between.The section of modeling with range can be any section, depending on the computing capability that uses and be allowed for analysis when
Between.For example, 0.01,0.05,0.02 or 0.1 section can be modeled.
In certain illustrative embodiments, sample have chromosome segment average allele imbalance between
Between 0.4% to 5%.In certain embodiments, average allele imbalance is low.In these embodiments, it puts down
Equal allele imbalance is usually less than 10%.In certain illustrative embodiments, allele imbalance between 0.25,
0.3,0.4,0.5,0.6,0.75,1,2,2.5,3,4 and 5% lower limit and 1,2,2.5,3,4 and 5% the upper limit.At it
In its illustrative embodiment, average allele imbalance between 0.4,0.45,0.5,0.6,0.7,0.8,0.9 or
1.0% lower limit and 0.5,0.6,0.7,0.8,0.9,1.0,1.5,2.0,3.0,4.0 or 5.0% the upper limit.For example, sample
Average allele it is uneven, in an illustrative example, between 0.45 and 2.5%.In another example,
Average allele imbalance is detected, with 0.45,0.5,0.6,0.8,0.8,0.9 or 1.0 sensitivity.In the method for the present invention
In one have in the unbalanced exemplary sample of hypomorph, including from the cancer with Circulating tumor DNA
The plasma sample of body, or the plasma sample from the pregnant female with circulation foetal DNA.
It should be appreciated that the ratio of abnormal DNA is measured, using mutation allele frequency (at gene loci for SNV
The mutation allele number/gene loci at allele sum).Due in tumour between the amount of two homologues
Difference be it is similar, we measure the ratio of abnormal DNA, and for a CNV, it is uneven to pass through average allele
(AAI), it is defined as | (H1-H2) |/(H1+H2), wherein Hi is the average copy number of homologue i in sample, and Hi/ (H1+H2) is
The homology of fractional abundance or homologue i.Maximum homology is the homology of richer homologue.
Measurement leakage code rate is the percentage for the single nucleotide polymorphism (SNP) not read, and is estimated using all SNP
Meter.Monoallelic missing (ADO) rate be there is only the percentage of the SNP of an allele,
Estimated just with the SNP of heterozygosity.The confidence level of genotype can be determined, by being fitted a binomial
The number of readings per taken (it is B- allele reading) being distributed at each SNP, and gone using the ploidy state of the focal zone of SNP
Estimate the probability of each genotype.
For tumor tissues sample, the aneuploidy (being illustrated in this section, pass through CNV) of chromosome can be drawn
It is fixed, by gene frequency be distributed between conversion.In plasma sample, CNV can be accredited, and pass through a maximum likelihood
Algorithm, the plasma C NV in the algorithm search region, tumor sample is from also with the same individual of CNV in this area.
The algorithm can simulate desired gene frequency, across all allele unbalance factors in 0.025% section, for three
Group hypothesis: (1) all cells are normal (no allele are uneven), and (2) some/all cells are lacked with homologue 1
It loses or homologue 2 expands, or there is (3) some/all cells the missing of homologue 2 or homologue 1 to expand.The possibility of each hypothesis
Property can be determined, at each SNP, using a Bayes classifier (expected from all heterozygosity SNP
With the β Binomial Model for the gene frequency observed), then the joint possibility between multiple SNP can be calculated, at certain
The chain of SNP gene loci is considered in a little illustrative embodiments, as shown here.Then maximum likelihood hypothesis can be chosen
It selects.
Consider a chromosomal region (it has average N copy in tumour), and c is enabled to indicate DNA points in blood plasma
Number is taken from normal and tumour cell mixing, in the region of a disomy.AAI is calculated as follows:
In certain illustrative examples, gene frequency data are corrected error, be used to generate at it individual general
Before rate.Different types of error and/or deviation correction are disclosed herein.In specific illustrative embodiment,
The error of correction is amplified allele efficiency variation.In other embodiments, the error of correction includes environmental pollution and base
Because type pollutes.In some embodiments, the error of correction includes amplified allele deviation, and environmental pollution and genotype are dirty
Dye.
It should be appreciated that amplified allele efficiency variation can be determined, for an allele, as one
Experiment or laboratory testing (including to test sample) a part or it can be determined in a different time, benefit
With one group of sample (including efficiency is just in allele calculated).Environmental pollution and genotype pollution is typically determined, with
In the identical operation that test sample is analyzed.
In certain embodiments, environmental pollution and genotype pollution are determined, for the equipotential of the homozygosity in sample
For gene.It should be appreciated that some gene locis in sample will be heterozygosity for any given sample from individual
, other gene locis will be homozygosity, even if a gene loci is selected to analyze, because it has in group
There is relatively high heterozygosity.It is advantageous, in some embodiments, although the ploidy of a chromosome segment may be by
Determining, by the heterozygous genes site of individual, homozygosity gene loci can be used to calculate environment and genotype pollution.
In certain illustrative examples, selection is carried out, by the equipotential for analyzing phase allelic information and estimation
The size of difference between gene frequency (being generated for model).
In the illustrated example, the individual probability of gene frequency is generated, based on a β Binomial Model (at this
Organize the expected and gene frequency observed at Genetic polymorphism site).In the illustrated example, individual probability
It is generated, utilizes Bayes classifier.
In certain illustrative embodiments, nucleic acid sequence data is generated, right by carrying out high-throughput DNA sequencing
A series of multiple copies (being generated using multiplex amplification reaction) of amplicons, wherein each amplification in this series of amplicon
Son crosses at least one Genetic polymorphism site (in the set of the Genetic polymorphism site), and wherein every in the set
A polymerization gene loci is all amplified.In certain embodiments, multiplex amplification reaction (draw restricted
Under the conditions of object) at least 12 reactions.In some embodiments, restricted primer concentration is used for the 1/10 of multiple reaction, 1/
5,1/4,1/3,1/2 or all reactions in.
It provided herein is the factors for considering to realize restricted primer condition in amplified reaction such as PCR.
In certain embodiments, method provided herein detects ploidy, multiple on a plurality of chromosome for spanning
Chromosome segment.Therefore, ploidy is determined in these embodiments, for the group chromosome segment in sample.
For these embodiments, higher multiplex amplification reaction is required.Therefore, for these embodiments, multiplex amplification
Reaction may include, such as 2,500 to 50,000 multiple reaction.In certain embodiments, following range of multiple reaction quilt
It carries out: in range between 100,200,250,500,1000,2500,5000,10,000,20,000,25000,50000
Lower limit, and between 200,250,500,1000,2500,5000,10,000,20,000,25000,50000 and 100,000 it
Between range the upper limit.
In illustrative embodiment, Genetic polymorphism site set is one group of gene of the known high heterozygosity of display
Site.However, it is expected for any given individual, some in these gene locis will be homozygosity.At certain
In a little illustrative embodiments, method of the invention utilizes nucleic acid sequence information, for the homozygosity and heterozygosis of an individual
The gene loci of property.The homozygosity gene loci of an individual is used, for example, it is used for error correction, and heterozygous genes position
Point is used for determining that the allele of sample is uneven.In certain embodiments, at least 10% Genetic polymorphism site is
The gene loci of heterozygosity, for individual.
As disclosed herein, Preference is presented, and is the target SNP base of heterozygosity for analyzing known in group
Because of site.Therefore, in certain embodiments, Genetic polymorphism site is selected, wherein at least 10,20,25,50,75,80,
90,95,99 or 100% Genetic polymorphism site is known to be heterozygosity, in group.
As disclosed herein, in certain embodiments, sample comes from the plasma sample of a pregnant female.
In some instances, the method further includes executing the method, there is known average equipotential base at one
Because on the check sample of uneven ratio.Control can have an average allele imbalance ratio, specific for one
Allele status, indicate the aneuploidy of chromosome segment, be 0.4 to 10% between, with one etc. in analog sample
The average allele of position gene (existing for low concentration) is uneven, such as one from fetus or from tumour
In circulation dissociative DNA desired by.
In some embodiments, PlasmArt is compareed, and as disclosed herein, is used as compareing.Therefore, certain
Aspect, control are a samples, are generated by a kind of method, and this method includes showing chromosomal aneuploidy for known
Sample of nucleic acid is cracked into segment, simulates the size of the DNA fragmentation recycled in individual blood plasma.In some aspects, control is used,
That control is the chromosome segment of not aneuploid.
In illustrative embodiment, the data from one or more control can be divided in the method
Analysis connects the same test sample.For example, control may include a different sample, contain dyeing from not under a cloud
The individual of body aneuploid or a sample under a cloud containing CNV or chromosomal aneuploidy.For example, working as test sample
It is when suspecting containing the plasma sample for recycling free Tumour DNA, the method can also be used, for one from tested
The check sample of the tumour of person, together with its plasma sample.As disclosed herein, check sample can be produced, by splitting
The DNA sample of chromosome aneuploid is shown known to solution.This cracking can produce DNA sample, simulate an apoptotic cell
DNA composition, especially when sample from suffer from cancer individual when.Data from check sample will increase chromosome
The confidence level of the detection of aneuploid.
In the certain embodiments for the method for determining ploidy, sample comes from the individual under a cloud with cancer
Plasma sample.In these embodiments, the method further includes determinations, based on selection, if copy number, which changes, is
It is existing, in the tumour cell of individual.For these embodiments, sample can be the plasma sample from individual.It is right
In these embodiments, the method may further include determination, be based on the selection, if cancer be it is existing, in institute
It states in individual.
For determining these implementation methods of the ploidy of chromosome segment, one mononucleotide of detection may further include
Variation, on a single nucleotide variations site in one group of single nucleotide variations site set, wherein detecting that chromosome is non-
Euploid or mononucleotide variant or both show the presence of circulating tumor nucleic acid in sample.
These embodiments may further include the haplotype information for receiving the chromosome segment of individual tumors, and benefit
It is gone to generate the model set with the haplotype information, these models have different ploidy states and Genetic polymorphism site
Allele imbalance ratio at set.
As disclosed herein, it is abnormal to determine that certain embodiments of the method for ploidy may further include removal
Value, from initial or correction gene frequency data, in relatively more initial or correction gene frequency and the group model
Before.For example, in certain embodiments, gene frequency, at least 2 or 3 standard deviations are higher or lower than chromosome
The gene loci of other gene loci average values, is removed from data in segment,
For before modeling.
As noted herein, it should be understood that for many embodiments provided herein, including those are used to determine
The ploidy of chromosome segment, incomplete or complete phase data is preferably used.It is also understood that it provided herein is one
A little features provide the improvement for the existing method for detecting ploidy, and many different combinations of these features
It can be used.
In certain embodiments, as shown in Figure 69-70, it provided herein is readable Jie of computer system and computer
Matter goes to execute any method of the invention.These include system and computer-readable medium, for executing the side of determining ploidy
Method.Therefore, it as the non-limitative example of system implementation plan, goes to prove that any method provided herein can be performed,
Using system disclosed herein and computer-readable medium, on the other hand, it provided herein is a kind of systems, for detecting dye
Colour solid ploidy, in individual sample, the system comprises:
A. an input processor is configured as receiving gene frequency data, including present in sample each
The amount of allele, at the gene loci of each of one group of Genetic polymorphism site on chromosome segment;
B. a modeling device, is configured as:
I. phase allelic information is generated, for the set in Genetic polymorphism site, by estimating gene frequency
The phase of data;With
Ii. the individual probability for generating gene frequency utilizes the Genetic polymorphism site under Different Ploidy state
The gene frequency data;With
Iii. joint probability is generated, Genetic polymorphism site is gathered, utilizes the individual probability and the phase
Allelic information;With
C. a hypothesis manager is configured to select, and is based on the joint probability, and one indicates ploidy
Best fit model, so that it is determined that the ploidy of chromosome segment.
In certain embodiments of the system implementation plan, gene frequency data are generated by nucleic acid sequencing system
Data.In certain embodiments, the system further comprises an error correction unit, is configured to correction equipotential
Mistake in gene frequency data is used to generate individual probability wherein the gene frequency data of the correction are modeled device.
In certain embodiments, error correction unit connects amplified allele efficiency variation.In certain embodiments, modeling device
Individual probability is generated, there is on polymorphic loci the mould of Different Ploidy state and allele imbalance ratio using one group
Type.Modeling device generates joint probability in certain illustrative embodiments, by considering polymorphism base on chromosome segment
Because chain between site.
In one illustrative embodiment, it provided herein is a system, is dyed for detecting in individual specimen
The ploidy of body comprising following:
A. an input processor is configured as receiving the nucleic acid sequence data of allele, in individual chromosome piece
At one group of Genetic polymorphism site in section, and gene frequency is detected in this group of gene loci, utilizes the nucleic acid sequence
Column data;
B. an error correction unit, is configured to correction error, in gene frequency detected, and generates correction
Gene frequency, for one group of Genetic polymorphism site;
C. a modeling device, is configured as:
I. phase allelic information is generated, for the set in Genetic polymorphism site, by estimating the nucleic acid sequence
The phase of data;With
Ii. the individual probability for generating gene frequency passes through the Genetic polymorphism site under Different Ploidy state
Compare the phase allelic information and one group has Different Ploidy state and equipotential base at the set of Genetic polymorphism site
Because of the model set of uneven ratio;With
Iii. joint probability is generated, Genetic polymorphism site is gathered, (considers chromosome piece by combining individual probability
Relative distance in section between Genetic polymorphism site);With
D. a hypothesis manager is configured to select, and is based on the joint probability, an instruction chromosome aneuploidy
The best fit model of property.
In certain exemplary system embodiments provided herein, Genetic polymorphism site set includes 1000 to 50,
000 Genetic polymorphism site.In certain exemplary system embodiments provided herein, Genetic polymorphism site set packet
Include 100 known heterozygosity hot spot gene locis.In certain exemplary system embodiments provided herein, polymorphism base
Because site set include 100 gene locis, at the 0.5kb of recombination hotspot or within.
In certain exemplary system embodiments provided herein, best fit model analyzes following ploidy state,
Second homologue of the first homologue and chromosome segment to chromosome segment:
(1) all cells all not no missings or amplification of the first homologue of chromosome segment or the second homologue;
(2) some or all of cells have the missing of the first homologue of chromosome segment or the amplification of the second homologue;
With
(3) some or all of cells have the missing of the second homologue of chromosome segment or the amplification of the first homologue.
In certain exemplary system embodiments provided herein, the error of correction includes that amplified allele efficiency is inclined
Difference, pollution, and/or sequencing error.In certain exemplary system embodiments provided herein, pollution include environmental pollution and
Genotype pollution.In certain exemplary system embodiments provided herein, environmental pollution and genotype pollution are determined, right
In homozygosity allele.
In certain exemplary system embodiments provided herein, hypothesis manager is configured to analysis phase equipotential base
Because of the size of the difference between information and the estimation gene frequency generated for model.In certain exemplary systems provided herein
In embodiment of uniting, modeling device generates the individual probability of gene frequency, based on gene frequency that is expected and observing
A β binomial model, at the set of Genetic polymorphism site.In certain Exemplary System Embodiments provided herein,
Modeling device generates individual probability, utilizes a Bayes classifier.
In certain exemplary system embodiments provided herein, nucleic acid sequence data is generated, by executing high pass
DNA sequencing is measured, to the multiple copies for using a series of amplicons caused by multiplex amplification reaction, wherein in the series amplicon
Each amplicon span at least one Genetic polymorphism site in the set of the Genetic polymorphism site, and wherein institute
The each polymerization gene loci for stating set is amplified.In certain exemplary system embodiments provided herein, wherein more
Weight amplified reaction has carried out at least 12 reactions (under the conditions of restricted primer).It is real in certain exemplary systems provided herein
It applies in scheme, wherein sample has an average allele imbalance between 0.4% to 5%.
In certain exemplary system embodiments provided herein, sample, which comes from, suspects the individual with cancer
Plasma sample, and the hypothesis manager is further configured and determines, is based on best fit model, if copy number variation
Be it is existing, in the cell of a tumour of the individual.
In certain exemplary system embodiments provided herein, sample comes from the plasma sample of individual, and
The hypothesis manager, which is further configured, to be determined, best fit model is based on, and cancer is present in individual.At these
In embodiment, the hypothesis manager can be further configured one single nucleotide variations of detection, become in mononucleotide
At a single nucleotide variations site in dystopy point set, wherein detecting that item chromosome aneuploid or mononucleotide become
Both exclusive or indicate the presence of circulating tumor nucleic acid in sample.
In certain exemplary system embodiments provided herein, the input processor, which is further configured, to be received
The haplotype information of the chromosome segment of individual tumors, and the modeling device is configured to produce using the haplotype information
Raw model set, has different ploidy states and allele imbalance ratio at the set of Genetic polymorphism site.
In certain exemplary system embodiments provided herein, modeling device generates model, in allele imbalance
Ratio ranges from 0% to 25% between.
It should be appreciated that any method provided herein can be performed by computer-readable code, it is stored in non-
On provisional computer-readable medium.Therefore, in one embodiment, it provided herein is a kind of non-transitory computers
Readable medium is held for detecting the ploidy in individual specimen, including computer-readable code when by a processing unit
When row, so that processing unit:
A. gene frequency data, the amount including each allele present in sample, in the chromosome are received
At each gene loci in segment in one group of Genetic polymorphism site;
B. phase allelic information is generated, for one group of Genetic polymorphism site, by estimating gene frequency number
According to;
C. the individual probability for generating gene frequency utilizes the Genetic polymorphism site under Different Ploidy state
The gene frequency data;
D. joint probability is generated, for one group of Genetic polymorphism site, utilizes the individual probability and the phase etc.
Position gene information;With
E. it selects, is based on the joint probability, a best fit model indicates ploidy, so that it is determined that dyeing
The ploidy of body segment.
In the embodiment of certain computer-readable mediums, gene frequency data are generated, from nucleic acid sequence
In.The embodiment of certain computer-readable mediums further comprises correction error, in gene frequency data, and
Individual probability step is generated using the gene frequency data of the correction.In the embodiment party of certain computer-readable mediums
In case, the error of correction is amplified allele efficiency variation.In the embodiment of certain computer-readable mediums, individual is general
Rate is generated, using one group at the set of Genetic polymorphism site with Different Ploidy state and allele imbalance ratio
Model.In the embodiment of certain computer-readable mediums, joint probability is generated, polymorphic on chromosome segment by considering
It is chain between property gene loci.
In a specific embodiment, it provided herein is the computer-readable mediums of a non-transitory, are used for
The ploidy in individual specimen, including computer-readable code are detected, when being executed by a processing unit, so that processing
Device:
A. the nucleic acid sequence data of allele, one group of Genetic polymorphism on the chromosome segment of the individual are received
At site;
B. gene frequency is detected, at the gene loci set, utilizes the nucleic acid sequence data;
C. amplified allele efficiency variation is corrected, in gene frequency detected, to generate the equipotential of correction
Gene frequency gathers the Genetic polymorphism site;
D. phase allelic information is generated, the Genetic polymorphism site is gathered, by estimating nucleic acid sequence
The phase of data;
E. the individual probability for generating gene frequency passes through the Genetic polymorphism site under Different Ploidy state
Compare correction gene frequency and one group at the set of Genetic polymorphism site have Different Ploidy state and allele
The model of uneven ratio;
F. joint probability is generated, the set in the Genetic polymorphism site is examined by combining the individual probability
Consider chain between the Genetic polymorphism site on the chromosome segment;With
G. it selects, is based on the joint probability, best fit model indicates the aneuploid of chromosome.
In the embodiment of certain illustrative computer-readable mediums, selection is carried out, and passes through analysis phase etc.
The size of difference between position gene information and the gene frequency of estimation (for caused by model).
In the embodiment of certain illustrative computer-readable mediums, the individual probability of gene frequency is given birth to
At based on the set expected from one and the β binomial model of gene frequency observed, in Genetic polymorphism site
Place.
It should be appreciated that the embodiment of any method provided herein can be performed, it is stored in by execution non-
Code on provisional computer-readable medium.
For detecting the illustrative embodiment of cancer
In some aspects, the present invention provides a kind of methods, for detecting cancer.Sample, it will accordingly be understood that be a tumour
Sample or liquid sample, such as blood plasma suffer from the individual of cancer from suspection.The method be it is particularly effective, detecting
Genetic mutation such as single nucleotide alteration (such as SNV) or copy number change (such as CNV), in these genetic changes with low water
In flat existing sample, a part as sample total DNA.Therefore, the sensitivity of the DNA or RNA of cancer in sample are detected
It is special.The method can combine any or all of improvement provided herein, for detecting CNV and SNV to realize this
The special sensitivity of kind.
Therefore, it is a kind of method in certain embodiments provided herein, is used to determine whether that circulating tumor nucleic acid exists
In individual specimen and the computer-readable medium of a non-transitory includes computer-readable code, when by processing equipment
When execution, when being executed by a processing unit.It the described method comprises the following steps:
C. one group of Genetic polymorphism site for analyzing the sample to determine ploidy, in the individual chromosome segment
Place;With
D. it determines the unbalanced level of average allele being present at Genetic polymorphism site, is measured based on ploidy,
Wherein be averaged allele imbalance be equal to or more than 0.4%, 0.45%, 0.5%, 0.6%, 0.7%, 0.75%, 0.8%,
0.9% or 1% indicates that there are circulating tumor nucleic acid, such as ctDNA, in the sample.
In certain illustrative examples, an average allele imbalance is greater than 0.4,0.45 or 0.5% instruction
The presence of ctDNA.In certain embodiments, the method be used to determine whether circulating tumor nucleic acid be it is existing, further
Including detecting single nucleotide variations, at a single nucleotide variations site in the set of single nucleotide variations site, wherein examining
It measures once an allele imbalance is equal to or more than 0.5, or detects single nucleotide variations or both, indicates in sample
The presence of circulating tumor nucleic acid.It should be appreciated that is provided can be used for really for detecting ploidy or any method of CNV
Determine the unbalanced level of allele, it is uneven to be typically expressed as average allele.It should be appreciated that provided herein for examining
Any method for surveying SNV can be used to detect single nucleotide acid (in terms of this of the invention).
In certain embodiments, it is used to determine whether method existing for circulating tumor nucleic acid, further comprises carrying out institute
Method is stated, on a check sample with known average allele imbalance ratio.Control, for example, it may be coming from
In the sample of individual tumors.In some embodiments, it is uneven to compare average allele expected from there is one, for institute
State analysis sample.For example, AAI is between 0.5% and 5% or average allele imbalance ratio is 0.5%.
In certain embodiments, the analytical procedure being used to determine whether in method existing for circulating tumor nucleic acid, including
Analyze one group of known chromosome segment that aneuploid is shown as in cancer.In certain embodiments, it is used to determine whether
Analytical procedure in method existing for circulating tumor nucleic acid, including analysis multiple are 1,000 to 50, between 000 or 100 to 1000
Genetic polymorphism site.In certain embodiments, the analysis being used to determine whether in method existing for circulating tumor nucleic acid
Step, including the single nucleotide variations site between analysis 100 to 1000.For example, in these embodiments, analytical procedure
It may include carrying out a multiplex PCR to span 1000 to 50,000 polymerization sites and 100 to 1000 monokaryon glycosides to expand
The amplicon of sour variant sites.The multiple reaction can be set to individually reaction or the multiple reaction as different subsets
Library.Multiple reaction method provided herein, such as extensive multiplex PCR disclosed herein, provide a kind of example process, use
In carry out amplification reaction with help realize it is improved multiplexing and therefore, level of sensitivity.
In certain embodiments, multi-PRC reaction carried out at least 10% (under the conditions of restricted primer),
20%, 25%, 50%, 75%, 90%, 95%, 98%, 99% or 100% reaction.Improved condition is (for carrying out herein
The extensive multiple reaction provided) it can be used.
In some aspects, it is used to determine whether that circulating tumor nucleic acid is present in the above method in individual specimen and its institute
Some embodiments can be carried out with a system.Present disclose provides guidance, about the concrete function for executing this method and
Structure feature.As a non-limitative example, the system comprises following:
A. an input processor is configured to analyze the data from the sample, to contaminate in the determination individual
The ploidy in one group of Genetic polymorphism site on chromosome fragment;With
B. a modeling device is configured to determine the unbalanced level of (at Genetic polymorphism site) allele,
It is measured based on ploidy, allelic imbalance is equal to or more than 0.5% presence for indicating circulation.
For detecting the illustrative embodiment of single nucleotide variations
In some aspects, it provided herein is methods, for detecting the single nucleotide variations in sample.It is provided herein to change
It can achieve detection into method and be limited to 0.015,0.017,0.02,0.05,0.1,0.2,0.3,0.4 or 0.5%SNV, in sample
In.All embodiments for detecting SNV,It can be carried out with a system.Present disclose provides guidances, should about executing The concrete function and structure feature of method.In addition, it provided herein is some embodiments comprising the meter of a non-transitory Calculation machine readable medium (including computer-readable code), when it is executed by a processing unit, so that processing unit goes to hold Row the method, to detect SNV provided herein.
Therefore, in one embodiment, it provided herein is a kind of method, it is used to determine whether that single nucleotide acid makes a variation
It is present on one in individual specimen group of genomic locations, which comprises
A. for each genomic locations, an efficiency estimation and every cyclic error rate are generated, for described in a leap
For the amplicon of genomic locations, training dataset is utilized;
B. the nucleotide identity information observed is received, for each genomic locations in the sample;
C. the probability for determining one group of single nucleotide variations rate, it is true from the one or more from each genome location
In real mutation, by independently comparing the nucleotide identity information and a different change observed in each genome location
The model of different rate, amplification efficiency and every cyclic error rate using estimation, for each genomic locations;With
D. most probable true aberration rate and confidence level are determined, from the Making by Probability Sets of each genome location.
In the illustrative embodiment for being used to determine whether method existing for single nucleotide variations, efficiency and often follow
The estimation of ring error rate is generated, and the amplicon of genomic locations is spanned for one group.For example, 2,3,4,5,10,15,20,
25,50,100 or more span genomic locations amplicon can be included.
In the illustrative embodiment for being used to determine whether method existing for single nucleotide variations, the nucleosides observed
Sour identity information includes the total indicator reading number (for each genomic locations) observed and the variation observed etc.
Position gene reads number (for each genomic locations).
In the illustrative embodiment for being used to determine whether method existing for mononucleotide variant, sample is blood plasma sample
This, single nucleotide variations are present in the Circulating tumor DNA of sample.
In another embodiment, it provided herein is a kind of methods, for estimating from a in the sample of individual
The percentage of existing single nucleotide variations.It the described method comprises the following steps:
A. on one group of genomic locations, the efficiency generated across one or more amplicons of those genomic locations is estimated
Meter and each loop error rate, utilize training dataset;
B. the nucleotide identity information observed is received, for each genomic locations in the sample;
C. the average value and variance for generating estimation, for the sum, background error molecule and true mutating molecule of molecule,
For a search space comprising the initial percentage of true mutating molecule utilizes the amplification efficiency of amplicon and every
Loop error rate;With
D. determine that percentage (from what is be really mutated) in the sample occur in single nucleotide variations, by determining one
The percentage of most probable true single nucleotide variations is (by the estimation for the nucleotide identity information observed in fitting sample
The distribution of average value and variance determines).
In the illustrative example of the method (percentage for mononucleotide variant present in sample estimates),
Sample is plasma sample, and single nucleotide variations are present in the Circulating tumor DNA of sample.
The training dataset of the embodiment of the invention is generally included from one or preferably a set of healthy individuals
Sample.In certain illustrative embodiments, the training dataset is analyzed, with one or more test sample phases
In same date or even identical operation.For example, from 2,3,4,5,10,15,20,25,30,36,48,96,100,
192, the sample of 200,250,500,1000 or more healthy individuals can be used to generate training dataset.When data can be used for
When greater amount of healthy individuals, such as when 96 or more, the confidence level of amplification efficiency estimation is increased, even if executing survey
Operation is executed before the method for sample sheet.PCR error rate can use nucleic acid sequence information (not only in SNV home position
And for caused by the entire amplification region around SNV), because error rate is each amplicon.For example, using from
50 individual samples are simultaneously sequenced the amplicon of 20 base-pairs around SNV, from the mistake of 1000 bases reading
Difference frequency data can be used to determine error frequency.
In general, amplification efficiency is estimated, by estimated mean value and standard deviation, the amplification of an amplified fragments is imitated
For rate, a distributed model, such as a bi-distribution or a β bi-distribution are then fitted it into.Error rate is true
Fixed, for a PCR reaction with known recurring number, then the error rate of every circulation is estimated.
In certain illustrative embodiments, estimate that the starting molecule of test data set further comprises updating test number
According to the estimation of the efficiency of collection, using the starting molecule number estimated in step (b), if it is observed that reading number it is dramatically different
In the number of readings per taken of estimation.Then, estimation can be updated, the efficiency new for one and/or starting molecule.
It is used to estimate the search space of the sum of molecule, background error molecule and true mutating molecule, may include one
It is a from 0.1%, 0.2%, 0.25%, 0.5%, 1%, 2.5%, 5%, 10%, 15%, 20% or 25% lower limit and 1%,
2%, the base copy of 2.5%, 5%, 10%, 12.5%, 15%, 20%, 25%, 50%, 75%, 90% or 95% upper limit
Several search spaces is used as SNV base at a position SNV.Lower range, 0.1%, 0.2%, 0.25%, 0.5% or
1% lower limit and 1%, 2%, 2.5%, 5%, 10%, 12.5% or 15% the upper limit, plasma sample can be used for
Illustrative examples of implementation, wherein the method is detection Circulating tumor DNA.Higher range is used for tumor sample.
One is distributed overall error molecule (the background error and true mutation) number being fit in total molecule, to calculate seemingly
Right property or probability, true mutation possible for each of search space.This distribution can be a bi-distribution or one
A β bi-distribution.
Most probable true mutation is determined, and by the most probable true mutation percentage of determination and calculates confidence level,
Utilize the data from fitting distribution.As an illustrative example, and it is not intended to be limited to mentioning herein for clinical interpretation
The method of confession, if average mutation rate is high, the percentage confidence level (needing to make positive detection) of a SNV is low.For example,
If the average mutation rate (using most probable hypothesis) of a SNV is 5% in sample, percentage confidence level is 99%, then one
The result of a positive SNV can be generated.On the other hand, for the illustrative example, if the average of SNV dashes forward in sample
Variability (using most probable hypothesis) is 1%, and percentage confidence level is 50%, then a positive SNV knot in some cases
Fruit will not be generated.It should be appreciated that the clinical interpretation of data will be a function, about sensitivity, specificity, prevalence rate with
And substitute products availability.
In one illustrative embodiment, sample is Circulating DNA sample, such as a Circulating tumor DNA sample.
In another embodiment, it provided herein is a kind of methods, for detecting in individual test sample
One or more single nucleotide variations.According to the method for the present embodiment, comprising the following steps:
D. an intermediate value variation gene frequency is determined, to each normal individual in multiple normal individuals
Multiple normal control samples, for each single nucleotide variations position in single nucleotide variations location sets, based on sequencing fortune
It is being generated in row as a result, to determine the single nucleotide variations position of selection with the variant having in normal sample lower than threshold value
Intermediate value gene frequency, and go to determine background error, (each monokaryon glycosides is being removed for each single nucleotide variations position
After the exceptional value sample of sour variant position);
E. the depth for determining the reading weighted average and variance that one is observed, for the selected list of test sample
Nucleotide variants position, based on generated data in the sequencing operation to test sample;With
F. identify that one or more single nucleotide variations position (adds with the significant reading of statistics using a computer
Weight average depth, compared with the background error of the position), to detect one or more single nucleotide variations.
In certain embodiments of this method for detecting one or more SNV, sample is plasma sample, control
Sample is plasma sample, and the one or more single nucleotide variations detected are present in the Circulating tumor DNA of sample.It is being used for
In the certain embodiments for detecting this method of one or more SNV, multiple check samples include at least 25 samples.At certain
In a little illustrative embodiments, multiple check samples are at least 5,10,15,20,25,50,75,100,200 or 250 samples
Lower limit and 10,15,20,25,50,75,100,200,250,500 and 1000 samples the upper limit.
In certain embodiments of this method for detecting one or more SNV, exceptional value is removed, Cong Gao
In the data generated in flux sequencing operation, depth, and the variance quilt observed are weighted and averaged to calculate the reading observed
It determines.In certain embodiments of this method for detecting one or more SNV, each mononucleotide of test sample
The reading depth of variable position is at least 100 readings.
In certain embodiments of this method for detecting one or more SNV, sequencing operation includes more than one
Weight amplified reaction (being carried out under restricted primer reaction condition).The improvement provided herein for being used to carry out multiplex amplification reaction
Method is used to carry out these embodiments, in illustrative examples of implementation.
Without being limited by theory, the method for the present embodiment (utilizes normal blood plasma sample using a background error model
This, is sequenced in sequencing operation identical with test sample), to solve operation specific error.With normal intermediate value
The make a variation noise position of gene frequency is higher than threshold value, such as > 0.1%, 0.2%, 0.25%, 0.5%, 0.75% He
1.0%, it is removed.
Exceptional value sample removes with being iterated, from the model for considering noise and pollution.For each genomic locus
Each base replacement, the standard deviation of the depth and error that read weighted average are calculated.In certain illustrative embodiment party
In case, sample, such as tumour or cell free plasma sample, there is at least one threshold value reading at single nucleotide variations position
Number, for example, what at least 2,3,4,5,6,7,8,9,10,15,20,25,50,100,250,500 or 1000 variations were read, with
And a1Z value is greater than 2.5,5,7.5 or 10 (being directed to the background error model in certain embodiments), is counted as a candidate
Mutation.
In certain embodiments, reading depth be greater than 100,250,500,1,000,2000,2500,5000,10,000,
20,000,25,0000,50,000 or 100,000 (in lower ranges) and 2000,2500,5,000,7,500,10,
000,25,000,50,000,100,000,250,000 or 500,000 reading (at the upper limit) is obtained in sequencing operation
, for each mononucleotide variant position in one group of single nucleotide variations location sets.In general, sequencing operation is high
Flux sequencing operation.The average value or intermediate value generated for test sample, is weighted in illustrative embodiment, passes through reading
Number depth.Therefore, a variation allele measurement is true (detects in reading at 1000 times with 1 variant equipotential
The sample of gene) a possibility that, it is higher than in being read at 10,000 times and detects the sample with 1 variant allele.Due to
The determination of one variation allele (such as mutation) is not 100% believable, and the single nucleotide variations identified can be recognized
To be a candidate variation or Candidate Mutant.
Exemplary inspection statistics for phase data analysis
One illustrative inspection statistics is described as follows, for analyzing the phase data from following samples, the sample
This
It is known or suspects the aggregate sample containing the DNA or RNA for being derived from the identical cell of two or more non-heredity
This.F is enabled to indicate interested DNA or RNA score, such as DNA the or RNA score with a CNV interested, or from sense
DNA the or RNA score of interest cell (such as cancer cell).In some embodiments for antenatal detection, f indicates fetus
DNA, RNA or cell (in fetus and mother body D NA, RNA or cell mixture) score.Note that this refer to from feel it is emerging
The DNA score of interesting cell, it is assumed that two copies of DNA are given by each interested cell.This is different from emerging from sense
Interesting cell lacked or repeated fragment at DNA score.
The possible allele value of each SNP is represented as A and B.AA, AB, BA and BB are used to indicate all possibility
Orderly allele pair.In some embodiments, the SNP with orderly allele AB or BA is analyzed.Ni is allowed to indicate
The sequence reads of i-th of SNP, Ai and Bi respectively indicate the reading of i-th of SNP of instruction allele A and B.Assuming that:
Ni=Ai+Bi
Allele ratio Ri is defined as:
T is allowed to indicate the quantity of target SNP.
Without loss of generality, some embodiments focus on a single chromosome segment.For the sake of further understanding,
Phrase " the first homologous chromosomal segments compared with the second homologous chromosomal segments " refers to chromosome segment in the present specification
Second homologue of the first homologue and chromosome segment.In some such embodiments, all target SNP include
In interested segment chromosome.In other embodiments, multiple chromosome segments have been analyzed possible copy number and have become
Change.
MAP (heredity mapping) estimation
This method, by orderly allele, goes the missing or repetition of detection target fragment using the knowledge of phase.It is right
In each SNP i, definition
Then it defines
The distribution of Xi and S, various copy number hypothesis (such as two-body hypothesis, the deletion hypothesis of first or second homologue,
Or the repetition hypothesis of first or second homologue) under be described as follows.
Two-body hypothesis
Under the hypothesis that target fragment is not lacked or replicated,
Wherein,
If we assume that a constant reading depth N, that give our bi-distribution S, have parameterAnd T.
Deletion hypothesis
Under the hypothesis of the first homologue missing (for example, AB SNP becomes B, BA SNP becomes A), then R i has binomial
Distribution, contains parameterIt is used for AB SNP with T, andBASNP is used for T.Therefore,
If we assume that a constant reading depth N, these give a bi-distribution S, have parameterAnd T.
Under the hypothesis of the second homologue missing (for example, AB SNP becomes A, BA SNP becomes B), then Ri has one two
Item distribution, contains parameterIt is used for AB SNP with T, andB ASNP is used for T.Therefore,
If we assume that a constant reading depth N, these give a bi-distribution S, have parameterAnd T.
Repeat hypothesis
Under the hypothesis that the first homologue repeats (for example, AB SNP becomes AAB, and BA SNP becomes BBA), then R i
With a bi-distribution, contain parameterIt is used for AB SNP with T, andB ASNP is used for T.Therefore,
If we assume that a constant reading depth N, that give our bi-distribution S, have parameterAnd T.
Under the hypothesis that the second homologue repeats (for example, AB SNP becomes ABB, BA SNP becomes BAA), then R i has
One bi-distribution, contains parameterIt is used for AB SNP with T, andBA SNP is used for T.Therefore,
If we assume that a constant reading depth N, these give a bi-distribution S, have parameterAnd T.
Classification
As proved in above section, XiIt is a binary random variables, has
This allows to calculate the probability of test statistics S, under each hypothesis.Provide the probability of each hypothesis of measurement data
It can be calculated.In some embodiments, the hypothesis with maximum probability is selected.If desired, the distribution of S can be simple
Change,
By the way that each Ni is similar to a constant reading depth N, or by will read depth truncation be one not
Variable N.This simplification provides
The value of f can be estimated, and by the most likely value (in the case where giving measurement data) of selection f, such as be given birth to
At the value of the f of optimum data fitting, using algorithm (for example, searching algorithm), such as maximal possibility estimation, MAP estimation or
Bayesian Estimation.In some embodiments, multiple chromosome segments are analyzed, and the value of f is estimated based on each segment
Data.If there are all target cells these to repeat or delete, the estimated value (data based on these different fragments of f
) it is similar.In some embodiments, f test measures, such as passes through the determining DNA's from cancer cell or RNA
Score, based on the methylation differential (hypomethylation or hyper-methylation) between cancer and non-cancerous DNA or RNA.
In some embodiments of the mixing sample of some fetuses and maternal nucleic acids, the value of f is fetus score, i.e. fetus
Score in DNA (or RNA) total amount of DNA (or RNA) in the sample.In some embodiments, the fetus score is true
It is fixed, by obtain from maternal blood sample (or part thereof) genotype data, for one at least one chromosome
For group Genetic polymorphism site, being expected in mother and fetus is all two-body;Create multiple hypothesis, each hypothesis pair
It should be in different possibility fetus scores, on the chromosome;An expected allele measurement is established in blood sample
Model, at the Genetic polymorphism site set on the chromosome, for possible fetus score;Calculate each fetus score
One relative probability of hypothesis, using the model and from the allele measurements of the blood sample or part thereof;
It determines the fetus score in blood sample, passes through the fetus score for selecting to correspond to the hypothesis with maximum probability.Some
In embodiment, fetus score is determined, by identifying those Genetic polymorphism sites, wherein at for Genetic polymorphism site
The first allele for female parent be homozygosity, and male parent is (i) for the first allele and the second allele
It is heterozygosity, or (ii) is homozygosity for the second allele at the polymorphic locus;And it utilizes
The amount (for each identified Genetic polymorphism site) of the second allele detected in blood sample goes to determine institute
The fetus score in blood sample is stated (see, e.g., the No.2012/ of the March in 2012 of the U.S. Publication submitted on the 29th
0185176 and 2013 on March 13, U.S. Publications submitted No.2004/0065621, entire contents be used as with reference to text
It offers and is cited into herein).
Another method for measuring fetus score includes going to count equipotential base using a high-throughput DNA sequencer
Cause, at the genetic site of a large amount of polymorphisms (such as SNP), and mould builds possible fetus score (see, e.g. U.S.'s public affairs
The number of opening 2012/0264121, whole to be cited as a reference into herein).Another method for calculating fetus score can be with
In " the Noninvasive prenatal detection and selectiv eanalysis of cell- of Sparks et al.
Free DNA obtained from maternal blood:evaluation for trisomy 21and trisomy
18, " Am J Obstet Gynecol 2012;It is seen in 206:319.el-9, entire contents are drawn as a reference
With enter herein.In some embodiments, fetus score is determined, using a methylation assay (see, e.g. United States Patent (USP)
Numbers 7,754,428;7,901,884;With 8,166,382, it is collectively referred to herein respectively as bibliography into herein), assume
Certain gene locis are methylation or preferential methylation in fetus, and those identical gene locis are in female parent
It does not methylate or does not methylate preferentially.
Figure 1A -13D is chart, it is shown that the distribution of test statistics S is divided by T (quantity of SNP) (" S/T "), for each
Kind copy number hypothesis, for various reading depth and tumour score (wherein f is score of the Tumour DNA in total DNA), for more
Carry out more SNP.
The exclusion of single hypothesis
F is not dependent on for the distribution of the S of two-body hypothesis.Therefore, the probability of measurement data can be calculated, for two-body
Assuming that without calculating f.Single hypothesis, which excludes test, can be used for the null hypothesis of two-body.In some embodiments, two-body hypothesis
Under the probability of S calculated, and if two-body hypothesis be excluded probability lower than given threshold value (be, for example, less than 1,000/
One).This shows the repetition an of chromosome segment or missing is existing.If desired, false positive rate can be changed, pass through
Adjust threshold value
Illustrative method for phase data analysis
Illustrative method is described below, and being used to analyze from known or suspection is mixing sample (comprising coming
From the DNA's or RNA in two or more non-hereditary same cells) sample data.In some embodiments, number of phases
According to being used.In some embodiments, the method includes determinations, for the allele ratio of each calculating, if meter
The difference that the allele ratio of calculation is above or is lower than at expected allele ratio and a specific gene site is big
It is small.In some embodiments, a likelihood distribution is determined, for the equipotential base at a gene loci of specific hypothesis
Because of ratio, and the allele ratio calculated, closer to the center of likelihood distribution, the more possible hypothesis is correct.?
In some embodiments, the method includes determining an a possibility that hypothesis is correct, for each gene loci.?
In some embodiments, the method includes determining an a possibility that hypothesis is correct, for each gene loci, and
And the probability of the hypothesis at each gene loci of combination, the hypothesis with greatest combined probability are selected.In some embodiment party
In case, the method includes determining an a possibility that hypothesis is correct, for each gene loci and for from one
The DNA or RNA of a or multiple target cells are in the possible ratio of each of sample total DNA or RNA.In some embodiments, right
It is determined in a combined probability of each hypothesis, by combining the probability of that hypothesis at each gene loci and each
Possible ratio, the hypothesis with greatest combined probability are selected.
In one embodiment, following hypothesis is considered: H11 (all cells are normal), (exist only has H10
The cell of homologue 1, therefore homologue 2 lacks), (there is cell only with homologue 2, therefore homologue 1 lacks) in H01,
H21 (exists with the duplicate cell of homologue 1), and H12 (exists with the duplicate cell of homologue 2).For example for target cell
Cancer cell or chimeric cell a segment f (or from target cell DNA or RNA segment), heterozygosity (AB or BA) SNP
Expection allele ratio can be found, it is as follows:
Equation (1):
R (AB, H11)=r (BA, H11)=0.5,
Deviation, pollution and sequencing error correction:
Observation Ds at SNP is as reading (n with original mappings existing for each alleleA oAnd nB o) composition.So
Afterwards, we can find the reading nA and nB of correction, utilize the anticipated deviation in the amplification of A and B allele.
Allow CaIndicate environmental pollution (such as pollution of the DNA in air or environment) and r (Ca) indicate that environment is dirty
Contaminate the allele ratio of object (it is 0.5 that it, which is initially taken).In addition, CgExpression genotype pollution rate (such as from another
The pollution of a sample), r (cg) be pollutant allele ratio.Allow Se(A, B) and Se(B, A) indicates sequencing mistake, for
The allele for calling one, allele different is (such as by error detection to an A allele, as B etc.
In the presence of the gene of position).
Allele ratio q (r, the c observed can be founda, r (ca), cg, r (cg), Se(A, B), Se(B, A)), for
One given expection allele ratio r, by correcting environmental pollution, genotype pollution and sequencing mistake.
Due to the genotype of pollutant be it is unknown, group's frequency can be used to find P (r (cg)).More specifically,
P is allowed to be group's frequency of an allele (it can be referred to as a reference allele).Then, we have P (r (cg)
=0)=(1-p)2, P (r (cg)=0)=2p (1-p) and P (r (cg)=0)=p2.Conditional expectation more than r (Cg) can be with
It is used for determining E [q (r, ca, r (ca), cg, r (cg), se(A, B), se(B, A))].Note that environment and genotype pollution are true
Fixed, using the SNP of homozygosity, therefore they are not lacked or duplicate missing or existing are influenced.Furthermore, it is possible to measure environment
It is polluted with genotype, refers to chromosome using one, if necessary.
A possibility that at each SNP:
Following equation gives the probability of the nA and nB of observation, when giving an allele ratio r:
Equation (2):
Allow DsIndicate the data of SNP s.It, can be in equation (1) for each hypothesis hc { H11, H01, H10, H21, H12 }
In allow r=r (AB, h) or r=r (BA, h), and it was found that more than r (cg) conditional expectation, to determine the equipotential base observed
Because of ratio E [q (r, ca, r (ca), cg, r (cg))].Then, r=E [q (r, c are allowed in equation (2)a, r (ca), cg, r (cg), se
(A, B), se(B, A))] it can determine P (Ds∣ h, f).
Searching algorithm:
In some embodiments, the SNP with (seeming to be exceptional value) allele ratio, which is ignored, (such as passes through
Ignore or eliminate the SNP with the allele ratio higher or lower than at least 2 or 3 standard deviations of average value).Note that mirror
A fixed advantage (for this method) is that in the presence of higher chimeric rate percentage, the variability of allele ratio can
It can be high, thus ensure that SNP will not be trimmed to about due to chimeric.
Allow F={ f1 ... ..., fNIndicate for be fitted into rate percentage (such as tumour score) search space.It can be true
Determine P (Ds ∣ h, j), in each SNP S and f ∈ F, and combines the likelihood on all SNP.
The algorithm ran each f, for each hypothesis.Use a kind of searching method, it was therefore concluded that: chimerism is deposited
If f has a range F*, wherein the confidence level of missing or repetition hypothesis is higher than without missing or without the confidence for repeating hypothesis
Degree.In some embodiments, the maximal possibility estimation of P (Ds ∣ h, j) in F* is determined.If desired, more than f ∈ F*'s
Conditional expectation can be determined.If desired, the confidence level of each hypothesis can be determined.
Additional embodiment:
In some embodiments, a β bi-distribution be used to replace bi-distribution.In some embodiments, one
Item is used for determining the design parameter of the sample of β binomial with reference to chromosome or chromosome segment.
Use the theoretical performance of simulation:
If desired, can be with the theoretical performance of assessment algorithm, the number by being randomly assigned reference count has to one
The given SNP for reading depth (DOR).For normal condition, use p=0.5 as binomial probability parameter, and for lack or
It repeats, p is adapted accordingly.It is as follows for the exemplary input parameter of each simulation: (1) perseverance of SNP quantity S (2) each SNP
Determine DOR D, (3) p, and (4) number of experiments.
First simulated experiment:
This experiment concentrates on S ∈ { 500,1000 }, D ∈ { 500,1000 } and p ∈ 0%, 1%, 2%, 3%, 4%,
5% }.We conducted 1000 simulated experiments, in each setting, (therefore 24,000 experiment has phase, 24,000 nothings
Phase).We simulate reading, from a bi-distribution (if necessary, other distributions can be used).False positive
Rate (in the case where p=0%) and false negative rate (in the case where p > 0%) are determined, with or without phase information
In the case where.False positive rate is listed in Figure 26.Note that phase information be it is very helpful, especially for S=500, D
=1000.Although D=500, algorithm has highest false positive rate for S=500,
With or without phase except test condition.False negative rate is listed in Figure 27.
Phase information be it is particularly useful, for low chimeric rate percentage (≤3%).There is no phase information, a Gao Shui
Flat false negative is observed, for p=1%, because the confidence level of missing is determined by specifying equal probability to H10With
H01, and a little deviation (tending to a hypothesis) be insufficient to compensate for from other hypothesis it is low a possibility that.
This is also applied for repeating.It is also noted that the algorithm is seemingly more sensitive, for reading depth, compared to SNP
Number.For with phase information as a result, it is assumed that full phase information can be used for a large amount of continuous heterozygosity
SNP.If desired, haplotype information can be obtained, by combining haplotype probabilityly in smaller fragment.
Second simulated experiment:
This experiment concentrates on S ∈ { 100,200,300,400,500 }, D ∈ { 1000,2000,3000,4000,5000 } with
And p ∈ { 0%, 1%, 1.5%, 2%, 2.5%, 3% } and 10000 random experiments are in each setting.False positive rate
(in the case where p=0%) and false negative rate (in the case where p > 0%) are determined, with or without phase information
In the case of.False negative rate is lower than 10%, for D >=3000 and N >=200, using haplotype information, however identical performance
It can achieve, for D=5000 and N >=400 (Figure 20 A and 20B).Difference between false negative rate be it is particularly pertinent, for
Small chimeric percentage (Figure 21 A-25B).For example, the false negative rate less than 20% is never reached, in no list as p=1%
In the case where times type data, however it close to 0% for N > 300 and D >=3000.For p=3%, one 0% vacation yin
Property rate is observed, and with Haplotype data, however N >=300 and D >=3000 are needed to reach the phase same sex
It can be (in the case where no Haplotype data).
For detecting missing and duplicate illustrative methods (in the case where no phase data)
In some embodiments, non-phase genetic data is used to determine whether that there are a first homologue pieces
The overexpression of the copy number of section, compared to the second homologous chromosomal segments, in individual genome (such as at one or more
In the genome of a cell or in cfDNA or cfRNA).In some embodiments, phase genetic data is used, but phase
Position is ignored.In some embodiments, DNA or RNA sample comes from an aggregate sample of the cfDNA or cfRNA of individual
This comprising is from the cfDNA or cfRNA of the different cells of two or more heredity.In some embodiments, the side
The size of the difference between the allele ratio of calculating and expected allele ratio is utilized in method, for each gene position
Point.
In some embodiments, the method includes acquisition genetic datas, and one on chromosome or chromosome segment
On group Genetic polymorphism site, at one in the sample of individual one or more cell DNAs or RNA, existed by measurement
The amount of each allele on each gene loci.In some embodiments, allele ratio is calculated, for dividing
It (such as is heterozygosity and/or in parent in fetus at least one cell of sample from the gene loci for being heterozygosity
In be heterozygosity gene loci).In some embodiments, for a specific gene site calculating allele ratio
Rate, be an allele measurement amount divided by the overall measurement amount of all allele, for the gene loci.In some realities
It applies in scheme, is that an allele is (such as same first for the allele ratio of the calculating in a specific gene site
Allele in source chromosome segment) measurement amount divided by other one or more allele (such as in the second homologous dye
Allele on chromosome fragment) measurement amount, for the gene loci.The allele ratio of calculating and expected equipotential
Gene ratio can be calculated, and any method as described herein or any standard method (such as calculating as described herein are utilized
Any mathematic(al) manipulation of allele ratio or expected allele ratio).
In some embodiments, a test statistics is calculated based on allele ratio and expection of calculating etc.
The size of difference between the gene ratio of position, for each gene loci.In some embodiments, test statistics Δ is counted
It calculates, uses following formula
Wherein δ i is the allele ratio and the expected allele ratio of the calculating at i-th of gene loci
Between difference size;
Wherein μ i is the average value of δ i;And
Wherein σi 2It is the standard deviation of δ i.
For example, we can to define δ i as follows, when expected allele ratio is 0.5:
μiAnd σiValue the fact that can be calculated, using Ri be a binomial stochastic variable.In some embodiments,
Standard deviation be assumed to be it is identical, for all gene locis.In some embodiments, average or weighting standard
The estimation of deviation average or a standard deviation be used to obtain σi 2Value.In some embodiments, test statistics quilt
It is assumed that having a normal distribution.For example, central-limit theorem means that the distribution polymerization of Δ is a standardized normal distribution, with
The quantity of gene loci (such as quantity T of SNP) become larger.
In some embodiments, one group of one or more hypothesis (dyeing in specified one or more cellular genome
The copy number of body or chromosome segment) it is listed.In some embodiments, the most probable hypothesis based on test statistics
It is selected, so that it is determined that the copy number of chromosome or chromosome segment in the genome of one or more cells.In some implementations
In scheme, a hypothesis is selected, if the probability of test statistics (belonging to the test statistics distribution of the hypothesis) is higher than
One higher threshold value;One or more hypothesis are excluded, if test statistics (belongs to the test statistics point of the hypothesis
Cloth) probability be lower than a lower threshold value;Or a hypothesis is both not selected or is not excluded, if test statistics
The probability of (belonging to the test statistics distribution of the hypothesis) is or if general between lower threshold value and higher threshold value
Rate is not determined to sufficiently high confidence level.In some embodiments, a higher and/or lower threshold value is true
It is fixed, it is distributed from an experience, such as one from training data (such as the sample with known copy number, such as diploid
Sample known has a particular hole or duplicate sample) distribution.Such a experience distribution can be used to select
Threshold value is selected, hypothesis single for one excludes test.
Note that test statistics Δ is not dependent on S, therefore, both can be used independently, if necessary.
For detecting missing and duplicate illustrative methods, allele distributions or mode are utilized
This part includes method, is used to determine whether to cross table there are the copy number of first homologous chromosomal segments
It reaches, compared to one the second homologous chromosomal segments.In some embodiments, the method includes enumerating (i) multiple hypothesis,
The copy of chromosome or chromosome segment in the specified genome for being present in individual one or more cells (such as cancer cell)
Number, or (ii) multiple hypothesis specify the degree of the overexpression of the copy number of first homologous chromosomal segments, compared to individual
Second homologous chromosomal segments in the genome of one or more cells.In some embodiments, the method packet
The genetic data for obtaining individual is included, multiple Genetic polymorphism sites (such as SNP gene on chromosome or chromosome segment
Seat) at.In some embodiments, a probability distribution (for each hypothesis) for the expected genotype of individual is created.One
In a little embodiments, the data between the probability distribution of the expected genotype of individual inheritance data and individual of acquisition
Fitting is calculated.
In some embodiments, one or more hypothesis are sorted, and are fitted according to the data, and it is highest to sort
Hypothesis is selected.In some embodiments, a kind of technology or algorithm, such as a searching algorithm, are used in following steps
One or more: calculate data fitting, to hypothesis sequence or the highest hypothesis of selected and sorted.In some embodiments, number
It is the fitting to β bi-distribution or the fitting to a bi-distribution according to fitting.In some embodiments, the technology
Or algorithm is selected from the set comprising maximal possibility estimation, MAP estimation, and Bayesian Estimation, dynamic estimation (such as
Dynamic Bayesian estimation) and expectation maximization estimation.In some embodiments, the method includes applying the technology or calculation
Method is to genetic data obtained and is expected genetic data.
In some embodiments, specified to be present in individual one or more the method includes enumerating (i) multiple hypothesis
The copy number of chromosome or chromosome segment in the genome of cell (such as cancer cell), or (ii) multiple hypothesis specify one
The degree of the overexpression of the copy number of a first homologous chromosomal segments, in the genome compared to individual one or more cells
Second homologous chromosomal segments.In some embodiments, it the method includes obtaining the genetic data of individual, is contaminating
At multiple Genetic polymorphism sites (such as SNP gene loci) on colour solid or chromosome segment.In some embodiments, institute
Stating genetic data includes that allele counts, for multiple Genetic polymorphism sites.In some embodiments, one is directed to institute
It states the Joint Distribution model that expected allele counts to be created, multiple Genetic polymorphisms on chromosome or chromosome segment
At site, for each hypothesis.In some embodiments, a relative probability of one or more hypothesis is determined, and is utilized
The Joint Distribution model and the allele measured on sample count, and the hypothesis with maximum probability is selected.
In some embodiments, the distribution of allele or mode (such as mode of the allele ratio of calculating) quilt
For determining the presence or absence of a CNV, such as a missing or repetition.Parental source if necessary to CNV can be determined,
Based on the mode.The repetition of one matrilinear inheritance comes from an additional copy of maternal chromosome segment, matrilinear inheritance
Missing come from maternal chromosome segment copy missing so that the unique copy of existing chromosome segment comes from
Father Yu.Illustrative mode is shown in Figure 15 A-19D, and is further described below.
In order to determine an interested chromosome segment missing presence or absence, the algorithm considers sequence count
The distribution of (from the sequence count of each of two possible allele, at a large amount of SNP of every chromosome).Weight
What is wanted is it should be noted that some embodiments of the algorithm are unsuitable for visualization method using one kind.Therefore, in order to illustrate
Purpose, the data are illustrated in Figure 15 A-18, indicate the ratio of two most probable allele, mark in a simplified manner
It is denoted as A and B, pertinent trends are more easily visualized.This diagram simplified does not account for some possibility of algorithm
Feature.For example, it is not possible to be come two embodiments of the algorithm illustrated with the method for visualizing of display allele ratio:
1) ability of linkage disequilibrium is utilized, that is, influence to the possibility identity of adjacent S NP of measurement at a SNP and 2)
The use of non-gaussian data model (which depict the expected distributions of allele measurement, at a SNP), gives platform spy
Property and amplification deviation.It is furthermore noted that the simple version of an algorithm only considers two most common allele, each
At SNP, ignore other possible allele.
Interested missing is detected, in genome and maternal blood sample.In some embodiments, genome
It is analyzed with maternal plasma sample, utilize the multiplex PCR and sequencing approach of example 1.Genomic DNA integrates sample and (is detected
To shortage heterozygosity SNP, in target region), it is thus identified that the measurement is for distinguishing monomer (impacted) and two-body (uninfluenced)
Ability.From the analysis of the cfDNA of a maternal blood sample, it is able to detect 22q11.2 deletion syndrome, Cri-du-
Other deletion syndromes in Chat deletion syndrome and Wolf-Hirschhorn deletion syndrome and Figure 14, in tire
In youngster.
Figure 15 A-15C describes data, shows the presence of two chromosome, (does not have fetus when sample is entirely female parent
CfDNA exists, Figure 15 A), comprising one 12% medium fetus cfDNA score (Figure 15 B), or include one 26% height
Fetus cfDNA score (Figure 15 C).X-axis indicates the linear position in the individual Genetic polymorphism site along chromosome, y-axis table
The reading for showing A allele, a part as total allele (A+B) reading.Maternal and fetus genotype is instructed to
On the right side of figure.The figure colours, according to female genotype, so that red indicate a female genotype AA, blue table
Show that a female genotype BB, and green indicate a female genotype AB.Note that measurement is to be located away from maternal blood
And carried out in total cfDNA including female parent and fetus cfDNA;Therefore, each point indicates that fetus and female parent DNA (contribute to
At the SNP) combination.Therefore, the ratio for increasing female parent cfDNA, from 0% to 100%, will gradually change some points it is upward or
It moves down (in figure), according to maternal and fetus genotype.
In all cases, SNP is homozygosity in mother and fetus for A allele (AA), is sent out
Now it is closely related with the upper limit of figure, the score as A allele is read is high, because should be without the presence of B allele.
On the contrary, for B allele being the SNP of homozygosity in mother and fetus, it is found to be closely related with the lower limit of figure,
Score as A allele is read is low, because should there was only B allele.With the upper and lower bound of figure not tight association
Point, indicate mother, fetus or both is all the SNP of heterozygosity;These points be used to identify fetus missing or repetition, but
Information is capable of providing for determining male parent and maternal hereditary information.These points are separated, according to maternal and fetus base
Because of type and fetus score, and therefore each single-point along y-axis exact position depend on stoichiometry and fetus score.Example
Such as, female parent is AA and fetus is the gene loci of AB, it is contemplated that with a different A allele reading score, and because
This carries out different positioning along y-axis, according to the score of fetus.
Figure 15 A has data, pregnant woman non-for one, therefore represents the mould when genotype is entirely female parent
Formula.This mode include point " cluster ": be closely related at the top of red a cluster and figure (SNP, wherein female genotype be
AA), the bottom of blue a cluster and figure is closely related (SNP, wherein female genotype is BB) and that one single is placed in the middle
Green cluster (SNP, wherein female genotype be AB).For Figure 15 B, foetal allele is to A allele reading score
Contribution, changes position of some allele points along y-axis upward or downward.For Figure 15 C, the mode, including two
The red and peripheral band of two blues and a center are three recombinations of green stripes, are obvious.In described three
Heart green stripes correspond in female parent be heterozygosity SNP, and the point of each comfortable top (red) and bottom (blue)
Two " periphery " bands are corresponding to the SNP in female parent being homozygosity.
One 22q11.2 deleted carrier (female parent with the missing) analysis shows that in Figure 16 A.The deleted carrier
Do not have the SNP of heterozygosity in this region, because carrier only has a copy in this region.Therefore, this is lacked
Indicated by the missing of green AB SNP.The 22q11 missing of paternal inheritance analysis shows that in fig. 16b in fetus.When
Fetus (in the case where paternal inheritance missing, is present in fetus when only inheriting the single copy of a chromosome segment
Copy is from female parent), thus only in the hereditary segment each gene loci single allele, the heterozygosity of fetus is
It is impossible.Therefore, the SNP identification of only possible fetus is A or B.Pay attention to the missing of internal peripheral band.For one
Paternal inheritance missing, feature mode include two center green bands, indicate that female parent is heterozygosity for SNP, only
Only there is the red and blue bands of single periphery, indicate that female parent is homozygosity for SNP, and its still with figure
Upper and lower bound (1 and 0) be closely related, respectively.
The Cri-du-Chat deletion syndrome of one matrilinear inheritance analysis shows that in Figure 17.There are two center greens
Band rather than three green stripes, there are two the red and peripheral bands of two blues.The missing (such as one of one matrilinear inheritance
The maternal carrier of a Duchenne's muscular dystrophy (muscular dystrophy)) it can also be detected, based on detection
A small amount of signal in region, in the mixing sample (such as plasma sample) of a female parent and foetal DNA, because of mother and fetus
All there is missing.
Figure 18 is the figure of the secondary lupus-Xi Erhuomu deletion syndrome an of paternal inheritance, by a red and one
Indicated by the presence of a peripheral band of blue.
If desired, similar figure can be generated, for one from it is under a cloud have missing or repeat (such as with
The relevant CNV of cancer) individual sample.In such figure, color coding below can be used based on not CNV's
The genotype of cell: red indicates that a frequency of genotypes AA, blue indicate a genotype BB, and green indicates a genotype AB.
In some embodiments, one is lacked, mode includes two center green bands, and representing individual is heterozygosity
SNP (green stripes at top indicate the AB from the cell not lacked, and from the A of the cell with missing, and
And the green stripes of bottom indicate the AB from the cell not lacked, and the B from the cell with missing), and
And only have single periphery red and blue bands, indicate that individual is the SNP of homozygosity, and it is still upper with figure
Limit and lower limit (1 and 0) are closely related, respectively.In some embodiments, the separation of two green stripes increases, with tool
There is the score increase of the cell, DNA or RNA of missing.
For identifying and analyzing the illustrative methods of multiple gestation
In some embodiments, any method of the invention be used to detect the presence of multifetation, such as twins is pregnant
It is pregnent, wherein at least one fetus is genetically different from least one other fetus.In some embodiments, fraternal twin
It is identified, based on depositing for two fetuses with not iso-allele, different allele ratios or different allele distributions
On some (or all) detected gene locis.In some embodiments, fraternal twin is identified, by true
Expection allele ratio at fixed each gene loci (such as SNP site), for can in sample (such as plasma sample)
There can be two fetuses of identical or different fetus score.In some embodiments, a pair of specific fetus score (wherein f1
The fetus score of fetus 1, f2 is the fetus score of fetus 2) a possibility that calculated, by consider some of two fetuses or
All possible genotype, genotype and genospecies body frequency depending on mother.Two fetuses and a maternal gene
Type mixture, in conjunction with fetus score, it is determined that it is expected that allele ratio, at a SNP.For example, if mother is AA, tire
Youngster 1 is AA, and fetus 2 is AB, then the gross score of the B allele at SNP is the half of f2.Likelihood calculating requires all
SNP matches the degree of expected allele ratio jointly, all possibility combinations based on fetus genotype.With data best
The fetus score combination (f1, f2) matched is selected.It is not necessary to go the specific genotype of calculating fetus;On the contrary, for example, can examine
Consider all possible genotype in a statistical combination.In some embodiments, if the method do not distinguish different ovum and
Identical twin, a ultrasonic wave can be performed, to determine whether there is a different ovum or with the gemellary pregnancy of ovum.If institute
The ultrasound detection stated is to gemellary pregnancy, it is believed that gestation is a gemellary pregnancy with ovum, because the twins of a different ovum is pregnant
It is pregnent and has been detected based on above-mentioned snp analysis.
In some embodiments, mother of a pregnancy is known has multifetation (such as a gemellary pregnancy), base
In previous test, such as ultrasonic wave.Any method of the invention can be used to determine whether that the multifetation includes same
Ovum or fraternal twin.For example, the allele ratio of measurement can be compared, with identical twin (with a single pregnancy
Identical allele ratio) or fraternal twin's (such as calculating of allele ratio as described above) desired value.
Some identical twins are single chorion twins, the risk with the Twin Transfusion Syndrome.Therefore, the one of the invention is utilized
Kind method is confirmed as the twins of identical twin, is tested (such as passing through ultrasound) as scheduled to determine whether that they are single suede
Trichilemma twins, and if so, these twins can be monitored (such as double Zhou Chaosheng since 16 weeks), for lose-lose
The sign of blood syndrome.
In some embodiments, any method of the invention be used to determine whether some fetuses (in multifetation,
Such as twins' gestation) it is aneuploid.Start from the score estimation of fetus for the detection of twinborn aneuploid.?
In some embodiments, (f1, f2) is selected with the fetus score of data best match, as described above.In some embodiment party
In case, a maximal possibility estimation is performed, for the parameter on possible fetus fraction range to (f1, f2).In some realities
It applies in scheme, the range of f2 is from 0 to f1, because f2 is defined as lesser fetus score.Given a pair (f1, f2), data
Possibility is calculated, from the allele ratio observed at one group of gene loci (such as SNP gene loci).Some
In embodiment, data possibility reflects the genotype of mother, and the genotype (if it can get) of father, group is general
The gained probability of rate and fetus genotype.In some embodiments, it is independent that SNP, which is assumed to be,.The fetus score of estimation
To being to generate one of the maximum data likelihood.If f2 is 0, data are explained by best through only a set of fetus
Genotype indicates identical twin, and wherein f1 is combined fetus score.Otherwise f1 and f2 is to single twinborn fetus point
Several estimations.The best estimate for having built up (f1, mouth), can predict the gross score of B allele in blood plasma, for maternal and
Any combination of fetus genotype, if necessary.Individual sequence reads need not be removed to distribute to single fetus.Ploidy detection
It is carried out, using another maximal possibility estimation, which compares the data likelihood of two hypothesis.Same ovum is directed to some
In twinborn embodiment, consider that hypothesis (i) two twins are euploids, and (ii) two twins are three
Body.In some embodiments for being directed to fraternal twin, consider that hypothesis (i) two twins are euploid and (ii)
At least one twins is three-body.The three-body hypothesis (for fraternal twin's) be based on lower fetus score because
One trisomy can be also detected in twins (having one higher fetus score).Ploidy possibility is calculated, benefit
With a kind of method, this method predicts the expected reading at each target gene site, using two-body or three-body hypothesis as condition.It is not required to
Wanting a diploid is with reference to chromosome.For the Tobin's mean variance model of expected reading, it is contemplated that the performance in single target gene site with
And between gene loci correlation (see, e.g., the United States serial 62/008,235 submitted on June 5th, 2014, with
And the United States serial No.62/032 submitted on August 4th, 2014,785, respectively as bibliography be collectively referred to herein into
Herein).If the lesser twins have fetus score f1, we detect the ability of a three-body in the twins,
It is equal to the ability that we detect a three-body in a single pregnancy, under same fetus score.This is because detection three
Polyembryony or single pregnancy are not distinguished independent of genotype in the method part (in some embodiments) of body.It is only
It is to find an increased reading, according to determining fetus score.
In some embodiments, the method includes detecting twinborn presence, based on SNP gene loci (on such as
Described in text).If twins are detected, SPN is used to determine the fetus score of each fetus (f1, f2), as described above.?
In some embodiments, the sample with the response of high confidence level two-body is used to determine amplification deviation, on the basis of each SNP
On.In some embodiments, these samples with the response of high confidence level two-body are analyzed, interested with one or more
In the identical operation of sample.In some embodiments, the amplification deviation based on every SNP is used for analog reading distribution, for
One or more interested chromosomes or chromosome segment, such as No. 21 chromosomes, the chromosome segment be it is expected or
The two-body hypothesis and three-body hypothesis give the junior in two twins' fetus scores.A possibility that two-body or trisomy
Or probability is calculated and gives the measurement amount of two models and interested chromosome or chromosome segment.
In some embodiments, the threshold value of a positive aneuploidy response (such as three-body response) is set, and is based on
Twins with lower fetus score.In this way, if another twins is positive, or if be both positive
, total chromosome indicates to be higher than threshold value certainly.
Illustrative method of counting/calculation method
In some embodiments, one or more method of counting (also referred to as quantitative approach) be used to detect one or
Multiple CNS, such as the missing or repetition of chromosome segment or whole chromosome.In some embodiments, one or more meters
Counting method is used to determine whether that the overexpression of the copy number of the first homologous chromosomal segments is due to the first homologue
One repetition of segment or a missing of the second homologous chromosomal segments.In some embodiments, one or more countings
Method be used to determine the chromosome segment that one has been repeated or chromosome additional copy number (for example whether there are 1,2,3,
4 or more additional copy).In some embodiments, one or more method of counting are used to distinguish one with many
The sample of repetition and one smaller cancer score, from a sample with less repetition and a larger cancer score.
For example, one or more method of counting can be used to distinguish between a tool, there are four extra-chromosome copy and one 10% are swollen
The sample of tumor score, from a tool, there are two extra-chromosome copies and the sample of a 20% tumour score.Illustratively
Method is disclosed, for example, US publication 2007/0184467;2013/0172211;And 2012/0003637;United States Patent (USP)
Numbers 8,467,976;7,888,017;8,008,018;8296076;And 8,915,415;In the beauty submitted on June 5th, 2014
State's patent application serial numbers 62/008,235, and the Application U.S. Serial No 62/032,785 submitted for 4th in August in 2014, it is each
It all quotes from document is all incorporated by reference into herein.
In some embodiments, method of counting includes the number for calculating DNA sequence dna, based on being mapped to one or more
The reading of given chromosome or chromosome segment.Some such methods are related to generating (the cut-off of a reference value
Value), the number that the DNA sequence dna for being mapped to a specific chromosome or chromosome segment is read, wherein being more than the one of the value
A number of readings per taken indicates a specific gene unconventionality.
In some embodiments, the overall measurement amount (example of all allele (for one or more gene locis)
Such as a polymorphism or the total amount of non-polymorphic gene loci) it is compared, with a reference quantity.In some embodiments
In, reference quantity is a desired amount of (i) threshold value or (ii) specific copy number hypothesis.In some embodiments,
Reference quantity (for a CNV is not present) is the overall measurement amount of all allele, for one or more chromosomes or dyeing
It is known or expected do not have missing or repeat for one or more gene locis of body segment.In some embodiments,
Reference quantity (for there are a CNV) is the overall measurement amount of all allele, for one or more chromosomes or chromosome
It is known or expected there is missing or repeat for one or more gene locis of segment.In some embodiments, it refers to
Amount is the overall measurement amount of all allele, for one or more one or more bases for referring to chromosome or chromosome segment
For site.In some embodiments, reference quantity is for two or more different chromosomes, chromosome segment or not
The average value or intermediate value determined with sample.In some embodiments, random (for example, extensive parallel shotgun sequencing) or target
The amount of one or more polymorphisms or non-polymorphic gene loci is used for determining to sequencing.
In some embodiments using reference quantity, the method includes (a) to measure interested chromosome or dyeing
The amount of inhereditary material in body segment;(b) compare the amount and reference quantity from step (a);(c) identification missing or duplicate
Presence or absence, based on the comparison.
In using some embodiments with reference to chromosome or chromosome segment, the method includes being sequenced from one
The DNA or RNA of a sample, to obtain the multiple sequence labels compared with target gene site.In some embodiments, sequence mark
Label have enough length to distribute to a specific target gene site (for example, length is 15-100 nucleotide);It is described
Target gene site is from multiple and different chromosome or chromosome segment comprising at least one the first chromosome or chromosome
Segment (suspect in the sample have a spatial abnormal feature) and at least one second chromosome or chromosome segment (it is assumed that
Normal distribution in sample).In some embodiments, multiple sequence labels are assigned to their corresponding target gene sites.?
In some embodiments, the number of the sequence label compared with the target gene site of the first chromosome or chromosome segment, and
The number of the sequence label compared with the target gene site of the second chromosome or chromosome segment is determined.In some embodiments
In, these numbers are compared to determine an a spatial abnormal feature (such as missing or again for the first chromosome or chromosome segment
Presence or absence again).
In some embodiments, the value (such as fetus score or tumour score) of f is used, in CNV measurement, such as
The difference of the amount of two chromosomes or chromosome segment that comparative observation arrives, and in given f value for a specific type CNV
Expected difference (see, e.g., US publication 2012/0190020;US publication 2012/0190021;U.S. Publication
Number 2012/0190557;US publication 2012/0191358 is respectively incorporated by reference document and all quotes into herein).Example
Such as, the difference in item chromosome number of fragments (chromosome segment be in a fetus it is duplicate, compared to one two times
For the reference chromosome segment of body, at one in the maternal blood sample for carrying fetus) increase, with fetus
The increase of score.In addition, in item chromosome number of fragments difference (chromosome segment be in a tumour it is duplicate,
For reference chromosome segment compared to a diploid) increase, with the increase of tumour score.In some embodiments
In, the method includes comparing an interested chromosome or chromosome segment to refer to chromosome or chromosome segment to one
The relative frequency of (such as expected from one or be known to be the chromosome or chromosome segment of two-body), the value with f, described in determination
A possibility that CNV.For example, the first chromosome or chromosome segment and the amount with reference between chromosome or chromosome segment
Difference can be compared, and desired value when given f value, for a various possible CNV (such as interested chromosome piece
One or two additional copy of section).
Example portentous illustrates method of counting/quantitative approach use below, to distinguish the first homologous dyeing
One duplication of body segment and a missing of the second homologous chromosomal segments.If it is considered that the normal diploid genome of host
As baseline, then the analysis of normal and cancer cell a mixture produces flat between baseline and cancer DNA in mixture
Equal difference.For example, it is assumed that a kind of situation, wherein 10% DNA is from the cell with a missing in sample, in a quilt
On a region for measuring the chromosome targeted.In some embodiments, a kind of quantitative approach is shown, corresponds to that
The amount of the reading in region is expected to 95% desired by normal sample.This is because one in two target chromosome regions,
It is to lose, and the total amount for being therefore mapped to the DNA in the region is in each tumour cell with target region missing
90% (for normal cell)+1/2x10% (for tumour cell)=95%.Alternatively, in some embodiments, one etc.
Position genetic method is shown, in the allele ratio average out to 19:20 of heterozygous sites.It is now assumed that a kind of situation, wherein sample
In 10% DNA from the cell with the amplification of 5 times of focus, at one of the measured chromosome targeted
On region.In some embodiments, a kind of quantitative approach is shown, the amount of the reading corresponding to that region is expected to normally
125% desired by sample.This is because one in two target chromosome regions, at each with 5 times of focuses expansion
In the tumour cell of increasing, it has been replicated additional five times, on the target region, and be therefore mapped to the DNA in that region
Total amount is 90% (for normal cell)+(2+5) × 10%/2 (for tumour cell)=125%.Alternatively, in some embodiment party
In case, a kind of allele method is shown, in the allele ratio average out to 25:20 of heterozygous sites.Note that when being used alone
When a kind of allele method, 5 times of the focus amplification on a chromosomal region is (in a sample with 10%cfDNA
In), in fact it could happen that the missing in identical situation, with the same region (in a sample with 10%cfDNA);
In both cases, the haplotype of low expression looks like and does not have CNV's under focus amplification situation in the case where missing
Haplotype, and there is no the haplotype of CNV to look like the gene being overexpressed under focus amplification situation in the case where missing
Type.A possibility that in conjunction with generating a possibility that generation by the allele method and by a kind of quantitative approach, distinguish described two
Kind possibility.
Illustrative method of counting/calculation method, utilizes reference sample
A kind of illustrative calculation method is described in U.S. serial 62/ using one or more reference samples
008,235 (being submitted on June 5th, 2014) and U.S. serial 62/032,785 (being submitted on August 4th, 2014),
Document is incorporated by reference herein integrally to quote into herein.In some embodiments, (most probable is or not one or more reference samples
With any CNV, on one or more chromosomes or interested chromosome (for example, normal sample)) it is identified, pass through
The sample with highest Tumour DNA score is selected, sample of the z value closest to zero is selected,
Select the sample of data fitting hypothesis (corresponding to the CNV's not with highest confidence level or likelihood), selection
Known normal sample selects the sample from the individual with the minimum possibility of cancer (for example, having the low age, to be
One male is when screening breast cancer, without family history, etc.), the sample with highest DNA input quantity is selected, selection has
The sample of highest signal to noise ratio selects sample to be based on being considered and other standards relevant a possibility that suffering from cancer, or selection sample
Originally pass through the combination of some standards.Once reference set is selected, it can be assumed that these situations are two-bodies, and then estimation is each
SNP deviation, that is, for the experimental specificity amplification of each gene loci and other machining deviations.It is then possible to utilize the experiment
Specific estimation of deviation goes to correct the deviation of chromosome interested (such as No. 21 chromosomal gene sites) in the measurements, and
For other chromosomal gene sites (depending on the circumstances), for not being that (wherein, diploid is assumed to be subset, and No. 21 are contaminated
Colour solid) a part sample.Once deviation is corrected, in the sample of these unknown ploidies, the data of these samples then may be used
With by secondary analysis, using identical or different method, to determine whether that individual (such as fetus) is with trisomy 21 syndrome
's.For example, a kind of quantitative approach can be used on the remaining sample of unknown multiple, and a z value can be calculated, and be utilized
The genetic data of the measurement of the correction, on No. 21 chromosomes.Alternatively, according to a preliminary estimate as No. 21 ploidy states
A part, fetus score (or tumour score that the individual specimen with cancer is suspected from one) can be counted
It calculates.The ratio of the reading for the correction being expected in the case where a two-body (two-body hypothesis) and (three-body is false in a three-body
Say) in the case where the ratio of the reading of correction that is expected can be calculated, for a kind of situation with that fetus score.
Alternatively, if fetus score is not measured in advance, one group of two-body and three-body hypothesis can be generated, for different fetus point
Number.For each case, the expected distribution of the ratio of a correction reading can be calculated, and consider expected statistical variations,
In the selection and measurement of various DNA gene locis.The correction ratio for the reading observed can be compared, and be read with correction
The distribution of several expection ratios, and one for two-body and three-body hypothesis a possibility that ratio can be calculated, for each
The sample of unknown ploidy.Ploidy state associated with hypothesis (with highest calculating probability) can be selected, according to school
Positive ploidy state.
In some embodiments, the subset with the sufficiently low sample with cancer possibility can be selected, with
Serve as a control group of sample.The subset can be a fixed number or it can be one based on only selection lower than threshold
Value those of sample can parameter.Quantitative data from sample set can be combined, average, or flat using a weighting
It combines, wherein a possibility that weighting is normal based on sample.The quantitative data may be used to determine whether that sample sequencing is expanded
Every gene loci deviation when increasing, in the check sample of instant batch.Every gene loci deviation also may include coming from
In the data of the sample of other batches.Every gene loci deviation can indicate the relative excess being observed or insufficient expansion
Increase, for that gene loci compared with other gene locis, it is assumed that the subset of sample does not contain any CNV, and any
The over or under amplification observed is due to expanding and/or being sequenced or other deviations.Every gene loci deviation can be examined
Consider the G/C content of amplicon.Gene loci can be divided into gene loci group, in order to calculate the purpose of each gene loci deviation.
Once every gene loci deviation is calculated, for each gene loci in multiple gene locis, one or more
The sequencing data of a sample (not in sample set, and one or more samples optionally in sample set) can be by
Correction, the influence of the deviation at that gene loci is eliminated by adjusting the quantitative measurment of each gene loci.For example, if
SNP 1 is observed, in the subset of patient, the reading depth with a twice average value size, then adjustment can be with
The reading for corresponding to SNP 1 including replacement is a medium-sized number.Once sequencing data (for each gene loci,
In one or more samples) it has been adjusted, it may be analyzed, using a kind of method (for detecting depositing for a CNV
), on one or more chromosomal regions.
In one example, sample A is the mixture of the DNA of an amplification, is derived from one normal and cancer cell
Mixture, the cell is analyzed to utilize a kind of quantitative method.The illustrative possible data of description of contents below.No. 22 dyes
A region of q arm is found the only DNA mapping with desired by that region 90% on colour solid;Corresponding to HER2 gene
A focus area be found to have desired by that region 150% DNA mapping;The p arm of No. 5 chromosomes is found to have
There is its desired 105% DNA mapping.One clinician is it is inferred that there is sample a missing (to dye at No. 22
On one region of body q arm) and HER2 gene a repetition.Clinician is it is inferred that since 22q is lacked in mammary gland
It is common in cancer, and since the cell on two chromosomes all with the region 22q missing is not survived usually, so sample
In close to 20% DNA from the cell with 22q missing (on one in two chromosomes).Clinician
It also is homologous derived from one group of region HER2 and the region 22q it is inferred that if the DNA of the mixing sample from tumour cell
Hereditary tumour cell, then the cell includes five times of repetitions in the region HER2.
In one example, sample A is also analyzed, and utilizes a kind of method of allele.Following description of contents
It illustratively may data.Two haplotypes of the same area are existing on No. 22 chromosome q arms, with the ratio of a 4:5;
Correspond to HER2 gene a focus area in two haplotypes be it is existing, with the ratio of a 1:2;It is contaminated at No. 5
Two haplotypes on the p arm of colour solid be it is existing, with the ratio of a 20:21.Other all measurement regions of genome
There is no the statistically significant surplus of any haplotype.One clinician is it is inferred that sample includes to have one from one
The DNA of the tumour of a CNV, in the region 22q, the region HER2 and 5p arm.Based on 22q missing, right and wrong are usually in breast cancer
This knowledge seen and/or the quantitative analysis (showing the insufficient expression for the amount of DNA for being mapped to the region genome 22q),
Clinician may infer that the presence of a tumour with 22q missing.It is very in breast cancer based on HER2 amplification
This common knowledge and/or the quantitative analysis (show the excessive table for being mapped to the amount of DNA in the region genome HER2
Reach), clinician may infer that the presence of a tumour with HER2 amplification.
Illustratively refer to chromosome or chromosome segment
In some embodiments, any method as described herein is also carried out in one or more with reference to chromosome or dye
Chromosome fragment, and the result is compared, and is directed to one or more interested chromosomes or chromosome segment with those
Result.
In some embodiments, it is used as a control with reference to chromosome or chromosome segment, for being expected one
For the chromosome or chromosome segment that CNV is not present.In some embodiments, one or more different with reference to coming from
The same chromosome or chromosome segment of sample, the sample be it is known or it is expected do not have one missing or it is duplicate,
On that chromosome or chromosome segment.In some embodiments, with reference to being a difference from tested sample
Chromosome or chromosome segment, the sample is expected to two-body.In some embodiments, it is different from reference to being one
The segment of one chromosome interested, in just tested same sample.For example, with reference to can be potential missing or duplicate block
One or more segments except domain.There is a reference on tested same chromosome, avoid different chromosomes it
Between difference, such as between metabolism, Apoptosis, histone, inactivation, and/or chromosome amplification on difference.Analysis does not have
There is the segment of a CNV, on the same chromosome being tested, also may be used to determine whether metabolism, Apoptosis, group
Difference between albumen, inactivation, and/or homologue in amplification allows the Deflection level between homologue, lacks in a CNV
In the case where, it is determined to be compared and the result from a potential CNV.In some embodiments, calculating
The amplitude of difference between expected allele ratio is corresponding greater than the reference for CNV potential for one
Amplitude, it is confirmed that the presence of a CNV.
In some embodiments, it is used as a control with reference to chromosome or chromosome segment, for being expected presence
One CNV, such as a specific interested missing or repetition.In some embodiments, with reference to come from one or
The same chromosome or chromosome segment of multiple and different samples, it is known that or it is expected with a missing or repetition, in that dye
On colour solid or chromosome segment.In some embodiments, referring to a different chromosome for coming from detected sample
Or chromosome segment, it is known or expected with a CNV.In some embodiments, calculating and expected allele
The amplitude (CNV potential for one) of difference is similar (such as without significant difference) between ratio, corresponding to reference
Amplitude, for the CNV, to confirm the presence of a CNV.In some embodiments, calculating and expected equipotential base
Because of the corresponding amplitude of the amplitude (CNV potential for one) of difference between ratio is less than (such as significant be less than) reference,
For the CNV, to confirm the missing of a CNV.In some embodiments, at one or more gene locis, one
The genotype (or DNA or RNA from a cancer cell, such as cfDNA or cfRNA) of cancer cell is different from a non-cancerous
The genotype (or DNA or RNA from non-cancerous cell, such as cfDNA or cfRNA) of cell, is used for determining tumour point
Number.The tumour score can be used to determine whether that the overexpression of the copy number of the first homologous chromosomal segments is due to first
One repetition of homologous chromosomal segments or a missing of the second homologous chromosomal segments.The tumour score can also by with
In determine a duplicate chromosome segment or chromosome additional copy number (for example whether there are 1,2,3,4 or more volumes
Outer copy), such as go to distinguish the sample that a tool is copied there are four extra-chromosome and a tumour score is 10%,
From a tool, there are two the samples that extra-chromosome copy and a tumour score are 20%.Tumour score can also be by
For determining the data of observation and the matching degree of expected data, for possible CNV.In some embodiments, one
The degree of the overexpression of CNV is used to select a kind of specific therapy or therapeutic scheme, for the individual.For example, some control
Treat agent be it is effective, only at least four, six of a chromosome segment, or more copy.
It in some embodiments, is at one referring to dye for determining one or more gene locis of tumour score
On colour solid or chromosome segment, such as item chromosome or chromosome segment known or that be contemplated to be diploid, seldom it is repeated
Or missing item chromosome or chromosome segment (in common or specific type cancer cancer cell, one of individual quilt
It is known to have or increased risk has cancer), or it is unlikely to be the item chromosome or chromosome segment (this of aneuploid
The segment of sample, which is expected, will lead to cell death, if deleting or repeating).In some embodiments, times of the invention
Where method be used to confirm that with reference to chromosome or chromosome segment be diploid, in cancer cell and non-cancerous cell.One
In a little embodiments, one or more chromosome or chromosome segment (it is high for the confidence level of a disomy response)
It is used.
Can be used to determine tumour score illustrative gene loci include cancer cell (or DNA or RNA, such as
From the cfDNA or cfRNA of a cancer cell) in polymorphism or mutation (such as SNP), be not present in a non-cancerous
In cell (or DNA or RNA from non-cancerous cell), on individual.In some embodiments, the tumour score quilt
Determine, by identifying those Genetic polymorphism sites, one of cancer cell (or from cancer cell DNA or
RNA) the allele lacked in non-cancerous cell (or DNA or RNA from non-cancerous cell) with one is coming
From in a sample of an individual (such as plasma sample or tumor biopsy sample);And it is specific etc. using cancer cell
The amount (on the Genetic polymorphism site of one or more identification) of position gene, goes to determine the tumour score in sample.Some
In embodiment, a non-cancerous cell is homozygosity, for the first allele at Genetic polymorphism site, and one
A cancer cell is (i) heterozygosity, for first allele and the second allele, or (ii) homozygosity, it is right
The second allele at polymorphic loci.In some embodiments, a non-cancerous cell is heterozygosity, for
One the first allele and second allele, on Genetic polymorphism site and a cancer cell is that (i) has
One third allele of one or two copy, at Genetic polymorphism site.In some embodiments, cancer cell quilt
It is assumed that or it is known only with one copy allele, be not present in the non-cancerous cell.For example, if non-
The genotype of cancerous cells is AA, and cancer cell is AB, and 5% of the signal in sample at that gene loci comes from B
Allele, 95% from A allele, then the tumour score of sample is 10%.In some embodiments, cancer cell quilt
It is assumed that or known tool there are two copy allele, be not present in the non-cancerous cell.For example, if non-cancerous
The genotype of cell is AA, and cancer cell is BB, and 5% of the signal in sample at that gene loci comes from B equipotential
Gene, 95% from A allele, then the tumour score of sample is 5%.In some embodiments, cancer cell has
And the multiple gene locis for an allele not having in non-cancerous cell are analyzed, to determine which gene loci exists
It is heterozygosity in cancer cell and which is homozygosity.For example, for be in non-cancerous cell AA gene loci,
It is about 10% at some gene locis, then if the signal from B allele is about 5% at some gene locis
The cancer cell is considered as heterozygosity at the gene loci with about 5%B allele, and homozygosity has about
(indicate that the tumour score is about 10%) at the gene loci of 10%B allele.
The illustrative gene loci that can be used to determine tumour score is included in a cancer cell and non-cancerous cell
(such as cancer cell is AB and non-cancerous cell is BB or cancer cell is for gene loci with a common alleles
BB and non-cancerous cell are the gene locis of AB).The amount of a-signal, the amount of B signal or A and B signal in a mixing sample
Ratio (containing the DNA or RNA from a cancer cell and a non-cancerous cell) compared, with it is corresponding value
Contain a sample for coming solely from the DNA or RNA of cancer cell, or (ii) for (i) containing coming solely from non-cancerous cell
DNA or RNA a sample.The difference of value is used for determining the tumour score of the mixing sample.
In some embodiments, it may be used to determine whether that the gene loci of tumour score is selected, be based on the gene
Type (i) contains sample for coming solely from the DNA or RNA of cancer cell, and/or (ii) containing coming solely from non-cancerous cell
DNA or RNA sample.In some embodiments, gene loci is selected, the analysis based on mixing sample, example
If the absolute or relative quantity of each allele is different from expected gene loci, if cancer cell and non-cancerous cell all have
Identical genotype, at a specific gene loci.For example, if cancer cell and non-cancerous cell gene having the same
Type, the gene loci is expected the B signal of generation 0%, if all cells are all AA;50% B signal, if all
Cell be all AB;Or 100% B signal, if all cells are all BB.The other values for being directed to B signal indicate cancer
The genotype of cell and non-cancerous cell is different, and at that gene loci, therefore that gene loci can be used for
Determine tumour score.
In some embodiments, the tumour score (is calculated, in one or more gene locis based on allele
Place) it is compared, the tumour score (utilizing one or more method of counting disclosed herein) with calculating.
Illustrative method, for one phenotype of detection or analysis multiple mutation
In some embodiments, the method includes analyzing a sample, for one group of mutation, with disease or obstacle
(such as cancer) or a kind of disease or the increased risk of obstacle are related.There are very strong correlations, in classification (such as M
Or C cancer class) event between, can be used to improve a kind of signal-to-noise ratio of method, and tumour is divided into different
Clinical subset.For example, it is several mutation (such as several CNV) edges as a result, joint consider one or more chromosomes or
On chromosome segment, it may be possible to a very strong signal.In some embodiments, multiple interested polymorphisms are determined
Or the presence or absence of mutation (such as 2,3,4,5,8,10,12,15 or more), sensitivity and/or specificity are increased, for true
A kind of fixed disease or obstacle (such as cancer) or a kind of a kind of presence for increasing risk (for disease or obstacle such as cancer) with
It is no.In some embodiments, it is used for more effectively observation signal, phase across the correlation between the event of a plurality of chromosome
Than in individually observing each of which.The design of the method itself can be optimized to tumour of most preferably classifying.This
May be it is highly useful, for early detection and screening one visible recurrence, to specific mutation/CNV sensitivity
Degree may be most important.In some embodiments, the event is not always relevant, but with one by relevant general
Rate.In some embodiments, a Matrix Estimation formula (noise covariance matrix with nondiagonal term) is used.
In some embodiments, the present invention describes a kind of method, for detecting phenotype (such as cancer table
Type) in an individual, wherein the phenotype is defined, pass through the presence of at least one of one group of mutation.In some implementations
In scheme, the method includes obtaining the measurement of DNA or RNA, for one from the DNA of individual one or more cells or
RNA sample, wherein one or more cells are under a cloud to have the phenotype;The measurement of DNA or RNA is analyzed with determination, for one
Each mutation in group mutation, at least one cell have a possibility that mutation.In some embodiments, the method packet
It includes and determines that individual has the phenotype, if (i) at least one mutation, at least one cell contains the possibility of that mutation
Property be greater than a threshold value, or (ii) at least one mutation, at least one cell have that mutation a possibility that less than one
A threshold value, and for multiple mutation, there is at least one cell the combinatory possibility of at least one mutation to be greater than threshold value.One
In a little embodiments, one or more cells have a subset or all mutation of ensemble de catastrophes.In some embodiments,
It is related to cancer or an increased risk of cancer to be mutated subset.In some embodiments, group mutation includes a subset
Or all mutation, in the mutation of M class cancer (Ciriello, Nat Genet.45 (10): 1127-1133,2013, doi:
10.1038/ng.2762 being incorporated by reference document herein all to quote into herein).In some embodiments, which is mutated
Including a subset or all mutation, in the mutation of C class cancer (Ciriello, supra).In some embodiments, described
Sample includes dissociative DNA or RNA.In some embodiments, the measurement of the DNA or RNA is included in one group of Genetic polymorphism
Measurement at site, on one or more interested chromosomes or chromosome segment.
Illustrative method, for permanently test or genetic correlation test
Method of the invention can be used to improve the accuracy (ginseng of paternity test test or the test of other genetic correlations
See, such as the US publication 2012/0122701 that on December 22nd, 2011 submits, is incorporated by reference document herein and all draws
With enter herein).It is analyzed for example, multiple PCR method can permit thousands of Genetic polymorphism sites (such as SNP) for this
Parent support algorithm described in text, to determine whether that a so-called male parent is the biology male parent of a fetus.In some realities
It applies in scheme, the present invention describes a kind of method, is used to determine whether that a so-called male parent is the maternal institute's gestation of a pregnancy
Fetus biology male parent.In some embodiments, the method is related to obtaining phase genetic data, for so-called father
This (such as by being used to determine phase genetic data using another method described herein), wherein the phase genetic data packet
The identity of allele is included, the allele is present in first homologous chromosomal segments and a second homologous dyeing
At each gene loci in one group of Genetic polymorphism site in body segment, in so-called male parent.In some embodiments,
The method includes acquisition genetic data, at one group of Genetic polymorphism site on chromosome or chromosome segment, at one
It contains in a hybrid dna sample of foetal DNA and female parent DNA (from fetus mother's), by measuring each equipotential
The amount of gene, at each gene loci.In some embodiments, the method includes calculating, on one computer, in advance
The genetic data of phase, for mixed DNA sample, from the phase genetic data of so-called male parent;It determines, in a computer
On, a possibility that so-called male parent is the biology male parent of fetus, by comparing the genetic data obtained (in a hybrid dna
Generated on sample) with the expection genetic data of hybrid dna sample;And determine whether that so-called male parent is the biology of fetus
Male parent is the probability of fetus biology male parent using the so-called male parent determined.In some embodiments, the method includes
Phase genetic data is obtained, maternal for the biology of fetus (such as by the way that using another method described herein, it is fixed to be used for
Phase genetic data), wherein the phase genetic data includes the identity of allele, the allele is present in one
Each gene loci in one group of Genetic polymorphism site on one homologous chromosomal segments and second homologous chromosomal segments
Place, in female parent.In some embodiments, the method includes obtaining the phase genetic data of fetus (such as to pass through utilization
Another kind method described herein, for determining phase genetic data), wherein the phase genetic data includes the same of allele
Property, the allele be present on first homologous chromosomal segments and second homologous chromosomal segments more than one group
At each gene loci of state property gene loci, in fetus.In some embodiments, the method includes technologies, one
On platform computer, it is contemplated that genetic data the phase genetic data of so-called father, Yi Jili are utilized for mixed DNA sample
With maternal phase genetic data and/or the phase genetic data of fetus.
In some embodiments, the invention is characterized in that determined signified father whether be mother pregnant youngster it is one's own
The method of father.In some embodiments, the method includes obtaining the gene data stage by stage of signified father (for example to pass through this
Another gene data method stage by stage that described in the text is crossed), wherein gene data includes the of signified father stage by stage for this
The allele in each site in a whole set of polymorphic site on one homologous chromosomal segments and the second homologous chromosomal segments
Identity.In some embodiments, the method includes by measuring each allele on each site, to be wrapped
Contain chromosome in the mixing sample of foetal DNA and fetus mother's mother body D NA or a whole set of polymorphic site on chromosome segment
Genetic data.In some embodiments, the method includes identifying in foetal DNA without the parent in polymorphic site
Allele (i) in DNA, and identify the allele lacked in foetal DNA and in the mother body D NA of polymorphic site
(i).In some case study on implementation, the method includes determining that censured father is the general of the natural father of fetus on computers
Rate;Wherein the measurement includes: that (1) compares (i) and is present in foetal DNA but is not present in the mother body D NA of polymorphic site
Allele and the allele of the corresponding polymorphic site in the inhereditary material of (ii) from signified father, and/or (2) will
(i) allele present in foetal DNA and the mother body D NA at polymorphic site and the inhereditary material of (ii) from signified father
In the allele of corresponding polymorphic site be compared;It and is the natural father of fetus using determining signified father
Probability come determine signified father whether be fetus natural father.
In some embodiments, the censured father of above-mentioned determination whether be fetus natural father method for determining
The relative (such as grand parents, siblings, auntie or uncle) of signified fetus whether be fetus practical affiliation it is (such as logical
Cross the genetic data for using signified relative rather than the genetic data of signified father).
Example combinations method
In order to improve the precision of result, two or more present or absent method (examples for detecting CNV are carried out
Such as any method or any of method of the invention).In some embodiments, progress is one or more refers to for analyzing
Show the present or absent factor of disease or illness or increases the method for the risk of disease or illness.(such as it is as described herein
Any method or any known method).
In some embodiments, the covariance between two or more methods is calculated using standard mathematical techniques
And/or correlation.Standard mathematical techniques can also be used for the combined probability that ad hoc hypothesis is determined based on two or more tests.Show
Example property technology includes meta-analysis, and the fischer joint probability for independent test is tested, and relies on p value and known association for combining
The Brownian method of variance, and for combining the Koster method for relying on p value and unknown covariance.Passing through first method and the
In the case that two methods determine that the mode of likelihood is orthogonal or incoherent mode determines likelihood, combination likelihood is direct
And it by multiplication and can be normalized to complete, or be completed by using following formula:
Rcomb=RlR2/ [R1R2+ (1-Rl) (l-R2)]
Rcomb is combined likelihood, and Ri and R2 are individual likelihoods.For example, if the possibility of the trisomy of method 1
Property be 90%, and a possibility that the trisomy of method 2 be 95%, then from two methods combination output allow clinician
Infer that fetus is three-body, there is (0.90) (0.95)/[(0.90) (0.95)+(1-0.90) (1-0.95)]=99.42%.?
In the case that first and second methods are non-orthogonal, that is, there are in the case where correlation between two methods, still can combine
Likelihood.
Analysis Multiple factors or the illustrative methods of variable are disclosed in the U.S. Patent number .8 of authorization on September 20th, 2011,
024,128;The US publication 2007/0027636 that on July 31st, 2006 submits;With the U.S.'s public affairs submitted on December 6th, 2006
The number of opening .2007/0178501 is integrally incorporated herein each by reference).
In various embodiments, the combined probability of ad hoc hypothesis or diagnosis is greater than 80,85,90,92,94,96,98,99
Or 99.9%, or it is greater than a certain other threshold test limits
In some embodiments, the detection limit of the mutation (such as SNV or CNV) of the method for the present invention is less than or equal to 10,
5,2,1,0.5,0.1,0.05,0.01 or 0.005%.In some embodiments, the method for the present invention mutation (such as SNV or
CNV detection) is limited to 15 to 0.005%, for example, comprising 10% to 0.005%, 10% to 0.01%, 10% to 0.1%5% to
0.005%, 5% to 0.01%, 5% to 0.1%, 1% to 0.005%, 1% to 0.01%, 1% to 0.1%, 0.5% to
0.005%, 0.5% to 0.01%, 0.5% to 0.1% or 0.1% to 0.01%.In some embodiments, detection limit so that
In the presence of less equal than 10%, 5%, 2%, 1%, 0.5%, 0.1%, 0.05%, 0.01% or 0.005% mutation (such as
SNV or CNV) detection (or enough detected) with the DNA in the site in sample or RNA molecule (such as cfDNA or
The sample of cfRNA).For example, even if be less than or equal to 10%, 5%, 2%, 1%, 0.5%, 0.1%, 0.05%, 0.01% or
0.005% with the DNA or RNA molecule for having mutation in the site, may also detect that mutation (for example, instead of the open country in site
Raw type or not mutated form or the different mutation at the site).In some embodiments, detection limit so that in the presence of being less than or
Equal to 10%, 5%, 2%, 1%, 0.5%, 0.1%, 0.05%, 0.01% or 0.005% sample (such as cfDNA or
CfRNA sample) in the mutation (such as SNV or CNV) of DNA or RNA molecule be detected or be able to detect.In some embodiment party
In case, CNV is missing from.Even if this is lacked only to be less than or equal to 10%, 5%, 2%, 1%, 0.5%, 0.1%, 0.05%,
0.01% or 0.005% DNA or RNA molecule exists, and can also be detected.These DNA or RNA molecule have comprising or not
Include the region of interest lacked in sample.In some embodiments, CNV is missing from.Even if this missing is only to be less than or equal to
10%, 5%, 2%, 1%, 0.5%, 0.1%, 0.05%, 0.01% or 0.005% DNA or RNA molecule exist, can also be with
It is detected.In some embodiments, CNV is duplication.Even if the existing DNA additionally replicated or RNA is less than or equal to DNA
Or the 10%, 5%, 2%, 1%, 0.5%, 0.1%, 0.05%, 0.01% or 0.005% of RNA molecule, this duplication can also be by
It detects.These DNA or RNA molecule have comprising or not comprising the region of interest that replicates in the sample.In some embodiments,
CNV is duplication.Even if the existing DNA additionally replicated or RNA is less than or equal to 10%, 5% of DNA or RNA molecule in sample,
2%, 1%, 0.5%, 0.1%, 0.05%, 0.01% or 0.005%, this duplication can also be detected.Example 6, which provides, to be used for
Calculate the illustrative methods of detection limit.In some embodiments, " LOD-zs5.0-mr5 " method of use case 6.
Exemplary sample
In some embodiments of any aspect of the invention, sample includes having missing from suspection or replicating thin
Born of the same parents and/or extracellular inhereditary material, such as suspect to be carcinous cell.In some embodiments, sample includes and suspects to contain
There are the cell with missing or duplication, any tissue or body fluid of DNA or RNA (such as cancer cell, DNA or RNA).It can be to packet
The heredity that any sample containing DNA or RNA carries out a part as these methods measures, such as, but not limited to tissue, blood,
Serum, blood plasma, urine, hair, tears, saliva, skin, nail, lymph, cervical mucus, sperm or other cells comprising nucleic acid
Or material.Sample may include any cell type, or DNA from any cell type or RNA can be used and (such as come
From the cell of doubtful carcinous or neuron any organ or tissue).In some embodiments, sample includes core and/or line
Mitochondrial DNA.In some embodiments, sample comes from any target individual disclosed herein.In some embodiments, target
Individual is the product of birth individual, Pregnant Fetus, non-pregnant fetus, such as sample of becoming pregnant, embryo or any other individual.
Exemplary sample includes the sample containing cfDNA or cfRNA.In some embodiments, cfDNA can be used for analyzing
The step of without lytic cell.Cell-free DNA can be obtained from Various Tissues, such as the tissue of liquid form, such as blood,
Blood plasma, lymph, ascites or celiolymph.In some cases, cfDNA is made of the DNA for being originated from fetal cell.In some cases
Under, cfDNA is made of the DNA for being originated from fetus and mother cell.In some cases, from being centrifuged to remove cellular material
CfDNA is separated in the blood plasma of whole blood separation.CfDNA can be (such as non-from target cell (such as cancer cell) and non-target cell
Cancer cell) DNA mixture.
In some embodiments, sample contains or suspects the mixture containing DNA (or RNA), such as cancer DNA (or
) and the mixture of non-cancerous DNA (or RNA) RNA.In some embodiments, at least 0.5%, 1%, 3%, 5%, 7%,
10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 92%, 94%, 95%, 96%, 98%,
Cell in 99% or 100% sample is cancer cell.In some embodiments, at least 0.5%, 1%, 3%, 5%, 7%,
10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 92%, 94%, 95%, 96%, 98%,
The percentage of DNA (such as cfDNA) or RNA (such as cfRNA) in 99% or 100% sample come from cancer cell.In various realities
Apply in scheme, the percentage of the cells in sample as cancer cell is 0.5 to 99%, such as comprising 1% to 95%, 5% to
95%, 10 to 90%, 5% to 70%, 10% to 70%20% to 90% or 20% to 70%.In some embodiments, sample
Product are enriched with cancer cell or DNA or RNA from cancer cell.In some embodiments of wherein example enrichment cancer cell, at least
0.5%, 1%, 3%, 5%, 7%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 92%,
94%, the cell in 95%, 96%, 98%, 99% or 100% enriched sample is cancer cell.Example enrichment comes from wherein
In some embodiments of the DNA or RNA of cancer cell, at least 0.5%, 1%, 3%, 5%, 7%, 10%, 15%, 20%,
30%, 40%, 50%, 60%, 70%, 80%, 90%, 92%, 94%, 95%, 96%, 98%, 99% or 100% enrichment
DNA or RNA in sample come from cancer cell.In some embodiments, using cell sorting (such as fluorescence activated cell point
Choosing) it is enriched with cancer cell (Biochim Biophys Acta.1836 (1): 105-22 of Barteneva et al., in August, 2013
.doi:10.1016/j.bbcan.2013.02.004. the electronic publishing, " Adv of on 2 24th, 2013 and Abraham et al.
Biochem Eng Biotechnol.106:19-39,2007, be integrally incorporated herein each by reference).
In some embodiments of any aspect of the invention, sample include it is any it is under a cloud be at least partly fetus come
The tissue in source.In some embodiments, sample includes cell and/or extracellular inhereditary material from fetus, contamination of cells
And/or extracellular inhereditary material (such as inhereditary material from fetus mother) or combinations thereof.In some embodiments, sample
Include the cellular genetic material from fetus, contamination of cells inhereditary material or combinations thereof.
In some embodiments, sample comes from Pregnant Fetus.In some embodiments, sample comes from non-pregnant tire
It becomes pregnant after youngster, such as foetal death the product of sample or the sample from any fetal tissue.In some embodiments, sample
It is maternal whole blood sample, from maternal blood sample, Maternal plasma sample, maternal serum sample, amniocentesis sample, placenta tissue
Sample (for example, chorionic villus, decidua or placental membrane), cervical mucus sample or other samples from fetus.Some
In embodiment, at least 3%, 5%, 7%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%,
92%, 94%, 95%, 96%, 98%, 99% or 100% cell is mother cell in the sample.In various embodiments
In, be 5% to 99% as the cell percentages in the sample of mother cell, such as 10 to 95%, 20 to 95%, 30 to
90%, 30 to 70%, 40 to 90%40 to 70%, 50 to 90% or 50 to 80%.
In some embodiments, sample is enriches fetal cells.In some realities of wherein example enrichment fetal cell
It applies in scheme, at least 0.5% in enriched sample, 1%, 2%, 3%, 4%, 5%, 6%, 7% or more cell is that fetus is thin
Born of the same parents.In some embodiments, cells in sample as the percentage of fetal cell between 0.5%-100%, such as comprising
1%-99%, 5%-95%, 10%-95%, 10-%95%, 20%-90% or 30%~70%.In some embodiments
In, sample is enriches fetal DNA.In some embodiments of wherein example enrichment foetal DNA, in enriched sample at least
0.5%, 1%, 2%, 3%, 4%, 5%, 6%, 7% or more DNA is foetal DNA.In some embodiments, in sample
DNA as foetal DNA percentage between 0.5-100%, such as comprising 1%-99%, 5%-95%, 10%-95%,
10-%95%, 20%-90% or 30%~70%.
In some embodiments, sample includes individual cells or includes DNA and/or RNA from individual cells.?
In some embodiments, multiple individual cells are analyzed in parallel (for example, from same subject or from different subjects'
At least 5,10,20,30,40 or 50 cells).In some embodiments, the cell quilt of multiple samples from same individual
Combination, this reduces workload compared with analyzing sample respectively.Combining multiple samples may also allow for testing Various Tissues simultaneously
Cancer (it may be used to provide or more thoroughly screens cancer or determine whether cancer may be transferred to its hetero-organization).
In some embodiments, sample contains individual cells or a small amount of cell, such as 2,3,5,6,7,8,9 or 10 thin
Born of the same parents.In some embodiments, sample has 1 to 100,100 to 500 or 500 to 1,000 cell, including 1 and 100.?
In some embodiments, sample contains 1 to 10 pik, 10 to 100 piks, 100 piks to 1 nanogram, 1 to 10 nanogram, and 10 to 100
The RNA and/or DNA of nanogram or 100 nanograms to 1 microgram.
In some embodiments, sample is embedded in parafilm.In some embodiments, sample preservative example
If formaldehyde saves, and optionally it is embedded in paraffin, this can cause the crosslinking of DNA, it is made less to be used for polymerase chain reaction.?
In some embodiments, sample is that formaldehyde fixes the-sample of paraffin embedding.In some embodiments, sample is fresh sample
(such as the sample obtained with analysis in 1 or 2 day).In some embodiments, frozen samples before analysis.In some embodiment party
In case, sample is historical samples.
In these samples any method for use in the present invention.
Exemplary sample preparation method
In some embodiments, the method includes isolated or purified DNA and/or RNA.There are as known in the art
Multiple standards program realizes this purpose.In some embodiments, sample can be centrifuged to separate each layer.In some implementations
In scheme, it can be used and be separated by filtration DNA or RNA.In some embodiments, the preparation of DNA or RNA can be related to expand, point
From, pass through chromatography, liquid-liquid separation, separation, priority enrichment, preferential amplification, targeting amplification or many known in the art
Any technology in other technologies.In some embodiments for separating DNA, RNA enzyme is used for degradation of rna.For dividing
In some embodiments from RNA, use DNase (such as from Invitrogen, Carlsbad, CA, DNase I of USA)
Degradation of dna.In some embodiments, RNA is separated according to the scheme of manufacturer using RNeasy mini kit (Qiagen).
In some embodiments, according to the scheme of manufacturer using mirVana PARIS kit (Ambion, Austin, TX,
USA) isolating small RNA molecules (Gu et al., J.Neurochem.122:641-649,2012 are integrally incorporated by reference).RNA's
Concentration and purity optionally use Nanovue (GE Healthcare, Piscataway, NJ, USA) to measure, and RNA is complete
Whole property is optionally by using 2100Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA)
J.Neurochem.122:641-649,2012, be incorporated herein by reference in their entirety).In some embodiments, TRIZOL
Or RNAlater (Ambion) is used to stablize RNA during storage.
In some embodiments, the connector of common tags is added to prepare library.Before proceeding, sample DNA can be with
It is flush end, single adenosine base is then added to 3- element end.Before proceeding, restriction enzyme can be used or some other cut
Segmentation method cutting DNA.During the connection, the complementary 3- element tyrosine jag of the 3- element adenosine of sample fragment and adapter can be with
Enhance joint efficiency.In some embodiments, using the connection reagent found in the kit of Agilent SureSelect
Box carries out adapter connection.In some embodiments, library is expanded using universal primer.In one embodiment, pass through
Size separation or the text expanded by using product such as AGENCOURT AMPURE pearl or the classification separation of other similar approach
Library.In some embodiments, target site is expanded using PCR amplification.In some embodiments, to the DNA sequencing of amplification
(such as using ILLUMINA IIGAX or HiSeq sequencer).In some embodiments, from each of DNA of amplification
The DNA of end sequencing amplification is to reduce sequencing mistake.If existed in particular bases when one end of the DNA from amplification is sequenced
Sequence errors, then when the other side of the DNA from amplification is sequenced, a possibility that there are sequence errors in complementary base, is smaller
(same end with the DNA from amplification).
In some embodiments, full-length genome application (WGA) is used for amplification of nucleic acid sample.There are many can be used for WGA's
Method: the PCR (LM-PCR) of mediation, degenerate oligonucleotide primed PCR (DOP-PCR) and multiple displacement amplification (MDA) are connected.?
In LM-PCR, the short dna sequence of referred to as adapter is connected to the end of DNA.These adapters contain general extension increasing sequence, use
In passing through pcr amplified DNA.In DOP-PCR, also the random primer comprising universal amplification sequence is for first round annealing and PCR.
Then, using the second wheel further extension increasing sequence of PCR universal primer sequence.MDA uses phi-29 polymerase, is duplication
DNA and the lasting and nonspecific enzyme of the height for having been used to single cell analysis.In some embodiments, WGA is not executed.
In some embodiments, selective amplification or enrichment are for expanding or being enriched with target site.In some embodiments
In, amplification and/or selective enrichment technology can be related to PCR, such as connect the PCR of mediation, be captured by the segment of hybridization, molecule
Reversed probe or other circularizing probes.In some embodiments, using real-time quantitative PCR (RT-qPCR), digital pcr or cream
Liquid PCR, monoallelic base extension, be followed by mass spectrography (henry etc., clinicopathologia magazine 62:308-313,2009,
It is incorporated herein by reference in their entirety).In some embodiments, it is used for by the capture hybridized with hybrid capture probe preferential
Enrichment DNA.In some embodiments, for expand or the method for selective enrichment may include using probe, wherein with
After target sequence correctly hybridizes, the 3- element end or 5- element end of nucleotide probe and the polymorphic site of polymorphic allele pass through
The nucleotide of peanut.It is this to separate the preferential amplification for reducing an allele, referred to as allele bias.This is to be related to
Using the improvement of the method for probe, wherein the 3- element end of the probe correctly hybridized or 5- element end are directly adjacent to or closely
The polymorphic site of allele.In one embodiment, eliminate wherein hybridization region can with or certainly contain polymorphic position
The probe of point.The polymorphic site of hybridization site can lead to not equal hybridization or completely inhibit hybridization in some allele, lead
Cause the preferential amplification of certain allele.These embodiments are to be related to other methods of targeting amplification and/or selective enrichment
Improvement because they preferably retain the original gene frequency of sample at each polymorphic site, no matter sample comes
From single individual or the pure genomic samples of individual mixture.
In some embodiments, very short amplicon (November 21 in 2012 is generated using PCR (referred to as miniature PCR)
The U.S. Application No. 13/683,604 day submitted, US publication 2013/0123120, U.S. Application No. on November 18th, 2011
Submit the 13/300th, No. 235 U.S. Patent application, on November 18th, 2011 U.S. Publication submitted the 2012/0270212nd
And the United States Patent (USP) of the U.S. the 61/994th, 791 that on May 16th, 2014 submits, it is whole).CfDNA (such as in maternal serum
Fetus cfDNA necrosis or apoptosis release cancer cfDNA) be height fragmentation.For fetus cfDNA, clip size
It is distributed with about average value for the Gaussian form of 160bp, standard deviation 15bp, minimum about 100bp are up to about 220bp.
The polymorphic site in one particular target site can take up any position in the various segments from the site from start to end
It sets.Because cfDNA segment is short, a possibility that there are two primer sites, the length comprising forward and reverse primer sites
For L segment a possibility that be amplicon length and segment length ratio.Under ideal conditions, wherein amplicon is
The measuring method of 45,50,55,60,65 or 70bp will be respectively successfully from 72%, 69%, 66%, 63%, 59% or 56% template
Fragments molecules.In certain embodiments, most preferably it is related to the cfDNA of the sample of the doubtful individual with cancer, uses generation
Maximum amplicon length is 85,80,75 or 70bp, and the primer amplification cfDNA for being 75bp in certain preferred embodiments has
50 to 65 DEG C of melting temperature;It is 54-60.5 DEG C in certain preferred embodiments.Amplicon length is that forward and reverse draws
Send out the distance between the 5- element end in site.It can lead to than those typically used shorter amplicon length known in the art
It crosses and needs short sequence read only more effectively to measure required polymorphic site.In one embodiment, most
Amplicon is less than 100bp, is less than 90bp, is less than 80bp, is less than 70bp, is less than 65bp, is less than 60bp, is less than 55bp, is less than
50bp, or it is less than 45bp.
In some embodiments, using Direct Multiple PCR, consecutive PCR, nest-type PRC, dual nesting PCR, side and half
Side nesting PCR, complete nesting PCR, unilateral nesting PCR completely, unilateral nest-type PRC, nested PCR, half nesting PCR, triple half is embedding
PCR, half nesting PCR are covered, unilateral half nesting PCR, reversed half nested PCR process or unilateral side PCR are described in November 21 in 2012
The U.S. Application No. 13/300,235 that the U.S. Application No. day submitted is submitted on November 18th, 13/683,604,2011, the U.S. are public
The application 61/994,791 that the number of opening 2012/0270212 and 2014 is submitted on May 16, entire contents are incorporated by reference into this
Text.When necessary, any one of these methods can be used for miniature PCR.
When necessary, the extension step of PCR amplification can be limited from time angle to reduce to come from and be longer than 200 nucleosides
The amplification of the segment of acid, 300 nucleotide, 400 nucleotide, 500 nucleotide or 1000 nucleotide.This can lead to piece
The enrichment and test of sectionization or shorter DNA (such as foetal DNA or from the DNA for having undergone apoptosis or the cancer cell of necrosis)
The improvement of performance.
In some embodiments, using multiplex PCR.In some embodiments, the target site in amplification of nucleic acid sample
Method include that (i) contacts nucleic acid samples with primed libraries, primed libraries while at least 100;200;500;750;
1,000;2,000;5,000;7,500;10,000;20,000;25,000;30,000;40,000;50,000;75,000;Or
100,000 different target sites are to generate reaction mixture;(ii) makes reaction mixture undergo primer extension reaction condition
(such as PCR condition) is to generate the amplified production including target amplicon.In some embodiments, at least 50%, 60%,
70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 99.5% target site is amplified.In various embodiment party
In case, it is less than 60%, 50%, 40%, 30%, 20%, 10%, 5%, 4%, 3%, 2%, 1%, 0.5%, 0.25%, 0.1%
Or 0.05% amplified production is primer dimer.In some embodiments, primer in the solution (such as be dissolved in liquid phase and
It is not in solid).In some embodiments, primer in dissolved state and is not fixed on solid support.Some
In embodiment, primer is not a part of microarray.In some embodiments, primer does not include the reversed probe of molecule
(MIP).
In some embodiments, by two or more (such as 3 or 4) target amplicons (such as from being disclosed herein
MiniPCR method amplicon) link together, then the product of connection is sequenced.Multiple amplicons are combined into individually
Connection product improves the efficiency of subsequent sequencing steps.In some embodiments, target amplicon is long before they are connected
Spend less than 150,100,90,75 or 50 base-pairs.Selective enrichment and/or amplification may include with different labels, molecule item
Shape code, the label for amplification and/or the label for sequencing mark each individual molecule.In some embodiments, lead to
Cross sequencing (such as passing through high-flux sequence) or by with array such as SNP array, ILLUMINA INFINIUM array or
AFFYMETRIX gene chip hybridization analyzes amplified production.In some embodiments, using nano-pore sequencing, such as by Ji
The nano-pore sequencing technology of Ni Ya exploitation (see, e.g., WWW geniachip.com/technology, passes through reference
It is integrally incorporated herein).In some embodiments, using double-strand sequencing (Schmidt's et al., " pass through next-generation sequencing detection
Super rare mutation, " U.S., National Academy of Sciences institute .109 (36): 14508-14513,2012, it is fully incorporated by reference
Herein).This method greatly reduces mistake by independently marking and being sequenced each of two chains of DNA duplex.Due to
Two chains are complementary, the same position discovery really mutation in two chains.In contrast, PCR or sequencing mistake only exist
Cause to be mutated in one chain, and therefore can be used as error of performance discount.In some embodiments, this method needs random
With two chains of complementary Double-stranded nucleotide sequence label duplex DNA, referred to as duplex.By first by single-stranded randomization core
Nucleotide sequence introduces a linking subchain, then extends opposite strand with archaeal dna polymerase to generate complementary double-strand label, by double-strand
Sequence label mixes in standard sequencing adapter.Asymmetry after the connector of label is connected to the DNA of shearing, from aptamer tail portion
The chain of primer sites PCR amplification separate marking, and carry out paired end sequencing.In some embodiments, by sample (such as
DNA or RNA sample) it is divided into multiple fractions, such as different holes (for example, hole of wafer wound intelligent chip).Sample is divided into not
The sensitivity that analysis can be improved in same fraction (for example, at least 5,10,20,50,75,100,150,200 or 300 fractions), because
It is higher than in Bulk Samples in a some holes for the percentage of the molecule with mutation.In some embodiments, each fraction
With less than 500,400,200,100,50,20,10,5,2 or 1 DNA or RNA molecule.In some embodiments, each portion
Molecule in point is sequenced respectively.In some embodiments, same bar code (such as random or nonhuman sequence) is added to phase
With all molecules (such as by with primer amplification containing bar code or by connecting bar code) and different items in fraction
Shape code is added to the molecule in different fractions.Bar code molecule can collect and be sequenced together.In some embodiments, it is closing
And and sequencing before, such as by using nest-type PRC, amplifier molecule.In some embodiments, using a forward direction and two
Reverse primer or two forward directions and a reverse primer.
In some embodiments, be present in DNA in sample or RNA molecule less than 10%, 5%, 2%, 1%,
Mutation (such as SNV or CNV) in 0.5%, 0.1%, 0.05%, 0.01% or 0.005% is (such as cfDNA or cfRNA
Sample) (or can be detected).In some embodiments, it is present in less than 1,000,500,100,50,20,10,5,4,3
Or the mutation (such as SNV or CNV) of 2 original DNAs or RNA molecule (before amplification) is detected (or being able to detect that) sample
Product (such as sample of the cfDNA or cfRNA from such as blood sample).In some embodiments, sample (example is existed only in
The sample of cfDNA or cfRNA such as from such as blood sample) in 1 original DNA or RNA molecule (before amplification) mutation
(such as SNV or CNV) is detected (or being able to detect that).
For example, if the detection of mutation (such as single nucleotide variations body (SNV)) is limited to 0.1%, it can be by by grade
It is divided into multiple portions (such as 100 holes) to detect the mutation for being present in 0.01%.The copy that most aperture is not mutated.It is right
In have mutation several holes, be mutated reading high percentage much.In one embodiment, the DNA from target site has
20,000 initial copies, and two in those copies include purpose SNV.If sample is divided into 100 holes, 98 hole tools
There is SNV, 2 holes have 0.5% SNV.DNA in each hole can be expanded, be merged with the DNA from other holes by bar code,
And it is sequenced.There is no the hole of SNV to can be used for measuring background amplification/sequencing error rate, whether to determine the signal from the hole that peels off
Higher than the background level of noise.
In some embodiments, using array, such as array, especially have for one or more purpose chromosomes
The microarray of the probe of (for example, chromosome 13,18,21, X, Y or any combination thereof) detects amplified production.It should be appreciated that example
Such as, commercially available SNP detection microarray, such as hundred million sensible company's (Santiago, chemical abstracts) GoldenGate can be used,
DASL, Infmium or CytoSNP-12 Genotyping measure or the SNP from Affymetrix company, the U.S. detects microarray products, example
Such as OncoScan microarray.In some embodiments, one or two of embryo or fetus biology parent determines phase gene
Data are used to improve the accuracy of the analysis of the array data from individual cells.
In some embodiments for being related to sequencing, the depth of reading is mapped to the number of the sequencing reading of anchor point
Amount.The depth of reading can be normalized on the sum of reading.In some embodiments, deep for the reading of sample
Degree, reading depth is the mean depth read on target site.In some embodiments, for the reading depth in site,
The depth of reading is to navigate to the site by the number of the reading of sequenator measurement.In general, the reading depth in site is bigger,
Ratio of the ratio of allele closer to original DNA sample allelic at site.Read depth can with it is a variety of not
With mode indicate, including but not limited to percentage or ratio.Thus, for example the DNA sequencer parallel in height, such as
In Illumin aHISEQ, such as the sequence of 1,000,000 clones is generated, sequencing 3000 times of a site cause in the site
Reading depth is 3,000 readings.The ratio of reading at the site is 3,000 divided by 1,000,000 total indicator readings or total indicator reading
0.3%.
In some embodiments, allele data are obtained, wherein allele data include instruction polymorphic site
The quantitative measurment of the copy number of specific allele.In some embodiments, allele data include instruction in polymorphic position
The quantitative measurment of the copy number for each allele observed at point.In general, can to all of interested polymorphic site
The allele of energy obtains quantitative measurment.It is, for example, possible to use discussed in earlier paragraphs for determining the site SNP or SNV
Any method of allele (such as microarray, qPCR, DNA sequencing, such as high-throughput DNA sequencing) generates polymorphic site
The copy number of specific allele.This quantitative measurment is referred to herein as the hereditary equipotential of gene frequency data or measurement
Gene data.Allele method is sometimes referred to as quantified using the method for allele data;This be used only from non-polymorphic
The quantitative data in property site does not consider from polymorphic site but that the quantitative approach of allele identity is opposite.When using high
When flux sequencing measurement allele data, allele data generally include each allele for being mapped to target site
Number of readings per taken.
In some embodiments, non-allelic genes data are obtained, wherein non-allelic genes data include instruction certain bits
The quantitative measurment of the copy number of point.Site can be polymorphism or non-polymorphic.In some embodiments, when site right and wrong
Polymorphism when, non-allelic genes data do not include about the opposite of the individual allele being likely to be present at the site or absolutely
To the information of quantity.Using only the method for non-allelic genes data (that is, quantitative data from non-polymorphic allele or coming
From the quantitative data of polymorphic site, but the allele identity of each segment is not considered) it is referred to as quantitative approach.In general, right
The all possible allele of interested polymorphic site obtains quantitative measurment, one of value in total with the site place
There is the measurement amount of allele associated.The non-allelic genes data of polymorphic site can be by equipotential base each at the site
The quantitative allele of cause is summed to obtain.When using high-flux sequence measurement allele data, non-allelic genes data
Generally include the quantity for being mapped to the reading in interested site.Sequencing measurement can indicate each equipotential being present at site
The opposite and/or absolute number of gene, and non-allelic genes data include the summation read, but regardless of allele identity,
It is mapped to site.In some embodiments, same group of sequencing measurement can be used for generating allele data and non-allelic genes
Data.In some embodiments, allele data are used as a part for determining the method for target chromosome copy number, and
Generated non-allelic genes data can be used as determining a part of the distinct methods of copy number on target chromosome.In some realities
It applies in scheme, both methods is statistically orthogonal, and is combined more accurately to determine on interested chromosome
Copy number.
In some embodiments, obtaining genetic data includes (i) by laboratory technique, such as by using automation
High-throughput DNA sequencer obtains DNA sequence dna information, or (ii) obtains the information that previously passed laboratory technique obtains, wherein believing
Breath is for example by the computer on internet or by the electronics transmission from sequencing device come electronics transmission.
Other exemplary sample preparation, amplification and quantitative approach are described in the U. S. application submitted on November 21st, 2012
Numbers 13/683,604 (U.S. Application No. 61/994,791 that US publication 2013/0123120 and 2014 is submitted on May 16,
It is incorporated herein by reference in their entirety).These methods can be used for analyzing any sample disclosed herein.
Exemplary quantitative approach for Cell-free DNA
When necessary, the amount or concentration of standard method measurement cfDNA or cfRNA can be used.In some embodiments, it surveys
The amount or concentration of fixed cell-free mitochondrial DNA (cf mDNA).In some embodiments, it determines and is originated from core DNA (cf nDNA)
Cell-free DNA amount or concentration.In some embodiments, while the amount or concentration of cf mDNA and cf nDNA being measured.
In some embodiments, qPCR is for measuring cfnDNA and/or cfm DNA (section's Le et al. " plasma circulation cell
The free potential source biomolecule marker of core and mitochondrial DNA level as tumor of breast ", mole cancer 8:105,2009,8:doi:
10.1186/1476-4598-8-105 being incorporated herein by reference in their entirety).It is, for example, possible to use multiple qPCR measurements to come from
One or more sites of cf nDNA (such as glyceraldehyde-3-phosphate dehydrogenase, GAPDH) and from cf mDNA (ATP enzyme 8,
MTATP 8) one or more sites.In some embodiments, measured using the PCR of fluorescent marker cfnDNA and/or
Cf mDNA (Shi Wacen Bach et al., " the cell-free Tumour DNA and RNA of assessment breast cancer and benign breast disease patient." rub
Your biosystem 7:2848-2854,2011, be incorporated herein by reference in their entirety).When necessary, standard method, example can be used
The normal distribution of data is determined such as Shapiro-Wilk-Test.When necessary, can be used standard method such as nnDNA and
MDNA level is compared, such as Mann-Whitney-U-Test.In some embodiments, for example using standard method
Mann-Whitney-U- is examined or Kruskal-Wallis is examined the prognosis of cfnDNA and/or mDNA level and other foundation
The factor is compared.
Exemplary RNA amplification, quantitative and analysis method
Any following exemplary method can be used for expanding and optionally quantify RNA, such as cfRNA, cell RNA, cytoplasm
RNA, Codocyte matter RNA, non-coding cytoplasm rna, mRNA, miRNA, mitochondrial RNA (mt RNA), rRNA or tRNA.In some implementations
In scheme, microRNA is any miRNA points listed in obtainable cdr database on the WWW of mirbase.org
Son is incorporated herein by reference in their entirety.Exemplary microrna molecule includes miR-509;21 and micro- R-146a.
In some embodiments, it is expanded using the multiple join dependency probe amplification (RT-MLPA) of reverse transcriptase
RNA.In some embodiments, every group of hybridization probe is by synthesizing few nucleosides across the two short of SNP and long oligonucleotide
Acid composition (Lee et al. " Arch Gynecol Obstet." antenatal by RT-MLPA and the Noninvasive of one group of new SNP marker
Diagnose trisomy 21 development ", on July 5th, 2013, DOI 10.1007/s00404-013-2926-5;Scott Teng et al. " passes through
Multiple 40 nucleic acid sequences of join dependency probe amplification relative quantification ", nucleic acids research 30:e57,2002;Step on lattice et al.
(2011) trisomy 21 of u non-invasive prenatal diagnosis passes through the multiple join dependency probe amplification of reverse transcriptase, " China, chemical
Laboratory medicine .49:641-646,2011, be integrally incorporated herein each by reference).
In some embodiments, with reverse transcriptase PCR amplification RNA.In some embodiments, real time reverse transcriptase is used
PCR amplification RNA, such as step real time reverse transcriptase PCR (Lee et al. " Arch as discussed previously using chimeric fluorescent method
Gynecol Obstet." antenatal examined using one group of new SNP marker object by the Noninvasive that RT-MLPA develops trisomy 21
It is disconnected, " on July 5th, 2013, DOI 10.1007/s00404-013-2926-5;" plasma placental RNA allele is than permitting for sieve et al.
Perhaps Noninvasive prenatal chromosome aneuploidy detects, " Natural medicine 13:218-223,2007;Xu et al. " is based on microarray
Identification of the placenta mRNA in Maternal plasma: towards Noninvasive prenatal gene express spectra ".Chinese Journal of Medical Genetics
41:461-467,2004;Care for et al. that " Journal of Neurochemistry .122:641-649,2012, are integrally incorporated this each by reference
Text).
In some embodiments, RNA is detected using microarray.For example, can be used according to the scheme of manufacturer next
From mankind's microarray analysis of Agilent Technologies.In short, connecting isolated RNA dephosphorylation and with pCp-Cy3.Base
14.0 are discharged in Sanger miRBase, by the RNA of label purifying and microRNA with the probe comprising being used for people's ripe microRNA
Hybridization array.Use microarray scanner (G2565BA, Agilent Technologies) washing and scanning array.It is mentioned by Agilent
Software v9.5.3 is taken to evaluate the intensity of each hybridization signal.Label, hybridization and scanning can be according to Agilent microRNA microarray systems
Scheme in system (care for et al. " J.Neurochem.122:641-649,2012, be incorporated herein by reference in their entirety) carries out.
In some embodiments, RNA is detected using TaqMan measuring method.Exemplary assay is hydrolysis probes array
Mankind Microrna panel vl.O (Preview Release) (Applied biosystems), it includes 157 hydrolysis probes microRNAs to survey
It is fixed, including respective reverse transcription primer, PCR primer and hydrolysis probes (Zhan et al., " and in Maternal plasma the detection of placenta microRNA and
Characterization, " Chemistry In China .54 (3): 482-90,2008 are incorporated herein by reference in their entirety).
When necessary, can be used standard method (method gram tal fibre and Ge Deli, disease model and mechanism 1:37-42,2008,
Doi:10.1242/dmm.000331 is incorporated herein by reference) the mRNA splice mode of the one or more mRNA of measurement its
All).For example, high-density micro-array and/or high-throughput DNA sequencing can be used for detecting mRNA splice variant.
In some embodiments, transcript profile is measured using full transcript profile shotgun sequencing or array.
Exemplary amplification method
Developed improved PCR amplification method, be used to minimize or prevent due to same reaction volume (such as
Expand the part of the sample multi-PRC reaction of all target sites simultaneously) in do caused by neighbouring or adjacent target site amplification
It disturbs.These methods can be used for expanding neighbouring or adjacent target site simultaneously, different more anti-than that must be separated to neighbouring target site
Answer in volume faster with it is cheaper, they are individually expanded to avoid interference.
In some embodiments, using the polymerase with low 5' → 3r exonuclease and/or low strand-displacement activity
The amplification of (for example, archaeal dna polymerase, RNA polymerase or reverse transcriptase) progress target site.In some embodiments, low-level
5' → 3r exonuclease reduce or prevent nearby primer (for example, the primer that does not extend or have during primer extend adds
The primer of the one or more nucleotide added) degradation.In some embodiments, low-level strand-displacement activity reduces or prevents
Only setting adjacent to primer (for example, non-extension primer or the primer that during primer extend there are one or more nucleotide to add)
It changes.In some embodiments, target site adjacent to each other (for example, not having base between target site) or neighbouring (for example, site
In 50,40,30,20,15,10,9,8,7,6,5,4,3,2 or 1 bases).?
In some embodiments, the end 3' in a site is the 50,40,30,20,15,10,9,8,7,6,5,4,3,2 or 1 of the end 5'
Downstream site in a base.
In some embodiments, at least 100,200,500,750,1,000;2,000;5,000;7,500;10,000;
20,000;25,000;30,000;40,000;50,000;75,000;Or 100,000 different target site, for example, by
Amplification simultaneously is in one reaction volume to expand.In some embodiments, at least 50%, 60%, 70%, 80%, 90%,
95%, 96%, 97%, 98%, 99% or 99.5% amplified production is target amplicon.In various embodiments, as target
The amount of the amplified production of amplicon is 50-99.5%, such as 60-99%, 70-98%, 80-98%, 90-99.5% or 95-
99.5%.In some embodiments, at least 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99%
Or 99.5% target site is amplified (for example, at least 5,10,20,30,50 or 100 times of amplification), such as by anti-at one
It answers in volume while expanding.In various embodiments, be amplified (for example, compared with before amplification, amplification at least 5,10,20,
30,50 or 100 times) the amount of target site be 50 to 99.5%, such as 60% to 99%, 70 to 98%, 80% to 99%, 90%
To 99.5%, 95% to 99.9% or 98% to 99.99%.In some embodiments, less non-target amplicon, example are generated
The less amplicon that reverse primer such as from the forward primer from the first primer pair and from the second primer pair is formed.If example
Reverse primer such as from the first primer pair and/or the forward primer degradation from the second primer pair and/or displacement, then can make
This undesirable non-target amplicon is generated with existing amplification method.
In some embodiments, these methods allow using longer extension of time, because in conjunction with the primer of extension
Polymerase be less likely degradation and/or displacement gives primer near low 5 → 3r exonuclease (such as next downstream is drawn
Object) and/or polymerase low strand-displacement activity.In various embodiments, using reaction condition (such as extension of time and temperature
Degree) so that the nucleotide number that the Drawing rate of polymerase allows to be added to primer extend is equal to or more than 80,90,95,100,
Next downstream primer knot on the end 3' of 110,120,130,140,150,175 or 200% primer binding site and same chain
Nucleotide number between the end 5' of coincidence point.
In some embodiments, using archaeal dna polymerase, DNA is used to generate DNA cloning as template.In some realities
It applies in scheme, uses RNA polymerase that DNA is used to generate RNA amplification as template.In some embodiments, using reverse
Enzyme is recorded, RNA is used to generate cDNA amplicon as template.
In some embodiments, the low-level exonuclease of 5' → 3 ' of polymerase is less than same amount of Thermus
Active 80%, 70%, 60%, 50%, 40%, 30%, 20%, 10%, 5%, 1% or 0.1% aquatic polymerase (" Taq "
Polymerase is the common archaeal dna polymerase from thermophilic bacteria, PDB 1BGX, EC 2.7.7.7, Mulally et al. " Taq DNA
The crystal structure of polymerase and inhibition Fab compound: Fab is the dynamic (dynamical) intermediate of spiral-coil for the enzyme, " U.S.
Proceedings of the National Academy of Sciences 95:12562-12567,1998, be incorporated herein by reference in their entirety) carry out under the same conditions.?
In some embodiments, the low-level strand-displacement activity of polymerase is less than active the 80% of same amount of Taq polymerase,
70%, 60%, 50%, 40%, 30%, 20%, 10%, 5%, 1% or 0.1% under the same conditions.
In some embodiments, polymerase is PUSHION archaeal dna polymerase, such as PHUSION high-fidelity DNA polymerase
Archaeal dna polymerase (M0535S, Xin Yingge are bent in (M0530S, biology laboratory Co., Ltd, New England) or PHUSION thermal starting
Blue biology laboratory Co., Ltd;Gustav Freij and Su Peiman, biochemistry 2:34-35,1995;Chester and the analysis of horse picogram
Biochemistry 209:284-290,1993, be integrally incorporated herein each by reference).PHUSION archaeal dna polymerase is and holds
Hot-bulb bacterium-XikQ the enzyme of continuous enhancing structure domain fusion.PHUSION archaeal dna polymerase is circumscribed with 5 ' → 3 ' polymerase activities and 5 '
Nuclease, and generate blunt-end product.PHUSION archaeal dna polymerase lacks 5 ' → 3 ' exonuclease activities and strand displacement is living
Property.
In some embodiments, polymerase is archaeal dna polymerase, such as high-fidelity DNA polymerase (M0491S, Xin Yingge
Blue biology laboratory Co., Ltd) or Hot Start High-Fidelity archaeal dna polymerase (M0493S, New England's life
Wu Xue Laboratories, Inc).High-fidelity DNA polymerase is the high-fidelity with 3' → 5' exonuclease activity, heat
Stable archaeal dna polymerase is fused to the Sso7d structural domain of lasting enhancing.High-fidelity DNA polymerase lacks outside 3 ' nucleic acid
Enzyme cutting activity and strand-displacement activity.
In some embodiments, polymerase is that (M0203S, New England's biology laboratory are limited for T4 archaeal dna polymerase
Company;Ta Boer and Xi Tu (1989)." DNA dependent dna-polymerases ", Ashbel et al. (version), molecular biology are worked as
Preceding agreement.3.5.10-3.5.12.New York: John Wiley father and son company, 1989;Pehanorm Brooker et al..Molecular cloning: experiment
Handbook (second edition), 5.44-5.47.Cold spring harbor laboratory: CSH Press, 1989, it is whole each by quoting
Body is incorporated herein)).T4 archaeal dna polymerase catalytic dna needs the presence of template and primer in the synthesis in the direction 5' → 3'.The enzyme
With 3' → 5' exonuclease activity, the much higher .T4 archaeal dna polymerase of activity than finding in DNA polymerase i lacks
3 ' exonuclease activities and strand-displacement activity.
In some embodiments, polymerase is that (M0327S, New England's biology are real by sulfolobus solfataricus DNA polymerase i V
Yan Shi Co., Ltd;(Byrd rope et al. (2001) nucleic acids research, 29:4607-4616,2001;MacDonald, 2006).Nucleic acid
Research " 34:1102-1111,2006, be integrally incorporated herein each by reference).Sulfo group bacillus DNA polymerase i V is
Heat-staple Y family lesion bypasses archaeal dna polymerase, in a variety of DNA profiling lesion MacDonalds, JP et al. (2006) nucleic acid
Research, 34,1102-1111, be incorporated herein by reference in their entirety).Sulfolobus solfataricus DNA polymerase i V lacks outside 5' → 3' nucleic acid
Enzyme cutting activity and strand-displacement activity.
In some embodiments, if primer combine with SNP region, primer can with different efficiency combine and
Amplification not iso-allele, or can only in conjunction with one allele of amplification.For the subject of heterozygosis, allele it
One may not be by primer amplification.It in some embodiments, is each allele design primer.For example, if there is two
Allele (for example, diallele SNP), then two primers can be used for the same position in conjunction with target gene seat (for example, positive
Primer combines " A " allele, and forward primer combines " B " allele).Standard method, such as nucleotide polymorphisms database,
It is determined for the position of known SNP, such as the SNP hot spot with high heterozygosis rate.
In some embodiments, the size of amplicon is similar.In some embodiments, the length range of target amplicon
Less than 100,75,50,25,15,10 or 5 nucleotide.In some embodiments (such as in the DNA or RNA of amplified fragments
Target gene seat), the length of target amplicon is 50 to 100 nucleotide, such as 60 to 80 nucleotide or 60 to 75 nucleosides
Acid.(such as multiple target gene seats are expanded in entire exon or gene) in some embodiments, the length of target amplicon
For 100 to 500 nucleotide, such as 150 to 450 nucleotide, 200 to 400 nucleotide, 200 to 300 nucleotide, or
300 and 400 nucleotide.
In some embodiments, multiple target gene seats are expanded simultaneously using primer pair, the primer pair include for
The forward and reverse primer of each target gene seat to be amplified in the reaction volume.In some embodiments, with each target base
Because the single primer of seat carries out a wheel PCR, the second wheel PCR then is carried out with the primer pair of each target gene seat.For example, the first round
The single primer that each target gene seat can be used in PCR carries out, so that all primer combination same chains (such as use each target base
Because of the forward primer of seat).This allows PCR to expand in a linear fashion, and reduces or eliminates as caused by sequence or difference in length
Amplification deviation between amplicon.In some embodiments, then expanded using the forward and reverse primer of each target gene seat
Increase amplicon.
Exemplary primers design method
When necessary, the primer that a possibility that forming primer dimer reduces can be used and carry out multiplex PCR.Particularly, high
Degree multiplex PCR, which frequently results in, generates very a high proportion of product DNA, forms production by unproductive side reaction such as primer dimer
It is raw.In one embodiment, most probable may be removed from primed libraries leads to the specific primer of unproductive side reaction, obtains
To primed libraries, the DNA amplification for being mapped to genome of greater proportion will lead to.Problematic primer is removed, i.e., especially may be used
Being capable of fixing those of dimer primer unexpectedly can be by being sequenced the high PCR duplication water for subsequent analysis
It is flat.
It is that library selects primer there are many method, wherein non-locating primer dimer or other primers wash in a pan the amount quilt of leakage product
It minimizes.Empirical data suggests that a small amount of " bad " primer is responsible for a large amount of non-locating primer dimer side reaction.Remove these
" bad " primer can increase the percentage for being mapped to the sequence reads of target site.A kind of method of identification " bad " primer is observation
Pass through the DNA sequencing data of targeting amplification;It those of can remove to see divided by maximum frequency primer dimer, less may be used with generating
It can lead to the primed libraries not with the by-product DNA of genomic mapping.The disclosure of the combination energy of various primer combinations can also be calculated
Program, and remove have highest combine can program also will generation be likely to result in the not by-product with genomic mapping
The primed libraries of DNA.
In some embodiments for selecting primer, by candidate target position point design one or more primer or drawing
Object is to the initial libraries for generating candidate drugs.Can based on target site expectation parameter (such as target cell group in SNP frequency or
The heterozygosis rate of SNP) public information select one group of candidate's target site (such as SNP).In one embodiment, it can be used
(World Wide Web is in primer3.sourceforge.net for Primer3 program;Libprimer3 issues 2.2.3, whole by quoting
Body is incorporated herein) design PCR primer.When necessary, primer may be designed as annealing in specific annealing region, have specific model
The G/C content enclosed has specific dimensions range, generates the target amplicon within the scope of specific dimensions and/or has other parameters special
Sign.Since multiple primers of each candidate target site or primer pair, increasing primer or primer pair will be retained in library and be used for
A possibility that most of or all target sites.In one embodiment, selection criteria may need each target site at least one
A primer pair is retained in library.In this way, most of or all target sites will be amplified when using final primed libraries.This
For a large amount of positions screening missing such as in genome or repeat or screening a large amount of sequences relevant to disease are (such as polymorphic
Property or other mutation) or increased disease risks application be ideal.If the primer pair from library will generate with by another
The target amplicon for the target amplicon overlapping that one primer pair generates, then can remove one of primer pair to prevent from doing from library
It disturbs.
In some embodiments, to the most of or all possible combinations meter of two kinds of primers from candidate drugs library
Calculate " undesirable property score " (the higher score for indicating subsistence level) (such as calculating on computers).In various embodiments
In, it calculates at least 80%, 90%, 95%, 98%, 99% or 99.5% of the possibility combination of candidate drugs in library undesirable
Property score.Each undesirable property score is at least partially based between two candidate drugs a possibility that forming dimer.When necessary,
Undesirable property score is also based on one or more other parameters selected from the following: the heterozygosis rate of target site, at target site
The relevant incidence rate of sequence (for example, polymorphism), with the sequence (such as polymorphism) at target site, candidate drugs are to target
The specificity in site, the size of candidate drugs, the melting temperature of target amplicon, the G/C content of target amplicon, the expansion of target amplicon
Increasing Efficiency, the size of target amplicon, and the distance at the center away from recombination hotspot.In some embodiments, candidate drugs pair
The specificity of target site includes candidate drugs by combining and expanding except its site in addition to being designed as the target site of amplification is wrong
With a possibility that.In some embodiments, removed from library it is one or more or the wrong candidate drugs filled out.Some
In embodiment, in order to increase selection candidate drugs number, the candidate drugs of wrong primer can not be removed from library.
If it is considered that Multiple factors, then can calculate undesirable property score based on the weighted average of various parameters.Based on they for
The importance of the specific application of primer will be used, weight that can be different to parametric distribution.In some embodiments, from library
It is middle to remove the primer with the undesirable score of highest.If the primer of removal be with the primer pair of a target position dot blot at
Member, then another member of primer pair can remove from library.It can according to need the process of repeated removal primer.Some
In embodiment, carry out selection method, until be retained in library candidate drugs combination undesirable property score all equal to
Or it is lower than minimum threshold.In some embodiments, selection method is carried out, until the number of candidate drugs remaining in library is reduced
To required number.
In various embodiments, after calculating undesirable property score, removal is as with most higher than first from library
The candidate drugs of the maximum number of combined a part of two candidate drugs of the undesirable property score of small threshold value.The step is neglected
Roughly equal to or lower than the first minimum threshold interaction because these interactions are unobvious.If removal primer be with
The member of the primer pair of one target position dot blot, then another member of primer pair can remove from library.It can be according to need
Want the process of repeated removal primer.In some embodiments, selection method is carried out, until the candidate drugs being retained in library
Combination undesirable property score all equal to or be lower than the first minimum threshold.If the number for the candidate drugs being retained in library
It, then can be by the way that the first minimum threshold to be reduced to the mistake of lower second minimum threshold and repeated removal primer higher than desired value
Journey reduces the number of primer.It, can be by by first if the quantity of remaining candidate drugs is lower than desired value in library
Minimum threshold increase to higher second minimum threshold and reuse original candidates primed libraries removal primer process come after
Continuous this method, so that more candidate drugs be allowed to be retained in library.In some embodiments, selection method is carried out, directly
To the candidate drugs combination being retained in library undesirable property score all equal to or lower than the second minimum threshold, or until text
The number of remaining candidate drugs is reduced to required number in library.
If desired, the primer pair for generating the target amplicon Chong Die with the target amplicon that another primer pair generates is segmented into
Individual amplified reaction.It may be preferably, it is expected that it is feasible for analyzing all candidate target tracks using multiplexed PCR amplification reaction
(rather than due to overlapping target amplicon is omitted from analysis candidate target site).
These selection methods minimize the number for the candidate drugs that must be removed from library to realize primer dimer
Requirement reduce.By removing small number of candidate drugs from library, the amplification of gained primed libraries can be used more
More (or all) target site.
It is multiplexed the measurement that a large amount of primer pairs may include and applies sizable constraint.The measurement to interact unintentionally causes
False amplified production.The size constraint of micro- PCR may cause further constraint.In one embodiment, may start from non-
Often a large amount of potential SNP target (between about 500 to more than 1,000,000), and design primer is attempted to expand each SNP.Can be with
In the case where design primer, it can attempt to assess by using the disclosed thermodynamic parameter formed for DNA duplex in institute
A possibility that false primer duplex formation between possible primer pair, identifies the primer pair for being likely to form false pain object.It can be with
Primer interaction is ranked up by score function relevant to interaction, and is eliminated with worst interaction score
Primer, until meet needed for primer number.In the most useful situation of SNP that may be heterozygosis, measurement can also be arranged
List sorting and the compatible measurement for selecting most heterozygosis.Experimental verification forms primer with the primer most probable of high interaction score
Dimer.Under high multiplicity, it is impossible to eliminate all false appearance interactions, but must be driven off the score that interacts with highest
Primer or primer pair because they can dominate entire reaction significantly limit the amplification from expected target.We are
This program is carried out to generate up to primer sets, is more than 10,0 primer in some cases.Since the improvement of the program is aobvious
, so that be more than 80% by the amplification on target product determined by all PCR products are sequenced, more than 90%,
More than 95%, more than 98%, even more than 99%, and compared with not removing the 10% of reaction of worst primer wherein.Before such as
Described when combining with half nesting method of part, more than 90%, even more than 95% amplicon may map to target sequence.
Note that there are also for determining which PCR probe is likely to form the other methods of dimer.In an embodiment
In, the analysis for determining that problematic primer has been expanded to DNA library may be enough using the primer sets of unoptimizable.For example, can
To use sequencing to be analyzed, and those are confirmed as that of most probable formation dimer with dimer existing for maximum number
A little dimers can also be removed.In one embodiment, the method for design of primers can be with the miniature side PCR as described herein
Method is applied in combination.
The amplification and sequencing of primer dimer product can be reduced using label on primer.In some embodiments,
Primer contains the interior zone with tag-shaped at ring structure.In a particular embodiment, primer includes to target site specificity
The region 5', it is not specific to target site and form the interior zone of ring structure and the 3' region special to target site.In some realities
It applies in scheme, ring region can be between two basic change region, and two of them bond area is designed to combine the company of template DNA
Continuous or adjacent area.In various embodiments, the length in the area 3' is at least seven nucleotide.In some embodiments, the area 3'
Length be 7 to 20 nucleotide, such as 7 to 15 nucleotide or 7 to 10 nucleotide.In various embodiments, primer
Including not having the region 5' of specificity to target site (such as label or universal primer binding site), it is followed by target site spy
Different region, not specific interior zone simultaneously form ring structure, and the 3' region special to target site.Tag primer can be with
For shortening to required target-specific sequences lower than 20, being lower than 15, being lower than 12, even lower than 10 base-pairs.When target sequence
When being listed in fragmentation in primer binding site, this can be the accidental generation that standard primer designs or its and can be designed to draw
Object design.The advantages of this method includes: that its increase can be the number of the measurement of a certain maximum amplicon Design of length, and it contracts
" non-information " of short primer sequence is sequenced.It can also be used in combination with inner marker.
In one embodiment, nonproductive production in multiple targeting PCR amplification can be reduced by improving annealing temperature
The relative quantity of object.In the case where wherein one amplification has the library of label identical with target specific primer, with genome
DNA is compared, and annealing temperature can increase, because label will be helpful to primer combination.In some embodiments, annealing time can
To be longer than 3 minutes, it is longer than 5 minutes, is longer than 8 minutes, be longer than 10 minutes, be longer than 15 minutes, be longer than 20 minutes, is longer than 30 minutes,
It is longer than 60 minutes, is longer than 120 minutes, more than 240 minutes, more than 480 minutes, even more than 960 minutes.In certain illustrative realities
It applies in scheme, using longer annealing time and reduces primer concentration.In various embodiments, using than normally extending
The time longer time is greater than 3,5,8,10 or 15 minutes.In some embodiments, primer concentration is down to 50nM,
20nM, 10nM, 5nM, 1nM and be lower than 1nM.This surprisingly results in height multiple reaction, such as 1, and 000 reacts again, and 2,000
It reacts again, 5,000 react again, and 10,000 react again, and 20,000 repeat to react, and 50,000 react or even 100 again, and 000 reacts again
Steady performance.In one embodiment, amplification has long annealing using one, two, three, four or five circulation
Time is followed by the PCR cycle with the more common annealing time of the primer of label.
For selective goal position, since candidate drugs are to design library and potential unfavorable phase between primer pair can be created
Then the thermodynamical model of interaction is eliminated using model and designs incompatible design with other in library.
In one embodiment, the present invention is characterized in that reduce target site (such as may contain and disease or disease
The relevant polymorphism of disease or mutation or the increased number of loci of risk to disease or illness such as cancer) and/or increase and detect
Disease burden (for example, increasing the quantity of polymorphism or mutation detected).In some embodiments, the method includes
Pass through polymorphism or the frequency of mutation (such as single nucleotide variations, insertion or missing or any other variation as described herein)
Or again come each site in the subject of (such as from up to minimum) with disease or illness such as cancer that grades.?
In some embodiments, PCR primer is designed to some or all of sites.When selection is used for the PCR primer of primed libraries, have
The primer in the site of upper frequency or reproduction (higher level site) is better than with lower frequency or recurrence (lower grade site)
Primer.In some embodiments, the parameter be included as the parameter in the calculating of undesirable score as described herein it
One.If desired, designing incompatible primer (such as the primer in high-grade site) with other in library may include not
In the same library PCR/library.In some embodiments, in individual PCR reaction using multiple library/libraries (such as 2,3,4,
5 or more) it enables to expand all (or most of) sites represented by all libraries/library.In some embodiments,
Continue this method, until including enough primers in one or more library/libraries, so that primer can generally be directed to disease
Or disease burden needed for illness capture is (for example, the disease by detection at least 80%, 85%, 90%, 95% or 99% is negative
Lotus).
Exemplary primers library
On the one hand, the invention is characterized in that primed libraries, such as using any method of the invention from candidate drugs library
The primer of middle selection.In some embodiments, library includes while hybridizing (or can hybridize simultaneously) or expanding (or energy simultaneously
Enough while expanding) at least 100,200;500;750;1,000;2,000;5,000;7,500;10,000
It is a;20,000;25000;30,000;40,000;50,000;75,000 or 100,000 different target position
Point.In various embodiments, library includes between 100 to 500 while expanding the primer of (or can expand simultaneously);500
~1,000;1,000~2,000;2,000 to 5,000;5,000 to 7,500;7,500 to 10,000;10,000
~20,000;20,000 to 25,000;25,000 to 30,000;30,000 to 40,000;40,000 to 50,000
It is a;50,000 to 75,000;It or include 75,000 to 100,000 different target sites in a reaction volume.Each
In kind of embodiment, library is included in a reaction volume while expanding (or can expand simultaneously) 1,000 to 100,000
The primer of different target sites, such as between 1,000 to 50,000;1,000~30,000;1,000~20,000;1,0 to 10,
000;2,000 to 30,000;2,000~20,000;2,000 to 10,000;5,000 to 30,000;5,000~20,000;Or
5,000 to 10,000 different target sites.In some embodiments, library is included in a reaction volume while expanding
The primer of (or can expand simultaneously) target site, so that less than 60%, 40%, 30%, 20%, 10%, 5%, 4%, 3%,
2%, 1%, 0.5%, 0.25%, 0.1% or 0.5% amplified production is primer dimer.In various embodiments, as
The amount of the amplified production of primer dimer is 0.5% to 60%, such as 0.1% to 40%, 0.1 to 20%, 0.25 to 20%,
0.25% to 10%, 0.5% to 20%10%, 1%~20% or 1%~10%.In some embodiments, primer is at one
(or can expand simultaneously) target site is expanded simultaneously in reaction volume, so that at least 50%, 60%, 70%, 80%, 90%,
95%, 96%, 97%, 98%, 99% or 99.5% amplified production is target amplicon.In various embodiments, as target
The amount of the amplified production of amplicon is 50%-99.5%, such as 60%-99%, 70%-98%, 80%-98%, 90-99.5%
Or 95-99.5%.In some embodiments, primer expands (or can expand simultaneously) target position simultaneously in a reaction volume
Point, so that at least 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 99.5% target site
It is amplified (for example, expanding at least 5,10,20,30,50 or 100 times compared with before amplification).In various embodiments, expanded
Increase the target site of (for example, compared with before amplification, expand at least 5,10,20,30,50 or 100 times) amount be 50% to
99.5%, such as 60% to 99%, 70% to 98%, 80% to 99%, 90% to 99.5%, 95% to 99.9% or 98% to
99.99%.In some embodiments, primed libraries include at least 100;200;500;750;1,000;2,000;5,000;7,
500;10,000;20,000;25,000;30,000;40,000;50,000;75,000;Or 100,000 primer pair, wherein often
It include positive test primer and negative testing primer to primer, wherein each pair of test primer and target position dot blot.In some implementations
In scheme, primed libraries include at least 100;200;500;750;1,000;2,000;5,000;7,500;10,000;20,000;
25,000;30,000;40,000;50,000;75,000;Or 100,000 respectively from the primer of different target position dot blots, wherein
Each primer is not a part of primer pair.
In various embodiments, the concentration of every kind of primer is less than 100,75,50,25,20,10,5,2 or 1nM, or is less than
500,100,10 or 1 μM.In various embodiments, the concentration of every kind of primer at 1 μM between 100nM, such as 1 μM extremely
1nM, 1 to 75nM, 2 to 50nM or 5 to 50nM, including these end values.In various embodiments, the G/C content of primer is 30%
To between 80%, such as between 40% to 70% or 50% to 60%, including 50% and 60%.In some embodiments,
The G/C content range of primer is less than 30%, 20%, 10% or 5%.In some embodiments, the G/C content range of primer is
5% to 30%, such as 5% to 20% or 5% to 10%, including these end values.In some embodiments, the solution of primer is tested
Chain temperature (Tm) is 40 DEG C to 80 DEG C, such as 50 DEG C to 70 DEG C, 55 DEG C to 65 DEG C or 57 DEG C to 60.5 DEG C (containing).In some implementations
In scheme, Primer3 program is used using built-in SantaLucia parameter (WWW primer3.sourceforge.net)
(libprimer3 version 2 .2.3) calculates Tm.In some embodiments, the melting temperature range of primer is less than 15 DEG C, 10 DEG C,
5 DEG C, 3 DEG C or 1 DEG C.In some embodiments, the melting temperature range of primer is 1 DEG C to 15 DEG C, such as 1 DEG C to 10 DEG C, 1 DEG C
To 5 DEG C or 1 DEG C to 3 DEG C, including 1 DEG C and 5 DEG C.In some embodiments, the length of primer is 15 to 100 nucleotide, example
Such as 15 to 75 nucleotide, 15 to 40 nucleotide, 17 to 35 nucleotide, 18 to 30 nucleotide or 20 to 65 nucleosides
Acid.In some embodiments, less than 50,40,30,20,10 or 5 nucleotide of the length range of primer.In some embodiment party
In case, the length range of primer is 5 to 50 nucleotide, such as 5 to 40 nucleotide, 5 to 20 nucleotide or 5 to 10 cores
Thuja acid.In some embodiments, the length of target amplicon is 50 to 100 nucleotide, such as 60 to 80 nucleotide or 60
To 75 nucleotide.In some embodiments, less than 50,25,15,10 or 5 nucleotide of the length range of target amplicon.?
In some embodiments, the length range of target amplicon is 5 to 50 nucleotide, such as 5 to 25 nucleotide, 5 to 15 cores
Thuja acid or 5 to 10 nucleotide.In some embodiments, library does not include microarray.In some embodiments, library is wrapped
Containing microarray.
95%) or all adapters or primer include in some embodiments, some (for example, at least 80%, 90% or
One or more keys between adjacent nucleotide in addition to naturally occurring phosphodiester bond.The embodiment of this connection includes
Phosphamide, thiophosphate are connected with phosphorodithioate.In some embodiments, some (for example, at least 8%0,90% or
95%) or all adapters or primer in last 3' nucleotide and second to including thiophosphate between last 3' nucleotide
(such as monothio phosphate).95%) or all adapters in some embodiments, some (for example, at least 80%, 90% or
Or primer includes that thiophosphate (such as monothio phosphate) terminates between last 2,3,4 or 5 nucleotide of the end 3'.
In some embodiments, some (95%) or all adapters or primer are included at least 1 for example, at least 80%, 90% or,
Last 10 nucleotide of thiophosphate (such as monothio phosphate) between 2,3,4 or 5 nucleotide in the end 3'.?
In some embodiments, such primer is less likely to be cut or degraded.In some embodiments, primer does not contain digestion position
Point (such as protease cutting site).
The U.S. Application No. 13/683,604 (US publication 2013/0123120) submitted on November 21st, 2012 and
Other exemplary multiple PCR method and library are described in the U.S. Application No. 61/994,791 that on May 16th, 2014 submits,
It is integrally incorporated herein each by reference).These methods and library can be used for analyzing any sample disclosed herein and for these
In any method of invention.For detecting the Exemplary primers library of recombination.
In some embodiments, the primer in design primer library with determine recombination whether occur it is one or more
The recombination hotspot (such as exchange between homologous human chromosome) known.Knowing the exchange occurred between chromosome allows to be individual true
Fixed more accurate phase gene data.Recombination hotspot is the regional area of chromosome, and wherein recombination event is tended to concentrate.Usually it
Flank be " cold spot ", lower than the region of average recombination frequency.Recombination hotspot tends to share similar form, and length
It is about 1 to 2kb.Hotspot's distribution is positively correlated with G/C content and repeat element distribution.13 aggressiveness motifs of partial deterioration
CCNCCNTNNCCNC works in some hot spot activity.Have shown that referred to as the zinc finger protein of PRDM9 combine the motif and
Cause recombination in its position.Average distance between recombination hotspot center is it is reported that for~80kb.In some embodiments, it recombinates
The distance between hot spot center is in about 3kb between about 100kb.Public database includes a large amount of known human recombinant hot spots,
Such as HUMHOT and world HapMap project database are (see, for example, Buddhist nun still top grade people, " HUMHOT: mankind's meiotic recombination
The database of hot spot, " nucleic acids research periodical, 34:D25-D28,2006, database problem;Ma Cikaiweiqi et al., " mankind's base
Because distribution-computer simulation of recombination hotspot in group is compared with truthful data " Public science library 8 (6): e65272,
Doi:10.1371/journal.pone.0065272;And hapmap.ncbi.nlm.nih.gov/ on WWW
Downloads/index.html.en is integrally incorporated herein each by reference).
In some embodiments, the primer in primed libraries is at recombination hotspot (such as known human recombinant hot spot)
Or it nearby clusters.In some embodiments, sequence in recombination hotspot or neighbouring is determined, using corresponding amplicon with true
The certain hotspot is scheduled on recombination whether to occur (such as whether the sequence of amplicon is expected sequence when recombinating, either
No generation recombination expected sequence if there is no recombination).In some embodiments, design primer is to expand recombination hotspot
Some or all of (and the sequence for being optionally disposed in recombination hotspot flank).In some embodiments, sequencing is read using long
(such as carrying out the sequencing that sequence is up to about 10kb using the Moleculo technology developed by Illumina) or paired end sequencing
Some or all of recombination hotspot is sequenced.It can be used about whether the knowledge that recombination event occurs and which haplotype determined
Section is located at the flank of hot spot.When necessary, the primer special to the region in haplotype section can be used to confirm specific list
The presence of times type section.In some embodiments, it is assumed that do not intersect between known compound hot spot.In some embodiments
In, the primer in primed libraries clusters near end of chromosome or its.For example, such primer can be used for determining whether there is dye
The particular arm or section of colour solid end.In some embodiments, the primer in primed libraries gathers at or near recombination hotspot
Cluster, and cluster near end of chromosome or its.
In some embodiments, primed libraries include one or more primers (for example, at least 5;10;50;100;200;
500;750;1,000;2,000;5,000;7,500;10,000;20,000;25,000;30,000;40,000;50,000 are not
With primer or different primer pairs), for recombination hotspot (such as known human recombinant hot spot) be specificity and/or
It is specific (such as the 5' or 3' of the recombination hotspot in 10,8,5,3,2,1 or 0.5kb for the region near recombination hotspot
End).In some embodiments, at least one, 5,10,20,40,60,80,100 or 150 different
Primer (or primer pair) is specific to same recombination hotspot, or to identical recombination hotspot or region close to recombination hotspot.
In some embodiments, at least one, 5,10,20,40,60,80,100 or 150 different primers
(or primer pair) is specific for the region between recombination hotspot (such as the region that can not be recombinated);These primers
It can be used for the presence of confirmation unit type section (such as according to whether occurring to recombinate those of expected).In some embodiments,
In primed libraries at least 10,20,30,40,50,60,70,80 or 90% primer for recombination hotspot be specificity and/or
For the region close to recombination hotspot be it is specific (such as the end 5' or 3' of recombination hotspot 10,8,5,3,2,1 or
In 0.5kb).In some embodiments, primed libraries are for determining whether recombination occurs be greater than or equal to 5;10;50;
100;200;500;750;1,000;2,000;5,000;7,500;10,000;20,000;25,000;30,000;40,000;Or
50,000 different recombination hotspots (such as known human recombinant hot spot).In some embodiments, weight is targeted by primer
The region of group hot spot or near zone is substantially evenly unfolded along the part of genome.In some embodiments, at least 1
A, 5,10,20,40,60,80,100 or 150 different primers (or primer pair) are to end of chromosome
Or neighbouring region (such as region is in 20 away from end of chromosome, 10,5,1,0.5,0.1,0.01 or 0.001mb).One
In a little embodiments, at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80% or 90% draws in primed libraries
Region of the object to end of chromosome or near it be it is specific (such as region is apart from end of chromosome 20,10,5,1,0.5,
In 0.1,0.01 or 0.001mb).In some embodiments, 10,20,40,60,80,100 or 150 are not
With primer (or primer pair) in chromosome it is potential it is micro-deleted in region be specific.In some embodiments,
It dives at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80% or 90% primer pair chromosome in primed libraries
It is micro-deleted in region be specific.In some embodiments, at least 10%, 20%, 30% in primed libraries,
40%, 50%, 60%, 70%, 80% or 90% primer pair recombination hotspot, the region near recombination hotspot, or close to dye
The end of colour solid or the potential micro-deleted interior region in chromosome.
Exemplary kit
On one side, the present invention is characterized in that kit, such as the reagent for the target site in amplification of nucleic acid sample
Box, for using the missing and/or duplication of any method detection chromosome segment as described herein or whole chromosome.Some
In embodiment, kit may include any primed libraries of the invention.In one embodiment, kit includes multiple
Internal forward primer and optional multiple internal reverse primers, and optional external forward primer and external reverse primer,
In each design of primers be with the target site close to upstream and/or on target chromosome or chromosome segment of DNA (for example, more
State property site) one of and optional other chromosome or chromosome segment downstream.In some embodiments, kit packet
It includes and expands target site using primed libraries, such as using any method as described herein to detect one or more chromosome pieces
The one or more missings and/or repeat description book of section or whole chromosome.
In certain embodiments, kit of the invention is provided for detecting chromosome aneuploid and CNV measurement
Primer pair, such as the primer pair of a large amount of multiple reactions for detecting chromosome aneuploid, such as CNV (CoNVERGe)
(copy number variant event shows genotype) and/or SNV.In these embodiments, kit may include at least 100,
200,250,300,500,1000,2000,2500,3000,5000,10,000,20,000,25,000,28,000,50,000 or
75,000 and at most 200,250,300,500,1000,2000,2500,3000,5000,10,000,20,000,25,000,28,
000,50,000,75,000 or 100,0 primer pair is transported together.Primer pair may be embodied in single container, such as individually
Pipe or box or multiple pipes or box.In certain embodiments, primer pair is limited in advance by commercial suppliers and is sold together, and
And in other embodiments, client selects the gene target and/or primer of customization, and commercial suppliers are not provided to a client
With transport primer depositary management or multiple pipes.In certain exemplary implementation schemes, kit includes for detecting drawing for CNV and SNV
Object, especially known CNV and SNV relevant to the cancer of at least one type.
The kit for Circulating DNA detection of some embodiments according to the present invention includes detecting for Circulating DNA
Standard items and/or control.For example, in certain embodiments, standard items and/or reference substance sold and optionally with
In carrying out the primer (such as primer for carrying out CoNVERGe) of amplified reaction provided herein transportation and packaging together.At certain
In a little embodiments, control includes polynucleotides such as DNA, point including showing one or more chromosome aneuploid such as CNV
From genomic DNA and/or including one or more SNV.In certain embodiments, standard and/or control are referred to as
PlasmArt standard, and the region including the genome with known performance CNV has the polynucleotides of sequence identity, especially
It is in certain genetic diseases, and in certain disease-states such as cancer, distribution reflects the cfDNA naturally found in blood plasma
The distribution of segment.The illustrative methods for being used to prepare PlasmArt standard items are provided in the embodiments herein.In general, coming from
The genomic DNA in the known source including chromosome body of gland is separated, fragmentation, purifying and size selection.
It therefore, can be by will not show chromosome known to the isolated polynucleotides sample incorporation prepared as outlined above
In the DNA sample of aneuploid and/or SNV, people is prepared under conditions of being similar to concentration observed by for internal cfDNA
Work cfDNA polynucleotides standard items and/or control, for example, in the fluid 0.01% to 20%, 0.1% to 15% or 0.4 to
10% DNA.These standard/controls may be used as measurement design, characterization, exploitation and/or the control verified, and in the test phase
Between be used as quality control standard, such as the cancer test that carries out in the laboratory CLIA and/or be only used for the standard that uses of research,
Diagnostic test packet.
Exemplary normalization/bearing calibration
In some embodiments, for deviation adjusting different loci, the measurement of chromosome segment or chromosome, such as by
In the difference of G/C content, due to other differences of amplification efficiency, the deviation adjusted due to sequencing mistake.In some embodiments
In, the measurement for the not iso-allele of same loci is directed to metabolism, apoptosis, histone, between inactivation and/or allele
The difference of amplification be adjusted.In some embodiments, for the measurement of the not iso-allele of same loci in RNA,
It is adjusted for the difference of transcription rate or stability between different RNA allele.
Determine the illustrative methods of phase genetic data
In some embodiments, any known method using method described herein or for determining phase genetic data is come
Determine phase genetic data (the PCT Publication WO 2009/1053531 and PCT Publication submitted see, for example, on 2 9th, 2009
WO2010/017214 was submitted on August 4th, 2009;US publication on November 21st, 2013/0123120,2012;The U.S. is public
The number of opening 2011/0033862 is filed on October 7th, 2010;US publication 2011/0033862,2010;On 2 3rd, 2011
Announce 2011/0178719 in the U.S. of submission;The United States Patent (USP) 8,515,679 that on March 17th, 2008 submits;November 22 in 2006
Announce 2007/0184467 in the U.S. that day submits;United States serial No.2008/0243398 that on March 17th, 2008 submits and
The United States serial 61/994,791 that on May 16th, 2014 submits is integrally incorporated herein each by reference).In some realities
It applies in scheme, determines phase that is known or suspecting one or more regions containing interested CNV.In some embodiments,
Also phase is determined for one or more regions of the region CNV flank and/or one or more reference zones.In an embodiment
In, individual (for example, the relative such as fetus or embryo of the individual or Pregnant Fetus or embryo tested using method of the invention
Parent) genetic data by inferring individual described in measurement tissue be monoploid, such as by measure one or more sperms or
Ovum.In one embodiment, by inferring that (such as the parent of individual is (such as from a using one or more first degree relatives
The sperm of the father of body) or siblings) measurement genotype data come determine mutually individual genetic data.
In one embodiment, individual genetic data determines phase by dilution, wherein diluting in one or more holes
DNA or RNA, such as by using digital pcr.In some embodiments, DNA or RNA are diluted to every in expected each hole
Then the point of a haplotype no more than about one copy measures DNA or RNA in one or more holes.In some embodiments
In, when chromosome is close beam, cell stops at m period, and microfluid is used to for the chromosome separated being placed in point
In the hole opened.Because DNA or RNA are diluted, it is impossible to have more than one haplotype in same score (or pipe).Cause
This, can effectively exist in single DNA molecules in pipe, allow to determine the haplotype in single DNA or RNA molecule.One
In a little embodiments, the method includes DNA or RNA sample are divided into multiple portions, so that at least one portion includes coming from
A chromosome or a chromosome segment and Genotyping for dyad is (for example, determination is two or more polymorphic
Property site), DNA or RNA sample at least one fraction, so that it is determined that haplotype.In some embodiments, gene point
Type is related to sequencing (such as air gun sequencing or single-molecule sequencing), for detecting the SNP array or multiplex PCR of polymorphic site.?
In some embodiments, Genotyping is related to detecting polymorphic site using SNP array, and for example, at least 100;200;500;
750;1,000;2,000;5,000;7,500;10,000;20,000;25,000;30,000;40,000;50,000;75,000;
Or 100,000 different polymorphic sites.In some embodiments, Genotyping is related to using multiplex PCR.In some realities
It applies in scheme, the method includes contacting the sample in fraction with the primed libraries hybridized simultaneously at least 100;200;
500;750;1,000;2,000;5,000;7,500;10,000;20,000;25,000;30,000;40,000;50,000;75,
000;Or 100,000 different polymorphic sites (such as SNP) are to generate reaction mixture;And draw reaction mixture experience
Object extension condition is measured with generating amplified production using high-flux sequence instrument to generate sequencing data.In some implementations
In scheme, RNA (such as mRNA) is sequenced.Because mRNA only contains exon, sequencing mRNA allows in genome
In big distance (such as several megabase) on determine the allele of polymorphic site (such as SNP).In some embodiment party
In case, individual haplotype is sorted by chromosome and is determined.When illustrative chromosome sorting method is included in chromosome tight beam
The cell of mitotic stages is prevented, and separated chromosome is placed in separated hole using microfluid.Another method relates to
And the monosome separation and collection monosome mediated using FACS.Standard method (such as sequencing or array) can be used for identifying list
Allele on a chromosome is to determine individual haplotype.
In some embodiments, individual haplotype is determined by long reading sequencing, such as by using
The Moleculo technology of Illumina exploitation.In some embodiments, library preparation step includes that DNA is cut into segment,
Such as the segment of~10kb size, it dilutes segment and they is placed in hole (so that about 3,000 segment is in single hole), lead to
It crosses long-range PCR and is cut into short-movie section and barcode encoding is carried out to segment, and the bar shaped chip segment from each hole is merged
Together they to be sequenced.After sequencing, calculating step is related to the bar code based on connection and separates from each hole
It reads, and they is grouped as segment, the segment being overlapped on heterozygosis SNV is assembled into haplotype section, and is based on sublevel
Section statistically determines section described in phase, haplotype contig with reference to panel.
In some embodiments, the haplotype of individual is determined using the data from individual relatives.In some embodiment party
In case, SNP array is for determining at least 100 presence;200;500;750;1,000;2,000;5,000;7,500;10,000;
20,000;25000;30,000;40,000;50,000;75,000;Or 100,000 in individual DNA or RNA sample
The relatives of a different polymorphic site and individual.In some embodiments, the method includes making the DNA sample from individual
The relatives of product and/or individual contact with primed libraries, primed libraries while at least 100;200;500;750;1,000;
2,000;5,000;7,500;10,000;20,000;25000;30,000;40,000;50,000;75,000;Or 100,000
Different polymorphic sites (such as SNP) are to generate reaction mixture;And reaction mixture is made to undergo primer extension reaction condition
To generate amplified production, measured with high-flux sequence instrument to generate sequencing data.
In one embodiment, using the computer journey for inferring most probable phase based on the haplotype frequency of group
Sequence, such as phase is determined based on HapMap, the genetic data of fixed mutually individual.For example, can use known list times in general groups
Type block (for example, public HapMap project and the Perlegen mankind's haplotype plan creation) statistical method, directly from diploid
Data derive monoploid data set.Haplotype section is substantially a series of related equipotentials repeated in each kind of groups
Gene.Due to these haplotype sections be usually it is ancient and common, they can be used for predicting from diploid gene type single
Times type.The publicly available algorithm for completing this task includes faulty systematic growth method, the Bayes based on conjugate prior
Method and priori from Population Genetics.In these algorithms it is some use hidden Markov model.
In one embodiment, the hereditary number of mutually individual is determined using the algorithm from genotype data estimation haplotype
According to, such as the algorithm clustered using local haplotype is (see, for example, Browning, John Moses and Browning, John Moses, " haplotype phase fast and accurately
Position and missing data infer genome-wide association study by using localization haplotype cluster " American Journal of Human Genetics.
In November, 2007;81 (5): 1084-1097 is incorporated herein by reference in their entirety).Exemplary process is than Ge Er version:
3.3.2 or edition 4 (can be on the world wide web (www in hfaculty.washington.edu/browning/beagle/
Beagle.html is obtained, and is incorporated herein by reference in their entirety).
In one embodiment, the hereditary number of mutually individual is determined using the algorithm of haplotype is estimated according to genotype data
According to, for example, using with distance, the decaying of the linkage disequilibrium of the sequence and interval of genotypic markers, missing number it is estimated that recombination
Rate estimation, or combinations thereof (see, e.g., Stephens and Xi Zi, " the link imbalance haplotype reasoning of accounting decay and lack
Lose data estimation ", American Journal of Human Genetics .76:449-462,2005, it is whole that it is incorporated by reference herein).It is exemplary
Program be PHASE v.2.1 or v2.1.1.It (can be on the world wide web (www in stephenslab.uchicago.edu/
Software.html is obtained, and is incorporated herein by reference in their entirety).
In one embodiment, the hereditary number of mutually individual is determined using the algorithm from group's genotype data estimation haplotype
According to, such as the algorithm for allowing making cluster member relationship continuously to be changed according to hidden Markov model along chromosome.This method is
Flexibly, allow " bulk " mode of linkage disequilibrium and be gradually reduced with the linkage disequilibrium of distance (see, for example, this base of a fruit
Fen Si and Xi Zi, " the quick and flexible statistical model for large-scale groups genotype data: for inferring the gene of missing
Type and haplotype phase." American Journal of Human Genetics, 78:629-644,2006, be incorporated herein by reference in their entirety).Example
Property program be fastPHASE (can on the world wide web (www stephenslab.uchicago.edu/software.html obtain,
It is incorporated herein by reference in their entirety).
In one embodiment, using genotype interpolation, such as using below with reference to one or more in data set
A method determines the genetic data of mutually individual: HapMap data set carries out the control of Genotyping in multiple SNP chips
Data set, and come from 1,000 Genome Project.Illustrative methods are flexible modeling frameworks, increase accuracy and across
It is more multiple with reference to panel combination information (see, for example, person of outstanding talent she, Donnelly and Ma Qini (2009) are " a kind of flexibly and accurate gene
Type interpolation is next-generation genome-wide association study." Public science library-science of heredity magazine 5 (6): el000529,
2009, be incorporated herein by reference in their entirety).Exemplary process is IMPUTE or IMPUTE version 2 (also referred to as IMPUTE2)
(it can obtain in WWW mathgen.stats.ox.ac.uk/impute/imputev2.html, be integrally incorporated by reference
Herein).
In one embodiment, the genetic data of mutually individual is determined using the algorithm for deriving haplotype, such as is being recombinated
Coalescence genetic model under infer the algorithm of haplotype, such as the genetic model developed in PHASE v2.1 by Stefan.It is main
The algorithm improvement wanted depends on the Candidate haplotype set that each individual is indicated using binary tree.These On Binary Tree Representations: (1)
The calculating of the posterior probability of haplotype, and (2) is accelerated to pass through by avoiding the redundant operation carried out in PHASE v2.1
In terms of index of the most haplotype of Intelligent exploration to overcome haplotype reasoning problems (for example, with reference to Draenor, Ku Longre and bundle
Gu Li, " Shape-IT: the new algorithm fast and accurately inferred for haplotype ", BMC bioinformatics 9:540,
2008doi:10.1186/1471-8), reasonable approach (i.e. haplotype) 2105-9-540 is identified in binary tree, by drawing
Be integrally incorporated herein).Exemplary process is that SHAPEIT (can be on the world wide web (www in mathgen.stats.ox.ac.uk/
Genetics_software/shapeit/shapeit.html is obtained, and is incorporated herein by reference in their entirety).
In one embodiment, the heredity of mutually individual is determined using the algorithm from group's genotype data estimation haplotype
Data, such as obtain using haplotype piece band frequency the algorithm of the probability based on experience of longer haplotype.In some implementations
In scheme, algorithm rebuilds haplotype so that they have maximum partial coherence (see, for example, Ai Luoning, Haier thatch and Toea
It is fertile peaceful, " HaploRec: effective and accurate large-scale reconstruction of haplotype, " BMC bioinformatics 7:542,2006, pass through
Reference is integrally incorporated herein).Exemplary process is HaploRec, such as HaploRec version 2 .3.(it can be existed by WWW
Cs.helsinki.fi/group/genetics/haplotyping.html is obtained, and is incorporated herein by reference in their entirety).
In one embodiment, the heredity of mutually individual is determined using the algorithm from group's genotype data estimation haplotype
Data, for example, using piecewise connection strategy and the algorithm of the algorithm based on expectation maximization (see, for example, the Qin, ox and Liu, " subregion
The haplotype reasoning of connection-expectation-maximization algorithm single nucleotide polymorphism ", American Journal of Human Genetics .71 (5):
1242-1247,2002, be incorporated herein by reference in their entirety).Exemplary process is that PL-EM (can be in WWW
People.fas.harvard.edu/junliu/plem/click.html is obtained, and is incorporated herein by reference in their entirety).
In one embodiment, the heredity of mutually individual is determined using the algorithm from group's genotype data estimation haplotype
Data, such as genotype to be determined to the algorithm mutually divided for haplotype and block simultaneously.In some embodiments, validity period
Hope maximize algorithm (see, for example, base Meier and Shamir, " GERBIL: genotype resolution ratio and identified using the block of likelihood ",
Institute, national academy of sciences, United States of America report (PNAS) 102:158-162,2005, be incorporated herein by reference in their entirety).Example
Property program is GERBIL, and a part that can be used as GEVALT version 2 program (can be from WWW acgt.cs.tau.ac.il/
Gevalt/ is obtained, and is incorporated herein by reference in their entirety).
In one embodiment, the heredity of mutually individual is determined using the algorithm from group's genotype data estimation haplotype
Data, such as the algorithm for giving the ML estimation of the Haplotype frequencies of genotype measurement of not designated phase is calculated using EM algorithm.
The algorithm also allows to lack some genotype measurements (since such as PCR fails).It also allows the multiple interpolation of single haplotype
(see, for example, Clayton D.(2002), " SNPHAP: for estimating the program of big haploid frequency of SNP ", passes through reference
It is integrally incorporated herein).Exemplary process is that SNPHAP (can gene.cimr.cam.ac.uk/clayton/ on the world wide web (www
Software/snphap.txt is obtained, and is incorporated herein by reference in their entirety).
In one embodiment, the heredity of mutually individual is determined using the algorithm from group's genotype data estimation haplotype
Data, such as algorithm is inferred to the statistical haplotype of the genotype of collection based on SNP is directed to.The software can be used for a large amount of length
The relatively accurate of genome sequence determines phase, for example, obtaining from DNA array.Exemplary process is input with genotype matrix, and
Corresponding haplotype matrix is exported (see, for example, Bu Linzha and lucky, " 2SNP: expansible based on 2-SNP haplotype determines phase, "
Bioinformatics .22 (3): 371-3,2006 are incorporated herein by reference in their entirety).Exemplary process is that 2SNP (can be in ten thousand dimensions
It nets alla.cs.gsu.edu/sofltware/2SNP to obtain, be incorporated herein by reference in their entirety).
In various embodiments, intersect at the different location in chromosome or chromosome segment using about chromosome
Probability data (such as using recombination data, such as the recombination data that can be found in HapMap database, to generate weight
Group risk score) come any interval of genetic data of sequencing individual), to simulate the polymorphism etc. on chromosome or chromosome segment
Dependence between the gene of position.In some embodiments, it is calculated on computers based on sequencing data or SNP array data more
Allele at state property site counts.In some embodiments, multiple hypothesis each relate to chromosome or chromosome segment
Different possible states (such as in the genome of the copy number of the first homologous chromosomal segments and the first homologous chromosomal segments
The cell that indicates or more of crossing of comparing of the second homologous chromosomal segments, the duplication of the first homologous chromosomal segments, second is same
The equal expression of the missing of source chromosome segment or the first and second homologous chromosomal segments) (such as creating on computers);
Model (such as the Joint Distribution counted for each expection allele for assuming to establish at polymorphic site on chromosome
Model) (such as establishing on computers), it is counted using joint distributed model and allele to determine the opposite of each hypothesis
Probability (such as determination on computers);And select the hypothesis with maximum probability.In some embodiments, equipotential is established
The step of Joint Distribution model of gene count and the relative probability of determining each hypothesis, is used without using with reference to chromosome
Method carry out.
In one embodiment, using one or more relationships of individual (such as one or more parent, Xiong Dijie
Younger sister, children, fetus, embryo, grand parents, uncle, aunt or cousin) genetic data come determine mutually individual hereditary number
According to.In one embodiment, using the something lost of the hereditary offspring (for example, 1,2,3 or more offspring) of the one or more of individual
Pass data, such as embryo, fetus, the children of birth or miscarriage sample.In one embodiment, phase is determined using other parents
Hereditary the non-of offspring of the one or more of Haplotype data and parent determine phase genetic data, to parent (such as Pregnant Fetus or
The parent of embryo) genetic data carry out determining phase.
In some embodiments, sample (such as biopsy, such as tumor biopsy, blood sample, blood plasma sample from individual
Product, blood serum sample or may mainly contain or contain only interested CNV DNA or RNA another sample) (such as suspect suffer from
Have the individual of cancer, fetus or embryo), to determine known or suspection one containing interested CNV (such as lack or repeat)
A or multiple regions phases.In some embodiments, sample have high tumour score (such as 30,40,50,60,70,80,
90,95,98,99 or 100%).In some embodiments, sample (such as maternal whole blood sample, it is female from maternal blood sample
Body plasma sample, maternal serum sample, amniocentesis sample, placenta tissue sample (such as chorionic villus, decidua or placenta
Film) cervical mucus sample, the fetal tissue after foetal death other samples from fetus or may mainly contain or contain only
Have the cell of interested CNV, another sample of DNA or RNA) it analyzes as from known to the determination of the pregnant mothers of fetus or fetus
Or suspect the phase in one or more regions containing interested CNV (such as missing or duplication).In some embodiments, sample
Product have high fetus score (such as 25%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 98%, 99% or
100%).
In some embodiments, sample has haplotype imbalance or any aneuploid.In some embodiments,
Sample includes any mixture of two kinds of DNA, and two of them type has two kinds of different haplotype ratios, and altogether
Enjoy at least one haplotype.For example, mother is 1:1 in the case where fetus mother, fetus is 1:0 (plus paternal haplotype).
For example, in the case of a tumor, normal tissue 1:1, tumor tissues are 1:0 or 1:2,1:3,1:4 etc..In some embodiments
In, at least 10:100;500;1,000;2,000;3,000;5,000;8,000;Or 10,000 polymorphic sites are to determine
The phase of allele at some or all of sites.In some embodiments, sample comes from and is handled to become non-multiple
The cell or tissue of body, such as the aneuploidy induced by extended cell culture.
In some embodiments, largely or entirely DNA or RNA has interest CNV in sample.In some embodiments
In, the ratio of total DNA or RNA in DNA or RNA and sample from one or more target cells containing interested CNV
It is at least 80%, 85%, 90%, 95% or 100%.For having the sample of missing, for having the cell of missing (or DNA
Or RNA) there is only a kind of haplotypes.First haplotype standard method can be used determine be present in absent region etc.
The identity of position gene determines.In the sample of the cell (or DNA or RNA) only containing missing, there is only come to be present in
The signal of the first haplotype in those cells.Also containing the cell (or DNA or RNA) not lacked on a small quantity it is (such as a small amount of
Non-cancerous cell) sample in, the weak signal of the second haplotype in these cells (or DNA or RNA) can be ignored.Come
Second haplotype present in other cells of the individual lacked from missing, DNA or RNA can be determined by inferring.Example
Such as, if the genotype of the cell from the individual not lacked is (AB, AB), and the phase data of determining of individual indicates first list
Times type is (A, A);So, another haplotype may infer that as (B, B).
For wherein there is the cell (or DNA or RNA) and the cell (or DNA or RNA) that does not lack with missing
Sample still can measure phase.For example, being similar to the producible figure of Figure 18 or 29, wherein x-axis is indicated along the individual of chromosome
The linear position in site, y-axis indicate the quantity that the A allele of the score as total (A+B) is read.In some embodiments
In, for missing, mode includes representing two center strips of the SNP that individual is heterozygosis (top strap being indicated from scarce
The AB of the cell of mistake, the A from the cell with missing, lower strip indicate that AB and B from the cell not lacked come from
Cell with missing).In some embodiments, with the cell with missing, the score of DNA or RNA increase, and this two
The separation of band increases.Therefore, the identity of A allele is determined for the first haplotype, and B allele is same
Property is determined for the second haplotype.
Duplicate sample, there are additional haplotype copies for duplicate cell (or DNA or RNA).Standard method can be used
Determine this haplotype of repeat region, to determine in repeat region the identity of existing allele in an increased amount,
Or standard method can be used and determine that the standard method of the identity of repeat region determines the haplotype in unduplicated region
Allele exist with the amount of reduction.Once it is determined that a haplotype, then can determine another single times by inferring
Type.
There is duplicate cell (or DNA or RNA) and without duplicate cell (or DNA or RNA) for wherein existing
Sample still can be used with the above-mentioned similar method of method for missing and measure phase.For example, for example, being similar to Figure 18 or 29
Producible figure, wherein x-axis indicates the linear position in individual sites along chromosome, and y-axis indicates the score as total (A+B)
A allele read quantity.In some embodiments, for missing, mode includes representing individual as the SNP of heterozygosis
(top strap indicates the AB from the cell not the replicated and AAB from the cell with duplication, bottom to two center strips
Band indicates the AB from the no duplicate cell and ABB from the unit with duplication).In some embodiments, with
There is duplicate cell, the score of DNA or RNA increase, and the separation of the two bands increases.Therefore, A allele is same
Property is determined for the first haplotype, and the identity of B allele is determined for the second haplotype.In some realities
Apply in scheme, the phase in one or more regions CNV (such as in measured region polymorphic site at least 50%, 60%,
70%, 80%, 90%, 95% or 100% phase) it is to sample (such as the tumor biopsy from the known individual with cancer
Or plasma sample) measurement, and for analyze the subsequent sample from same individual with monitor the progress of cancer (such as monitor cancer
Alleviation or relapse cancer).In some embodiments, with high tumour score sample (such as tumor biopsy or from tool
Have the plasma sample of the individual of high tumor load) determine phase data for obtaining, be used to analyze have lower tumour score with
Sample (such as plasma sample of the individual from experience treatment of cancer or alleviation) afterwards.
For in another embodiment of pre-natal diagnosis, determining phase parent haploid number according to for detecting one from father
The presence of a above homologue, it means that the inhereditary material from more than one fetus is present in maternal blood sample.It is logical
The chromosome for crossing the euploid that concern is expected in fetus can exclude a possibility that fetus is with trisomy.Furthermore, it is possible to really
Determine whether foetal DNA is not from current father.
In some embodiments, two or more methods as described herein are used to determine the genetic data of individual
Phase.In some embodiments, bioinformatics method is (such as most probable using being inferred based on the Haplotype frequencies of group
Stage) and molecular biology method (such as any molecule phasing method disclosed herein, to obtain actual sublevel segment data,
Rather than the deduction phase data based on bioinformatics).In some embodiments, using from other subjects (such as
Previous subject) determine phase data to improve population data.For example, the phase data of determining from other subjects can be added to
In population data, to calculate the priori of the possibility haplotype of another subject.In some embodiments, using from other by
Examination person's (such as previous subject) determines phase data to calculate the priori of the possibility haplotype of another subject.
In some embodiments, probability data can be used.For example, due in sample DNA molecular indicate it is probability
Matter and various amplifications and measured deviation are divided from two different locis or from the DNA that the not iso-allele to anchor point measures
Son relative populations be always do not represent in mixture or in individual molecule relative number.If attempted by from individual
The DNA sequencing of blood plasma come determine normal diploid individual on autosome chromosome to the genotype at anchor point, can be with
It is expected that only observing two allele (heterozygosis) of an allele (homozygous) or approximately equivalent quantity.If at this
10 A allele molecules are observed at allele, and observe 2 B allele, then do not know the individual in the position
Whether point is homozygous, and two molecules of B allele be due to noise or pollution, or if individual is heterozygosis,
And the molecule of the relatively low number of B allele is random, the statistics variations due to DNA molecular number in blood plasma, expands deviation,
Pollution or any number of other reasons.In such a case, it is possible to calculate, individual is the probability of homozygosis and individual is heterozygosis
Corresponding probability, and these probability genotype can be used for further calculating.
Note that the number of observed molecule is bigger, which closely represents for given allele ratio
A possibility that ratio of DNA molecular, is bigger in individual.For example, if 100 A molecules of measurement and 100 B molecules, practical ratio
If rate is that a possibility that 50% a possibility that ratio measures 10 A molecules and 10 B molecules is much bigger.In an embodiment
In, one uses bayesian theory in conjunction with detailed data model, and to determine, ad hoc hypothesis is in the case where given observation
A possibility that correct.For example, one corresponds to disomy individual-if it is considered that two are assumed that-corresponds to a trisomy individual
Then disomy is assumed the case where being correct probability for two 100 molecules each of assumed will it is quite high with observe
Into two allele, the case where 10 molecules of each, is compared, and observes allele.Due to deviation, pollution or some
Other noise sources, or as the observation quantity at given trace declines, data become more plus noise, in the given number observed
In the case where decline, the probability that maximum likelihood is assumed is true.In practice, the probability on many sites can be gathered
Collection can determine that maximum likelihood hypothesis is the confidence level correctly assumed to increase.In some embodiments, simply polymerization is general
Rate is recombinated without considering.In some embodiments, it calculates and considers to intersect.
In one embodiment, the phased data of probability are for determining that copy number changes.In some embodiments, probability
Phased data are the unit type block frequency data based on population from such as data source of HapMap database.In some implementations
In scheme, the phased data of probability are the Haplotype datas obtained by molecular method, such as carry out determining phase by dilution, wherein often
Each section of chromosome is diluted to individual molecule by a reaction, but due to hereditary noise, the identity of haplotype may be without cease
To knowing.In some embodiments, probability to determine phase data be the Haplotype data obtained by molecular method, wherein single times
The identity of type can be known to high certainty.
The case where imagining a hypothesis, doctor want to determine whether people in their body has some cells specific
Chromosome segment by measurement individual plasma dna have a deletion.Doctor can utilize following knowledge: if all blood plasma
The cell of DNA is diploid and genotype having the same, then for heterozygous sites, for each of two allele
Decline is divided into one centered on 50%A allele and 50%B allele by the relative number for the DNA molecular observed
Distribution.However, there is missing at specific chromosome segment if being originated from a part of the cell of plasma dna, for heterozygosis
Site, it is contemplated that the relative number for the DNA molecular that each of two allele is observed will fall into two distributions, wherein
One center is located on the 50%A allele in the site of the chromosome segment missing containing B allele, and a center
Positioned at the 50% or less of the site of the chromosome segment missing containing A allele.The ratio containing deletion cells of plasma dna
Bigger, then the 50% of the two distributions is remoter.
In the case where this hypothesis, imagine that a clinician wants to determine whether a people has a chromosomal region
The cell of a ratio deleted in personal body.Clinician can by individual blood be drawn into vacuum blood collection tube or its
In the blood tube of his type, centrifugal blood, and plasma layer is isolated.Clinician can separate DNA from blood plasma, at target site
Enrichment DNA, may be by targeting or other amplifications, and site capture technique, size is enriched with or other beneficiation technologies.Clinician can
For example by using measurement such as qPCR, to be sequenced, the allele quantity of one group of SNP of microarray or other measurements, in other words
The DNA for generating gene frequency data, enrichment and/or amplification is said, to measure the amount of DNA in sample.We will consider clinic
Data in the case that doctor uses targeting amplification technique amplification cell-free plasma DNA are analyzed, and are then surveyed to the DNA of amplification
Sequence, with provided at six SNP finding on the chromosome segment of instruction cancer following exemplary may data, wherein individual exists
Those SNP are heterozygosis:
SNP 1:460 interprets A allele;540 interpret B allele;(46%A)
SNP 2:530 interprets A allele;;470 interpret B allele;(53%A)
SNP 3:40 interprets A allele;;60 interpret B allele;(40%A)
SNP 4:46 interprets A allele;;54 interpret B allele;(46%A)
SNP 5:520 interprets A allele;;480 interpret B allele;(52%A)
SNP 6:200 interprets A allele;;200 interpret B allele;(50%A)
From this group of data, it may be difficult to which distinguishing individual ownership cell is that diploid is normally or individual may have
A part of cancer cell, DNA make have the Cell-free DNA found in missing or the blood plasma of duplication on chromosome.For example, tool
There are two of maximum likelihood to assume that can be individual has missing in the chromosome segment, with 6% tumour score, and
Wherein genotype of the deleted segment of chromosome with six SNP for being more than (A, B).B, A, A, B, B) or (A, B, A, A, B,
For idiotype in this expression on one group of SNP, the first letter in bracket corresponds to the gene of the haplotype of SNP 1
Type, second corresponds to SNP 2 etc..
If determining the haplotype of the individual at the chromosome segment using a kind of method, and it was found that two dyeing
The haplotype of one of body is (A, B, A, A, B, B), then this will coincide with maximum likelihood hypothesis, and the individual calculated is in the segment
Place has missing and therefore may have a possibility that carcinous or precancerous cell that will increase.On the other hand, if it find that individual tool
There is haplotype (A, A, A, A, A, A), then individual in the chromosome segment there is a possibility that missing will significantly reduce, and not lack
A possibility that hypothesis, can be higher (practical likelihood value will depend on the measurement noise etc. in other parameters, such as system).
The haplotype of individual, many descriptions elsewhere in this document are determined there are many method.Here it provides
Partial list, is not meant to exhaustive.A kind of method is biological method, and wherein single DNA molecules are diluted, until
About molecule from each chromosomal region is in any given reaction volume, then using the method being such as sequenced
Measure genotype.Another method is that the population data based on information coupled based on various haplotypes with its frequency can be with general
Rate mode uses.Another method is that the diploid data of measurement individual and expection are shared haplotype section with individual and inferred
One or more related individuals of haplotype section.Still an alternative is that taking out the missing or repeated fragment with high concentration
Tissue sample, and haplotype is determined based on allele imbalance, it is, for example, possible to use the neoplasmic tissue samples with missing
Genotype measurement determines phase data determine the missing area, which can be used for determining regrowth after whether cancer has cut off.
In fact, measurement is typically more than 20 SNP on given chromosome segment, it is more than 50 SNP, is more than 100
SNP is more than 500 SNP, more than 1,000 SNP or is more than 5,000 SNP.
For determining phase, prediction allele ratio and the illustrative methods for rebuilding fetus genetic data
On one side, the present invention is characterized in that the method for determining one or more haplotypes of fetus.In various implementations
In scheme, this method allows to determine which polymorphic site (such as SNP) by fetus genetic and rebuilds, which homologue (packet
Include recombination event) it is present in fetus (and thus sequence between interpolation polymorphic site).When necessary, tire can substantially be rebuild
The whole gene group of youngster.If there are some remaining ambiguities (such as to have the interval intersected in the genome of fetus
In), when necessary, this ambiguity can be minimized by analyzing other polymorphic site.In various embodiments, it selects
It selects polymorphic site and one or more chromosome is covered with certain density, any ambiguity is reduced to required level.This method
Important application with polymorphism or other interested mutation (such as missing or repetition) in detection fetus, because it can base
It is detected in chain (such as there are chain polymorphic sites in Fetal genome), rather than instructs detection Fetal genome
In purpose polymorphism or other mutation.For example, if parent is the carrier of mutation relevant to cystic fibrosis (CF), it can
To analyze the nucleic acid samples for including the mother body D NA from fetus mother and the foetal DNA from fetus, to determine that foetal DNA is
No includes that the haplotype contains CF mutation.Particularly, it can analyze polymorphic site to determine whether foetal DNA includes containing CF
The haplotype of mutation, without detecting CF mutation itself in foetal DNA.This can be used for screening one or more mutation, such as
The relevant mutation of disease, without directly detecting mutation.
In some embodiments, the method includes for example determining parent's list by using any method as described herein
Times type (for example, haplotype of mother of fetus or father).In some embodiments, without using from mother or father
The data of relatives are determined.In some embodiments, SNP Genotyping or survey are carried out using dilution method as described herein
Sequence measures parent's haplotype.In some embodiments, passed through herein using the data of the relatives from mother (or father)
Any method determines the haplotype of mother (or father).In some embodiments, single times of father and mother is determined
Type.
Parent's haplotype data can be used for determining fetus whether heredity parent's haplotype.In some embodiments,
It the use of SNP array analysis include the nucleic acid samples from mother body D NA and foetal DNA, with detection at least 100;200;500;750;
1,000;2,000;5,000;7,500;10,000;20,000;25,000;30,000;40,000;50,000;75,000;Or
100,000 different polymorphic sites.In some embodiments, logical including carrying out the nucleic acid samples of mother body D NA and foetal DNA
Crossing contacts sample with the primed libraries for hybridizing at least 100 simultaneously to analyze;200;500;750;1,000;2,000;5,
000;7,500;10,000;20,000;25,000;30,000;40,000;50,000;75,000;Or 100,0 different more
State property site (such as SNP) is to generate reaction mixture.In some embodiments, make reaction mixture experience primer extend anti-
Condition is answered to generate amplified production.In some embodiments, with high-flux sequence instrument measurement amplified production to generate sequencing number
According to.
In various embodiments, intersect at the different location in chromosome or chromosome segment using about chromosome
Probability data (such as by using recombination data, such as the recombination data that can be found in HapMap database, to produce
Raw recombination risk score) determine any interval of fetus haplotype) to simulate the polymorphism etc. on chromosome or chromosome segment
Dependence between the gene of position, as described above.In some embodiments, the method consider SNP (such as positioned at gene or
The SNP of mutation flank interested) and recombinate the recombination data of possibility from location specific and surveyed from the hereditary of Maternal plasma
The physical distance for the data observed is measured, to obtain at most possible fetus genotype.It then can be to obtaining from these SNP
Targeting sequencing or SPN array data carry out PARENTAL SUPPORT TM, and to determine fetus, from two parents heredity, which is same
Source object is (see, for example, U.S. Application No. 11/603,406 (US publication 20070184467), U.S. Application No. 12/076,348
(US publication 20080243398), U. S. application 13/110,685 (US publication 2011/0288780), PCT application
PCT/US09/52730 (PCT Publication WO/2010/017214) and PCT Application No. PCT/US10/050824 (PCT Publication
W0/2011/041485), U.S. Application No. 13/300,235 (US publications 2012/0270212), U.S. Application No. 13/
335,043 (US publications, 2012/0122701), U.S. Application No. 13/683,604 and U.S. Application No. 13/780,022,
It is integrally incorporated herein each by reference).
Assuming that the possibility allele at one of site is the general embodiment of A and B;It is any by identity A or B points
The specific allele of dispensing.For the parent genotype of specific SNP, referred to as genetic background, it is expressed as female parent | male parent gene type.
Therefore, if it is heterozygosis that mother, which is homozygous and father, this will be indicated as AA | AB.Similarly, if two parents couple
Be in identical allele it is homozygous, then parent genotype will be indicated as AA | AA.In addition, fetus will never have AB or
BB state, and the number of the sequence reads with B allele will be low, and therefore be determined for measurement and base
Because of the noise response of parting platform, influence and sequencing mistake including such as low-level DNA pollution;These noise responses can be used for
The genetic data of modeling prospective is composed.Only there are five types of possible maternal father's genetic backgrounds: AA | AA, AA | AB, AB | AA, AB | AB and
AA|BB;Other backgrounds pass through symmetry equivalent.Wherein parent is that homozygous SNP is only used for determining for phase iso-allele
The information of noise and level of pollution.Wherein parent is not that homozygous SNP is determining fetus score and copying for phase iso-allele
It is informedness in terms of shellfish counting number.
The number of the reading of each allele of the NAJ and NBJ expression at SNP is enabled, and Ci is enabled to indicate in the site
Parent's genetic background at place.Data set for specific chromosome is by Nab={ NaxNbj } i=1...N and C={ Ci }, i=
1...N it indicates.In order to rebuild part or all of Fetal genome, optionally it can determine whether fetus has aneuploid (example
Such as the missing or additional copy of chromosome or chromosome segment).For each individual chromosome or chromosome in research, H is allowed
It indicates total chromosome number, is recombinated during fertilization gamete is formed on the parental source and parental chromosome of each chromosome
One or more set assumed of position create child.Can be used data from HapMap database and with each ploidy
The relevant previous message of state assumes the probability of P (H) to calculate.
In addition, F is enabled to indicate the part fetus cfDNA in sample.One group of possible H, C and F are given, it can be based on to molecule
The noise source of measurement and microarray dataset is modeled to calculate N ab, the probability of P (N ab, H, F, C).Target is to find to assume H
With so that the maximized fetus score F. of P (H'F Nab) and is assumed that F is uniform from 0 to 1 using Standard Bayesian statistical technique
Probability distribution, this can according to maximize P (Nab | H, F, C) P (H) relative to the probability of H and F and rewrite wherein now can based on
It calculates.It will be with specific copy number and fetus score (such as trisomy and F=10%, but cover all possible parental set of chromosome and rise
Point and crossover location) probability of relevant all hypothesis is added.The copy number with maximum probability is selected to assume to tie as test
Fruit, fetus score associated with the hypothesis discloses fetus score, and probability associated with the hypothesis is knot calculated
The accuracy of fruit.
In some embodiments, algorithm generates larger numbers of hypothesis sequencing data collection using computer simulation,
It may be from the possible fetus genetic hereditary pattern of method, sample parameters and amplification and measurement illusion.More specifically, algorithm is first
Parent genotype first with a large amount of SNP and the crossover frequency data from HapMap database predict possible fetus gene
Type.Then it predicts the anticipatory data spectrum of sequencing data, will carry mother of the fetus of every kind of possible fetus genotype
Mixing sample measure and consider various parameters, including fetus score, it is contemplated that read depth spectrum, Fetal genome is present in sample
Equivalent in product, expected amplification deviation and multiple noise parameters at each SNP.Data model is described for given
Each of these hypothesis of specific set of parameters, how Preference order or SNP array data occur.Selection is in the modeling data
Hypothesis with optimum data fitting between measurement data.
When necessary, the result that the haplotype of fetus genetic can be used calculates the expection equipotential of DNA or RNA from fetus
Gene ratio.Expected allele ratio (this can also be calculated to comprising the mixing sample from mother and the nucleic acid of fetus
A little allele ratios indicate the desired value for measuring the total amount of each allele, including the equipotential base from maternal nucleic acids
The amount acid and fetal nucleic acid of cause).It can be the different hypothesis meters of the degree of the overexpression of specified first homologous chromosomal segments
Calculate expected allele ratio.
In some embodiments, this method includes determining whether fetus has one of following illness or a variety of: capsule
Property fibrosis, Huntington's chorea, fragile X, thrombopenia, muscular dystrophy (such as Duchenne muscular dystrophy), Ah
Alzheimer's disease, Fanconi anemia, Gaucher disease, IV, Niemann-Pick disease, tay-Sachs disease, sickle-cell anemia, Parkinson's disease,
Twist mode myodystony and cancer.In some embodiments, for being derived from one or more of chromosome 13,18,21, X and Y
A chromosome determines fetus haplotype.In some embodiments, fetus haplotype is determined for all fetal chromosomals.Each
In kind embodiment, this method substantially determines the whole gene group of fetus.In some embodiments, for the gene of fetus
At least the 30%, 40%, 50%, 60%, 70%, 80%, 90% or 95% of group determines haplotype.In some embodiments,
The haplotype measurement of fetus includes the information which allele to have at least 100 about;200;500;750;1,000;2,
000;5,000;7,500;10,000;20,000;25,000;30,000;40,000;50,000;75,000;Or 100,000
Different polymorphic sites.In some embodiments, this method is used to determine the haplotype or allele ratio of embryo.
For predicting the illustrative methods of allele ratio
The illustrative methods of expection allele ratio for calculating sample are described below.Table 1 is shown containing next
From the expection allele ratio of mother and the mixing sample (such as maternal blood sample) of the nucleic acid of fetus.Expected from these etc.
Position gene ratio indicates desired for the measurement of the total amount of each allele, including the parent core in mixing sample
The amount of the allele of acid and fetal nucleic acid.In one embodiment, parent is in the expected two adjacent sites (examples isolated
It is such as, expected between site that there is no two sites of chromosome exchange) it is heterozygosis.Therefore, mother is (AB, AB).Now
Imagine mother stage by stage statistics indicate that, for a haplotype, she is (A, A);Therefore, it for other haplotypes, may infer that
She is (B, B).Table 1 gives the different expection allele ratios assumed that fetus score is 20%.It is not false for the example
If the knowledge of father's data, and assume that heterozygosis rate is 50%.Expected allele ratio is with each of two SNP's
(expection ratio/reading sum of A reading) provides.Determining phase data using parent, (haplotype is (A, A) and one is
(B, B)) and phase data is determined without using parent to calculate these ratios.Table 1 includes the fetal chromosomal piece from each parent
The different of the copy number of section are assumed.
The expection genetic data of mother and fetal nucleic acid mixing sample
Other than the quantity for using phase data to reduce possible expected allele ratio, it also changes each expection
The previous likelihood of allele ratio, so that maximum likelihood result is it is more likely that correctly.Eliminate impossible expection etc.
Position gene ratio is assumed to increase selection correct a possibility that assuming.As an example it is supposed that the allele ratio of measurement is
(0.41,0.59).In the case where not using sublevel segment data, it can be assumed that the hypothesis with maximum likelihood is that two-body is assumed
(it is assumed that similitude of the allele ratio of measurement and disomy expected allele ratio (0.40,0.60)).However, using
Data by stages can exclude the expection allele ratio that (0.40,0.60) is assumed as two-body, and can choose three-body hypothesis
It is more likely to.
Assuming that the allele ratio of measurement is (0.4,0.4).There is no any haplotype information, the mother at each SNP
The probability of body missing will be 0.5 × P (A missing)+0.5 × P (B missing).Therefore, while it seem that A is deleted (scarce in fetus
Lose), but the average value that a possibility that deleting will both be.For sufficiently high fetus score, still can determine most probable
Assuming that.For sufficiently low fetus score, average value may be unfavorable for missing and assume.However, for haplotype information, homologue
1 deleted probability P (A deletion) is bigger, and will preferably be fitted measurement data.When necessary, it is also contemplated that two sites
Between crossover probability.
In another illustrative embodiments using phase data combination likelihood, consider two continuous SNP s1 with
S2, and D1 and D2 indicate the allele data in these SNP.We provide an example herein, how to combine this two
The probability of a single nucleotide polymorphism.C is enabled to indicate that two continuous heterozygosis SNP have phase iso-allele in identical homologue
The probability of (that is, it is BA that two SNP, which are AB or two SNP).Therefore, 1-c indicates that SNP is AB, and the other is BA's is general
Rate.For example, it is contemplated that assuming H10 and allele unbalanced value f.First, it is assumed that assuming that all SNP are AB or BA, calculate all
Probability.Then, we can be as a result as follows by the probabilistic combination in two continuous SNP:
Lik(D1, D2|H10, f)=
Lik(D1|H10, f) and × c × Lik (D2|H10, f) and+Lik (D1|H10, f) and × (1-c) × Lik (D2|H01, f) and
We can recursively determine the combined probability Lik (D of all SNP1..., DN|H10, f).
Exemplary mutations
With the increased risk of disease or illness (such as cancer) or disease or illness (such as cancer) (such as higher than normal
Risk level) relevant exemplary mutations include mononucleotide variant (SNV), polynucleotides mutation, missing (such as missing
2 to 3,000 ten thousand base pair regions), duplication or tandem sequence repeats.In some embodiments, mutation is in DNA, such as cfDNA,
Cell-free mitochondrial DNA (cf mDNA) is originated from core DNA (cf nDNA), the Cell-free DNA of cell DNA or mitochondrial DNA.
In some embodiments, mutation is RNA, such as cfRNA, cell RNA, cytoplasm rna, Codocyte matter RNA, and non-coding is thin
Cytoplasm RNA, mRNA, miRNA, mitochondrial RNA (mt RNA), rRNA or tRNA.In some embodiments, mutation is suffering from disease or illness
Subject in the subject of (such as cancer) than no disease or illness (such as cancer) exists with higher frequency.Some
In embodiment, mutation instruction cancer, such as pathogenic mutation.In some embodiments, mutation is that have in disease or illness
There is the driving of pathogenic effects to be mutated.In some embodiments, mutation is not pathogenic mutation.For example, in certain cancers, it is multiple
Mutation accumulation, but some of them are not pathogenic mutations.The mutation of non-pathogenic is (such as in the subject with disease or illness
Than in the subject of no disease or illness to be mutated existing for higher frequency) still can be used for diagnosing the illness or illness.?
In some embodiments, mutation is the loss of heterozygosity (LOH) in one or more microsatellites.
In some embodiments, subject is screened known to subject to have and (for example, testing its presence, there are these
The variation of the amount of polymorphism or the cell of mutation, DNA or RNA) one or more polymorphisms or mutation or cancer remission or again
Occur).In some embodiments, for be in known to subject one of risk or a variety of polymorphisms or mutation (such as
The subject of relatives with polymorphism or mutation) screening subject.In some embodiments, to subject's screening and disease
Or the relevant one group of polymorphism of illness such as cancer or mutation, (for example, at least 5,10,50,100,200,300,500,750,1,
000,1,500,2,000 or 5,000 polymorphism or mutation).
Many coding variants relevant to cancer are described in Abadan et al., " the exon group of NCI-60 experimental subjects: base
Because of the carcinobiology and system of group resource ", cancer research, on July 15th, 2013 and WWW at
Dtp.nci.nih.gov/branches/btb/characterizationNCI60.html, it is whole simultaneously each by reference
Enter herein).NCI-60 human carcinoma cell line group represents lung, colon, brain, ovary, mammary gland, prostate and kidney and white blood by 60
The different cell lines composition of the cancer of disease and melanoma.The hereditary variation identified in these cell lines is by two types: normal
The II type variant of the I type variant and cancer specific found in group forms.
Example polymorphic or mutation (such as missing or repetition) are in one or more following genes: TP53, PTEN,
PIK3CA, APC, EGFR, NRAS, NF2, FBXW7, ERBB, ATAD5, KRAS, BRAF, VEGF, EGFR, HER2, ALK, p53,
BRCA, BRCA1, BRCA2, SETD2, LRP1B, PBRM, SPTA1, DNMT3A, ARID1A, GRIN2A, TRRAP, STAG2,
EPHA3/5/7, POLE, SYNE1, C20orfB0, CSMD1, CTNNB1, ERBB2.FBXW7, KIT, MUC4, ATM, CDH1,
DDX11, DDX12, DSPP, EPPK1, FAM186A, GNAS, HRNR, KRTAP4-II, MAP2K4, MLL3, NRAS, RBI,
SMAD4, TTN, ABCC9, ACVR1B, ADAM29, ADAMTS19, AGAP10, AKT2, CBWD1, CCDC30, CCDC93, CD5L,
CDC27, CDC42BPA, CDH9, CDKN2A, CHD8, CHEK2, CDK2, CHIN9, CIZ1, CLSPN, CNTN6, COL14A1,
CREBBP, CROCC, CTSF, CYP1A2, DCLK1, DHDDS, DHX32, DKK2, DLEC1, DNAH14, DNAH5, DNAH9,
DNASE1L3, DUSP16, DYNC2H1, ECT2, EFHB, RRN3P2, TRIM49B, TUBB8P5, EPHA7, ERBB3, ERCC6,
FAM21A, FAM21C, FCGBP, FGFR2, FLG2, FLT1, FOLR2, FRYL, FSCB, GAB1, GABRA4, GABRP, GH2,
GOLGA6L1, GPHB5, GPR32, GPX5, GTF3C3, HECW1, HIST1H3B, HLA-A, HRAS, HS3ST1, HS6ST1,
HSPD1, IDH1, JAK2, KDM5B, KIAA0528, KRT15, KRT38, KRTAP21-1, KRTAP4-5, KRTAP4-7,
KRTAP5-4, KRTAP5-5, LAMA4, LATS1, LMF1, LPAR4, LPPR4, LRRFIP1, LUM, LYST, MAP2K1,
MARCH1, MARCO, MB21D2, MEGF10, MMP16, MORC1, MRE11A, MTMR3, MUC12, MUC17, MUC2, MUC20,
NBPF10, NBPF20, NEK1, NFE2L2, NLRP4, NOTCH2, NRK, NUP93, OBSCN, OR11H1, OR2B11, OR2M4,
OR4Q3, OR5D13, OR8I2, OXSM, PIK3R1, PPP2R5C, PRAME, PRF1, PRG4, PRPF19, PTH2, PTPRC,
PTPRJ, RAC1, RAD50, RBM12, RGPD3, RGS22, ROR1, RP11-671M22.1, RP13-996F3.4, RP1L1,
RSBN1L, RYR3, SAMD3, SCN3A, SEC31A, SF1, SF3B1, SLC25A2, SLC44A1, SLC4A11, SMAD2, SPTA1,
ST6GAL2, STK11, SZT2, TAF1L, TAX1BP1, TBP, TGFBI, TIF1, TMEM14B, TMEM74, TPTE, TRAPPC8,
TRPS1, TXNDC6, USP32, UTP20, VASN, VPS72, WASH3P, WWTR1, XPO1, ZFHX4, ZMIZ1, ZNF167,
ZNF436, ZNF492, ZNF598, ZRSR2, ABL1, AKT2, AKT3, ARAF, ARFRP1, ARID2, ASXL1, ATR, ATRX,
AURKA, AURKB, AXL, BAP1, BARD1, BCL2, BCL2L2, BCL6, BCOR, BCORL1, BLM, BRIP1, BTK, CARD11,
CBFB, CBL, CCND1, CCND2, CCND3, CCNE1, CD79A, CD79B, CD73, CDK12, CDK4, CDK6, CDK8,
CDKN1B, CDKN2B, CDKN2C, CEBPA, CHEK1, CIC, CRKL, CRLF2, CSF1R, CTCF, CTNNA1, DAXX, DDR2,
DOT1L, EMSY (Cllorf10), EP300, EPHA3, EPHB1, ERBB4, ERG, ESR1, EZH2, FAM123B (WTX),
FAM46C, FANCA, FANCC, FANCD2, FANCE, FANCF, FANCG, FANCL, FGF10, FGF14, FGF19, FGF23,
FGF3, FGF4, FGF6, FGFR1, FGFR2, FGFR3FGFR4, FLT3, FLT4, FOXL2, GATA1, GATA2, GATA3, GID4
(C17 or 39), GNA11, GNA13, GNAQ, GNAS, GPR124, GSK3B, HGF, IDH1, IDH2, IGF1R, IKBKE, IKZF1,
IL7R, IRF4, IRS2, JAK1, JAK3, JUN, KAT6A (MYST3), KDM5A, KDM5C, KDM6A, KDR, KEAP1, KLHL6,
MAP2K2, MAP2K4, MAP3K1, MCL1, MDM2, MDM4, MED12, MEF2B, MEN1, MET, MITF, MLH1, MLL, MLL2,
MPL, MSH2, MSH6, MTOR, MUTYH, MYC, MYCL1, MYCN, MYD88, NF1, NFKBIA, NKX2-1, NOTCH1, NPM1,
NRAS, NTRK1, NTRK2, NTRK3, PAK3, PALB2, PAX5, PBRM1, PDGFRA, PDGFRB, PDK1, PIK3CG,
PIK3R2, PPP2R1A, PRDM1, PRKAR1A, PRKDC, PTCH1, PTPN11, RAD51, RAF1, RARA, RET, RICTOR,
RNF43, RPTOR, RUNX1, SMARCA4, SMARCB1, SMO, SOCS1, SOX10, SOX2, SPEN, SPOP, SRC, STAT4,
SUFU, TET2, TGFBR2, TNFAIP3, TNFRSF14, TOPI, TP53, TSC1, TSC2, TSHR, VHL, WISP3, WT1,
ZNF217, ZNF703 and combinations thereof (Soviet Union et al. " J More Dell root 2011,13:74-84;DOI:10.1016/
j.jmoldx.2010.11.010;With Abadan et al., " the exon group of NCI-60 experimental subjects: the cancer of genome resource
Biology and system ", cancer research, on July 15th, 2013 are integrally incorporated herein each by reference).In some embodiment party
In case, the repetition is that chromosome 1p (" Chrlp ") relevant to breast cancer repeats.In some embodiments, one or more
A polymorphism or mutation are in BRAF, such as V600E mutation.In some embodiments, one or more polymorphisms or mutation
It is K-ras.In some embodiments, there are one or more polymorphisms or the combinations of mutation in K-ras and APC.One
In a little embodiments, there are one or more polymorphisms or the combinations of mutation in K-ras and p53.In some embodiments,
There are one or more polymorphisms or the combinations of mutation in APC and p53.In some embodiments, in K-ras, APC and p53
There are one or more polymorphisms or the combinations of mutation.In some embodiments, exist in K-ras and EGFR a kind of or more
The combination of kind polymorphism or mutation.Example polymorphic or mutation are in one or more following microRNAs: miR-15a, miR-
16-1, miR-23a, miR-23b, miR-24-1, miR-24-2, miR-27a, miR-27b, miR-29b-2, miR-29c, miR-
146, miR-155, miR-221, miR-222 and miR-223 (card woods et al. " prognosis with chronic lymphocytic leukemia and into
Open up relevant Microrna label " New England Journal of Medicine 353:1793-801,2005, this is integrally incorporated by reference
Text).
In some embodiments, missing is at least 0.01kb, 0.1kb, 1kb, 10kb, 100kb, 1mb, 2mb, 3mb,
The missing or 40mb of 5mb, 10mb, 15mb, 20mb, 30mb.In some embodiments, missing is 1kb to lacking between 40bp
It loses, such as 1kb to 100kb, 100kb be to 1mb, 1 to 5mb, 5 to 10mb, 10 to 15mb, 15 to 20bp mb, 20 to 25mb, 25
To 30mb or 30 to 40mb.
In some embodiments, the repetition is at least 0.01kb, 0.1kb, 1kb, 10kb, 100kb, 1mb, 2mb,
The repetition of 3mb, 5mb, 10mb, 15mb, 20mb, 30mb or 40mb.In some embodiments, described to repeat to be 1kb to 40bp
Between repetition, such as 1kb to 100kb, 100kb be to 1mb, 1 to 5mb, 5 to 10mb, 10 to 15mb, 15 to 20mb, 20 to
25mb, 25 to 30mb or 30 to 40mb.
In the oncogene that some reality BRAF are the downstreams of Ras.In glioma, melanoma is reflected in thyroid gland and lung cancer
Having determined BRAF mutation, (Manuel Diaz-Si Tegeda et al. BRAF V600E mutation is common in pleomorphism yellow cell tumor: diagnosing and controls
Treating influences Public science library journal 2011;6:e17948,2011;B-RAF DNA mutation is for supervising in the rugged equal human serums of formal little slender bamboo
Survey the applying clinical Cane Res 13:2068-2074 for receiving the melanoma patient of biochemotherapy, 2007;With Boulder et al.
Detection participation AZD6244 (ARRY-142886) advanced melanoma II phase studies .Brit J Cane 2009;101:1724-
1730, be integrally incorporated herein each by reference).BRAF V600E mutation occurs, such as " in Melanoma Tumor, and
Late the stage is more common.Detect that V600E is mutated in cfDNA.It applies in scheme, tandem sequence repeats are 2 to 60 nucleotide, example
Such as 2 to 6,7 to 10,10 to 20,20 to 30,30 to 40,40 to 50 or the repetition of 50 to 60 nucleotide.In some embodiments
In, tandem sequence repeats are the repetitions (dinucleotides repetition) of 2 nucleotide.In some embodiments, tandem sequence repeats are 3 nucleosides
The repetition (Trinucleotide repeats) of acid.
In some embodiments, polymorphism or mutation are prognosis.Illustrative prognosis mutation includes that K-ras is mutated,
Such as the K-ras mutation as colorectal cancer disorders post surgery relapse indications (relies peace et al. " in the serum of colorectal carcinoma patient
A perspective study of cycle mutant KRAS2: strong prognostic indicator follow-up after surgery ", lattice spy 52:101-108,2003;With
The detection of free circulating tumor correlation DNA and its relationship with prognosis, international cancer in Patrice Leconte T et al. colorectal cancer patients blood plasma
Disease magazine 100:542-548,2002, be integrally incorporated herein each by reference).
In some embodiments, polymorphism or mutation and the reacting of the change to particular treatment (such as effect or secondary work
Increase or decrease) it is related.Embodiment include K-ras mutation in non-small cell lung cancer to the anti-of the treatment based on EGFR
It should reduce in relation to (" KRAS mutation based on blood plasma analyzes the potential clinical meaning in late Patients with Non-small-cell Lung to Wang et al.
Justice, " Clinical Cancer Research 16:1324-1330,2010, be incorporated herein by reference in their entirety).
K-ras is the oncogene being activated in many cancers.K-ras cfDNA is mutated in cancer of pancreas, lung cancer, colon
Identification in the carcinoma of the rectum, bladder cancer and gastric cancer (Fu Lieshi Haake your Schmidt " circle nucleic acid (CNA) and cancer-one investigate ",
Acta Biochimica et Biophysica Sinica [J] 1775:181-232,2007, be incorporated herein by reference in their entirety).
P53 is the tumor suppressor gene of tumour progression to be mutated and facilitated in many cancers (human relations of Lay text & Austria are " before p53
30 years: growth becomes increasingly complex, and is naturally comprehensive to say cancer periodical, 9:749-758,2009, bibliography).Many different passwords
Son can be mutated, such as Ser249.In breast cancer, lung cancer, oophoroma, bladder cancer, gastric cancer, cancer of pancreas, colorectum
P53cfDNA mutation (Fu Lieshi Haake that Schmidt " circle nucleic acid (CNA) and cancer are identified in cancer, intestinal cancer and hepatocellular carcinoma
Disease-investigation, " Acta Biochimica et Biophysica Sinica [J] 1775:181-232,2007, be incorporated herein by reference in their entirety).
BRAF is the oncogene in the downstream of Ras.In glioma, melanoma identifies BRAF in thyroid gland and lung cancer
(Manuel Diaz-Si Tegeda et al. BRAF V600E mutation is common in pleomorphism yellow cell tumor: diagnosing and treating influences for mutation
Public science library journal 2011;6:e17948,2011;B-RAF DNA mutation receives for monitoring in the rugged equal human serums of formal little slender bamboo
The applying clinical Cane Res 13:2068-2074 of the melanoma patient of biochemotherapy, 2007;Join with Boulder et al. detection
.Brit J Cane 2009 is studied with AZD6244 (ARRY-142886) advanced melanoma II phase;101:1724-1730, it is each
From being incorporated herein by reference in their entirety).BRAF V600E mutation occurs, such as " in Melanoma Tumor, and the late stage
It is more common.Detect that V600E is mutated in cfDNA.
EGFR facilitates cell Proliferation and is adjusted (in Tang's Ward J. Targeted cancer therapy in many cancers by mistake
RAS signal pathway is naturally comprehensive to say cancer periodical 3:11-22,2003;" first 30 years of p53: growth is more next with Lay text & Austria human relations
More complicated is naturally comprehensive to say cancer periodical, " 9749-758,2009, be incorporated herein by reference in their entirety).Exemplary EGFR mutation
Including the EGFR mutation in identified exons 1 8-21 in patients with lung cancer.EGFR is identified in patients with lung cancer
(" prediction/pleural effusion of blood plasma epidermal growth factor receptor mutation is to treated with gefitinib evening by Ji Ya et al. for cfDNA mutation
The curative effect of phase non-small cell lung cancer, " cancer research and Journal of Clinical Oncology 2010;136:1341-1347,2010, pass through
Reference is integrally incorporated herein).
Example polymorphic relevant to breast cancer or mutation include loss of heterozygosity (section's Le et al. " blood in microsatellite
Starch the free potential source biomolecule marker of core and mitochondrial DNA level as tumor of breast of circulating cells ", mole cancer 8:doi:
10.1186/1476-4598-8-105 2009, be incorporated herein by reference in their entirety), p53 mutation (such as in exon 5-8
Mutation) (add West Asia et al. " extracellular Tumour DNA blood plasma and patient with breast cancer overall survival, " gene, chromosome and cancer
Disease 45:692-701,2006, is incorporated herein by reference in their entirety), (Soren is gloomy et al. by human epidermal growth factor acceptor II
" survival and reaction of circulation HER2 DNA prediction breast cancer after Herceptin treatment, " anticancer research 30:2463-2468,
(Moore tower bundle et al., " is sequenced by plasma dna to acquired cancer for 2010, PIK3CA, MED1 and GAS6 polymorphism or mutation
Disease treats the TSfon invasion analysis of drug resistance, " nature periodical 2013;Doi:10.1038/ natural 12065, by quote with
It is integrally incorporated herein) 2013, be incorporated herein by reference in their entirety).
Increased cfDNA level and LOH are related to totality and without the reduction of disease survival rate.P53 be mutated (exon 5-8) with
Overall survival reduces related.HER2 is targeted in the circulation HER2 cfDNA level of reduction and HER2 positive breast tumors subject
The better reaction for the treatment of is related.The activated mutant of PIK3CA, the splice mutation in the truncation and GAS6 of MED1 cause to treatment
Resistance.
Example polymorphic relevant to colorectal cancer or mutation include p53, APC, K-ras and thymidylate synthase mutation
(" Molecular Detection that APC, K-ras and p53 are mutated in serum in patients with colorectal is as circulation by Wang et al. with pi6 gene methylation
Biomarker ", world magazine 28:721-726,2004;Lai An et al. " in colorectal carcinoma serum cycle mutant KRAS2
A perspective study: strong prognostic indicator follow-up after surgery ", Gut 52:101-108,2003;Patrice Leconte et al. " Colon and rectum
The detection of free circulating tumor correlation DNA and its relationship with prognosis in cancer patients blood plasma, " international journal of cancer 100:542-
548,2002;Shi Wacen Bach et al. " thymidylate synthase polymorphism of cell-free Circulating DNA in advanced colorectal cancer blood samples of patients
Analysis of molecules, " international journal of cancer 127:881-888,2009, be integrally incorporated herein each by reference).Postoperative detection
K-ras mutation in serum is the strong predictive factor of palindromia.The detection and reduction of K-ras mutation and p16 gene methylation
Survival it is related to increased palindromia.The detection of K-ras, APC and/or p53 mutation is related to recurrence and/or transfer.Make
With the polymorphism of the thymidylate synthase (the chemotherapeutic target gene based on fluoropyrimidine) of cfDNA, (including LOH, SNP, can parameter
Mesh tandem sequence repeats and missing) it may be related to therapeutic response.
Example polymorphic relevant to lung cancer (such as non-small cell lung cancer) or mutation include K-ras (such as codon
Mutation in 12) and EGFR mutation.Exemplary prognosis mutation includes EGFR relevant to increased totality and progresson free survival prominent
Become the progresson free survival of (missing of exons 19 or exon 21 are mutated) and K-ras mutation (in codon 12 and 13) and reduction
It is related that (" prediction/pleural effusion of blood plasma epidermal growth factor receptor mutation is non-to gefitinib in treatment small for good day et al.
The curative effect of cell lung cancer, " cancer research and Journal of Clinical Oncology 136:1341-1347,2010;Wang et al. " is based on blood plasma
KRAS mutation analysis late the potential clinical meaning in Patients with Non-small-cell Lung " Clinical Cancer Research 16:1324-1330,
2010, be integrally incorporated herein each by reference).It indicates the example polymorphic of the response to treatment or mutation includes improvement
It is prominent to the K-ras of the reaction for the treatment of to the EGFR mutation (missing of exons 19 or exon 21 are mutated) and reduction of the response for the treatment of
Become (codon 12 and 13).Identified in EFGR resistance-conferring mutation (Moore tower prick et al. " be sequenced by plasma dna
To the analysis of the TSfon invasion of acquired treatment of cancer drug resistance, " nature doi:10.1038/nature12065,2013,
It is incorporated herein by reference in their entirety)).
Example polymorphic relevant to melanoma (such as uveal melanoma) or mutation include GNAQ, GNA11, BRAF
With example polymorphic those of in p53 or mutation.Exemplary GNAQ and GNA11 mutation includes that R183 and Q209 is mutated.QA99
It is related to Bone tumour to be mutated GNAQ or GNA11.BRAF V600E mutation can detect in metastatic/advanced melanoma patient
It arrives.BRAF V600E is the indicator of aggressive melanoma.After chemotherapy BRAF V600E be mutated presence with to treatment not
React related.
Example polymorphic relevant to cancer of pancreas or mutation include K-ras and p53 in those of (such as
p53Ser249).P53 Ser249 also has with hepatitis B infection and hepatocellular carcinoma and oophoroma and non-Hodgkin lymphoma
It closes.
It can also even be detected in the sample with method of the invention with polymorphism existing for low frequency or mutation.For example, logical
It crosses and carries out thousands of times sequencing reading, it can be observed that with polymorphism existing for millionth frequency or mutation 10 times.It must
When wanting, the quantity of sequencing reading can be changed according to required level of sensitivity.In some embodiments, reanalyse sample or
Using greater number of sequencing another sample of reading Analysis from subject to improve sensitivity.For example, if detecting
Increased wind that is no or only detecting a small amount of (such as 1,2,3,4 or 5 kind) polymorphism relevant to cancer or mutation or cancer
Danger, then reanalyse sample or another sample tested.
In some embodiments, multiple polymorphisms or mutation are needed for cancer or metastatic cancer.In such case
Under, it screens multiple polymorphisms or the ability of accurate setting diagnosis cancer or metastatic cancer can be improved in mutation.In some embodiment party
In case, multiple polymorphisms or mutation needed for an object has progress cancer or metastatic cancer, subject can be later
It is screened, sees whether subject obtains additional mutation.
In multiple polymorphisms needed for wherein cancer or metastatic cancer or some embodiments of mutation, each polymorphism
Or the frequency of mutation can compare, and see whether they occur in similar frequency.For example, if cancer needs two mutation
(being expressed as " A " and " B "), some cells do not have, some cells have an A, some have B and it is some there is A and B, if A and B are
In similar observed frequency, subject is to be more likely to both some cell A and B.If A and B is in different frequency, subject
Different cell masses can be had by more having.
In I multiple polymorphisms needed for wherein cancer or metastatic cancer or some embodiments of mutation, quantity or
Check in the multiple polymorphisms or some embodiments of mutation of the identity of polymorphism or mutation, be present in originally can be used for predicting it is tested
Person is possible to or may have quickly the polymorphism of disease or illness or the quantity of mutation and identity.In some embodiments,
Polymorphism or mutation often occur in certain sequence, and subject can periodically test, and to look into, whether subject has been obtained
The other polymorphisms or mutation obtained.
In some embodiments, multiple polymorphisms or mutation (color: such as 2,3,4,5,8,10,12,15, or more)
Presence or absence of can increase the present or absent sensitivity and/or specificity of disease or illness, such as cancer, or in danger
In for increase with disease or illness: such as cancer.
In some embodiments, multiformity or mutation can directly detect.In some embodiments, multiformity or prominent
Become indirectly by the inspection for detecting polymorphism or mutation that one or more sequences (for example, polymorphic site such as SNP) are also linked to
It surveys.
Exemplary nucleic acid changes
In some embodiments, integrality (such as the piece of RNA or DNA relevant to disease or illness (such as cancer)
The change of the size of the cfRNA or cfDNA of sectionization or the change of nucleosome composition) there is variations, or increase disease or illness
The risk of (such as cancer).In some embodiments, methylation patterns RNA relevant to disease or illness (such as cancer) or
There is variations in DNA, or the wind with disease or illness (such as cancer) (such as hyper-methylation of tumor suppressor gene)
Danger increases.Such as, it has been suggested that the methylation on the island promoter region Zhong CpG of tumor suppressor gene triggers local gene suppression
System.The abnormal methylation of pi6 tumor suppressor gene occurs suffering from liver, in the subject of lung and breast cancer.Various types of
Other tumor suppressor genes often to methylate, including APC, Ras association structure domain family protein 1A are had detected that in cancer
(RASSF1A), glutathione S-transferase PI (GSTP1) and DAPK, such as nasopharyngeal carcinoma, colorectal cancer, lung cancer disease, oesophagus
Cancer, prostate cancer, bladder cancer, melanoma and acute leukemia.The methylation of certain tumor suppressor genes (such as p16) is retouched
State for cancer formed in earliest events, therefore can be used for early-stage cancer screening.
In some embodiments, using bisulfite conversion or the base of use methylation sensitive restriction Enzyme digestion
In the strategy of non-bisulfites come determine methylation patterns (henry et al., clinicopathologia magazine 62:308-313,2009,
It is incorporated by reference into its entirety).In bisulfite conversion, the cytimidine of methylation is left cytimidine, and does not methylate
Cytosines be uracil.Methylation sensitive restriction enzyme (such as BstUI) is in specific recognition site (for example, BstUI
5f-CG v CG-3') the unmethylated DNA sequence dna of cutting, and methylated DNA fragments keep complete.In some embodiments,
Detect complete methylated DNA fragments.In some embodiments, non-first of the stem ring primer for selective amplification limitation enzymic digestion
Base segment, without the methylate DNA of the non-enzymic digestion of coamplification.
The exemplary change of mRNA montage
In some embodiments, the variation of mRNA montage and disease or illness (such as cancer) or disease or illness (example
Such as cancer) risk increase it is related.In some embodiments, the variation of mRNA montage is one relevant to cancer or more
A following nucleic acid or risk of cancer increase: DNMT3B, BRCA1, KLF6, Ron or Gemin5.In some embodiments, it is examined
The mRNA splice variant of survey is related to disease or illness (such as cancer).In some embodiments, a variety of mRNA splice variants
It is generated by healthy cell (such as non-cancerous cell), but the variation of the relative quantity of mRNA splice variant and disease or illness are for example
Cancer is related.In some embodiments, the variation of mRNA montage is the variation due to mRNA sequence (such as in splice site
Mutation), the variation of splicing factor level can (such as reduction be due to splicing factor and repetition with the variation of the amount of splicing factor
Combination caused by can use splicing factor amount), the montage adjusting of change or tumor microenvironment.
Montage reacts the polyprotein by spliceosome/RNA compound and carries out (Fa Kesong and Ge Deli, disease model and machine
System: 37-42,2008, doi:10.1242/dmm.000331 are incorporated by reference into its entirety).Spliceosome identifies introne-
Exon boundary and the introne of the removal insertion of two ester exchange reactions by causing two neighboring exons to connect.The reaction
Fidelity must be accurate, because if connection occur it is incorrect, normal protein coding potentiality may be damaged.For example,
It, can during exon skipping reservation specified translation in the case where the reading frame of the codeword triplet of the identity and sequence of amino acid
The mRNA for becoming montage can specify the protein for lacking critical amino acid residues.More commonly, exon skipping turns over destruction
Frame is translated, premature terminator codon is caused.These mRNA are usually degraded by the process of referred to as nonsense-mediated mRNA decay
At least 90%, it reduce this defect information by accumulate to generate truncated protein product a possibility that.If misspelled
MRNA escapes the approach, then generates truncated, mutation or unstable protein.
Alternative splicing is the means of several or many different transcripts shown from identical genomic DNA, and is
Due to the available exon comprising specific protein subset and generate.By excluding one or more exons, certain eggs
White matter structural domain may be from the protein loss of coding, this can lead to protein function and loses or increase.It has been described several
The alternative splicing of type: exon skipping;Substitute 5' or 3' splice site;Mutually exclusive exon;In less common
Containing sub- reservation.Other people compare the amount of alternative splicing in cancer and normal cell using bioinformatics method, and determine cancer
Disease shows alternative splicing more lower level than normal cell.In addition, in cancer and normal cell, alternative splicing events
The distribution of type is different.Cancer cell shows less exon skipping, but substitution 5' and 3' more more than normal cell is cut
It connects site selection and introne retains.When inspection exon phenomenon (uses sequence as exon mainly by its hetero-organization conduct
Introne use) when, gene relevant to the external sourceization in cancer cell is preferentially related with mRNA processing, shows that cancer cell and cancer are thin
Exception mRNA splicing form is directly contacted between born of the same parents' generation.
The exemplary change of DNA or rna level
In some embodiments, DNA (such as cfDNA cf mDNA, cfnDNA, the cell DNA of one or more types
Or mitochondrial DNA) or RNA (cfRNA, cell RNA, cytoplasm rna, Codocyte matter RNA, non-coding cytoplasm rna, mRNA,
MiRNA, mitochondrial RNA (mt RNA), rRNA or tRNA).In some embodiments, one or more specific DNA (such as cfDNA cf
MDNA, cfnDNA, cell DNA or mitochondrial DNA) or RNA (cfRNA, cell RNA, cytoplasm rna, Codocyte matter RNA are non-
Codocyte matter RNA, mRNA, miRNA, mitochondrial RNA (mt RNA), rRNA or tRNA) molecule.In some embodiments, an equipotential
Gene has more forms of expression than another allele of target site.Exemplary miRNA is the short of adjusting gene expression
The RNA molecule of 20-22 nucleotide.In some embodiments, there are variation, such as one or more RNA points in transcript profile
The variation of the identity or amount of son.
In some embodiments, the increase of the total amount or concentration of cfDNA or cfRNA and disease or illness (such as cancer)
Or the risk increase of disease or illness (such as cancer) is related.In some embodiments, a type of DNA (such as cfDNA
Cf mDNA, cfnDNA, cell DNA or mitochondrial DNA) or RNA (cfRNA, cell RNA, cytoplasm rna, Codocyte matter
RNA, non-coding cytoplasm rna, mRNA, miRNA, mitochondrial RNA (mt RNA), rRNA or tRNA) total concentration and the type DNA or
The total concentration of RNA is compared to increasing at least 2, and 3,4,5,6,7,8,9,10 times or more in healthy (such as non-cancerous) subject.
In some embodiments, the total concentration of cfDNA is received in 75 nanograms/milliliters to 100 nanograms/milliliters, 100 nanograms/milliliters to 150
Grams per milliliter, 150 nanograms/milliliters to 200 nanograms/milliliters, 200 nanograms/milliliters to 300 nanograms/milliliters, 300 nanograms/milliliters are extremely
400ng/mgL, 400 nanograms/milliliters to 600 nanograms/milliliters, 600 to 800 nanograms/milliliters, 800 nanograms/milliliters to 1,000 receives
The total concentration of grams per milliliter (including end value) or cfDNA are greater than 100 nanograms/milliliters, are greater than 200 nanograms/milliliters, 300 receive
Grams per milliliter, 400 nanograms/milliliters, 500 nanograms/milliliters, 600 nanograms/milliliters, 700 nanograms/milliliters, 800 nanograms/milliliters, 900
Nanograms/milliliter or 1,000 nanograms/milliliter indicate cancer, and risk of cancer increases, and malignancy of tumor rather than benign risk increase,
Cancer is poor possibly into the prognosis of alleviation or cancer.In some embodiments, a type of DNA (such as cfDNA cf
MDNA, cf nDNA, cell DNA or mitochondrial DNA) or RNA (cfRNA, cell RNA, cytoplasm rna, Codocyte matter RNA,
Non-coding cytoplasm rna, mRNA, miRNA, mitochondrial RNA (mt RNA), rRNA or tRNA) have and disease or illness (such as cancer) phase
The risk of the one or more polymorphism/mutation (such as missing or repetition) or increased disease or illness (such as cancer) closed
It is at least 2,3 kinds, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 14%, 16%, 18%, 20% or 25%.
In some embodiments, at least the 2% of the total amount of a type of DNA, 3%, 4%, 5%, 6%, 7%, 8%, 9%,
10%, 11%, 12%, 14%, 16%, 18%, 20% or 25% is used as cfDNA cf mDNA, cf nDNA, cell DNA or line
Mitochondrial DNA) or RNA (cfRNA, cell RNA, cytoplasm rna, Codocyte matter RNA, non-coding cytoplasm rna, mRNA,
MiRNA, mitochondrial RNA (mt RNA), rRNA or tRNA) specific polymorphism or mutation relevant to disease or illness (such as cancer) (such as
Missing or repeat) or increased disease or illness (such as cancer) risk.
In some embodiments, cfDNA is packaged.In some embodiments, cfDNA is not packaged.
In some embodiments, measure total DNA (such as tumor section in total cfDNA or has in total cfDNA
The part tumour cfDNA of specific mutation) in Tumour DNA score.It in some embodiments, can be true for multiple mutation
The score of Tumour DNA is determined, wherein mutation can be mononucleotide variant, copy number variant, differential methylation or combinations thereof.One
In a little embodiments, the average tumor score calculated to the one or a set of mutation for calculating tumour score with highest is as sample
In practical tumour score.In some embodiments, using the average tumor score calculated for all mutation as in sample
Practical tumour score.In some embodiments, the tumor section is for cancer by stages (because higher tumour score can be with
The cancer in more advanced stage is related).In some embodiments, tumor section is used to determine the size of cancer, because of biggish tumour
It may be related to the ratio of the Tumour DNA in blood plasma.In some embodiments, tumor section is for determining with single or more
The ratio of the tumour of a mutation because between the tumor section and given tissue size measured in plasma sample there may be
Related mutations genotype.For example, have the size of the tissue of given mutated-genotype can be related to the score of Tumour DNA,
It can be calculated by focusing on the specific mutation.
Exemplary database
Feature of the invention includes the database from one or more results of method of the invention.For example, database can
To include the following information of record with any one or more subjects: (color: for example copy number becomes for any polymorphism/mutation
Change) it identifies, any known associated polymorphism/mutation increases with disease or illness or in the risk of disease or illness, polymorphism/
It is mutated the expression to the mRNA or protein of coding or the influence of activity level, a part of DNA, the RNA, (table of cell fraction
Existing or activity level polymorphism/mutation effect: such as DNA, RNA or cell have polymorphism/mutation disease or illness) it is thin
The total DNA of the relevant disease of born of the same parents or obstacle, RNA, cell sample identify polymorphism/mutation source (color: such as from specific group
The blood sample or sample knitted), the number of diseased cells repeats test gained (such as retest monitors disease from subsequent
Or the progress or alleviation of disorder), the test of other diseases or obstacle, the type of the disease or obstacle made a definite diagnosis implements treatment, treatment
Reaction, the side effect for the treatment of, symptom type, symptom (check symptom relevant to the disease or illness), the duration of alleviation
And quantity, the time-to-live (such as from initial test to the dead duration and/or from diagnosis to the dead duration), extremely
Reason is died, with and combinations thereof.
In some embodiments, database includes the record with the following information of any one or more subjects: being appointed
What polymorphism/mutation (color: for example copy number changes) identification, any known associated polymorphism/mutation and disease or illness
Or increase in the risk of disease or illness, polymorphism/mutation is to the expression of the mRNA or protein of coding or the shadow of activity level
It rings, a part of DNA, RNA, (performance of cell fraction or polymorphism/mutation effect of activity level: such as DNA, RNA or cell
With polymorphism/mutation disease or illness) total DNA of the relevant disease of cell or obstacle, RNA, cell sample identifies more
State property/mutation source (color: such as from the blood sample of specific organization or sample), the number of diseased cells, from subsequent heavy
Gained (such as retest to monitor the progress or alleviation of disease or disorder) is tested in retrial, and the test of other diseases or obstacle is made a definite diagnosis
Disease or obstacle type, implement treatment, the reaction for the treatment of, the side effect for the treatment of, symptom type, symptom (check with it is described
Disease or the relevant symptom of illness), the duration and quantity of alleviation, the time-to-live (such as from initial test to it is dead lasting when
Between and/or from diagnosis to the dead duration), the cause of death, with and combinations thereof.In some embodiments, the reaction for the treatment of
Including following any: reducing or stablize the size of tumour (for example, benign or cancerous tumour), slow down or prevent tumor size
Increase, reduce or stablize increase tumor disappearance and its occur again between the disease-free survival time, prevent the initial or subsequent of tumour
Occur, reduces or stablize ill symptoms relevant to tumour, or combinations thereof.In some embodiments, including for disease or
Other one or more tests of illness such as cancer as a result, for example from screening test, medical imaging or tissue sample
The result of microexamination.
In such one side, the present invention is characterized in that including at least 5,10,102,103,104,105,106,107,108
Or more record electronic databank.In some embodiments, database has at least 5,10,102,103,104,105,
The record of 106,107,108 or more different subjects.
On the other hand, the present invention is characterized in that including the computer and user interface of database of the invention.One
In a little embodiments, user interface can be shown comprising information some or all of in one or more records.In some implementations
In scheme, user interface can show that (i) has been accredited as the cancer of one or more types containing polymorphism or mutation,
Record storage in a computer, (ii) one or more polymorphisms or mutation, (iii) certain types of cancer or specific polymorphic
Property or the prognosis information of mutation, record storage is in a computer;(iv) one or more compounds or other therapeutic agents are used for
Cancer with record storage polymorphism in a computer or mutation, (v) adjusts the mRNA of its record storage in a computer
Or expression or the active one or more compounds of protein, and (vi) one or more mRNA molecules or protein, table
It reaches or activity is adjusted by the compound of its record storage in a computer.The internal component of computer generally includes to be coupled to storage
The processor of device.External module generally includes mass-memory unit, such as hard disk drive;User input equipment, such as key
Disk and mouse;Display, such as monitor;And optionally, computer system can be connected to other computers to allow
The network link of shared data and processing task.Program can be loaded into during operation in the memory of the system.
On the other hand, the present invention is characterized in that including the calculating of the one or more steps of any method of the invention
The process that machine is realized.
Exemplary risk factors
In some embodiments, also assess subject disease or illness (such as cancer) one or more risks because
Element.Illustrative risk factors include the family history of disease or illness, life style (such as smoke and be exposed in carcinogenic substance)
With one or more hormones or haemocyanin (such as the alpha-fetoprotein (AFP) in liver cancer, carcinomebryonic antigen (CEA) or prostate cancer
In prostate-specific antigen (PSA)).In some embodiments, the size and/or number of tumour are measured, and for true
Determine the prognosis of subject or selects the treatment of subject.
Exemplary screening technique
When necessary, it can confirm the existence or non-existence of disease or illness (such as cancer), or any standard side can be used
Method classifies to disease or illness such as cancer.For example, disease or illness, such as cancer can be detected with many modes, wrap
Include the presence of certain S&Ss, tumor biopsy, screening test or medical imaging (such as mammogram or ultrasonic wave).
Once detecting possible cancer, can be diagnosed by the microexamination of tissue sample.In some embodiments, diagnosis
Subject carries out retest at multiple time points using method of the invention or the test of known disease or illness, to monitor disease
The alleviation or recurrence of the progress or disease or illness of disease or illness.
Exemplary cancers
Any method of the invention can be used to diagnose, prognosis is stablized, and the exemplary cancers for the treatment of or prevention include entity
Tumor, cancer, sarcoma, lymthoma, leukaemia, germinoma or blastoma.In various embodiments, cancer is acute leaching
Bar chronic myeloid leukemia, acute myelogenous leukemia, adrenocortical carcinoma, AIDS associated cancer, AIDS associated lymphoma, anus
Cancer, appendix cancer, astrocytoma (such as children's cerebellum or cerebral astrocytoma), basal-cell carcinoma, cholangiocarcinoma (such as extrahepatic bile ducts
Cancer), bladder cancer, bone tumour (such as osteosarcoma or malignant fibrous histiocytoma), brain stem glioma, the cancer of the brain (such as cerebellum star
Shape cytoma, cerebral astrocytoma/glioblastoma, ependymoma, medulloblastoma, neuroectodermal tumors or vision
Approach and inferior colliculus glioma brain tumour), spongioblastoma, breast cancer, bronchial adenoma or class cancer, Burkitt lymphoma, class
Carcinoma (such as children or stomach and intestine carcinoid tumor), cancer central nervous system lymphoma, cerebellar astrocytoma or malignant nerve glue
Matter tumor (such as Cerebellar Astrocytoma in Children. An or glioblastoma), cervix cancer, childhood cancer, chronic lymphocytic are white
Blood disease, chronic myelogenous leukemia, chronic myeloproliferative disease, colon cancer, cutaneous T-cell lymphomas, small circle cell are swollen
Tumor, carcinoma of endometrium, ependymoma, the cancer of the esophagus, Ewing's sarcoma, the tumour of tumour, extracranial germ cell tumour (such as children
Phase extracranial germ cell tumour), vulva germinoma, cancer eye melanoma or retinoblastoma cancer eye), gallbladder cancer,
Gastric cancer, gastrointestinal associated cancers tumor, gastrointestinal stromal tumor, germinoma (such as outside cranium, vulva or ovarian germ cell tumors),
Pregnant trophoblastic tumor, glioma cerebral astrocytoma or children's vision approach and inferior colliculus glioma brain tumour), gastric cancer,
Hairy cell leukemia, head and neck cancer, heart cancer, liver cell (liver) cancer, Hodgkin lymphoma, hypopharyngeal cancer, hypothalamus and pathways for vision
Glioma glioma), islet-cell carcinoma (such as endocrine or pancreatic islet cell cancer), Kaposi's sarcoma, kidney,
Laryngocarcinoma, leukaemia (such as acute lymphoblastic, Acute Meyloid, chronic lymphocytic, chronic myelognous or hairy cell leukemia)
Carcinoma of mouth, embryonal-cell lipoma, liver cancer (such as non-small cell or small cell carcinoma), lung cancer, lymthoma (such as AIDS is relevant, Bai Ji
Spy, cutaneous T-cell, Huo Qijin, non-Hodgkin lymphoma or central nervous system lymphoma), macroglobulinemia (such as Wa Er
Moral steps on macroglobulinemia, the malignant fibrous histiocytoma of bone or osteosarcoma, medulloblastoma (such as children are at nerve
Solencyte tumor), melanoma, Merkel cell cancer, celiothelioma (such as adult or children's celiothelioma), in occult matastasis squamous
Skin cancer, Multiple Endocrine tumor form syndrome (such as childhood Multiple Endocrine tumor formed syndrome), multiple marrow
Tumor or plasmacytoma, mycosis fungoides, myelodysplastic syndrome, myeloproliferative disease (such as Chronic Myeloid hyperplasia
Property disease), nasal cavity or paranasal sinus cancer, nasopharyngeal carcinoma, neuroblastoma (such as Adult Acute Myeloid Leukemia), myeloproliferative
The malignant fibrous histiocytoma of disease, carcinoma of mouth, oropharyngeal cancer, osteosarcoma or bone, oophoroma, epithelial ovarian cancer, ovarian germinal
Cytoma, the low malignant potential tumour of ovary, cancer of pancreas (such as islet cells cancer of pancreas), paranasal sinus or CARCINOMA OF THE NASAL CAVITY, parathyroid gland
Cancer, carcinoma of penis, pharynx cancer, pheochromocytoma, pineal body astrocytoma, Pineal Germ-cell Tumor.Neuroblastoma, it is primary
Sexual centre nervous system lymthoma, cancer, the carcinoma of the rectum, clear-cell carcinoma, renal plevis or carcinoma of ureter (such as neuroblastoma), mind
Through blastoma or neurogenicity neuroectodermal tumors as renal plevis or transitional cell carcinoma of ureter, retinoblastoma,
Rhabdomyosarcoma (such as Children Rhabdomyosarcoma), salivary-gland carcinoma, sarcoma (sarcoma in such as tumour family, Ka Boxi, soft group
Knit or sarcoma of uterus), sezary syndrome, cutaneum carcinoma (such as non-melanoma, melanoma or Meike your cell cutaneum carcinoma), small intestine
Cancer, squamous cell carcinoma, Supratentorial primitive neuroectodermal tumour (such as childhood primitive neuroectodermal tumor), T cell lymph
Tumor (such as skin T cell lymphoma) carcinoma of testis, laryngocarcinoma, thymoma (such as children's thymoma), thymoma or thymic carcinoma, thyroid gland
Cancer (such as childhood thyroid cancer), Trophoblastic (such as gestational trophoblastic tumors), unknown original site cancer original site cancer),
Carcinoma of urethra (such as endometrium uterine cancer), sarcoma of uterus, carcinoma of vagina, visual pathway or inferior colliculus glioma brain tumour (such as youngster
Virgin visual pathway or inferior colliculus glioma brain tumour), carcinoma of vulva, Valdez spy's human relations macroglobulinemia or wilms tumour (such as
Children wilms tumour).In various embodiments, cancer has been shifted or has not been shifted.
Cancer can be or can not be (for example, estrogen or androgen associated cancer) of hormone correlation or dependence.
Benign tumour or malignant tumour can be used method and/or composition diagnosis, prognosis of the invention and stablize, and treat or prevent.
In some embodiments, subject suffers from cancer syndrome.Cancer syndrome is a kind of inherited disorder, wherein one
Genetic mutation in a or multiple genes makes impacted individual tend to the development of cancer, and is also possible to lead to these cancers
Early onset thereof.Cancer syndrome usually not only shows the high lifetime risk of developing cancer, but also shows multiple independent primary
The development of tumour.Many in these syndromes is as caused by the mutation in tumor suppressor gene, and the gene is related to protection of
Cell is from canceration.Possible impacted other genes are DNA-repair gene, oncogene and the gene (blood for participating in angiogenesis
Pipe generates).The Common examples of inherited cancer syndrome are heredity mammary gland-ovarian cancer syndrome and Hereditary non-polyposis
Property colon cancer (Lynch syndrome).
In some embodiments, to one or more polymorphisms or mutation n K-ras, p53, BRA, EGFR or
The subject of HER2 tries out the treatment of targeting K-ras, p53, BRA, EGFR or HER2 respectively.
Method of the invention is generally used for treating the pernicious or benign tumour of any cell, tissue or organ type.
Exemplary treatment
When necessary, subject (for example, being accredited as with cancer or the increased subject of risk) can be used for
Stablize, treats or prevents disease or illness (such as cancer) or disease or illness (such as cancer) is used for using of the invention any
The cancer of method).In various embodiments, treatment is known treatment or the treatment group for disease or illness (such as cancer)
It closes, such as cytotoxic agent, targeted therapy, immunization therapy, hormone therapy, radiotherapy, cancer cell or is likely to become the thin of cancer
The operation of born of the same parents is cut off, stem cell transplantation, bone-marrow transplantation, photodynamic therapy, palliative treatment or combinations thereof.In some embodiments
In, treatment (such as preventive medicine) for preventing in disease or the increased subject of illness (such as cancer) risk, delay or
Reduce the seriousness of disease or illness (such as cancer).
In some embodiments, targeted therapy is the specific gene of target on cancer, protein or facilitates growth of cancers
With the treatment of the organizational environment of survival.Such treatment prevents the growth and diffusion of cancer cell, while limiting to normal thin
The damage of born of the same parents usually has less side effect than other cancer drugs.
More successful method first is that target vascular therapy generates, the new blood vessel growth around tumour.Targeted therapy such as shellfish cuts down list
Anti- (Avastin), lenalidomide (Revlimid), Sorafenib (Nexavar), Sutent (Sutent) and Thalidomide
(Thalomid) angiogenesis is interfered.Another embodiment is using the treatment of targeting HER2, such as Herceptin or drawing pa
For Buddhist nun, for being overexpressed the cancer (such as some breast cancer) of HER2.In some embodiments, monoclonal antibody is for blocking
Specific target outside cancer cell.Embodiment includes Alemtuzumab (Campath-1H), bevacizumab, Cetuximab
(Erbitux), Victibix (Vectibix), handkerchief trastuzumab (Omnitarg), Rituximab (Rituxan) and toltrazuril
Monoclonal antibody.In some embodiments, the western trastuzumab of monoclonal antibody (Bexxar) is used to deliver to tumour and radiate.In some realities
It applies in scheme, the cancer disease process in the little molecules in inhibiting cancer cell of oral cavity.Embodiment includes Dasatinib (Sprycel), and Lip river in distress is replaced
Buddhist nun (Tarceva), Gefitinib (Iressa), Imatinib (Gleevec), Lapatinib (Tykerb), nilotinib
(Tasigna), Sorafenib, Sutent and tesirolimus (Torisel).In some embodiments, proteasome presses down
Preparation (such as Huppert's disease drug, bortezomib (Velcade)) interference is known as decomposing the enzyme of other protein in cell
Specialization protein.
In some embodiments, immunization therapy is intended to improve the natural phylactic power defensive power of human body to anticancer.Exemplary class
The immunization therapy of type is using the material prepared by body or in the lab, to support, targeting or restores function of immune system.
In some embodiments, hormone therapy is by reducing the amount of hormone in vivo come treating cancer.The cancer of several types
Disease, including some breast cancer and prostate cancer are only known as growing and spreading in the presence of the natural chemical substance of hormone in vivo.
In various embodiments, hormone therapy is for treating prostate, mammary gland, the cancer of thyroid gland and reproductive system.
In some embodiments, treatment includes stem cell transplantation, and wherein the marrow of illness is referred to as candidate stem cell
The cell replacement of eggcase.Candidate stem cell is present in blood flow and marrow.
In some embodiments, treatment includes optical dynamic therapy, uses the specific drugs and use of referred to as photosensitizer
In the light for killing cancer cell.These drugs work after by certain photoactivation.
In some embodiments, it is described treatment include operation excision cancer cell or be likely to become cancer cell (such as cream
Room tumorectomy or mastectomy).For example, having breast cancer predisposing genes mutation (BRCA1 or BRCA2 gene mutation)
Women can reduce her breast cancer and the risk of oophoroma, reduce salpingo-oophorectomy (removal fallopian tubal and ovary)
And/or reduce risk bilateral mastectomy (two breast of removal).Laser is very powerful, accurate light beam, can be with
For replacing blade (scalpel) to be used for very careful surgical work, including treatment certain cancers.
In addition to slow, stop or eliminate except the treatment of cancer (also referred to as disease targeted therapy), treatment of cancer it is important
Part is the symptom and side effect for alleviating subject, such as pain and nausea.It includes to patient's branch body, and emotion and society need
The support wanted, a method of being referred to as palliative treatment or supportive treatment.People usually receive treatment and while energy for disease
Alleviate the treatment of symptom.
Typical treatment includes actinomycin D, adcetris, adriamycin, Aldesleukin, alemtuzumab, Alimta, amine
Benzacridine, amsacrine, Anastrozole, Aredia, Arimidex, Arnold, L-Asparaginasum, Arastin, bevacizumab, than card Shandong
Amine, bleomycin, ibandronic acid injection, disodium clodronate capsule, bortezomib, busilvex, busulfan, Irinotecan,
Capecitabine, carboplatin, Carmustine, Carmustine, Cetuximab, chimax, Chlorambucil, Cimetidine, cis-platinum, gram
2-CdA, clodronate, clofarabine, Ke Lita enzyme, cyclophosphamide, cyproterone, prostaglandin before filling in, cytarabine, carefully
Born of the same parents' toxin, Dacarbazine, dactinomycin D, Dasatinib, daunorubicin, dexamethasone, adriamycin, Flutamide, estramustine,
Epi-ADM, eposin, Erbitux, Tarceva, estradiol phosphate, Estramustine, Etopophos, Etoposide,
Evoltra, Exemestane, fareston, Letrozole, Filgrastim, fludarabine, fludarabine, fluorouracil, Flutamide, easily
Auspicious sand, gemcitabine, gemcitabine, Gleevec, Gleevec.Gonapeptyl depot, Goserelin, methanesulfonic acid Ai Ruibu
Woods, Trastuzumab, Top is happy to agree, and hydroxycarbamide, ibandronic acid replaces emol, and idarubicin, ifosfamide, interferon, she replaces at horse
Buddhist nun, Gefitinib, Irinotecan, Cabazitaxel, Lan Kuaishu, Lapatinib, Letrozole, chlorambucil, Leuprorelin, leustat,
Lomustine, alemtuzumab, Mabthera, megestrol acetate, megestrol acetate, methotrexate, mitoxantrone, mitomycin,
Mutulane, busulfan, vinorelbine train Filgrastim, Filgrastim, Nexavar, Pentostatin, tamoxifen, vein point
Instillation is penetrated, vincristine, taxol, Pamidronate Disodium, PCV, pemetrexed, sprays Pentostatin, handkerchief trastuzumab, methyl benzyl
Hydrazine, Pu Luowenqi, prednisolone, prostrap, Raltitrexed, Rituximab and Dasatinib, Sorafenib, tamoxifen
Sweet smell, streptozotocin, diethylstilbestrol, stimuvax, Sutent, sotan, tabloid, safe stomach beauty, special jasmine is fragrant, tamoxifen, special sieve
Triumphant, taxol, taxotere, Tegafur and uracil, Temozolomide, Temozolomide, Thalidomide, Thioplex, plug replace
Group, Toremifene, Herceptin, vitamin A acid, Triamcinolone acetonide, trifluoroacetic acid porphines amide, Triptorelin, flavones, violet,
Bortezomib, Fan Bishi, flavonoids, vincristine, Crizotinib, capecitabine, her monoclonal antibody, Vande Thani, zanad, promise thunder
Moral, zoladronate select safe zoledronic acid and abiraterone.
For mRNA or protein mutant form (for example, cancer correlation form) and wild-type form (for example, and cancer
Incoherent form) subject, treat the preferred expression for inhibiting mutant or activity form and inhibits wild-type form than it
Expression or activity are at least 2,5,10 or 20 times high.While a variety of therapeutic agents or sequence uses the generation that can substantially reduce cancer
Rate simultaneously reduces the quantity that the cancer for the treatment of of resistance is generated to treatment.In addition, the therapeutic agent for being used as a part of combination treatment can
The dosage of the required lower treating cancer of corresponding dosage when can need than therapeutic agent is used alone.Every kind of change in combination treatment
The low dosage of conjunction object reduces the seriousness of the potential adverse side effect of compound.
In some embodiments, being accredited as having the subject of increased risk of cancer can invent or any standard
Method), it avoids specific risk factors or changes lifestyles to reduce any additional risk of cancer.
In some embodiments, polymorphism, mutation, risks and assumptions or any combination thereof are used to select the treatment of subject
Scheme.In some embodiments, bigger for subject's selection with bigger risk of cancer or with worse prognosis
Dosage or greater amount for the treatment of.
It include other compounds in individual or combination treatment
When necessary, it can be identified from the large-scale library of natural products or synthesis (or semi-synthetic) extract for stablizing,
Treat or prevent disease or illness (such as cancer) or increase the risk of disease or illness (such as cancer) other compounds or
Chemistry library is according to methods known in the art.Those skilled in the art or drug discovery and exploitation it will be understood that Test extraction object or
The accurate source of compound is not crucial for method of the invention.Therefore, any amount of chemistry can actually be screened
Extract or compound are to the cell from specific types of cancer or from the effect of particular subject, or to them to cancer
The activity of relevant molecule or the influence of expression are screened, and (cancer relevant molecule known to such as has in certain types of cancer
There are the activity or expression of change).When finding that crude extract adjusts the activity or expression of cancer relevant molecule, plumbean can be carried out and mentioned
The further classification of object is taken to separate, to use methods known in the art separation to be responsible for the chemical component for the effect observed.
For testing the exemplary mensuration and animal model for the treatment of
When necessary, can be used cell line (such as with identified in the subject diagnosed one or more mutation
Cell line) test one or more effects of the treatment to disease or illness (such as cancer) disclosed herein, use the present invention
Method and cancer or increased risk of cancer) or disease or illness animal model, such as SCID mice model (simple grace et al.
" tumor model of cancer research, safe thorough, Hu Mana Press, Inc, Tuo Tuowa, New Jersey, 647-671 pages, 2001, pass through
Reference is integrally incorporated herein).Additionally, there are many standard tests and animal models, for determining that specific therapy for stablizing, is controlled
The effect of risk for the treatment of or prevention disease or illness (such as cancer) or increased disease or illness (such as cancer).Treatment can also
It is tested in the human clinical trial of standard.
In order to select the preferred therapy of particular subject, compound can be tested to the one or more being mutated in subject
The expression of gene or active influence.It is, for example, possible to use standard Northern, Western or microarray analysis are come detection
Close the ability that object adjusts the expression of specific mRNA molecule or protein.In some embodiments, one or more chemical combination are selected
Object (i) inhibits the expression or activity that promote the mRNA molecule or protein of cancer, and the mRNA molecule or protein are to be higher than
Normal level expression promotes to inhibit with the activity (such as in sample from subject) or (ii) for being higher than normal level
The expression or activity of the mRNA molecule or protein of cancer, the mRNA molecule or protein are in subject to be lower than normal water
It puts down or with the activity expression for being lower than normal level.Individual or combination treatment (i) adjust the maximum number of mRNA molecule or egg
White matter has mutation relevant to the cancer in subject, and (ii) is adjusted in subject.In some embodiments, selected
The individual or combination treatment selected have high efficacy of drugs, and generate seldom (if any) adverse side effect.
As the alternative solution of above-mentioned subject's specificity analysis, DNA chip can be used for certain types of early stage or evening
Expression (Waldemar Malak et al. " immunology in the expression and normal tissue of mRNA molecule in phase cancer (such as breast cancer cell)
It is new " 12,206-209,2000;Breathe out gold, oncologist .5:501-507,2000;Bruno Pellizzari et al. " research of nucleic acid " 8
(22): 4577-4581,2000 are respectively integrally incorporated from there through reference).Based on the analysis, can choose with this type
The individual or combination treatment of the subject of the cancer of type come adjust changed in such cancer expression mRNA or
The expression of protein.
Other than for for particular subject or subject group selection treatment, express spectra can be used for monitoring in treatment phase
Between the variation of mRNA and/or protein expression that occurs.For example, express spectra can be used for determining cancer related gene expression whether
It has been restored to normal level.If not, thus it is possible to vary the dosage of one or more compounds is controlled in treatment with increasing or decreasing
Treat the influence to the expression of corresponding cancer related gene.In addition, the analysis can be used for determining whether treatment influences other bases
Because of the expression of (for example, gene relevant to adverse side effect).When necessary, thus it is possible to vary the dosage or composition for the treatment of to prevent or
Reduce undesirable side effect.
Exemplary formulation and method of administration
For stabilization, treats or prevents disease or illness such as cancer or increases the risk of disease or illness such as cancer,
Any method well known by persons skilled in the art can be used and prepare and apply composition (referring to U.S. Patent number 8,389,578
It with 8,389,557, is integrally incorporated herein each by reference).For preparation and the general technology of application in " Remington: medicine
It studies science and practices, " the 21st edition, David Troy editor, 2006, Donald Lippincott WILLIAMS-DARLING Ton & Louis Wilkins, Philadelphia,
It is incorporated herein by reference in their entirety) liquid, slurries, tablet, capsule, pill, pulvis, granule, gelling agent, ointment, bolt
Agent, injection, inhalant and aerosol are the embodiments of such preparation, for example, modified or extended release oral preparation can be with
It the use of the other suitable matrix forming material of method include such as wax (for example, Brazil wax, beeswax, paraffin, ceresine, worm
Glue wax, fatty acid and fatty alcohol), oil, fixed oil or fat (such as hardened rapeseed oil, castor oil, tallow, palm oil and soybean
Oil) and polymer (such as hydroxypropyl cellulose, polyvinylpyrrolidone, hydroxypropyl methyl cellulose and polyethylene glycol).It is other
Suitable matrix tabletting material is microcrystalline cellulose, cellulose powder, hydroxypropyl cellulose, ethyl cellulose and other loads
Body and filler.Tablet can also include particle, be coated powder or piller.Tablet is also possible to multilayer.Optionally, finished tablet
Can be coating or uncoated.
Giving the classical pathway of such composition includes but is not limited to oral, sublingual, oral cavity, and part is transdermal, sucking,
Parenteral (such as subcutaneous, intravenously, intramuscular, breastbone inner injection or infusion techniques), rectum, vagina and intranasal.Preferred
In embodiment, is applied and treated using extended release device.Composition of the invention is prepared to allow activity contained therein
Ingredient is bioavailable when applying composition.Composition can take the form of one or more dosage units.Combination
Object can contain 1,2,3,4 or more active constituents, and can optionally contain 1,2,3,4 or more it is nonactive at
Point.
Alternate embodiment
Any method described herein may include with physical format (such as on the computer screen or papery print it is defeated
On out) output data.Any method of the invention can be with can be by format that doctor takes action and the data that can act
Output combination.For determining that some embodiments described in the document of genetic data related with target individual can be with medicine
The potential chromosome abnormality (such as missing or duplication) of professional or the combination of notifications of shortage, are optionally examined with decision antenatal
Stop or do not stop fetus in the case where disconnected.Some embodiments as described herein can with can action data output and execution
Lead to the clinical decision of clinical treatment or execute to combine without the clinical decision of movement.
In some embodiments, disclosed herein is disclose the result of any method of the invention for generating and (such as delete
The existence or non-existence for removing or replicating) report method.The result that can use method of the invention generates report, and can
To be sent to doctor with electronic form, be shown on output equipment (such as number report), or with reading report form (such as
Hard copy this report of printing) it is delivered to doctor.In addition, described method can be with the clinical decision for leading to clinical treatment
The practical clinical decision combination for executing or executing without movement.
In certain embodiments, the present invention provides use multiple PCR method disclosed herein to detect from same sample
The reagent of CNV and SNV, kit and method, and the computer system and computer medium with coded command.Certain excellent
In the embodiment of choosing, sample is to suspect unicellular sample or plasma sample containing Circulating tumor DNA.These embodiments benefit
With following discovery: detecting DNA sample from unicellular or blood plasma by using super-sensitive multiple PCR method disclosed herein
Product are used for CNV and SNV, and the cancer detection that can improve goes out relative to independent detection CNV or SNV especially for cancer displays
CNV such as breast cancer, oophoroma and lung cancer.In certain illustrative embodiments, the method for analyzing CNV inquires 50 to 100,
000 or 50 to 10,000 or 50 to 1,000 SNP and SNV inquire 50 to 1000 SNV or 50 to 500 SNV or 50 to
250 SNV.Method provided herein for detecting CNV and/or SNV in the blood plasma for suspecting the subject with cancer, packet
For example known cancer for showing CNV and SNV, such as breast cancer, lung cancer and oophoroma are included, provides detection CNV and/or from logical
The SNV for the tumour being often made of in terms of genetic constitution heterogeneous cancer cell group.Therefore, it is absorbed in some districts for only analyzing tumour
The conventional method in domain can usually miss CNV or the SNV being present in the cell in other regions of tumour.Plasma sample is living as liquid
Inspection can be asked to detect any CNV and/or the SNV that exist only in tumour cell subgroup.
Computer Architecture example
Figure 69 shows the example system architecture X00 for executing the embodiment of the present invention.System architecture X00 includes connection
To the analysis platform X08 of one or more laboratory information systems (" LIS ") X04.As shown in Figure 69, analysis platform X08 can be with
LIS X04 is connected to by network X02.Network X02 may include one or more networks of one or more network types, packet
Include LAN, WAN, any combination of internet etc..Network X02 may include between any or all component in system architecture X00
Connection.Analysis platform X08 can alternatively, or in addition be directly connected to LIS X06.In one embodiment, analysis platform
X08 analyzes the genetic data provided by LIS X04 in software, that is, service model, and wherein LIS X04 is third party LIS, and is divided
Analysis platform X08 analyzes the genetic data provided by LIS X06, service or internal model, wherein LIS X06 and analysis platform X08
It is controlled by same side.In the embodiment that analysis platform X08 provides information by network X02, analysis platform X08 can be service
Device.
In the exemplary embodiment, laboratory information system X04 includes collecting, and manages and/or store one of genetic data
Or multiple public or private organization.Those skilled in the relevant art will be understood that method and standard for conservation genetics data are
Know, and various information security technologies and strategy can be used to realize, such as usemame/password, Transport Layer Security
(TLS), other cipher protocols of security socket layer (SSL) and/or offer communications security.
In the exemplary embodiment, system architecture X00 is operated as Enterprise SOA, and uses client-server
Device model will understand by those skilled in the relevant art, with realize various forms of interactions between LIS X04 and analysis and
Communications platform X08.System architecture X00 can be distributed on various types of network X02 and/or can be used as cloud computing framework behaviour
Make.Cloud computing framework may include any kind of distributed network architecture.As an example, not a limit, cloud computing architecture
(SaaS) is serviced for providing software, infrastructure services (IaaS), and platform services (PaaS), and network services
(NaaS) (DaaS) is serviced, database services (DBaaS), back-end services (BaaS), and test environment services (TEaaS), API
(APIaaS) is serviced, integrated platform services (IPaaS) etc..
In the exemplary embodiment, LIS X04 and X06 respectively includes computer, equipment, interface etc. or its any subsystem.
LI SX04 and X06 may include operating system (OS), install the application for performing various functions, such as access and/or navigation
It is locally accessible, in memory and/or the data that pass through network X02.In one embodiment, LIS X04 passes through application
Programming interface (" API ") access analysis platform X08.LI SX04 further includes can be independently of the one or more primary of API operation
Using.
In the exemplary embodiment, analysis platform X08 includes input processor X12, it is assumed that manager X14, modeling device X16,
Error correction unit X18, one or more of machine learning unit X20 and output processor X18.Input processor XI2 connects
Receive and handle the input from LI SX04 and/or X06.Processing can include but is not limited to such as parse, transcoding, translate, adaptation
Or operation from the received any input of LI SX04 and/or X06 is handled in other ways.It can be flowed via one or more, feedback
It send, database or the input of other data sources can such as be accessed by LIS X04 and X06.Data error can pass through execution
Above-mentioned error correction mechanism is corrected by error correction unit X18.
In the exemplary embodiment, it is assumed that manager XI4 is configured as to prepare according to the something lost for being expressed as model and/or algorithm
The form for passing the hypothesis of analysis to handle receives the input transmitted from input processor X12.Modeling device XI6 can be used such
Model and/or algorithm are with for example based on dynamic, real-time and/or historical statistics or other indexs carry out generating probability.For export and
The data for filling such Policy model and/or algorithm can use hypothesis manager X14 via such as genetic data source X10.
Genetic data source X10 may include such as nucleic acid sequencing instrument.It is based on for example filling its mould assuming that manager XI4 can be configured as
Variable needed for type and/or algorithm is assumed to formulate.Once being filled, model and/or algorithm can be modeled device XI6 and be used to
Generate one or more hypothesis as described above.Assuming that manager X14 can choose particular value, it is worth range or is based on most probable
Hypothesis estimated as output as described above.Modeling device XI6 can be according to the model by machine learning unit X20 training
And/or algorithm operates.For example, machine learning unit X20 can be by being applied to training set for sorting algorithm as described above
Database (not shown) develops such model and/or algorithm.In certain embodiments, machine learning unit analyze one or
Multiple control samples are to generate useful training dataset in SNV detection method provided herein.
Once assuming that manager XI4 has identified specific output, then such output can be returned to by output
Manage the specific LIS 104 or 106 of device X22 solicited message.
Various aspects of the disclosure can be by software, firmware, and hardware or combinations thereof is realized on the computing device.Figure 70 shows
Example computer system Y00 is gone out, wherein the embodiment or part thereof conceived may be implemented as computer-readable code.Root
Various embodiments are described according to example computer system Y00.
Processing task in the embodiment of Fig. 5.70 are executed by one or more processors Y02.It should be noted, however, that this
In various types of processing techniques, including programmable logic array (PLA), specific integrated circuit (ASIC), multicore can be used
Processor, multiprocessor or distributed processors.The additional dedicated processes resource of such as figure, multimedia or mathematical processing ability
It can be used for assisting certain processing tasks.These process resources can be hardware, software or its is appropriately combined.For example, one
Or multiple processor Y02 can be graphics processing unit (GPU).In embodiment, GPU is processor, is designed to fast
The special electronic circuit of mathematically-intensive application on speed processing electronic equipment.GPU can have highly-parallel structure, for big
The parallel processing of data block (such as math-intensive data) is effective.Alternatively or in addition, one of processor Y02 or one
It above can be the special parallel processing without graphics-optimized, such parallel processor executes math-intensive described herein
Function.One or more of processor Y02 may include processor accelerator (for example, DSP or other application specific processors).
Computer system Y00 further includes main memory Y30, and can also include additional storage Y40.Main memory
Y30 can be volatile memory or nonvolatile memory, and be divided into channel.Additional storage Y40 may include
Such as such as hard disk drive Y50, the nonvolatile memory of removable Storage driver Y60 and/or memory stick.It is removable to deposit
Storing up driver Y60 may include floppy disk drive, tape drive, CD drive, flash memory etc..Storage can be removed to drive
Dynamic device Y60 reads from removable storage unit 470 and/or is written in known manner removable storage unit 470.It is removable to deposit
Storage unit Y70 may include the floppy disk read and write by removable Storage driver Y60, tape, CD etc..Such as related fields
Technical staff understand, removable memory module Y70 include wherein be stored with the computer of computer software and/or data can
Use storage medium.
In substitution is realized, additional storage Y40 may include for allowing computer program or other instruction loads
Other similar device into computer system Y00.Such device may include such as removable memory module Y70 and interface
(not shown).The example of such device may include programming box and pod interface (such as finding in video game device),
It removable memory chip (such as EPROM or PROM) and associated socket and other removable memory modules Y70 and connects
Mouthful, allow software and data to be transmitted to computer system Y00 from removable storage unit Y70.
Computer system Y00 can also include Memory Controller Y75.Memory Controller Y75 is controlled to main memory
The data access of Y30 and additional storage Y40.In some embodiments, Memory Controller Y75 can be outside processor Y10
Portion, as shown in Figure 1 in other embodiments, Memory Controller Y75 can also directly be a part of processor Y10.For example,
The a part of many AMDTM and IntelTM processors used as identical with processor Y10 (being not shown in Figure 70) chip
Integrated memory controller.
Computer system Y00 can also include communication and network interface Y80.Communication and network interface Y80 allow software and
Data are transmitted between computer system Y00 and external equipment.Communication and network interface Y80 may include modem, lead to
Believe port, PCMCIA slot and card etc..Software and data via communication and network interface Y80 transmission are the forms of signal,
Can be can be by communication and the received electronics of network interface Y80, electromagnetism, light or other signals.These signals are via communication lines
Diameter Y85 is supplied to communication and network interface Y80.Communication path Y85 carries signal, and line or cable can be used, optical fiber, electricity
Line, cellular phone link, RF link or other communication channels are talked about to realize.
Communication and network interface Y80 allow computer system Y00 to pass through communication network or medium (such as LAN, WAN, Yin Te
Net etc.) communication.Communication and network interface Y80 can be via wired or wireless connections and remote site or network interface.
In the document, term " computer program medium ", " computer usable medium " and 66 " non-state mediums " are usual
For the tangible of reference such as removable memory module Y70, removable Storage driver Y60 and the hard disk being installed therein etc
Medium hard disk drive Y50.The signal carried by communication path Y85 can also embody logic described herein.Computer
Program medium and computer usable medium can also refer to memory, such as main memory Y30 and additional storage Y40, can
To be memory semiconductor (such as DRAM etc.).These computer program products are for providing software to computer system Y00
Device.
Computer program (also referred to as computer control logic) is stored in main memory Y30 and/or additional storage Y40
In.Computer program can also be received via communication and network interface Y80.Such computer program makes to succeed in one's scheme when executed
Calculation machine system Y00 can be realized embodiment as discussed herein.Specifically, computer program makes processor when executed
Y10 can be realized disclosed process.Therefore, such computer program indicates the controller of computer system Y00.It is using
In the case where software realization embodiment, software be can store in computer program product, and be driven using such as removable Storage
Dynamic device Y60, interface, hard disk drive Y50 or communication and network interface Y80 are loaded into computer system Y00.
Computer system Y00 can also include input/output/display equipment Y90, and such as keyboard, monitor, instruction sets
It is standby, touch screen etc..
It should be noted that the simulation of various embodiments, synthesis and/or manufacture can be partially by using including universal programming
Language (such as C or C++), the computer-readable code of hardware description language (HDL) etc. are realized.Such as Verilog HDL,
VHDL, Altera HDL (AHDL) or other available programming tools.The computer-readable code can be set any known
Computer usable medium in, including semiconductor, disk, CD (such as CD-ROM, DVD-ROM).In this way, code can wrap
Include the transmitted over communications networks of internet.
Embodiment further relate to include the software being stored on any computer usable medium computer program product.When
When executing in one or more data processing equipments, such software grasps data processing equipment as described herein
Make.Embodiment is using any computer is available or readable medium, and now or any computer known to future is available or can
Read storage medium.Computer can be used or the example of computer-readable medium includes but is not limited to main storage device (for example, any class
The random access memory of type), auxiliary storage device is (for example, hard disk drive, floppy disk, CD ROM, ZIP disk, magnetic storage are set
Standby, optical storage apparatus, MEMS, nanotechnological storage device etc.) and communication media (for example, wired and wireless communication network, office
Domain net, wide area network, Intranet etc.).Computer is available or computer-readable medium may include any type of temporary (its packet
Include signal) or non-transitory medium (it excludes signal).Non-transitory medium includes, as non-limiting example, said physical
It stores equipment (for example, main storage device and auxiliary storage device).
Can so it understand, any embodiment disclosed herein can be combined with any other embodiment disclosed herein to be made
With.
Experimental section
Presently disclosed embodiment is described in the examples below, and the embodiment is set forth to help to understand this public affairs
It opens, and is not necessarily to be construed as limiting the scope of the present disclosure limited in claim thereafter in any way.It proposes following
How embodiment uses the complete disclosure and description of described embodiment to provide to those of ordinary skill in the art, and
It is not intended to be limited to the scope of the present disclosure, the experiment that being not intended to indicates following is carried out whole or unique experiment.?
It makes efforts to ensure the accuracy about used number (for example, amount, temperature etc.), it is contemplated that some experiments miss
Difference and deviation.Unless otherwise indicated, number is parts by volume, and temperature is degree Celsius.It should be appreciated that can be intended to not changing experiment
The variation of described method is carried out in the case where the basic sides of explanation.
Embodiment 1
Exemplary sample preparation and amplification method are described in the U.S. Application No. 13/683 submitted on November 21st, 2012,
604;US publication 2013/0123120 and the U.S. Application No. 61/994,791 submitted on May 16th, 2014 it is preferential
Power, entire contents are incorporated herein by reference.These methods can be used for analyzing any sample disclosed herein.
In an experiment, plasma sample is prepared and expanded using half nesting 19,488-plex scheme.It makes in the following manner
Standby sample: it will be up to the centrifugal blood of 20mL to separate buffy coat and blood plasma.It is prepared in blood sample from buffy coat
Genomic DNA.Genomic DNA can also be prepared from saliva sample.Use QIAGEN CIRCULATING NUCLEIC ACID
Cell-free DNA in kit separated plasma, and eluted in 50uL TE buffer according to the manufacturer's instructions.It will be general
Connection adapter is attached to the end of the plasma dna of the 40uL purifying of each molecule, uses adapter primer amplified text
9, library circulation.With AGENCOURT AMPURE pearl purified library, and eluted in 50 μ lDNA buffer suspension liquid.
With 15 STAR 1 recycled, (95 DEG C are used for initial polymerization enzyme activation in 10 minutes, then 96 DEG C 30 of 15 circulations
Second;65 DEG C 1 minute;58 DEG C 6 minutes;60 DEG C) amplification 65 DEG C of 6ul DNA 4 minutes, 72 DEG C 30 seconds;Finally extend 72 DEG C 2 points
Clock), the reverse primer marked using the 19 of 7.5nM primer concentration, 488 target-specifics and a library adapter specificity are just
To primer.
Half nesting PCR scheme is related to second of amplification, 15 (95 DEG C 10 points of circulations (STAR 2) of 1 product dilution of STAR
Clock is used for initial polymerization enzyme activition, then 95 DEG C of 15 circulations 30 seconds;65 DEG C 1 minute;60 DEG C 5 minutes;65 DEG C 5 minutes and 72
DEG C 30 seconds;Finally extend 2 minutes at 72 DEG C), using the reversed label concentration of 1000nM and the concentration of 20nM for 19,488
Each of target-specific forward primer.
Then 2 product of STAR for passing through standard PCR amplification equal portions is reversely drawn with 1uM label specificity is positive with bar code
Object carries out 12 circulations, to generate bar code sequencing library.The aliquot in each library is mixed from the library of different bar codes
It closes, and uses spin column purification.
In this way, 19,488 primers are used in single hole reaction;Design primer is to target in chromosome 1,2 and 3
The SNP of upper discovery.Then 13,18,21, X and Y. is sequenced amplicon using ILLUMINA GAIIX sequenator.It is necessary
When, the number of sequencing reading can be increased to increase the number for the targeting SNP for being amplified and being sequenced.
Dependency basis is expanded in 7.5nM using the reverse primer of half nested 19,488 outside forward primers and label in STAR
Because group thermal cycle conditions of DNA sample STAR 2 and composition and bar code PCR are identical as half nested scheme.
Embodiment 2
Exemplary primers selection method is described in the 13/683,604 (U.S. of U.S. Application No. submitted on November 21st, 2012
Publication number 2013/0123120) and the United States serial 61/994,791 submitted on May 16th, 2014, bibliography).These sides
Method can be used for analyzing any sample disclosed herein.
Primed libraries of the following description of test for designing and select any multiple PCR method for use in the present invention
Illustrative methods.Purpose is to can be used in single reaction volume from selection in the initial libraries of candidate drugs while expanding largely
The primer of target site (or subset of target site).For the candidate target site of initial group, it is not necessary to be each target spot.
Step 1
Based on the publicly available information of the expectation parameter about target site, such as the frequency of the intracorporal SNP of target complex
The heterozygosis rate (world web at ncbi.nlm) of rate or SNP selects one group of candidate's target site (such as SNP).nih.gov/
projects/SNP/;Sherry ST, Ward MH, Kholodov M, et al., the hereditary variation of dbSNP:NCBI database
Nucleic acids research .2001 January 1;29 (1): 308-11 is integrally incorporated each by reference).For each candidate bit
Point, using Primer3 program, (World Wide Web is in primer3.sourceforge.net;Libprimer3 release 2.2.3,
It is incorporated herein by reference in their entirety) design one or more PCR primers pair.If do not had for the PCR primer in particular target site
There is feasible design, then the target site is eliminated from further consideration.
When necessary, can calculate " target site scoring " for most of or all target sites (indicates higher satisfaction
Higher scoring), such as the various expectation parameters based on target site weighted average calculate target site scoring.Base
In them for the importance of the specific application of primer will be used, weight that can be different to parametric distribution.Exemplary parameter packet
The heterozygosis rate for including target site, incidence rate relevant to sequence (such as polymorphism) at target site, with the sequence at target site
(such as polymorphism) relevant disease genepenetrance is arranged, for expanding the candidate drugs of target site, for expanding the candidate of target site
The size of primer and the size of target amplicon.In some embodiments, candidate drugs include waiting to the specificity of target site
Select primer by combine and expand except its be designed as amplification target site in addition to site misguide a possibility that.Some
In embodiment, removed from library it is one or more or the wrong candidate drugs filled out.
Step 2
Each primer of thermodynamic interaction score value and all primers from step 1 (for example, I ties up, H.T. and Sheng Luxi
Any nucleic acids research are calculated between " in internal DNA mismatch CT thermodynamics " every other target site by Asia, J., Jr. (1998)
26,2694 years to 2701;.Peyret, N., plug receive Wella Tener, PA, A Lawei, H.T. and St. Lucia, J., JR
(1999), " the nuclear magnetic resonance mispairing of neighbour's thermodynamics and inside AA, CC, GG DNA, TT sequence ", biochemistry 38,3468-
3477;I ties up, H.T. and St. Lucia, J.Jr. (1998), " it is small size in the mismatch neighbour heat power of DNA internal communication:
The influence of sequence dependent and pH ", biochemistry 37,9435-9444;I ties up, H.T. and St. Lucia, J.Jr. (1998),
" arest neighbors thermodynamic parameter is in internal DNA mismatch GA ", biochemistry 37,2170 to 2179 years;And I ties up, HT and Sheng Luxi
Asia, J., Jr. (1997), " NMR of thermodynamics DNA internal G. T mismatches and " biochemistry 36,10581-10594;MULTIPLX 2.1
(Kapp Lin Siji L, An Delie join beauty M.MULTIPLX in R, T Puurand: automatic grouping and evaluation PCR primer.Biology letter
Breath is learned.On April 15th, 2005;21 (8): 1701-2 is respectively hereby incorporated by reference in its entirety herein).This step causes
The 2D matrix of interaction score value.The prediction primer dimer of the interaction score value involves the primer of two interactions
Possibility.Score calculation method is as follows:
It interacts branch=MAX (- deltaG_2,0.8* (- deltaG_l)).
Wherein,
DeltaG_2=gibbs energy (fracture dimer needed for energy) is for by PCR being extendible at both ends
The end 3' of dimer, i.e., each primer is annealed to another primer;And
DeltaG_l=gibbs energy is used for the dimer expansible by PCR at least one end.
Step 3:
For each target trajectory, if there is more than one primer pair designs, following methods can be used and select one
Design:
1 each primer pair design about track, (highest) interaction score value is to design in the worst cases for discovery
In two primers, and complete all primers from all designs of other target sites.
2 selections have the design of best (minimum) worst condition interaction score.
Step 4
Built-in figure asks one site of each node on behalf and its design of relevant primer pair (for example, maximum is gathered together
Topic).A side is created between every pair of nodes.The weight at each edge is equal to associated with two nodes connected by edge
Worst case (highest) interaction score between primer.
Step 5
When necessary, each pair of design of target sites different for two a, wherein primer being designed from one and coming from
One primer of another design will be annealed to the target region of overlapping, add other edge between the node that two are designed.
The weight at these edges is equal to the highest weight specified in step 4.Therefore, step 5 prevents text
Step 6
Initial interaction score threshold calculates as follows: weight threshold=max (side right) -0.05* (max (side right)-min
(side right))
Max (side right) is the maximum side right weight in figure;With the minimum edge weight that min (side right weight) is in figure.At the beginning of threshold value
Initial line circle is provided that weight limit threshold value=max (side right) minimal weight threshold value=min (side right)
Step 7
The new figure constituted with the identical node group of figure of step 5, only comprises only the side of the weight more than weight threshold.Cause
This, step ignores the interaction of the score equal to or less than weight threshold.
Step 8
Node (and all edges for being connected to the node of removal) are removed from the figure of step 7, until there is no edge to leave.
By repeating to apply following procedure removing node:
1 finds the node with topnotch (highest number of edges).It, then can be one optional if there is multiple.
2 define the node by choosing above and are connected to the node collection that its all nodes form, but do not include any degree
Less than the node of the node selected above.
3 select from step 1 in the set with minimum target trajectory score (lower score represents lower desirability)
Node.The node is deleted from figure.
Step 9
If remaining number of nodes meets the required target site quantity in the multiplexing pond PCR (acceptable in figure
In tolerance), then this method can be continued to use in step 10.
If executing binary search there are too many or very little node in figure to determine which threshold value will lead in figure
The node of remaining desired amt.If there is too many node in figure, weight threshold boundary is adjusted as follows:
Weight limit threshold value=weight threshold
Otherwise (if there are two nodes in figure), weight threshold boundary is adjusted as follows:
Minimal weight threshold value=weight threshold
Then, adjustment weight threshold is as follows:
Weight threshold=(weight limit threshold value+minimal weight threshold value)/2
The method for repeating 7 to 9 steps of step.
Step 10
Select primer pair relevant to the node being retained in figure designed for primed libraries.The primed libraries can be used for
In any method of the invention.
When necessary, the primed libraries that can be used to expand target site to only one primer (rather than primer pair) carry out
This method of design and selection primer.In this case, node provides one to each target site (rather than primer pair)
Primer.
Embodiment 3
When necessary, method of the invention can assess its missing or duplicate ability for detecting chromosome or chromosome segment.
Following experiment is carried out to prove compared with X chromosome or X chromosome section from mother, detects X chromosome or from father
Crossing for the section of the X chromosome of close heredity shows.The measurement is designed to the missing or again of simulation chromosome or chromosome segment
It is multiple.By the different amounts of DNA from father (with XY sex chromosome) and the filial generation from father (with XX sex chromosome)
DNA mixing, for analyzing the X chromosome (Figure 19 A-19D) of the additional quantity from father.
It extracts the DNA from father and progeny cell system and is quantified using Qubit.Using paternal cell system AG16782,
CAG16782-2-F and daughter cell system AG16777, cAG16777-2-P.In order to determine X chromosome father haplotype, detection
To being present on X chromosome but be not present in the SNP on Y chromosome, therefore the X chromosome from father will be present rather than Y
The signal of chromosome.Daughter inherits this haplotype from father there.Haplotype from daughter other X chromosomes is from her
Mother inherit.This haplotype from mother can by will from not from the progeny cell of father's heredity be DNA
In SNP distribute to the haplotype from mother to determine.
In order to determine whether to detect the overexpression of the X chromosome from father, by from paternal cell system not
The DNA of same amount is mixed with the DNA from daughter cell system.Total DNA input is the genomic DNA of about 75ng (about 25k copy).It uses
About 3,456 SNP of Direct Multiple PCR amplification are measured for X and Y chromosome.It is surveyed using the 50bp one way with 7bp bar code
Sequence is sequenced amplified production using Rapid/HT mode.The reading of each SNP is about 10K.
As seen in figs. 19 a-19d, it can detecte the chimera from father DNA.These charts are bright to can detecte dyeing
The whole chromosome of body segment or overexpression.
All patents recited herein, patent application and disclosed bibliography are incorporated herein by reference in their entirety.Although
The specific embodiment for having been combined the disclosure describes disclosed method, it is understood that it can further be modified.This
Outside, this application is intended to cover any variation of disclosed method, use or reorganization, it is included in disclosed method fields
Known or conventional practice in the deviation to the disclosure, and fall within the scope of the appended claims.Of the invention is any
Embodiment can be carried out by DNA in analysis sample and/or RNA.For example, any side disclosed herein for DNA
Method can be easily adaptable RNA, such as by including the reverse transcription step that RNA is converted to DNA.
Embodiment 4
This embodiment describes the examples for the cell-free Tumour DNA detection breast cancer correlation copy number variation of Noninvasive
Property method.Breast cancer screening is related to mammography, leads to high false positive rate and lacks certain cancers.To cancer correlation
The analysis of circulation Cell-free DNA (ctDNA) derived from the tumour of CNV allows safer and more accurate screening earlier.It is based on
CNV is screened in the ctDNA that extensive multiplex PCR (mmPCR) method of SNP is used to separate in the blood plasma from patient with breast cancer.
MmPCR measurement is designed with 3 on targeting staining body 1,2 and 22,168 SNP, usually have in cancer CNV (for example,
49% breast cancer sample is lacked with 22q).Analyze six plasma samples-one stage Ha from patient with breast cancer, four
An a stage IIb and stage Illb.Each sample has CNV on one or more targeting staining bodies.The measurement identifies
CNV in six kinds of plasma samples, a stage IIb sample being correctly known as including the ctDNA fraction 0.58% (Figure 30,
31B, 32A, 32B and 33);Detection only needs 86 heterozygosis SNP.About 636 heterozygosis SNP (Figure 29,31A and 32A) are also used,
The calibration phase IIa sample under 4.33% ctDNA score.This shows that focus or whole chromosome arm CNV are normal in cancer
It is seeing and can easily detect.
In order to further evaluate sensitivity, by 22 kinds of artificial mixtures of the 3Mb 22q CNV of cancerous cell line and from just
The DNA mixing of normal cell line (5:95), to simulate the part ctDNA (Figure 28 A-28C) between 0.43% and 7.35%.The party
Method is correctly detecting CNV in the 100% of these samples.It therefore, can be other by the way that isolated polynucleotides sample to be incorporated into
Artificial cfDNA polynucleotides standard items/control is prepared in DNA sample, the polynucleotides sample includes by known displaying CNV
The non-source cfDNA (such as tumor cell line) generate fragmented polynucleotide mixture, concentration be similar to cfDNA, example
Such as 0.01% to 20% in the fluid, 0.1% to 15% or 0.4% to 10% DNA.These standard/controls may be used as
The control of measurement design, characterization, exploitation and/or verifying, and it is used as quality control standard dduring test, such as in CLIA reality
It tests the cancer test carried out in room and/or is only used for the standard that research uses, diagnostic test packet.In many cancers (including mammary gland
Cancer and oophoroma) in, CNV relative to point mutation more commonly.This supports this mmPCR method based on SNP to provide for examining
Survey the cost-effective noninvasive method of these cancers.
Embodiment 5
Present embodiment describes the examples that copy number variation in extensive multiplex PCR detection breast cancer sample is targeted with SNP
Property method.The evaluation of CNV is usually directed to SNP microarray or aCGH in tumor tissues.These methods have high full genome component
Resolution, but need a large amount of input material, has a high fixed cost, and fix in formaldehyde-paraffin insertion (FFPE) sample on
It cannot work well.For the embodiment, 28, the 000 heavy chain SNP targeting PCR targeting with next-generation sequencing (NGS) is used
1p, 1q, 2p, 2q, 4p16,5p15,7q11,15q, 17p, 22q11,22q13 and chromosome 13,18,21 and X are for detecting mammary gland
CNV in cancer sample.Accuracy with aneuploid or micro-deleted 96 samples is verified.It is unicellular by analyzing
Establish single molecule sensitivity.In 17 breast cancer samples (15 fresh food frozens and 2 FFPE tumor tissues, 5 pairs of matched tumours
With normal cell system) in, observe 16 (including two FFPE) in all or part of CNV in 1 to 15 target (it is average:
7.8);Observe the evidence of Tumor Heterogeneity.Three kinds of tissues with a CNV all there is Iq to repeat, and be most normal in breast cancer
The cytogenetic abnormalities seen.The most common region with CNV is 1q, 7p and 22ql.Only one tumor tissues (has 9
A CNV) there is the region with LOH;The LOH is also detected that in the adjacent presumption normal tissue for lacking other 8 kinds of CNV.Phase
Than under, detected in cell line (average: 5 or more regions 12.8) with LOH and high total CNV disease incidence.Cause
This, extensive multiplex PCR is provided economic high throughput method and is studied CNV in a manner of targeted, and is suitable for being difficult to
The sample of analysis, such as FFPE tissue.
Embodiment 6
This embodiment illustrates the illustrative methods of the detection limit for calculating any method of the invention.These methods are used
In the detection limit for calculating mononucleotide variant (SNV) in tumor biopsy (Figure 34) and plasma sample (Figure 35).
First method (" LOD-mr5 " is expressed as in Figure 34 and 35) is based on minimum 5 readings and calculates detection limit, the reading
Number is selected as observing that SNV there are in fact with the minimum number SNV with enough confidence levels in sequencing data.Detection
The limit is whether to be higher than the minimum value 5 based on the reading depth (DOR) observed.Figure 34 and 35 indicates that detection limit is limited by DOR
SNV in these cases, be not measured by enough readings to reach the limit of error of measurement.When necessary, increase can be passed through
DOR improves the detection limit (leading to lower numerical value) of these SNV.
Second method (" LOD-zs5.0 " is expressed as in Figure 34 and 35) is based on z-score and calculates detectable limit.Z- score
It is the quantity for standard deviation of the percentage error far from background mean error observed.When necessary, exceptional value can be removed, and
Z-score can be recalculated, the process can also be repeated.The final weighted average and standard deviation of error rate are for calculating z
Score.Average value is weighted by DOR, because precision is higher as DOR higher.
For the exemplary z-score calculating for the present embodiment, back is calculated from all other samples of identical sequencing operation
Scape mean error and standard deviation for each genomic locus and replace type, depth weighted by reading.If background distributions
There are 5 standard deviations apart from average background value, does not then consider the background distributions of sample.Figure 34,35 orange line indicate detection limit
The SNV limited by the bit error rate.These SNV can get enough readings to reach 5 reading minimum value, and detect limit by mistake
Accidentally rate limitation.When necessary, error rate can be reduced by optimization measurement to improve detection limit.
The third maximum value of method (" LOD-zs5.0-mr5 " is expressed as in Figure 34 and 35) based on above-mentioned two measurement
Calculate detectable limit.
The averaging limits of the analysis detection of tumor sample shown in Figure 34 are 0.36%, and the intermediate value limit of detection is
0.28%.The quantity of DOR limitation (grey lines) SNV is that the quantity of 934. error rates limitation (orange line) SNV is 738.
The averaging limits of detection in plasma sample shown in analysis chart 35 are 0.24%, and the intermediate value limit of detection is
0.09%.The quantity of DOR limitation (grey lines) SNV is that the quantity of 732. error rates limitation (orange line) SNV is 921.
Embodiment 7
This embodiment illustrates the detections from identical single celled CNV and SNV.Use following primed libraries: for examining
Survey CNV~library of 28,000 primer, for detect CNV~library of 3,000 primer and for detecting drawing for SNV
Object library.For single celled analysis, by cell serial dilution, until every drop has 3 or 4 cells.Pipette individual cells simultaneously
It is placed in PCR pipe.Using Proteinase K, salt and DTT use the following conditions lytic cell: 56 DEG C 20 minutes, 95 DEG C 10 minutes, so
4 DEG C of holdings afterwards.Analysis for genomic DNA is bought or passes through culture cell and extract DNA acquisition from slender with analysis
The DNA of the identical cell line of born of the same parents.
In order to use the amplified library of~28,000 primer, use following PCR condition: 40 μ L reaction volumes, 7.5nM's is every
Kind primer and 2 × main mixture (MM).In some embodiments, QIAGEN multiple PCR reagent kit is used for main mixture
(QIAGEN catalog number (Cat.No.) 206143;Referring to WWW qiagen.com/products/catalog/assay-technologies/
End-point-pcr- and-pr-pcr-reagents/qiagen-multiplex-pcr-kit is integrally incorporated by reference
Herein).Kit includes that the grasp of 2x kit multiplex PCR mixes (providing final concentration of 3mM MgCl2,3 × 0.85ml), 5 ×
Q- solution (1 × 2.0ml) and without RNA enzyme water (2 × 1.7ml).QIAGEN multiplex PCR main mixture (MM) contains KCl and (NH
4) combination of 2O 4 and PCR additive Factor MP increase the local concentration of primer in template.Factor M P stablizes special
Property combine primer, allow for example, by the effective primer extend of HotStarTaq archaeal dna polymerase.HotStarTaq DNA is poly-
Synthase is the modified forms of Taq archaeal dna polymerase, at ambient temperature without polymerase activity.Following thermal cycle conditions are for the
One takes turns PCR:95 DEG C 10 minutes;25 circulation 96 DEG C 30 seconds, 65 DEG C 29 minutes and 72 DEG C 30 seconds;Then 72 DEG C 2 minutes, 4 DEG C
It keeps.For the second wheel PCR, 10 μ l reaction volumes, every kind of primer of 1 × MM and 5nM are used.Use following thermal cycle conditions:
95 DEG C 15 minutes;94 DEG C 30 seconds, 65 DEG C 1 minute, 60 DEG C 5 minutes, 65 DEG C 5 minutes, 72 DEG C of 25 of 30 seconds circulations;Then 72 DEG C
2 minutes, 4 DEG C of holdings.
For the library of~3,000 primers, exemplary reaction condition includes the 10ul reaction volume of each primer, 2 ×
For MM, 70mM TMAC and 2nM primer for the primed libraries for detecting SNV, exemplary reaction condition includes every kind of primer
10ul reaction volume, 2 × MM, 4mM EDTA and 7.5nM primer.Illustrative thermal cycle conditions include 95 DEG C 15 minutes, 94 DEG C
30 seconds, 65 DEG C of 15 minutes and 72 DEG C of 20 of 30 seconds circulations;Then 72 DEG C 2 minutes, 4 DEG C of holdings.
Amplified production adds bar code.The wheel sequencing carried out, the reading of each sample are approximately equal.
Figure 36 A and 36B are shown using the library analysis of about 28.0 primers designed for detection CNV from single thin
The genomic DNA (Figure 36 A) of born of the same parents (Figure 36 B) or the result of DNA.Each sample measures about 4,000,000 readings.There are two centers
There are CNV for band rather than the instruction of center band.For three DNA samples from individual cells, the percentage of reading is mapped
Respectively 89.9%, 94.0% and 93.4%.For two samples of genomic DNA, the mapping of each sample reads percentage
It is 99.1%.
Figure 37 A and 37B are shown using the library analysis of about 3.0 primers designed for detection CNV from single
The genomic DNA (Figure 37 A) of cell (Figure 37 B) or the result of DNA.Each sample measures about 1,200,000 readings.There are in two
There are CNV for central band rather than the instruction of center band.For three DNA samples from individual cells, the percentage of reading is mapped
Than being respectively 98.2%, 98.2% and 97.9%.For two samples of genomic DNA, the mapping of each sample reads percentage
Than being 98.8%.Figure 38 shows the uniformity of the DOR in these -3,000 sites.
For calling SNV, the calling percentage of the true positives mutation from unicellular and genomic DNA DNA is similar
's.The true positives mutation of individual cells in y-axis calls percentage relative to the Positive mutants of the genomic DNA in x-axis
The figure of percentage is called to generate the curve matching of y=1.0076x-0.3088, wherein R2=0.9834.Figure 39 is shown from single
The genomic DNA of a cell error calls measurement similar with DNA's.Figure 40 shows that the error rate for detecting transition mutations is greater than inspection
The error rate for surveying transversional mutation shows that possible needs select reverse mutation to be used for detection rather than Transpositional mutation.
Embodiment 8
The embodiment further demonstrates referred to as CoNVERGe (the copy number variant thing that copy number variation gene is shown
Part) the extensive multiple PCR method measured for chromosome aneuploid and CNV, and further illustrate for ctDNA
The exploitation and use of " PlasmArt " standard of the PCR of sample.PlasmArt standard include and it is known show CNV genomic region
Domain has the size distribution of naturally occurring cfDNA segment in the polynucleotides and reflection blood plasma of sequence identity.
Sample collection
From American type culture collection (ATCC) obtain human breast cancer cell line (HCC38, HCC1143,
HCC1395, HCC1937, HCC1954 and HCC2218) and matched normal cell system (HCC38BL, HCC1143BL,
HCC1395BL, HCC1937BL, HCC1954BL and HCC2218BL).Trisomy 21B- lymphocyte (AG16777) and pairing
Father/sub- DiGeorge syndrome (DGS) cell line (respectively GM10383 and GM10382) comes from Ke Ruier cell bank (card nurse
It steps on, New Jersey).GM10382 cell only has the area male parent 22q11.2.
We acquire tumor tissues from 16 patient with breast cancers, including from geneticist (city Glan Dai Er, California)
11 fresh food frozen (FF) samples and 5 formalin for coming from north bank-Li Jie (graceful Haast, New York) fix paraffin embedding
(FFPE) sample.We obtain matched buffy coat sample for 8 patients, obtain matched blood plasma for 9 patients
Sample.FF tumor tissues and matched buffy coat from five ovarian cancer patients and plasma sample come from North
Shore-LIJ.For 8 tumor of breast FF samples, resection organization is sliced for analyzing.It obtains and comes from north bank/LIJ IRB and Kazakhstan
The institutional review board of the state-run Ethics Committee, medical university of Er Kefu is ratified, and is used for sample collection, and from all subjects
Obtain informed consent form.
By blood sample collection into EDTA pipe.Using QIAamp circle nucleic acid kit (Qiagen, Valencia, CA) from
Circulating tumor DNA is separated in 1mL blood plasma.
In order to prepare Daqi Technology Co., Ltd's standard items according to a kind of illustrative methods, firstly, by 9 × 106
Cell cracks 15 with hypotonic lysis buffer (20mM Tris-Cl (pH 7.5), 10mM NaCl and 3mM MgCl 2) on ice
Minute.Then, 10%IGEPAL CA-630 (Sigma, St.Louis, MO) is added to final concentration of 0.5%.With 3 at 4 DEG C,
000g is centrifuged after ten minutes, the core of precipitating is resuspended in 1 before 1000U MNase (new england biological experiment room) is added ×
Micrococcal nuclease (MNase) buffer (new england biological experiment room, Ipswich, MA) is 5 minutes at 37 DEG C.Pass through
EDTA is added and terminates reaction to final concentration of 15mM.Indigested chromatin is removed by being centrifuged 1 minute in 2,000g.With
The DNA of DNAClean&Concentrator TM-500 kit (Zymo Research, Irvine, CA) purified fragments.?
Disappeared using (Beckman Coulter Inc., mine-laying is sub-, the carbonic anhydrase) purifying of AMPure XP magnetic bead and size selection by micrococcus luteus enzyme
Change the monocyte DNA generated.With biological analyser DNA 1000 chip (Agilent Technologies, Santa Clara, carbonic anhydride
Enzyme) size is carried out to DNA fragmentation and is quantified.
In order to simulate the ctDNA of various concentration, by the difference of the PlasmArts from HCC1954 and HCC2218 cancer cell
Fraction is mixed to from those of corresponding matched normal cell system (respectively HCC1954BL and HCC2218BL).Analysis is each
Three samples of concentration.Similarly, in order to simulate the allele imbalance in the focal region 3.5Mb in plasma dna, we
From the DNA mixture of the DNA of the different ratios containing the DNA from the children lacked with parent 22q11.2 and from father
Generate PlasmArts.Sample only containing father DNA is used as negative control.Analyze eight samples of each concentration.
Therefore, in order to evaluate the sensitivity and repeatability of CoNVERGe, especially when the abnormal DNA ratio of CNV or average
When allele imbalance (AAI) is low, we detect the CNV in DNA mixture using it, and the DNA mixture includes first
The abnormal sample titration of preceding characterization enters matched normal specimens.Mixture is made of the artificial cfDNA for being known as " PlasmArt ",
With the fragment size distribution (seeing above) close to natural cfDNA.Figure.Size distribution of Figure 42 graphical display with cfDNA
It compares, the size distribution of the exemplary PlasmArt prepared from cancerous cell line observes chromosome arm lp, lq, on 2p and 2q
CNV.In first pair, son's tumor DNA sample of the CNV missing with 3 μM of the regions 22q11.2 is in the total cfDNA of 0-1.5%
Between be titrated to (Figure 41 a) in the matched normal specimens from father.CoNVERGe is repeatably identified corresponding to known different
Normal CNV, AAI > 0.35% estimated in > 0.5%+/- 0.2%AAI mixture, at 6/8 of 0.25% exception DNA
Fail to detect CNV in repeating, is < 0.05% for all 8 negative control samples.It is shown by the AAI value that CoNVERGe estimates
Show High Linear (R2=0.940) and reproducibility (error variance=0.087).The measurement is to level of amplification different in same sample
It is sensitive.Based on these data, the conservative detection threshold value of 0.45%AAI can be used for subsequent analysis.Using the cutoff value, carry out another
One experiment, wherein mixing the ctDNA of Plasmart synthesis with known concentration to generate the synthesis of about 0.5% to about 3.5%
Cancer blood plasma.It further include negative plasma as control.The cancer blood plasma of all synthesis generates high 0.45% estimated value, negative blood
The reading of slurry is far below 0.45% (Figure 43 A-C).Figure 43 A;Right figure shows the maximum likelihood of tumour, as odds ratio figure
The estimation of DNA fragmentation result.Figure 43 B is the figure for detecting transversion event.Figure 43 C is the detection figure of transition events.
Also have evaluated from the tumour and normal cell system sample of pairing to and on chromosome 1 or chromosome 2 with CNV
Two other PlasmArt titrate (Figure 41 b, 41c).In negative control, < 0.45%, and High Linear is (right for all values
In HCC1954Ip R2=0.952, for HCC2254Iq R2=0.993, for HCC2218 2p R2=0.977, for
HCC2218 2q R2=0.967) and reproducibility (error variance=and for HCC1954lp, it is 0.029 for HCC1954lq, it is right
It is 0.250 in HCC2218 2p, is 0.350) for HCC2218 2q.It is calculated in known input amount of DNA with by CoNVERGe
Amount between observe.The difference of the slope of the recurrence of the region lp and lq of one sample pair and the region lp and lq of same sample
B- gene frequency (BAF) in the relative different of copy number observed it is related, show relative accuracy by CoNVERGe
The AAI of calculating estimates (Figure 41 c, 41d).
Workflow for handling sample is shown in FIG. 5.63.CoNVERge is suitable for various samples source, including
FFPE, fresh food frozen is unicellular, germline control and cfDNA.We using CoNVERGe to six human breast cancer cell line and
The normal cell system matched, to assess whether it can detecte body cell CNVs.Arm level and focal CNV are present in all six
It in kind tumor cell line, but is not present in its matched normal cell system, in addition to the chromosome 2 in HCC1143, wherein normally
Cell line is shown and the deviation map 63b of the homologous ratio of 1:1).In order to verify these on different platforms as a result, we carry out
CytoSNP-12 microarray analysis generates consistent result (Figure 63 d, 63e) to all samples.In addition, by CoNVERGe and
The maximum of the CNV of CytoSNP-12 microarray identification is homologous than showing strong linear dependence (R2=0.987, P < 0.001)
(Figure 63 f).What next we fixed CoNVERGe applied to fresh food frozen (FF) (Figure 64 a) and formalin, paraffin packet
(FFPE) breast tumor tissue samples (Figure 64 b, 64d) buried.In two kinds of sample types, there are several arm levels and focal
CNV;However, not detecting CNV in the DNA from matched buffy coat sample.CoNVERGe result with come from phase
With result (Figure 64 e-h of the microarray analysis of sample;R2=0.909, for P < 0.001 of the CytoSNP-12 on Cy;For
The R2=0.992 of the OncoScan of FFPE, P < 0.001) it is highly relevant.CoNVERGe is also to from laser capture microdissection
(LCM) a small amount of DNA of sample extraction generates consistent as a result, wherein microarray method is not applicable.
The CNV in unicellular is detected with CoNVERGe
In order to test this mmPCR method applicability limitation, we are from six kinds of above-mentioned cancerous cell lines and in target region
In do not have to separate individual cells in the B- lymphocyte cell line of CNV.CNV from these unicellular experiments is composed in three repetitions
And come since being consistent (figure between those of genomic DNA (gDNA) that the bulk sample of about 20,000 cells extracts
65).The quantity of SNP based on no sequencing reading, the average test expulsion rate of bulk sample are 0.48% (range: 0.41%-
0.60%), this is attributed to synthesis or measurement design failure.For other average measurement rate of descent that is unicellular, observing
For 0.39% (range: 0.19%-0.67%).For the single cell measurements (that is, there is no measurements to fall off) not failed, make
It is only 0.05% (range: 0.00%-0.43%) with the average single ADO rate that heterozygosis SNP is calculated.In addition, having high confidence
The percentage of the SNP (that is, the SNP genotype determined at least 98% confidence level) of genotype is for unicellular and a large amount of samples
Be it is similar, and the genotype in unicellular sample matched with the genotype in gross sample (average 99.52% range:
92.63%-100.00%).
In unicellular, gene frequency is expected directly to reflect chromosomal copy number, this is different from tumor sample, wherein
Tumor sample may be obscured by TH and non-tumor cell pollution.The BAF of 1/n and (n-1)/n indicates that n chromosome in region is copied
Shellfish.Chromosomal copy number (Figure 65) is indicated on the gene frequency figure of unicellular and matched gDNA sample.
Application of the CoNVERGe in plasma sample
The ability of CNV is detected to study CoNVERGe in actual plasma sample, our method is applied to by we
With the cfDNA for matching tumor biopsy and matching of each of two II primary breast cancer patients and 5 advanced ovarian cancers.Institute
Have in 7 patients, CNV (Figure 66) is detected in FF tumor tissues and corresponding plasma sample.Figure 67 provides SNV breast cancer
The list of mutation.In five regions measured, detected in seven plasma samples (range: 0.48-12.99%AAI)
32 CNV, level > 0.45%AAI in total represent about 20% genome.Note that due to the orthogonal method for lacking substitution,
The presence of CNV in blood plasma is not can confirm that.
Although AAI estimation may show related to the BAF in tumour, due to Tumor Heterogeneity, it is not necessarily intended to direct
Proportionality.For example, the oval 66a of the top left region of Fig. 8 indicates to have compatible with N=11 in sample B C5 (Figure 66 a)
The region of BAF;By itself and the AAI calculations incorporated from plasma sample, cause the c's in two regions to be estimated as 2.33% He
2.67%.The value between 4.46% and 9.53% is provided using other regions estimation c in sample, this clearly demonstrates that tumour
Heterogeneous presence.
These statistics indicate that, CNV can be detected in most of sample in blood plasma, and show CNV in tumour more
Generally, a possibility that observing in cfDNA is bigger.In addition, CoNVERGe detects CNV from liquid biopsy, otherwise it can
It can not be observed in traditional tumour biopsy.
Embodiment 9
This embodiment offers certain exemplary sample preparation methods of the CoNVERGe analysis for different type sample
Details.
Unicellular CNV scheme for 28,000-plex PCR
Multiplex PCR allows to expand many targets simultaneously in single reaction.There is 10% Minimum plant Population minorAllele
Frequency (1000 Genomes project datas;On April 30th, 2012 version) each genome area in identify target SNP.For
Each SNP, multiple primers, half nesting, it is designed as having between the amplicon length of the maximum length of 75bp and 54-60.5 DEG C
Melting temperature.Calculate the primer interaction score of all possible primer combination;Eliminating has the primer of balloon score to drop
A possibility that low primer dimer product formation.Based on target SNP minorAllele frequency, the heterozygosis rate observed (is come from
DbSNP), the presence in HapMap and amplicon length are classified and are selected to candidate PCR measurement.
In some experiments, using mmPCR 28,000-plex scheme prepares and expands unicellular sample.Sample is with following
Prepared by mode: in order to analyze individual cells, by cell serial dilution, until each drop has 3 or 4 cells.It pipettes single thin
Born of the same parents are placed in PCR pipe.Using Proteinase K, salt and DTT use the following conditions lytic cell: 56 DEG C 20 minutes, 95 DEG C 10 points
Clock, then 4 DEG C of holdings.Analysis for genomic DNA, purchase or pass through culture cell and extract DNA obtain come from and analysis
Unicellular identical cell line DNA.DNA is being contained into Qiagen mp-PCR main mixture (2XMM ultimate density),
It is expanded in the 40uL reaction volume of 7.5nM primer concentration.For having the 28K primer pair of half nesting Rev primer, at 95 DEG C 10 points
Clock, 25 × [96 DEG C 30 seconds, 65 DEG C 29 minutes, 72 DEG C 30 seconds], 72 DEG C 2 minutes, 4 DEG C of holdings.Amplified production is diluted in water
1:200, and STAR 2 (10 μ l reaction volume) 1XMM is added in 2 μ l, 5nM primer concentrate is simultaneously drawn using half nested inner Fwd
Object and label specificity Rev primer: 95 DEG C 15 minutes, 25 × [94 DEG C 30 seconds, 65 DEG C 1 minute, 60 DEG C 5 minutes, 65 DEG C 5 minutes,
72 DEG C 30 seconds], 72 DEG C 2 minutes, 4 DEG C of holdings carry out PCR.
Total order column label and bar code are connected to amplified production, and use adapter primer amplified 9 circulations.
Before sequencing, merge bar code library production, purified with QIAquickPCR purification kit (Qiagen), and uses Qubit
DsDNA BR assay kit (Life Technologies, Inc., the U.S.) is quantitative.Using 2500 sequenator of Illumina HiSeq to amplification
Son is sequenced.
DNA is extracted from blood/plasma sample
By blood sample collection into EDTA pipe.Whole blood sample is centrifuged and is divided into three layers: upper layer, 55% blood sample are
Blood plasma, and contain Cell-free DNA (cfDNA);Contain white thin less than 1% DNA with total DNA amount in buffy coat middle layer
Born of the same parents;And bottom, the 45% of collected blood sample contains red blood cell, because red blood cell is picked-off, does not deposit in the fraction
In DNA.Using QIAamp circle nucleic acid kit Qia-Amp (Qiagen, Valencia, CA) according to the scheme of manufacturer to
Circulating tumor DNA is separated in few 1mL blood plasma.The plasma C NV of 3, the 168-plex for chromosome 1p, lq, 2p, 2q and 22q11
Scheme.
It prepares plasma dna library and is expanded using mmPCR 3,168-plex scheme.Sample is prepared in the following manner: will be high
Up to 20mL centrifugal blood to separate buffy coat and blood plasma.The blood plasma for carrying out cfDNA extracts and library preparation.DNA is existed
It is eluted in 50uLTE buffer.The input of mmPCR is the Natera blood plasma library of 6.7uL amplification and purifying, and input quantity is about
1200ng.Containing Qiagen mp-PCR main mixture (2XMM ultimate density), 20 μ L reactants of 2nM labeled primer concentration
Plasma dna is expanded in product.(12.7uM in total) and PCR amplification: 95 DEG C 10 minutes, 25 × [96 DEG C 30 seconds, 65 DEG C 20 minutes,
72 DEG C 30 seconds], 72 DEG C 2 minutes, 4 DEG C holding.Amplified production is diluted into 1:2000 in water, and 10 μ L reactants are added in 1 μ l
Barcoding-PCR in product.Using tag-specific primers, bar code is connected to amplified production 12 by PCR amplification
Circulation.Merge the product of multiple samples, is then purified with QIAquick PCR purification kit (Qiagen), and in 50 μ lDNA
It is eluted in buffer suspension liquid.Such as being carried out by NGS to sample described in the unicellular CNV scheme for 28,000 heavy chain PCR
Sequencing.SNV panel of the feasibility of breast cancer from blood plasma.
The cfDNA from patient with breast cancer's blood sample is prepared, and is drawn using 336 that are distributed in four 84- Kong Chizhong
Object is to expanding.It is used for chromosome 1p, 1q, 2p as being directed to, described in the plasma C NV scheme of 3, the 168- aggressiveness of 2q and 22q11
Prepare Natera blood plasma library.DNA is eluted in 50uL TE buffer.The input of mPCR is 2.5uL amplification and purifying
Natera blood plasma library, input quantity are about 600ng.Figure 68 A-B indicate SNP main used in 3168mmPCR reaction and
MinorAllele frequency.X-axis indicates the number of the SNP for chromosome 1q, 1p, 2q, 2p and 22q from left to right.From the mankind
1000 Genome Atlas in select SNP, the 19th group and dbSNP selection target, but be used only and come from 1000 genomes
SNP screen minorAllele frequency.In 84 overlapping primers ponds, containing Qiagen mp-PCR main mixture, (2XMM is most
Final concentration), the 10uL reaction volume of 4mM EDTA, 7.5nM primer concentration (amounting to 1.26uM) is parallel with four of PCR amplification anti-
Answer middle amplification plasma dna: 95 DEG C 15 minutes, 25 × [94 DEG C 30 seconds, 65 DEG C 15 minutes, 72 DEG C 30 seconds], 72 DEG C 2 minutes, 4 DEG C of guarantors
It holds.By the amplified production of 4 kinds of sub- buffers, respectively 1:200 dilutes in water, and 1 μ l is added to containing the main mixing of Q5 HS HF
In Barcoding-PCR reaction in the 10uL reaction volume of every kind of bar code primer of object (1xfinal) and 1uM, and every kind
Expand pond in following reaction: 98 DEG C 1 minute, 25 × [98 DEG C 10 seconds, 70 DEG C 10 seconds, 60 DEG C 30 seconds, 65 DEG C 15 seconds, 72 DEG C 15
Second], 72 DEG C 2 minutes, 4 DEG C holding.Library QIAquick PCR purification kit (Qiagen) is purified, and in 50 μ lDNA
It is eluted in buffer suspension liquid.Sample is sequenced by paired end sequencing.
Embodiment 10
This embodiment offers about the details for identifying certain illustrative methods of SNV for analyzing sequencing data.
SNV method 1: for the embodiment, background error model is constructed using normal plasma samples, in identical survey
Sequencing is in sort run to consider to run specific pseudomorphism.In certain embodiments, 5,10 are analyzed in identical sequencing operation,
15,20,25,30,40,50,100,150,200,250 or be more than 250 normal plasma samples.In certain illustrative embodiments
In, 20,25,40 or 50 normal plasma samples are analyzed in identical sequencing operation.It removes to have and is greater than the normal of cutoff value
The noisy locations of intermediate value variant gene frequency.For example, in certain embodiments, which is > 0.1%, 0.2%,
0.25%, 0.5%, 1%, 2%, 5% or 10%.In certain illustrative embodiments, there is the normal internal greater than 0.5%
The noise position of variant gene frequency is removed.Remove exceptional value sample iteratively from model to consider noise and dirt
Dye.In certain embodiments, the sample with the Z score greater than 5,6,7,8,9 or 10 is removed from data analysis.For every
Each base replacement of a genomic locus calculates the standard deviation of the depth and error that read weighted average.To have to
Few 5 variants read and are known as candidate for the Z-score of background error model for 10 tumour or cell-free plasma sample position
Mutation.
SNV method 2: for the embodiment, we are intended to determine mononucleotide variant using blood plasma ctDNA data
(SNV).PCR process model building is random process by we, estimates parameter using training set, and carry out most using individual test set
Whole SNV is called.Main thought is propagation of the determining error in multiple PCR cycles, calculates the average value and variance of background error,
And distinguish background error and true mutation.
For the following parameter of each base estimation:
P=efficiency (each in each cycle to read the probability being replicated)
peThe error rate (probability of the mistake of type e) in each period of=mutation type e
Occur)
XoThe initial number of=molecule
It is read due to being replicated during PCR, so the mistake occurred is more.Therefore, the error profile of reading by with original
The separating degree read that begins determines.We, which will read, is known as kth generation, if it has gone through k duplication, until it is generated.
Let us is that each basis defines following variable:
XijThe number that=the i-th generation generated in PCR cycle j reads
Yij=generation sum the i read at the end of period j
Xij e=the number that there is the i-th generation of mutation e to read generated in PCR cycle
In addition, other than normal molecular Xo, if when PCR process starts in the presence of the other/eXo molecule with mutation e
(therefore fe/ (1+fe) by be mutating molecule in original mixture score).
To the sum for generating i-1 and reading for being scheduled on period j-1, there is sample in the quantity that i is read that generates that period j is generated
The bi-distribution of size and probability parameter p.Therefore, E (Xij/ Yi-ij-i, p)=p YI-1, j-1With Var (Xij|YI-1, j-1, p) and=p
(l-p)YI-1, j-1。
We also haveTherefore, by recurrence, simulation or similar method, we can determine whether E
(Xij).Similarly, the distribution of p can be used to determine Var (X in weij)=E (Var (Xij,/p))+Var (E (Xij,/p)).
Finally, E (Xij e/YI-1, j-1, pe) and=pe YI-1, j-1With Var (Xij e/YI-1, j-1, pe) and=pe (1-pe) YI-1, j-1Meter
Calculate E (Xij e) and Var (Xij e)。
20.
6+2 algorithm
The algorithm begins with the efficiency and error rate that training set estimates each period.N is enabled to indicate the total of PCR cycle
Number.
The number of reading Rb at each substrate b can be approximated to be (1+Pb) n X 0, and wherein pb is the efficiency at substrate b.
Then, (Rb/Xo) 1/n can be used for approximate 1+pb.Then, we can determine whether the average values and standard of pb in all training samples
Variation, the parameter or similar distribution being distributed with estimated probability).
Similarly, the number that the error e at each substrate b reads Rb can be used to estimate ρ e.Determining all trained samples
After the average value and standard deviation of this error rate, its approximate probability distribution (such as normal distribution, β or similar distribution),
Parameter is estimated using the average value and standard deviation value.
Next, we estimate that the initial starting copies of each bases are for test dataWherein f () is the estimation distribution from training set.
Wherein, f () is the estimation distribution from training set.
Therefore, we have estimated the parameter used in random process.Then, by using these estimations, Wo Menke
To estimate that average value and the variance of molecule create in each period (note that we are respectively normal molecular, mistake molecule and are dashed forward
Variation is done so).
Finally, by using probabilistic method (such as maximum likelihood method or the like), we can determine whether best fe value,
It is suitble to error, the optimal distribution of mutation and normal state molecule.More specifically, we estimate the error point of various/e value in finally reading
Son determines a possibility that our data are for each of these values to the desired ratio of total molecule, then selection tool
There is the value of highest likelihood.
In certain embodiments, the above method 2 is carried out as follows:
A) using training dataset estimation PCR efficiency and each circular error rate;
B) starting molecule of the test data set at each base is estimated using the distribution for the efficiency estimated in step (a)
Number,
C) when necessary, the estimation of the efficiency of test data set is updated using the starting molecule number estimated in step (b),
D) use is in step (a), (b), (c) the middle test set data estimated and parameter Estimation molecule sum, background error
(search formed for the initial percentage by true mutating molecule is empty for the average value and variance of molecule and true mutating molecule
Between).
E) the overall error molecular number (background error and true mutation) being distributed in total molecule is closed, and is calculated in search space
A possibility that each true mutation percentage;And
F) it determines most probable practical mutation percentage, and calculates confidence level using the data in step (e).
Embodiment 11
This embodiment offers use following as a result, being used to pass through detection for multiplex PCR CoNVERGe method provided herein
CNV in circular DNA detects cancer.It is used for chromosome 1p, 1q, 2p using provided herein, the 3 of 2q and 22q11,168 weights
Plasma CNV scheme.Analysis comes from the blood plasma of 21 patient with breast cancers's (I-IIIB phase).As a result it proves as shown in figure 44 in institute
Have in sample and detects CNV using AAI >=0.45% and needs as little as 62 heterozygosis SNP.Come using similar program analysis
From the blood plasma of ovarian cancer patients.Using 0.45% cutoff value, 100% oophoroma verification and measurement ratio, five samples as shown in figure 25 are realized
Each of also have matched tumor sample.
Embodiment 12
The embodiment proves to realize that the significant of ability of detection cancer changes by the presence of CNV and SNV in measurement blood plasma
It is kind.CNV and SNV is detected using the method provided in above-described embodiment.Sample is prepared according to the appropriate scheme in embodiment 9.Make
SNV is identified with above-mentioned SNV method 1.As shown in figure 46, by analyzing from I-III phase cancer patient from CNV's and SNV
Blood plasma, compared with individually test SNV, the significant sensitivity for improving detection mammary gland and lung cancer.SNV is only analyzed, in plasma sample
In detect 71% cancer.However, by the presence of analysis SNV and/or CNV, the inspection of mammary gland in the PATIENT POPULATION of analysis
Extracting rate reaches 83%, and the recall rate of lung is 92%.If it is considered that all SNV identified in TCGA and COSMIC data set and
CNV, it is contemplated that diagnostic load will be greater than 97% breast cancer and > 98% lung cancer.
Using the plasma sample preparation method provided in embodiment 9 provided above and SNV method 1 to from not
Sample with 41 Patient Sample As of carcinoma stage is further analyzed.As shown in figure 47, when from patient with breast cancer's
When measuring CNV and SNV in Circulating tumor DNA, detected using the determination limit of the 0.2%ctDNA of SNV and 0.45 determination limit
60% I phase, 88% II phase and 100% III primary breast cancer CNV %ctDNA.As shown in figure 48, when being examined in ctDNA
When surveying CNV and SNV and observing 41 Patient Sample As with different stages for breast cancer, 60% stage I, 100% stage
II, 90% stage IIA, the III phase of 80% stage IIB and 100%, IIIA phase and IIIB primary breast cancer, use SNV's
The determination limit of the 0.45%ctDNA of 0.2%ctDNA and CNV is detected.As shown in figure 49, when from patients with lung cancer sample
When measuring CNV and SNV in 24 Circulating tumor DNAs of product, 88% is detected using the quantitation limit of the 0.2%ctDNA of SNV
Stage I, the III phase lung cancer of 100% stage II and 100% and the 0.45%ctDNA for CNV.As shown in figure 50, when
When detecting CNV and SNV in ctDNA and checking 24 Patient Sample As with different lung cancer, in addition to using the patient of IB lung cancer real
Except existing 82% verification and measurement ratio, for all 100% verification and measurement ratios of realization by stages, the 0.45% of the 0.2%ctDNA and CNV of SNV
The quantitative limit of ctDNA.
Embodiment 13
The embodiment proves that SNV is detected in ctDNA to be overcome due to Tumor Heterogeneity and identify and become in biopsy samples
The limitation of body allele.Use the TRACERx sample and a gland cancer patients with lung cancer sample of three Patients With Small Cell Carcinoma of The Lung samples
Product, wherein having collected tumor biopsy and corresponding operation consent plasma sample for analyzing Tumor Heterogeneity.Sample is ground obtained from cancer
Study carefully Britain's lung cancer center of excellence, University College London, London WC1E 6BT, Britain.Sample is for analyzing the primary of SNV mutation
Property lung cancer sample.Two to three biopsy (figures of each region from entire carcinous lung are taken out from each patient
51A).Pass through full sequencing of extron group (Illumin aHiSeq200;Hundred million sensible companies, Santiago, CA) each biopsy of measurement
Sample then carries out AmpliSeq sequencing (ion torrent company, South San Francisco, CA) on PGM, for identifying potential clone
It is heterogeneous.It is being sequenced with after SNV analysis, is determining the variant gene frequency (VAF) (Figure 51 B) of each biopsy samples.
Plasma sample from each of four patients is for separating ctDNA and identifying the clone in blood plasma and sub- gram
Grand SNV mutation is to overcome Tumor Heterogeneity (Figure 52).Clonal population has VAF equipotential in all measurement biopsy samples and blood plasma
Gene, and being subcloned group, there is VAF allele to call at least one biopsy samples, but not all biopsy samples.
Blood plasma be considered as in the ctDNA of each patient the accumulation of SNV that finds represent.The not all SNV energy by sequencing identification
Enough corresponding PCR measurements with design.
In order to compare AmpliSeq (Si Wangdun) and the mmPCR/NGS measuring method for identifying Tumor Heterogeneity,
PCR measurement of the Natera designed for each SNV mutation of the VAF detection in the biopsy and corresponding ctDNA from blood plasma
(Figure 53).Blanc cell represents no biopsy samples and can be used, and zero expression does not detect VAF.11 genes initially lead to below
AmpliSeq FP or FN test for identification is crossed as negative (false VAF is called), but passes through NateraTP or TN test and mmPCR/NGS
Test method is correctly called: L12:CYFIP1, FAT1, MLLT4 and RASA1;L13:HERC4, JAK2, MSH2, MTOR and
PLCG2;L15:GABRG1;L17:TRIM67., it is surprising that being tested when reexamining AmpliSeq raw sequencing data
These results are demonstrate,proved.Original AmpliSeq data sequencing file shows that data can detect threshold value setting lower than PGM or Illumina.
The data of 16/38 variant of identification are detected in blood plasma, and in the L12 Patient Sample A being mutated with dominant clone SNV
There are several biopsy samples: L12:BRIP1, CARS, FAT1, MLLT4, NFE2L2, TP53, TP53 and patient L13:EGFR,
EGFR, TP53 and L15:KDM6A, ROS1.It was found that other two patients have four kinds of subclone variation mutation in total in blood plasma:
L12:CIC, KDM6A and L17;NF1, TRIM67.It is being averaged for each sample listed in Figure 53 that these results, which are summarised in Figure 54 A,
The whisker of VAF.Figure 54 B is the direct comparison indicated by the linear regression graph of the VAF sample mean of each measurement.
Embodiment 14
The embodiment shows by using low primer concentration, so that primer amount is the restricted reaction in multiplex PCR
Object, in the workflow for being followed by next-generation sequencing, the uniformity of the reading density across amplified reaction pond and detection therefore
Limit is to improve.Using some experiments for carrying out plasma C NV according to 3,168 orifice plates of above-described embodiment 9, the difference is that overall reaction
Volume is 10uL rather than 20uL.In addition, PCR carries out 15,20 or 25 circulations.According to the scheme of embodiment 9, breast cancer is used
Four holes 84- pond on sample carries out other experiments, unlike primer concentration be 2nM, and PCR amplification carry out 15,20 or
25 circulations.
It is without being bound by theory, it is believed that the restricted multiplex PCR of primer provides improved multiplex PCR before more reading sequencings and reads
Take uniformity depth, such as in Illumina HiSeq or MiSeq system or based on Ion Torrent PGM or proton system
Sequencing is based on considered below: if some amplifications in multiplex PCR have lower efficiency than other amplifications, utilizing normal
Multiplex PCR, we will obtain reading depth (" DOR ") value of wide scope, however, if primer is limited, and multiplex PCR
Cycle ratio discharge primer needed for often, then more effectively amplification will stop doubling (because they are more
Primer use), more inefficient primer will continue to double will lead to the amplified production of amount more like for all amplified productions,
This translates into the distribution of DOR more evenly.
It is calculated below for determining the primer of accurate specified rate and the recurring number of initial nucleic acid template:
Assume given starting DNA input level: each target 100k copies (10A5;Amplification library can be used to be easy for this
It realizes on ground)
Assume that we use every kind of primer of 2nM as exemplary concentration, but other concentration such as 0.2,0.5,1,
1.5,2,2.5,5 or 10nM can also work.
Calculate the primer molecule number of every kind of primer: 2*10^9 molar concentration, 2nM) × 10*10^-6 (reaction volume, 10 μ
L) X6*10^23 (every molecular number, Avogadro number)=12*10^9
Calculate amplification times needed for consuming all primers: 12*10^9 (primer molecule number)/10^5 (copy by each target
Shellfish number)=12*10^4
Recurring number needed for calculating reaches the amplification times, it is assumed that in 100% efficiency of each circulation: 2 (12^10 of log
^4)=17 circulation (this is log 2, because in each circulation, copy number is double).
Therefore for these condition (100k copy input, 2nM primer, 10 μ l reaction volumes, it is assumed that in each cycle
100% PCR efficiency), primer will be in 17 PCR cycle post consumptions.
However, crucial hypothesis is, some products do not have 100% efficiency, therefore do not measure their efficiency (this
It is feasible for a small amount of they), consuming them will need over 17 periods.
Figure 55-58 shows the result in four 84-plex SNV PCR primer ponds.For each pond, it is observed that with
From 15 to 20 to 25 circulation increase DOR efficiency improve.Similar knot is obtained using the experiment of 3,168- panel (Figure 59-61)
Fruit.With the increase for reading depth, detection limit reduces (i.e. SNV sensitivity increases).In addition, when detection transversional mutation is more prominent than transformation
When change, sensitivity is more preferable always.When multiplex PCR restricted using primer before more reading sequencing, may use additionally
Circulation can obtain the additional increase of DOR efficiency.
Therefore, on the one hand, there is provided herein the methods of multiple target sites in amplification of nucleic acid sample comprising (i) makes core
Sour sample is contacted with primed libraries and other primer extension reaction components to provide reaction mixture, wherein with other primer extends
Reactive component is compared, in reaction mixture the relative quantity of every kind of primer generate wherein primer to be reacted existing for restricted concentration,
And wherein primer and multiple and different target position dot blots;(ii) keeps reaction mixture experience primer extension reaction condition enough
The circulation of number is to consume or exhaust the primer in primed libraries, to generate the amplified production for including target amplicon.For example, multiple
Different target trajectories can include at least 2,3,5,10,25,50,100,200,250,500,1,000;2,000;5,000;7,
500;10,000;20,000;25,000;30,000;40,000;50,000;75,000;Or 100,000 different target site,
It and is at most 50,100,200,250,500,1,000;2,000;5,000;7,500;10,000;20,000;25,000;30,
000;40,000;50,000;75,000;100,000,200,000,250,000,500,000 and 1,000,000 different target
Site is to generate reaction mixture.
Method in illustrative embodiment include determine by be rate limit amount primer amount.The calculating generally includes
Estimation and/or the quantity for determining target molecule, and be related to analyzing and/or determining the quantity of carried out amplification cycles.For example,
In illustrative embodiment, the concentration of every kind of primer is less than 100,75,50,25,10,5,2,1,0.5,0.25,0.2 or 0.1nM.
In various embodiments, the G/C content of primer is between 30 to 80%, such as between 40 to 70% or 50 to 60%, including
End value.In some embodiments, primer G/C content range (for example, maximum G/C content subtracts minimum G/C content, such as
The range of 80%-60%=20%) less than 30%, 20%, 10% or 5%.In some embodiments, the melting temperature of primer
It (Tm) is 40 DEG C to 80 DEG C, such as 50 DEG C to 70 DEG C, 55 DEG C to 65 DEG C or 57 DEG C to 60.5 DEG C, including end value.In some implementations
In scheme, the melting temperature range of primer is less than 20 DEG C, 15 DEG C, 10 DEG C, 5 DEG C, 3 DEG C or 1 DEG C.In some embodiments, draw
The length of object is 15 to 100 nucleotide, such as 15 to 75 nucleotide, 15 to 40 nucleotide, 17 to 35 nucleotide, 18
To 30 nucleotide, 20 to 65 nucleotide.In some embodiments, primer includes the label of non-target-specific, such as shape
At the label of internal ring structure.In some embodiments, label is between two combined areas DNA.In various embodiments,
Primer includes the region 5' to target site specificity, not specific to target site and form the interior zone of ring structure and to target position
The special region 3' of point.In various embodiments, the length in the area 3' is at least seven nucleotide.In some embodiments, 3'
The length in area is 7 to 20 nucleotide, such as 7 to 15 nucleotide or 7 to 10 nucleotide.In various embodiments, it surveys
Examination primer includes the region 5' for not having specificity to target site (such as label or universal primer binding site), is followed by target
The region of site-specific is not specificity for target site and forms ring structure, and the 3' region special to target site.One
In a little embodiments, less than 50,40,30,20,10 or 5 nucleotide of length range of primer.In some embodiments, target
The length of amplicon is 50 to 100 nucleotide, such as 60 to 80 nucleotide or 60 to 75 nucleotide.In some embodiment party
In case, less than 100,75,50,25,15,10 or 5 nucleotide of length range of target amplicon.
In multiple embodiments of any aspect of the invention, primer extension reaction condition is polymerase chain reaction item
Part (PCR).In various embodiments, the length of annealing steps be greater than 3,5,8,10 or 15 minutes but less than 240,120,60 or
30 minutes.In various embodiments, extend step length be greater than 3,5,8,10 or 15 minutes but less than 240,120,60 or
30 minutes.
Embodiment 15
The identification in single cell analysis (also referred to as single molecule analysis) that this example demonstrated SNV detection methods of the invention
The ability of chimera.Figure 62 is shown using according to the 28K-plex primer sets of the unicellular method of the 28K provided in embodiment 9
Tumor cell gene group DNA and individual cells/molecule input multiplex PCR result.Reading quilt using this method, more than 85%
Mapping-reads (each target about 167 readings) more than 4.7M.Be partially shown under figure in cell observe it is chimeric.
Claims (20)
1. a kind of for detecting one kind in blood, serum or the plasma sample with cancer or the object under a cloud with cancer
Or the method for various mutations or genetic mutation, the method include:
By full sequencing of extron group, various mutations or genetic mutation in the tumor sample of the object are identified;
Blood, serum or plasma sample are collected from the object, and are divided from the blood, serum, blood plasma or tumor sample
From Cell-free DNA;
From the Cell-free DNA, amplification corresponds to multiple locus of the mutation or genetic mutation, to obtain amplicon;
The amplicon is sequenced, to obtain sequence reads;And
Detection one of the mutation or genetic mutation present in the Cell-free DNA or more from the sequence reads
Kind.
2. according to the method described in claim 1, wherein the Cell-free DNA includes Circulating tumor DNA.
3. according to the method described in claim 1, wherein the mutation or genetic mutation become comprising one or more mononucleotides
Body (SNV) mutation.
4. according to the method described in claim 1, wherein the mutation or genetic mutation change comprising one or more copy numbers
(CNV).
5. according to the method described in claim 3, wherein at least one of described SNV mutation is in gene selected from the following:
TP53、PTEN、PIK3CA、APC、EGFR、NRAS、NF2、FBXW7、ERBBs、ATAD5、KRAS、BRAF、VEGF、EGFR、
HER2、ALK、p53、BRCA、BRCA1、BRCA2、SETD2、LRP1B、PBRM、SPTA1、DNMT3A、ARID1A、GRIN2A、
TRRAP、STAG2、EPHA3/5/7、POLE、SYNE1、C20orf80、CSMD1、CTNNB1、ERBB2. FBXW7、KIT、MUC4、
ATM、CDH1、DDX11、DDX12、DSPP、EPPK1、FAM186A、GNAS、HRNR、KRTAP4-11、MAP2K4、MLL3、NRAS、
RB1、SMAD4、TTN、ABCC9、ACVR1B、ADAM29、ADAMTS19、AGAP10、AKT1、AMBN、AMPD2、ANKRD30A、
ANKRD40、APOBR、AR、BIRC6、BMP2、BRAT1、BTNL8、C12orf4、C1QTNF7、C20orf186、CAPRIN2、
CBWD1、CCDC30、CCDC93、CD5L、CDC27、CDC42BPA、CDH9、CDKN2A、CHD8、CHEK2、CHRNA9、CIZ1、
CLSPN、CNTN6、COL14A1、CREBBP、CROCC、CTSF、CYP1A2、DCLK1、DHDDS、DHX32、DKK2、DLEC1、
DNAH14、DNAH5、DNAH9、DNASE1L3、DUSP16、DYNC2H1、ECT2、EFHB、RRN3P2、TRIM49B、TUBB8P5、
EPHA7、ERBB3、ERCC6、FAM21A、FAM21C、FCGBP、FGFR2、FLG2、FLT1、FOLR2、FRYL、FSCB、GAB1、
GABRA4、GABRP、GH2、GOLGA6L1、GPHB5、GPR32、GPX5、GTF3C3、HECW1、HIST1H3B、HLA-A、HRAS、
HS3ST1、HS6ST1、HSPD1、IDH1、JAK2、KDM5B、KIAA0528、KRT15、KRT38、KRTAP21-1、KRTAP4-5、
KRTAP4-7、KRTAP5-4、KRTAP5-5、LAMA4、LATS1、LMF1、LPAR4、LPPR4、LRRFIP1、LUM、LYST、
MAP2K1、MARCH1、MARCO、MB21D2、MEGF10、MMP16、MORC1、MRE11A、MTMR3、MUC12、MUC17、MUC2、
MUC20、NBPF10、NBPF20、NEK1、NFE2L2、NLRP4、NOTCH2、NRK、NUP93、OBSCN、OR11H1、OR2B11、
OR2M4、OR4Q3、OR5D13、OR8I2、OXSM、PIK3R1、PPP2R5C、PRAME、PRF1、PRG4、PRPF19、PTH2、
PTPRC、PTPRJ、RAC1、RAD50、RBM12、RGPD3、RGS22、ROR1、RP11-671M22.1、RP13-996F3.4、
RP1L1、RSBN1L、RYR3、SAMD3、SCN3A、SEC31A、SF1、SF3B1、SLC25A2、SLC44A1、SLC4A1、SMAD2、
SPTA1、ST6GAL2、STK11、SZT2、TAF1L、TAX1BP1、TBP、TGFBI、TIF1、TMEM14B、TMEM74、TPTE、
TRAPPC8、TRPS1、TXNDC6、USP32、UTP20、VASN、VPS72、WASH3P、WWTR1、XPO1、ZFHX4、ZMIZ1、
ZNF167、ZNF436、ZNF492、ZNF598、ZRSR2、ABL1、AKT2、AKT3、ARAF、ARFRP1、ARID2、ASXL1、ATR、
ATRX、AURKA、AURKB、AXL、BAP1、BARD1、BCL2、BCL2L2、BCL6、BCOR、BCORL1、BLM、BRIP1、BTK、
CARD11、CBFB、CBL、CCND1、CCND2、CCND3、CCNE1、CD79A、CD79B、CDC73、CDK12、CDK4、CDK6、
CDK8、CDKN1B、CDKN2B、CDKN2C、CEBPA、CHEK1、CIC、CRKL、CRLF2、CSF1R、CTCF、CTNNA1、DAXX、
DDR2、DOT1L、EMSY (C11orf30)、EP300、EPHA3、EPHA5、EPHB1、ERBB4、ERG、ESR1、EZH2、
FAM123B (WTX)、FAM46C、FANCA、FANCC、FANCD2、FANCE、FANCF、FANCG、FANCL、FGF10、FGF14、
FGF19、FGF23、FGF3、FGF4、FGF6、FGFR1、FGFR2、FGFR3、FGFR4、FLT3、FLT4、FOXL2、GATA1、
GATA2、GATA3、GID4 (C17orf39)、GNA11、GNA13、GNAQ、GNAS、GPR124、GSK3B、HGF、IDH1、IDH2、
IGF1R、IKBKE、IKZF1、IL7R、INHBA、IRF4、IRS2、JAK1、JAK3、JUN、KAT6A (MYST3)、KDM5A、
KDM5C、KDM6A、KDR、KEAP1、KLHL6、MAP2K2、MAP2K4、MAP3K1、MCL1、MDM2、MDM4、MED12、MEF2B、
MEN1、MET、MITF、MLH1、MLL、MLL2、MPL、MSH2、MSH6、MTOR、MUTYH、MYC、MYCL1、MYCN、MYD88、
NF1、NFKBIA、NKX2-1、NOTCH1、NPM1、NRAS、NTRK1、NTRK2、NTRK3、PAK3、PALB2、PAX5、PBRM1、
PDGFRA、PDGFRB、PDK1、PIK3CG、PIK3R2、PPP2R1A、PRDM1、PRKAR1A、PRKDC、PTCH1、PTPN11、
RAD51、RAF1、RARA、RET、RICTOR、RNF43、RPTOR、RUNX1、SMARCA4、SMARCB1、SMO、SOCS1、SOX10、
SOX2、SPEN、SPOP、SRC、STAT4、SUFU、TET2、TGFBR2、TNFAIP3、TNFRSF14、TOP1、TP53、TSC1、
TSC2, TSHR, VHL, WISP3, WT1, ZNF217 and ZNF703.
6. according to the method described in claim 3, wherein at least one of described SNV mutation is in gene selected from the following:
CYFIP1, FAT1, MLLT4, RASA1, HERC4, JAK2, MSH2, MTOR, PLCG2, GABRG1 and TRIM67.
7. according to the method described in claim 3, wherein the SNV mutation is mutated comprising one or more clone SNV.
8. according to the method described in claim 7, wherein at least one of described clone SNV mutation is in gene selected from the following
In: BRIP1, CARS, FAT1, MLLT4, NFE2L2, TP53, TP53, EGFR, EGFR, TP53, KDM6A and ROS1.
9. according to the method described in claim 3, wherein the SNV mutation is mutated comprising one or more subclone SNV.
10. according to the method described in claim 9, wherein at least one of described subclone SNV mutation is selected from the following
In gene: CIC, KDM6A, NF1 and TRIM67.
11. according to the method described in claim 3, wherein the SNV mutation includes one or more clone SNV mutation and one
Kind or a variety of subclone SNV mutation.
12. according to the method for claim 11, wherein the method further includes the clone for determining the tumor sample
It is heterogeneous.
13. according to the method described in claim 1, wherein amplification step includes that amplification corresponds to the mutation or genetic mutation
10 to 50 locus.
14. according to the method described in claim 1, wherein amplification step includes to carry out targeting amplification by multiplex PCR.
15. according to the method for claim 14, wherein the method is further included designed in the tumor sample
In recognize the mutation or genetic mutation targeting PCR test.
16. according to the method described in claim 1, wherein sequencing steps include high-flux sequence.
17. according to the method described in claim 1, wherein sequencing steps include next-generation sequencing.
18. according to the method described in claim 1, wherein the mutation or genetic mutation are mutation or the gene of tumour-specific
Variation.
19. according to the method described in claim 1, wherein the method is further included and is detected from the Cell-free DNA
To the mutation or genetic mutation in detect the recurrence and/or transfer of the cancer.
20. according to the method for claim 19, wherein the cancer is colorectal cancer, lung cancer, bladder cancer or breast cancer.
Applications Claiming Priority (15)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201461982245P | 2014-04-21 | 2014-04-21 | |
US61/982,245 | 2014-04-21 | ||
US201461987407P | 2014-05-01 | 2014-05-01 | |
US61/987,407 | 2014-05-01 | ||
US201461994791P | 2014-05-16 | 2014-05-16 | |
US61/994,791 | 2014-05-16 | ||
US201462066514P | 2014-10-21 | 2014-10-21 | |
US62/066,514 | 2014-10-21 | ||
US201562146188P | 2015-04-10 | 2015-04-10 | |
US62/146,188 | 2015-04-10 | ||
US201562147377P | 2015-04-14 | 2015-04-14 | |
US62/147,377 | 2015-04-14 | ||
US201562148173P | 2015-04-15 | 2015-04-15 | |
US62/148,173 | 2015-04-15 | ||
CN201580033190.XA CN106460070B (en) | 2014-04-21 | 2015-04-21 | Detection of mutations and ploidy in chromosomal segments |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201580033190.XA Division CN106460070B (en) | 2014-04-21 | 2015-04-21 | Detection of mutations and ploidy in chromosomal segments |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109971852A true CN109971852A (en) | 2019-07-05 |
Family
ID=57221528
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201580033190.XA Active CN106460070B (en) | 2014-04-21 | 2015-04-21 | Detection of mutations and ploidy in chromosomal segments |
CN201910135027.4A Pending CN109971852A (en) | 2014-04-21 | 2015-04-21 | Detect the mutation and ploidy in chromosome segment |
CN202110905394.5A Pending CN113774132A (en) | 2014-04-21 | 2015-04-21 | Detection of mutations and ploidy in chromosomal segments |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201580033190.XA Active CN106460070B (en) | 2014-04-21 | 2015-04-21 | Detection of mutations and ploidy in chromosomal segments |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110905394.5A Pending CN113774132A (en) | 2014-04-21 | 2015-04-21 | Detection of mutations and ploidy in chromosomal segments |
Country Status (8)
Country | Link |
---|---|
US (13) | US10179937B2 (en) |
EP (3) | EP3561075A1 (en) |
JP (11) | JP6659575B2 (en) |
CN (3) | CN106460070B (en) |
AU (3) | AU2015249846B2 (en) |
CA (1) | CA2945962C (en) |
HK (1) | HK1232260A1 (en) |
RU (1) | RU2717641C2 (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110379465A (en) * | 2019-07-19 | 2019-10-25 | 元码基因科技(北京)股份有限公司 | Based on RNA target to sequencing and machine learning cancerous tissue source tracing method |
CN112094911A (en) * | 2020-10-10 | 2020-12-18 | 广西医科大学 | Medical application of NRK in lung cancer treatment and prognosis diagnosis |
CN112397144A (en) * | 2020-10-29 | 2021-02-23 | 无锡臻和生物科技有限公司 | Method and device for detecting gene mutation and expression quantity |
CN112592976A (en) * | 2020-12-30 | 2021-04-02 | 深圳市海普洛斯生物科技有限公司 | Method and device for detecting MET gene amplification |
CN113913525A (en) * | 2021-11-24 | 2022-01-11 | 广州医科大学 | Application of DNASE1L3 gene as target point for detecting and/or preventing liver cancer invasion and metastasis |
CN114093417A (en) * | 2021-11-23 | 2022-02-25 | 深圳基因家科技有限公司 | Method and device for identifying chromosomal arm heterozygosity loss |
CN114807377A (en) * | 2022-06-29 | 2022-07-29 | 南京世和基因生物技术股份有限公司 | Application of bladder cancer prognosis survival time marker, evaluation device and computer readable medium |
CN116004800A (en) * | 2022-12-09 | 2023-04-25 | 中国农业科学院北京畜牧兽医研究所 | Application of CNV marker in early screening of sheep fat tail |
Families Citing this family (109)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11111544B2 (en) | 2005-07-29 | 2021-09-07 | Natera, Inc. | System and method for cleaning noisy genetic data and determining chromosome copy number |
US11111543B2 (en) | 2005-07-29 | 2021-09-07 | Natera, Inc. | System and method for cleaning noisy genetic data and determining chromosome copy number |
US10081839B2 (en) | 2005-07-29 | 2018-09-25 | Natera, Inc | System and method for cleaning noisy genetic data and determining chromosome copy number |
US9424392B2 (en) | 2005-11-26 | 2016-08-23 | Natera, Inc. | System and method for cleaning noisy genetic data from target individuals using genetic data from genetically related individuals |
US10017812B2 (en) | 2010-05-18 | 2018-07-10 | Natera, Inc. | Methods for non-invasive prenatal ploidy calling |
US20120185176A1 (en) | 2009-09-30 | 2012-07-19 | Natera, Inc. | Methods for Non-Invasive Prenatal Ploidy Calling |
US20190010543A1 (en) * | 2010-05-18 | 2019-01-10 | Natera, Inc. | Methods for simultaneous amplification of target loci |
US11939634B2 (en) | 2010-05-18 | 2024-03-26 | Natera, Inc. | Methods for simultaneous amplification of target loci |
CA2798758C (en) | 2010-05-18 | 2019-05-07 | Natera, Inc. | Methods for non-invasive prenatal ploidy calling |
US10316362B2 (en) | 2010-05-18 | 2019-06-11 | Natera, Inc. | Methods for simultaneous amplification of target loci |
US11322224B2 (en) | 2010-05-18 | 2022-05-03 | Natera, Inc. | Methods for non-invasive prenatal ploidy calling |
US9677118B2 (en) | 2014-04-21 | 2017-06-13 | Natera, Inc. | Methods for simultaneous amplification of target loci |
US11332793B2 (en) | 2010-05-18 | 2022-05-17 | Natera, Inc. | Methods for simultaneous amplification of target loci |
US11339429B2 (en) | 2010-05-18 | 2022-05-24 | Natera, Inc. | Methods for non-invasive prenatal ploidy calling |
US11326208B2 (en) | 2010-05-18 | 2022-05-10 | Natera, Inc. | Methods for nested PCR amplification of cell-free DNA |
US11408031B2 (en) | 2010-05-18 | 2022-08-09 | Natera, Inc. | Methods for non-invasive prenatal paternity testing |
US11332785B2 (en) | 2010-05-18 | 2022-05-17 | Natera, Inc. | Methods for non-invasive prenatal ploidy calling |
WO2012088456A2 (en) | 2010-12-22 | 2012-06-28 | Natera, Inc. | Methods for non-invasive prenatal paternity testing |
US9892230B2 (en) | 2012-03-08 | 2018-02-13 | The Chinese University Of Hong Kong | Size-based analysis of fetal or tumor DNA fraction in plasma |
US10577655B2 (en) | 2013-09-27 | 2020-03-03 | Natera, Inc. | Cell free DNA diagnostic testing standards |
US10262755B2 (en) | 2014-04-21 | 2019-04-16 | Natera, Inc. | Detecting cancer mutations and aneuploidy in chromosomal segments |
RU2717641C2 (en) | 2014-04-21 | 2020-03-24 | Натера, Инк. | Detection of mutations and ploidy in chromosomal segments |
WO2016008451A1 (en) | 2014-07-18 | 2016-01-21 | The Chinese University Of Hong Kong | Methylation pattern analysis of tissues in dna mixture |
KR20220127359A (en) * | 2014-07-25 | 2022-09-19 | 유니버시티 오브 워싱톤 | Methods of determining tissues and/or cell types giving rise to cell-free dna, and methods of identifying a disease or disorder using same |
EP2977912A1 (en) * | 2014-07-25 | 2016-01-27 | Franz-Peter Dr. Liebel | Automated diagnosis |
EP4026913A1 (en) | 2014-10-30 | 2022-07-13 | Personalis, Inc. | Methods for using mosaicism in nucleic acids sampled distal to their origin |
US10364467B2 (en) | 2015-01-13 | 2019-07-30 | The Chinese University Of Hong Kong | Using size and number aberrations in plasma DNA for detecting cancer |
US11479812B2 (en) | 2015-05-11 | 2022-10-25 | Natera, Inc. | Methods and compositions for determining ploidy |
US10395759B2 (en) | 2015-05-18 | 2019-08-27 | Regeneron Pharmaceuticals, Inc. | Methods and systems for copy number variant detection |
US11302416B2 (en) | 2015-09-02 | 2022-04-12 | Guardant Health | Machine learning for somatic single nucleotide variant detection in cell-free tumor nucleic acid sequencing applications |
WO2017181146A1 (en) * | 2016-04-14 | 2017-10-19 | Guardant Health, Inc. | Methods for early detection of cancer |
EP3443119B8 (en) * | 2016-04-15 | 2022-04-06 | Natera, Inc. | Methods for lung cancer detection |
US11299783B2 (en) | 2016-05-27 | 2022-04-12 | Personalis, Inc. | Methods and systems for genetic analysis |
AU2017336153B2 (en) | 2016-09-30 | 2023-07-13 | Guardant Health, Inc. | Methods for multi-resolution analysis of cell-free nucleic acids |
WO2018067517A1 (en) | 2016-10-04 | 2018-04-12 | Natera, Inc. | Methods for characterizing copy number variation using proximity-litigation sequencing |
GB201618485D0 (en) * | 2016-11-02 | 2016-12-14 | Ucl Business Plc | Method of detecting tumour recurrence |
JP2020503003A (en) | 2016-11-30 | 2020-01-30 | ザ チャイニーズ ユニバーシティ オブ ホンコン | Analysis of cell-free DNA in urine and other samples |
US10011870B2 (en) | 2016-12-07 | 2018-07-03 | Natera, Inc. | Compositions and methods for identifying nucleic acid molecules |
CN108277267B (en) * | 2016-12-29 | 2019-08-13 | 安诺优达基因科技(北京)有限公司 | It detects the device of gene mutation and carries out the kit of parting for the genotype to pregnant woman and fetus |
MX2019008257A (en) * | 2017-01-11 | 2019-10-07 | Koninklijke Philips Nv | Method and system for automated inclusion or exclusion criteria detection. |
AU2018225348A1 (en) | 2017-02-21 | 2019-07-18 | Natera, Inc. | Compositions, methods, and kits for isolating nucleic acids |
US11342047B2 (en) * | 2017-04-21 | 2022-05-24 | Illumina, Inc. | Using cell-free DNA fragment size to detect tumor-associated variant |
CN108804876B (en) * | 2017-05-05 | 2022-03-15 | 中国科学院上海药物研究所 | Method and apparatus for calculating purity and chromosome ploidy of cancer sample |
KR102145417B1 (en) * | 2017-05-24 | 2020-08-19 | 지니너스 주식회사 | Method for generating distribution of background allele frequency for sequencing data obtained from cell-free nucleic acid and method for detecting mutation from cell-free nucleic acid using the same |
CN109136355A (en) * | 2017-06-19 | 2019-01-04 | 北京大学第三医院 | A kind of SNP array chip that evaluating in vitro fetus the Twin Transfusion Syndrome occurs |
JP2020524519A (en) | 2017-06-20 | 2020-08-20 | ザ メディカル カレッジ オブ ウィスコンシン,インコーポレイテッドThe Medical College of Wisconsin, Inc. | Assessment of transplant complication risk by all cell-free DNA |
US11450121B2 (en) * | 2017-06-27 | 2022-09-20 | The Regents Of The University Of California | Label-free digital brightfield analysis of nucleic acid amplification |
CN107164535A (en) * | 2017-07-07 | 2017-09-15 | 沈阳宁沪科技有限公司 | A kind of noninvasive high flux methylates diagnosis of colon cancer, research and treatment method |
US10622095B2 (en) * | 2017-07-21 | 2020-04-14 | Helix OpCo, LLC | Genomic services platform supporting multiple application providers |
US11608533B1 (en) * | 2017-08-21 | 2023-03-21 | The General Hospital Corporation | Compositions and methods for classifying tumors with microsatellite instability |
JP7029719B2 (en) * | 2017-08-31 | 2022-03-04 | 静岡県 | Biomarker |
EP3460071A1 (en) * | 2017-09-22 | 2019-03-27 | Lexogen GmbH | Estimating pre-pcr fragment numbers from post-pcr frequencies of unique molecular identifiers |
CN110021355B (en) * | 2017-09-22 | 2021-05-04 | 深圳华大生命科学研究院 | Haploid typing and variation detection method and device for diploid genome sequencing segment |
CN107641648A (en) * | 2017-10-19 | 2018-01-30 | 深圳华大基因股份有限公司 | It is a kind of to be used to detect standard items of chromosome aberration and preparation method thereof |
WO2019109086A1 (en) * | 2017-12-01 | 2019-06-06 | Illumina, Inc. | Methods and systems for determining somatic mutation clonality |
CN112365927B (en) * | 2017-12-28 | 2023-08-25 | 安诺优达基因科技(北京)有限公司 | CNV detection device |
WO2019200398A1 (en) * | 2018-04-13 | 2019-10-17 | Dana-Farber Cancer Institute, Inc. | Ultra-sensitive detection of cancer by algorithmic analysis |
US11874220B2 (en) | 2018-04-26 | 2024-01-16 | Ppg Industries Ohio, Inc. | Formulation systems and methods employing target coating data results |
US10970879B2 (en) | 2018-04-26 | 2021-04-06 | Ppg Industries Ohio, Inc. | Formulation systems and methods employing target coating data results |
US10871888B2 (en) | 2018-04-26 | 2020-12-22 | Ppg Industries Ohio, Inc. | Systems, methods, and interfaces for rapid coating generation |
US11119035B2 (en) * | 2018-04-26 | 2021-09-14 | Ppg Industries Ohio, Inc. | Systems and methods for rapid coating composition determinations |
CN108588201B (en) * | 2018-05-11 | 2019-08-09 | 浙江省人民医院 | A kind of method and device of colorectal cancer Cetuximab drug resistance trace amount DNA abrupt climatic change |
US11814750B2 (en) | 2018-05-31 | 2023-11-14 | Personalis, Inc. | Compositions, methods and systems for processing or analyzing multi-species nucleic acid samples |
US10801064B2 (en) | 2018-05-31 | 2020-10-13 | Personalis, Inc. | Compositions, methods and systems for processing or analyzing multi-species nucleic acid samples |
CA3098321A1 (en) | 2018-06-01 | 2019-12-05 | Grail, Inc. | Convolutional neural network systems and methods for data classification |
US11525159B2 (en) | 2018-07-03 | 2022-12-13 | Natera, Inc. | Methods for detection of donor-derived cell-free DNA |
CN109461473B (en) * | 2018-09-30 | 2019-12-17 | 北京优迅医疗器械有限公司 | Method and device for acquiring concentration of free DNA of fetus |
CN109971846A (en) * | 2018-11-29 | 2019-07-05 | 时代基因检测中心有限公司 | Use the method for the diallele SNP antenatal measurement aneuploid of Noninvasive for targeting next-generation sequencing |
US11581062B2 (en) | 2018-12-10 | 2023-02-14 | Grail, Llc | Systems and methods for classifying patients with respect to multiple cancer classes |
CN109686401B (en) * | 2018-12-19 | 2022-08-05 | 上海蓝沙生物科技有限公司 | Method for identifying uniqueness of heterologous low-frequency genome signal and application thereof |
CN109801677B (en) * | 2018-12-29 | 2023-05-23 | 浙江安诺优达生物科技有限公司 | Sequencing data automatic analysis method and device and electronic equipment |
CN113557572A (en) * | 2019-01-25 | 2021-10-26 | 加利福尼亚太平洋生物科学股份有限公司 | System and method for map-based mapping of nucleic acid fragments |
CN109837344B (en) * | 2019-03-04 | 2022-12-27 | 天津医科大学 | Methylated EphA7 nucleotide fragment, detection method and application thereof |
WO2020206290A1 (en) * | 2019-04-03 | 2020-10-08 | The Medical College Of Wisconsin, Inc. | Methods for assessing risk using total cell-free dna |
US11931674B2 (en) | 2019-04-04 | 2024-03-19 | Natera, Inc. | Materials and methods for processing blood samples |
CN110129439A (en) * | 2019-04-28 | 2019-08-16 | 安徽鼎晶生物科技有限公司 | A kind of people BRCA1/2 genetic mutation detection quality-control product and its preparation method and application |
CN110106063B (en) * | 2019-05-06 | 2022-07-08 | 臻和精准医学检验实验室无锡有限公司 | System for detecting 1p/19q combined deletion of glioma based on second-generation sequencing |
JP2022534634A (en) * | 2019-06-03 | 2022-08-03 | イルミナ インコーポレイテッド | Detection limit-based quality control metrics |
CN110452988A (en) * | 2019-08-27 | 2019-11-15 | 武汉友芝友医疗科技股份有限公司 | Detect primer sets, reagent, kit and the method for ESR1 gene mutation |
CN110648722B (en) * | 2019-09-19 | 2022-05-31 | 首都医科大学附属北京儿童医院 | Device for evaluating neonatal genetic disease risk |
CN112634986A (en) * | 2019-09-24 | 2021-04-09 | 厦门希吉亚生物科技有限公司 | Noninvasive identification method for twins zygote property based on peripheral blood of pregnant woman |
JP2023501376A (en) | 2019-11-06 | 2023-01-18 | ザ ボード オブ トラスティーズ オブ ザ レランド スタンフォード ジュニア ユニバーシティー | Methods and systems for analyzing nucleic acid molecules |
US11211147B2 (en) | 2020-02-18 | 2021-12-28 | Tempus Labs, Inc. | Estimation of circulating tumor fraction using off-target reads of targeted-panel sequencing |
US11211144B2 (en) | 2020-02-18 | 2021-12-28 | Tempus Labs, Inc. | Methods and systems for refining copy number variation in a liquid biopsy assay |
US11475981B2 (en) | 2020-02-18 | 2022-10-18 | Tempus Labs, Inc. | Methods and systems for dynamic variant thresholding in a liquid biopsy assay |
CN113496769B (en) * | 2020-03-20 | 2023-11-07 | 梅傲科技(广州)有限公司 | Triple comprehensive tumor analysis system based on pathological tissue and application |
CN111370065B (en) * | 2020-03-26 | 2022-10-04 | 北京吉因加医学检验实验室有限公司 | Method and device for detecting cross-sample contamination rate of RNA |
CN112885406B (en) * | 2020-04-16 | 2023-01-31 | 深圳裕策生物科技有限公司 | Method and system for detecting HLA heterozygosity loss |
US20210326332A1 (en) * | 2020-04-17 | 2021-10-21 | International Business Machines Corporation | Temporal directed cycle detection and pruning in transaction graphs |
CN111724860B (en) * | 2020-06-18 | 2021-03-16 | 深圳吉因加医学检验实验室 | Method and device for identifying chromatin open area based on sequencing data |
US11675599B2 (en) * | 2020-08-04 | 2023-06-13 | Dell Products L.P. | Systems and methods for managing system rollup of accelerator health |
CN114277129A (en) * | 2020-09-28 | 2022-04-05 | 北京市神经外科研究所 | Application of SWI/SNF gene change in auxiliary diagnosis and prognosis of chordoma |
JP2023548113A (en) * | 2020-10-30 | 2023-11-15 | マイオーム,インコーポレイテッド | Using a combination of non-error propagation phase determination techniques and allelic balance to improve CNV detection |
US11354286B1 (en) * | 2020-11-02 | 2022-06-07 | Workday, Inc. | Outlier identification and removal |
CN112634991B (en) * | 2020-12-18 | 2022-07-19 | 长沙都正生物科技股份有限公司 | Genotyping method, genotyping device, electronic device, and storage medium |
JPWO2022190752A1 (en) | 2021-03-12 | 2022-09-15 | ||
CN113096736B (en) * | 2021-03-26 | 2023-10-31 | 北京源生康泰基因科技有限公司 | Virus real-time automatic analysis method and system based on nanopore sequencing |
WO2022212590A1 (en) * | 2021-03-31 | 2022-10-06 | Predicine, Inc. | Systems and methods for multi-analyte detection of cancer |
US11783912B2 (en) | 2021-05-05 | 2023-10-10 | The Board Of Trustees Of The Leland Stanford Junior University | Methods and systems for analyzing nucleic acid molecules |
CN113637751A (en) * | 2021-07-16 | 2021-11-12 | 暨南大学 | Application of TNFAIP3 non-coding sequence mutation detection reagent in preparation of T cell lymphoma prognosis prediction kit |
CN113808660B (en) * | 2021-09-13 | 2024-02-13 | 复旦大学附属华山医院 | Natural selection and database-based hereditary rare disease prevalence Bayes calculation model, construction method and application thereof |
WO2023060236A1 (en) * | 2021-10-08 | 2023-04-13 | Foundation Medicine, Inc. | Methods and systems for automated calling of copy number alterations |
WO2023081639A1 (en) * | 2021-11-03 | 2023-05-11 | Foundation Medicine, Inc. | System and method for identifying copy number alterations |
EP4184514A1 (en) | 2021-11-23 | 2023-05-24 | Eone Reference Laboratory | Apparatus and method for diagnosing cancer using liquid biopsy data |
US11788152B2 (en) | 2022-01-28 | 2023-10-17 | Flagship Pioneering Innovations Vi, Llc | Multiple-tiered screening and second analysis |
WO2023164017A2 (en) * | 2022-02-22 | 2023-08-31 | Flagship Pioneering Innovations Vi, Llc | Intra-individual analysis for presence of health conditions |
WO2023177901A1 (en) * | 2022-03-17 | 2023-09-21 | Delfi Diagnostics, Inc. | Method of monitoring cancer using fragmentation profiles |
CN117334249A (en) * | 2023-05-30 | 2024-01-02 | 上海品峰医疗科技有限公司 | Method, apparatus and medium for detecting copy number variation based on amplicon sequencing data |
CN117551782A (en) * | 2023-11-24 | 2024-02-13 | 江苏省家禽科学研究所 | Application of molecular marker related to eggshell thickness at tip of egg in genetic breeding of chicken |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102985561A (en) * | 2011-04-14 | 2013-03-20 | 维里纳塔健康公司 | Normalizing chromosomes for the determination and verification of common and rare chromosomal aneuploidies |
WO2013086464A1 (en) * | 2011-12-07 | 2013-06-13 | The Broad Institute, Inc. | Markers associated with chronic lymphocytic leukemia prognosis and progression |
WO2013130848A1 (en) * | 2012-02-29 | 2013-09-06 | Natera, Inc. | Informatics enhanced analysis of fetal samples subject to maternal contamination |
WO2013159035A2 (en) * | 2012-04-19 | 2013-10-24 | Medical College Of Wisconsin, Inc. | Highly sensitive surveillance using detection of cell free dna |
WO2014039556A1 (en) * | 2012-09-04 | 2014-03-13 | Guardant Health, Inc. | Systems and methods to detect rare mutations and copy number variation |
US20140100121A1 (en) * | 2012-06-21 | 2014-04-10 | The Chinese University Of Hong Kong | Mutational analysis of plasma dna for cancer detection |
Family Cites Families (543)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3957654A (en) | 1974-02-27 | 1976-05-18 | Becton, Dickinson And Company | Plasma separator with barrier to eject sealant |
US4040785A (en) | 1976-10-18 | 1977-08-09 | Technicon Instruments Corporation | Lysable blood preservative composition |
US5242794A (en) | 1984-12-13 | 1993-09-07 | Applied Biosystems, Inc. | Detection of specific sequences in nucleic acids |
US4683195A (en) | 1986-01-30 | 1987-07-28 | Cetus Corporation | Process for amplifying, detecting, and/or-cloning nucleic acid sequences |
US4935342A (en) | 1986-12-01 | 1990-06-19 | Syngene, Inc. | Method of isolating and purifying nucleic acids from biological samples |
US4942124A (en) | 1987-08-11 | 1990-07-17 | President And Fellows Of Harvard College | Multiplex sequencing |
WO1989006692A1 (en) | 1988-01-12 | 1989-07-27 | Genentech, Inc. | Method of treating tumor cells by inhibiting growth factor receptor function |
US5153117A (en) | 1990-03-27 | 1992-10-06 | Genetype A.G. | Fetal cell recovery method |
US6582908B2 (en) | 1990-12-06 | 2003-06-24 | Affymetrix, Inc. | Oligonucleotides |
US5262329A (en) | 1991-06-13 | 1993-11-16 | Carver Jr Edward L | Method for improved multiple species blood analysis |
IL103935A0 (en) | 1991-12-04 | 1993-05-13 | Du Pont | Method for the identification of microorganisms by the utilization of directed and arbitrary dna amplification |
US5856097A (en) | 1992-03-04 | 1999-01-05 | The Regents Of The University Of California | Comparative genomic hybridization (CGH) |
GB9305984D0 (en) | 1993-03-23 | 1993-05-12 | Royal Free Hosp School Med | Predictive assay |
ES2197168T3 (en) | 1993-07-05 | 2004-01-01 | Sheffield Teaching Hospitals National Health Service Trust | PREPARATION AND STABILIZATION OF CELLS. |
WO1995006137A1 (en) | 1993-08-27 | 1995-03-02 | Australian Red Cross Society | Detection of genes |
CH686982A5 (en) | 1993-12-16 | 1996-08-15 | Maurice Stroun | Method for diagnosis of cancers. |
SE9400522D0 (en) | 1994-02-16 | 1994-02-16 | Ulf Landegren | Method and reagent for detecting specific nucleotide sequences |
US5716776A (en) | 1994-03-04 | 1998-02-10 | Mark H. Bogart | Enrichment by preferential mitosis of fetal lymphocytes from a maternal blood sample |
US5891734A (en) | 1994-08-01 | 1999-04-06 | Abbott Laboratories | Method for performing automated analysis |
US6025128A (en) | 1994-09-29 | 2000-02-15 | The University Of Tulsa | Prediction of prostate cancer progression by analysis of selected predictive parameters |
US6479235B1 (en) | 1994-09-30 | 2002-11-12 | Promega Corporation | Multiplex amplification of short tandem repeat loci |
US5648220A (en) | 1995-02-14 | 1997-07-15 | New England Medical Center Hospitals, Inc. | Methods for labeling intracytoplasmic molecules |
CA2221454A1 (en) | 1995-05-19 | 1996-11-21 | Abbott Laboratories | Wide dynamic range nucleic acid detection using an aggregate primer series |
US6720140B1 (en) | 1995-06-07 | 2004-04-13 | Invitrogen Corporation | Recombinational cloning using engineered recombination sites |
US5733729A (en) | 1995-09-14 | 1998-03-31 | Affymetrix, Inc. | Computer-aided probability base calling for arrays of nucleic acid probes on chips |
US5854033A (en) | 1995-11-21 | 1998-12-29 | Yale University | Rolling circle replication reporter systems |
US5736033A (en) | 1995-12-13 | 1998-04-07 | Coleman; Charles M. | Separator float for blood collection tubes with water swellable material |
US6852487B1 (en) | 1996-02-09 | 2005-02-08 | Cornell Research Foundation, Inc. | Detection of nucleic acid sequence differences using the ligase detection reaction with addressable arrays |
CA2248981C (en) | 1996-03-15 | 2009-11-24 | The Penn State Research Foundation | Detection of extracellular tumor-associated nucleic acid in blood plasma or serum using nucleic acid amplification assays |
DK0938320T3 (en) | 1996-03-26 | 2010-10-18 | Michael S Kopreski | Method of extracting extracellular RNA from plasma or serum to detect, monitor or assess cancer |
US6108635A (en) | 1996-05-22 | 2000-08-22 | Interleukin Genetics, Inc. | Integrated disease information system |
US6300077B1 (en) | 1996-08-14 | 2001-10-09 | Exact Sciences Corporation | Methods for the detection of nucleic acids |
US6100029A (en) * | 1996-08-14 | 2000-08-08 | Exact Laboratories, Inc. | Methods for the detection of chromosomal aberrations |
US6221654B1 (en) | 1996-09-25 | 2001-04-24 | California Institute Of Technology | Method and apparatus for analysis and sorting of polynucleotides based on size |
US5860917A (en) | 1997-01-15 | 1999-01-19 | Chiron Corporation | Method and apparatus for predicting therapeutic outcomes |
US5824467A (en) | 1997-02-25 | 1998-10-20 | Celtrix Pharmaceuticals | Methods for predicting drug response |
US20010051341A1 (en) | 1997-03-04 | 2001-12-13 | Isis Innovation Limited | Non-invasive prenatal diagnosis |
GB9704444D0 (en) | 1997-03-04 | 1997-04-23 | Isis Innovation | Non-invasive prenatal diagnosis |
EP0866071B1 (en) | 1997-03-20 | 2004-10-20 | F. Hoffmann-La Roche Ag | Modified primers |
ATE364718T1 (en) | 1997-04-01 | 2007-07-15 | Solexa Ltd | METHOD FOR DUPLICATION OF NUCLEIC ACID |
US6143496A (en) | 1997-04-17 | 2000-11-07 | Cytonix Corporation | Method of sampling, amplifying and quantifying segment of nucleic acid, polymerase chain reaction assembly having nanoliter-sized sample chambers, and method of filling assembly |
US20020119478A1 (en) | 1997-05-30 | 2002-08-29 | Diagen Corporation | Methods for detection of nucleic acid sequences in urine |
US5994148A (en) | 1997-06-23 | 1999-11-30 | The Regents Of University Of California | Method of predicting and enhancing success of IVF/ET pregnancy |
US6833242B2 (en) | 1997-09-23 | 2004-12-21 | California Institute Of Technology | Methods for detecting and sorting polynucleotides based on size |
US6124120A (en) | 1997-10-08 | 2000-09-26 | Yale University | Multiple displacement amplification |
AR021833A1 (en) | 1998-09-30 | 2002-08-07 | Applied Research Systems | METHODS OF AMPLIFICATION AND SEQUENCING OF NUCLEIC ACID |
US6406671B1 (en) | 1998-12-05 | 2002-06-18 | Becton, Dickinson And Company | Device and method for separating components of a fluid sample |
WO2000066605A2 (en) | 1999-04-30 | 2000-11-09 | Cyclops Genome Sciences Limited | Polynucleotides |
US6180349B1 (en) | 1999-05-18 | 2001-01-30 | The Regents Of The University Of California | Quantitative PCR method to enumerate DNA copy number |
US7058517B1 (en) | 1999-06-25 | 2006-06-06 | Genaissance Pharmaceuticals, Inc. | Methods for obtaining and using haplotype data |
US6964847B1 (en) | 1999-07-14 | 2005-11-15 | Packard Biosciences Company | Derivative nucleic acids and uses thereof |
DE60035691T2 (en) | 1999-07-23 | 2008-04-30 | Forensic Science Service Ltd. | Method for recognition of single nucleotide polymorphisms |
GB9917307D0 (en) | 1999-07-23 | 1999-09-22 | Sec Dep Of The Home Department | Improvements in and relating to analysis of DNA |
US6440706B1 (en) | 1999-08-02 | 2002-08-27 | Johns Hopkins University | Digital amplification |
US6251604B1 (en) | 1999-08-13 | 2001-06-26 | Genopsys, Inc. | Random mutagenesis and amplification of nucleic acid |
WO2001034844A1 (en) | 1999-11-10 | 2001-05-17 | Ligochem, Inc. | Method for isolating dna from a proteinaceous medium and kit for performing method |
US6221603B1 (en) | 2000-02-04 | 2001-04-24 | Molecular Dynamics, Inc. | Rolling circle amplification assay for nucleic acid analysis |
WO2001057269A2 (en) | 2000-02-07 | 2001-08-09 | Illumina, Inc. | Nucleic acid detection methods using universal priming |
US7955794B2 (en) | 2000-09-21 | 2011-06-07 | Illumina, Inc. | Multiplex nucleic acid reactions |
US7510834B2 (en) | 2000-04-13 | 2009-03-31 | Hidetoshi Inoko | Gene mapping method using microsatellite genetic polymorphism markers |
GB0009179D0 (en) | 2000-04-13 | 2000-05-31 | Imp College Innovations Ltd | Non-invasive prenatal diagnosis |
AU2001274869A1 (en) | 2000-05-20 | 2001-12-03 | The Regents Of The University Of Michigan | Method of producing a dna library using positional amplification |
AU2001264811A1 (en) | 2000-05-23 | 2001-12-03 | Vincent P. Stanton Jr. | Methods for genetic analysis of dna to detect sequence variances |
US6605451B1 (en) | 2000-06-06 | 2003-08-12 | Xtrana, Inc. | Methods and devices for multiplexing amplification reactions |
US7087414B2 (en) | 2000-06-06 | 2006-08-08 | Applera Corporation | Methods and devices for multiplexing amplification reactions |
AU2001255518A1 (en) | 2000-06-07 | 2001-12-17 | Baylor College Of Medicine | Compositions and methods for array-based nucleic acid hybridization |
US7058616B1 (en) | 2000-06-08 | 2006-06-06 | Virco Bvba | Method and system for predicting resistance of a disease to a therapeutic agent using a neural network |
GB0016742D0 (en) | 2000-07-10 | 2000-08-30 | Simeg Limited | Diagnostic method |
WO2002057491A2 (en) | 2000-10-24 | 2002-07-25 | The Board Of Trustees Of The Leland Stanford Junior University | Direct multiplex characterization of genomic dna |
US20020107640A1 (en) | 2000-11-14 | 2002-08-08 | Ideker Trey E. | Methods for determining the true signal of an analyte |
AU2002243263A1 (en) | 2000-11-15 | 2002-07-24 | Roche Diagnostics Corporation | Methods and reagents for identifying rare fetal cells in the material circulation |
WO2002044411A1 (en) | 2000-12-01 | 2002-06-06 | Rosetta Inpharmatics, Inc. | Use of profiling for detecting aneuploidy |
US7218764B2 (en) | 2000-12-04 | 2007-05-15 | Cytokinetics, Inc. | Ploidy classification method |
AR031640A1 (en) | 2000-12-08 | 2003-09-24 | Applied Research Systems | ISOTHERMAL AMPLIFICATION OF NUCLEIC ACIDS IN A SOLID SUPPORT |
US20020182622A1 (en) | 2001-02-01 | 2002-12-05 | Yusuke Nakamura | Method for SNP (single nucleotide polymorphism) typing |
JP2002300894A (en) | 2001-02-01 | 2002-10-15 | Inst Of Physical & Chemical Res | Single nucleotide polymorphic typing method |
CA2439402A1 (en) | 2001-03-02 | 2002-09-12 | University Of Pittsburgh Of The Commonwealth System Of Higher Education | Pcr method |
WO2002073504A1 (en) | 2001-03-14 | 2002-09-19 | Gene Logic, Inc. | A system and method for retrieving and using gene expression data from multiple sources |
AU2002238994A1 (en) | 2001-03-23 | 2002-10-08 | Center For Advanced Science And Technology Incubation, Ltd. | Mononucleosome and process for producing the same, method of assaying antibody specific to nucleosome, method of diagnosing autoimmune disease, process for producing nucleosome dna, dna plate, process for producing dna plate and method of assaying anti-dna antibody |
US6489135B1 (en) | 2001-04-17 | 2002-12-03 | Atairgintechnologies, Inc. | Determination of biological characteristics of embryos fertilized in vitro by assaying for bioactive lipids in culture media |
FR2824144B1 (en) | 2001-04-30 | 2004-09-17 | Metagenex S A R L | METHOD OF PRENATAL DIAGNOSIS ON FETAL CELLS ISOLATED FROM MATERNAL BLOOD |
US7392199B2 (en) | 2001-05-01 | 2008-06-24 | Quest Diagnostics Investments Incorporated | Diagnosing inapparent diseases from common clinical tests using Bayesian analysis |
AU2002305436A1 (en) | 2001-05-09 | 2002-11-18 | Virginia Commonwealth University | Multiple sequencible and ligatible structures for genomic analysis |
US20040126760A1 (en) | 2001-05-17 | 2004-07-01 | Natalia Broude | Novel compositions and methods for carrying out multple pcr reactions on a single sample |
US7026121B1 (en) | 2001-06-08 | 2006-04-11 | Expression Diagnostics, Inc. | Methods and compositions for diagnosing and monitoring transplant rejection |
EP1397512A2 (en) | 2001-06-22 | 2004-03-17 | University of Geneva | Method for detecting diseases caused by chromosomal imbalances |
WO2003010537A1 (en) | 2001-07-24 | 2003-02-06 | Curagen Corporation | Family based tests of association using pooled dna and snp markers |
ES2367280T3 (en) | 2001-07-25 | 2011-11-02 | Oncomedx Inc. | METHODS TO EVALUATE PATHOLOGICAL STATES USING EXTRACELLULAR RNA. |
US7297778B2 (en) | 2001-07-25 | 2007-11-20 | Affymetrix, Inc. | Complexity management of genomic DNA |
US6958211B2 (en) | 2001-08-08 | 2005-10-25 | Tibotech Bvba | Methods of assessing HIV integrase inhibitor therapy |
JP2005501236A (en) | 2001-08-23 | 2005-01-13 | イムニベスト・コーポレイション | Stabilization of cells and biological specimens for analysis |
US6807491B2 (en) | 2001-08-30 | 2004-10-19 | Hewlett-Packard Development Company, L.P. | Method and apparatus for combining gene predictions using bayesian networks |
US6927028B2 (en) | 2001-08-31 | 2005-08-09 | Chinese University Of Hong Kong | Non-invasive methods for detecting non-host DNA in a host using epigenetic differences between the host and non-host DNA |
AUPR749901A0 (en) | 2001-09-06 | 2001-09-27 | Monash University | Method of identifying chromosomal abnormalities and prenatal diagnosis |
US7153656B2 (en) | 2001-09-11 | 2006-12-26 | Los Alamos National Security, Llc | Nucleic acid sequence detection using multiplexed oligonucleotide PCR |
US8986944B2 (en) | 2001-10-11 | 2015-03-24 | Aviva Biosciences Corporation | Methods and compositions for separating rare cells from fluid samples |
US20040014067A1 (en) | 2001-10-12 | 2004-01-22 | Third Wave Technologies, Inc. | Amplification methods and compositions |
WO2003031646A1 (en) | 2001-10-12 | 2003-04-17 | The University Of Queensland | Multiple genetic marker selection and amplification |
US6617137B2 (en) | 2001-10-15 | 2003-09-09 | Molecular Staging Inc. | Method of amplifying whole genomes without subjecting the genome to denaturing conditions |
US7297485B2 (en) | 2001-10-15 | 2007-11-20 | Qiagen Gmbh | Method for nucleic acid amplification that results in low amplification bias |
US7511704B2 (en) | 2001-11-20 | 2009-03-31 | Illinois Tool Works Inc. | Acoustic wave touch bar system and method of use |
US20030119004A1 (en) | 2001-12-05 | 2003-06-26 | Wenz H. Michael | Methods for quantitating nucleic acids using coupled ligation and amplification |
CA2469878C (en) | 2001-12-11 | 2011-06-28 | Netech Inc. | Blood cell separating system |
US7198897B2 (en) | 2001-12-19 | 2007-04-03 | Brandeis University | Late-PCR |
EP1325963B1 (en) | 2001-12-24 | 2006-09-27 | Wolfgang Prof. Holzgreve | Method for non-invasive diagnosis of transplantations and transfusions |
US20040115629A1 (en) | 2002-01-09 | 2004-06-17 | Panzer Scott R | Molecules for diagnostics and therapeutics |
EP1468104A4 (en) | 2002-01-18 | 2006-02-01 | Genzyme Corp | Methods for fetal dna detection and allele quantitation |
JP2005516310A (en) | 2002-02-01 | 2005-06-02 | ロゼッタ インファーマティクス エルエルシー | Computer system and method for identifying genes and revealing pathways associated with traits |
US6977162B2 (en) | 2002-03-01 | 2005-12-20 | Ravgen, Inc. | Rapid analysis of variations in a genome |
JP2006523082A (en) | 2002-03-01 | 2006-10-12 | ラブジェン, インコーポレイテッド | Rapid analysis of mutations in the genome |
US20060229823A1 (en) | 2002-03-28 | 2006-10-12 | Affymetrix, Inc. | Methods and computer software products for analyzing genotyping data |
US7241281B2 (en) | 2002-04-08 | 2007-07-10 | Thermogenesis Corporation | Blood component separation method and apparatus |
US7211191B2 (en) | 2004-09-30 | 2007-05-01 | Thermogenesis Corp. | Blood component separation method and apparatus |
US20030235848A1 (en) | 2002-04-11 | 2003-12-25 | Matt Neville | Characterization of CYP 2D6 alleles |
US20040096874A1 (en) | 2002-04-11 | 2004-05-20 | Third Wave Technologies, Inc. | Characterization of CYP 2D6 genotypes |
IL164687A0 (en) | 2002-05-02 | 2005-12-18 | Univ North Carolina | Cellular libraries and methods for the preparationthereof |
US20070178478A1 (en) | 2002-05-08 | 2007-08-02 | Dhallan Ravinder S | Methods for detection of genetic disorders |
US7727720B2 (en) | 2002-05-08 | 2010-06-01 | Ravgen, Inc. | Methods for detection of genetic disorders |
US7442506B2 (en) | 2002-05-08 | 2008-10-28 | Ravgen, Inc. | Methods for detection of genetic disorders |
US20040009518A1 (en) | 2002-05-14 | 2004-01-15 | The Chinese University Of Hong Kong | Methods for evaluating a disease condition by nucleic acid detection and fractionation |
US20040229231A1 (en) | 2002-05-28 | 2004-11-18 | Frudakis Tony N. | Compositions and methods for inferring ancestry |
EP1532453B1 (en) | 2002-05-31 | 2013-08-21 | Genetic Technologies Limited | Maternal antibodies as fetal cell markers to identify and enrich fetal cells from maternal blood |
WO2003106623A2 (en) | 2002-06-13 | 2003-12-24 | New York University | Early noninvasive prenatal test for aneuploidies and heritable conditions |
US7108976B2 (en) | 2002-06-17 | 2006-09-19 | Affymetrix, Inc. | Complexity management of genomic DNA by locus specific amplification |
US7097976B2 (en) | 2002-06-17 | 2006-08-29 | Affymetrix, Inc. | Methods of analysis of allelic imbalance |
US20050009069A1 (en) | 2002-06-25 | 2005-01-13 | Affymetrix, Inc. | Computer software products for analyzing genotyping |
CA2491117A1 (en) | 2002-06-28 | 2004-01-08 | Orchid Biosciences, Inc. | Methods and compositions for analyzing compromised samples using single nucleotide polymorphism panels |
EP1388812A1 (en) | 2002-07-04 | 2004-02-11 | Ronald E. Dr. Kates | Method for training a learning-capable system |
US20040117346A1 (en) | 2002-09-20 | 2004-06-17 | Kilian Stoffel | Computer-based method and apparatus for repurposing an ontology |
US7459273B2 (en) | 2002-10-04 | 2008-12-02 | Affymetrix, Inc. | Methods for genotyping selected polymorphism |
WO2004033649A2 (en) | 2002-10-07 | 2004-04-22 | University Of Medicine And Dentistry Of New Jersey | High throughput multiplex dna sequence amplifications |
JP2006519977A (en) | 2002-11-11 | 2006-08-31 | アフィメトリックス インコーポレイテッド | Method for identifying DNA copy number changes |
US10229244B2 (en) | 2002-11-11 | 2019-03-12 | Affymetrix, Inc. | Methods for identifying DNA copy number changes using hidden markov model based estimations |
EP1594975A4 (en) | 2002-12-04 | 2006-08-02 | Applera Corp | Multiplex amplification of polynucleotides |
PT2469077T (en) | 2003-01-02 | 2020-07-10 | Wobben Properties Gmbh | Rotor blade for a wind energy facility |
US7700325B2 (en) | 2003-01-17 | 2010-04-20 | Trustees Of Boston University | Haplotype analysis |
WO2004065628A1 (en) | 2003-01-21 | 2004-08-05 | Guoliang Fu | Quantitative multiplex detection of nucleic acids |
US7575865B2 (en) | 2003-01-29 | 2009-08-18 | 454 Life Sciences Corporation | Methods of amplifying and sequencing nucleic acids |
EP2261372B1 (en) | 2003-01-29 | 2012-08-22 | 454 Life Sciences Corporation | Methods of amplifying and sequencing nucleic acids |
US8394582B2 (en) | 2003-03-05 | 2013-03-12 | Genetic Technologies, Inc | Identification of fetal DNA and fetal cell markers in maternal plasma or serum |
US20040209299A1 (en) | 2003-03-07 | 2004-10-21 | Rubicon Genomics, Inc. | In vitro DNA immortalization and whole genome amplification using libraries generated from randomly fragmented DNA |
US20040197832A1 (en) | 2003-04-03 | 2004-10-07 | Mor Research Applications Ltd. | Non-invasive prenatal genetic diagnosis using transcervical cells |
US20070059700A1 (en) | 2003-05-09 | 2007-03-15 | Shengce Tao | Methods and compositions for optimizing multiplex pcr primers |
WO2005000006A2 (en) | 2003-05-28 | 2005-01-06 | Pioneer Hi-Bred International, Inc. | Plant breeding method |
US20040259100A1 (en) | 2003-06-20 | 2004-12-23 | Illumina, Inc. | Methods and compositions for whole genome amplification and genotyping |
DK1660680T3 (en) | 2003-07-31 | 2009-06-22 | Sequenom Inc | Process for High Level Multiplex Polymerase Chain Reactions and Homogeneous Mass Extension Reactions for Genotyping Polymorphisms |
US8346482B2 (en) | 2003-08-22 | 2013-01-01 | Fernandez Dennis S | Integrated biosensor and simulation system for diagnosis and therapy |
WO2005021793A1 (en) | 2003-08-29 | 2005-03-10 | Pantarhei Bioscience B.V. | Prenatal diagnosis of down syndrome by detection of fetal rna markers in maternal blood |
EP1664077B1 (en) | 2003-09-05 | 2016-04-13 | Trustees of Boston University | Method for non-invasive prenatal diagnosis |
US20050053950A1 (en) | 2003-09-08 | 2005-03-10 | Enrique Zudaire Ubani | Protocol and software for multiplex real-time PCR quantification based on the different melting temperatures of amplicons |
JPWO2005028645A1 (en) | 2003-09-24 | 2007-11-15 | 株式会社産学連携機構九州 | SNPs in the 5 'regulatory region of the MDR1 gene |
WO2005030999A1 (en) | 2003-09-25 | 2005-04-07 | Dana-Farber Cancer Institute, Inc | Methods to detect lineage-specific cells |
CN101985619B (en) | 2003-10-08 | 2014-08-20 | 波士顿大学信托人 | Methods for prenatal diagnosis of chromosomal abnormalities |
CA2482097C (en) | 2003-10-13 | 2012-02-21 | F. Hoffmann-La Roche Ag | Methods for isolating nucleic acids |
DE60328193D1 (en) | 2003-10-16 | 2009-08-13 | Sequenom Inc | Non-invasive detection of fetal genetic traits |
WO2005039389A2 (en) | 2003-10-22 | 2005-05-06 | 454 Corporation | Sequence-based karyotyping |
CA2544178A1 (en) | 2003-10-30 | 2005-05-19 | Tufts-New England Medical Center | Prenatal diagnosis using cell-free fetal dna in amniotic fluid |
US20050164252A1 (en) | 2003-12-04 | 2005-07-28 | Yeung Wah Hin A. | Methods using non-genic sequences for the detection, modification and treatment of any disease or improvement of functions of a cell |
EP1706488B1 (en) | 2004-01-12 | 2013-07-24 | Roche NimbleGen, Inc. | Method of performing pcr amplification on a microarray |
US20100216151A1 (en) | 2004-02-27 | 2010-08-26 | Helicos Biosciences Corporation | Methods for detecting fetal nucleic acids and diagnosing fetal abnormalities |
US20060046258A1 (en) | 2004-02-27 | 2006-03-02 | Lapidus Stanley N | Applications of single molecule sequencing |
US20100216153A1 (en) | 2004-02-27 | 2010-08-26 | Helicos Biosciences Corporation | Methods for detecting fetal nucleic acids and diagnosing fetal abnormalities |
US7035740B2 (en) | 2004-03-24 | 2006-04-25 | Illumina, Inc. | Artificial intelligence and global normalization methods for genotyping |
JP4437050B2 (en) | 2004-03-26 | 2010-03-24 | 株式会社日立製作所 | Diagnosis support system, diagnosis support method, and diagnosis support service providing method |
US7805282B2 (en) | 2004-03-30 | 2010-09-28 | New York University | Process, software arrangement and computer-accessible medium for obtaining information associated with a haplotype |
AU2005233254B2 (en) | 2004-03-31 | 2011-11-24 | Kellbenx, Inc. | Monoclonal antibodies with specificity for fetal erythroid cells |
US7414118B1 (en) | 2004-04-14 | 2008-08-19 | Applied Biosystems Inc. | Modified oligonucleotides and applications thereof |
US7468249B2 (en) | 2004-05-05 | 2008-12-23 | Biocept, Inc. | Detection of chromosomal disorders |
US7709194B2 (en) | 2004-06-04 | 2010-05-04 | The Chinese University Of Hong Kong | Marker for prenatal diagnosis and monitoring |
WO2005123779A2 (en) | 2004-06-14 | 2005-12-29 | The Board Of Trustees Of The University Of Illinois | Antibodies binding to cd34+/cd36+ fetal but not to adult cells |
US20080102455A1 (en) | 2004-07-06 | 2008-05-01 | Genera Biosystems Pty Ltd | Method Of Detecting Aneuploidy |
DE102004036285A1 (en) | 2004-07-27 | 2006-02-16 | Advalytix Ag | Method for determining the frequency of sequences of a sample |
JP5249581B2 (en) | 2004-08-09 | 2013-07-31 | ジェネレイション バイオテック リミテッド ライアビリティ カンパニー | Nucleic acid isolation and amplification method |
EP1789786A4 (en) | 2004-08-18 | 2008-02-13 | Abbott Molecular Inc | Determining data quality and/or segmental aneusomy using a computer system |
US7634808B1 (en) | 2004-08-20 | 2009-12-15 | Symantec Corporation | Method and apparatus to block fast-spreading computer worms that use DNS MX record queries |
JP4810164B2 (en) | 2004-09-03 | 2011-11-09 | 富士フイルム株式会社 | Nucleic acid separation and purification method |
US8024128B2 (en) | 2004-09-07 | 2011-09-20 | Gene Security Network, Inc. | System and method for improving clinical decisions by aggregating, validating and analysing genetic and phenotypic data |
US20060088574A1 (en) | 2004-10-25 | 2006-04-27 | Manning Paul B | Nutritional supplements |
US20060134662A1 (en) | 2004-10-25 | 2006-06-22 | Pratt Mark R | Method and system for genotyping samples in a normalized allelic space |
US9109256B2 (en) * | 2004-10-27 | 2015-08-18 | Esoterix Genetic Laboratories, Llc | Method for monitoring disease progression or recurrence |
WO2006055761A1 (en) | 2004-11-17 | 2006-05-26 | Mosaic Reproductive Health And Genetics, Llc | Methods of determining human egg competency |
CA2588296A1 (en) | 2004-11-24 | 2006-06-01 | Neuromolecular Pharmaceuticals, Inc. | Composition comprising an nmda receptor antagonist and levodopa and use thereof for treating neurological disease |
US20070042384A1 (en) | 2004-12-01 | 2007-02-22 | Weiwei Li | Method for isolating and modifying DNA from blood and body fluids |
US20090098534A1 (en) | 2005-02-25 | 2009-04-16 | The Regents Of The University Of California | Full Karyotype Single Cell Chromosome Analysis |
US7618777B2 (en) | 2005-03-16 | 2009-11-17 | Agilent Technologies, Inc. | Composition and method for array hybridization |
CN102010902B (en) | 2005-03-18 | 2012-05-23 | 香港中文大学 | Markers for prenatal diagnosis and monitoring |
AU2006224971B2 (en) | 2005-03-18 | 2009-07-02 | Boston University | A method for the detection of chromosomal aneuploidies |
US20060228721A1 (en) | 2005-04-12 | 2006-10-12 | Leamon John H | Methods for determining sequence variants using ultra-deep sequencing |
EP1877576B1 (en) | 2005-04-12 | 2013-01-23 | 454 Life Sciences Corporation | Methods for determining sequence variants using ultra-deep sequencing |
WO2006128192A2 (en) | 2005-05-27 | 2006-11-30 | John Wayne Cancer Institute | Use of free circulating dna for diagnosis, prognosis, and treatment of cancer |
JP4879975B2 (en) | 2005-05-31 | 2012-02-22 | アプライド バイオシステムズ リミテッド ライアビリティー カンパニー | Multiplex amplification of short nucleic acids |
US7601499B2 (en) | 2005-06-06 | 2009-10-13 | 454 Life Sciences Corporation | Paired end sequencing |
CA2611743C (en) | 2005-06-15 | 2019-12-31 | Callida Genomics, Inc. | Nucleic acid analysis by forming and tracking aliquoted fragments of a target polynucleotide |
EP1910549B1 (en) | 2005-07-15 | 2010-11-24 | Life Technologies Corporation | Analyzing messenger rna and micro rna in the same reaction mixture |
US20070020640A1 (en) | 2005-07-21 | 2007-01-25 | Mccloskey Megan L | Molecular encoding of nucleic acid templates for PCR and other forms of sequence analysis |
RU2290078C1 (en) * | 2005-07-25 | 2006-12-27 | Евгений Владимирович Новичков | Method for predicting the relapse of serous ovarian cancer |
US8532930B2 (en) | 2005-11-26 | 2013-09-10 | Natera, Inc. | Method for determining the number of copies of a chromosome in the genome of a target individual using genetic data from genetically related individuals |
US11111543B2 (en) | 2005-07-29 | 2021-09-07 | Natera, Inc. | System and method for cleaning noisy genetic data and determining chromosome copy number |
US8515679B2 (en) | 2005-12-06 | 2013-08-20 | Natera, Inc. | System and method for cleaning noisy genetic data and determining chromosome copy number |
US9424392B2 (en) | 2005-11-26 | 2016-08-23 | Natera, Inc. | System and method for cleaning noisy genetic data from target individuals using genetic data from genetically related individuals |
US20070027636A1 (en) | 2005-07-29 | 2007-02-01 | Matthew Rabinowitz | System and method for using genetic, phentoypic and clinical data to make predictions for clinical or lifestyle decisions |
US11111544B2 (en) | 2005-07-29 | 2021-09-07 | Natera, Inc. | System and method for cleaning noisy genetic data and determining chromosome copy number |
US10081839B2 (en) | 2005-07-29 | 2018-09-25 | Natera, Inc | System and method for cleaning noisy genetic data and determining chromosome copy number |
US20070178501A1 (en) | 2005-12-06 | 2007-08-02 | Matthew Rabinowitz | System and method for integrating and validating genotypic, phenotypic and medical information into a database according to a standardized ontology |
US10083273B2 (en) | 2005-07-29 | 2018-09-25 | Natera, Inc. | System and method for cleaning noisy genetic data and determining chromosome copy number |
ATE510930T1 (en) | 2005-08-02 | 2011-06-15 | Rubicon Genomics Inc | COMPOSITIONS AND METHODS FOR EDITING AND AMPLIFICATION OF DNA USING MULTIPLE ENZYMES IN A SINGLE REACTION |
JP2009507080A (en) | 2005-09-07 | 2009-02-19 | リゲル ファーマシューティカルズ,インコーポレーテッド | Triazole derivatives useful as Axl inhibitors |
GB0522310D0 (en) | 2005-11-01 | 2005-12-07 | Solexa Ltd | Methods of preparing libraries of template polynucleotides |
WO2007056601A2 (en) | 2005-11-09 | 2007-05-18 | The Regents Of The University Of California | Methods and apparatus for context-sensitive telemedicine |
GB0523276D0 (en) | 2005-11-15 | 2005-12-21 | London Bridge Fertility | Chromosomal analysis by molecular karyotyping |
JP6121642B2 (en) | 2005-11-26 | 2017-04-26 | ナテラ, インコーポレイテッド | System and method for cleaning and using genetic data for making predictions |
US20070172853A1 (en) | 2005-12-02 | 2007-07-26 | The General Hospital Corporation | Use of deletion polymorphisms to predict, prevent, and manage histoincompatibility |
WO2007070482A2 (en) | 2005-12-14 | 2007-06-21 | Xueliang Xia | Microarray-based preimplantation genetic diagnosis of chromosomal abnormalities |
JP5198284B2 (en) | 2005-12-22 | 2013-05-15 | キージーン ナムローゼ フェンノートシャップ | An improved strategy for transcript characterization using high-throughput sequencing techniques |
PT2385143T (en) | 2006-02-02 | 2016-10-18 | Univ Leland Stanford Junior | Non-invasive fetal genetic screening by digital analysis |
WO2007092538A2 (en) | 2006-02-07 | 2007-08-16 | President And Fellows Of Harvard College | Methods for making nucleotide probes for sequencing and synthesis |
WO2007091064A1 (en) | 2006-02-08 | 2007-08-16 | Solexa Limited | End modification to prevent over-representation of fragments |
ATE508209T1 (en) | 2006-02-28 | 2011-05-15 | Univ Louisville Res Found | DETECTION OF CHROMOSOME ABNORMALITIES IN THE FETUS USING TANDEM SINGLE NUCLEOTIDE POLYMORPHISMS |
US20100184043A1 (en) | 2006-02-28 | 2010-07-22 | University Of Louisville Research Foundation | Detecting Genetic Abnormalities |
EP2007874A4 (en) | 2006-03-13 | 2009-06-10 | Veridex Llc | Propagation of primary cells |
WO2007112418A2 (en) | 2006-03-28 | 2007-10-04 | Baylor College Of Medicine | Screening for down syndrome |
JP2009072062A (en) | 2006-04-07 | 2009-04-09 | Institute Of Physical & Chemical Research | Method for isolating 5'-terminals of nucleic acid and its application |
US20070243549A1 (en) | 2006-04-12 | 2007-10-18 | Biocept, Inc. | Enrichment of circulating fetal dna |
US7702468B2 (en) | 2006-05-03 | 2010-04-20 | Population Diagnostics, Inc. | Evaluating genetic disorders |
US7901884B2 (en) | 2006-05-03 | 2011-03-08 | The Chinese University Of Hong Kong | Markers for prenatal diagnosis and monitoring |
US7754428B2 (en) | 2006-05-03 | 2010-07-13 | The Chinese University Of Hong Kong | Fetal methylation markers |
EP3260556B1 (en) | 2006-05-31 | 2019-07-31 | Sequenom, Inc. | Methods for the extraction of nucleic acid from a sample |
WO2007146819A2 (en) | 2006-06-09 | 2007-12-21 | Brigham And Women's Hospital, Inc. | Methods for identifying and using snp panels |
US20080050739A1 (en) | 2006-06-14 | 2008-02-28 | Roland Stoughton | Diagnosis of fetal abnormalities using polymorphisms including short tandem repeats |
WO2009035447A1 (en) | 2006-06-14 | 2009-03-19 | Living Microsystems, Inc. | Diagnosis of fetal abnormalities by comparative genomic hybridization analysis |
EP3406736B1 (en) | 2006-06-14 | 2022-09-07 | Verinata Health, Inc. | Methods for the diagnosis of fetal abnormalities |
US8137912B2 (en) | 2006-06-14 | 2012-03-20 | The General Hospital Corporation | Methods for the diagnosis of fetal abnormalities |
EP2589668A1 (en) | 2006-06-14 | 2013-05-08 | Verinata Health, Inc | Rare cell analysis using sample splitting and DNA tags |
EP4108780A1 (en) | 2006-06-14 | 2022-12-28 | Verinata Health, Inc. | Rare cell analysis using sample splitting and dna tags |
DK2029778T3 (en) | 2006-06-14 | 2018-08-20 | Verinata Health Inc | DIAGNOSIS OF Fetal ABNORMITIES |
EP2029779A4 (en) | 2006-06-14 | 2010-01-20 | Living Microsystems Inc | Use of highly parallel snp genotyping for fetal diagnosis |
WO2007147018A1 (en) | 2006-06-14 | 2007-12-21 | Cellpoint Diagnostics, Inc. | Analysis of rare cell-enriched samples |
WO2007149791A2 (en) | 2006-06-15 | 2007-12-27 | Stratagene | System for isolating biomolecules from a sample |
CA2655269A1 (en) | 2006-06-16 | 2007-12-21 | Sequenom, Inc. | Methods and compositions for the amplification, detection and quantification of nucleic acid from a sample |
US20080182244A1 (en) | 2006-08-04 | 2008-07-31 | Ikonisys, Inc. | Pre-Implantation Genetic Diagnosis Test |
WO2008024473A2 (en) | 2006-08-24 | 2008-02-28 | University Of Massachusetts Medical School | Mapping of genomic interactions |
EP2064332B1 (en) | 2006-09-14 | 2012-07-18 | Ibis Biosciences, Inc. | Targeted whole genome amplification method for identification of pathogens |
US20080085836A1 (en) | 2006-09-22 | 2008-04-10 | Kearns William G | Method for genetic testing of human embryos for chromosome abnormalities, segregating genetic disorders with or without a known mutation and mitochondrial disorders following in vitro fertilization (IVF), embryo culture and embryo biopsy |
AU2007311126A1 (en) | 2006-10-16 | 2008-04-24 | Celula Inc. | Methods and compositions for differential expansion of fetal cells in maternal blood and their use |
US20100184152A1 (en) | 2006-10-23 | 2010-07-22 | Vladislav Sandler | Target-oriented whole genome amplification of nucleic acids |
KR100825367B1 (en) | 2006-11-07 | 2008-04-28 | 전남대학교산학협력단 | Method and kit for determining chimerism after stem cell transplantation using mitochondrial dna microsatellites as markers |
US20110045462A1 (en) | 2006-11-14 | 2011-02-24 | The Regents Of The University Of California | Digital analysis of gene expression |
WO2008059578A1 (en) | 2006-11-16 | 2008-05-22 | Olympus Corporation | Multiplex pcr method |
EP2082064B1 (en) | 2006-11-16 | 2011-10-05 | Genentech, Inc. | Genetic variations associated with tumors |
WO2008079374A2 (en) | 2006-12-21 | 2008-07-03 | Wang Eric T | Methods and compositions for selecting and using single nucleotide polymorphisms |
WO2008081451A2 (en) | 2007-01-03 | 2008-07-10 | Monaliza Medical Ltd. | Methods and kits for analyzing genetic material of a fetus |
US20080164204A1 (en) | 2007-01-08 | 2008-07-10 | Mehdi Hatamian | Valve for facilitating and maintaining separation of fluids and materials |
JP5690068B2 (en) | 2007-01-11 | 2015-03-25 | エラスムス ユニバーシティ メディカル センター | Circular chromosome conformation capture (4C) |
US20100129792A1 (en) | 2007-02-06 | 2010-05-27 | Gerassimos Makrigiorgos | Direct monitoring and pcr amplification of the dosage and dosage difference between target genetic regions |
AU2008213634B2 (en) | 2007-02-08 | 2013-09-05 | Sequenom, Inc. | Nucleic acid-based tests for RhD typing, gender determination and nucleic acid quantification |
JP2008209985A (en) | 2007-02-23 | 2008-09-11 | Canon Inc | Data processor, electronic document registration method and computer program |
WO2008115497A2 (en) | 2007-03-16 | 2008-09-25 | Gene Security Network | System and method for cleaning noisy genetic data and determining chromsome copy number |
EP2121984A2 (en) | 2007-03-16 | 2009-11-25 | 454 Life Sciences Corporation | System and method for detection of hiv drug resistant variants |
EP2140031A4 (en) | 2007-03-26 | 2011-04-20 | Sequenom Inc | Restriction endonuclease enhanced polymorphic sequence detection |
JP5262230B2 (en) | 2007-03-28 | 2013-08-14 | 独立行政法人理化学研究所 | New polymorphism detection method |
ITTO20070307A1 (en) | 2007-05-04 | 2008-11-05 | Silicon Biosystems Spa | METHOD AND DEVICE FOR NON-INVASIVE PRENATAL DIAGNOSIS |
KR20100038330A (en) | 2007-05-31 | 2010-04-14 | 더 리젠츠 오브 더 유니버시티 오브 캘리포니아 | High specificity and high sensitivity detection based on steric hindrance & enzyme-related signal amplification |
WO2008151110A2 (en) | 2007-06-01 | 2008-12-11 | The University Of North Carolina At Chapel Hill | Molecular diagnosis and typing of lung cancer variants |
WO2008157264A2 (en) | 2007-06-15 | 2008-12-24 | Sequenom, Inc. | Combined methods for the detection of chromosomal aneuploidy |
US20090023190A1 (en) | 2007-06-20 | 2009-01-22 | Kai Qin Lao | Sequence amplification with loopable primers |
WO2008155599A1 (en) | 2007-06-20 | 2008-12-24 | Taag-Genetics Sa | Nested multiplex amplification method for identification of multiple biological entities |
AU2008272421B2 (en) | 2007-07-03 | 2014-09-04 | Genaphora Ltd. | Chimeric primers for improved nucleic acid amplification reactions |
WO2009009769A2 (en) | 2007-07-11 | 2009-01-15 | Artemis Health, Inc. | Diagnosis of fetal abnormalities using nucleated red blood cells |
EP3067807A1 (en) | 2007-07-23 | 2016-09-14 | The Chinese University of Hong Kong | Diagnosing fetal chromosomal aneuploidy using genomic sequencing |
US20100112590A1 (en) | 2007-07-23 | 2010-05-06 | The Chinese University Of Hong Kong | Diagnosing Fetal Chromosomal Aneuploidy Using Genomic Sequencing With Enrichment |
US8455190B2 (en) | 2007-08-01 | 2013-06-04 | Dana-Farber Cancer Institute, Inc. | Enrichment of a target sequence |
US20090053719A1 (en) | 2007-08-03 | 2009-02-26 | The Chinese University Of Hong Kong | Analysis of nucleic acids by digital pcr |
WO2009019215A1 (en) | 2007-08-03 | 2009-02-12 | Dkfz Deutsches Krebsforschungszentrum | Method for prenatal diagnosis using exosomes and cd24 as a marker |
WO2009032781A2 (en) | 2007-08-29 | 2009-03-12 | Sequenom, Inc. | Methods and compositions for universal size-specific polymerase chain reaction |
WO2009032779A2 (en) | 2007-08-29 | 2009-03-12 | Sequenom, Inc. | Methods and compositions for the size-specific seperation of nucleic acid from a sample |
US8748100B2 (en) | 2007-08-30 | 2014-06-10 | The Chinese University Of Hong Kong | Methods and kits for selectively amplifying, detecting or quantifying target DNA with specific end sequences |
AU2008295992B2 (en) | 2007-09-07 | 2014-04-17 | Fluidigm Corporation | Copy number variation determination, methods and systems |
CA2697640C (en) | 2007-09-21 | 2016-06-21 | Katholieke Universiteit Leuven | Tools and methods for genetic tests using next generation sequencing |
US20110300608A1 (en) | 2007-09-21 | 2011-12-08 | Streck, Inc. | Nucleic acid isolation in preserved whole blood |
US20100086914A1 (en) | 2008-10-03 | 2010-04-08 | Roche Molecular Systems, Inc. | High resolution, high throughput hla genotyping by clonal sequencing |
EP2203567B1 (en) | 2007-10-16 | 2011-05-11 | Roche Diagnostics GmbH | High resolution, high throughput hla genotyping by clonal sequencing |
WO2009064897A2 (en) | 2007-11-14 | 2009-05-22 | Chronix Biomedical | Detection of nucleic acid sequence variations in circulating nucleic acid in bovine spongiform encephalopathy |
EP2217703B1 (en) | 2007-11-30 | 2017-03-01 | GE Healthcare Bio-Sciences Corp. | Method for isolation of genomic dna, rna and proteins from a single sample |
FR2925480B1 (en) | 2007-12-21 | 2011-07-01 | Gervais Danone Sa | PROCESS FOR THE ENRICHMENT OF OXYGEN WATER BY ELECTROLYTIC, OXYGEN-ENRICHED WATER OR DRINK AND USES THEREOF |
EP2077337A1 (en) | 2007-12-26 | 2009-07-08 | Eppendorf Array Technologies SA | Amplification and detection composition, method and kit |
WO2009092035A2 (en) | 2008-01-17 | 2009-07-23 | Sequenom, Inc. | Methods and compositions for the analysis of biological molecules |
EP3360972B1 (en) | 2008-01-17 | 2019-12-11 | Sequenom, Inc. | Single molecule nucleic acid sequence analysis processes |
WO2009100029A1 (en) | 2008-02-01 | 2009-08-13 | The General Hospital Corporation | Use of microvesicles in diagnosis, prognosis and treatment of medical diseases and conditions |
WO2009099602A1 (en) | 2008-02-04 | 2009-08-13 | Massachusetts Institute Of Technology | Selection of nucleic acids by solution hybridization to oligonucleotide baits |
WO2009105531A1 (en) | 2008-02-19 | 2009-08-27 | Gene Security Network, Inc. | Methods for cell genotyping |
US20090221620A1 (en) | 2008-02-20 | 2009-09-03 | Celera Corporation | Gentic polymorphisms associated with stroke, methods of detection and uses thereof |
US8709726B2 (en) | 2008-03-11 | 2014-04-29 | Sequenom, Inc. | Nucleic acid-based tests for prenatal gender determination |
US20090307180A1 (en) | 2008-03-19 | 2009-12-10 | Brandon Colby | Genetic analysis |
WO2009120808A2 (en) | 2008-03-26 | 2009-10-01 | Sequenom, Inc. | Restriction endonuclease enhanced polymorphic sequence detection |
US8133672B2 (en) | 2008-03-31 | 2012-03-13 | Pacific Biosciences Of California, Inc. | Two slow-step polymerase enzyme systems and methods |
JP5811483B2 (en) | 2008-04-03 | 2015-11-11 | シービー バイオテクノロジーズ インコーポレイテッド | Amplicon Rescue Multiplex Polymerase Chain Reaction for Amplification of Multiple Targets |
WO2009146335A1 (en) | 2008-05-27 | 2009-12-03 | Gene Security Network, Inc. | Methods for embryo characterization and comparison |
EP2128169A1 (en) | 2008-05-30 | 2009-12-02 | Qiagen GmbH | Method for isolating short chain nucleic acids |
BRPI0915953B1 (en) | 2008-07-21 | 2021-04-13 | Becton, Dickinson And Company | MECHANICAL SEPARATOR, SEPARATION ASSEMBLY AND SEPARATING METHOD |
JP5555453B2 (en) | 2008-07-24 | 2014-07-23 | スリーエム イノベイティブ プロパティズ カンパニー | Abrasive product, method for producing and using the same |
US20100041048A1 (en) | 2008-07-31 | 2010-02-18 | The Johns Hopkins University | Circulating Mutant DNA to Assess Tumor Dynamics |
ES2620431T3 (en) | 2008-08-04 | 2017-06-28 | Natera, Inc. | Methods for the determination of alleles and ploidy |
CN102203287B (en) | 2008-08-26 | 2017-09-19 | 弗卢迪格姆公司 | Increase the assay method of sample and/or target flux |
DE102008045705A1 (en) | 2008-09-04 | 2010-04-22 | Macherey, Nagel Gmbh & Co. Kg Handelsgesellschaft | Method for obtaining short RNA and kit therefor |
US8586310B2 (en) | 2008-09-05 | 2013-11-19 | Washington University | Method for multiplexed nucleic acid patch polymerase chain reaction |
US8476013B2 (en) | 2008-09-16 | 2013-07-02 | Sequenom, Inc. | Processes and compositions for methylation-based acid enrichment of fetal nucleic acid from a maternal sample useful for non-invasive prenatal diagnoses |
EP2329021B1 (en) | 2008-09-16 | 2016-08-10 | Sequenom, Inc. | Processes and compositions for methylation-based enrichment of fetal nucleic acid from a maternal sample useful for non invasive prenatal diagnoses |
WO2010033652A1 (en) | 2008-09-17 | 2010-03-25 | Ge Healthcare Bio-Sciences Corp. | Method for small rna isolation |
EP2952589B1 (en) | 2008-09-20 | 2018-02-14 | The Board of Trustees of The Leland Stanford Junior University | Noninvasive diagnosis of fetal aneuploidy by sequencing |
US20110159499A1 (en) | 2009-11-25 | 2011-06-30 | Quantalife, Inc. | Methods and compositions for detecting genetic material |
US9156010B2 (en) | 2008-09-23 | 2015-10-13 | Bio-Rad Laboratories, Inc. | Droplet-based assay system |
WO2010042831A2 (en) | 2008-10-10 | 2010-04-15 | Swedish Health Services | Diagnosis, prognosis and treatment of glioblastoma multiforme |
WO2010045617A2 (en) | 2008-10-17 | 2010-04-22 | University Of Louisville Research Foundation | Detecting genetic abnormalities |
US8236503B2 (en) | 2008-11-07 | 2012-08-07 | Sequenta, Inc. | Methods of monitoring conditions by sequence analysis |
US9506119B2 (en) | 2008-11-07 | 2016-11-29 | Adaptive Biotechnologies Corp. | Method of sequence determination using sequence tags |
US8748103B2 (en) | 2008-11-07 | 2014-06-10 | Sequenta, Inc. | Monitoring health and disease status using clonotype profiles |
US20100285478A1 (en) | 2009-03-27 | 2010-11-11 | Life Technologies Corporation | Methods, Compositions, and Kits for Detecting Allelic Variants |
AU2009329946B2 (en) | 2008-12-22 | 2016-01-07 | Celula, Inc. | Methods and genotyping panels for detecting alleles, genomes, and transcriptomes |
US11634747B2 (en) | 2009-01-21 | 2023-04-25 | Streck Llc | Preservation of fetal nucleic acids in maternal plasma |
US8450063B2 (en) | 2009-01-28 | 2013-05-28 | Fluidigm Corporation | Determination of copy number differences by amplification |
DK3290530T3 (en) | 2009-02-18 | 2020-12-07 | Streck Inc | PRESERVATION OF CELL-FREE NUCLEIC ACIDS |
US20100285537A1 (en) | 2009-04-02 | 2010-11-11 | Fluidigm Corporation | Selective tagging of short nucleic acid fragments and selective protection of target sequences from degradation |
CN102439177B (en) | 2009-04-02 | 2014-10-01 | 弗卢伊蒂格姆公司 | Multi-primer amplification method for barcoding of target nucleic acids |
EP3964586A1 (en) | 2009-04-03 | 2022-03-09 | Sequenom, Inc. | Nucleic acid preparation compositions and methods |
EP3327139A1 (en) | 2009-04-06 | 2018-05-30 | The Johns Hopkins University | Digital quantification of dna methylation |
WO2010127186A1 (en) | 2009-04-30 | 2010-11-04 | Prognosys Biosciences, Inc. | Nucleic acid constructs and methods of use |
US8481699B2 (en) | 2009-07-14 | 2013-07-09 | Academia Sinica | Multiplex barcoded Paired-End ditag (mbPED) library construction for ultra high throughput sequencing |
US20130196862A1 (en) | 2009-07-17 | 2013-08-01 | Natera, Inc. | Informatics Enhanced Analysis of Fetal Samples Subject to Maternal Contamination |
US10017812B2 (en) | 2010-05-18 | 2018-07-10 | Natera, Inc. | Methods for non-invasive prenatal ploidy calling |
US8563242B2 (en) | 2009-08-11 | 2013-10-22 | The Chinese University Of Hong Kong | Method for detecting chromosomal aneuploidy |
CN101643904B (en) | 2009-08-27 | 2011-04-27 | 北京北方微电子基地设备工艺研究中心有限责任公司 | Deep silicon etching device and intake system thereof |
WO2011032078A1 (en) | 2009-09-11 | 2011-03-17 | Health Reseach Inc. | Detection of x4 strains of hiv-1 by heteroduplex tracking assay |
EP2480683B1 (en) | 2009-09-22 | 2017-11-29 | Roche Diagnostics GmbH | Determination of kir haplotypes by amplification of exons |
EP2480666B1 (en) | 2009-09-24 | 2017-03-08 | QIAGEN Gaithersburg, Inc. | Compositions, methods, and kits for isolating and analyzing nucleic acids using an anion exchange material |
US20120185176A1 (en) | 2009-09-30 | 2012-07-19 | Natera, Inc. | Methods for Non-Invasive Prenatal Ploidy Calling |
EP2824191A3 (en) | 2009-10-26 | 2015-02-18 | Lifecodexx AG | Means and methods for non-invasive diagnosis of chromosomal aneuploidy |
AU2010315037B9 (en) | 2009-11-05 | 2015-04-23 | Sequenom, Inc. | Fetal genomic analysis from a maternal biological sample |
EP2496720B1 (en) | 2009-11-06 | 2020-07-08 | The Board of Trustees of the Leland Stanford Junior University | Non-invasive diagnosis of graft rejection in organ transplant patients |
EP2499259B1 (en) | 2009-11-09 | 2016-04-06 | Streck Inc. | Stabilization of rna in and extracting from intact cells within a blood sample |
US20120251411A1 (en) | 2009-12-07 | 2012-10-04 | Min-Yong Jeon | Centrifuge tube |
WO2011072086A1 (en) | 2009-12-08 | 2011-06-16 | Hemaquest Pharmaceuticals, Inc. | Methods and low dose regimens for treating red blood cell disorders |
US8835358B2 (en) | 2009-12-15 | 2014-09-16 | Cellular Research, Inc. | Digital counting of individual molecules by stochastic attachment of diverse labels |
US9315857B2 (en) | 2009-12-15 | 2016-04-19 | Cellular Research, Inc. | Digital counting of individual molecules by stochastic attachment of diverse label-tags |
US8574842B2 (en) | 2009-12-22 | 2013-11-05 | The Board Of Trustees Of The Leland Stanford Junior University | Direct molecular diagnosis of fetal aneuploidy |
EP2516680B1 (en) | 2009-12-22 | 2016-04-06 | Sequenom, Inc. | Processes and kits for identifying aneuploidy |
WO2011085491A1 (en) | 2010-01-15 | 2011-07-21 | The University Of British Columbia | Multiplex amplification for the detection of nucleic acid variations |
US20120010085A1 (en) | 2010-01-19 | 2012-01-12 | Rava Richard P | Methods for determining fraction of fetal nucleic acids in maternal samples |
US10388403B2 (en) | 2010-01-19 | 2019-08-20 | Verinata Health, Inc. | Analyzing copy number variation in the detection of cancer |
US9323888B2 (en) | 2010-01-19 | 2016-04-26 | Verinata Health, Inc. | Detecting and classifying copy number variation |
US20120270739A1 (en) | 2010-01-19 | 2012-10-25 | Verinata Health, Inc. | Method for sample analysis of aneuploidies in maternal samples |
EP2883965B8 (en) | 2010-01-19 | 2018-06-27 | Verinata Health, Inc | Method for determining copy number variations |
US20110312503A1 (en) | 2010-01-23 | 2011-12-22 | Artemis Health, Inc. | Methods of fetal abnormality detection |
EP2534263B1 (en) | 2010-02-09 | 2020-08-05 | Unitaq Bio | Methods and compositions for universal detection of nucleic acids |
PL2539472T3 (en) | 2010-02-26 | 2016-03-31 | Life Technologies Corp | Fast pcr for str genotyping |
CN103237901B (en) | 2010-03-01 | 2016-08-03 | 卡里斯生命科学瑞士控股有限责任公司 | For treating the biomarker of diagnosis |
WO2011118603A1 (en) | 2010-03-24 | 2011-09-29 | 凸版印刷株式会社 | Method for detecting target base sequence using competitive primer |
US9255291B2 (en) | 2010-05-06 | 2016-02-09 | Bioo Scientific Corporation | Oligonucleotide ligation methods for improving data quality and throughput using massively parallel sequencing |
EP2566984B1 (en) | 2010-05-07 | 2019-04-03 | The Board of Trustees of the Leland Stanford Junior University | Measurement and comparison of immune diversity by high-throughput sequencing |
WO2011143659A2 (en) | 2010-05-14 | 2011-11-17 | Fluidigm Corporation | Nucleic acid isolation methods |
SG185543A1 (en) | 2010-05-14 | 2012-12-28 | Fluidigm Corp | Assays for the detection of genotype, mutations, and/or aneuploidy |
WO2013052557A2 (en) | 2011-10-03 | 2013-04-11 | Natera, Inc. | Methods for preimplantation genetic diagnosis by sequencing |
US11939634B2 (en) | 2010-05-18 | 2024-03-26 | Natera, Inc. | Methods for simultaneous amplification of target loci |
US20190010543A1 (en) | 2010-05-18 | 2019-01-10 | Natera, Inc. | Methods for simultaneous amplification of target loci |
US11332785B2 (en) | 2010-05-18 | 2022-05-17 | Natera, Inc. | Methods for non-invasive prenatal ploidy calling |
US11332793B2 (en) | 2010-05-18 | 2022-05-17 | Natera, Inc. | Methods for simultaneous amplification of target loci |
US9677118B2 (en) | 2014-04-21 | 2017-06-13 | Natera, Inc. | Methods for simultaneous amplification of target loci |
US20130123120A1 (en) | 2010-05-18 | 2013-05-16 | Natera, Inc. | Highly Multiplex PCR Methods and Compositions |
US20190284623A1 (en) | 2010-05-18 | 2019-09-19 | Natera, Inc. | Methods for non-invasive prenatal ploidy calling |
US10316362B2 (en) | 2010-05-18 | 2019-06-11 | Natera, Inc. | Methods for simultaneous amplification of target loci |
US20190309358A1 (en) | 2010-05-18 | 2019-10-10 | Natera, Inc. | Methods for non-invasive prenatal ploidy calling |
US20190323076A1 (en) | 2010-05-18 | 2019-10-24 | Natera, Inc. | Methods for non-invasive prenatal ploidy calling |
US11408031B2 (en) | 2010-05-18 | 2022-08-09 | Natera, Inc. | Methods for non-invasive prenatal paternity testing |
US11326208B2 (en) | 2010-05-18 | 2022-05-10 | Natera, Inc. | Methods for nested PCR amplification of cell-free DNA |
US11322224B2 (en) | 2010-05-18 | 2022-05-03 | Natera, Inc. | Methods for non-invasive prenatal ploidy calling |
CA2798758C (en) | 2010-05-18 | 2019-05-07 | Natera, Inc. | Methods for non-invasive prenatal ploidy calling |
US20140206552A1 (en) | 2010-05-18 | 2014-07-24 | Natera, Inc. | Methods for preimplantation genetic diagnosis by sequencing |
US11339429B2 (en) | 2010-05-18 | 2022-05-24 | Natera, Inc. | Methods for non-invasive prenatal ploidy calling |
US20110301854A1 (en) | 2010-06-08 | 2011-12-08 | Curry Bo U | Method of Determining Allele-Specific Copy Number of a SNP |
US8700338B2 (en) | 2011-01-25 | 2014-04-15 | Ariosa Diagnosis, Inc. | Risk calculation for evaluation of fetal aneuploidy |
US20130040375A1 (en) | 2011-08-08 | 2013-02-14 | Tandem Diagnotics, Inc. | Assay systems for genetic analysis |
US20120034603A1 (en) | 2010-08-06 | 2012-02-09 | Tandem Diagnostics, Inc. | Ligation-based detection of genetic variants |
US20120190557A1 (en) | 2011-01-25 | 2012-07-26 | Aria Diagnostics, Inc. | Risk calculation for evaluation of fetal aneuploidy |
EP2426217A1 (en) | 2010-09-03 | 2012-03-07 | Centre National de la Recherche Scientifique (CNRS) | Analytical methods for cell free nucleic acids and applications |
WO2012042374A2 (en) | 2010-10-01 | 2012-04-05 | Anssi Jussi Nikolai Taipale | Method of determining number or concentration of molecules |
EP2633311A4 (en) | 2010-10-26 | 2014-05-07 | Univ Stanford | Non-invasive fetal genetic screening by sequencing analysis |
US9284602B2 (en) | 2010-10-27 | 2016-03-15 | President And Fellows Of Harvard College | Compositions of toehold primer duplexes and methods of use |
MX349568B (en) | 2010-11-30 | 2017-08-03 | Univ Hong Kong Chinese | Detection of genetic or molecular aberrations associated with cancer. |
WO2012078792A2 (en) | 2010-12-07 | 2012-06-14 | Stanford University | Non-invasive determination of fetal inheritance of parental haplotypes at the genome-wide scale |
WO2012083250A2 (en) | 2010-12-17 | 2012-06-21 | Celula, Inc. | Methods for screening and diagnosing genetic conditions |
WO2012088456A2 (en) | 2010-12-22 | 2012-06-28 | Natera, Inc. | Methods for non-invasive prenatal paternity testing |
EP3225697A3 (en) | 2010-12-30 | 2017-11-22 | Foundation Medicine, Inc. | Optimization of multigene analysis of tumor samples |
US20120190021A1 (en) | 2011-01-25 | 2012-07-26 | Aria Diagnostics, Inc. | Detection of genetic abnormalities |
EP3187597B1 (en) | 2011-02-09 | 2020-06-03 | Natera, Inc. | Methods for non-invasive prenatal ploidy calling |
GB2488358A (en) | 2011-02-25 | 2012-08-29 | Univ Plymouth | Enrichment of foetal DNA in maternal plasma |
US9260753B2 (en) | 2011-03-24 | 2016-02-16 | President And Fellows Of Harvard College | Single cell nucleic acid detection and analysis |
PL2697392T3 (en) | 2011-04-12 | 2016-08-31 | Verinata Health Inc | Resolving genome fractions using polymorphism counts |
CN107368705B (en) | 2011-04-14 | 2021-07-13 | 完整基因有限公司 | Method and computer system for analyzing genomic DNA of organism |
US9411937B2 (en) * | 2011-04-15 | 2016-08-09 | Verinata Health, Inc. | Detecting and classifying copy number variation |
CN110016499B (en) | 2011-04-15 | 2023-11-14 | 约翰·霍普金斯大学 | Safety sequencing system |
CN106912197B (en) | 2011-04-28 | 2022-01-25 | 生命技术公司 | Methods and compositions for multiplex PCR |
CN103717750B (en) | 2011-04-29 | 2017-03-08 | 塞昆纳姆股份有限公司 | The quantitation of minority nucleic acid substances |
CN102876660A (en) | 2011-07-11 | 2013-01-16 | 三星电子株式会社 | Method of amplifying target nucleic acid with reduced amplification bias and method for determining relative amount of target nucleic acid in sample |
US20130024127A1 (en) | 2011-07-19 | 2013-01-24 | John Stuelpnagel | Determination of source contributions using binomial probability calculations |
GB201115095D0 (en) | 2011-09-01 | 2011-10-19 | Singapore Volition Pte Ltd | Method for detecting nucleosomes containing nucleotides |
US8712697B2 (en) | 2011-09-07 | 2014-04-29 | Ariosa Diagnostics, Inc. | Determination of copy number variations using binomial probability calculations |
US9249460B2 (en) | 2011-09-09 | 2016-02-02 | The Board Of Trustees Of The Leland Stanford Junior University | Methods for obtaining a sequence |
JP5536729B2 (en) | 2011-09-20 | 2014-07-02 | 株式会社ソニー・コンピュータエンタテインメント | Information processing apparatus, application providing system, application providing server, application providing method, and information processing method |
CN103930546A (en) | 2011-09-26 | 2014-07-16 | 凯杰有限公司 | Rapid method for isolating extracellular nucleic acids |
WO2013052907A2 (en) | 2011-10-06 | 2013-04-11 | Sequenom, Inc. | Methods and processes for non-invasive assessment of genetic variations |
US9984198B2 (en) | 2011-10-06 | 2018-05-29 | Sequenom, Inc. | Reducing sequence read count error in assessment of complex genetic variations |
US9367663B2 (en) | 2011-10-06 | 2016-06-14 | Sequenom, Inc. | Methods and processes for non-invasive assessment of genetic variations |
AU2012321053B2 (en) | 2011-10-07 | 2018-04-19 | Murdoch Childrens Research Institute | Diagnostic assay for tissue transplantation status |
WO2013078470A2 (en) | 2011-11-22 | 2013-05-30 | MOTIF, Active | Multiplex isolation of protein-associated nucleic acids |
WO2013086352A1 (en) | 2011-12-07 | 2013-06-13 | Chronix Biomedical | Prostate cancer associated circulating nucleic acid biomarkers |
US20130190653A1 (en) | 2012-01-25 | 2013-07-25 | Angel Gabriel Alvarez Ramos | Device for blood collection from the placenta and the umbilical cord |
CA2863257C (en) | 2012-02-14 | 2021-12-14 | Cornell University | Method for relative quantification of nucleic acid sequence, expression, or copy changes, using combined nuclease, ligation, and polymerase reactions |
EP3287531B1 (en) | 2012-02-28 | 2019-06-19 | Agilent Technologies, Inc. | Method for attaching a counter sequence to a nucleic acid sample |
US9892230B2 (en) | 2012-03-08 | 2018-02-13 | The Chinese University Of Hong Kong | Size-based analysis of fetal or tumor DNA fraction in plasma |
WO2013138510A1 (en) | 2012-03-13 | 2013-09-19 | Patel Abhijit Ajit | Measurement of nucleic acid variants using highly-multiplexed error-suppressed deep sequencing |
EP2653562A1 (en) | 2012-04-20 | 2013-10-23 | Institut Pasteur | Anellovirus genome quantification as a biomarker of immune suppression |
ES2905448T3 (en) | 2012-05-10 | 2022-04-08 | Massachusetts Gen Hospital | Methods for determining a nucleotide sequence |
WO2013177220A1 (en) | 2012-05-21 | 2013-11-28 | The Scripps Research Institute | Methods of sample preparation |
US9920361B2 (en) | 2012-05-21 | 2018-03-20 | Sequenom, Inc. | Methods and compositions for analyzing nucleic acid |
CN104350152B (en) | 2012-06-01 | 2017-08-11 | 欧米伽生物技术公司 | Selective kernel acid fragment is reclaimed |
WO2014004726A1 (en) | 2012-06-26 | 2014-01-03 | Caifu Chen | Methods, compositions and kits for the diagnosis, prognosis and monitoring of cancer |
US20150011396A1 (en) | 2012-07-09 | 2015-01-08 | Benjamin G. Schroeder | Methods for creating directional bisulfite-converted nucleic acid libraries for next generation sequencing |
EP2877594B1 (en) | 2012-07-20 | 2019-12-04 | Verinata Health, Inc. | Detecting and classifying copy number variation in a fetal genome |
JP6392222B2 (en) | 2012-07-24 | 2018-09-19 | ナテラ, インコーポレイテッド | Advanced multiplex PCR methods and compositions |
BR112015001690A2 (en) * | 2012-07-24 | 2017-11-07 | Pharmacyclics Inc | mutations associated with resistance to bruton tyrosine kinase inhibitors (btk) |
US10988803B2 (en) | 2014-12-29 | 2021-04-27 | Life Genetics Lab, Llc | Multiplexed assay for quantitating and assessing integrity of cell-free DNA in biological fluids for cancer diagnosis, prognosis and surveillance |
US10993418B2 (en) | 2012-08-13 | 2021-05-04 | Life Genetics Lab, Llc | Method for measuring tumor burden in patient derived xenograft (PDX) mice |
EP2867252B1 (en) * | 2012-08-15 | 2019-12-04 | Université de Montréal | Method for identifying novel minor histocompatibility antigens |
US20140051585A1 (en) | 2012-08-15 | 2014-02-20 | Natera, Inc. | Methods and compositions for reducing genetic library contamination |
US20140100126A1 (en) | 2012-08-17 | 2014-04-10 | Natera, Inc. | Method for Non-Invasive Prenatal Testing Using Parental Mosaicism Data |
CN108103057B (en) | 2012-08-28 | 2021-09-03 | 阿科尼生物***公司 | Method and kit for purifying nucleic acids |
US20140066317A1 (en) | 2012-09-04 | 2014-03-06 | Guardant Health, Inc. | Systems and methods to detect rare mutations and copy number variation |
US20140065621A1 (en) | 2012-09-04 | 2014-03-06 | Natera, Inc. | Methods for increasing fetal fraction in maternal blood |
US10876152B2 (en) | 2012-09-04 | 2020-12-29 | Guardant Health, Inc. | Systems and methods to detect rare mutations and copy number variation |
AU2013326901B2 (en) | 2012-10-04 | 2018-05-31 | Dana-Farber Cancer Institute, Inc. | Human monoclonal anti-PD-L1 antibodies and methods of use |
US9669405B2 (en) | 2012-10-22 | 2017-06-06 | The Regents Of The University Of California | Sterilizable photopolymer serum separator |
EP3026124A1 (en) | 2012-10-31 | 2016-06-01 | Genesupport SA | Non-invasive method for detecting a fetal chromosomal aneuploidy |
US20220307086A1 (en) | 2012-11-21 | 2022-09-29 | Natera, Inc. | Methods for simultaneous amplification of target loci |
US9523121B2 (en) | 2013-01-13 | 2016-12-20 | Uni Taq Bio | Methods and compositions for PCR using blocked and universal primers |
EP2954054B1 (en) | 2013-02-08 | 2018-12-05 | Qiagen GmbH | Method for separating dna by size |
US9982255B2 (en) | 2013-03-11 | 2018-05-29 | Kailos Genetics, Inc. | Capture methodologies for circulating cell free DNA |
GB2528205B (en) | 2013-03-15 | 2020-06-03 | Guardant Health Inc | Systems and methods to detect rare mutations and copy number variation |
CN113337604A (en) | 2013-03-15 | 2021-09-03 | 莱兰斯坦福初级大学评议会 | Identification and use of circulating nucleic acid tumor markers |
US10385394B2 (en) | 2013-03-15 | 2019-08-20 | The Translational Genomics Research Institute | Processes of identifying and characterizing X-linked disorders |
CN105229175A (en) | 2013-03-15 | 2016-01-06 | 雅培分子公司 | For increasing and measuring the method for RNA fusion gene variant, the method distinguishing them and relevant primer, probe and test kit |
US20160024581A1 (en) | 2013-03-15 | 2016-01-28 | Immucor Gti Diagnostics, Inc. | Methods and compositions for assessing renal status using urine cell free dna |
CA3156663A1 (en) | 2013-03-15 | 2014-09-18 | Verinata Health, Inc. | Generating cell-free dna libraries directly from blood |
US10072298B2 (en) | 2013-04-17 | 2018-09-11 | Life Technologies Corporation | Gene fusions and gene variants associated with cancer |
WO2014194113A2 (en) | 2013-05-29 | 2014-12-04 | Chronix Biomedical | Detection and quantification of donor cell-free dna in the circulation of organ transplant recipients |
WO2015048595A1 (en) | 2013-09-27 | 2015-04-02 | Jay Shendure | Methods and systems for large scale scaffolding of genome assemblies |
US10577655B2 (en) | 2013-09-27 | 2020-03-03 | Natera, Inc. | Cell free DNA diagnostic testing standards |
US9499870B2 (en) | 2013-09-27 | 2016-11-22 | Natera, Inc. | Cell free DNA diagnostic testing standards |
AU2014346562B2 (en) | 2013-11-07 | 2018-11-29 | The Board Of Trustees Of The Leland Stanford Junior University | Cell-free nucleic acids for the analysis of the human microbiome and components thereof |
ES2693217T3 (en) | 2013-12-02 | 2018-12-10 | Personal Genome Diagnostics Inc. | Method to evaluate minority variants in a sample |
EP3524694B1 (en) | 2013-12-28 | 2020-07-15 | Guardant Health, Inc. | Methods and systems for detecting genetic variants |
JP6608368B2 (en) | 2013-12-30 | 2019-11-20 | アトレカ インコーポレイテッド | Method for analyzing nucleic acids associated with single cells using nucleic acid barcodes |
EP4219744A3 (en) | 2014-01-27 | 2023-08-30 | The General Hospital Corporation | Methods of preparing nucleic acids for sequencing |
CN106029962A (en) | 2014-02-11 | 2016-10-12 | 豪夫迈·罗氏有限公司 | Targeted sequencing and UID filtering |
WO2015134552A1 (en) | 2014-03-03 | 2015-09-11 | Swift Biosciences, Inc. | Enhanced adaptor ligation |
CA2970916A1 (en) | 2014-03-14 | 2015-09-17 | Caredx, Inc. | Methods of monitoring immunosuppressive therapies in a transplant recipient |
EP3123380B1 (en) | 2014-03-25 | 2020-11-04 | Quest Diagnostics Investments Incorporated | Detection of gene fusions by intragenic differential expression (ide) using average cycle thresholds |
WO2015164432A1 (en) | 2014-04-21 | 2015-10-29 | Natera, Inc. | Detecting mutations and ploidy in chromosomal segments |
RU2717641C2 (en) | 2014-04-21 | 2020-03-24 | Натера, Инк. | Detection of mutations and ploidy in chromosomal segments |
US20180173846A1 (en) | 2014-06-05 | 2018-06-21 | Natera, Inc. | Systems and Methods for Detection of Aneuploidy |
EP3161149B1 (en) | 2014-06-27 | 2020-06-10 | INSERM (Institut National de la Santé et de la Recherche Médicale) | Methods employing circulating dna and mirna as biomarkers for female infertility |
DK3164489T3 (en) | 2014-07-03 | 2020-08-10 | Rhodx Inc | MARKING AND EVALUATION OF A MEASUREMENT SEQUENCE |
CN106574265B (en) | 2014-07-17 | 2020-07-28 | 凯杰有限公司 | Method for isolating RNA with high yield |
GB201412834D0 (en) | 2014-07-18 | 2014-09-03 | Cancer Rec Tech Ltd | A method for detecting a genetic variant |
EP3656875B1 (en) | 2014-07-18 | 2021-09-22 | Illumina, Inc. | Non-invasive prenatal diagnosis |
ES2904909T3 (en) | 2014-10-08 | 2022-04-06 | Univ Cornell | Method for the identification and relative quantification of changes in nucleic acid sequence expression, splice variants, translocation, copy number, or methylation using combined nuclease, ligation, and polymerase reactions with drag prevention |
US10704104B2 (en) | 2014-10-20 | 2020-07-07 | Inserm (Institut National De La Sante Et De La Recherche Medicale) | Methods for screening a subject for a cancer |
CN107405540A (en) | 2014-10-24 | 2017-11-28 | 雅培分子公司 | The enrichment of small nucleic acids |
EP3218519B1 (en) | 2014-11-11 | 2020-12-02 | BGI Shenzhen | Multi-pass sequencing |
US11279974B2 (en) | 2014-12-01 | 2022-03-22 | The Broad Institute, Inc. | Method for in situ determination of nucleic acid proximity |
GB201501907D0 (en) | 2015-02-05 | 2015-03-25 | Technion Res & Dev Foundation | System and method for single cell genetic analysis |
WO2016123698A1 (en) | 2015-02-06 | 2016-08-11 | Uti Limited Partnership | Diagnostic assay for post-transplant assessment of potential rejection of donor organs |
US10557134B2 (en) | 2015-02-24 | 2020-02-11 | Trustees Of Boston University | Protection of barcodes during DNA amplification using molecular hairpins |
US20160257993A1 (en) | 2015-02-27 | 2016-09-08 | Cellular Research, Inc. | Methods and compositions for labeling targets |
EP4180535A1 (en) | 2015-03-30 | 2023-05-17 | Becton, Dickinson and Company | Methods and compositions for combinatorial barcoding |
KR101850437B1 (en) | 2015-04-14 | 2018-04-20 | 이원다이애그노믹스(주) | Method for predicting transplantation rejection using next generation sequencing |
EP3286326A1 (en) | 2015-04-23 | 2018-02-28 | Cellular Research, Inc. | Methods and compositions for whole transcriptome amplification |
US10844428B2 (en) | 2015-04-28 | 2020-11-24 | Illumina, Inc. | Error suppression in sequenced DNA fragments using redundant reads with unique molecular indices (UMIS) |
WO2016176662A1 (en) | 2015-04-30 | 2016-11-03 | Medical College Of Wisconsin, Inc. | Multiplexed optimized mismatch amplification (moma)-real time pcr for assessing cell-free dna |
US11479812B2 (en) | 2015-05-11 | 2022-10-25 | Natera, Inc. | Methods and compositions for determining ploidy |
JP7007197B2 (en) | 2015-06-05 | 2022-01-24 | キアゲン ゲーエムベーハー | How to separate DNA by size |
US20160371428A1 (en) | 2015-06-19 | 2016-12-22 | Natera, Inc. | Systems and methods for determining aneuploidy risk using sample fetal fraction |
CN107922959A (en) | 2015-07-02 | 2018-04-17 | 阿瑞玛基因组学公司 | The accurate molecular of blend sample deconvolutes |
GB201516047D0 (en) | 2015-09-10 | 2015-10-28 | Cancer Rec Tech Ltd | Method |
WO2017058784A1 (en) | 2015-09-29 | 2017-04-06 | Ludwig Institute For Cancer Research Ltd | Typing and assembling discontinuous genomic elements |
CN108368542B (en) | 2015-10-19 | 2022-04-08 | 多弗泰尔基因组学有限责任公司 | Methods for genome assembly, haplotype phasing, and target-independent nucleic acid detection |
EP3371309B1 (en) | 2015-11-04 | 2023-07-05 | Atreca, Inc. | Combinatorial sets of nucleic acid barcodes for analysis of nucleic acids associated with single cells |
US20170145475A1 (en) | 2015-11-20 | 2017-05-25 | Streck, Inc. | Single spin process for blood plasma separation and plasma composition including preservative |
CN105986030A (en) | 2016-02-03 | 2016-10-05 | 广州市基准医疗有限责任公司 | Methylated DNA detection method |
EP4071250A1 (en) | 2016-03-22 | 2022-10-12 | Myriad Women's Health, Inc. | Combinatorial dna screening |
US10781439B2 (en) | 2016-03-30 | 2020-09-22 | Covaris, Inc. | Extraction of cfDNA from biological samples |
WO2017165982A1 (en) | 2016-04-01 | 2017-10-05 | Uti Limited Partnership | Plasma derived cell-free mitochondrial deoxyribonucleic acid |
WO2017181146A1 (en) | 2016-04-14 | 2017-10-19 | Guardant Health, Inc. | Methods for early detection of cancer |
EP3443119B8 (en) | 2016-04-15 | 2022-04-06 | Natera, Inc. | Methods for lung cancer detection |
BR112018072196A2 (en) | 2016-04-29 | 2019-02-12 | Medical College Wisconsin Inc | multiplex optimized mismatch amplification (moma) - target number |
WO2017205540A1 (en) | 2016-05-24 | 2017-11-30 | The Translational Genomics Research Institute | Molecular tagging methods and sequencing libraries |
EP4043581A1 (en) | 2016-05-27 | 2022-08-17 | Sequenom, Inc. | Method for generating a paralog assay system |
JP2019526230A (en) | 2016-07-01 | 2019-09-19 | ナテラ, インコーポレイテッド | Compositions and methods for nucleic acid mutation detection |
WO2018009723A1 (en) | 2016-07-06 | 2018-01-11 | Guardant Health, Inc. | Methods for fragmentome profiling of cell-free nucleic acids |
WO2018067517A1 (en) | 2016-10-04 | 2018-04-12 | Natera, Inc. | Methods for characterizing copy number variation using proximity-litigation sequencing |
CA3042696A1 (en) | 2016-11-02 | 2018-05-11 | The Medical College Of Wisconsin, Inc. | Methods for assessing risk using total and specific cell-free dna |
GB201618485D0 (en) | 2016-11-02 | 2016-12-14 | Ucl Business Plc | Method of detecting tumour recurrence |
US20180129994A1 (en) | 2016-11-06 | 2018-05-10 | Microsoft Technology Licensing, Llc | Efficiency enhancements in task management applications |
KR20190077061A (en) | 2016-11-08 | 2019-07-02 | 셀룰러 리서치, 인크. | Cell labeling method |
US10011870B2 (en) | 2016-12-07 | 2018-07-03 | Natera, Inc. | Compositions and methods for identifying nucleic acid molecules |
EP4050113A1 (en) | 2017-01-17 | 2022-08-31 | Life Technologies Corporation | Compositions and methods for immune repertoire sequencing |
AU2018225348A1 (en) | 2017-02-21 | 2019-07-18 | Natera, Inc. | Compositions, methods, and kits for isolating nucleic acids |
JP7323462B2 (en) | 2017-06-20 | 2023-08-08 | ザ メディカル カレッジ オブ ウィスコンシン,インコーポレイテッド | Monitoring transplant patients with cell-free DNA |
CN111094593A (en) | 2017-06-20 | 2020-05-01 | 威斯康星州立大学医学院 | Use of donor-specific cell-free DNA for assessing conditions in transplant subjects |
JP2020524519A (en) | 2017-06-20 | 2020-08-20 | ザ メディカル カレッジ オブ ウィスコンシン,インコーポレイテッドThe Medical College of Wisconsin, Inc. | Assessment of transplant complication risk by all cell-free DNA |
US20200377956A1 (en) | 2017-08-07 | 2020-12-03 | The Johns Hopkins University | Methods and materials for assessing and treating cancer |
EP3676399B1 (en) | 2017-09-01 | 2022-05-11 | Life Technologies Corporation | Compositions and methods for immune repertoire sequencing |
US11091800B2 (en) | 2017-09-20 | 2021-08-17 | University Of Utah Research Foundation | Size-selection of cell-free DNA for increasing family size during next-generation sequencing |
CA3085933A1 (en) | 2017-12-14 | 2019-06-20 | Tai Diagnostics, Inc. | Assessing graft suitability for transplantation |
JP7322063B2 (en) | 2018-01-12 | 2023-08-07 | ナテラ, インコーポレイテッド | Novel primers and uses thereof |
WO2019161244A1 (en) | 2018-02-15 | 2019-08-22 | Natera, Inc. | Methods for isolating nucleic acids with size selection |
GB201819134D0 (en) | 2018-11-23 | 2019-01-09 | Cancer Research Tech Ltd | Improvements in variant detection |
US20190316184A1 (en) | 2018-04-14 | 2019-10-17 | Natera, Inc. | Methods for cancer detection and monitoring |
EP3807884A1 (en) | 2018-06-12 | 2021-04-21 | Natera, Inc. | Methods and systems for calling mutations |
CN112752852A (en) | 2018-07-03 | 2021-05-04 | 纳特拉公司 | Method for detecting donor-derived cell-free DNA |
US11525159B2 (en) | 2018-07-03 | 2022-12-13 | Natera, Inc. | Methods for detection of donor-derived cell-free DNA |
KR20210059694A (en) | 2018-07-12 | 2021-05-25 | 트윈스트랜드 바이오사이언시스, 인코포레이티드 | Methods and reagents for identifying genome editing, clonal expansion, and related fields |
EP3824470A1 (en) | 2018-07-17 | 2021-05-26 | Natera, Inc. | Methods and systems for calling ploidy states using a neural network |
WO2020041449A1 (en) | 2018-08-21 | 2020-02-27 | Zymo Research Corporation | Methods and compositions for tracking sample quality |
WO2020076957A1 (en) | 2018-10-09 | 2020-04-16 | Tai Diagnostics, Inc. | Cell lysis assay for cell-free dna analysis |
WO2020106897A1 (en) | 2018-11-20 | 2020-05-28 | Alex Brandt | Methods and materials for detecting salmonella in beef |
WO2020106987A1 (en) | 2018-11-21 | 2020-05-28 | Karius, Inc. | Detection and prediction of infectious disease |
JP2022519159A (en) | 2018-12-17 | 2022-03-22 | ナテラ, インコーポレイテッド | Analytical method of circulating cells |
US11931674B2 (en) | 2019-04-04 | 2024-03-19 | Natera, Inc. | Materials and methods for processing blood samples |
US20220154249A1 (en) | 2019-04-15 | 2022-05-19 | Natera, Inc. | Improved liquid biopsy using size selection |
US20220251654A1 (en) | 2019-06-06 | 2022-08-11 | Natera, Inc. | Methods for detecting immune cell dna and monitoring immune system |
US20220340963A1 (en) | 2019-09-20 | 2022-10-27 | Natera, Inc. | Methods for assessing graft suitability for transplantation |
US20230203573A1 (en) | 2020-05-29 | 2023-06-29 | Natera, Inc. | Methods for detection of donor-derived cell-free dna |
US20230257816A1 (en) | 2020-07-13 | 2023-08-17 | Aoy Tomita Mitchell | Methods for making treatment management decisions in transplant subjects and assessing transplant risks with threshold values |
AU2022237538A1 (en) | 2021-03-18 | 2023-09-21 | Natera, Inc. | Methods for determination of transplant rejection |
-
2015
- 2015-04-21 RU RU2016141308A patent/RU2717641C2/en active
- 2015-04-21 US US14/692,703 patent/US10179937B2/en active Active
- 2015-04-21 EP EP19159999.2A patent/EP3561075A1/en active Pending
- 2015-04-21 CN CN201580033190.XA patent/CN106460070B/en active Active
- 2015-04-21 AU AU2015249846A patent/AU2015249846B2/en active Active
- 2015-04-21 CN CN201910135027.4A patent/CN109971852A/en active Pending
- 2015-04-21 EP EP21193128.2A patent/EP3957749A1/en not_active Withdrawn
- 2015-04-21 JP JP2016563812A patent/JP6659575B2/en active Active
- 2015-04-21 EP EP15718754.3A patent/EP3134541B1/en active Active
- 2015-04-21 CA CA2945962A patent/CA2945962C/en active Active
- 2015-04-21 CN CN202110905394.5A patent/CN113774132A/en active Pending
-
2017
- 2017-06-15 HK HK17105964.2A patent/HK1232260A1/en unknown
-
2018
- 2018-02-15 US US15/898,145 patent/US11319595B2/en active Active
- 2018-06-21 US US16/014,961 patent/US11486008B2/en active Active
-
2019
- 2019-02-28 US US16/288,416 patent/US11408037B2/en active Active
- 2019-02-28 US US16/288,351 patent/US11319596B2/en active Active
- 2019-02-28 US US16/288,462 patent/US11371100B2/en active Active
- 2019-03-20 US US16/359,917 patent/US11414709B2/en active Active
- 2019-04-30 US US16/399,941 patent/US20190256931A1/en not_active Abandoned
-
2020
- 2020-01-27 JP JP2020010542A patent/JP6994058B2/en active Active
- 2020-01-27 JP JP2020010659A patent/JP6994059B2/en active Active
- 2020-01-27 JP JP2020011048A patent/JP2020072739A/en active Pending
- 2020-01-27 JP JP2020010580A patent/JP7030860B2/en active Active
- 2020-01-27 JP JP2020010765A patent/JP2020072738A/en active Pending
- 2020-01-27 JP JP2020010558A patent/JP2020072735A/en active Pending
-
2021
- 2021-07-27 AU AU2021209221A patent/AU2021209221B2/en active Active
- 2021-12-08 JP JP2021199347A patent/JP2022028949A/en active Pending
- 2021-12-08 JP JP2021199357A patent/JP2022028950A/en active Pending
-
2022
- 2022-01-05 US US17/568,854 patent/US20220154290A1/en active Pending
- 2022-02-22 JP JP2022025293A patent/JP7362805B2/en active Active
- 2022-03-11 US US17/692,469 patent/US20220213561A1/en active Pending
- 2022-03-28 AU AU2022202083A patent/AU2022202083B2/en active Active
- 2022-05-06 US US17/738,354 patent/US11530454B2/en active Active
- 2022-10-04 US US17/959,543 patent/US20230042405A1/en active Pending
-
2023
- 2023-02-20 US US18/111,790 patent/US20230242998A1/en active Pending
- 2023-11-06 JP JP2023189080A patent/JP2024001359A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102985561A (en) * | 2011-04-14 | 2013-03-20 | 维里纳塔健康公司 | Normalizing chromosomes for the determination and verification of common and rare chromosomal aneuploidies |
WO2013086464A1 (en) * | 2011-12-07 | 2013-06-13 | The Broad Institute, Inc. | Markers associated with chronic lymphocytic leukemia prognosis and progression |
WO2013130848A1 (en) * | 2012-02-29 | 2013-09-06 | Natera, Inc. | Informatics enhanced analysis of fetal samples subject to maternal contamination |
WO2013159035A2 (en) * | 2012-04-19 | 2013-10-24 | Medical College Of Wisconsin, Inc. | Highly sensitive surveillance using detection of cell free dna |
US20140100121A1 (en) * | 2012-06-21 | 2014-04-10 | The Chinese University Of Hong Kong | Mutational analysis of plasma dna for cancer detection |
WO2014039556A1 (en) * | 2012-09-04 | 2014-03-13 | Guardant Health, Inc. | Systems and methods to detect rare mutations and copy number variation |
Non-Patent Citations (2)
Title |
---|
JACQUELINE A. SHAW等: "Genomic analysis of circulating cell-free DNA infers", 《GENOME RESEARCH》 * |
胡彬等: "利用单核苷酸多态性芯片全基因组检测人大细胞肺癌细胞株的杂合性缺失和拷贝数变异", 《中国肺癌杂志》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110379465A (en) * | 2019-07-19 | 2019-10-25 | 元码基因科技(北京)股份有限公司 | Based on RNA target to sequencing and machine learning cancerous tissue source tracing method |
CN112094911A (en) * | 2020-10-10 | 2020-12-18 | 广西医科大学 | Medical application of NRK in lung cancer treatment and prognosis diagnosis |
CN112397144A (en) * | 2020-10-29 | 2021-02-23 | 无锡臻和生物科技有限公司 | Method and device for detecting gene mutation and expression quantity |
CN112397144B (en) * | 2020-10-29 | 2021-06-15 | 无锡臻和生物科技股份有限公司 | Method and device for detecting gene mutation and expression quantity |
CN112592976A (en) * | 2020-12-30 | 2021-04-02 | 深圳市海普洛斯生物科技有限公司 | Method and device for detecting MET gene amplification |
CN114093417A (en) * | 2021-11-23 | 2022-02-25 | 深圳基因家科技有限公司 | Method and device for identifying chromosomal arm heterozygosity loss |
CN114093417B (en) * | 2021-11-23 | 2022-10-04 | 深圳吉因加信息科技有限公司 | Method and device for identifying chromosomal arm heterozygosity loss |
CN113913525A (en) * | 2021-11-24 | 2022-01-11 | 广州医科大学 | Application of DNASE1L3 gene as target point for detecting and/or preventing liver cancer invasion and metastasis |
CN114807377A (en) * | 2022-06-29 | 2022-07-29 | 南京世和基因生物技术股份有限公司 | Application of bladder cancer prognosis survival time marker, evaluation device and computer readable medium |
CN116004800A (en) * | 2022-12-09 | 2023-04-25 | 中国农业科学院北京畜牧兽医研究所 | Application of CNV marker in early screening of sheep fat tail |
CN116004800B (en) * | 2022-12-09 | 2023-09-05 | 中国农业科学院北京畜牧兽医研究所 | Application of CNV marker in early screening of sheep fat tail |
Also Published As
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7362805B2 (en) | Detection of mutations and ploidy of chromosome segments |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |