DK2511843T3 - Fremgangsmåde og system til at hente variationer i en prøvepolynukleotidsekvens i forhold til en referencepolynukleotidsekvens - Google Patents

Fremgangsmåde og system til at hente variationer i en prøvepolynukleotidsekvens i forhold til en referencepolynukleotidsekvens Download PDF

Info

Publication number
DK2511843T3
DK2511843T3 DK12165247.3T DK12165247T DK2511843T3 DK 2511843 T3 DK2511843 T3 DK 2511843T3 DK 12165247 T DK12165247 T DK 12165247T DK 2511843 T3 DK2511843 T3 DK 2511843T3
Authority
DK
Denmark
Prior art keywords
base
polynucleotide sequence
sequence
mapped
reads
Prior art date
Application number
DK12165247.3T
Other languages
English (en)
Inventor
Paolo Carnevali
Jonathan M Baccash
Igor Nazarenko
Aaron L Halpern
Geoffrey Nilsen
Bruce Martin
Radoje Drmanac
Original Assignee
Complete Genomics Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Complete Genomics Inc filed Critical Complete Genomics Inc
Application granted granted Critical
Publication of DK2511843T3 publication Critical patent/DK2511843T3/da

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/10Ploidy or copy number detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/50Mutagenesis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/20Sequence assembly

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Medical Informatics (AREA)
  • Biophysics (AREA)
  • Theoretical Computer Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Chemical & Material Sciences (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Molecular Biology (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Claims (15)

1. Computer-implementeret fremgangsmåde til at hente variationer i kortlagte, parrede aflæsninger, der er opnået fra en polynukleotidsekvens af en prøve sammenlignet med en polynukleotidsekvens af en reference, hvilken fremgangsmåde omfatter: at modtage referencepolynukleotidsekvensen og kortlagte, parrede aflæsninger (200), hvor de kortlagte, parrede aflæsninger opnås fra prøvepolynukleotidsekvensen og kortlægges til steder i referencepolynukleotidsekvensen; for hver af et antal positioner på referencepolynukleotidsekvensen: at beregne en eller flere referenceværdier (510), hvor hver referenceværdi tilhører en hypotese, der er forskellig fra referencepolynukleotidsekvensen ved positionen og som er beregnet på basis af de kortlagte, parrede aflæsninger ved anvendelse af en Bayes-formulering (500) for hypotesen; at identificere lokale områder (300) svarende til positioner, der har en referenceværdi over en tærskelværdi; at anvende de-Bruijn-grafbaserede algoritmer til at bestemme grafer til at identificere lokale de-novo-intervaller (512), hvor hvert lokalt de-novo-interval omfatter en eller flere positioner, ved hvilke grafen afviger fra referencepolynukleotidsekvensen; at kombinere de lokale områder og de lokale de-novo-intervaller til at danne optimeringsintervaller (514); for hvert optimeringsinterval: at generere en første sekvenshypotese (412) ved anvendelse af referenceværdierne eller én eller flere grafer af optimeringsintervallet; og at modificere den første sekvenshypotese for at opnå en optimeret sekvenshypotese (414), der har en forøget sandsynlighed for at være korrekt baseret på de kortlagte, parrede aflæsninger, der kan kortlægges til op timerings interval le tjat identificere og hente variationer (32), der er detekteret i de kortlagte, parrede aflæsninger i forhold til referencepolynukleotidsekvensen ved anvendelse af de optimerede sekvenshypoteser; og at udsende en liste over de variationer, der hver især beskriver en måde, hvorpå de kortlagte, parrede aflæsninger observeres at afvige fra referencepolynukleotidsekvensen ved eller nær et bestemt sted.
2. Fremgangsmåden ifølge krav 1, hvor udlæsning af listen over variationerne yderligere indbefatter udlæsning af en liste over ikke-hentede områder, for hvilke variationer ikke kan hentes på grund af beregningsmæssige usikkerheder.
3. Fremgangsmåden ifølge krav 1, hvor variationerne indbefatter identificerede sekvenser af deletioner, insertioner, mutationer polymorfier og duplikationer eller omlejringer af én eller flere baser.
4. Fremgangsmåden ifølge krav 3, der yderligere omfatter anvendelse af de optimerede sekvenshypoteser til at samle prøvepolynukleotidsekvensen fra de kortlagte, parrede aflæsninger, hvor en samlet polynukleotidsekvens hovedsageligt er baseret på referencepolynukleotidsekvensen, men indbefatter de identificerede sekvenser.
5. Fremgangsmåden ifølge krav 1, hvor hver af de kortlagte, parrede aflæsninger omfatter aflæsninger, der har variable mellemrum.
6. Fremgangsmåden ifølge krav 1, hvor hver af de kortlagte, parrede aflæsninger omfatter aflæsninger, der ikke har mellemrum.
7. Fremgangsmåden ifølge krav 1, hvorved beregning af reference- værdier ved anvendelse af Bayes-formuleringen omfatter: for hver baseposition i referencepolynukleotidsekvensen: at generere et sæt af første hypoteser for denne baseposition i referencepolynukleotidsekvensen ved at modificere en baseværdi ved denne baseposition i p alleler ved alle mulige 1-basevariationer; at bestemme et sæt kortlagte, parrede aflæsninger, der er tæt på denne baseposition af referencepolynukleotidsekvensen; og at beregne referenceværdier for denne baseposition ved at beregne for hver af de første hypoteser i sættet af første hypoteser, et forhold mellem sandsynlighederne Pv/PRef, hvor Pv er en sandsynlighed for en 1-basevariationhypotese, og PRef er en sandsynlighed for basisværdien i referencepolynukleotidsekvensen, og hvor sættet af kortlagte, parrede aflæsninger nær denne baseposition anvendes under beregning af sandsynlighedsforholdet ved denne baseposition; hvor prøvepolynukleotidsekvensen omfatter et genom G, og hvor hver af referenceværdierne omfatter et logaritmisk likelihoodforhold L(G) for hver af hypoteserne, hvor L (G) = Log (Pv/PRef) .
8. Fremgangsmåden ifølge krav 6, hvor de kortlagte, parrede aflæsninger genereres uafhængigt af hinanden, og sandsynlighedsestimater, der tager hensyn til alle af de kortlagte, parrede aflæsninger beregnes ved
hvor N30 repræsenterer et antal baser i referencegenomet, Ng repræsenterer et antal baser i prøvegenomet, og Nd repræsenterer et antal parrede aflæsninger.
9. Fremgangsmåden ifølge krav 7, der yderligere omfatter at repræsentere
med en tilnærmelse for en insertionstraf, således at hver ekstra base i et allel af G forårsager en formindskelse i P(G|MtdRds) med en faktor exp (-c/nD) , hvor nD repræsenterer et antal baser i hver af de kortlagte, parrede aflæsninger, således at ekstra baser ikke tilføjes til G, medmindre de ekstra baser har en tilstrækkelig støtte ved de kortlagte, parrede aflæsninger, hvor c er den gennemsnitlige dækning pr allel.
10. Fremgangsmåden ifølge krav 1, hvor beregning af lokale de-novo-intervaller anvender en partiel de-Bruijn-graf for at finde variationer ud over singlebaseændringer, hvilken fremgangsmåde yderligere omfatter: at initialisere en partielle de-Bruijn-graf med referenceknuder, der er dannet fra basesekvenser fra referencepolynukleotidsekvensen; for hver af referenceknuderne at bestemme et sæt kortlagte, parrede aflæsninger, der kan kortlægges til referenceknuderne og som indbefatter en baseudvidelse, der strækker sig ud over hver ende af referenceknuden af enhver mulig 1-baseværdi; for hver baseudvidelse at beregne en udvidelsesstyrke, der repræsenterer en mængde af støtte for at udvide referenceknuden ved hver 1-baseværdi, som er baseret mindst delvist på et antal kortlagte, parrede aflæsninger, der har den samme udvidelse og antallet af overensstemmelser og ikke-overensstemmelser af disse kortlagte, parrede aflæsninger med sekvensen for knuden, der behandles; idet baseudvidelserne, der har en højeste udvidelsesstyrke, der er uforenelig med referenceknuderne som forgrenede knuder anvendes i den partielle de-Bruijn-graf; at beregne udvidelsesstyrken i retningen af udvidelsen for hver forgrenet knude på en dybde-første måde i en retning, og skabe en ny kant og en forgrenet ny knude efter hver beregning fra baseudvidelserne, der har udvidelsesstyrker over en tærskelværdi; hvis der ikke er baseudvidelser, der har udvidelsesstyrken over tærskelværdien i en bane, returneres en fejl for banen; og hvis en ny forgrenet knude dannes, der er lig med basesekvensen af en af referenceknuderne og som er i overensstemmelse med en SNP eller kort indel, at afslutte beregningen og returnerer banen.
11. Fremgangsmåden ifølge krav 1, hvor kombination af de lokale områder og de lokale de-novo-intervaller til at danne optimeringsintervaller indbefatter: at overveje som kandidater for optimeringsintervaller de lokale de-novo-intervaller og referenceværdierne, der er forbundet med et højt sandsynlighedsforhold Pv/Pp.ef som overstiger tærskelværdien, hvor Pv er en sandsynlighed for en 1-basevariationshypotese, og PRef er en sandsynlighed for baseværdien i referencepolynukleotidsekvensen; og at kombinere kandidatoptimeringsintervallerne, der overlapper hinanden eller der er mindre end en tærskelbaseafstand fra hinanden, til optimeringsintervallerne; hvor prøvepolynukleotidsekvensen omfatter et genom G, og hvor hver af referenceværdierne omfatter et logaritmisk likelihood-forhold L(G) for hver af hypoteserne, hvor L (G) = Log (Pv/PRef) ·
12. Fremgangsmåden ifølge krav 1, hvor modificering af den første sekvenshypotese for at opnå en optimeret sekvenshypotese omfatter: at gennemløbe hver baseposition i en første hypotese i optimeringsintervallet og iterativt at ændre basen med hver af de mulige alternative baseværdier, herunder indsatte og slettede baser, og at beregne et sandsynlighedsforhold for hver ændring; og at anvende ændringer på de første hypoteser, der maksimerer sandsynlighedsforholdet.
13. System, der omfatter: et datalager (14), der lagrer en referencepolynukleotidsekvens og kortlagte, parrede aflæsninger, som er opnået fra en prøvepolynukleotidsekvens, der er kortlagt til steder i referencepolynukleotidsekvensen; et computercluster (10), der omfatter et antal computere (12), som er koblet til datalageret via et netværk; og et variationskaldeprogram (18), der eksekveres parallelt på antallet af computere, hvilket variationskaldeprogram er konfigureret til at udføre fremgangsmåden i ethvert af kravene 1-12.
14. Systemet ifølge krav 13, hvor computerclusteret er konfigureret således, at en instans af variationskaldeprogrammet, der eksekveres på forskellige af antallet af computere, opererer parallelt på forskellige dele af referencepolynukleotidsekvensen og de kortlagte, parrede aflæsninger.
15. Eksekverbart softwareprodukt, der er lagret på et computerlæsbart medium, som indeholder programinstruktioner til at udføre fremgangsmåden ifølge ethvert af kravene 1-12.
DK12165247.3T 2009-04-29 2010-04-28 Fremgangsmåde og system til at hente variationer i en prøvepolynukleotidsekvens i forhold til en referencepolynukleotidsekvens DK2511843T3 (da)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US17396709P 2009-04-29 2009-04-29
EP10770290.4A EP2430441B1 (en) 2009-04-29 2010-04-28 Method and system for calling variations in a sample polynucleotide sequence with respect to a reference polynucleotide sequence

Publications (1)

Publication Number Publication Date
DK2511843T3 true DK2511843T3 (da) 2017-03-27

Family

ID=43032762

Family Applications (1)

Application Number Title Priority Date Filing Date
DK12165247.3T DK2511843T3 (da) 2009-04-29 2010-04-28 Fremgangsmåde og system til at hente variationer i en prøvepolynukleotidsekvens i forhold til en referencepolynukleotidsekvens

Country Status (5)

Country Link
US (1) US20110004413A1 (da)
EP (2) EP2430441B1 (da)
CN (1) CN102460155B (da)
DK (1) DK2511843T3 (da)
WO (1) WO2010127045A2 (da)

Families Citing this family (98)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105349647B (zh) 2007-10-30 2020-08-28 完整基因有限公司 用于核酸高通量测序的方法
GB2467691A (en) * 2008-09-05 2010-08-11 Aueon Inc Methods for stratifying and annotating cancer drug treatment options
WO2010091021A2 (en) 2009-02-03 2010-08-12 Complete Genomics, Inc. Oligomer sequences mapping
WO2010091023A2 (en) 2009-02-03 2010-08-12 Complete Genomics, Inc. Indexing a reference sequence for oligomer sequence mapping
WO2010091024A1 (en) 2009-02-03 2010-08-12 Complete Genomics, Inc. Oligomer sequences mapping
AU2010242073C1 (en) 2009-04-30 2015-12-24 Good Start Genetics, Inc. Methods and compositions for evaluating genetic markers
WO2012040387A1 (en) 2010-09-24 2012-03-29 The Board Of Trustees Of The Leland Stanford Junior University Direct capture, amplification and sequencing of target dna using immobilized primers
US8725422B2 (en) 2010-10-13 2014-05-13 Complete Genomics, Inc. Methods for estimating genome-wide copy number variations
US9163281B2 (en) 2010-12-23 2015-10-20 Good Start Genetics, Inc. Methods for maintaining the integrity and identification of a nucleic acid template in a multiplex sequencing reaction
WO2013040583A2 (en) * 2011-09-16 2013-03-21 Complete Genomics, Inc Determining variants in a genome of a heterogeneous sample
CA2852665A1 (en) 2011-10-17 2013-04-25 Good Start Genetics, Inc. Analysis methods
US10837879B2 (en) 2011-11-02 2020-11-17 Complete Genomics, Inc. Treatment for stabilizing nucleic acid arrays
US8209130B1 (en) 2012-04-04 2012-06-26 Good Start Genetics, Inc. Sequence assembly
US8812422B2 (en) 2012-04-09 2014-08-19 Good Start Genetics, Inc. Variant database
US10227635B2 (en) 2012-04-16 2019-03-12 Molecular Loop Biosolutions, Llc Capture reactions
US9600625B2 (en) 2012-04-23 2017-03-21 Bina Technologies, Inc. Systems and methods for processing nucleic acid sequence data
CN104871164B (zh) 2012-10-24 2019-02-05 南托米克斯有限责任公司 处理和呈现基因组序列数据中核苷酸变化的基因组浏览器***
US10691775B2 (en) 2013-01-17 2020-06-23 Edico Genome, Corp. Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform
US9679104B2 (en) 2013-01-17 2017-06-13 Edico Genome, Corp. Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform
GB2523495A (en) * 2013-01-17 2015-08-26 Edico Genome Corp Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform
WO2014113204A1 (en) 2013-01-17 2014-07-24 Personalis, Inc. Methods and systems for genetic analysis
US10068054B2 (en) 2013-01-17 2018-09-04 Edico Genome, Corp. Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform
US10847251B2 (en) 2013-01-17 2020-11-24 Illumina, Inc. Genomic infrastructure for on-site or cloud-based DNA and RNA processing and analysis
US9792405B2 (en) 2013-01-17 2017-10-17 Edico Genome, Corp. Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform
EP2971159B1 (en) 2013-03-14 2019-05-08 Molecular Loop Biosolutions, LLC Methods for analyzing nucleic acids
US9328382B2 (en) 2013-03-15 2016-05-03 Complete Genomics, Inc. Multiple tagging of individual long DNA fragments
WO2014186604A1 (en) * 2013-05-15 2014-11-20 Edico Genome Corp. Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform
WO2014197377A2 (en) 2013-06-03 2014-12-11 Good Start Genetics, Inc. Methods and systems for storing sequence read data
US20150073724A1 (en) 2013-07-29 2015-03-12 Agilent Technologies, Inc Method for finding variants from targeted sequencing panels
US9898575B2 (en) 2013-08-21 2018-02-20 Seven Bridges Genomics Inc. Methods and systems for aligning sequences
US9116866B2 (en) 2013-08-21 2015-08-25 Seven Bridges Genomics Inc. Methods and systems for detecting sequence variants
US10726942B2 (en) 2013-08-23 2020-07-28 Complete Genomics, Inc. Long fragment de novo assembly using short reads
EP3965111A1 (en) 2013-08-30 2022-03-09 Personalis, Inc. Methods and systems for genomic analysis
CN105793859B (zh) * 2013-09-30 2020-02-28 七桥基因公司 用于检测序列变异体的***
GB2535066A (en) 2013-10-03 2016-08-10 Personalis Inc Methods for analyzing genotypes
US11041203B2 (en) 2013-10-18 2021-06-22 Molecular Loop Biosolutions, Inc. Methods for assessing a genomic region of a subject
EP3058093B1 (en) 2013-10-18 2019-07-17 Seven Bridges Genomics Inc. Methods and systems for identifying disease-induced mutations
US10851414B2 (en) 2013-10-18 2020-12-01 Good Start Genetics, Inc. Methods for determining carrier status
WO2015058120A1 (en) 2013-10-18 2015-04-23 Seven Bridges Genomics Inc. Methods and systems for aligning sequences in the presence of repeating elements
SG11201602903XA (en) 2013-10-18 2016-05-30 Seven Bridges Genomics Inc Methods and systems for genotyping genetic samples
US10832797B2 (en) 2013-10-18 2020-11-10 Seven Bridges Genomics Inc. Method and system for quantifying sequence alignment
US9092402B2 (en) 2013-10-21 2015-07-28 Seven Bridges Genomics Inc. Systems and methods for using paired-end data in directed acyclic structure
WO2015062184A1 (en) * 2013-11-01 2015-05-07 Accurascience, Llc Method and apparatus for calling single-nucleotide variations and other variations
US9817944B2 (en) 2014-02-11 2017-11-14 Seven Bridges Genomics Inc. Systems and methods for analyzing sequence data
US9697327B2 (en) 2014-02-24 2017-07-04 Edico Genome Corporation Dynamic genome reference generation for improved NGS accuracy and reproducibility
US11053548B2 (en) 2014-05-12 2021-07-06 Good Start Genetics, Inc. Methods for detecting aneuploidy
WO2016040446A1 (en) 2014-09-10 2016-03-17 Good Start Genetics, Inc. Methods for selectively suppressing non-target sequences
US10429399B2 (en) 2014-09-24 2019-10-01 Good Start Genetics, Inc. Process control for increased robustness of genetic assays
CA2963868A1 (en) * 2014-10-10 2016-04-14 Invitae Corporation Methods, systems and processes of de novo assembly of sequencing reads
CN107076729A (zh) * 2014-10-16 2017-08-18 康希尔公司 变异体调用器
JP2017530720A (ja) 2014-10-17 2017-10-19 グッド スタート ジェネティクス, インコーポレイテッド 着床前遺伝子スクリーニングおよび異数性検出
US10125399B2 (en) 2014-10-30 2018-11-13 Personalis, Inc. Methods for using mosaicism in nucleic acids sampled distal to their origin
JP6788587B2 (ja) * 2014-11-25 2020-11-25 コーニンクレッカ フィリップス エヌ ヴェKoninklijke Philips N.V. ゲノムデータの安全な転送
US10429342B2 (en) 2014-12-18 2019-10-01 Edico Genome Corporation Chemically-sensitive field effect transistor
US9859394B2 (en) 2014-12-18 2018-01-02 Agilome, Inc. Graphene FET devices, systems, and methods of using the same for sequencing nucleic acids
US9618474B2 (en) 2014-12-18 2017-04-11 Edico Genome, Inc. Graphene FET devices, systems, and methods of using the same for sequencing nucleic acids
US9857328B2 (en) 2014-12-18 2018-01-02 Agilome, Inc. Chemically-sensitive field effect transistors, systems and methods for manufacturing and using the same
US10020300B2 (en) 2014-12-18 2018-07-10 Agilome, Inc. Graphene FET devices, systems, and methods of using the same for sequencing nucleic acids
US10006910B2 (en) 2014-12-18 2018-06-26 Agilome, Inc. Chemically-sensitive field effect transistors, systems, and methods for manufacturing and using the same
EP4095261A1 (en) 2015-01-06 2022-11-30 Molecular Loop Biosciences, Inc. Screening for structural variants
US10192026B2 (en) 2015-03-05 2019-01-29 Seven Bridges Genomics Inc. Systems and methods for genomic pattern analysis
US20160273049A1 (en) 2015-03-16 2016-09-22 Personal Genome Diagnostics, Inc. Systems and methods for analyzing nucleic acid
EP3329491A2 (en) 2015-03-23 2018-06-06 Edico Genome Corporation Method and system for genomic visualization
CN106021998A (zh) * 2015-03-27 2016-10-12 知源生信公司(美国硅谷) 单通多变体识别计算流水线
US10275567B2 (en) 2015-05-22 2019-04-30 Seven Bridges Genomics Inc. Systems and methods for haplotyping
US10793895B2 (en) 2015-08-24 2020-10-06 Seven Bridges Genomics Inc. Systems and methods for epigenetic analysis
BR112018003631A2 (pt) * 2015-08-25 2018-09-25 Nantomics Llc sistemas e métodos para busca por variante de alta precisão
US10724110B2 (en) 2015-09-01 2020-07-28 Seven Bridges Genomics Inc. Systems and methods for analyzing viral nucleic acids
US10584380B2 (en) 2015-09-01 2020-03-10 Seven Bridges Genomics Inc. Systems and methods for mitochondrial analysis
GB2543068A (en) * 2015-10-06 2017-04-12 Fonleap Ltd System for generating genomics data, and device, method and software product for use therein
US11347704B2 (en) 2015-10-16 2022-05-31 Seven Bridges Genomics Inc. Biological graph or sequence serialization
CN105483244B (zh) * 2015-12-28 2019-10-22 武汉菲沙基因信息有限公司 一种基于超长基因组的变异检测方法及检测***
US20170199960A1 (en) 2016-01-07 2017-07-13 Seven Bridges Genomics Inc. Systems and methods for adaptive local alignment for graph genomes
US20170270245A1 (en) 2016-01-11 2017-09-21 Edico Genome, Corp. Bioinformatics systems, apparatuses, and methods for performing secondary and/or tertiary processing
US10068183B1 (en) 2017-02-23 2018-09-04 Edico Genome, Corp. Bioinformatics systems, apparatuses, and methods executed on a quantum processing platform
US10364468B2 (en) 2016-01-13 2019-07-30 Seven Bridges Genomics Inc. Systems and methods for analyzing circulating tumor DNA
US10460829B2 (en) 2016-01-26 2019-10-29 Seven Bridges Genomics Inc. Systems and methods for encoding genetic variation for a population
US10262102B2 (en) 2016-02-24 2019-04-16 Seven Bridges Genomics Inc. Systems and methods for genotyping with graph reference
WO2017201081A1 (en) 2016-05-16 2017-11-23 Agilome, Inc. Graphene fet devices, systems, and methods of using the same for sequencing nucleic acids
US10790044B2 (en) 2016-05-19 2020-09-29 Seven Bridges Genomics Inc. Systems and methods for sequence encoding, storage, and compression
US11299783B2 (en) 2016-05-27 2022-04-12 Personalis, Inc. Methods and systems for genetic analysis
US10600499B2 (en) 2016-07-13 2020-03-24 Seven Bridges Genomics Inc. Systems and methods for reconciling variants in sequence data relative to reference sequence data
US11289177B2 (en) 2016-08-08 2022-03-29 Seven Bridges Genomics, Inc. Computer method and system of identifying genomic mutations using graph-based local assembly
US11250931B2 (en) 2016-09-01 2022-02-15 Seven Bridges Genomics Inc. Systems and methods for detecting recombination
KR102217487B1 (ko) * 2016-09-21 2021-02-23 트위스트 바이오사이언스 코포레이션 핵산 기반 데이터 저장
US10319465B2 (en) 2016-11-16 2019-06-11 Seven Bridges Genomics Inc. Systems and methods for aligning sequences to graph references
US11347844B2 (en) 2017-03-01 2022-05-31 Seven Bridges Genomics, Inc. Data security in bioinformatic sequence analysis
US10726110B2 (en) 2017-03-01 2020-07-28 Seven Bridges Genomics, Inc. Watermarking for data security in bioinformatic sequence analysis
WO2019017806A1 (en) * 2017-07-20 2019-01-24 Huawei Technologies Co., Ltd APPARATUS AND METHOD FOR IDENTIFYING HAPLOTYPES
US11728007B2 (en) * 2017-11-30 2023-08-15 Grail, Llc Methods and systems for analyzing nucleic acid sequences using mappability analysis and de novo sequence assembly
CN108763872B (zh) * 2018-04-25 2019-12-06 华中科技大学 一种分析预测癌症突变影响lir模体功能的方法
WO2019222120A1 (en) * 2018-05-14 2019-11-21 Quantum-Si Incorporated Machine learning enabled biological polymer assembly
US11814750B2 (en) 2018-05-31 2023-11-14 Personalis, Inc. Compositions, methods and systems for processing or analyzing multi-species nucleic acid samples
US10801064B2 (en) 2018-05-31 2020-10-13 Personalis, Inc. Compositions, methods and systems for processing or analyzing multi-species nucleic acid samples
CN109741788A (zh) * 2018-12-24 2019-05-10 广州合众生物科技有限公司 一种snp位点分析方法及***
EP3918088B1 (en) 2019-01-29 2024-03-13 MGI Tech Co., Ltd. High coverage stlfr
CN110299185B (zh) * 2019-05-08 2023-07-04 西安电子科技大学 一种基于新一代测序数据的***变异检测方法及***
CN113005188A (zh) * 2020-12-29 2021-06-22 阅尔基因技术(苏州)有限公司 用一代测序评估样本dna中碱基损伤、错配和变异的方法

Family Cites Families (49)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5547839A (en) 1989-06-07 1996-08-20 Affymax Technologies N.V. Sequencing of surface immobilized polymers utilizing microflourescence detection
DE69333422T2 (de) * 1992-07-31 2004-12-16 International Business Machines Corp. Auffindung von Zeichenketten in einer Datenbank von Zeichenketten
US6401267B1 (en) 1993-09-27 2002-06-11 Radoje Drmanac Methods and compositions for efficient nucleic acid sequencing
US6335160B1 (en) * 1995-02-17 2002-01-01 Maxygen, Inc. Methods and compositions for polypeptide engineering
US5795782A (en) 1995-03-17 1998-08-18 President & Fellows Of Harvard College Characterization of individual polymer molecules based on monomer-interface interactions
US5750341A (en) 1995-04-17 1998-05-12 Lynx Therapeutics, Inc. DNA sequencing by parallel oligonucleotide extensions
GB9620209D0 (en) 1996-09-27 1996-11-13 Cemu Bioteknik Ab Method of sequencing DNA
US6309824B1 (en) 1997-01-16 2001-10-30 Hyseq, Inc. Methods for analyzing a target nucleic acid using immobilized heterogeneous mixtures of oligonucleotide probes
US6830748B1 (en) * 1997-09-26 2004-12-14 Medimmune Vaccines, Inc. Recombinant RSV virus expression systems and vaccines
US6055526A (en) * 1998-04-02 2000-04-25 Sun Microsystems, Inc. Data indexing technique
US7071324B2 (en) * 1998-10-13 2006-07-04 Brown University Research Foundation Systems and methods for sequencing by hybridization
US6403312B1 (en) * 1998-10-16 2002-06-11 Xencor Protein design automatic for protein libraries
ATE440148T1 (de) 1999-01-06 2009-09-15 Callida Genomics Inc Verbesserte sequenzierung mittels hybridisierung durch verwendung von sondengemischen
WO2000042559A1 (en) * 1999-01-18 2000-07-20 Maxygen, Inc. Methods of populating data structures for use in evolutionary simulations
US7024312B1 (en) * 1999-01-19 2006-04-04 Maxygen, Inc. Methods for making character strings, polynucleotides and polypeptides having desired characteristics
DE60044223D1 (de) * 1999-01-19 2010-06-02 Maxygen Inc Durch oligonukleotide-vermittelte nukleinsäuren-rekombination
GB9901475D0 (en) 1999-01-22 1999-03-17 Pyrosequencing Ab A method of DNA sequencing
ATE296310T1 (de) * 1999-03-08 2005-06-15 Metrigen Inc Syntheseverfahren zum ökonomischen aufbau langer dna-sequenzen und zusammensetzungen hierfür
US6401043B1 (en) * 1999-04-26 2002-06-04 Variagenics, Inc. Variance scanning method for identifying gene sequence variances
US7258838B2 (en) 1999-06-22 2007-08-21 President And Fellows Of Harvard College Solid state molecular probe device
EP1192453B1 (en) 1999-06-22 2012-02-15 President and Fellows of Harvard College Molecular and atomic scale evaluation of biopolymers
US6818395B1 (en) 1999-06-28 2004-11-16 California Institute Of Technology Methods and apparatus for analyzing polynucleotide sequences
EP1218543A2 (en) 1999-09-29 2002-07-03 Solexa Ltd. Polynucleotide sequencing
US7430477B2 (en) * 1999-10-12 2008-09-30 Maxygen, Inc. Methods of populating data structures for use in evolutionary simulations
US6775622B1 (en) * 2000-01-31 2004-08-10 Zymogenetics, Inc. Method and system for detecting near identities in large DNA databases
JP2002071687A (ja) * 2000-08-31 2002-03-12 Canon Inc 変異遺伝子のスクリーニング方法
JP2005537030A (ja) * 2002-05-09 2005-12-08 ユー.エス. ジェノミクス, インコーポレイテッド 核酸を分析する方法
US20040018525A1 (en) * 2002-05-21 2004-01-29 Bayer Aktiengesellschaft Methods and compositions for the prediction, diagnosis, prognosis, prevention and treatment of malignant neoplasma
CN1774511B (zh) * 2002-11-27 2013-08-21 斯昆诺有限公司 用于序列变异检测和发现的基于断裂的方法和***
WO2004113505A2 (en) * 2003-06-19 2004-12-29 Board Of Regents Of University Of Nebraska System and method for sequence distance measure for phylogenetic tree construction
WO2005024562A2 (en) * 2003-08-11 2005-03-17 Eloret Corporation System and method for pattern recognition in sequential data
US20050149272A1 (en) * 2003-09-10 2005-07-07 Itshack Pe' Er Method for sequencing polynucleotides
US7238485B2 (en) 2004-03-23 2007-07-03 President And Fellows Of Harvard College Methods and apparatus for characterizing polynucleotides
JP4533015B2 (ja) 2004-06-15 2010-08-25 キヤノン株式会社 化合物及びそれを用いた有機エレクトロルミネッセンス素子
JP2008506165A (ja) * 2004-06-18 2008-02-28 リール・トゥー・リミテッド データ集合の目録作成および探索のための方法およびシステム
WO2006073504A2 (en) 2004-08-04 2006-07-13 President And Fellows Of Harvard College Wobble sequencing
CN102925549A (zh) 2004-08-13 2013-02-13 哈佛学院院长等 超高处理量光学-纳米孔dna读出平台
WO2006031745A2 (en) * 2004-09-10 2006-03-23 Sequenom, Inc. Methods for long-range sequence analysis of nucleic acids
US20070122817A1 (en) * 2005-02-28 2007-05-31 George Church Methods for assembly of high fidelity synthetic polynucleotides
US20060286566A1 (en) * 2005-02-03 2006-12-21 Helicos Biosciences Corporation Detecting apparent mutations in nucleic acid sequences
US20090264299A1 (en) * 2006-02-24 2009-10-22 Complete Genomics, Inc. High throughput genome sequencing on DNA arrays
EP3257949A1 (en) 2005-06-15 2017-12-20 Complete Genomics Inc. Nucleic acid analysis by random mixtures of non-overlapping fragments
US20060287833A1 (en) * 2005-06-17 2006-12-21 Zohar Yakhini Method and system for sequencing nucleic acid molecules using sequencing by hybridization and comparison with decoration patterns
WO2007120208A2 (en) 2005-11-14 2007-10-25 President And Fellows Of Harvard College Nanogrid rolling circle dna sequencing
CN1851704A (zh) * 2006-05-17 2006-10-25 杨仑 对专利基因或基因专利进行检索、注释和数据挖掘的方法
WO2008070375A2 (en) 2006-11-09 2008-06-12 Complete Genomics, Inc. Selection of dna adaptor orientation
WO2009052214A2 (en) 2007-10-15 2009-04-23 Complete Genomics, Inc. Sequence analysis using decorated nucleic acids
US20100049445A1 (en) * 2008-06-20 2010-02-25 Eureka Genomics Corporation Method and apparatus for sequencing data samples
US8504374B2 (en) * 2009-02-02 2013-08-06 Jerry Lee Potter Method for recognizing and interpreting patterns in noisy data sequences

Also Published As

Publication number Publication date
WO2010127045A2 (en) 2010-11-04
WO2010127045A3 (en) 2011-01-13
EP2430441A4 (en) 2014-02-19
CN102460155B (zh) 2015-03-25
EP2511843A2 (en) 2012-10-17
EP2430441B1 (en) 2018-06-13
EP2511843B1 (en) 2016-12-21
CN102460155A (zh) 2012-05-16
EP2511843A3 (en) 2014-02-19
US20110004413A1 (en) 2011-01-06
EP2430441A2 (en) 2012-03-21

Similar Documents

Publication Publication Date Title
DK2511843T3 (da) Fremgangsmåde og system til at hente variationer i en prøvepolynukleotidsekvens i forhold til en referencepolynukleotidsekvens
Bogard et al. A deep neural network for predicting and engineering alternative polyadenylation
AU2021282482B2 (en) Deep learning-based aberrant splicing detection
Edwards et al. High-resolution genetic mapping with pooled sequencing
US20130138358A1 (en) Algorithms for sequence determination
Menelaou et al. Genotype calling and phasing using next-generation sequencing reads and a haplotype scaffold
Grover et al. Searching microsatellites in DNA sequences: approaches used and tools developed
Denti et al. MALVA: genotyping by Mapping-free ALlele detection of known VAriants
Flagel et al. GOOGA: A platform to synthesize mapping experiments and identify genomic structural diversity
Dib et al. Evolutionary footprint of coevolving positions in genes
Flassig et al. An effective framework for reconstructing gene regulatory networks from genetical genomics data
Gebert et al. Analyzing and optimizing genetic network structure via path-finding
Backofen et al. Comparative RNA genomics
Huang et al. Reveel: large-scale population genotyping using low-coverage sequencing data
Sheikh et al. Base-calling for bioinformaticians
Baudry Investigating chromosome dynamics through Hi-C assembly
Lim et al. LSTrAP-denovo: Automated Generation of Transcriptome Atlases for Eukaryotic Species Without Genomes
Uthayopas et al. PRIMITI: a computational approach for accurate prediction of miRNA-target mRNA interaction
Singh Inferring interaction networks from transcriptomic data: methods and applications
Wen et al. Reference-guided automatic assembly of genomic tandem repeats with only HiFi and Hi-C data enables population-level analysis
Gaitán Gómez Development of a new structural variant detection software based on graph clustering machine learning algorithms from long reads
Duitama Genomic variants detection and genotyping
Zhang Computational Methods for Resolving Heterogeneity in Biological Data
Lagarde Genomic Characterization of Human Long Noncoding RNAs
Valente et al. Nonparametric Reduced-Rank Regression for Multi-SNP, Multi-Trait Association Mapping