US20160040234A1 - Methods of sequencing the immune repertoire - Google Patents

Methods of sequencing the immune repertoire Download PDF

Info

Publication number
US20160040234A1
US20160040234A1 US14/776,141 US201414776141A US2016040234A1 US 20160040234 A1 US20160040234 A1 US 20160040234A1 US 201414776141 A US201414776141 A US 201414776141A US 2016040234 A1 US2016040234 A1 US 2016040234A1
Authority
US
United States
Prior art keywords
chain
sequencing
repertoire
sequence
cells
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US14/776,141
Inventor
Edward A. Hutchins
Hei-Mun Christina Fan
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lineage Biosciences Inc USA
Original Assignee
Lineage Biosciences Inc USA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Family has litigation
First worldwide family litigation filed litigation Critical https://patents.darts-ip.com/?family=50686173&utm_source=***_patent&utm_medium=platform_link&utm_campaign=public_patent_search&patent=US20160040234(A1) "Global patent litigation dataset” by Darts-ip is licensed under a Creative Commons Attribution 4.0 International License.
Application filed by Lineage Biosciences Inc USA filed Critical Lineage Biosciences Inc USA
Priority to US14/776,141 priority Critical patent/US20160040234A1/en
Assigned to IMMUMETRIX, INC. reassignment IMMUMETRIX, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: FAN, HEI-MUN CHRISTINA, HUTCHINS, EDWARD A.
Publication of US20160040234A1 publication Critical patent/US20160040234A1/en
Assigned to LINEAGE BIOSCIENCES, INC. reassignment LINEAGE BIOSCIENCES, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: IMMUMETRIX, INC.
Assigned to LINEAGE BIOSCIENCES, INC. reassignment LINEAGE BIOSCIENCES, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: IMMUMETRIX, INC.
Assigned to BANK OF MONTREAL reassignment BANK OF MONTREAL SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: LINEAGE BIOSCIENCES, INC.
Abandoned legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J19/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J19/0046Sequential or parallel reactions, e.g. for the synthesis of polypeptides or polynucleotides; Apparatus and devices for combinatorial chemistry or for making molecular arrays
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1096Processes for the isolation, preparation or purification of DNA or RNA cDNA Synthesis; Subtracted cDNA library construction, e.g. RT, RT-PCR
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • C12Q1/6874Methods for sequencing involving nucleic acid arrays, e.g. sequencing by hybridisation
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6881Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for tissue or cell typing, e.g. human leukocyte antigen [HLA] probes
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/00583Features relative to the processes being carried out
    • B01J2219/00596Solid-phase processes
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/00718Type of compounds synthesised
    • B01J2219/0072Organic compounds
    • B01J2219/00722Nucleotides
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2521/00Reaction characterised by the enzymatic activity
    • C12Q2521/10Nucleotidyl transfering
    • C12Q2521/101DNA polymerase
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2521/00Reaction characterised by the enzymatic activity
    • C12Q2521/10Nucleotidyl transfering
    • C12Q2521/107RNA dependent DNA polymerase,(i.e. reverse transcriptase)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2525/00Reactions involving modified oligonucleotides, nucleic acids, or nucleotides
    • C12Q2525/10Modifications characterised by
    • C12Q2525/155Modifications characterised by incorporating/generating a new priming site
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2525/00Reactions involving modified oligonucleotides, nucleic acids, or nucleotides
    • C12Q2525/10Modifications characterised by
    • C12Q2525/173Modifications characterised by incorporating a polynucleotide run, e.g. polyAs, polyTs
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2525/00Reactions involving modified oligonucleotides, nucleic acids, or nucleotides
    • C12Q2525/10Modifications characterised by
    • C12Q2525/179Modifications characterised by incorporating arbitrary or random nucleotide sequences
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2563/00Nucleic acid detection characterized by the use of physical, structural and functional properties
    • C12Q2563/179Nucleic acid detection characterized by the use of physical, structural and functional properties the label being a nucleic acid
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/16Primer sets for multiplex assays

Definitions

  • the present invention relates to the field of quantitative nucleic acid analysis. More specifically, the present invention provides methods of determining the immune repertoire using high throughput sequencing.
  • a feature of the adaptive immune response is the ability to generate a wide diversity of binding molecules, e.g. T-cell antigen receptors and antibodies.
  • naive B and T-cells Upon exposure to antigen there can be a positive selection process, where cells expressing immunological receptors having desired binding properties are expanded, and may undergo further sequence modification, for example somatic hypermutation, and additional recombination.
  • the repertoire of binding specificities in an individual sample can provide a history of past antigenic exposures, as well as being informative of inherent repertoire capabilities and limitations.
  • Adaptive immunological receptors of interest include immunoglobulins, or antibodies. This repertoire is highly plastic and can be directed to create antibodies with broad chemical diversity and high selectivity. There is also a good understanding of the potential diversity available and the mechanistic aspects of how this diversity is generated.
  • Antibodies are composed of two types of chains (heavy and light), each containing a highly diversified antigen-binding domain (variable). The V, D, and J gene segments of the antibody heavy-chain variable genes go through a series of recombination events to generate a new heavy-chain gene. Antibodies are formed by a mixture of recombination among gene segments, sequence diversification at the junctions of these segments, and point mutations throughout the gene. The mechanisms are reviewed, for example in Maizels (2005) Annu. Revu. Genet. 39:23-46; Jones and Gellert (2004) Immunol. Rev. 200:233-248; Winter and Gearhart (1998) Immunol. Rev. 162:89-96.
  • TCR T-cell antigen receptor
  • V N-terminal variable
  • CDRs hypervariable or complementarity determining regions
  • HV4 hypervariability
  • V and J for the alpha or gamma chain corresponds to the CDR3 region that is important for antigen-MHC recognition. It is the unique combination of the segments at this region, along with palindromic and random N- and P-nucleotide additions, which accounts for the TCR binding repertoire.
  • lymphocytes While reference is made to binding specificities, and indeed a good deal of serological analysis is based on the physical interactions between antigen and receptor, the underlying cause of the diversity lies in the genetic sequences expressed by lymphocytes, which sequences reflect the myriad processes of recombination, mutation and selection that have acted on the cell.
  • the present invention provides methods for monitoring the immune repertoire. More specifically, the invention provides method of determining the immune repertoire in a subject, by isolating a plurality of RNA from a biological sample comprising a plurality of cell types obtained from a subject, producing immunoglobulin chain or TCR chain cDNAs from the RNA; adding a homopolymeric tail, a random molecular tag and a universal sequence to the 3′ end of the cDNAs; amplifying the cDNAs by using a one or more immunoglobulin chain or TCR chain specific primers and a universal sequence specific primer comprising an flanking sequence specific to the sequencing platform to produce a plurality of molecular tagged immunoglobulin chain or TCR chain nucleic acids and sequencing the amplified immunoglobuling chain or TCR chain nucleic acids to produce a plurality of sequencing reads thereby determining the immune repertoire of the subject.
  • the method further includes a data analysis step such as grouping sequences reads with the same molecular tag and clustering sequences within the same group.
  • a consensus sequence for each cluster is determined to produce a collection of consensus sequences.
  • the collection of consensus sequences is used to determine the diversity of the immune repertoire.
  • the molecular tag is an oligomer.
  • the oligomer is at least a 9mer.
  • the biological sample is blood or a fraction thereof.
  • the blood is peripheral whole blood.
  • the blood fraction comprises peripheral blood mononuclear cells.
  • the blood sample is sorted based upon extracellular or intracellular markers.
  • the immunoglobulin chain is the heavy chain or the light chain.
  • the immunoglobulin heavy chain contains the immunoglobulin VDJ and constant regions.
  • the TCR chain is the alpha chain, the beta chain, the gamma chain or the delta chain.
  • the TCR chain contains the VJ and constant regions.
  • FIG. 1 is a schematic illustrating the method of the invention.
  • FIG. 2 is an illustration highlighting the differences of the method of the invention and multiplex sequencing of the immune repertoire.
  • FIG. 3 illustrates the elimination of capture and PCR bias by the method of the invention.
  • FIG. 4 is an illustration detailing the sequencing of the immune repertoire according to the method of the invention.
  • FIG. 5 is an illustration detailing the sequencing of the immune repertoire according to the method of the invention.
  • FIG. 6 is an illustration detailing the molecular tagging step of the sequencing of the immune repertoire according to the method of the invention.
  • FIG. 7 is a schematic illustrating the data analysis steps of the method of the invention.
  • FIG. 8 is a graph showing the isotype distribution obtained using the multiplex PCR vs. the 5′RACE method of the invention from the same sample.
  • the multiplex PCR skews the repertoire towards a more na ⁇ ve compartment.
  • FIG. 9 is a plot showing the top 100 lineages within each sample and the corresponding resonation of the other sample. Ribbons connect the same lineage. Width of segment represents the relative abundance of the lineage in the sample. A few large lineages seen with the method of the invention is absent in the same sample prepared by multiplex PCR.
  • FIG. 10 is an illustration showing that multiplex PCR does not prime lineages with mutations, with the result being that highly mutated lineages are absent in from the multiplex PCR data set.
  • the sequence identifiers of each of the sequences in FIG. 10 are SEQ ID NOs: 3-24 from top to bottom.
  • the present invention provides an improved method of sequencing the immune repertoire.
  • Previous methods for determining the immune repertoire such as those described in WO 2011/140433 and WO 2012/083069 are based upon multiplex PCR.
  • Multiplex PCR has a number of limitations that make it particularly unsuited for accurately determining the immune repertoire. (See FIG. 2 ) These limitations include capture bias and amplification bias owing to PCR.
  • Multiplex PCR techniques for sequencing the immune repertoire use primers designed to prime all framework regions of known V gene segments. When a mutation arises at the priming site capture bias occurs and the gene that had the mutation would be under-amplified. PCR bias results from unequal amplification of the genes due to either the relative amount of each primer and PCR replicates of the same sequence. Thus PCR bias can cause apparent clonality or a lack of diversity.
  • the observed repertoire is not a faithful, or linear representation of the actual underlying repertoire.
  • the methods of the invention utilizes 5′ RACE and universal PCR. PCR bias is eliminated by molecular tagging. ( FIG. 3 )
  • the present method sequences the immune repertoire directly from a heterogeneous nucleic acid mixture derived from a heterogeneous population of cells.
  • the methods of the invention generally involve the steps of obtaining a peripheral whole blood sample from a subject, isolating RNA from the peripheral whole blood sample, or fraction thereof (e.g., peripheral blood mononuclear cells), reverse transcribing the isolated RNA using immunoglobulin heavy chain or TCR beta chain specific primers to generate immunoglobulin (e.g., heavy chain or light chain) or TCR (e.g., alpha, beta, delta or gamma chain) cDNA transcripts.
  • immunoglobulin e.g., heavy chain or light chain
  • TCR e.g., alpha, beta, delta or gamma chain
  • a short homopolymer is added to the end of the cDNA by the intrinsic property of reverse transcriptase.
  • Oligonucleotides with 3′ sequence complementary to the homopolymer and a 5′ flanking sequence containing a universal sequence and molecular tag that is composed of random nucleotides would be used by the reverse transcriptase as template. The result is that the end of each cDNA molecule is extended with a short homopolymer, a unique molecular tag, and a universal sequence. This allows amplification of unknown sequences between the gene specific sequence and the 5′-end of the mRNA. ( FIGS. 4-6 ).
  • each cDNA molecule is labeled with a unique tag prior to amplification, the differential amplification of each cDNA molecule can be corrected for by counting each unique tag once, thereby providing a faithful measure of the abundance of each species in the repertoire. Sequence replicates of each cDNA molecule identified by the same molecular tag can be used to construct consensus sequences, therefore allowing correction for amplification and sequencing errors.
  • the methods of the invention utilize biological samples from subjects or individuals.
  • the subject can be a patient, for example, a patient with an autoimmune disease, an infectious disease or cancer, or a transplant recipient.
  • the subject can be a human or a non-human mammal.
  • the subject can be a male or female subject of any age (e.g., a fetus, an infant, a child, or an adult).
  • Samples used in the methods of the provided invention can include, for example, a bodily fluid from a subject, including amniotic fluid surrounding a fetus, aqueous humor, bile, blood and blood plasma, cerumen (earwax), Cowper's fluid or pre-ejaculatory fluid, chyle, chyme, female ejaculate, interstitial fluid, lymph, menses, breast milk, mucus (including snot and phlegm), pleural fluid, pus, saliva, sebum (skin oil), semen, serum, sweat, tears, urine, vaginal lubrication, vomit, feces, internal body fluids including cerebrospinal fluid surrounding the brain and the spinal cord, synovial fluid surrounding bone joints, intracellular fluid (the fluid inside cells), and vitreous humour (the fluids in the eyeball).
  • a bodily fluid from a subject including amniotic fluid surrounding a fetus, aqueous humor, bile,
  • the sample is a blood sample, such as a peripheral whole blood sample, or a fraction thereof.
  • the sample is whole, unfractionated blood.
  • the blood sample can be about 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08 0.09, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, or 5.0 mL.
  • the sample can be obtained by a health care provider, for example, a physician, physician assistant, nurse, veterinarian, dermatologist, rheumatologist, dentist, paramedic, or surgeon.
  • the sample can be obtained by a research technician. More than one sample from a subject can be obtained.
  • the sample can include immune cells.
  • the immune cells can include T-cells and/or B-cells.
  • T-cells T lymphocytes
  • T-cells include, for example, cells that express T-cell receptors.
  • T-cells include Helper T-cells (effector T-cells or Th cells), cytotoxic T-cells (CTLs), memory T-cells, and regulatory T-cells.
  • the sample can include a single cell in some applications (e.g., a calibration test to define relevant T-cells) or more generally at least 1,000, at least 10,000, at least 100,000, at least 250,000, at least 500,000, at least 750,000, or at least 1,000,000 T-cells.
  • B-cells include, for example, plasma B cells, memory B cells, B1 cells, B2 cells, marginal-zone B cells, and follicular B cells.
  • B-cells can express immunoglobulins (antibodies, B cell receptor).
  • the sample can include a single cell in some applications (e.g., a calibration test to define relevant B cells) or more generally at least 1,000, at least 10,000, at least 100,000, at least 250,000, at least 500,000, at least 750,000, or at least 1,000,000 B-cells.
  • the sample can include nucleic acid, for example, DNA (e.g., genomic DNA or
  • RNA e.g., messenger RNA or microRNA
  • the nucleic acid can be cell-free DNA or RNA.
  • the amount of RNA or DNA from a subject that can be analyzed includes, for example, as low as a single cell in some applications (e.g., a calibration test) and as many as 10 millions of cells or more translating to a range of DNA of 6 pg-60 ug, and RNA of approximately 1 pg-10 ug.
  • PCR Polymerase chain reaction
  • the region to be amplified includes the full clonal sequence or a subset of the clonal sequence, including the V-D junction, D-J junction of an immunoglobulin or T-cell receptor gene, the full variable region of an immunoglobulin or T-cell receptor gene, the antigen recognition region, or a CDR, e.g., complementarity determining region 3 (CDR3).
  • CDR3 complementarity determining region 3
  • the immunoglobulin sequence is amplified using a primary and a secondary amplification step.
  • Each of the different amplification steps can comprise different primers.
  • the different primers can introduce sequence not originally present in the immune gene sequence.
  • the amplification procedure can add one or more tags to the 5′ and/or 3′ end of amplified immunoglobulin sequence.
  • the tag can be a sequence that facilitates subsequent sequencing of the amplified DNA.
  • the tag can be a sequence that facilitates binding the amplified sequence to a solid support.
  • the tag can be a bar-code or label to facilitate identification of the amplified immunoglobulin sequence.
  • a specific primer can be used from the C segment and a generic primer can be put in the other side (5′).
  • the generic primer can be appended in the cDNA synthesis through different methods including the well described methods of strand switching.
  • the generic primer can be appended after cDNA making through different methods including ligation.
  • RNA sequence based amplification examples include, for example, reverse transcription-PCR, real-time PCR, quantitative real-time PCR, digital PCR (dPCR), digital emulsion PCR (dePCR), clonal PCR, amplified fragment length polymorphism PCR (AFLP PCR), allele specific PCR, assembly PCR, asymmetric PCR (in which a great excess of primers for a chosen strand is used), colony PCR, helicase-dependent amplification (HDA), Hot Start PCR, inverse PCR (IPCR), in situ PCR, long PCR (extension of DNA greater than about 5 kilobases), multiplex PCR, nested PCR (uses more than one pair of primers), single-cell PCR, touchdown PCR, loop-mediated isothermal PCR (LAMP), and nucleic acid sequence based amplification (NASBA).
  • Other amplification schemes include: Ligase Chain Reaction, Branch DNA Amplification, Rolling Circle Amplification,
  • RNA in a sample can be converted to cDNA by using reverse transcription using techniques well known to those of ordinary skill in the art (see e.g., Sambrook, Fritsch and Maniatis, MOLECULAR CLONING: A LABORATORY MANUAL, 2nd edition (1989)).
  • PolyA primers, random primers, and/or gene specific primers can be used in reverse transcription reactions.
  • Polymerases that can be used for amplification in the methods of the provided invention include, for example, Taq polymerase, AccuPrime polymerase, or Pfu.
  • the choice of polymerase to use can be based on whether fidelity or efficiency is preferred.
  • the amplicons are directly sequenced.
  • DNA sequencing techniques include classic dideoxy sequencing reactions (Sanger method) using labeled terminators or primers and gel separation in slab or capillary, sequencing-by-synthesis using reversibly terminated labeled nucleotides, pyrosequencing, 454 sequencing, allele specific hybridization to a library of labeled oligonucleotide probes, sequencing-by-synthesis using allele specific hybridization to a library of labeled clones that is followed by ligation, real time monitoring of the incorporation of labeled nucleotides during a polymerization step, and SOLiD sequencing.
  • the sequencing technique used in the methods of the provided invention generates at least 100 reads per run, at least 200 reads per run, at least 300 reads per run, at least 400 reads per run, at least 500 reads per run, at least 600 reads per run, at least 700 reads per run, at least 800 reads per run, at least 900 reads per run, at least 1000 reads per run, at least 5,000 reads per run, at least 10,000 reads per run, at least 50,000 reads per run, at least 100,000 reads per run, at least 500,000 reads per run, at least 1,000,000 reads per run, at least 2,000,000 reads per run, at least 3,000,000 reads per run, at least 4,000,000 reads per run at least 5000,000 reads per run, at least 6,000,000 reads per run at least 7,000,000 reads per run at least 8,000,000 reads per run, at least 9,000,000 reads per run, or at least 10,000,000 reads per run.
  • the number of sequencing reads per B cell sampled should be at least 2 times the number of B cells sampled, at least 3 times the number of B cells sampled, at least 5 times the number of B cells sampled, at least 6 times the number of B cells sampled, at least 7 times the number of B cells sampled, at least 8 times the number of B cells sampled, at least 9 times the number of B cells sampled, or at least at least 10 times the number of B cells
  • the read depth allows for accurate coverage of B cells sampled, facilitates error correction, and ensures that the sequencing of the library has been saturated.
  • the number of sequencing reads per T-cell sampled should be at least 2 times the number of T-cells sampled, at least 3 times the number of T-cells sampled, at least 5 times the number of T-cells sampled, at least 6 times the number of T-cells sampled, at least 7 times the number of T-cells sampled, at least 8 times the number of T-cells sampled, at least 9 times the number of T-cells sampled, or at least at least 10 times the number of T-cells
  • the read depth allows for accurate coverage of T-cells sampled, facilitates error correction, and ensures that the sequencing of the library has been saturated.
  • the sequencing technique used in the methods of the provided invention can generate about 30 bp, about 40 bp, about 50 bp, about 60 bp, about 70 bp, about 80 bp, about 90 bp, about 100 bp, about 110, about 120 by per read, about 150 bp, about 200 bp, about 250 bp, about 300 bp, about 350 bp, about 400 bp, about 450 bp, about 500 bp, about 550 bp, about 600 bp, about 700 bp, about 800 bp, about 900 bp, or about 1,000 by per read.
  • the sequencing technique used in the methods of the provided invention can generate at least 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, or 1,000 by per read.
  • a sequencing technique that can be used in the methods of the provided invention includes, for example, Helicos True Single Molecule Sequencing (tSMS) (Harris T. D. et al. (2008) Science 320:106-109).
  • tSMS Helicos True Single Molecule Sequencing
  • a DNA sample is cleaved into strands of approximately 100 to 200 nucleotides, and a polyA sequence is added to the 3′ end of each DNA strand.
  • Each strand is labeled by the addition of a fluorescently labeled adenosine nucleotide.
  • the DNA strands are then hybridized to a flow cell, which contains millions of oligo-T capture sites that are immobilized to the flow cell surface.
  • the templates can be at a density of about 100 million templates/cm 2 .
  • the flow cell is then loaded into an instrument, e.g., HeliScopeTM. sequencer, and a laser illuminates the surface of the flow cell, revealing the position of each template.
  • a CCD camera can map the position of the templates on the flow cell surface.
  • the template fluorescent label is then cleaved and washed away.
  • the sequencing reaction begins by introducing a DNA polymerase and a fluorescently labeled nucleotide.
  • the oligo-T nucleic acid serves as a primer.
  • the polymerase incorporates the labeled nucleotides to the primer in a template directed manner. The polymerase and unincorporated nucleotides are removed.
  • the templates that have directed incorporation of the fluorescently labeled nucleotide are detected by imaging the flow cell surface. After imaging, a cleavage step removes the fluorescent label, and the process is repeated with other fluorescently labeled nucleotides until the desired read length is achieved. Sequence information is collected with each nucleotide addition step.
  • 454 sequencing involves two steps. In the first step, DNA is sheared into fragments of approximately 300-800 base pairs, and the fragments are blunt ended. Oligonucleotide adaptors are then ligated to the ends of the fragments. The adaptors serve as primers for amplification and sequencing of the fragments.
  • the fragments can be attached to DNA capture beads, e.g., streptavidin-coated beads using, e.g., Adaptor B, which contains 5′-biotin tag.
  • the fragments attached to the beads are PCR amplified within droplets of an oil-water emulsion. The result is multiple copies of clonally amplified DNA fragments on each bead.
  • the beads are captured in wells (pico-liter sized). Pyrosequencing is performed on each DNA fragment in parallel. Addition of one or more nucleotides generates a light signal that is recorded by a CCD camera in a sequencing instrument. The signal strength is proportional to the number of nucleotides incorporated.
  • Pyrosequencing makes use of pyrophosphate (PPi) which is released upon nucleotide addition.
  • PPi is converted to ATP by ATP sulfurylase in the presence of adenosine 5′ phosphosulfate.
  • Luciferase uses ATP to convert luciferin to oxyluciferin, and this reaction generates light that is detected and analyzed.
  • Genome Sequencer FLX systems e.g., GS FLX/FLX+, GS Junior
  • GS FLX/FLX+, GS Junior e.g., GS FLX/FLX+, GS Junior
  • GS Junior GS FLX/FLX+, GS Junior
  • These systems are ideally suited for de novo sequencing of whole genomes and transcriptomes of any size, metagenomic characterization of complex samples, or resequencing studies.
  • SOLiD sequencing genomic DNA is sheared into fragments, and adaptors are attached to the 5′ and 3′ ends of the fragments to generate a fragment library.
  • internal adaptors can be introduced by ligating adaptors to the 5′ and 3′ ends of the fragments, circularizing the fragments, digesting the circularized fragment to generate an internal adaptor, and attaching adaptors to the 5′ and 3′ ends of the resulting fragments to generate a mate-paired library.
  • clonal bead populations are prepared in microreactors containing beads, primers, template, and PCR components. Following PCR, the templates are denatured and beads are enriched to separate the beads with extended templates. Templates on the selected beads are subjected to a 3′ modification that permits bonding to a glass slide.
  • the sequence can be determined by sequential hybridization and ligation of partially random oligonucleotides with a central determined base (or pair of bases) that is identified by a specific fluorophore. After a color is recorded, the ligated oligonucleotide is cleaved and removed and the process is then repeated.
  • IonTorrent uses a high-density array of micro-machined wells to perform this biochemical process in a massively parallel way. Each well holds a different DNA template. Beneath the wells is an ion-sensitive layer and beneath that a proprietary Ion sensor. If a nucleotide, for example a C, is added to a DNA template and is then incorporated into a strand of DNA, a hydrogen ion will be released. The charge from that ion will change the pH of the solution, which can be detected by the proprietary ion sensor. The sequencer will call the base, going directly from chemical information to digital information.
  • IonTorrent uses a high-density array of micro-machined wells to perform this biochemical process in a massively parallel way. Each well holds a different DNA template. Beneath the wells is an ion-sensitive layer and beneath that a proprietary Ion sensor. If a nucleotide, for example a C, is added to a DNA template and is then incorporated
  • the Ion Personal Genome Machine (PGMTM) sequencer then sequentially floods the chip with one nucleotide after another. If the next nucleotide that floods the chip is not a match, no voltage change will be recorded and no base will be called. If there are two identical bases on the DNA strand, the voltage will be double, and the chip will record two identical bases called. Because this is direct detection—no scanning, no cameras, no light—each nucleotide incorporation is recorded in seconds.
  • HiSEQTM e.g., HiSEQ2000TM and HiSEQ1000TM
  • MiSEQTM the MiSEQTM system from Illumina, Inc.
  • the HiSEQTM system is based on massively parallel sequencing of millions of fragments using attachment of randomly fragmented genomic DNA to a planar, optically transparent surface and solid phase amplification to create a high density sequencing flow cell with millions of clusters, each containing about 1,000 copies of template per sq. cm. These templates are sequenced using four-color DNA sequencing-by-synthesis technology.
  • the MiSEQTM system uses TruSeq, Illumina's reversible terminator-based sequencing-by-synthesis.
  • SOLEXA sequencing is based on the amplification of DNA on a solid surface using fold-back PCR and anchored primers. Genomic DNA is fragmented, and adapters are added to the 5′ and 3′ ends of the fragments. DNA fragments that are attached to the surface of flow cell channels are extended and bridge amplified. The fragments become double stranded, and the double stranded molecules are denatured. Multiple cycles of the solid-phase amplification followed by denaturation can create several million clusters of approximately 1,000 copies of single-stranded DNA molecules of the same template in each channel of the flow cell.
  • Primers DNA polymerase and four fluorophore-labeled, reversibly terminating nucleotides are used to perform sequential sequencing. After nucleotide incorporation, a laser is used to excite the fluorophores, and an image is captured and the identity of the first base is recorded. The 3′ terminators and fluorophores from each incorporated base are removed and the incorporation, detection and identification steps are repeated.
  • SMRTTM single molecule, real-time
  • each of the four DNA bases is attached to one of four different fluorescent dyes. These dyes are phospholinked.
  • a single DNA polymerase is immobilized with a single molecule of template single stranded DNA at the bottom of a zero-mode waveguide (ZMW).
  • ZMW zero-mode waveguide
  • a ZMW is a confinement structure which enables observation of incorporation of a single nucleotide by DNA polymerase against the background of fluorescent nucleotides that rapidly diffuse in and out of the ZMW (in microseconds). It takes several milliseconds to incorporate a nucleotide into a growing strand.
  • the fluorescent label is excited and produces a fluorescent signal, and the fluorescent tag is cleaved off. Detection of the corresponding fluorescence of the dye indicates which base was incorporated. The process is repeated.
  • a nanopore is a small hole, of the order of 1 nanometer in diameter. Immersion of a nanopore in a conducting fluid and application of a potential across it results in a slight electrical current due to conduction of ions through the nanopore. The amount of current which flows is sensitive to the size of the nanopore. As a DNA molecule passes through a nanopore, each nucleotide on the DNA molecule obstructs the nanopore to a different degree. Thus, the change in the current passing through the nanopore as the DNA molecule passes through the nanopore represents a reading of the DNA sequence.
  • a sequencing technique that can be used in the methods of the provided invention involves using a chemical-sensitive field effect transistor (chemFET) array to sequence DNA (for example, as described in US Patent Application Publication No. 20090026082).
  • chemFET chemical-sensitive field effect transistor
  • DNA molecules can be placed into reaction chambers, and the template molecules can be hybridized to a sequencing primer bound to a polymerase.
  • Incorporation of one or more triphosphates into a new nucleic acid strand at the 3′ end of the sequencing primer can be detected by a change in current by a chemFET.
  • An array can have multiple chemFET sensors.
  • single nucleic acids can be attached to beads, and the nucleic acids can be amplified on the bead, and the individual beads can be transferred to individual reaction chambers on a chemFET array, with each chamber having a chemFET sensor, and the nucleic acids can be sequenced.
  • Another example of a sequencing technique that can be used in the methods of the provided invention involves using a electron microscope (Moudrianakis E. N. and Beer M. Proc Natl Acad Sci USA. 1965 March; 53:564-71).
  • individual DNA molecules are labeled using metallic labels that are distinguishable using an electron microscope. These molecules are then stretched on a flat surface and imaged using an electron microscope to measure sequences.
  • Sequencing allows for the presence of multiple immunoglobulin gene to be detected and quantified in a heterogeneous biological sample.
  • the high throughput sequencing provides a very large dataset, which is then analyzed in order to establish the repertoire.
  • High-throughput analysis can be achieved using one or more bioinformatics tools, such as ALLPATHS (a whole genome shotgun assembler that can generate high quality assemblies from short reads), Arachne (a tool for assembling genome sequences from whole genome shotgun reads, mostly in forward and reverse pairs obtained by sequencing cloned ends, BACCardl (a graphical tool for the validation of genomic assemblies, assisting genome finishing and intergenome comparison), CCRaVAT & QuTie (enables analysis of rare variants in large-scale case control and quantitative trait association studies), CNV-seq (a method to detect copy number variation using high throughput sequencing), Elvira (a set of tools/procedures for high throughput assembly of small genomes (e.g., viruses)), Glimmer (a system for finding genes in microbial DNA, especially the genomes of bacteria, archaea and viruses), gnumap (a program designed to accurately map sequence data obtained from next-generation sequencing machines), Goseq (an R library for performing Gene Ontology and
  • Grouping reads with the same molecular tag Initially sequences are matched based on identical molecular tag.
  • Consensus reads are used for mutation analysis and diversity measurement.
  • VDJ lineage diversity VDJ usage is enumerated by the number of observed lineages falling into each VJ, VDJ, VJC, or VDJC (e.g., VDJ) combination at a given read-depth.
  • VDJ and unique sequence abundance histograms are plotted by binning VDJ and unique sequence abundances (the latter which is either clustered or has undergone lineage-analysis filtering and grouping) into log-spaced bins.
  • VJ, VDJ, VJC, or VDJC (e.g., VDJ) usage Repertoires are represented by applying V-, D-, J-, and/or C-segments to different axes on a three-dimensional plot. Using either abundance (generally read number, which can be bias-normalized) or observed lineage diversity, bubbles of varying sizes are used at each V/D/J/C coordinate to represent the total usage of that combination.
  • Mutation vs. sequence abundance plots After undergoing lineage analysis, unique sequences are binned by read-number (or bias-normalized abundance) into log-spaced bins. For a given abundance-bin, the number of mutations per unique sequence is averaged, giving a mutation vs. abundance curve.
  • VJ, D, J, C, VJ, VDJ, VJC, VDJC antibody heavy chain, antibody light chain, CDR3, or T-cell receptor usage (Pearson, KL divergence): VJ, VDJ, VJC, or VDJC (e.g., VDJ) combinations are treated as vectors with indexed components v, weighted by either lineage-diversity or abundance for that VDJ combination. Pearson correlations and KL-divergences between each pair of individuals are then calculated over the indices.
  • the results of the analysis may be referred to herein as an immune repertoire analysis result, which may be represented as a dataset that includes sequence information, representation of V, D, J, C, VJ, VDJ, VJC, VDJC, antibody heavy chain, antibody light chain, CDR3, or T-cell receptor usage, representation for abundance of V, D, J, C, VJ, VDJ, VJC, VDJC, antibody heavy chain, antibody light chain, CDR3, or T-cell receptor and unique sequences; representation of mutation frequency, correlative measures of VJ V, D, J, C, VJ, VDJ, VJC, VDJC, antibody heavy chain, antibody light chain, CDR3, or T-cell receptor usage, etc.
  • Such results may then be output or stored, e.g. in a database of repertoire analyses, and may be used in comparisons with test results, reference results, and the like.
  • the repertoire can be compared with a reference or control repertoire to make a diagnosis, prognosis, analysis of drug effectiveness, or other desired analysis.
  • a reference or control repertoire may be obtained by the methods of the invention, and will be selected to be relevant for the sample of interest.
  • a test repertoire result can be compared to a single reference/control repertoire result to obtain information regarding the immune capability and/or history of the individual from which the sample was obtained.
  • the obtained repertoire result can be compared to two or more different reference/control repertoire results to obtain more in-depth information regarding the characteristics of the test sample.
  • the obtained repertoire result may be compared to a positive and negative reference repertoire result to obtain confirmed information regarding whether the phenotype of interest.
  • test repertoires can also be compared with each other.
  • a test repertoire is compared to a reference sample and the result is then compared with a result derived from a comparison between a second test repertoire and the same reference sample.
  • Determination or analysis of the difference values i.e., the difference between two repertoires can be performed using any conventional methodology, where a variety of methodologies are known to those of skill in the array art, e.g., by comparing digital images of the repertoire output, by comparing databases of usage data, etc.
  • a statistical analysis step can then be performed to obtain the weighted contribution of the sequence prevalence, e.g. V, D, J, C, VJ, VDJ, VJC, VDJC, antibody heavy chain, antibody light chain, CDR3, or T-cell receptor usage, mutation analysis, etc.
  • nearest shrunken centroids analysis may be applied as described in Tibshirani et at. (2002) P.N.A.S. 99:6567-6572 to compute the centroid for each class, then compute the average squared distance between a given repertoire and each centroid, normalized by the within-class standard deviation.
  • a statistical analysis may comprise use of a statistical metric (e.g., an entropy metric, an ecology metric, a variation of abundance metric, a species richness metric, or a species heterogeneity metric.) in order to characterize diversity of a set of immunological receptors.
  • a statistical metric e.g., an entropy metric, an ecology metric, a variation of abundance metric, a species richness metric, or a species heterogeneity metric.
  • Methods used to characterize ecological species diversity can also be used in the present invention. See, e.g., Peet, Annu Rev. Ecol. Syst. 5:285 (1974).
  • a statistical metric may also be used to characterize variation of abundance or heterogeneity.
  • An example of an approach to characterize heterogeneity is based on information theory, specifically the Shannon-Weaver entropy, which summarizes the frequency distribution in a single number. See, e.g., Pe
  • the classification can be probabilistically defined, where the cut-off may be empirically derived.
  • a probability of about 0.4 can be used to distinguish between individuals exposed and not-exposed to an antigen of interest, more usually a probability of about 0.5, and can utilize a probability of about 0.6 or higher.
  • a “high” probability can be at least about 0.75, at least about 0.7, at least about 0.6, or at least about 0.5.
  • a “low” probability may be not more than about 0.25, not more than 0.3, or not more than 0.4.
  • the above-obtained information is employed to predict whether a host, subject or patient should be treated with a therapy of interest and to optimize the dose therein.
  • the invention finds use in the prevention, treatment, detection, diagnosis, prognosis, or research into any condition or symptom of any condition, including cancer, inflammatory diseases, autoimmune diseases, allergies and infections of an organism.
  • the organism is preferably a human subject but can also be derived from non-human subjects, e.g., non-human mammals.
  • non-human mammals include, but are not limited to, non-human primates (e.g., apes, monkeys, gorillas), rodents (e.g., mice, rats), cows, pigs, sheep, horses, dogs, cats, or rabbits.
  • cancer examples include prostrate, pancreas, colon, brain, lung, breast, bone, and skin cancers.
  • inflammatory conditions include irritable bowel syndrome, ulcerative colitis, appendicitis, tonsilitis, dermatitis.
  • atopic conditions include allergy, asthma, etc.
  • autoimmune diseases include IDDM, RA, MS, SLE, Crohn's disease, Graves' disease, etc.
  • Autoimmune diseases also include Celiac disease, and dermatitis herpetiformis. For example, determination of an immune response to cancer antigens, autoantigens, pathogenic antigens, vaccine antigens, and the like is of interest.
  • nucleic acids e.g., genomic DNA, mRNA, etc.
  • an antigen e.g., vaccinated
  • the nucleic acids are obtained from an organism before the organism has been challenged with an antigen (e.g., vaccinated). Comparing the diversity of the immunological receptors present before and after challenge, may assist the analysis of the organism's response to the challenge.
  • Methods are also provided for optimizing therapy, by analyzing the immune repertoire in a sample, and based on that information, selecting the appropriate therapy, dose, treatment modality, etc. that is optimal for stimulating or suppressing a targeted immune response, while minimizing undesirable toxicity.
  • the treatment is optimized by selection for a treatment that minimizes undesirable toxicity, while providing for effective activity. For example, a patient may be assessed for the immune repertoire relevant to an autoimmune disease, and a systemic or targeted immunosuppressive regimen may be selected based on that information.
  • a signature repertoire for a condition can refer to an immune repertoire result that indicates the presence of a condition of interest. For example a history of cancer (or a specific type of allergy) may be reflected in the presence of immune receptor sequences that bind to one or more cancer antigens. The presence of autoimmune disease may be reflected in the presence of immune receptor sequences that bind to autoantigens.
  • a signature can be obtained from all or a part of a dataset, usually a signature will comprise repertoire information from at least about 100 different immune receptor sequences, at least about 10 2 different immune receptor sequences, at least about 10 3 different immune receptor sequences, at least about 10 4 different immune receptor sequences, at least about 10 5 different immune receptor sequences, or more. Where a subset of the dataset is used, the subset may comprise, for example, alpha TCR, beta TCR, MHC, IgH, IgL, or combinations thereof.
  • classification methods described herein are of interest as a means of detecting the earliest changes along a disease pathway (e.g., a carcinogenesis pathway, inflammatory pathway, etc.), and/or to monitor the efficacy of various therapies and preventive interventions.
  • a disease pathway e.g., a carcinogenesis pathway, inflammatory pathway, etc.
  • the methods disclosed herein can also be utilized to analyze the effects of agents on cells of the immune system. For example, analysis of changes in immune repertoire following exposure to one or more test compounds can performed to analyze the effect(s) of the test compounds on an individual. Such analyses can be useful for multiple purposes, for example in the development of immunosuppressive or immune enhancing therapies.
  • Agents to be analyzed for potential therapeutic value can be any compound, small molecule, protein, lipid, carbohydrate, nucleic acid or other agent appropriate for therapeutic use.
  • tests are performed in vivo, e.g. using an animal model, to determine effects on the immune repertoire.
  • Agents of interest for screening include known and unknown compounds that encompass numerous chemical classes, primarily organic molecules, which may include organometallic molecules, genetic sequences, etc.
  • An important aspect of the invention is to evaluate candidate drugs, including toxicity testing; and the like.
  • candidate agents include organic molecules comprising functional groups necessary for structural interactions, particularly hydrogen bonding, and typically include at least an amine, carbonyl, hydroxyl or carboxyl group, frequently at least two of the functional chemical groups.
  • the candidate agents can comprise cyclical carbon or heterocyclic structures and/or aromatic or polyaromatic structures substituted with one or more of the above functional groups.
  • Candidate agents can also be found among biomolecules, including peptides, polynucleotides, saccharides, fatty acids, steroids, purines, pyrimidines, derivatives, structural analogs or combinations thereof.
  • test compounds may have known functions (e.g., relief of oxidative stress), but may act through an unknown mechanism or act on an unknown target.
  • pharmacologically active drugs include chemotherapeutic agents, hormones or hormone antagonists, etc.
  • chemotherapeutic agents include chemotherapeutic agents, hormones or hormone antagonists, etc.
  • Exemplary of pharmaceutical agents suitable for this invention are those described in, “The Pharmacological Basis of Therapeutics,” Goodman and Gilman, McGraw-Hill, New York, N.Y., (1996), Ninth edition, under the sections: Water, Salts and Ions; Drugs Affecting Renal Function and Electrolyte Metabolism; Drugs Affecting Gastrointestinal Function; Chemotherapy of Microbial Diseases; Chemotherapy of Neoplastic Diseases; Drugs Acting on Blood-Forming organs; Hormones and Hormone Antagonists; Vitamins, Dermatology; and Toxicology, all incorporated herein by reference. Also included are toxins, and biological and chemical warfare agents, for example see Somani, S. M. (Ed.), “Chemical Warfare Agents,” Academic Press, New
  • Test compounds include all of the classes of molecules described above, and can further comprise samples of unknown content. Of interest are complex mixtures of naturally occurring compounds derived from natural sources such as plants, fungi, bacteria, protists or animals. While many samples will comprise compounds in solution, solid samples that can be dissolved in a suitable solvent may also be assayed. Samples of interest include environmental samples, e.g., ground water, sea water, mining waste, etc., biological samples, e.g. lysates prepared from crops, tissue samples, etc.; manufacturing samples, e.g. time course during preparation of pharmaceuticals; as well as libraries of compounds prepared for analysis; and the like (e.g., compounds being assessed for potential therapeutic value, i.e., drug candidates).
  • environmental samples e.g., ground water, sea water, mining waste, etc.
  • biological samples e.g. lysates prepared from crops, tissue samples, etc.
  • manufacturing samples e.g. time course during preparation of pharmaceuticals
  • libraries of compounds prepared for analysis e.g.
  • Samples or compounds can also include additional components, for example components that affect the ionic strength, pH, total protein concentration, etc.
  • the samples may be treated to achieve at least partial fractionation or concentration.
  • Biological samples may be stored if care is taken to reduce degradation of the compound, e.g. under nitrogen, frozen, or a combination thereof.
  • the volume of sample used is sufficient to allow for measurable detection, for example from about 0.1 ml to 1 ml of a biological sample can be sufficient.
  • Compounds, including candidate agents, are obtained from a wide variety of sources including libraries of synthetic or natural compounds. For example, numerous means are available for random and directed synthesis of a wide variety of organic compounds, including biomolecules, including expression of randomized oligonucleotides and oligopeptides. Alternatively, libraries of natural compounds in the form of bacterial, fungal, plant and animal extracts are available or readily produced. Additionally, natural or synthetically produced libraries and compounds are readily modified through conventional chemical, physical and biochemical means, and may be used to produce combinatorial libraries. Known pharmacological agents may be subjected to directed or random chemical modifications, such as acylation, alkylation, esterification, amidification, etc. to produce structural analogs.
  • agent formulations do not include additional components, such as preservatives, that may have a significant effect on the overall formulation.
  • additional components such as preservatives, that may have a significant effect on the overall formulation.
  • such formulations consist essentially of a biologically active compound and a physiologically acceptable carrier, e.g. water, ethanol, DMSO, etc.
  • a physiologically acceptable carrier e.g. water, ethanol, DMSO, etc.
  • the formulation may consist essentially of the compound itself.
  • databases of immune repertoires or of sets of immunological receptors can typically comprise repertoires results derived from various individual conditions, such as individuals having exposure to a vaccine, to a cancer, having an autoimmune disease of interest, infection with a pathogen, and the like.
  • Such databases can also include sequences of immunological receptors derived from synthetic libraries, or from other artificial methods.
  • the repertoire results and databases thereof may be provided in a variety of media to facilitate their use. “Media” refers to a manufacture that contains the expression repertoire information of the present invention.
  • the databases of the present invention can be recorded on computer readable media, e.g. any medium that can be read and accessed directly by a computer.
  • Such media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/optical storage media.
  • magnetic storage media such as floppy discs, hard disc storage medium, and magnetic tape
  • optical storage media such as CD-ROM
  • electrical storage media such as RAM and ROM
  • hybrids of these categories such as magnetic/optical storage media.
  • a computer-based system refers to the hardware means, software means, and data storage means used to analyze the information of the present invention.
  • the minimum hardware of the computer-based systems of the present invention comprises a central processing unit (CPU), input means, output means, and data storage means.
  • CPU central processing unit
  • input means input means
  • output means output means
  • data storage means may comprise any manufacture comprising a recording of the present information as described above, or a memory access means that can access such a manufacture.
  • a variety of structural formats for the input and output means can be used to input and output the information in the computer-based systems of the present invention. Such presentation provides a skilled artisan with a ranking of similarities and identifies the degree of similarity contained in the test expression repertoire.
  • a scaled approach may also be taken to the data analysis. For example, Pearson correlation of the repertoire results can provide a quantitative score reflecting the signature for each sample. The higher the correlation value, the more the sample resembles a reference repertoire. A negative correlation value indicates the opposite behavior.
  • the threshold for the classification can be moved up or down from zero depending on the clinical goal.
  • the false discovery rate may be determined
  • a set of null distributions of dissimilarity values is generated.
  • the values of observed repertoires are permuted to create a sequence of distributions of correlation coefficients obtained out of chance, thereby creating an appropriate set of null distributions of correlation coefficients (see Tusher et al. (2001) PNAS 98, 51 18-21, herein incorporated by reference).
  • the set of null distribution is obtained by: permuting the values of each repertoire for all available repertoires; calculating the pairwise correlation coefficients for all repertoire results; calculating the probability density function of the correlation coefficients for this permutation; and repeating the procedure for N times, where N is a large number, usually 300.
  • an appropriate measure mean, median, etc.
  • the FDR is the ratio of the number of the expected falsely significant correlations (estimated from the correlations greater than this selected Pearson correlation in the set of randomized data) to the number of correlations greater than this selected Pearson correlation in the empirical data (significant correlations). This cut-off correlation value may be applied to the correlations between experimental repertoires.
  • a level of confidence is chosen for significance. This is used to determine the lowest value of the correlation coefficient that exceeds the result that would have obtained by chance.
  • this method one obtains thresholds for positive correlation, negative correlation or both. Using this threshold(s), the user can filter the observed values of the pairwise correlation coefficients and eliminate those that do not exceed the threshold(s). Furthermore, an estimate of the false positive rate can be obtained for a given threshold. For each of the individual “random correlation” distributions, one can find how many observations fall outside the threshold range. This procedure provides a sequence of counts. The mean and the standard deviation of the sequence provide the average number of potential false positives and its standard deviation.
  • the data can be subjected to non-supervised hierarchical clustering to reveal relationships among repertoires.
  • hierarchical clustering may be performed, where the Pearson correlation is employed as the clustering metric.
  • Clustering of the correlation matrix e.g. using multidimensional scaling, enhances the visualization of functional homology similarities and dissimilarities.
  • Multidimensional scaling (MDS) can be applied in one, two or three dimensions.
  • a machine-readable storage medium comprising a data storage material encoded with machine readable data which, when using a machine programmed with instructions for using said data, is capable of displaying a any of the datasets and data comparisons of this invention.
  • Such data may be used for a variety of purposes, such as drug discovery, analysis of interactions between cellular components, and the like.
  • the invention is implemented in computer programs executing on programmable computers, comprising a processor, a data storage system (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device.
  • Program code is applied to input data to perform the functions described above and generate output information.
  • the output information is applied to one or more output devices, in known fashion.
  • the computer may be, for example, a personal computer, microcomputer, or workstation of conventional design.
  • Each program can be implemented in a high level procedural or object oriented programming language to communicate with a computer system.
  • the programs can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language.
  • Each such computer program can be stored on a storage media or device (e.g., ROM or magnetic diskette) readable by a general or special purpose programmable computer, for configuring and operating the computer when the storage media or device is read by the computer to perform the procedures described herein.
  • the system may also be considered to be implemented as a computer-readable storage medium, configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner to perform the functions described herein.
  • a variety of structural formats for the input and output means can be used to input and output the information in the computer-based systems of the present invention.
  • One format for an output tests datasets possessing varying degrees of similarity to a trusted repertoire. Such presentation provides a skilled artisan with a ranking of similarities and identifies the degree of similarity contained in the test repertoire.
  • Sequence or other data can be input into a computer by a user either directly or indirectly.
  • any of the devices which can be used to sequence DNA or analyze DNA or analyze immune repertoire data can be linked to a computer, such that the data is transferred to a computer and/or computer-compatible storage device.
  • Data can be stored on a computer or suitable storage device (e.g., CD).
  • Data can also be sent from a computer to another computer or data collection point via methods well known in the art (e.g., the internet, ground mail, air mail).
  • methods well known in the art e.g., the internet, ground mail, air mail.
  • data collected by the methods described herein can be collected at any point or geographical location and sent to any other geographical location
  • reagents and kits thereof for practicing one or more of the above-described methods.
  • the subject reagents and kits thereof may vary greatly.
  • Reagents of interest include reagents specifically designed for use in production of the above described immune repertoire analysis.
  • reagents can include primer sets for cDNA synthesis, for PCR amplification and/or for high throughput sequencing of a class or subtype of immunological receptors.
  • Gene specific primers and methods for using the same are described in U.S. Pat. No. 5,994,076, the disclosure of which is herein incorporated by reference
  • the gene specific primer collections can include only primers for immunological receptors, or they may include primers for additional genes, e.g., housekeeping genes, controls, etc.
  • kits of the subject invention can include the above described gene specific primer collections.
  • the kits can further include a software package for statistical analysis, and may include a reference database for calculating the probability of a match between two repertoires.
  • the kit may include reagents employed in the various methods, such as primers for generating target nucleic acids, dNTPs and/or rNTPs, which may be either premixed or separate, one or more uniquely labeled dNTPs and/or rNTPs, such as biotinylated or Cy3 or Cy5 tagged dNTPs, gold or silver particles with different scattering spectra, or other post synthesis labeling reagent, such as chemically active derivatives of fluorescent dyes, enzymes, such as reverse transcriptases, DNA polymerases, RNA polymerases, and the like, various buffer mediums, e.g.
  • hybridization and washing buffers prefabricated probe arrays, labeled probe purification reagents and components, like spin columns, etc.
  • signal generation and detection reagents e.g. streptavidin-alkaline phosphatase conjugate, chemifluorescent or chemiluminescent substrate, and the like.
  • the subject kits will further include instructions for practicing the subject methods. These instructions may be present in the subject kits in a variety of forms, one or more of which may be present in the kit.
  • One form in which these instructions may be present is as printed information on a suitable medium or substrate, e.g., a piece or pieces of paper on which the information is printed, in the packaging of the kit, in a package insert, etc.
  • Yet another means would be a computer readable medium, e.g., diskette, CD, etc., on which the information has been recorded.
  • Yet another means that may be present is a website address which may be used via the internet to access the information at a removed, site. Any convenient means may be present in the kits.
  • the above-described analytical methods may be embodied as a program of instructions executable by computer to perform the different aspects of the invention. Any of the techniques described above may be performed by means of software components loaded into a computer or other information appliance or digital device. When so enabled, the computer, appliance or device may then perform the above-described techniques to assist the analysis of sets of values associated with a plurality of genes in the manner described above, or for comparing such associated values.
  • the software component may be loaded from a fixed media or accessed through a communication medium such as the internet or other type of computer network.
  • the above features are embodied in one or more computer programs may be performed by one or more computers running such programs.
  • Software products may be tangibly embodied in a machine-readable medium, and comprise instructions operable to cause one or more data processing apparatus to perform operations comprising: a) clustering sequence data from a plurality of immunological receptors or fragments thereof; and b) providing a statistical analysis output on said sequence data.
  • a software product includes instructions for assigning the sequence data into V, D, J, C, VJ, VDJ, VJC, VDJC, or VJ/VDJ lineage usage classes or instructions for displaying an analysis output in a multi-dimensional plot.
  • a multidimensional plot enumerates all possible values for one of the following: V, D, J, or C. (e.g., a three-dimensional plot that includes one axis that enumerates all possible V values, a second axis that enumerates all possible D values, and a third axis that enumerates all possible J values).
  • a software product includes instructions for identifying one or more unique patterns from a single sample correlated to a condition.
  • the software product may also include instructions for normalizing for amplification bias.
  • the software product may include instructions for using control data to normalize for sequencing errors or for using a clustering process to reduce sequencing errors.
  • a software product (or component) may also include instructions for using two separate primer sets or a PCR filter to reduce sequencing errors.

Abstract

The invention provides a non-invasive technique for the detection and quantification of the immune repertoire, in a biological sample containing a plurality of distinct cell populations. Methods are conducted using sequencing technology to detect and enumerate immune repertoire within a heterogeneous biological sample.

Description

    RELATED APPLICATIONS
  • This application claims the benefit of, and priority to, U.S. Provisional Application No. 61/806,143, filed Mar. 28, 2013, and U.S. Provisional Application No. 61/801,785, filed Mar. 15, 2013, the contents of which are herein incorporated by reference in their entirety.
  • FIELD OF THE INVENTION
  • The present invention relates to the field of quantitative nucleic acid analysis. More specifically, the present invention provides methods of determining the immune repertoire using high throughput sequencing.
  • BACKGROUND
  • A feature of the adaptive immune response is the ability to generate a wide diversity of binding molecules, e.g. T-cell antigen receptors and antibodies. A variety of molecular mechanisms exist to generate initial diversity, including genetic recombination at multiple sites. Armed with this initial repertoire of binding moieties, naive B and T-cells circulate where they can come in contact with antigen. Upon exposure to antigen there can be a positive selection process, where cells expressing immunological receptors having desired binding properties are expanded, and may undergo further sequence modification, for example somatic hypermutation, and additional recombination. There can also be a negative selection process, where cells expressing immunological receptors having undesirable binding properties, such as self-reactivity, are deleted. As a result of these selective processes, the repertoire of binding specificities in an individual sample can provide a history of past antigenic exposures, as well as being informative of inherent repertoire capabilities and limitations.
  • Adaptive immunological receptors of interest include immunoglobulins, or antibodies. This repertoire is highly plastic and can be directed to create antibodies with broad chemical diversity and high selectivity. There is also a good understanding of the potential diversity available and the mechanistic aspects of how this diversity is generated. Antibodies are composed of two types of chains (heavy and light), each containing a highly diversified antigen-binding domain (variable). The V, D, and J gene segments of the antibody heavy-chain variable genes go through a series of recombination events to generate a new heavy-chain gene. Antibodies are formed by a mixture of recombination among gene segments, sequence diversification at the junctions of these segments, and point mutations throughout the gene. The mechanisms are reviewed, for example in Maizels (2005) Annu. Revu. Genet. 39:23-46; Jones and Gellert (2004) Immunol. Rev. 200:233-248; Winter and Gearhart (1998) Immunol. Rev. 162:89-96.
  • Another adaptive immunological receptor of interest is the T-cell antigen receptor (TCR), which is a heterodimer of two chains, each of which is a member of the immunoglobulin superfamily, possessing an N-terminal variable (V) domain, and a C terminal constant domain. The variable domain of the TCR α-chain and β-chain has three hypervariable or complementarity determining regions (CDRs). The β-chain has an additional area of hypervariability (HV4) that does not normally contact antigen. Processes for generating diversity of the TCR are similar to those described for immunoglobulins. The TCR alpha chain is generated by VJ recombination, while the beta chain is generated by V(D)J recombination. Similarly, generation of the TCR gamma chain involves VJ recombination, while generation of the TCR delta chain occurs by V(D)J recombination. The intersection of these specific regions (V and J for the alpha or gamma chain, V D and J for the beta or delta chain) corresponds to the CDR3 region that is important for antigen-MHC recognition. It is the unique combination of the segments at this region, along with palindromic and random N- and P-nucleotide additions, which accounts for the TCR binding repertoire.
  • While reference is made to binding specificities, and indeed a good deal of serological analysis is based on the physical interactions between antigen and receptor, the underlying cause of the diversity lies in the genetic sequences expressed by lymphocytes, which sequences reflect the myriad processes of recombination, mutation and selection that have acted on the cell.
  • Methods of precisely determining the immune receptor repertoire of an individual, or a sample of interest from an individual, are of great interest for prognosis, diagnosis, and characterization.
  • SUMMARY
  • The present invention provides methods for monitoring the immune repertoire. More specifically, the invention provides method of determining the immune repertoire in a subject, by isolating a plurality of RNA from a biological sample comprising a plurality of cell types obtained from a subject, producing immunoglobulin chain or TCR chain cDNAs from the RNA; adding a homopolymeric tail, a random molecular tag and a universal sequence to the 3′ end of the cDNAs; amplifying the cDNAs by using a one or more immunoglobulin chain or TCR chain specific primers and a universal sequence specific primer comprising an flanking sequence specific to the sequencing platform to produce a plurality of molecular tagged immunoglobulin chain or TCR chain nucleic acids and sequencing the amplified immunoglobuling chain or TCR chain nucleic acids to produce a plurality of sequencing reads thereby determining the immune repertoire of the subject. Optionally, the method further includes a data analysis step such as grouping sequences reads with the same molecular tag and clustering sequences within the same group. In some aspects a consensus sequence for each cluster is determined to produce a collection of consensus sequences. The collection of consensus sequences is used to determine the diversity of the immune repertoire.
  • The molecular tag is an oligomer. The oligomer is at least a 9mer. The biological sample is blood or a fraction thereof. The blood is peripheral whole blood. The blood fraction comprises peripheral blood mononuclear cells. In some aspects the blood sample is sorted based upon extracellular or intracellular markers.
  • The immunoglobulin chain is the heavy chain or the light chain. The immunoglobulin heavy chain contains the immunoglobulin VDJ and constant regions. The TCR chain is the alpha chain, the beta chain, the gamma chain or the delta chain. The TCR chain contains the VJ and constant regions.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1. is a schematic illustrating the method of the invention.
  • FIG. 2. is an illustration highlighting the differences of the method of the invention and multiplex sequencing of the immune repertoire.
  • FIG. 3 illustrates the elimination of capture and PCR bias by the method of the invention.
  • FIG. 4. is an illustration detailing the sequencing of the immune repertoire according to the method of the invention.
  • FIG. 5. is an illustration detailing the sequencing of the immune repertoire according to the method of the invention.
  • FIG. 6. is an illustration detailing the molecular tagging step of the sequencing of the immune repertoire according to the method of the invention.
  • FIG. 7. is a schematic illustrating the data analysis steps of the method of the invention.
  • FIG. 8. is a graph showing the isotype distribution obtained using the multiplex PCR vs. the 5′RACE method of the invention from the same sample. The multiplex PCR skews the repertoire towards a more naïve compartment.
  • FIG. 9. is a plot showing the top 100 lineages within each sample and the corresponding resonation of the other sample. Ribbons connect the same lineage. Width of segment represents the relative abundance of the lineage in the sample. A few large lineages seen with the method of the invention is absent in the same sample prepared by multiplex PCR.
  • FIG. 10 is an illustration showing that multiplex PCR does not prime lineages with mutations, with the result being that highly mutated lineages are absent in from the multiplex PCR data set. The sequence identifiers of each of the sequences in FIG. 10 are SEQ ID NOs: 3-24 from top to bottom.
  • DETAILED DESCRIPTION
  • The present invention provides an improved method of sequencing the immune repertoire. Previous methods for determining the immune repertoire, such as those described in WO 2011/140433 and WO 2012/083069 are based upon multiplex PCR. Multiplex PCR has a number of limitations that make it particularly unsuited for accurately determining the immune repertoire. (See FIG. 2) These limitations include capture bias and amplification bias owing to PCR. Multiplex PCR techniques for sequencing the immune repertoire use primers designed to prime all framework regions of known V gene segments. When a mutation arises at the priming site capture bias occurs and the gene that had the mutation would be under-amplified. PCR bias results from unequal amplification of the genes due to either the relative amount of each primer and PCR replicates of the same sequence. Thus PCR bias can cause apparent clonality or a lack of diversity. Generally speaking, the observed repertoire is not a faithful, or linear representation of the actual underlying repertoire.
  • To eliminate capture bias the methods of the invention utilizes 5′ RACE and universal PCR. PCR bias is eliminated by molecular tagging. (FIG. 3)
  • Additionally, unlike previous methods of sequencing the immune repertoire which require the isolation of specific populations of immune cells (e.g., T-cells or B-cells), and the spatial isolation of such cells into individual cells and/or individual molecules of nucleic acid derived from such cells to form colonies, the present method sequences the immune repertoire directly from a heterogeneous nucleic acid mixture derived from a heterogeneous population of cells.
  • The methods of the invention generally involve the steps of obtaining a peripheral whole blood sample from a subject, isolating RNA from the peripheral whole blood sample, or fraction thereof (e.g., peripheral blood mononuclear cells), reverse transcribing the isolated RNA using immunoglobulin heavy chain or TCR beta chain specific primers to generate immunoglobulin (e.g., heavy chain or light chain) or TCR (e.g., alpha, beta, delta or gamma chain) cDNA transcripts. A short homopolymer is added to the end of the cDNA by the intrinsic property of reverse transcriptase. Oligonucleotides with 3′ sequence complementary to the homopolymer and a 5′ flanking sequence containing a universal sequence and molecular tag that is composed of random nucleotides would be used by the reverse transcriptase as template. The result is that the end of each cDNA molecule is extended with a short homopolymer, a unique molecular tag, and a universal sequence. This allows amplification of unknown sequences between the gene specific sequence and the 5′-end of the mRNA. (FIGS. 4-6). Because each cDNA molecule is labeled with a unique tag prior to amplification, the differential amplification of each cDNA molecule can be corrected for by counting each unique tag once, thereby providing a faithful measure of the abundance of each species in the repertoire. Sequence replicates of each cDNA molecule identified by the same molecular tag can be used to construct consensus sequences, therefore allowing correction for amplification and sequencing errors.
  • Subjects
  • The methods of the invention utilize biological samples from subjects or individuals. The subject can be a patient, for example, a patient with an autoimmune disease, an infectious disease or cancer, or a transplant recipient. The subject can be a human or a non-human mammal. The subject can be a male or female subject of any age (e.g., a fetus, an infant, a child, or an adult).
  • Samples
  • Samples used in the methods of the provided invention can include, for example, a bodily fluid from a subject, including amniotic fluid surrounding a fetus, aqueous humor, bile, blood and blood plasma, cerumen (earwax), Cowper's fluid or pre-ejaculatory fluid, chyle, chyme, female ejaculate, interstitial fluid, lymph, menses, breast milk, mucus (including snot and phlegm), pleural fluid, pus, saliva, sebum (skin oil), semen, serum, sweat, tears, urine, vaginal lubrication, vomit, feces, internal body fluids including cerebrospinal fluid surrounding the brain and the spinal cord, synovial fluid surrounding bone joints, intracellular fluid (the fluid inside cells), and vitreous humour (the fluids in the eyeball).
  • In one embodiment, the sample is a blood sample, such as a peripheral whole blood sample, or a fraction thereof. Preferably, the sample is whole, unfractionated blood.
  • The blood sample can be about 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08 0.09, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, or 5.0 mL.
  • The sample can be obtained by a health care provider, for example, a physician, physician assistant, nurse, veterinarian, dermatologist, rheumatologist, dentist, paramedic, or surgeon. The sample can be obtained by a research technician. More than one sample from a subject can be obtained.
  • The sample can include immune cells. The immune cells can include T-cells and/or B-cells. T-cells (T lymphocytes) include, for example, cells that express T-cell receptors. T-cells include Helper T-cells (effector T-cells or Th cells), cytotoxic T-cells (CTLs), memory T-cells, and regulatory T-cells. The sample can include a single cell in some applications (e.g., a calibration test to define relevant T-cells) or more generally at least 1,000, at least 10,000, at least 100,000, at least 250,000, at least 500,000, at least 750,000, or at least 1,000,000 T-cells.
  • B-cells include, for example, plasma B cells, memory B cells, B1 cells, B2 cells, marginal-zone B cells, and follicular B cells. B-cells can express immunoglobulins (antibodies, B cell receptor). The sample can include a single cell in some applications (e.g., a calibration test to define relevant B cells) or more generally at least 1,000, at least 10,000, at least 100,000, at least 250,000, at least 500,000, at least 750,000, or at least 1,000,000 B-cells.
  • The sample can include nucleic acid, for example, DNA (e.g., genomic DNA or
  • mitochondrial DNA) or RNA (e.g., messenger RNA or microRNA). The nucleic acid can be cell-free DNA or RNA. In the methods of the provided invention, the amount of RNA or DNA from a subject that can be analyzed includes, for example, as low as a single cell in some applications (e.g., a calibration test) and as many as 10 millions of cells or more translating to a range of DNA of 6 pg-60 ug, and RNA of approximately 1 pg-10 ug.
  • Amplification Reactions
  • Polymerase chain reaction (PCR) can be used to amplify the relevant regions from a collection of cells.
  • In some embodiments, the region to be amplified includes the full clonal sequence or a subset of the clonal sequence, including the V-D junction, D-J junction of an immunoglobulin or T-cell receptor gene, the full variable region of an immunoglobulin or T-cell receptor gene, the antigen recognition region, or a CDR, e.g., complementarity determining region 3 (CDR3).
  • In some embodiments, the immunoglobulin sequence is amplified using a primary and a secondary amplification step. Each of the different amplification steps can comprise different primers. The different primers can introduce sequence not originally present in the immune gene sequence. For example, the amplification procedure can add one or more tags to the 5′ and/or 3′ end of amplified immunoglobulin sequence. The tag can be a sequence that facilitates subsequent sequencing of the amplified DNA. The tag can be a sequence that facilitates binding the amplified sequence to a solid support. The tag can be a bar-code or label to facilitate identification of the amplified immunoglobulin sequence.
  • Other methods for amplification may not employ any primers in the V region. Instead, a specific primer can be used from the C segment and a generic primer can be put in the other side (5′). The generic primer can be appended in the cDNA synthesis through different methods including the well described methods of strand switching. Similarly, the generic primer can be appended after cDNA making through different methods including ligation.
  • Other means of amplifying nucleic acid that can be used in the methods of the invention include, for example, reverse transcription-PCR, real-time PCR, quantitative real-time PCR, digital PCR (dPCR), digital emulsion PCR (dePCR), clonal PCR, amplified fragment length polymorphism PCR (AFLP PCR), allele specific PCR, assembly PCR, asymmetric PCR (in which a great excess of primers for a chosen strand is used), colony PCR, helicase-dependent amplification (HDA), Hot Start PCR, inverse PCR (IPCR), in situ PCR, long PCR (extension of DNA greater than about 5 kilobases), multiplex PCR, nested PCR (uses more than one pair of primers), single-cell PCR, touchdown PCR, loop-mediated isothermal PCR (LAMP), and nucleic acid sequence based amplification (NASBA). Other amplification schemes include: Ligase Chain Reaction, Branch DNA Amplification, Rolling Circle Amplification, Circle to Circle Amplification, SPIA amplification, Target Amplification by Capture and Ligation (TACL) amplification, and RACE amplification.
  • The information in RNA in a sample can be converted to cDNA by using reverse transcription using techniques well known to those of ordinary skill in the art (see e.g., Sambrook, Fritsch and Maniatis, MOLECULAR CLONING: A LABORATORY MANUAL, 2nd edition (1989)). PolyA primers, random primers, and/or gene specific primers can be used in reverse transcription reactions.
  • Polymerases that can be used for amplification in the methods of the provided invention include, for example, Taq polymerase, AccuPrime polymerase, or Pfu. The choice of polymerase to use can be based on whether fidelity or efficiency is preferred.
  • After amplification of DNA from the genome (or amplification of nucleic acid in the form of cDNA by reverse transcribing RNA), the amplicons are directly sequenced.
  • Sequencing
  • Any technique for sequencing nucleic acid known to those skilled in the art can be used in the methods of the provided invention. DNA sequencing techniques include classic dideoxy sequencing reactions (Sanger method) using labeled terminators or primers and gel separation in slab or capillary, sequencing-by-synthesis using reversibly terminated labeled nucleotides, pyrosequencing, 454 sequencing, allele specific hybridization to a library of labeled oligonucleotide probes, sequencing-by-synthesis using allele specific hybridization to a library of labeled clones that is followed by ligation, real time monitoring of the incorporation of labeled nucleotides during a polymerization step, and SOLiD sequencing.
  • In certain embodiments, the sequencing technique used in the methods of the provided invention generates at least 100 reads per run, at least 200 reads per run, at least 300 reads per run, at least 400 reads per run, at least 500 reads per run, at least 600 reads per run, at least 700 reads per run, at least 800 reads per run, at least 900 reads per run, at least 1000 reads per run, at least 5,000 reads per run, at least 10,000 reads per run, at least 50,000 reads per run, at least 100,000 reads per run, at least 500,000 reads per run, at least 1,000,000 reads per run, at least 2,000,000 reads per run, at least 3,000,000 reads per run, at least 4,000,000 reads per run at least 5000,000 reads per run, at least 6,000,000 reads per run at least 7,000,000 reads per run at least 8,000,000 reads per run, at least 9,000,000 reads per run, or at least 10,000,000 reads per run.
  • In some embodiments the number of sequencing reads per B cell sampled should be at least 2 times the number of B cells sampled, at least 3 times the number of B cells sampled, at least 5 times the number of B cells sampled, at least 6 times the number of B cells sampled, at least 7 times the number of B cells sampled, at least 8 times the number of B cells sampled, at least 9 times the number of B cells sampled, or at least at least 10 times the number of B cells The read depth allows for accurate coverage of B cells sampled, facilitates error correction, and ensures that the sequencing of the library has been saturated.
  • In some embodiments the number of sequencing reads per T-cell sampled should be at least 2 times the number of T-cells sampled, at least 3 times the number of T-cells sampled, at least 5 times the number of T-cells sampled, at least 6 times the number of T-cells sampled, at least 7 times the number of T-cells sampled, at least 8 times the number of T-cells sampled, at least 9 times the number of T-cells sampled, or at least at least 10 times the number of T-cells The read depth allows for accurate coverage of T-cells sampled, facilitates error correction, and ensures that the sequencing of the library has been saturated.
  • In certain embodiments, the sequencing technique used in the methods of the provided invention can generate about 30 bp, about 40 bp, about 50 bp, about 60 bp, about 70 bp, about 80 bp, about 90 bp, about 100 bp, about 110, about 120 by per read, about 150 bp, about 200 bp, about 250 bp, about 300 bp, about 350 bp, about 400 bp, about 450 bp, about 500 bp, about 550 bp, about 600 bp, about 700 bp, about 800 bp, about 900 bp, or about 1,000 by per read. For example, the sequencing technique used in the methods of the provided invention can generate at least 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, or 1,000 by per read.
  • True Single Molecule Sequencing
  • A sequencing technique that can be used in the methods of the provided invention includes, for example, Helicos True Single Molecule Sequencing (tSMS) (Harris T. D. et al. (2008) Science 320:106-109). In the tSMS technique, a DNA sample is cleaved into strands of approximately 100 to 200 nucleotides, and a polyA sequence is added to the 3′ end of each DNA strand. Each strand is labeled by the addition of a fluorescently labeled adenosine nucleotide. The DNA strands are then hybridized to a flow cell, which contains millions of oligo-T capture sites that are immobilized to the flow cell surface. The templates can be at a density of about 100 million templates/cm2. The flow cell is then loaded into an instrument, e.g., HeliScope™. sequencer, and a laser illuminates the surface of the flow cell, revealing the position of each template. A CCD camera can map the position of the templates on the flow cell surface. The template fluorescent label is then cleaved and washed away. The sequencing reaction begins by introducing a DNA polymerase and a fluorescently labeled nucleotide. The oligo-T nucleic acid serves as a primer. The polymerase incorporates the labeled nucleotides to the primer in a template directed manner. The polymerase and unincorporated nucleotides are removed. The templates that have directed incorporation of the fluorescently labeled nucleotide are detected by imaging the flow cell surface. After imaging, a cleavage step removes the fluorescent label, and the process is repeated with other fluorescently labeled nucleotides until the desired read length is achieved. Sequence information is collected with each nucleotide addition step.
  • 454 Sequencing
  • Another example of a DNA sequencing technique that can be used in the methods of the provided invention is 454 sequencing (Roche) (Margulies, M et al. 2005, Nature, 437, 376-380). 454 sequencing involves two steps. In the first step, DNA is sheared into fragments of approximately 300-800 base pairs, and the fragments are blunt ended. Oligonucleotide adaptors are then ligated to the ends of the fragments. The adaptors serve as primers for amplification and sequencing of the fragments. The fragments can be attached to DNA capture beads, e.g., streptavidin-coated beads using, e.g., Adaptor B, which contains 5′-biotin tag. The fragments attached to the beads are PCR amplified within droplets of an oil-water emulsion. The result is multiple copies of clonally amplified DNA fragments on each bead. In the second step, the beads are captured in wells (pico-liter sized). Pyrosequencing is performed on each DNA fragment in parallel. Addition of one or more nucleotides generates a light signal that is recorded by a CCD camera in a sequencing instrument. The signal strength is proportional to the number of nucleotides incorporated.
  • Pyrosequencing makes use of pyrophosphate (PPi) which is released upon nucleotide addition. PPi is converted to ATP by ATP sulfurylase in the presence of adenosine 5′ phosphosulfate. Luciferase uses ATP to convert luciferin to oxyluciferin, and this reaction generates light that is detected and analyzed.
  • Genome Sequencer FLX™
  • Another example of a DNA sequencing technique that can be used in the methods of the invention is the Genome Sequencer FLX systems (Roche/454). The Genome Sequences FLX systems (e.g., GS FLX/FLX+, GS Junior) offer more than 1 million high-quality reads per run and read lengths of 400 bases. These systems are ideally suited for de novo sequencing of whole genomes and transcriptomes of any size, metagenomic characterization of complex samples, or resequencing studies.
  • SOLiD™ Sequencing
  • Another example of a DNA sequencing technique that can be used in the methods of the provided invention is SOLiD technology (Life Technologies, Inc.). In SOLiD sequencing, genomic DNA is sheared into fragments, and adaptors are attached to the 5′ and 3′ ends of the fragments to generate a fragment library. Alternatively, internal adaptors can be introduced by ligating adaptors to the 5′ and 3′ ends of the fragments, circularizing the fragments, digesting the circularized fragment to generate an internal adaptor, and attaching adaptors to the 5′ and 3′ ends of the resulting fragments to generate a mate-paired library. Next, clonal bead populations are prepared in microreactors containing beads, primers, template, and PCR components. Following PCR, the templates are denatured and beads are enriched to separate the beads with extended templates. Templates on the selected beads are subjected to a 3′ modification that permits bonding to a glass slide.
  • The sequence can be determined by sequential hybridization and ligation of partially random oligonucleotides with a central determined base (or pair of bases) that is identified by a specific fluorophore. After a color is recorded, the ligated oligonucleotide is cleaved and removed and the process is then repeated.
  • Ion Torrent™ Sequencing
  • Another example of a DNA sequencing technique that can be used in the methods of the provided invention is the IonTorrent system (Life Technologies, Inc.). Ion Torrent uses a high-density array of micro-machined wells to perform this biochemical process in a massively parallel way. Each well holds a different DNA template. Beneath the wells is an ion-sensitive layer and beneath that a proprietary Ion sensor. If a nucleotide, for example a C, is added to a DNA template and is then incorporated into a strand of DNA, a hydrogen ion will be released. The charge from that ion will change the pH of the solution, which can be detected by the proprietary ion sensor. The sequencer will call the base, going directly from chemical information to digital information. The Ion Personal Genome Machine (PGM™) sequencer then sequentially floods the chip with one nucleotide after another. If the next nucleotide that floods the chip is not a match, no voltage change will be recorded and no base will be called. If there are two identical bases on the DNA strand, the voltage will be double, and the chip will record two identical bases called. Because this is direct detection—no scanning, no cameras, no light—each nucleotide incorporation is recorded in seconds.
  • HiSeq™ and MiSeq™ Sequencing
  • Additional examples of sequencing technologies that can be used in the methods of the invention include the HiSEQ™ system (e.g., HiSEQ2000™ and HiSEQ1000™) and the MiSEQ™ system from Illumina, Inc. The HiSEQ™ system is based on massively parallel sequencing of millions of fragments using attachment of randomly fragmented genomic DNA to a planar, optically transparent surface and solid phase amplification to create a high density sequencing flow cell with millions of clusters, each containing about 1,000 copies of template per sq. cm. These templates are sequenced using four-color DNA sequencing-by-synthesis technology. The MiSEQ™ system uses TruSeq, Illumina's reversible terminator-based sequencing-by-synthesis.
  • SOLEXA™ Sequencing
  • Another example of a sequencing technology that can be used in the methods of the invention is SOLEXA sequencing (Illumina) SOLEXA sequencing is based on the amplification of DNA on a solid surface using fold-back PCR and anchored primers. Genomic DNA is fragmented, and adapters are added to the 5′ and 3′ ends of the fragments. DNA fragments that are attached to the surface of flow cell channels are extended and bridge amplified. The fragments become double stranded, and the double stranded molecules are denatured. Multiple cycles of the solid-phase amplification followed by denaturation can create several million clusters of approximately 1,000 copies of single-stranded DNA molecules of the same template in each channel of the flow cell. Primers, DNA polymerase and four fluorophore-labeled, reversibly terminating nucleotides are used to perform sequential sequencing. After nucleotide incorporation, a laser is used to excite the fluorophores, and an image is captured and the identity of the first base is recorded. The 3′ terminators and fluorophores from each incorporated base are removed and the incorporation, detection and identification steps are repeated.
  • SMRT™ Sequencing
  • Another example of a sequencing technology that can be used in the methods of the provided invention includes the single molecule, real-time (SMRT™) technology of Pacific Biosciences. In SMRT™, each of the four DNA bases is attached to one of four different fluorescent dyes. These dyes are phospholinked. A single DNA polymerase is immobilized with a single molecule of template single stranded DNA at the bottom of a zero-mode waveguide (ZMW). A ZMW is a confinement structure which enables observation of incorporation of a single nucleotide by DNA polymerase against the background of fluorescent nucleotides that rapidly diffuse in and out of the ZMW (in microseconds). It takes several milliseconds to incorporate a nucleotide into a growing strand. During this time, the fluorescent label is excited and produces a fluorescent signal, and the fluorescent tag is cleaved off. Detection of the corresponding fluorescence of the dye indicates which base was incorporated. The process is repeated.
  • Nanopore Sequencing
  • Another example of a sequencing technique that can be used in the methods of the provided invention is nanopore sequencing (Soni G V and Meller A. (2007) Clin Chem 53: 1996-2001). A nanopore is a small hole, of the order of 1 nanometer in diameter. Immersion of a nanopore in a conducting fluid and application of a potential across it results in a slight electrical current due to conduction of ions through the nanopore. The amount of current which flows is sensitive to the size of the nanopore. As a DNA molecule passes through a nanopore, each nucleotide on the DNA molecule obstructs the nanopore to a different degree. Thus, the change in the current passing through the nanopore as the DNA molecule passes through the nanopore represents a reading of the DNA sequence.
  • Chemical-Sensitive Field Effect Transistor Array Sequencing
  • Another example of a sequencing technique that can be used in the methods of the provided invention involves using a chemical-sensitive field effect transistor (chemFET) array to sequence DNA (for example, as described in US Patent Application Publication No. 20090026082). In one example of the technique, DNA molecules can be placed into reaction chambers, and the template molecules can be hybridized to a sequencing primer bound to a polymerase. Incorporation of one or more triphosphates into a new nucleic acid strand at the 3′ end of the sequencing primer can be detected by a change in current by a chemFET. An array can have multiple chemFET sensors. In another example, single nucleic acids can be attached to beads, and the nucleic acids can be amplified on the bead, and the individual beads can be transferred to individual reaction chambers on a chemFET array, with each chamber having a chemFET sensor, and the nucleic acids can be sequenced.
  • Sequencing with an Electron Microscope
  • Another example of a sequencing technique that can be used in the methods of the provided invention involves using a electron microscope (Moudrianakis E. N. and Beer M. Proc Natl Acad Sci USA. 1965 March; 53:564-71). In one example of the technique, individual DNA molecules are labeled using metallic labels that are distinguishable using an electron microscope. These molecules are then stretched on a flat surface and imaged using an electron microscope to measure sequences.
  • Any one of the sequencing techniques described herein can be used in the methods of the invention.
  • Digital Counting and Analysis
  • Sequencing allows for the presence of multiple immunoglobulin gene to be detected and quantified in a heterogeneous biological sample.
  • The high throughput sequencing provides a very large dataset, which is then analyzed in order to establish the repertoire.
  • High-throughput analysis can be achieved using one or more bioinformatics tools, such as ALLPATHS (a whole genome shotgun assembler that can generate high quality assemblies from short reads), Arachne (a tool for assembling genome sequences from whole genome shotgun reads, mostly in forward and reverse pairs obtained by sequencing cloned ends, BACCardl (a graphical tool for the validation of genomic assemblies, assisting genome finishing and intergenome comparison), CCRaVAT & QuTie (enables analysis of rare variants in large-scale case control and quantitative trait association studies), CNV-seq (a method to detect copy number variation using high throughput sequencing), Elvira (a set of tools/procedures for high throughput assembly of small genomes (e.g., viruses)), Glimmer (a system for finding genes in microbial DNA, especially the genomes of bacteria, archaea and viruses), gnumap (a program designed to accurately map sequence data obtained from next-generation sequencing machines), Goseq (an R library for performing Gene Ontology and other category based tests on RNA-seq data which corrects for selection bias), ICAtools (a set of programs useful for medium to large scale sequencing projects), LOCAS, a program for assembling short reads of second generation sequencing technology, Maq (builds assembly by mapping short reads to reference sequences, MEME (motif-based sequence analysis tools, NGSView (allows for visualization and manipulation of millions of sequences simultaneously on a desktop computer, through a graphical interface, OSLay (Optimal Syntenic Layout of Unfinished Assemblies), Penn (efficient mapping for short sequencing reads with periodic full sensitive spaced seeds, Projector (automatic contig mapping for gap closure purposes), Qpalma (an alignment tool targeted to align spliced reads produced by sequencing platforms such as Illumina, Solexa, or 454), RazerS (fast read mapping with sensitivity control), SHARCGS (SHort read Assembler based on Robust Contig extension for Genome Sequencing; a DNA assembly program designed for de novo assembly of 25-40mer input fragments and deep sequence coverage), Tablet (next generation sequence assembly visualization), and Velvet (sequence assembler for very short reads).
  • A Non-limiting example of data analysis steps are summarized in the flow chart of FIG. 7.
  • Grouping reads with the same molecular tag: Initially sequences are matched based on identical molecular tag.
  • Build a minimum spanning forest for each group: Cluster into sungroups (tress) if Hamming distance is greater than 5%.
  • For each subgroup (or tree), create a vector of sums of correct probabilities for each called base in each read.
  • Construct a consensus read from the base with the maximum sum in each position: Consensus reads are used for mutation analysis and diversity measurement.
  • VDJ lineage diversity: VDJ usage is enumerated by the number of observed lineages falling into each VJ, VDJ, VJC, or VDJC (e.g., VDJ) combination at a given read-depth.
  • VDJ and unique sequence abundance histograms: Histograms are plotted by binning VDJ and unique sequence abundances (the latter which is either clustered or has undergone lineage-analysis filtering and grouping) into log-spaced bins.
  • 3D representation of VJ, VDJ, VJC, or VDJC (e.g., VDJ) usage: Repertoires are represented by applying V-, D-, J-, and/or C-segments to different axes on a three-dimensional plot. Using either abundance (generally read number, which can be bias-normalized) or observed lineage diversity, bubbles of varying sizes are used at each V/D/J/C coordinate to represent the total usage of that combination.
  • Mutation vs. sequence abundance plots: After undergoing lineage analysis, unique sequences are binned by read-number (or bias-normalized abundance) into log-spaced bins. For a given abundance-bin, the number of mutations per unique sequence is averaged, giving a mutation vs. abundance curve.
  • Correlative measures of V, D, J, C, VJ, VDJ, VJC, VDJC, antibody heavy chain, antibody light chain, CDR3, or T-cell receptor usage (Pearson, KL divergence): VJ, VDJ, VJC, or VDJC (e.g., VDJ) combinations are treated as vectors with indexed components v, weighted by either lineage-diversity or abundance for that VDJ combination. Pearson correlations and KL-divergences between each pair of individuals are then calculated over the indices.
  • The results of the analysis may be referred to herein as an immune repertoire analysis result, which may be represented as a dataset that includes sequence information, representation of V, D, J, C, VJ, VDJ, VJC, VDJC, antibody heavy chain, antibody light chain, CDR3, or T-cell receptor usage, representation for abundance of V, D, J, C, VJ, VDJ, VJC, VDJC, antibody heavy chain, antibody light chain, CDR3, or T-cell receptor and unique sequences; representation of mutation frequency, correlative measures of VJ V, D, J, C, VJ, VDJ, VJC, VDJC, antibody heavy chain, antibody light chain, CDR3, or T-cell receptor usage, etc. Such results may then be output or stored, e.g. in a database of repertoire analyses, and may be used in comparisons with test results, reference results, and the like.
  • After obtaining an immune repertoire analysis result from the sample being assayed, the repertoire can be compared with a reference or control repertoire to make a diagnosis, prognosis, analysis of drug effectiveness, or other desired analysis. A reference or control repertoire may be obtained by the methods of the invention, and will be selected to be relevant for the sample of interest. A test repertoire result can be compared to a single reference/control repertoire result to obtain information regarding the immune capability and/or history of the individual from which the sample was obtained. Alternately, the obtained repertoire result can be compared to two or more different reference/control repertoire results to obtain more in-depth information regarding the characteristics of the test sample. For example, the obtained repertoire result may be compared to a positive and negative reference repertoire result to obtain confirmed information regarding whether the phenotype of interest. In another example, two “test” repertoires can also be compared with each other. In some cases, a test repertoire is compared to a reference sample and the result is then compared with a result derived from a comparison between a second test repertoire and the same reference sample.
  • Determination or analysis of the difference values, i.e., the difference between two repertoires can be performed using any conventional methodology, where a variety of methodologies are known to those of skill in the array art, e.g., by comparing digital images of the repertoire output, by comparing databases of usage data, etc.
  • A statistical analysis step can then be performed to obtain the weighted contribution of the sequence prevalence, e.g. V, D, J, C, VJ, VDJ, VJC, VDJC, antibody heavy chain, antibody light chain, CDR3, or T-cell receptor usage, mutation analysis, etc. For example, nearest shrunken centroids analysis may be applied as described in Tibshirani et at. (2002) P.N.A.S. 99:6567-6572 to compute the centroid for each class, then compute the average squared distance between a given repertoire and each centroid, normalized by the within-class standard deviation.
  • A statistical analysis may comprise use of a statistical metric (e.g., an entropy metric, an ecology metric, a variation of abundance metric, a species richness metric, or a species heterogeneity metric.) in order to characterize diversity of a set of immunological receptors. Methods used to characterize ecological species diversity can also be used in the present invention. See, e.g., Peet, Annu Rev. Ecol. Syst. 5:285 (1974). A statistical metric may also be used to characterize variation of abundance or heterogeneity. An example of an approach to characterize heterogeneity is based on information theory, specifically the Shannon-Weaver entropy, which summarizes the frequency distribution in a single number. See, e.g., Peet, Annu Rev. Ecol. Syst. 5:285 (1974).
  • The classification can be probabilistically defined, where the cut-off may be empirically derived. In one embodiment of the invention, a probability of about 0.4 can be used to distinguish between individuals exposed and not-exposed to an antigen of interest, more usually a probability of about 0.5, and can utilize a probability of about 0.6 or higher. A “high” probability can be at least about 0.75, at least about 0.7, at least about 0.6, or at least about 0.5. A “low” probability may be not more than about 0.25, not more than 0.3, or not more than 0.4. In many embodiments, the above-obtained information is employed to predict whether a host, subject or patient should be treated with a therapy of interest and to optimize the dose therein.
  • Diagnostics and Prognostics
  • The invention finds use in the prevention, treatment, detection, diagnosis, prognosis, or research into any condition or symptom of any condition, including cancer, inflammatory diseases, autoimmune diseases, allergies and infections of an organism. The organism is preferably a human subject but can also be derived from non-human subjects, e.g., non-human mammals. Examples of non-human mammals include, but are not limited to, non-human primates (e.g., apes, monkeys, gorillas), rodents (e.g., mice, rats), cows, pigs, sheep, horses, dogs, cats, or rabbits.
  • Examples of cancer include prostrate, pancreas, colon, brain, lung, breast, bone, and skin cancers. Examples of inflammatory conditions include irritable bowel syndrome, ulcerative colitis, appendicitis, tonsilitis, dermatitis. Examples of atopic conditions include allergy, asthma, etc. Examples of autoimmune diseases include IDDM, RA, MS, SLE, Crohn's disease, Graves' disease, etc. Autoimmune diseases also include Celiac disease, and dermatitis herpetiformis. For example, determination of an immune response to cancer antigens, autoantigens, pathogenic antigens, vaccine antigens, and the like is of interest.
  • In some cases, nucleic acids (e.g., genomic DNA, mRNA, etc.) are obtained from an organism after the organism has been challenged with an antigen (e.g., vaccinated). In other cases, the nucleic acids are obtained from an organism before the organism has been challenged with an antigen (e.g., vaccinated). Comparing the diversity of the immunological receptors present before and after challenge, may assist the analysis of the organism's response to the challenge.
  • Methods are also provided for optimizing therapy, by analyzing the immune repertoire in a sample, and based on that information, selecting the appropriate therapy, dose, treatment modality, etc. that is optimal for stimulating or suppressing a targeted immune response, while minimizing undesirable toxicity. The treatment is optimized by selection for a treatment that minimizes undesirable toxicity, while providing for effective activity. For example, a patient may be assessed for the immune repertoire relevant to an autoimmune disease, and a systemic or targeted immunosuppressive regimen may be selected based on that information.
  • A signature repertoire for a condition can refer to an immune repertoire result that indicates the presence of a condition of interest. For example a history of cancer (or a specific type of allergy) may be reflected in the presence of immune receptor sequences that bind to one or more cancer antigens. The presence of autoimmune disease may be reflected in the presence of immune receptor sequences that bind to autoantigens. A signature can be obtained from all or a part of a dataset, usually a signature will comprise repertoire information from at least about 100 different immune receptor sequences, at least about 102 different immune receptor sequences, at least about 103 different immune receptor sequences, at least about 104 different immune receptor sequences, at least about 105 different immune receptor sequences, or more. Where a subset of the dataset is used, the subset may comprise, for example, alpha TCR, beta TCR, MHC, IgH, IgL, or combinations thereof.
  • The classification methods described herein are of interest as a means of detecting the earliest changes along a disease pathway (e.g., a carcinogenesis pathway, inflammatory pathway, etc.), and/or to monitor the efficacy of various therapies and preventive interventions.
  • The methods disclosed herein can also be utilized to analyze the effects of agents on cells of the immune system. For example, analysis of changes in immune repertoire following exposure to one or more test compounds can performed to analyze the effect(s) of the test compounds on an individual. Such analyses can be useful for multiple purposes, for example in the development of immunosuppressive or immune enhancing therapies.
  • Agents to be analyzed for potential therapeutic value can be any compound, small molecule, protein, lipid, carbohydrate, nucleic acid or other agent appropriate for therapeutic use. Preferably tests are performed in vivo, e.g. using an animal model, to determine effects on the immune repertoire.
  • Agents of interest for screening include known and unknown compounds that encompass numerous chemical classes, primarily organic molecules, which may include organometallic molecules, genetic sequences, etc. An important aspect of the invention is to evaluate candidate drugs, including toxicity testing; and the like.
  • In addition to complex biological agents candidate agents include organic molecules comprising functional groups necessary for structural interactions, particularly hydrogen bonding, and typically include at least an amine, carbonyl, hydroxyl or carboxyl group, frequently at least two of the functional chemical groups. The candidate agents can comprise cyclical carbon or heterocyclic structures and/or aromatic or polyaromatic structures substituted with one or more of the above functional groups. Candidate agents can also be found among biomolecules, including peptides, polynucleotides, saccharides, fatty acids, steroids, purines, pyrimidines, derivatives, structural analogs or combinations thereof. In some instances, test compounds may have known functions (e.g., relief of oxidative stress), but may act through an unknown mechanism or act on an unknown target. Included are pharmacologically active drugs, genetically active molecules, etc. Compounds of interest include chemotherapeutic agents, hormones or hormone antagonists, etc. Exemplary of pharmaceutical agents suitable for this invention are those described in, “The Pharmacological Basis of Therapeutics,” Goodman and Gilman, McGraw-Hill, New York, N.Y., (1996), Ninth edition, under the sections: Water, Salts and Ions; Drugs Affecting Renal Function and Electrolyte Metabolism; Drugs Affecting Gastrointestinal Function; Chemotherapy of Microbial Diseases; Chemotherapy of Neoplastic Diseases; Drugs Acting on Blood-Forming organs; Hormones and Hormone Antagonists; Vitamins, Dermatology; and Toxicology, all incorporated herein by reference. Also included are toxins, and biological and chemical warfare agents, for example see Somani, S. M. (Ed.), “Chemical Warfare Agents,” Academic Press, New York, 1992).
  • Test compounds include all of the classes of molecules described above, and can further comprise samples of unknown content. Of interest are complex mixtures of naturally occurring compounds derived from natural sources such as plants, fungi, bacteria, protists or animals. While many samples will comprise compounds in solution, solid samples that can be dissolved in a suitable solvent may also be assayed. Samples of interest include environmental samples, e.g., ground water, sea water, mining waste, etc., biological samples, e.g. lysates prepared from crops, tissue samples, etc.; manufacturing samples, e.g. time course during preparation of pharmaceuticals; as well as libraries of compounds prepared for analysis; and the like (e.g., compounds being assessed for potential therapeutic value, i.e., drug candidates).
  • Samples or compounds can also include additional components, for example components that affect the ionic strength, pH, total protein concentration, etc. In addition, the samples may be treated to achieve at least partial fractionation or concentration. Biological samples may be stored if care is taken to reduce degradation of the compound, e.g. under nitrogen, frozen, or a combination thereof. The volume of sample used is sufficient to allow for measurable detection, for example from about 0.1 ml to 1 ml of a biological sample can be sufficient.
  • Compounds, including candidate agents, are obtained from a wide variety of sources including libraries of synthetic or natural compounds. For example, numerous means are available for random and directed synthesis of a wide variety of organic compounds, including biomolecules, including expression of randomized oligonucleotides and oligopeptides. Alternatively, libraries of natural compounds in the form of bacterial, fungal, plant and animal extracts are available or readily produced. Additionally, natural or synthetically produced libraries and compounds are readily modified through conventional chemical, physical and biochemical means, and may be used to produce combinatorial libraries. Known pharmacological agents may be subjected to directed or random chemical modifications, such as acylation, alkylation, esterification, amidification, etc. to produce structural analogs.
  • Some agent formulations do not include additional components, such as preservatives, that may have a significant effect on the overall formulation. Thus, such formulations consist essentially of a biologically active compound and a physiologically acceptable carrier, e.g. water, ethanol, DMSO, etc. However, if a compound is liquid without a solvent, the formulation may consist essentially of the compound itself.
  • Databases of Expression Repertoires and Data Analysis
  • Also provided are databases of immune repertoires or of sets of immunological receptors. Such databases can typically comprise repertoires results derived from various individual conditions, such as individuals having exposure to a vaccine, to a cancer, having an autoimmune disease of interest, infection with a pathogen, and the like. Such databases can also include sequences of immunological receptors derived from synthetic libraries, or from other artificial methods. The repertoire results and databases thereof may be provided in a variety of media to facilitate their use. “Media” refers to a manufacture that contains the expression repertoire information of the present invention. The databases of the present invention can be recorded on computer readable media, e.g. any medium that can be read and accessed directly by a computer. Such media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/optical storage media. One of skill in the art can readily appreciate how any of the presently known computer readable mediums can be used to create a manufacture comprising a recording of the present database information. “Recorded” refers to a process for storing information on computer readable medium, using any such methods as known in the art. Any convenient data storage structure may be chosen, based on the means used to access the stored information. A variety of data processor programs and formats can be used for storage, e.g. word processing text file, database format, etc.
  • As used herein, “a computer-based system” refers to the hardware means, software means, and data storage means used to analyze the information of the present invention. The minimum hardware of the computer-based systems of the present invention comprises a central processing unit (CPU), input means, output means, and data storage means. A skilled artisan can readily appreciate that any one of the currently available computer-based system are suitable for use in the present invention. The data storage means may comprise any manufacture comprising a recording of the present information as described above, or a memory access means that can access such a manufacture.
  • A variety of structural formats for the input and output means can be used to input and output the information in the computer-based systems of the present invention. Such presentation provides a skilled artisan with a ranking of similarities and identifies the degree of similarity contained in the test expression repertoire.
  • A scaled approach may also be taken to the data analysis. For example, Pearson correlation of the repertoire results can provide a quantitative score reflecting the signature for each sample. The higher the correlation value, the more the sample resembles a reference repertoire. A negative correlation value indicates the opposite behavior. The threshold for the classification can be moved up or down from zero depending on the clinical goal.
  • To provide significance ordering, the false discovery rate (FDR) may be determined
  • First, a set of null distributions of dissimilarity values is generated. In one embodiment, the values of observed repertoires are permuted to create a sequence of distributions of correlation coefficients obtained out of chance, thereby creating an appropriate set of null distributions of correlation coefficients (see Tusher et al. (2001) PNAS 98, 51 18-21, herein incorporated by reference). The set of null distribution is obtained by: permuting the values of each repertoire for all available repertoires; calculating the pairwise correlation coefficients for all repertoire results; calculating the probability density function of the correlation coefficients for this permutation; and repeating the procedure for N times, where N is a large number, usually 300. Using the N distributions, one calculates an appropriate measure (mean, median, etc.) of the count of correlation coefficient values that their values exceed the value (of similarity) that is obtained from the distribution of experimentally observed similarity values at given significance level.
  • The FDR is the ratio of the number of the expected falsely significant correlations (estimated from the correlations greater than this selected Pearson correlation in the set of randomized data) to the number of correlations greater than this selected Pearson correlation in the empirical data (significant correlations). This cut-off correlation value may be applied to the correlations between experimental repertoires.
  • Using the aforementioned distribution, a level of confidence is chosen for significance. This is used to determine the lowest value of the correlation coefficient that exceeds the result that would have obtained by chance. Using this method, one obtains thresholds for positive correlation, negative correlation or both. Using this threshold(s), the user can filter the observed values of the pairwise correlation coefficients and eliminate those that do not exceed the threshold(s). Furthermore, an estimate of the false positive rate can be obtained for a given threshold. For each of the individual “random correlation” distributions, one can find how many observations fall outside the threshold range. This procedure provides a sequence of counts. The mean and the standard deviation of the sequence provide the average number of potential false positives and its standard deviation.
  • The data can be subjected to non-supervised hierarchical clustering to reveal relationships among repertoires. For example, hierarchical clustering may be performed, where the Pearson correlation is employed as the clustering metric. Clustering of the correlation matrix, e.g. using multidimensional scaling, enhances the visualization of functional homology similarities and dissimilarities. Multidimensional scaling (MDS) can be applied in one, two or three dimensions.
  • The analysis may be implemented in hardware or software, or a combination of both. In one embodiment of the invention, a machine-readable storage medium is provided, the medium comprising a data storage material encoded with machine readable data which, when using a machine programmed with instructions for using said data, is capable of displaying a any of the datasets and data comparisons of this invention. Such data may be used for a variety of purposes, such as drug discovery, analysis of interactions between cellular components, and the like. In some embodiments, the invention is implemented in computer programs executing on programmable computers, comprising a processor, a data storage system (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device. Program code is applied to input data to perform the functions described above and generate output information. The output information is applied to one or more output devices, in known fashion. The computer may be, for example, a personal computer, microcomputer, or workstation of conventional design.
  • Each program can be implemented in a high level procedural or object oriented programming language to communicate with a computer system. However, the programs can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language. Each such computer program can be stored on a storage media or device (e.g., ROM or magnetic diskette) readable by a general or special purpose programmable computer, for configuring and operating the computer when the storage media or device is read by the computer to perform the procedures described herein. The system may also be considered to be implemented as a computer-readable storage medium, configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner to perform the functions described herein. A variety of structural formats for the input and output means can be used to input and output the information in the computer-based systems of the present invention. One format for an output tests datasets possessing varying degrees of similarity to a trusted repertoire. Such presentation provides a skilled artisan with a ranking of similarities and identifies the degree of similarity contained in the test repertoire.
  • Storing and Transmission of Data
  • Further provided herein is a method of storing and/or transmitting, via computer, sequence, and other, data collected by the methods disclosed herein. Any computer or computer accessory including, but not limited to software and storage devices, can be utilized to practice the present invention. Sequence or other data (e.g., immune repertoire analysis results), can be input into a computer by a user either directly or indirectly. Additionally, any of the devices which can be used to sequence DNA or analyze DNA or analyze immune repertoire data can be linked to a computer, such that the data is transferred to a computer and/or computer-compatible storage device. Data can be stored on a computer or suitable storage device (e.g., CD). Data can also be sent from a computer to another computer or data collection point via methods well known in the art (e.g., the internet, ground mail, air mail). Thus, data collected by the methods described herein can be collected at any point or geographical location and sent to any other geographical location
  • Reagents and Kits
  • Also provided are reagents and kits thereof for practicing one or more of the above-described methods. The subject reagents and kits thereof may vary greatly. Reagents of interest include reagents specifically designed for use in production of the above described immune repertoire analysis. For example, reagents can include primer sets for cDNA synthesis, for PCR amplification and/or for high throughput sequencing of a class or subtype of immunological receptors. Gene specific primers and methods for using the same are described in U.S. Pat. No. 5,994,076, the disclosure of which is herein incorporated by reference The gene specific primer collections can include only primers for immunological receptors, or they may include primers for additional genes, e.g., housekeeping genes, controls, etc.
  • The kits of the subject invention can include the above described gene specific primer collections. The kits can further include a software package for statistical analysis, and may include a reference database for calculating the probability of a match between two repertoires. The kit may include reagents employed in the various methods, such as primers for generating target nucleic acids, dNTPs and/or rNTPs, which may be either premixed or separate, one or more uniquely labeled dNTPs and/or rNTPs, such as biotinylated or Cy3 or Cy5 tagged dNTPs, gold or silver particles with different scattering spectra, or other post synthesis labeling reagent, such as chemically active derivatives of fluorescent dyes, enzymes, such as reverse transcriptases, DNA polymerases, RNA polymerases, and the like, various buffer mediums, e.g. hybridization and washing buffers, prefabricated probe arrays, labeled probe purification reagents and components, like spin columns, etc., signal generation and detection reagents, e.g. streptavidin-alkaline phosphatase conjugate, chemifluorescent or chemiluminescent substrate, and the like.
  • In addition to the above components, the subject kits will further include instructions for practicing the subject methods. These instructions may be present in the subject kits in a variety of forms, one or more of which may be present in the kit. One form in which these instructions may be present is as printed information on a suitable medium or substrate, e.g., a piece or pieces of paper on which the information is printed, in the packaging of the kit, in a package insert, etc. Yet another means would be a computer readable medium, e.g., diskette, CD, etc., on which the information has been recorded. Yet another means that may be present is a website address which may be used via the internet to access the information at a removed, site. Any convenient means may be present in the kits.
  • The above-described analytical methods may be embodied as a program of instructions executable by computer to perform the different aspects of the invention. Any of the techniques described above may be performed by means of software components loaded into a computer or other information appliance or digital device. When so enabled, the computer, appliance or device may then perform the above-described techniques to assist the analysis of sets of values associated with a plurality of genes in the manner described above, or for comparing such associated values. The software component may be loaded from a fixed media or accessed through a communication medium such as the internet or other type of computer network. The above features are embodied in one or more computer programs may be performed by one or more computers running such programs.
  • Software products (or components) may be tangibly embodied in a machine-readable medium, and comprise instructions operable to cause one or more data processing apparatus to perform operations comprising: a) clustering sequence data from a plurality of immunological receptors or fragments thereof; and b) providing a statistical analysis output on said sequence data. Also provided herein are software products (or components) tangibly embodied in a machine-readable medium, and that comprise instructions operable to cause one or more data processing apparatus to perform operations comprising: storing sequence data for more than 102, 103, 104, 105, 106, 107, 108, 109, 1010, 1011, or 1012 immunological receptors or more than 102, 103, 104, 105, 106, 107, 108, 109, 1010, 1011, or 1012 sequence reads.
  • In some examples, a software product (or component) includes instructions for assigning the sequence data into V, D, J, C, VJ, VDJ, VJC, VDJC, or VJ/VDJ lineage usage classes or instructions for displaying an analysis output in a multi-dimensional plot. In some cases, a multidimensional plot enumerates all possible values for one of the following: V, D, J, or C. (e.g., a three-dimensional plot that includes one axis that enumerates all possible V values, a second axis that enumerates all possible D values, and a third axis that enumerates all possible J values). In some cases, a software product (or component) includes instructions for identifying one or more unique patterns from a single sample correlated to a condition. The software product (or component) may also include instructions for normalizing for amplification bias. In some examples, the software product (or component) may include instructions for using control data to normalize for sequencing errors or for using a clustering process to reduce sequencing errors. A software product (or component) may also include instructions for using two separate primer sets or a PCR filter to reduce sequencing errors.
  • INCORPORATION BY REFERENCE
  • References and citations to other documents, such as patents, patent applications, patent publications, journals, books, papers, web contents, have been made throughout this disclosure. All such documents are hereby incorporated herein by reference in their entirety for all purposes.
  • EQUIVALENTS
  • The invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The foregoing embodiments are therefore to be considered in all respects illustrative rather than limiting on the invention described herein. Scope of the invention is thus indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

Claims (15)

What is claimed is:
1. A method of determining the immune repertoire in a subject, comprising:
a. isolating a plurality of RNA from a biological sample comprising a plurality of cell types obtained from a subject,
b. producing immunoglobulin chain or TCR chain cDNAs from the RNA;
c. adding a homopolymeric tail, a random molecular tag and a universal sequence to the 3′ end of the cDNAs;
d. amplifying the cDNAs by using a one or more immunoglobulin chain or TCR chain specific primers and a universal sequence specific primer comprising an flanking sequence specific to the sequencing platform to produce a plurality of molecular tagged immunoglobulin chain or TCR chain nucleic acids;
e. sequencing the amplified immunoglobuling chain or TCR chain nucleic acids to produce a plurality of sequencing reads thereby determining the immune repertoire of the subject.
2. The method of claim 1, further comprising a data analysis step after step (e).
3. The method of claim 2, wherein the data analysis includes grouping sequences reads with the same molecular tag and clustering sequences within the same group.
4. The method of claim 3, further comprising building a consensus sequence for each cluster to produce a collection of consensus sequences.
5. The method of claim 4, wherein the collection of consensus sequences is used to determine the diversity of the immune repertoire.
6. The method of claim 1, wherein the molecular tag is an oligomer.
5. The method of claim 6, wherein the oligomer is at least a 9mer.
7. The method of claim 1, wherein the biological sample is blood or a fraction thereof.
8. The method of claim 7, wherein the blood is peripheral whole blood.
9. The method of claim 7, wherein the blood fraction comprises peripheral blood mononuclear cells.
10. The method of claim 7, wherein the blood sample is sorted based upon extracellular or intracellular markers.
11. The method of claim 1, wherein the immunoglobulin chain is the heavy chain or the light chain.
12. The method of claim 11, wherein immunoglobulin heavy chain comprise the immunoglobulin VDJ and constant regions.
13. The method of claim 1, wherein the TCR chain is the alpha chain, the beta chain, the gamma chain or the delta chain.
14. The method of claim 13, wherein the TCR chain comprises the VJ and constant regions.
US14/776,141 2013-03-15 2014-03-14 Methods of sequencing the immune repertoire Abandoned US20160040234A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/776,141 US20160040234A1 (en) 2013-03-15 2014-03-14 Methods of sequencing the immune repertoire

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201361801785P 2013-03-15 2013-03-15
US201361806143P 2013-03-28 2013-03-28
PCT/US2014/029241 WO2014144713A2 (en) 2013-03-15 2014-03-14 Methods of sequencing the immune repertoire
US14/776,141 US20160040234A1 (en) 2013-03-15 2014-03-14 Methods of sequencing the immune repertoire

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2014/029241 A-371-Of-International WO2014144713A2 (en) 2013-03-15 2014-03-14 Methods of sequencing the immune repertoire

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/018,060 Continuation US20210001302A1 (en) 2013-03-15 2020-09-11 Methods of sequencing the immune repertoire

Publications (1)

Publication Number Publication Date
US20160040234A1 true US20160040234A1 (en) 2016-02-11

Family

ID=50686173

Family Applications (6)

Application Number Title Priority Date Filing Date
US14/776,177 Active 2034-11-06 US10058839B2 (en) 2013-03-15 2014-03-14 Methods and compositions for tagging and analyzing samples
US14/776,141 Abandoned US20160040234A1 (en) 2013-03-15 2014-03-14 Methods of sequencing the immune repertoire
US16/043,060 Active 2034-09-21 US10722858B2 (en) 2013-03-15 2018-07-23 Methods and compositions for tagging and analyzing samples
US16/884,518 Active US11161087B2 (en) 2013-03-15 2020-05-27 Methods and compositions for tagging and analyzing samples
US17/018,060 Abandoned US20210001302A1 (en) 2013-03-15 2020-09-11 Methods of sequencing the immune repertoire
US17/486,771 Pending US20220176335A1 (en) 2013-03-15 2021-09-27 Methods and compositions for tagging and analyzing samples

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US14/776,177 Active 2034-11-06 US10058839B2 (en) 2013-03-15 2014-03-14 Methods and compositions for tagging and analyzing samples

Family Applications After (4)

Application Number Title Priority Date Filing Date
US16/043,060 Active 2034-09-21 US10722858B2 (en) 2013-03-15 2018-07-23 Methods and compositions for tagging and analyzing samples
US16/884,518 Active US11161087B2 (en) 2013-03-15 2020-05-27 Methods and compositions for tagging and analyzing samples
US17/018,060 Abandoned US20210001302A1 (en) 2013-03-15 2020-09-11 Methods of sequencing the immune repertoire
US17/486,771 Pending US20220176335A1 (en) 2013-03-15 2021-09-27 Methods and compositions for tagging and analyzing samples

Country Status (7)

Country Link
US (6) US10058839B2 (en)
EP (5) EP3327123B1 (en)
CN (3) CN105189748B (en)
CA (2) CA2905517A1 (en)
DK (4) DK3611262T3 (en)
HK (1) HK1255869A1 (en)
WO (2) WO2014144822A2 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10058839B2 (en) 2013-03-15 2018-08-28 Lineage Biosciences, Inc. Methods and compositions for tagging and analyzing samples
WO2019152108A1 (en) * 2018-02-05 2019-08-08 The Board Of Trustees Of The Leland Stanford Junior University Systems and methods for multiplexed measurements in single and ensemble cells
US11319590B2 (en) * 2017-01-31 2022-05-03 Ludwig Institute For Cancer Research Ltd. Enhanced immune cell receptor sequencing methods

Families Citing this family (77)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8835358B2 (en) 2009-12-15 2014-09-16 Cellular Research, Inc. Digital counting of individual molecules by stochastic attachment of diverse labels
GB201106254D0 (en) 2011-04-13 2011-05-25 Frisen Jonas Method and product
SG11201405274WA (en) 2012-02-27 2014-10-30 Cellular Res Inc Compositions and kits for molecular counting
US9879313B2 (en) 2013-06-25 2018-01-30 Prognosys Biosciences, Inc. Methods and systems for determining spatial patterns of biological targets in a sample
KR20230074639A (en) 2013-08-28 2023-05-30 벡톤 디킨슨 앤드 컴퍼니 Massively parallel single cell analysis
WO2016081919A1 (en) * 2014-11-20 2016-05-26 Icahn School Of Medicine At Mount Sinai Methods for determining recombination diversity at a genomic locus
EP3763825B1 (en) * 2015-01-23 2023-10-04 Qiagen Sciences, LLC High multiplex pcr with molecular barcoding
ES2836802T3 (en) 2015-02-27 2021-06-28 Becton Dickinson Co Spatially addressable molecular barcodes
WO2016160844A2 (en) 2015-03-30 2016-10-06 Cellular Research, Inc. Methods and compositions for combinatorial barcoding
CN113186256A (en) 2015-04-10 2021-07-30 空间转录公司 Spatially differentiated, multiplexed nucleic acid analysis of biological samples
US10590461B2 (en) * 2015-04-21 2020-03-17 General Automation Lab Technologies Inc. High resolution systems, kits, apparatus, and methods using magnetic beads for high throughput microbiology applications
US11390914B2 (en) 2015-04-23 2022-07-19 Becton, Dickinson And Company Methods and compositions for whole transcriptome amplification
DK3822365T3 (en) * 2015-05-11 2023-02-06 Illumina Inc Platform for discovery and analysis of therapeutics
CN108026524A (en) 2015-09-11 2018-05-11 赛卢拉研究公司 Method and composition for nucleic acid library standardization
JP6934234B2 (en) 2015-11-25 2021-09-15 一般財団法人生産技術研究奨励会 Microchamber array device and analysis method of inspection object using it
JP6954899B2 (en) 2015-12-04 2021-10-27 10エックス ゲノミクス,インコーポレイテッド Methods and compositions for nucleic acid analysis
AU2017259794B2 (en) * 2016-05-02 2023-04-13 Encodia, Inc. Macromolecule analysis employing nucleic acid encoding
US10301677B2 (en) 2016-05-25 2019-05-28 Cellular Research, Inc. Normalization of nucleic acid libraries
US10640763B2 (en) 2016-05-31 2020-05-05 Cellular Research, Inc. Molecular indexing of internal sequences
US10202641B2 (en) 2016-05-31 2019-02-12 Cellular Research, Inc. Error correction in amplification of samples
CN106047857B (en) * 2016-06-01 2020-04-03 苏州金唯智生物科技有限公司 Method for discovering specific functional antibody
ES2903357T3 (en) 2016-09-15 2022-04-01 Archerdx Llc Nucleic Acid Sample Preparation Methods
CA3034924A1 (en) 2016-09-26 2018-03-29 Cellular Research, Inc. Measurement of protein expression using reagents with barcoded oligonucleotide sequences
US11667951B2 (en) 2016-10-24 2023-06-06 Geneinfosec, Inc. Concealing information present within nucleic acids
JP7161991B2 (en) * 2016-11-02 2022-10-27 アーチャーディーエックス, エルエルシー Methods of Nucleic Acid Sample Preparation for Immune Repertoire Sequencing
WO2018089860A1 (en) 2016-11-11 2018-05-17 2D Genomics Inc. Methods for processing nucleic acid samples
US11319583B2 (en) 2017-02-01 2022-05-03 Becton, Dickinson And Company Selective amplification using blocking oligonucleotides
WO2018209625A1 (en) * 2017-05-18 2018-11-22 北京吉因加科技有限公司 Analysis system for peripheral blood-based non-invasive detection of lesion immune repertoire diversity and uses of system
US10676779B2 (en) 2017-06-05 2020-06-09 Becton, Dickinson And Company Sample indexing for single cells
CN111344416A (en) * 2017-09-01 2020-06-26 生命技术公司 Compositions and methods for immunohistorian sequencing
CA3076367A1 (en) * 2017-09-22 2019-03-28 University Of Washington In situ combinatorial labeling of cellular molecules
CN111148836A (en) * 2017-09-28 2020-05-12 深圳华大生命科学研究院 PCR primer for detection and application thereof
KR102556494B1 (en) 2017-10-31 2023-07-18 엔코디아, 인코포레이티드 Kits for assays using nucleic acid encoding and/or labeling
CN107868837B (en) 2017-12-12 2019-03-01 苏州普瑞森基因科技有限公司 It is a kind of for analyzing the Primer composition and its application of enteric microorganism
CN109918688B (en) * 2017-12-12 2023-09-08 上海翼锐汽车科技有限公司 Vehicle body shape uniform matching method based on entropy principle
EP4234717A3 (en) 2018-05-03 2023-11-01 Becton, Dickinson and Company High throughput multiomics sample analysis
JP7358388B2 (en) 2018-05-03 2023-10-10 ベクトン・ディキンソン・アンド・カンパニー Molecular barcoding at opposite transcript ends
US11519033B2 (en) 2018-08-28 2022-12-06 10X Genomics, Inc. Method for transposase-mediated spatial tagging and analyzing genomic DNA in a biological sample
EP3861134A1 (en) 2018-10-01 2021-08-11 Becton, Dickinson and Company Determining 5' transcript sequences
CN112969789A (en) 2018-11-08 2021-06-15 贝克顿迪金森公司 Single cell whole transcriptome analysis using random priming
EP3894590A2 (en) 2018-12-10 2021-10-20 10X Genomics, Inc. Methods of using master / copy arrays for spatial detection
WO2020123384A1 (en) 2018-12-13 2020-06-18 Cellular Research, Inc. Selective extension in single cell whole transcriptome analysis
US11649485B2 (en) 2019-01-06 2023-05-16 10X Genomics, Inc. Generating capture probes for spatial analysis
US11926867B2 (en) 2019-01-06 2024-03-12 10X Genomics, Inc. Generating capture probes for spatial analysis
EP4242322A3 (en) 2019-01-23 2023-09-20 Becton, Dickinson and Company Oligonucleotides associated with antibodies
WO2020243579A1 (en) 2019-05-30 2020-12-03 10X Genomics, Inc. Methods of detecting spatial heterogeneity of a biological sample
EP4004231A1 (en) 2019-07-22 2022-06-01 Becton, Dickinson and Company Single cell chromatin immunoprecipitation sequencing assay
KR20220041211A (en) 2019-08-09 2022-03-31 넛크래커 테라퓨틱스 인코포레이티드 Manufacturing methods and apparatus for removing substances from therapeutic compositions
US11773436B2 (en) 2019-11-08 2023-10-03 Becton, Dickinson And Company Using random priming to obtain full-length V(D)J information for immune repertoire sequencing
FI3891300T3 (en) 2019-12-23 2023-05-10 10X Genomics Inc Methods for spatial analysis using rna-templated ligation
CN115244184A (en) 2020-01-13 2022-10-25 贝克顿迪金森公司 Methods and compositions for quantifying protein and RNA
CA3167722A1 (en) * 2020-01-13 2021-07-22 Fluent Biosciences Inc. Reverse transcription during template emulsification
US11702693B2 (en) 2020-01-21 2023-07-18 10X Genomics, Inc. Methods for printing cells and generating arrays of barcoded cells
US11732299B2 (en) 2020-01-21 2023-08-22 10X Genomics, Inc. Spatial assays with perturbed cells
US11898205B2 (en) 2020-02-03 2024-02-13 10X Genomics, Inc. Increasing capture efficiency of spatial assays
US11732300B2 (en) 2020-02-05 2023-08-22 10X Genomics, Inc. Increasing efficiency of spatial analysis in a biological sample
US11891654B2 (en) 2020-02-24 2024-02-06 10X Genomics, Inc. Methods of making gene expression libraries
ES2965354T3 (en) 2020-04-22 2024-04-12 10X Genomics Inc Methods for spatial analysis using targeted RNA deletion
US11661625B2 (en) 2020-05-14 2023-05-30 Becton, Dickinson And Company Primers for immune repertoire profiling
WO2021236929A1 (en) 2020-05-22 2021-11-25 10X Genomics, Inc. Simultaneous spatio-temporal measurement of gene expression and cellular activity
AU2021275906A1 (en) 2020-05-22 2022-12-22 10X Genomics, Inc. Spatial analysis to detect sequence variants
WO2021242834A1 (en) 2020-05-26 2021-12-02 10X Genomics, Inc. Method for resetting an array
CN116249785A (en) 2020-06-02 2023-06-09 10X基因组学有限公司 Space transcriptomics for antigen-receptor
EP4025692A2 (en) 2020-06-02 2022-07-13 10X Genomics, Inc. Nucleic acid library methods
WO2021252499A1 (en) 2020-06-08 2021-12-16 10X Genomics, Inc. Methods of determining a surgical margin and methods of use thereof
EP4165207A1 (en) 2020-06-10 2023-04-19 10X Genomics, Inc. Methods for determining a location of an analyte in a biological sample
AU2021294334A1 (en) 2020-06-25 2023-02-02 10X Genomics, Inc. Spatial analysis of DNA methylation
US11761038B1 (en) 2020-07-06 2023-09-19 10X Genomics, Inc. Methods for identifying a location of an RNA in a biological sample
US11932901B2 (en) 2020-07-13 2024-03-19 Becton, Dickinson And Company Target enrichment using nucleic acid probes for scRNAseq
US11926822B1 (en) 2020-09-23 2024-03-12 10X Genomics, Inc. Three-dimensional spatial analysis
US11827935B1 (en) 2020-11-19 2023-11-28 10X Genomics, Inc. Methods for spatial analysis using rolling circle amplification and detection probes
US11739443B2 (en) 2020-11-20 2023-08-29 Becton, Dickinson And Company Profiling of highly expressed and lowly expressed proteins
WO2022140028A1 (en) 2020-12-21 2022-06-30 10X Genomics, Inc. Methods, compositions, and systems for capturing probes and/or barcodes
EP4351788A1 (en) 2021-06-04 2024-04-17 Enumerix, Inc. Compositions, methods, and systems for single cell barcoding and sequencing
WO2023034489A1 (en) 2021-09-01 2023-03-09 10X Genomics, Inc. Methods, compositions, and kits for blocking a capture probe on a spatial array
CN114743593B (en) * 2022-06-13 2023-02-24 北京橡鑫生物科技有限公司 Construction method of prostate cancer early screening model based on urine, screening model and kit
CN116640855A (en) * 2023-04-13 2023-08-25 华南农业大学 Porcine TCR sequence primer for single cell V (D) J sequencing and application thereof

Family Cites Families (136)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4466467A (en) 1979-06-18 1984-08-21 Tolliver Wilbur E Stackable stirrup mat
US5018067A (en) 1987-01-12 1991-05-21 Iameter Incorporated Apparatus and method for improved estimation of health resource consumption through use of diagnostic and/or procedure grouping and severity of illness indicators
US5432054A (en) 1994-01-31 1995-07-11 Applied Imaging Method for separating rare cells from a population of cells
US6287850B1 (en) 1995-06-07 2001-09-11 Affymetrix, Inc. Bioarray chip reaction apparatus and its manufacture
IL139728A (en) 1995-01-09 2003-06-24 Penwest Pharmaceuticals Compan Aqueous slurry composition containing microcrystalline cellulose for preparing a pharmaceutical excipient
US5733729A (en) 1995-09-14 1998-03-31 Affymetrix, Inc. Computer-aided probability base calling for arrays of nucleic acid probes on chips
US5962271A (en) * 1996-01-03 1999-10-05 Cloutech Laboratories, Inc. Methods and compositions for generating full-length cDNA having arbitrary nucleotide sequence at the 3'-end
GB9704444D0 (en) 1997-03-04 1997-04-23 Isis Innovation Non-invasive prenatal diagnosis
US5994076A (en) 1997-05-21 1999-11-30 Clontech Laboratories, Inc. Methods of assaying differential expression
US6969488B2 (en) 1998-05-22 2005-11-29 Solexa, Inc. System and apparatus for sequential processing of analytes
USRE39920E1 (en) 1997-05-30 2007-11-13 Xenomics, Inc. Methods for detection of nucleic acid sequences in urine
US6492144B1 (en) 1997-05-30 2002-12-10 Diagen Corporation Methods for detection of nucleic acid sequences in urine
ATE487790T1 (en) * 1997-07-07 2010-11-15 Medical Res Council IN VITRO SORTING PROCESS
EP1025257A4 (en) 1997-10-03 2005-03-30 Univ Columbia Method for predicting transplant rejection
EP2045334A1 (en) 1998-06-24 2009-04-08 Illumina, Inc. Decoding of array sensors with microspheres
US20030022207A1 (en) 1998-10-16 2003-01-30 Solexa, Ltd. Arrayed polynucleotides and their use in genome analysis
US6787308B2 (en) 1998-07-30 2004-09-07 Solexa Ltd. Arrayed biomolecules and their use in sequencing
FR2782730B1 (en) 1998-08-25 2002-05-17 Biocom Sa CELL SEPARATION PROCESS FOR THE ISOLATION OF PATHOGENIC CELLS, PARTICULARLY RARE CANCERES, EQUIPMENT AND REAGENT FOR IMPLEMENTING THE PROCESS AND APPLICATION OF THE PROCESS
US7510841B2 (en) 1998-12-28 2009-03-31 Illumina, Inc. Methods of making and using composite arrays for the detection of a plurality of target analytes
US6429027B1 (en) 1998-12-28 2002-08-06 Illumina, Inc. Composite arrays utilizing microspheres
US6846460B1 (en) 1999-01-29 2005-01-25 Illumina, Inc. Apparatus and method for separation of liquid phases of different density and for fluorous phase organic syntheses
US20050191698A1 (en) 1999-04-20 2005-09-01 Illumina, Inc. Nucleic acid sequencing using microsphere arrays
US6355431B1 (en) 1999-04-20 2002-03-12 Illumina, Inc. Detection of nucleic acid amplification reactions using bead arrays
US7056661B2 (en) 1999-05-19 2006-06-06 Cornell Research Foundation, Inc. Method for sequencing nucleic acid molecules
US6544732B1 (en) 1999-05-20 2003-04-08 Illumina, Inc. Encoding and decoding of array sensors utilizing nanocrystals
WO2000071992A1 (en) 1999-05-20 2000-11-30 Illumina, Inc. Method and apparatus for retaining and presenting at least one microsphere array to solutions and/or to optical imaging systems
EP2360270B1 (en) 1999-05-20 2016-11-09 Illumina, Inc. Combinatorial decoding of random nucleic acid arrays
US6942968B1 (en) 1999-08-30 2005-09-13 Illumina, Inc. Array compositions for improved signal detection
US7244559B2 (en) 1999-09-16 2007-07-17 454 Life Sciences Corporation Method of sequencing a nucleic acid
US7211390B2 (en) 1999-09-16 2007-05-01 454 Life Sciences Corporation Method of sequencing a nucleic acid
US6505125B1 (en) 1999-09-28 2003-01-07 Affymetrix, Inc. Methods and computer software products for multiple probe gene expression analysis
EP1218543A2 (en) 1999-09-29 2002-07-03 Solexa Ltd. Polynucleotide sequencing
EP1239952B1 (en) 1999-12-13 2011-10-05 Illumina, Inc. Oligonucleotide synthesizer using centrifugal force
GB0002389D0 (en) 2000-02-02 2000-03-22 Solexa Ltd Molecular arrays
AU2001238068A1 (en) 2000-02-07 2001-08-14 Illumina, Inc. Nucleic acid detection methods using universal priming
WO2001057268A2 (en) 2000-02-07 2001-08-09 Illumina, Inc. Nucleic acid detection methods using universal priming
US6913884B2 (en) 2001-08-16 2005-07-05 Illumina, Inc. Compositions and methods for repetitive use of genomic DNA
US6770441B2 (en) 2000-02-10 2004-08-03 Illumina, Inc. Array compositions and methods of making same
US20020038227A1 (en) 2000-02-25 2002-03-28 Fey Christopher T. Method for centralized health data management
WO2002027029A2 (en) 2000-09-27 2002-04-04 Lynx Therapeutics, Inc. Method for determining relative abundance of nucleic acid sequences
US6753137B2 (en) 2001-01-31 2004-06-22 The Chinese University Of Hong Kong Circulating epstein-barr virus DNA in the serum of patients with gastric carcinoma
WO2002074951A1 (en) * 2001-03-15 2002-09-26 Kureha Chemical Industry Company, Limited METHOD OF CONSTRUCTING cDNA TAG FOR IDENTIFYING EXPRESSED GENE AND METHOD OF ANALYZING GENE EXPRESSION
DE10120797B4 (en) 2001-04-27 2005-12-22 Genovoxx Gmbh Method for analyzing nucleic acid chains
US7026121B1 (en) 2001-06-08 2006-04-11 Expression Diagnostics, Inc. Methods and compositions for diagnosing and monitoring transplant rejection
US7473767B2 (en) 2001-07-03 2009-01-06 The Institute For Systems Biology Methods for detection and quantification of analytes in complex mixtures
DE10239504A1 (en) 2001-08-29 2003-04-24 Genovoxx Gmbh Parallel sequencing of nucleic acid fragments, useful e.g. for detecting mutations, comprises sequential single-base extension of immobilized fragment-primer complex
US6927028B2 (en) 2001-08-31 2005-08-09 Chinese University Of Hong Kong Non-invasive methods for detecting non-host DNA in a host using epigenetic differences between the host and non-host DNA
JP2003101204A (en) 2001-09-25 2003-04-04 Nec Kansai Ltd Wiring substrate, method of manufacturing the same, and electronic component
DE10246005A1 (en) 2001-10-04 2003-04-30 Genovoxx Gmbh Automated nucleic acid sequencer, useful e.g. for analyzing gene expression, based on parallel incorporation of fluorescently labeled terminating nucleotides
DE10149786B4 (en) 2001-10-09 2013-04-25 Dmitry Cherkasov Surface for studies of populations of single molecules
US6902921B2 (en) 2001-10-30 2005-06-07 454 Corporation Sulfurylase-luciferase fusion proteins and thermostable sulfurylase
US20050124022A1 (en) 2001-10-30 2005-06-09 Maithreyan Srinivasan Novel sulfurylase-luciferase fusion proteins and thermostable sulfurylase
GB0128153D0 (en) 2001-11-23 2002-01-16 Bayer Ag Profiling of the immune gene repertoire
US20030198573A1 (en) 2002-01-25 2003-10-23 Illumina, Inc. Sensor arrays for detecting analytes in fluids
US20090149335A1 (en) 2002-02-22 2009-06-11 Biolife Solutions Inc. Method and use of microarray technology and proteogenomic analysis to predict efficacy of human and xenographic cell, tissue and organ transplant
US6977162B2 (en) 2002-03-01 2005-12-20 Ravgen, Inc. Rapid analysis of variations in a genome
US7468032B2 (en) 2002-12-18 2008-12-23 Cardiac Pacemakers, Inc. Advanced patient management for identifying, displaying and assisting with correlating health-related data
US20040122296A1 (en) 2002-12-18 2004-06-24 John Hatlestad Advanced patient management for triaging health-related data
DE10214395A1 (en) 2002-03-30 2003-10-23 Dmitri Tcherkassov Parallel sequencing of nucleic acid segments, useful for detecting single-nucleotide polymorphisms, by single-base extensions with labeled nucleotide
US20060259248A1 (en) 2002-07-01 2006-11-16 Institut Pasteur System, method, device, and computer program product for extraction, gathering, manipulation, and analysis of peak data from an automated sequencer
JP2004097158A (en) * 2002-09-12 2004-04-02 Kureha Chem Ind Co Ltd METHOD FOR PRODUCING cDNA TAG FOR IDENTIFICATION OF EXPRESSION GENE AND METHOD FOR ANALYZING GENE EXPRESSION BY USING THE cDNA TAG
CA2513292C (en) 2003-01-17 2016-04-05 The Chinese University Of Hong Kong Circulating mrna as diagnostic markers for pregnancy-related disorders
DE602004024034D1 (en) 2003-01-29 2009-12-24 454 Corp NUCLEIC ACID AMPLIFICATION BASED ON KINGGEL EMULSION
US7025935B2 (en) 2003-04-11 2006-04-11 Illumina, Inc. Apparatus and methods for reformatting liquid samples
AU2003902299A0 (en) 2003-05-13 2003-05-29 Flinders Medical Centre A method of analysing a marker nucleic acid molecule
US8150626B2 (en) 2003-05-15 2012-04-03 Illumina, Inc. Methods and compositions for diagnosing lung cancer with specific DNA methylation patterns
US20040241764A1 (en) 2003-06-02 2004-12-02 Uri Galili Early detection of recipient's immune response against heart transplant
US20050181394A1 (en) 2003-06-20 2005-08-18 Illumina, Inc. Methods and compositions for whole genome amplification and genotyping
GB0319332D0 (en) * 2003-08-16 2003-09-17 Astrazeneca Ab Amplification
US8637650B2 (en) 2003-11-05 2014-01-28 Genovoxx Gmbh Macromolecular nucleotide compounds and methods for using the same
DE10356837A1 (en) 2003-12-05 2005-06-30 Dmitry Cherkasov New conjugates useful for modifying nucleic acid chains comprise nucleotide or nucleoside molecules coupled to a label through water-soluble polymer linkers
US7169560B2 (en) 2003-11-12 2007-01-30 Helicos Biosciences Corporation Short cycle methods for sequencing polynucleotides
DE60326052D1 (en) 2003-12-15 2009-03-19 Pasteur Institut Determination of the repertoire of B lymphocyte populations
EP1704251A1 (en) * 2003-12-15 2006-09-27 Institut Pasteur Repertoire determination of a lymphocyte b population
US7040959B1 (en) 2004-01-20 2006-05-09 Illumina, Inc. Variable rate dispensing system for abrasive material and method thereof
WO2005082110A2 (en) 2004-02-26 2005-09-09 Illumina Inc. Haplotype markers for diagnosing susceptibility to immunological conditions
US20060046258A1 (en) 2004-02-27 2006-03-02 Lapidus Stanley N Applications of single molecule sequencing
DE102004009704A1 (en) 2004-02-27 2005-09-15 Dmitry Cherkasov New conjugates useful for labeling nucleic acids comprise a label coupled to nucleotide or nucleoside molecules through polymer linkers
US20070161001A1 (en) 2004-03-04 2007-07-12 Dena Leshkowitz Quantifying and profiling antibody and t cell receptor gene expression
US7035740B2 (en) 2004-03-24 2006-04-25 Illumina, Inc. Artificial intelligence and global normalization methods for genotyping
DE102004025744A1 (en) 2004-05-26 2005-12-29 Dmitry Cherkasov Surface of a solid support, useful for multiple parallel analysis of nucleic acids by optical methods, having low non-specific binding of labeled components
DE102004025694A1 (en) 2004-05-26 2006-02-23 Dmitry Cherkasov Optical fluorescent ultra-high parallel process to analyse nucleic acid chains in which a sample solid is bound with a primer-matrix complex
DE102004025746A1 (en) 2004-05-26 2005-12-15 Dmitry Cherkasov Parallel sequencing of nucleic acids by optical methods, by cyclic primer-matrix extension, using a solid phase with reduced non-specific binding of labeled components
DE102004025695A1 (en) 2004-05-26 2006-02-23 Dmitry Cherkasov Optical fluorescent parallel process to analyse nucleic acid chains in which a sample solid is bound with a primer-matrix complex
DE102004025745A1 (en) 2004-05-26 2005-12-15 Cherkasov, Dmitry Surface of solid phase, useful for parallel, optical analysis of many nucleic acids, has reduced non-specific binding of labeled components
DE102004025696A1 (en) 2004-05-26 2006-02-23 Dmitry Cherkasov Ultra-high parallel analysis process to analyse nucleic acid chains in which a sample solid is bound and substrate material
US20060024711A1 (en) 2004-07-02 2006-02-02 Helicos Biosciences Corporation Methods for nucleic acid amplification and sequence determination
US7276720B2 (en) 2004-07-19 2007-10-02 Helicos Biosciences Corporation Apparatus and methods for analyzing samples
US20060012793A1 (en) 2004-07-19 2006-01-19 Helicos Biosciences Corporation Apparatus and methods for analyzing samples
US20060019258A1 (en) 2004-07-20 2006-01-26 Illumina, Inc. Methods and compositions for detection of small interfering RNA and micro-RNA
US20060024678A1 (en) 2004-07-28 2006-02-02 Helicos Biosciences Corporation Use of single-stranded nucleic acid binding proteins in sequencing
US7316363B2 (en) 2004-09-03 2008-01-08 Nitrocision Llc System and method for delivering cryogenic fluid
US8986926B2 (en) 2005-12-23 2015-03-24 Nanostring Technologies, Inc. Compositions comprising oriented, immobilized macromolecules and methods for their preparation
WO2007149432A2 (en) * 2006-06-19 2007-12-27 The Johns Hopkins University Single-molecule pcr on microparticles in water-in-oil emulsions
US7745132B1 (en) 2006-11-28 2010-06-29 The Ohio State University Prognostic indicators of canine lymphoid neoplasia using tumor-derived plasma DNA
US8262900B2 (en) 2006-12-14 2012-09-11 Life Technologies Corporation Methods and apparatus for measuring analytes using large scale FET arrays
ATE490319T1 (en) * 2007-02-23 2010-12-15 New England Biolabs Inc SELECTION AND ENRICHMENT OF PROTEINS THROUGH IN VITRO SEPARATION
WO2008147879A1 (en) 2007-05-22 2008-12-04 Ryan Golhar Automated method and device for dna isolation, sequence determination, and identification
WO2009006438A2 (en) * 2007-06-29 2009-01-08 Epicentre Technologies Corporation Copy dna and sense rna
CN101861399B (en) 2007-08-22 2015-11-25 特罗瓦基因公司 Use the method for miRNA detection bodies inner cell death condition
US20100233716A1 (en) 2007-11-08 2010-09-16 Pierre Saint-Mezard Transplant rejection markers
US8252911B2 (en) 2008-02-12 2012-08-28 Pacific Biosciences Of California, Inc. Compositions and methods for use in analytical reactions
US20090221620A1 (en) 2008-02-20 2009-09-03 Celera Corporation Gentic polymorphisms associated with stroke, methods of detection and uses thereof
ES2549184T3 (en) 2008-04-16 2015-10-23 Cb Biotechnologies, Inc. Method to evaluate and compare immunorepertories
DE102008025656B4 (en) * 2008-05-28 2016-07-28 Genxpro Gmbh Method for the quantitative analysis of nucleic acids, markers therefor and their use
US20100120097A1 (en) 2008-05-30 2010-05-13 Board Of Regents, The University Of Texas System Methods and compositions for nucleic acid sequencing
US20100041048A1 (en) 2008-07-31 2010-02-18 The Johns Hopkins University Circulating Mutant DNA to Assess Tumor Dynamics
US8583380B2 (en) 2008-09-05 2013-11-12 Aueon, Inc. Methods for stratifying and annotating cancer drug treatment options
US8748103B2 (en) 2008-11-07 2014-06-10 Sequenta, Inc. Monitoring health and disease status using clonotype profiles
US8691510B2 (en) 2008-11-07 2014-04-08 Sequenta, Inc. Sequence analysis of complex amplicons
US8628927B2 (en) 2008-11-07 2014-01-14 Sequenta, Inc. Monitoring health and disease status using clonotype profiles
US9506119B2 (en) 2008-11-07 2016-11-29 Adaptive Biotechnologies Corp. Method of sequence determination using sequence tags
US9528160B2 (en) 2008-11-07 2016-12-27 Adaptive Biotechnolgies Corp. Rare clonotypes and uses thereof
EP2719774B8 (en) 2008-11-07 2020-04-22 Adaptive Biotechnologies Corporation Methods of monitoring conditions by sequence analysis
US9365901B2 (en) 2008-11-07 2016-06-14 Adaptive Biotechnologies Corp. Monitoring immunoglobulin heavy chain evolution in B-cell acute lymphoblastic leukemia
CN101864477A (en) * 2009-04-15 2010-10-20 上海聚类生物科技有限公司 Method for carrying out determination on microRNA expression profiles by utilizing high-throughput sequencing single-channel multiplexing technology
WO2010122538A1 (en) 2009-04-24 2010-10-28 Santaris Pharma A/S Pharmaceutical compositions for treatment of hcv patients that are non-responders to interferon
US20110166029A1 (en) 2009-09-08 2011-07-07 David Michael Margulies Compositions And Methods For Diagnosing Autism Spectrum Disorders
PT2496720T (en) 2009-11-06 2020-10-12 Univ Leland Stanford Junior Non-invasive diagnosis of graft rejection in organ transplant patients
WO2011140433A2 (en) * 2010-05-07 2011-11-10 The Board Of Trustees Of The Leland Stanford Junior University Measurement and comparison of immune diversity by high-throughput sequencing
US9650629B2 (en) * 2010-07-07 2017-05-16 Roche Molecular Systems, Inc. Clonal pre-amplification in emulsion
WO2012083069A2 (en) * 2010-12-15 2012-06-21 The Board Of Trustees Of The Leland Stanford Junior University Measurement and monitoring of cell clonality
JP6069224B2 (en) * 2011-01-31 2017-02-01 アプライズ バイオ, インコーポレイテッド Methods for identifying multiple epitopes in a cell
EP2675819B1 (en) * 2011-02-18 2020-04-08 Bio-Rad Laboratories, Inc. Compositions and methods for molecular labeling
CN102212888A (en) * 2011-03-17 2011-10-12 靳海峰 High throughput sequencing-based method for constructing immune group library
WO2012149042A2 (en) * 2011-04-25 2012-11-01 Bio-Rad Laboratories, Inc. Methods and compositions for nucleic acid analysis
PT2702146T (en) 2011-04-28 2019-04-22 Univ Leland Stanford Junior Identification of polynucleotides associated with a sample
US10385475B2 (en) 2011-09-12 2019-08-20 Adaptive Biotechnologies Corp. Random array sequencing of low-complexity libraries
CA2848516C (en) 2011-09-22 2021-06-15 Maria U. Hutchins Detection of isotype profiles as signatures for disease
US20150211070A1 (en) 2011-09-22 2015-07-30 Immu-Metrix, Llc Compositions and methods for analyzing heterogeneous samples
US9499865B2 (en) 2011-12-13 2016-11-22 Adaptive Biotechnologies Corp. Detection and measurement of tissue-infiltrating lymphocytes
US20130324422A1 (en) 2012-06-04 2013-12-05 Sequenta, Inc. Detecting disease-correlated clonotypes from fixed samples
WO2014108850A2 (en) * 2013-01-09 2014-07-17 Yeda Research And Development Co. Ltd. High throughput transcriptome analysis
US20140255944A1 (en) 2013-03-08 2014-09-11 Sequenta, Inc. Monitoring treatment-resistant clones in lymphoid and myeloid neoplasms by relative levels of evolved clonotypes
US20140255929A1 (en) 2013-03-11 2014-09-11 Sequenta, Inc. Mosaic tags for labeling templates in large-scale amplifications
WO2014144822A2 (en) 2013-03-15 2014-09-18 Immumetrix, Inc. Methods and compositions for tagging and analyzing samples

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Robinson US 2015/0133317, previously cited *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10058839B2 (en) 2013-03-15 2018-08-28 Lineage Biosciences, Inc. Methods and compositions for tagging and analyzing samples
US10722858B2 (en) 2013-03-15 2020-07-28 Lineage Biosciences, Inc. Methods and compositions for tagging and analyzing samples
US11161087B2 (en) 2013-03-15 2021-11-02 Lineage Biosciences, Inc. Methods and compositions for tagging and analyzing samples
US11319590B2 (en) * 2017-01-31 2022-05-03 Ludwig Institute For Cancer Research Ltd. Enhanced immune cell receptor sequencing methods
WO2019152108A1 (en) * 2018-02-05 2019-08-08 The Board Of Trustees Of The Leland Stanford Junior University Systems and methods for multiplexed measurements in single and ensemble cells

Also Published As

Publication number Publication date
EP3611262A1 (en) 2020-02-19
US20220176335A1 (en) 2022-06-09
US10722858B2 (en) 2020-07-28
CN105189748A (en) 2015-12-23
CA2905505C (en) 2023-03-14
US11161087B2 (en) 2021-11-02
WO2014144822A2 (en) 2014-09-18
CA2905517A1 (en) 2014-09-18
EP3327123A1 (en) 2018-05-30
EP3327123B1 (en) 2019-08-28
WO2014144822A3 (en) 2015-04-23
US20160001248A1 (en) 2016-01-07
HK1255869A1 (en) 2019-08-30
CN114107458A (en) 2022-03-01
US20210001302A1 (en) 2021-01-07
CN105189749B (en) 2020-08-11
US20200290008A1 (en) 2020-09-17
CA2905505A1 (en) 2014-09-18
DK3327123T3 (en) 2019-11-25
DK2970958T3 (en) 2018-02-19
WO2014144713A3 (en) 2014-11-27
EP2970959A2 (en) 2016-01-20
CN105189749A (en) 2015-12-23
EP3415626B1 (en) 2020-01-22
US20160228841A2 (en) 2016-08-11
EP3611262B1 (en) 2020-11-11
CN105189748B (en) 2021-06-08
EP2970958B1 (en) 2017-12-06
EP2970959B1 (en) 2018-05-30
EP2970958A2 (en) 2016-01-20
EP3415626A1 (en) 2018-12-19
US20180326389A1 (en) 2018-11-15
US10058839B2 (en) 2018-08-28
DK3611262T3 (en) 2021-01-25
DK2970959T3 (en) 2018-08-06
WO2014144713A2 (en) 2014-09-18

Similar Documents

Publication Publication Date Title
US20210001302A1 (en) Methods of sequencing the immune repertoire
US20230295690A1 (en) Haplotype resolved genome sequencing
US20220246234A1 (en) Using cell-free dna fragment size to detect tumor-associated variant
EP2758550B1 (en) Detection of isotype profiles as signatures for disease
US20210108263A1 (en) Methods and Compositions for Preparing Sequencing Libraries
JP2018509928A (en) Method for detecting genomic mutations using circularized mate pair library and shotgun sequencing
Barbaro Overview of NGS platforms and technological advancements for forensic applications
Buadu Forensic DNA genotyping by means of next generation sequencing. Analysis of Autosomal STRs of a Norwegian population sample using the ForenSeq FGx system

Legal Events

Date Code Title Description
AS Assignment

Owner name: IMMUMETRIX, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:HUTCHINS, EDWARD A.;FAN, HEI-MUN CHRISTINA;SIGNING DATES FROM 20140314 TO 20140315;REEL/FRAME:032863/0567

AS Assignment

Owner name: LINEAGE BIOSCIENCES, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:IMMUMETRIX, INC.;REEL/FRAME:041607/0926

Effective date: 20140610

Owner name: LINEAGE BIOSCIENCES, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:IMMUMETRIX, INC.;REEL/FRAME:041608/0029

Effective date: 20140610

AS Assignment

Owner name: BANK OF MONTREAL, CANADA

Free format text: SECURITY INTEREST;ASSIGNOR:LINEAGE BIOSCIENCES, INC.;REEL/FRAME:044784/0836

Effective date: 20180125

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STCV Information on status: appeal procedure

Free format text: NOTICE OF APPEAL FILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION