WO2010039275A1 - Method, array and system for detecting intergenic fusions - Google Patents

Method, array and system for detecting intergenic fusions Download PDF

Info

Publication number
WO2010039275A1
WO2010039275A1 PCT/US2009/005460 US2009005460W WO2010039275A1 WO 2010039275 A1 WO2010039275 A1 WO 2010039275A1 US 2009005460 W US2009005460 W US 2009005460W WO 2010039275 A1 WO2010039275 A1 WO 2010039275A1
Authority
WO
WIPO (PCT)
Prior art keywords
exon
gene
probes
test
array
Prior art date
Application number
PCT/US2009/005460
Other languages
French (fr)
Inventor
Michael Griffiths
Original Assignee
Oligonix, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oligonix, Inc. filed Critical Oligonix, Inc.
Publication of WO2010039275A1 publication Critical patent/WO2010039275A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Definitions

  • the present invention relates to a method, array and system for detecting cancer associated intergenic fusions resulting from deletions and inversions within a chromosome, and from translocations and insertions between different chromosomes.
  • Oligonucleotide probe arrays are currently used or have been proposed for detecting a variety of genetic abnormalities, including point mutations, deletions and insertions, and splice variants, as well as for sequencing by hybridization (SBH).
  • SBH sequencing by hybridization
  • the probes on an array may be (i) long sequences, such as bacterial artificial chromosomes (BACs), for detecting large regions of genetic variation, (ii) short oligonucleotide probes, e.g., in the 20-30 base range, for detecting single-nucleotide base changes, such as single nucleotide polymorphisms (SNPs) or single-base mutations, and for sequencing by hybridization, or (iii) intermediate-size oligonucleotide probes, e.g., in the 60-80 base range, for detecting sequence variations that may be distributed over a large region of the genome.
  • oligonucleotide probe arrays for detecting splice-variant effects have been disclosed (see, U.S. Patent Nos: 6,251 ,590 and 6,881 ,571).
  • oligonucleotide probe arrays are ease of sample preparation, the availability of sophisticated two-channel array readers, and the ability to examine a very large number of genetic events on a single array.
  • the invention includes, in one aspect, an oligonucleotide probe array device for detecting intergenic fusions between a selected pair of genes in the same or different chromosomes.
  • the device includes (a) a substrate having a surface defining an array of probe positions, and (b) attached to the array positions, one oligonucleotide probe per position, a plurality of probes.
  • the probes include:
  • the plurality of exonic probes may include for each exon in a gene in the selected pair(s) of genes, at least three probes having the sequences of 5' upstream, 3' downstream, and intermediate exonic regions of that gene exon.
  • the plurality of intronic genes may include at least one probe for each intron in a gene in the selected pair(s) of genes, and may further include at least three probes having the sequences of 5' upstream, 3' downstream, and intermediate intronic regions of a selected intron.
  • the probes attached to the array position on the substrate may be in the size range 50-70 bases.
  • the exon-exon boundary spanning probes may be designed, for example, for detecting gene fusions associated with leukemia in a human subject, and may include exon-exon boundary spanning sequences for one or more of gene pairs, including, for example: BCR-ABL1 , E2A-PBX, E2A-HLF, SIL-TAL, TEL-AML1 , AML1-EVI1 , CHIC-TEL, AML1-ETO, PML-RARA, CBFB-MYH11 , MLL-AF4, MLL- MLL and MLL-MSF.
  • the invention includes a method for detecting gene fusions in one or more selected pairs of human genes of a test subject.
  • the method includes (a) reacting with the oligonucleotide probe array device described above, under DNA hybridization conditions, test and reference cDNA prepared from mRNA samples obtained from the subject and from a normal individual(s), respectively, and labeled with first and second different detectable reporters, (b) detecting the test and reference reporter levels at each probe position in the array, and (c) identifying gene fusions and the exon-exon boundary at which the fusion has occurred by identifying the exon-exon boundary spanning probe associated with a given gene pair in which the test value and test/reference ratio is statistically elevated with respect to the other exon-exon boundary spanning probes for that gene pair.
  • Step (c) in the method may include identifying the exon-exon boundary spanning probe having (i) the highest test value within the exon-exon boundary spanning probes for a selected gene pair, and a test value that is at least 1.5, preferably at least 2-5, fold greater than the mean test values for the exon-exon boundary spanning probes in that gene pair; and (ii) the highest test/reference ratio within the exon-exon boundary spanning probes for a selected gene pair and a test/reference ratio that is at least 1.5, preferably at least 2-5 fold greater than the mean test/reference ratio for the exon-exon boundary spanning probes in that gene pair.
  • the method may further include the step of confirming that an exon- exon boundary spanning probe identified in step (c) is not associated with an exon in a single gene that has an elevated test/reference to mean test/reference ratio with respect to other exons in that gene.
  • the method may further include the step of confirming that an exon- exon boundary spanning probe identified in step (c) is not associated with a gene that has that has an elevated test or test/reference intron probe value.
  • a system for detecting intergenic fusions between a pair of genes in the same or different chromosomes is also disclosed.
  • the system includes an oligonucleotide probe array device of the type described above, an array detector for detecting the test and reference reporter levels at each probe position in the array, a processor operatively connected to the array detector for identifying gene fusions and the exon boundary at which the fusion has occurred by identifying the exon-exon boundary spanning probe associated with a given gene pair in which the test value and test/reference ratio is elevated with respect to other exon-exon boundary spanning probes in that gene pair, and a device operatively connected to the processor for storing and displaying the results from the processor.
  • the array device may have the specific embodiments noted above.
  • the array detector may include an optical/laser reader for determining the intensity and color at each array position, and from this information, determining a test intensity and a test/reference ratio for each array position.
  • the processor may operate to identify those exon-exon boundary spanning probes having:
  • test/reference ratio within the exon-exon boundary spanning probes for a selected gene pair and a test/reference ratio that is at least 1.5, preferably at least 2-5 fold greater than the mean test/reference ratio the exon-exon boundary spanning probes in that gene pair.
  • the processor may further operate to confirm that an exon-exon boundary spanning probe identified with a gene fusion is not associated with an exon in a single gene that has an elevated test/reference to mean/test reference ratio with respect to other exons in that gene.
  • the processor may further operate to confirm that an exon-spanning probe identified with a gene fusion is not associated with a gene that has an elevated test or test/reference intron probe value.
  • Fig. 1 illustrates an intergenic gene fusion between BCR exons 1-13 and ABL exons 2-11 ;
  • FIG. 2 shows expanded maps of chromosome 22
  • FIG. 3 is a planar view of a portion of an oligonucleotide probe array device constructed in accordance with the invention.
  • FIGs. 4 and 5 are flow diagrams of the logic employed in detecting and confirming intergenic fusions in accordance with the invention.
  • FIG. 6 illustrates one pattern of test and reference reporters in an exemplary application of the invention for detecting an intergenic fusion.
  • oligonucleotide probe refers to a defined-sequence DNA or RNA oligomer, or an oligonucleotide analog thereof that can hybridize to a complementary sequence DNA or RNA fragment in a test or reference sample.
  • the term generally refers to multiple copies of the same-sequence probe at each array position.
  • Oligonucleotide probes on spot based arrays are typically 30-70 bases in length, preferably 40-60 bases. The design limitations of the oligos due to the exon- exon boundary concept, means the oligos will have a limited size range which is likely to be in the 50-70 base range.,
  • exon-exon boundary probe refers to an array probe whose sequence includes complementary portions of exons from two different genes involved in a gene fusion, including the 5'end sequence of one exon and the 3'end region of a second exon to which the first exon may be joined in the gene fusion. In one direction, the exon-exon boundary probes are joined 5' to 3' from gene 1 to gene
  • An exon-exon boundary probe contains between 50-70 bases, where each exon sequence contributes at most 40 bases.
  • a 70mer probe could include 40 bases from one exon and 30 from another, or 35 bases from each exon.
  • the number of bases for one exon is within 10 bases of the number for the other exon. [00028] As will be appreciated, the number of bases are designed to allow cDNA fragment complementary to one exon sequence only to be washed off under stringent washing conditions that are effective to remove bound cDNA bound to only one exon sequence in the exon-exon boundary probe.
  • the washing conditions would be selected to remove bound cDNAs with up to 40 bases of complementarity with the probes.
  • Such washing conditions are well know, and must take into consideration both the length and G:C content of each exon sequence.
  • an exon-exon boundary oligo of 50 mers will be designed to recognize approximately 25 bases of the end of one exon and 25 bases at the start of the potential illegitimate partner exon from the other gene (give or take say 5 bases).
  • the 20-30 bases on each side of the boundary can be expected to hybridise to the normal exon targets irrespective of the presence of the target illegitimate exon-exon boundary.
  • the wash step has to be stringent enough to remove any hybridisations to normal exons (up to say 30 bases), but leave complete hybridizations (say 50 bases) intact.
  • An "exonic probe” refers to an array probe that is complementary to some region of an exon of a gene.
  • an "intronic probe” refers to an array probe that is complementary to some region of an intron of a gene. Preferably exonic and intronic probes also have sizes between 50-70 bases.
  • Test cDNA refers to cDNA prepared from a test-individual mRNA sample and labeled with a first reporter, typically a fluorescence reporter having a fluorescence emission peak at one selected visible wavelength, e.g., in the red wavelength portion of the visible spectrum.
  • a first reporter typically a fluorescence reporter having a fluorescence emission peak at one selected visible wavelength, e.g., in the red wavelength portion of the visible spectrum.
  • Reference cDNA refers to cDNA prepared from mRNA sample(s) obtained from one or more normal subjects who do not have gene fusions, at least of the type intended to be detected, and which is labeled with a second reporter, typically a fluorescence reporter having a fluorescence emission peak at a second selected visible wavelength, e.g., in the green wavelength portion of the visible spectrum.
  • a second reporter typically a fluorescence reporter having a fluorescence emission peak at a second selected visible wavelength, e.g., in the green wavelength portion of the visible spectrum.
  • Test reporter values or “test reporter intensity values” refer to the intensity values, e.g., fluorescence intensity values, detected for a test reporter, e.g., test fluorescence reporter at an array portion.
  • Reference reporter values or “reference reporter intensity values” refer to the intensity values, e.g., fluorescence intensity values, detected for a test reporter, e.g., test fluorescence reporter at an array portion.
  • Gene fusion or "intergenic fusions” refer to fusions between two different genes. If the different genes are on the same chromosome, the fusion may result from an inversion or deletion. If the different genes that are fused are on different chromosomes, the fusion may result from a translocation or insertion. II. lntergenic fusions
  • Fig. 1 shows an intergeneic fusion between a BCR gene on chromosome 22 (chr 22), indicated at 10, having exons e1-e23 and the ABL gene on chromosome 9 (chr 9), indicated at 12, having exons a1-a11.
  • a gene fusion 14 is shown with breaks within intron 13 (between e13 and e14) of the BCR gene and intron 1 (between a1 and a2) of the ABL gene.
  • Transcribed RNA from such a fusion undergoes post-translational splicing modifications which remove the introns and bring the exons together in the resulting mRNA.
  • the mRNA has an illegitimate exon- exon boundary between exon e13 of BCR and exon a2 of the ABL gene; that is, the fusion joins the 3'end of BCR exon e13 to the 5' end of ABL exon a2.
  • the translocation produces two fused genes, a BCR-ABL fusion 16 composed of BCR exons e1-e13 and ABL exons a2-a11 , and an ABL-BCR fusion 18 composed of ABL exon a1 and BCR exons e14-e23.
  • the two mRNA transcripts of the two fused genes are shown at 20, 22 in Fig. 1.
  • This particular gene fusion illustrated above is also known as the "Philadelphia translocation" (Ph+) and is a characteristic acquired gene fusion in people who have chronic myelogenous myeloma (CML), occurring as a translocation in about 95%, and as a cytogenetically cryptic molecular insertion in about 5%. It is also found in 25%-30% of adults with acute lymphobalstic leukemia (ALL), and occasionally in acute myelogenous leukemia (AML).
  • ALL acute lymphobalstic leukemia
  • AML acute myelogenous leukemia
  • the BCR-ABL protein is a protein p210 or sometimes p190 (where p stands for "protein” and the number represents the apparent molecular weight of the mutant proteins in kDa).
  • the fused "BCR-ABL” gene is located on the resulting, shorter chromosome 22. Because ABL carries a domain that can add phosphate groups to tyrosine residues (tyrosine kinase) the BCR-ABL fusion gene is also a tyrosine kinase. Although the bcr protein product of BCR is also a serine/threonine kinase, the tyrosine kinase function is particularly relevant for therapy. The p210 occurs primarily in CML, and sometimes in Ph+ ALL, for which the p190 fusion protein is more common.
  • the fused BCR-ABL protein interacts with the interleukin-3 receptor beta(c) subunit.
  • the BCR-ABL transcript is constitutively active, i.e. it does not require activation by other cellular messaging proteins.
  • BCR-ABL activates a number of cell cycle-controlling proteins and enzymes, speeding up cell division. Moreover, it inhibits DNA repair, causing genomic instability and potentially causing the feared blast crisis in CML.
  • gene pairs at which fusions are known and used for diagnostic purposes, particularly for leukemia include, for example, the gene pairs: E2A-PBX, E2A-HLF, SIL-TAL, TEL-AML1 , AML1-EVI1 , CHIC-TEL, AML1-ETO, PML-RARA, CBFB-MYH11 MLL-AF4, MLL-MLL, and MLL-MSF.
  • some genes can form fusions with alternative partner genes.
  • Fig. 2 show expanded maps of chromosome 22 (chr22), indicated at 24, and a multi-gene region 25 on the chromosomes.
  • a portion of gene 28 is shown at the map expansion 29, and includes intron 1 30, exon 2, 32, and so on..
  • the three probes per exon and intron provide comprehensive coverage of each exon and intron. However, the three probes are not necessarily overlapping, where the intron or exon is greater that than about 150-200 bases.
  • FIG. 3 show, in planar view, an oligonucleotide array device or gene chip 50, constructed in accordance with the invention.
  • the device includes a substrate 52 having an array of positions or regions, such as indicated at 54-70, where each array position has anchored thereto, a defined-sequence oligonucleotide probe that is attached covalently or non-covalently, e.g., by predominant electrostatic interactions, to the substrate surface.
  • Gene-chip substrates or supports, and methods for synthesizing and anchoring defined-sequence oligonucleotide probes to array positions on the substrate are well known to those skilled in the art, e.g., as detailed in USPN 6,994,972, 6,844,151 , 6,927,032, 6,188,783, 6,852,487, 6,852,850, 6,924,094, and 6,372,432, all of which are incorporated herein by reference.
  • the number of array positions and distinct-sequence probes in the device may range from several hundred or fewer up to several hundred thousand or more, depending on the number of gene fusions the device is designed to detect, and the "coverage" of exonic and intronic probes that are included in the device, as will be discussed below.
  • array device 50 has three array regions or subarrays, indicated at 53, 59, and 67 in Fig. 3.
  • Subarray 53 includes exon-exon boundary spanning probes for each selected gene pair whose intergenic fusions are to be detected.
  • a gene fusion between any pair of selected genes e.g., BCR and ABL, can occur between any pair of exons of the two genes, and will usually produce transcripts in both directions, e.g., BCR-ABL and ABL-BCR.
  • the device will preferably include all relevant permutations of exon-exon boundary spanning probes and in both directions.
  • each exon-exon boundary probe could be included in duplicate or triplicate, for purposes of achieving average value test and reference readings.
  • the exon-exon boundary spanning probes for any selected pair of genes may be arrayed in a single row or column, or in multiple rows or columns.
  • the top row containing array positions 54 may contain all 2xNxM exon-exon boundary spanning probes for a selected gene pair A-B, the second row containing array positions 56, the exon-exon boundary spanning probes for a second selected gene pair C-D, the third containing array positions 58, for a third selected gene pair E-F, and so forth.
  • each of the exon-exon boundary spanning probes preferably contains approximately 25-35 bases complementary to the 5'- or 3'-end region of each of the fused exons.
  • the probe for a given fusion will have approximately 30 bases of the selected exon of one gene, at its 5 1 or 3' end, and approximately 30 bases of the selected exon of the second gene, at its 3'- or 5'-end, respectively.
  • Subarray 59 includes exonic sequences for each of the genes in the selected gene pairs whose fusions are to be detected.
  • the array includes, for each gene in a selected gene pair to be test, a plurality of exonic probes having the sequences of 5' upstream, 3 1 downstream, and/or intermediate exonic regions of a plurality of exons in the gene. That is, the exonic probes include, for each selected gene, a plurality of probes representing multiple exons, preferably at least one probe per exon, and/or a plurality of probes representing a single exon, preferably three probes having the sequences of 5' upstream, 3' downstream, and intermediate exonic regions of that gene exon.
  • the device preferably includes, for each exon in a selected gene, a 5'-upstream, 3' downstream, and intermediate exonic regions of that gene exon.
  • the device would include 23 x 3 exonic probes.
  • these may be array in a single row, single column, or combinations of different rows and columns.
  • the rows in subarray 59 represented by array positions 60, 62 may include all of the exonic probes for genes A and B, respectively, the rows represented by array positions 64, 66, all of the exonic probes for genes C and D, respectively, and so forth.
  • the exonic probes in subarray 59 provide positive controls for exon-boundary probe hits observed in subarray 53.
  • Subarray 67 includes intronic sequences for each of the genes in the selected gene pairs whose fusions are to be detected.
  • the array includes, for each gene in a selected gene pair to be test, one or more intronic probes having the sequences of 5' upstream, 3' downstream, or intermediate intronic regions of at least one intron.
  • the array devices preferably includes multiple probes for each gene intron, for example three probes having the sequences of 5' upstream, 3' downstream, and intermediate intronic regions of that gene intron.
  • all of the selected gene introns are represented by three probes. As above, these probes may be array in a single row, single column, or combinations of different rows and columns.
  • the rows in subarray 67 represented by array positions 68, 70 may include all of the intronic probes for genes A and B, respectively, and so forth.
  • the device may contain probes from selected 5' and/or untranslated regions of each gene's mRNA, as well as one or "combined sequence" probes that include selected portions of several exons of the gene. IV. Detecting Interqenic fusions
  • the array device of the invention is used for detecting gene fusions in one or more selected pairs of human genes of a test subject.
  • test and reference cDNA samples are prepared from mature mRNA samples obtained from the test subject and from a normal individual(s), respectively, and these samples are labeled with first and second different detectable reporters, e.g., fluorescence reporters having fluorescence emission peaks in the red and green regions of the visible spectrum, respectively.
  • Sample preparation can be carried out according to well known methods, for example, by isolating a polyA mRNA fraction from cells from the test and reference individuals, and using the mRNA as a template for a reverse transcriptase, to generate single-stranded cDNA in the presence of all four dNTPs, including dNTS labeled with a selected fluorescent reported.
  • the test and reference cDNA sample are prepared separately, and the reference sample may be obtained independently, e.g., from one or more normal individuals and supplied and/or stored separately, for example, as part of a test kit that includes the array device of the invention.
  • detectable reporters may be used for the nucleotide-labeled probes, for example, probes that can be differentiated on the basis of different absorption peaks in the visible or UV spectra, or antigenic probes that can be detected by reaction with reported-labeled, antigen- specific antibodies, according to well-known methods.
  • the system could also operate in a non-competitive mode, in which the same reporter is used for both test and reference samples, e.g., when carried out in parallel arrays.
  • the reporter-labeled test and reference cDNA containing sample from above are mixed, e.g., in a 1 :1 cDNA weight ratio, reactive with the probes in the array device by adding the sample material to the surface of the array device under hybridization conditions, i.e., conditions that favor hybridization of the labeled cDNAs to complementary-sequence probes on the array device. After a suitable reaction time, the array is washed one or more times to remove non-specifically bound probes and incompletely bound probes (i.e.
  • exon-exon boundary probes partially hybridized to normal exon targets rather than completely hybridized to its target complete exon-exon boundary
  • an array detector for detecting the test and reference reporter levels at each probe position in the array.
  • Methods for washing to achieve selective removal of DNA hybrids below a given number of complementary bases, e.g., 30-40 bases, are well known.
  • Suitable array detectors are well known, and in the case of fluorescence-labeled cDNA, function to detect and quantify the fluorescence emission at for each fluorescence reporter at each array position.
  • the detector may also include software for storing the detected emission values, and for calculating and storing a test/reference ratio at each array position.
  • the software may also find an average test/reference values for some group of "neutral" probes, i.e., probes corresponding to housekeeping genes, and normalize all of the probe test/reference values to that average.
  • Software for performing these operations is commercially available; one suitable array detector with built-in reader is commercially available. Detection using single reporter dyes and only test material (non-comparative approach without internal reference RNA/cDNA) is also possible.
  • Figs. 4 and 5 are flow diagrams illustrating steps for identifying intergenic fusions test and reference values recorded from the array, in accordance with the invention when using the comaparative test/reference detection approach.
  • the steps of detecting test and reference reporter values at each array position (box 72) and calculating a test/reference value for each array position (box 74) have been discussed above.
  • the program considers the test values for all of the exon-exon boundary spanning (ES) probes for that gene pair, calculates an average test value, and determines a test/mean test value for each exon-spanning probe, as at 76.
  • ES probes having a test/mean test value greater than a selected threshold value for example, a test/mean ratio greater than a selected threshold value, e.g., 1.5, and preferably at least 2, 3, 4, 5, or 10, are identified and stored at 78 as "candidate" probes for a gene fusion event.
  • a test/mean ratio greater than a selected threshold value e.g. 1.5, and preferably at least 2, 3, 4, 5, or 10
  • the program calculates the mean test/reference ratio for each ES probe in the gene pair of interest, and determines a the ratio of test/reference value to mean test/reference value for each of the ES probes for that gene pair.
  • Those probes that have an elevated T/R/mean T/R ratio e.g., a ratio greater than a selected threshold value, e.g., 1.5, and preferably at least 2, 3, 4, 5, or 10, are similarly identified and stored at 80 as "candidate" probes for an intergenic fusion event. These steps are repeated for all ES probes in all gene pairs on the array, and the results are stored at 78.
  • the program follows the logic in Fig. 5 to classify that E/S probe as either a possible, but inconclusive intergenic fusion or a confirmed intergenic fusion.
  • a candidate ES from 78 is selected, and its test/mean test and T/R/mean T/R value are retrieved. If one of the values only is elevated above its selected threshold, at 88, the probe is classified as an inconclusive intergenic fusion, at 94.
  • the program may simply report the results as "inconclusive" or attempt to assign a confidence value for the result, based on the extent to which the key test/mean test and T/R/mean T/R values are above or below the selected thresholds, and the extent to which the test or T/R values of corresponding exonic or intronic sequences might suggest a basis for ambiguity (see below).
  • An inconclusive result might also provide the basis for further analysis, either by operator review of the raw data produce by the device, or by alternative methods of analysis , e.g., by in situ analysis of stained chromosomes, to attempt to resolve the ambiguity.
  • the program checks exonic sequences for that exon-exon boundary spanning pair in subarray 59, to determine if either of the exons in the ES probe have elevated T/R/mean T/R values, indicating a non-specific interaction to one of the exons which may be the cause of the elevated ES value(s) (box 90). That is, the exonic sequence probes act as positive controls to rule out non-specific hybridization to the ES probe.
  • the intronic sequences for the two genes is checked, in this case to confirm that there is little or no significant test or reference binding to the intronic seqences.
  • the intronic sequences thus act a negative controls, to ensure that elevated test and reference probe reporter values are not elevated because of false reporting from one or more intronic sequences, which should have near-zero binding.
  • the E/S probe is again classified as an inconclusive intergenic fusion, at 94. If both exonic and intronic sequences (positive and negative controls) are normal, the result is classified as a confirmed intergenic fusion, at 92.
  • the program repeats the analysis, at 86, for each of the ES probes identified and stored at 78.
  • the results are display at 84. The displayed information will report each confirmed, and optionally, each inconclusive gene fusion, the genes and chromosomes making up the fusion, the direction of the fusion, and the exons at which the fusion has occurred.
  • Fig. 6 shows an array device 50 with hypothetical probe reporter values shown in gray (average test/reference values) or black (elevated test/reference values).
  • the array shows two probes that represent potential gene fusions: probe 96 representing a possible gene fusion between exon i in gene A and exon j in gene B, and probe 98 representing a possible gene fusion between exon k in gene C and exon I in gene D. It is recognized that the binding patterns shown here may be in duplicate or triplicate.
  • the system will confirm that the test/mean test and T/R/mean T/r values for that probe are both above threshold values indicative of a gene fusion.
  • the program will then identify the exonic probes associated with exon i in gene A (one of probes 60 in subarray 59), and associated with exon j in gene B (one or probes 62 in subarray 59), and confirm that neither has an elevated T/R/mean T/R value. It will then look at test and test/reference values for introns from the same genes, e.g., from the top two rows in subarray 657, to confirm that these are also within normal ranges. Based on the pattern shown, this probe will be confirmed as a gene fusion between exons I and j from genes A and B.
  • probe 98 representing a possible gene fusion between exon k in gene C and exon I in gene D
  • probe 100 is elevated in either test/mean test or T/R/mean T/R value, as indicated.
  • the elevated test value in probe 100 indicates that the elevated values for exon-spanning probe 98 may be due to binding of other test sequences to that exonic sequence, producing an artificial elevation in both probes. Based on this result, the fusion associated with probe 98 would be classified as "inconclusive.”
  • An array-device system in accordance with the invention includes the array device described above, an array detector or scanner for detecting the test and reference reporter levels at each probe position in the array, a processor operatively connected to the array detector for analyzing test and reference values measured by the detector, for identifying gene fusions, and a device operatively connected to the processor for storing and displaying the results from the processor.
  • the array scanner includes an optical reader or laser device for determining the intensity and color at each array position, and from this information, determining a test intensity and a test/reference ratio for each array position.
  • the processor in the system operates to identify those exon-exon boundary spanning probes having elevated test/mean test and T/R/mean T/R values from among all of the exon-exon boundary spanning probes on the device, as detailed above.
  • the processor may further operate to confirm that an exon-exon boundary spanning probe identified with a gene fusion is not associated with an exon in a single gene that has the highest test/reference value for an exon within the single gene or a test/reference ratio that is at least 1.5 fold greater than the average test/reference ratio for the exons within that gene.
  • the processor may further operates to confirm that an exon-spanning probe identified with a gene fusion is not associated with a gene that has that has a mean test/reference ratio of intronic probes that is statistically different from the mean test/reference ratio for the intronic probes of other genes in the gene pairs.

Abstract

A probe array device, method, and system for detecting intergenic fusions between a selected pair of genes in the same or different chromosomes are disclosed. The array device provides a plurality of exon-exon boundary spanning probes whose sequences span each of the possible combinations of illegitimate exon-exon boundaries, in both gene fusion transcript directions, for a selected gene pair. Exon-exon boundary spanning probes having elevated test/mean test and/or test/reference/mean test/reference values are identified as candidate exon-exon boundaries for a gene fusion, and these events are confirmed against exonic and intronic probes in the device that provide positive and negative controls, respectively.

Description

METHOD. ARRAY AND SYSTEM FOR DETECTING INTERGENIC FUSIONS
Field of the Invention
[0001] The present invention relates to a method, array and system for detecting cancer associated intergenic fusions resulting from deletions and inversions within a chromosome, and from translocations and insertions between different chromosomes.
Background of the invention
[0002] Oligonucleotide probe arrays are currently used or have been proposed for detecting a variety of genetic abnormalities, including point mutations, deletions and insertions, and splice variants, as well as for sequencing by hybridization (SBH). Depending on the intended application of the array, the probes on an array may be (i) long sequences, such as bacterial artificial chromosomes (BACs), for detecting large regions of genetic variation, (ii) short oligonucleotide probes, e.g., in the 20-30 base range, for detecting single-nucleotide base changes, such as single nucleotide polymorphisms (SNPs) or single-base mutations, and for sequencing by hybridization, or (iii) intermediate-size oligonucleotide probes, e.g., in the 60-80 base range, for detecting sequence variations that may be distributed over a large region of the genome. In the latter category, oligonucleotide probe arrays for detecting splice-variant effects have been disclosed (see, U.S. Patent Nos: 6,251 ,590 and 6,881 ,571).
[0003] Among the advantages of oligonucleotide probe arrays are ease of sample preparation, the availability of sophisticated two-channel array readers, and the ability to examine a very large number of genetic events on a single array. [0004] In addition to the genetic variations noted above, it would be desirable to apply oligonucleotide probe array technology to the detection of intergenic fusions, including those produced by deletions within a chromosome, inversions within a chromosome, or translocations between a pair of chromosomes. In particular, it would be desirable to provide a reliable array method and device for detecting gene fusion events at the resolution of the exons involved in such fusions, so that the exon composition of the fused gene can be determined. This capability would help pinpoint the nature of the gene fusion, both for purposes of classifying the fusion, which may have diagnostic and prognostic significance, and for developing the best therapeutic strategies. Summary of the Invention
[0005] The invention includes, in one aspect, an oligonucleotide probe array device for detecting intergenic fusions between a selected pair of genes in the same or different chromosomes. The device includes (a) a substrate having a surface defining an array of probe positions, and (b) attached to the array positions, one oligonucleotide probe per position, a plurality of probes. The probes include:
(i) for each of the selected pair(s) of genes, a plurality of exon-exon boundary spanning probes whose oligonucleotide sequences span each of the possible combinations of illegitimate exon-exon boundaries, in both translocation directions, where the exon-exon boundary probes have a size range of 50-70 bases and no more than about 40 bases in each boundary exon,
(ii) for each gene in the selected pair(s) of genes, a plurality of exonic probes having the sequences of 5' upstream, 3' downstream, or intermediate exonic regions of a plurality of exons in the gene, and
(iii) for each gene in the selected pair(s) of genes, a plurality of intronic probes having the sequences of 5' upstream, 3' downstream, or intermediate intronic regions of at least one intron.
[0006] The plurality of exonic probes may include for each exon in a gene in the selected pair(s) of genes, at least three probes having the sequences of 5' upstream, 3' downstream, and intermediate exonic regions of that gene exon. [0007] The plurality of intronic genes may include at least one probe for each intron in a gene in the selected pair(s) of genes, and may further include at least three probes having the sequences of 5' upstream, 3' downstream, and intermediate intronic regions of a selected intron.
[0008] The probes attached to the array position on the substrate may be in the size range 50-70 bases.
[0009] The exon-exon boundary spanning probes may be designed, for example, for detecting gene fusions associated with leukemia in a human subject, and may include exon-exon boundary spanning sequences for one or more of gene pairs, including, for example: BCR-ABL1 , E2A-PBX, E2A-HLF, SIL-TAL, TEL-AML1 , AML1-EVI1 , CHIC-TEL, AML1-ETO, PML-RARA, CBFB-MYH11 , MLL-AF4, MLL- MLL and MLL-MSF.
[00010] In another aspect, the invention includes a method for detecting gene fusions in one or more selected pairs of human genes of a test subject. The method includes (a) reacting with the oligonucleotide probe array device described above, under DNA hybridization conditions, test and reference cDNA prepared from mRNA samples obtained from the subject and from a normal individual(s), respectively, and labeled with first and second different detectable reporters, (b) detecting the test and reference reporter levels at each probe position in the array, and (c) identifying gene fusions and the exon-exon boundary at which the fusion has occurred by identifying the exon-exon boundary spanning probe associated with a given gene pair in which the test value and test/reference ratio is statistically elevated with respect to the other exon-exon boundary spanning probes for that gene pair. [00011] Step (c) in the method may include identifying the exon-exon boundary spanning probe having (i) the highest test value within the exon-exon boundary spanning probes for a selected gene pair, and a test value that is at least 1.5, preferably at least 2-5, fold greater than the mean test values for the exon-exon boundary spanning probes in that gene pair; and (ii) the highest test/reference ratio within the exon-exon boundary spanning probes for a selected gene pair and a test/reference ratio that is at least 1.5, preferably at least 2-5 fold greater than the mean test/reference ratio for the exon-exon boundary spanning probes in that gene pair.
[00012] The method may further include the step of confirming that an exon- exon boundary spanning probe identified in step (c) is not associated with an exon in a single gene that has an elevated test/reference to mean test/reference ratio with respect to other exons in that gene.
[00013] The method may further include the step of confirming that an exon- exon boundary spanning probe identified in step (c) is not associated with a gene that has that has an elevated test or test/reference intron probe value. [00014] Also disclosed is a system for detecting intergenic fusions between a pair of genes in the same or different chromosomes. The system includes an oligonucleotide probe array device of the type described above, an array detector for detecting the test and reference reporter levels at each probe position in the array, a processor operatively connected to the array detector for identifying gene fusions and the exon boundary at which the fusion has occurred by identifying the exon-exon boundary spanning probe associated with a given gene pair in which the test value and test/reference ratio is elevated with respect to other exon-exon boundary spanning probes in that gene pair, and a device operatively connected to the processor for storing and displaying the results from the processor. [00015] The array device may have the specific embodiments noted above. The array detector may include an optical/laser reader for determining the intensity and color at each array position, and from this information, determining a test intensity and a test/reference ratio for each array position. [00016] The processor may operate to identify those exon-exon boundary spanning probes having:
(i) the highest test value within the exon-exon boundary spanning probes for a selected gene pair, and a test value that is at least 1.5, preferably at least 2-5 fold greater than the mean test values for the exon-exon boundary spanning probes in that gene pair; and
(ii) the highest test/reference ratio within the exon-exon boundary spanning probes for a selected gene pair and a test/reference ratio that is at least 1.5, preferably at least 2-5 fold greater than the mean test/reference ratio the exon-exon boundary spanning probes in that gene pair.
[00017] The processor may further operate to confirm that an exon-exon boundary spanning probe identified with a gene fusion is not associated with an exon in a single gene that has an elevated test/reference to mean/test reference ratio with respect to other exons in that gene.
[00018] The processor may further operate to confirm that an exon-spanning probe identified with a gene fusion is not associated with a gene that has an elevated test or test/reference intron probe value.
[00019] These and other objects and features of the present invention will become more fully apparent when the following detailed description of the invention is read in conjunction with the accompanying drawings.
Brief Description of the Drawings
[00020] Fig. 1 illustrates an intergenic gene fusion between BCR exons 1-13 and ABL exons 2-11 ;
[00021] Fig. 2 shows expanded maps of chromosome 22;
[00022] Fig. 3 is a planar view of a portion of an oligonucleotide probe array device constructed in accordance with the invention;
[00023] Figs. 4 and 5 are flow diagrams of the logic employed in detecting and confirming intergenic fusions in accordance with the invention; and
[00024] Fig. 6 illustrates one pattern of test and reference reporters in an exemplary application of the invention for detecting an intergenic fusion. Detailed Description of the Invention
1. Definitions
[00025] Unless indicated otherwise, the terms below will be given the following definitions.
[00026] An "oligonucleotide probe" or "probe" refers to a defined-sequence DNA or RNA oligomer, or an oligonucleotide analog thereof that can hybridize to a complementary sequence DNA or RNA fragment in a test or reference sample. The term generally refers to multiple copies of the same-sequence probe at each array position. Oligonucleotide probes on spot based arrays are typically 30-70 bases in length, preferably 40-60 bases. The design limitations of the oligos due to the exon- exon boundary concept, means the oligos will have a limited size range which is likely to be in the 50-70 base range.,
[00027] An "exon-exon boundary probe" refers to an array probe whose sequence includes complementary portions of exons from two different genes involved in a gene fusion, including the 5'end sequence of one exon and the 3'end region of a second exon to which the first exon may be joined in the gene fusion. In one direction, the exon-exon boundary probes are joined 5' to 3' from gene 1 to gene
2, and in the opposite direction, from 5' to 3' from gene 2 to gene 1. An exon-exon boundary probe contains between 50-70 bases, where each exon sequence contributes at most 40 bases. For example a 70mer probe could include 40 bases from one exon and 30 from another, or 35 bases from each exon. Typically, the number of bases for one exon is within 10 bases of the number for the other exon. [00028] As will be appreciated, the number of bases are designed to allow cDNA fragment complementary to one exon sequence only to be washed off under stringent washing conditions that are effective to remove bound cDNA bound to only one exon sequence in the exon-exon boundary probe. Thus, if one or more of the exon-boundary probes contained up to 40 bases in one exon, the washing conditions would be selected to remove bound cDNAs with up to 40 bases of complementarity with the probes. Such washing conditions are well know, and must take into consideration both the length and G:C content of each exon sequence. For example an exon-exon boundary oligo of 50 mers will be designed to recognize approximately 25 bases of the end of one exon and 25 bases at the start of the potential illegitimate partner exon from the other gene (give or take say 5 bases). The 20-30 bases on each side of the boundary can be expected to hybridise to the normal exon targets irrespective of the presence of the target illegitimate exon-exon boundary. In this example, the wash step has to be stringent enough to remove any hybridisations to normal exons (up to say 30 bases), but leave complete hybridizations (say 50 bases) intact.
[00029] As an aside, it is recognized that the ability to offset the two exons in the probe, i.e., adjust their relative sizes, is one of the design parameters in forming the probes. Other design parameters may be deliberately including mismatches to desabilize hybridisation to the probe.
[00030] An "exonic probe" refers to an array probe that is complementary to some region of an exon of a gene..
[00031] An "intronic probe" refers to an array probe that is complementary to some region of an intron of a gene. Preferably exonic and intronic probes also have sizes between 50-70 bases.
[00032] "Test cDNA" refers to cDNA prepared from a test-individual mRNA sample and labeled with a first reporter, typically a fluorescence reporter having a fluorescence emission peak at one selected visible wavelength, e.g., in the red wavelength portion of the visible spectrum.
[00033] "Reference cDNA" refers to cDNA prepared from mRNA sample(s) obtained from one or more normal subjects who do not have gene fusions, at least of the type intended to be detected, and which is labeled with a second reporter, typically a fluorescence reporter having a fluorescence emission peak at a second selected visible wavelength, e.g., in the green wavelength portion of the visible spectrum.
[00034] "Test reporter values" or "test reporter intensity values" refer to the intensity values, e.g., fluorescence intensity values, detected for a test reporter, e.g., test fluorescence reporter at an array portion.
[00035] "Reference reporter values" or "reference reporter intensity values" refer to the intensity values, e.g., fluorescence intensity values, detected for a test reporter, e.g., test fluorescence reporter at an array portion.
[00036] "Gene fusion" or "intergenic fusions" refer to fusions between two different genes. If the different genes are on the same chromosome, the fusion may result from an inversion or deletion. If the different genes that are fused are on different chromosomes, the fusion may result from a translocation or insertion. II. lntergenic fusions
[00037] Fig. 1 shows an intergeneic fusion between a BCR gene on chromosome 22 (chr 22), indicated at 10, having exons e1-e23 and the ABL gene on chromosome 9 (chr 9), indicated at 12, having exons a1-a11. A gene fusion 14 is shown with breaks within intron 13 (between e13 and e14) of the BCR gene and intron 1 (between a1 and a2) of the ABL gene. Transcribed RNA from such a fusion undergoes post-translational splicing modifications which remove the introns and bring the exons together in the resulting mRNA. The mRNA has an illegitimate exon- exon boundary between exon e13 of BCR and exon a2 of the ABL gene; that is, the fusion joins the 3'end of BCR exon e13 to the 5' end of ABL exon a2. As seen in the bottom frame in Fig. 1 , the translocation produces two fused genes, a BCR-ABL fusion 16 composed of BCR exons e1-e13 and ABL exons a2-a11 , and an ABL-BCR fusion 18 composed of ABL exon a1 and BCR exons e14-e23. The two mRNA transcripts of the two fused genes are shown at 20, 22 in Fig. 1. [00038] This particular gene fusion illustrated above is also known as the "Philadelphia translocation" (Ph+) and is a characteristic acquired gene fusion in people who have chronic myelogenous myeloma (CML), occurring as a translocation in about 95%, and as a cytogenetically cryptic molecular insertion in about 5%. It is also found in 25%-30% of adults with acute lymphobalstic leukemia (ALL), and occasionally in acute myelogenous leukemia (AML). The BCR-ABL protein is a protein p210 or sometimes p190 (where p stands for "protein" and the number represents the apparent molecular weight of the mutant proteins in kDa). The fused "BCR-ABL" gene is located on the resulting, shorter chromosome 22. Because ABL carries a domain that can add phosphate groups to tyrosine residues (tyrosine kinase) the BCR-ABL fusion gene is also a tyrosine kinase. Although the bcr protein product of BCR is also a serine/threonine kinase, the tyrosine kinase function is particularly relevant for therapy. The p210 occurs primarily in CML, and sometimes in Ph+ ALL, for which the p190 fusion protein is more common. For pediatric Ph+ ALL, the impact of the type of fusion on prognosis after therapy is unknown, since Ph+ ALL is rare and populating statistically relevant studies is difficult. Although in adults, some studies suggest the p210 BCR-ABL is worse than p210. [00039] The fused BCR-ABL protein interacts with the interleukin-3 receptor beta(c) subunit. The BCR-ABL transcript is constitutively active, i.e. it does not require activation by other cellular messaging proteins. In turn, BCR-ABL activates a number of cell cycle-controlling proteins and enzymes, speeding up cell division. Moreover, it inhibits DNA repair, causing genomic instability and potentially causing the feared blast crisis in CML.
[00040] Although the invention will be illustrated with respect to the BCR/ABL fusion, since this fusion is well characterized for diagnostic purposes, it will be appreciated that the invention is designed to detect intergenic fusions between any two selected genes. Other gene pairs at which fusions are known and used for diagnostic purposes, particularly for leukemia, include, for example, the gene pairs: E2A-PBX, E2A-HLF, SIL-TAL, TEL-AML1 , AML1-EVI1 , CHIC-TEL, AML1-ETO, PML-RARA, CBFB-MYH11 MLL-AF4, MLL-MLL, and MLL-MSF. As can be seen in the examples some genes can form fusions with alternative partner genes. [00041] Fig. 2 show expanded maps of chromosome 22 (chr22), indicated at 24, and a multi-gene region 25 on the chromosomes. Region 25, when expanded, yields the multi-gene map shown at 26 which, in turn, contains a gene 28 of interest. A portion of gene 28 is shown at the map expansion 29, and includes intron 1 30, exon 2, 32, and so on.. Below this map are shown probes that provide a 5'-end, 3'- end and intermediate-region probes, indicated at 34, 36, 38, for intron 1 and at 40, 42, and 44, respectively, for exon e2. As can be appreciated, the three probes per exon and intron provide comprehensive coverage of each exon and intron. However, the three probes are not necessarily overlapping, where the intron or exon is greater that than about 150-200 bases.
III. Array Device
[00042] Fig. 3 show, in planar view, an oligonucleotide array device or gene chip 50, constructed in accordance with the invention. The device includes a substrate 52 having an array of positions or regions, such as indicated at 54-70, where each array position has anchored thereto, a defined-sequence oligonucleotide probe that is attached covalently or non-covalently, e.g., by predominant electrostatic interactions, to the substrate surface. Gene-chip substrates or supports, and methods for synthesizing and anchoring defined-sequence oligonucleotide probes to array positions on the substrate are well known to those skilled in the art, e.g., as detailed in USPN 6,994,972, 6,844,151 , 6,927,032, 6,188,783, 6,852,487, 6,852,850, 6,924,094, and 6,372,432, all of which are incorporated herein by reference. The number of array positions and distinct-sequence probes in the device may range from several hundred or fewer up to several hundred thousand or more, depending on the number of gene fusions the device is designed to detect, and the "coverage" of exonic and intronic probes that are included in the device, as will be discussed below.
[00043] In general, array device 50 has three array regions or subarrays, indicated at 53, 59, and 67 in Fig. 3. Subarray 53 includes exon-exon boundary spanning probes for each selected gene pair whose intergenic fusions are to be detected. As can appreciated from above, with reference to Fig. 1 , a gene fusion between any pair of selected genes, e.g., BCR and ABL, can occur between any pair of exons of the two genes, and will usually produce transcripts in both directions, e.g., BCR-ABL and ABL-BCR. Thus, in order to be able to pinpoint where the fusion has occurred, i.e., which two exons of the two genes are fused, and whether both transcript directions are present, the device will preferably include all relevant permutations of exon-exon boundary spanning probes and in both directions. For example, for the selected example BCR-ABL gene pair, probes spanning all possible exon-exon boundaries between each of the 23 BCR exons (N) and each of the 11 ABL exons (M) (NxM=263 probes) and in both directions (2xNxM=526 probes) could be present in the array. Further, each exon-exon boundary probe could be included in duplicate or triplicate, for purposes of achieving average value test and reference readings. The exon-exon boundary spanning probes for any selected pair of genes, e.g., the 526 exon-exon boundary spanning probes for the BCR-ABL gene pair, may be arrayed in a single row or column, or in multiple rows or columns. For example, in the device shown in Fig. 3, the top row containing array positions 54 may contain all 2xNxM exon-exon boundary spanning probes for a selected gene pair A-B, the second row containing array positions 56, the exon-exon boundary spanning probes for a second selected gene pair C-D, the third containing array positions 58, for a third selected gene pair E-F, and so forth.
[00044] As discussed above, each of the exon-exon boundary spanning probes preferably contains approximately 25-35 bases complementary to the 5'- or 3'-end region of each of the fused exons. Thus, the probe for a given fusion will have approximately 30 bases of the selected exon of one gene, at its 51 or 3' end, and approximately 30 bases of the selected exon of the second gene, at its 3'- or 5'-end, respectively.
[00045] Subarray 59 includes exonic sequences for each of the genes in the selected gene pairs whose fusions are to be detected. Specifically, the array includes, for each gene in a selected gene pair to be test, a plurality of exonic probes having the sequences of 5' upstream, 31 downstream, and/or intermediate exonic regions of a plurality of exons in the gene. That is, the exonic probes include, for each selected gene, a plurality of probes representing multiple exons, preferably at least one probe per exon, and/or a plurality of probes representing a single exon, preferably three probes having the sequences of 5' upstream, 3' downstream, and intermediate exonic regions of that gene exon. In order to provide broad exon- sequence coverage, the device preferably includes, for each exon in a selected gene, a 5'-upstream, 3' downstream, and intermediate exonic regions of that gene exon. Thus, for example, for the BCR gene having 23 exons, the device would include 23 x 3 exonic probes. As above, these may be array in a single row, single column, or combinations of different rows and columns. For example, in the array shown in Fig. 1 , the rows in subarray 59 represented by array positions 60, 62 may include all of the exonic probes for genes A and B, respectively, the rows represented by array positions 64, 66, all of the exonic probes for genes C and D, respectively, and so forth. As will be seen in the section below, the exonic probes in subarray 59 provide positive controls for exon-boundary probe hits observed in subarray 53.
[00046] Subarray 67 includes intronic sequences for each of the genes in the selected gene pairs whose fusions are to be detected. Specifically, the array includes, for each gene in a selected gene pair to be test, one or more intronic probes having the sequences of 5' upstream, 3' downstream, or intermediate intronic regions of at least one intron. As for the exonic genes, the array devices preferably includes multiple probes for each gene intron, for example three probes having the sequences of 5' upstream, 3' downstream, and intermediate intronic regions of that gene intron. In a preferred embodiment, all of the selected gene introns are represented by three probes. As above, these probes may be array in a single row, single column, or combinations of different rows and columns. For example, in the array shown in Fig. 1 , the rows in subarray 67 represented by array positions 68, 70 may include all of the intronic probes for genes A and B, respectively, and so forth. [00047] In addition, although not shown here, the device may contain probes from selected 5' and/or untranslated regions of each gene's mRNA, as well as one or "combined sequence" probes that include selected portions of several exons of the gene. IV. Detecting Interqenic fusions
[00048] In accordance with another aspect of the invention, the array device of the invention is used for detecting gene fusions in one or more selected pairs of human genes of a test subject. In practicing the method of the invention, test and reference cDNA samples are prepared from mature mRNA samples obtained from the test subject and from a normal individual(s), respectively, and these samples are labeled with first and second different detectable reporters, e.g., fluorescence reporters having fluorescence emission peaks in the red and green regions of the visible spectrum, respectively. Sample preparation can be carried out according to well known methods, for example, by isolating a polyA mRNA fraction from cells from the test and reference individuals, and using the mRNA as a template for a reverse transcriptase, to generate single-stranded cDNA in the presence of all four dNTPs, including dNTS labeled with a selected fluorescent reported. The test and reference cDNA sample are prepared separately, and the reference sample may be obtained independently, e.g., from one or more normal individuals and supplied and/or stored separately, for example, as part of a test kit that includes the array device of the invention.
[00049] It will be appreciated that other type of detectable reporters may be used for the nucleotide-labeled probes, for example, probes that can be differentiated on the basis of different absorption peaks in the visible or UV spectra, or antigenic probes that can be detected by reaction with reported-labeled, antigen- specific antibodies, according to well-known methods. However, it is recognized that the system could also operate in a non-competitive mode, in which the same reporter is used for both test and reference samples, e.g., when carried out in parallel arrays.
[00050] The reporter-labeled test and reference cDNA containing sample from above are mixed, e.g., in a 1 :1 cDNA weight ratio, reactive with the probes in the array device by adding the sample material to the surface of the array device under hybridization conditions, i.e., conditions that favor hybridization of the labeled cDNAs to complementary-sequence probes on the array device. After a suitable reaction time, the array is washed one or more times to remove non-specifically bound probes and incompletely bound probes (i.e. those exon-exon boundary probes partially hybridized to normal exon targets rather than completely hybridized to its target complete exon-exon boundary), and prepared for reading by an array detector, for detecting the test and reference reporter levels at each probe position in the array. Methods for washing to achieve selective removal of DNA hybrids below a given number of complementary bases, e.g., 30-40 bases, are well known. [00051] Suitable array detectors are well known, and in the case of fluorescence-labeled cDNA, function to detect and quantify the fluorescence emission at for each fluorescence reporter at each array position. The detector may also include software for storing the detected emission values, and for calculating and storing a test/reference ratio at each array position. From the test/reference ratios, the software may also find an average test/reference values for some group of "neutral" probes, i.e., probes corresponding to housekeeping genes, and normalize all of the probe test/reference values to that average. Software for performing these operations is commercially available; one suitable array detector with built-in reader is commercially available. Detection using single reporter dyes and only test material (non-comparative approach without internal reference RNA/cDNA) is also possible.
[00052] Figs. 4 and 5 are flow diagrams illustrating steps for identifying intergenic fusions test and reference values recorded from the array, in accordance with the invention when using the comaparative test/reference detection approach. The steps of detecting test and reference reporter values at each array position (box 72) and calculating a test/reference value for each array position (box 74) have been discussed above. For each gene pair under consideration, the program considers the test values for all of the exon-exon boundary spanning (ES) probes for that gene pair, calculates an average test value, and determines a test/mean test value for each exon-spanning probe, as at 76. Those ES probes having a test/mean test value greater than a selected threshold value, for example, a test/mean ratio greater than a selected threshold value, e.g., 1.5, and preferably at least 2, 3, 4, 5, or 10, are identified and stored at 78 as "candidate" probes for a gene fusion event. [00053] In a similar operation, shown in box 80, the program calculates the mean test/reference ratio for each ES probe in the gene pair of interest, and determines a the ratio of test/reference value to mean test/reference value for each of the ES probes for that gene pair. Those probes that have an elevated T/R/mean T/R ratio, e.g., a ratio greater than a selected threshold value, e.g., 1.5, and preferably at least 2, 3, 4, 5, or 10, are similarly identified and stored at 80 as "candidate" probes for an intergenic fusion event. These steps are repeated for all ES probes in all gene pairs on the array, and the results are stored at 78. [00054] For each of the ES probes identified above as having an elevated test/mean test ratio, or T/R/mean T/R ratio, the program follows the logic in Fig. 5 to classify that E/S probe as either a possible, but inconclusive intergenic fusion or a confirmed intergenic fusion. With reference to Fig. 5, a candidate ES from 78 is selected, and its test/mean test and T/R/mean T/R value are retrieved. If one of the values only is elevated above its selected threshold, at 88, the probe is classified as an inconclusive intergenic fusion, at 94. For inconclusive fusions, the program may simply report the results as "inconclusive" or attempt to assign a confidence value for the result, based on the extent to which the key test/mean test and T/R/mean T/R values are above or below the selected thresholds, and the extent to which the test or T/R values of corresponding exonic or intronic sequences might suggest a basis for ambiguity (see below). An inconclusive result might also provide the basis for further analysis, either by operator review of the raw data produce by the device, or by alternative methods of analysis , e.g., by in situ analysis of stained chromosomes, to attempt to resolve the ambiguity.
[00055] If the test/mean test and T/R/mean T/R ratios are both elevated above the preselected thresholds, at 88, the program then checks exonic sequences for that exon-exon boundary spanning pair in subarray 59, to determine if either of the exons in the ES probe have elevated T/R/mean T/R values, indicating a non-specific interaction to one of the exons which may be the cause of the elevated ES value(s) (box 90). That is, the exonic sequence probes act as positive controls to rule out non-specific hybridization to the ES probe. Assuming the exonic sequences values are normal, the intronic sequences for the two genes is checked, in this case to confirm that there is little or no significant test or reference binding to the intronic seqences. The intronic sequences thus act a negative controls, to ensure that elevated test and reference probe reporter values are not elevated because of false reporting from one or more intronic sequences, which should have near-zero binding.
[00056] Assuming either the corresponding exonic or intronic sequences are elevated in test-reporter values, at 80, the E/S probe is again classified as an inconclusive intergenic fusion, at 94. If both exonic and intronic sequences (positive and negative controls) are normal, the result is classified as a confirmed intergenic fusion, at 92.
[00057] Returning to Fig. 4, after the above confirmation analysis has been carried out for a selected ES probe with elevated test/mean test and/or T/R/mean T/R values, the program repeats the analysis, at 86, for each of the ES probes identified and stored at 78. When all of the elevated ES probes have been considered and classified, the results are display at 84. The displayed information will report each confirmed, and optionally, each inconclusive gene fusion, the genes and chromosomes making up the fusion, the direction of the fusion, and the exons at which the fusion has occurred.
[00058] By way of illustration, Fig. 6 shows an array device 50 with hypothetical probe reporter values shown in gray (average test/reference values) or black (elevated test/reference values). The array shows two probes that represent potential gene fusions: probe 96 representing a possible gene fusion between exon i in gene A and exon j in gene B, and probe 98 representing a possible gene fusion between exon k in gene C and exon I in gene D. It is recognized that the binding patterns shown here may be in duplicate or triplicate. To confirm the first gene fusion, the system will confirm that the test/mean test and T/R/mean T/r values for that probe are both above threshold values indicative of a gene fusion. The program will then identify the exonic probes associated with exon i in gene A (one of probes 60 in subarray 59), and associated with exon j in gene B (one or probes 62 in subarray 59), and confirm that neither has an elevated T/R/mean T/R value. It will then look at test and test/reference values for introns from the same genes, e.g., from the top two rows in subarray 657, to confirm that these are also within normal ranges. Based on the pattern shown, this probe will be confirmed as a gene fusion between exons I and j from genes A and B.
[00059] The analysis with respect to probe 98, representing a possible gene fusion between exon k in gene C and exon I in gene D, will be handled in the same way. Assume in this case, however, that the probes for exon i in gene C, one of which is represented by probe 100 in the third row in subarray 59, is elevated in either test/mean test or T/R/mean T/R value, as indicated. The elevated test value in probe 100 indicates that the elevated values for exon-spanning probe 98 may be due to binding of other test sequences to that exonic sequence, producing an artificial elevation in both probes. Based on this result, the fusion associated with probe 98 would be classified as "inconclusive."
V. Gene array system
[00060] An array-device system in accordance with the invention includes the array device described above, an array detector or scanner for detecting the test and reference reporter levels at each probe position in the array, a processor operatively connected to the array detector for analyzing test and reference values measured by the detector, for identifying gene fusions, and a device operatively connected to the processor for storing and displaying the results from the processor. [00061] The array scanner includes an optical reader or laser device for determining the intensity and color at each array position, and from this information, determining a test intensity and a test/reference ratio for each array position. The processor in the system operates to identify those exon-exon boundary spanning probes having elevated test/mean test and T/R/mean T/R values from among all of the exon-exon boundary spanning probes on the device, as detailed above. The processor may further operate to confirm that an exon-exon boundary spanning probe identified with a gene fusion is not associated with an exon in a single gene that has the highest test/reference value for an exon within the single gene or a test/reference ratio that is at least 1.5 fold greater than the average test/reference ratio for the exons within that gene. The processor may further operates to confirm that an exon-spanning probe identified with a gene fusion is not associated with a gene that has that has a mean test/reference ratio of intronic probes that is statistically different from the mean test/reference ratio for the intronic probes of other genes in the gene pairs. These operations are embodied in the flow diagrams described above with respect to Figs. 4 and 5. Machine readable code designed to carry out these operations under the control of the processor form another aspect of the invention.
[00062] While the invention has been described with respect to particular features and embodiments, it will be appreciated how various changes and modifications may be made without departing from the spirit of the invention.

Claims

IT IS CLAIMED:
1. A probe array device for detecting intergenic fusions between a selected pair of genes in the same or different chromosomes, comprising
(a) a substrate having a surface defining an array of probe positions, and
(b) attached to the array positions, one probe per position, a plurality of probe probes including:
(i) for each of the selected pair(s) of genes, a plurality of exon-exon boundary probes whose sequences span each of the possible combinations of illegitimate exon-exon boundaries, in both gene fusion transcript directions, said exon-exon boundary probes having a size range of 50-70 bases and no more than about 40 bases in each boundary exon,
(ii) for each gene in the selected pair(s) of genes, a plurality of exonic probes having the sequences of 5' upstream, 3' downstream, or intermediate exonic regions of a plurality of exons in the gene, and
(iii) for each gene in the selected pair(s) of genes, a plurality of intronic probes having the sequences of 5' upstream, 3' downstream, or intermediate intronic regions of at least one intron.
2. The array of claim 1 , wherein the plurality of exonic probes include for each exon in a gene in the selected pair(s) of genes, at least three probes having the sequences of 5' upstream, 3' downstream, and intermediate exonic regions of that gene exon.
3. The array of claim 1 , wherein the plurality of intronic genes includes at least one probe for each intron in a gene in the selected pair(s) of genes.
4. The array of claim 3, wherein the plurality of intronic genes includes at least three probes having the sequences of 5' upstream, 31 downstream, and intermediate intonic regions of a selected intron.
5. The array of claim 1 , wherein the plurality of probes attached to the array position on the substrate are in the size range 50-70 bases.
6. The array of claim 1 , for detecting gene fusions associated with leukemia in a human subject, wherein the selected pairs of human genes may include one or more gene pairs, including, for example, BCR-ABL, E2A-PBX, E2A-HLF, SIL-TAL, TEL-AML1 , AML1-EVI1 , CHIC-TEL, AML1-ETO, PML-RARA, CBFB-MYH11 , MLL- AF4, MLL-MLL, and MLL-/MSF.
7. A method for detecting genomic gene fusions in one or more selected pairs of human genes of a test subject, comprising
(a) reacting with the array device of claim 1 , under DNA hybridization conditions, test and reference cDNA prepared from mature mRNA samples obtained from the subject and from a normal individual(s), respectively, and labeled with first and second different detectable reporters,
(b) washing the array device under conditions effective to remove test and reference cDNAs that bind to one boundry exon only in an exon-exon boundary probe;
(c) detecting the test and reference reporter levels at each probe position in the array, and
(d) identifying gene fusions and the exon-exon boundaries at which the fusion has occurred by identifying those exon-exon boundary probes associated with a given gene pair in which the test value and test/reference ratio is statistically elevated with respect to other exon-spanning probes in that gene pair.
8. The method of claim 7, wherein step (c) includes identifying those exon- exon boundary spanning probes having:
(i) the highest test value within the exon-exon boundary spanning probes for a selected gene pair, and a test value that is at least 1.5 fold greater than the mean test values for the exon-exon boundary spanning probes in that gene pair; and
(ii) the highest test/reference ratio within the exon-exon boundary spanning probes for a selected gene pair and a test/reference ratio that is at least 1.5 fold greater than the mean test/reference ratio the exon-exon boundary spanning probes in that gene pair.
9. The method of claim 8, further includes the step of confirming that an exon-exon boundary spanning probe identified in step (c) is not associated with an exon in a single gene that has the highest test/reference value for an exon within the single gene or a test/reference ratio that is at least 1.5 fold greater than the average test/reference ratio for the exons in that gene.
10. The method of claim 8, which further includes the step of confirming that an exon-exon bondary spanning probe identified in step (c) is not associated with a gene that has that has a mean test/reference ratio of intronic probes that is statistically different from the mean test/reference ratio for the intronic probes of other genes in the gene pairs.
11. A system for detecting intergenic fusions between a pair of genes in the same or different chromosomes, comprising
A. an probe array device having
(a) a substrate having a surface defining an array of probe positions, and
(b) attached to the array positions, one probe per position, including: (i) (i) for each of the selected pair(s) of genes, a plurality of exon-exon boundary probes whose sequences span each of the possible combinations of illegitimate exon-exon boundaries, in both gene fusion transcript directions, said exon-exon boundary probes having a size range of 50-70 bases and no more than about 40 bases in each boundary exon,
(ii) for each gene in the selected pair(s) of genes, a plurality of exonic probes having the sequences of 5' upstream, 3' downstream, or intermediate exonic regions of a plurality of exons in the gene, and
(iii) for each gene in the selected pair(s) of genes, a plurality of intronic probes having the sequences of 5' upstream, 3' downstream, or intermediate intronic regions of at least one intron.
B. an array detector for detecting the test and reference reporter levels at each probe position in the array,
C. a processor operatively connected to the array detector for identifying gene fusions and the exon boundaries at which the fusion has occurred by identifying those exon-exon boundary spanning probes associated with a given gene pair in which the test value and test/reference ratio is statistically elevated with respect to other exon-exon boundary spanning probes in that gene pair, and
D. a device operatively connected to the processor for storing and displaying the results from the processor.
12. The system of claim 11 , wherein the plurality of exonic probes in the array device include for each exon in a gene in the selected pair(s) of genes, at least three probes having the sequences of 5' upstream, 3' downstream, and intermediate exonic regions of that gene exon.
13. The system of claim 11 , wherein the plurality of intronic genes in the array deice includes at least one probe for each intron in a gene in the selected pair(s) of genes.
14. The array of claim 11 , wherein the plurality of intronic genes in the array device includes at least at least three probes having the sequences of 5' upstream, 3' downstream, and intermediate intronic regions of a selected intron.
15. The array of claim 1 , wherein the plurality of probes attached to the array position on the substrate of the array device are in the size range 50-70 bases.
16. The system of claim 11 , for detecting gene fusions associated with leukemia in a human subject, wherein the selected pairs of human genes may include, for example.pairs from the group consisting of BCR-ABL, E2A-PBX, E2A- HLF, SIL-TAL, TEL-AML1 , AML1-EVI1 , CHIC-TEL, AML1-ETO, PML-RARA, CBFB- MYM 1 , MLL-AF4, MLL-MLL, and MLL-MSF.
17. The system of claim 11 , wherein the array scanner includes an optical reader or laser device for determining the intensity and color at each array position, and from this information, determining a test intensity and a test/reference ratio for each array position.
18. The system of claim 17, wherein the processor operates to identify those exon-exon boundary spanning probes having:
(i) the highest test value within the exon-exon boundary spanning probes for a selected gene pair, and a test value that is at least 1.5 fold greater than the mean test values for the exon-exon boundary spanning probes in that gene pair; and
(ii) the highest test/reference ratio within the exon-exon boundary spanning probes for a selected gene pair and a test/reference ratio that is at least 1.5 fold greater than the mean test/reference ratio the exon-spanning probes in that gene pair.
19. The system of claim 18, wherein the processor further operates to confirm that an exon-exon boundary spanning probe identified with the a gene fusion is not associated with an exon in a single gene that has the highest test/reference value for an exon within the single gene or a test/reference ratio that is at least 1.5 fold greater than the average test/reference ratio for the exons within that gene.
20. The system of claim 18, wherein the processor further operates to confirm that an exon-exon boundary spanning probe identified with a gene fusion is not associated with a gene that has that has a mean test/reference ratio of intronic probes that is statistically different from the mean test/reference ratio for the intronic probes of other genes in the gene pairs.
PCT/US2009/005460 2008-10-03 2009-10-05 Method, array and system for detecting intergenic fusions WO2010039275A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US10273308P 2008-10-03 2008-10-03
US61/102,733 2008-10-03

Publications (1)

Publication Number Publication Date
WO2010039275A1 true WO2010039275A1 (en) 2010-04-08

Family

ID=42073789

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2009/005460 WO2010039275A1 (en) 2008-10-03 2009-10-05 Method, array and system for detecting intergenic fusions

Country Status (1)

Country Link
WO (1) WO2010039275A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012135340A2 (en) * 2011-03-28 2012-10-04 Nanostring Technologies, Inc. Compositions and methods for diagnosing cancer
US9631239B2 (en) 2008-05-30 2017-04-25 University Of Utah Research Foundation Method of classifying a breast cancer instrinsic subtype
CN107345244A (en) * 2016-10-12 2017-11-14 深圳市儿童医院 Detect method, primer and the kit of leukaemia TEL AML1 fusions

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070117144A1 (en) * 2002-10-21 2007-05-24 Sakari Kauppinen Oligonucleotides useful for detecting and analyzing nucleic acids of interest
US20070148667A1 (en) * 2005-09-30 2007-06-28 Affymetrix, Inc. Methods and computer software for detecting splice variants
US20070299097A1 (en) * 2000-06-30 2007-12-27 The Regents Of The University Of California Strategy for Leukemia Therapy

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070299097A1 (en) * 2000-06-30 2007-12-27 The Regents Of The University Of California Strategy for Leukemia Therapy
US20070117144A1 (en) * 2002-10-21 2007-05-24 Sakari Kauppinen Oligonucleotides useful for detecting and analyzing nucleic acids of interest
US20070148667A1 (en) * 2005-09-30 2007-06-28 Affymetrix, Inc. Methods and computer software for detecting splice variants

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9631239B2 (en) 2008-05-30 2017-04-25 University Of Utah Research Foundation Method of classifying a breast cancer instrinsic subtype
WO2012135340A2 (en) * 2011-03-28 2012-10-04 Nanostring Technologies, Inc. Compositions and methods for diagnosing cancer
WO2012135340A3 (en) * 2011-03-28 2012-12-20 Nanostring Technologies, Inc. Compositions and methods for diagnosing cancer
US9758834B2 (en) 2011-03-28 2017-09-12 Nanostring Technologies, Inc. Compositions and methods for diagnosing cancer
CN107345244A (en) * 2016-10-12 2017-11-14 深圳市儿童医院 Detect method, primer and the kit of leukaemia TEL AML1 fusions

Similar Documents

Publication Publication Date Title
Pinkel et al. Array comparative genomic hybridization and its applications in cancer
CN106834474B (en) Utilize gene order-checking diagnosing fetal chromosomal aneuploidy
CN108899091A (en) The detection of heredity relevant to cancer or molecular distortion
Shaw et al. Comparative genomic hybridisation using a proximal 17p BAC/PAC array detects rearrangements responsible for four genomic disorders
CN1997757A (en) Method for determining the abundance of sequences in a sample
US20120289425A1 (en) Techniques for Identifying, Confirming, Mapping and Categorizing Polymers
US10198553B2 (en) Combined CGH and allele specific hybridisation method
JP2011505122A5 (en)
US20070092869A1 (en) Spike-in controls and methods for using the same
US20100279890A1 (en) Fusion gene microarray
WO2018161019A1 (en) Methods for optimizing direct targeted sequencing
US20180100202A1 (en) Method for identifying or detecting genomic rearrangements in a biological sample
US20030152931A1 (en) Nucleic acid detection device and method utilizing the same
WO2010039275A1 (en) Method, array and system for detecting intergenic fusions
US8321138B2 (en) Method of characterizing quality of hybridized CGH arrays
US20070099227A1 (en) Significance analysis using data smoothing with shaped response functions
WO2006002191A1 (en) Probe optimization methods
US8221978B2 (en) Normalization probes for comparative genome hybridization arrays
Cowell High throughput determination of gains and losses of genetic material using high resolution BAC arrays and comparative genomic hybridization
CN103131788A (en) Probe and primer for detecting single nucleotide polymorphism related to chronic periodontitis, and kit thereof
US7914983B2 (en) Detection method for gene expression
Brennan Methods for DNA Copy Number Derivations.
Brown et al. RNA sequencing with next-generation sequencing
US20070128611A1 (en) Negative control probes
US20070092903A1 (en) One-color microarray analysis methods, reagents and kits

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09818123

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09818123

Country of ref document: EP

Kind code of ref document: A1