WO2008051374A2 - Rapid screening of oral squamous cell carcinoma - Google Patents

Rapid screening of oral squamous cell carcinoma Download PDF

Info

Publication number
WO2008051374A2
WO2008051374A2 PCT/US2007/021687 US2007021687W WO2008051374A2 WO 2008051374 A2 WO2008051374 A2 WO 2008051374A2 US 2007021687 W US2007021687 W US 2007021687W WO 2008051374 A2 WO2008051374 A2 WO 2008051374A2
Authority
WO
WIPO (PCT)
Prior art keywords
genes
alpha
collagen
oral
type
Prior art date
Application number
PCT/US2007/021687
Other languages
French (fr)
Other versions
WO2008051374A9 (en
WO2008051374A3 (en
Inventor
Amy F. Zoiber
Barry L. Zoiber
Original Assignee
The Trustees Of The University Of Pennsylvania
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Trustees Of The University Of Pennsylvania filed Critical The Trustees Of The University Of Pennsylvania
Publication of WO2008051374A2 publication Critical patent/WO2008051374A2/en
Publication of WO2008051374A9 publication Critical patent/WO2008051374A9/en
Publication of WO2008051374A3 publication Critical patent/WO2008051374A3/en

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/574Immunoassay; Biospecific binding assay; Materials therefor for cancer
    • G01N33/57407Specifically defined cancers

Definitions

  • the present invention provides methods for determining the onset, progression, and treatment outcomes of certain head and neck cancers.
  • the methods include detecting an gene signature to classify normal and pathogenic specimens.
  • Head and neck cancers are the sixth most common cancer worldwide and are associated with low survival and high morbidity.
  • OSCC oral squamous cell carcinoma
  • Biopsies are invasive procedures typically involving surgical techniques. Furthermore, biopsies are limited when it comes to lesion size. For example, small lesions may not provide enough material for accurate diagnosis while biopsies taken from large lesions may not accurately reflect every histopathological aspect of the lesion.
  • the biopsy as a diagnostic tool, has limited sensitivity.
  • additional methodologies would be helpful in screening for pre-malignant and malignant oral cancer lesions.
  • to non-invasively detect oral cancer cells will require easy access to the site where these cancers typically arise and a readily available source of cells. Oral cavity saliva meets both of these criteria.
  • Oral cancer is a significant health problem in the United States and worldwide.
  • the five-year survival rate for oral cancer has not improved significantly over the preceding 20 years and remains at approximately 50%.
  • patients diagnosed at an early stage of the disease typically have an 80% chance for cure and functional outcomes, currently most oral cancer patients are identified when the pathology has already reached an advanced stage.
  • a convenient and accurate means for detecting oral cancer would decrease patient morbidity and mortality.
  • the present invention provides methods for monitoring the onset, progression, and treatment outcomes of oral squamous cell carcinoma (OSCC).
  • the methods include detecting an OSCC gene signature in order to distinguish normal from OSCC specimens.
  • systems and other materials for diagnosing OSCC in a tissue sample Using a supervised-learning algorithm, an exemplary gene signature for OSCC was identified that can be applied to the accurate classification of normal and OSCC specimens.
  • the present invention fulfills two necessary, but heretofore unmet, prerequisites for non-invasively monitoring oral cancer onset, progression, and treatment: the identification of specific biomarkers for oral cancers, and non-invasive access plus monitoring of these biomarkers at the point of care by minimally-trained health care personnel. [0009] Additional information concerning this invention can be found at Amy F. Ziober, et al, Clin Cancer Res 2006, 12 5960-5971, which is incorporated herein by reference in its entirety.
  • FIG. 1 depicts the data resulting from the analysis of microarray data for a total of 2207 genes, graphically represented in Treeview; gene signatures of interest are highlighted by dark lines, and tree headings for tumor and normal specimens are labeled.
  • FIG. 2 demonstrates how two major classes of samples were identified using the above-referenced 2207 genes, representing a distinct separation of tumor and normal specimens. Tree headings for each of the two groups are labeled, as are: anatomical sites in the oral cavity from which samples were surgically removed; stage; and, tissue number.
  • FIGS. 3 A and 3B depict an unsupervised classification analysis of patient- matched normal oral mucosal and OSCC samples, respectively.
  • samples derived from tongue are represented by squares ( ⁇ ) while non-tongue samples are represented by triangles (A). Clustering of similar samples is contained within circles.
  • FIG. 4 depicts site specific gene expression patterns, labeled to indicate patterns that are (independently) up-regulated and that are associated with tongue samples. Regions identified in FIG. 1 as site specific are enlarged. Genes are identified using Affymetrix numbers.
  • FIG. 5 depicts site specific gene expression patterns, labeled to indicate patterns that are (independently) up-regulated and that are associated with non-tongue samples. Regions identified in FIG. 1 as site specific are enlarged, and headings for tumor and normal specimens are labeled. Genes are identified using Affymetrix numbers.
  • FIG. 7 shows results of immunohistochemical staining analysis in normal and OSCC oral mucosa for MMP-I, Ln-5 ⁇ 2 chain, and MMP-3.
  • the present invention provides methods for monitoring the onset, progression, and treatment outcomes of oral squamous cell carcinoma (OSCC).
  • the methods include detecting a particular OSCC gene signature in order to distinguish normal from OSCC specimens.
  • methods of assessing the absence or presence of oral squamous cell carcinoma in a subject comprising determining, in an oral tissue sample from said subject, the expression levels of at least a subset of genes belonging to a class of oral squamous cell carcinoma-associated genes, followed by comparing the determined expression levels with expression levels of such genes in a patient in which oral squamous cell carcinoma (“OSCC”) is known to be present.
  • OSCC oral squamous cell carcinoma
  • the gene expression profiles of identified OSCC patients may be used as a benchmark.
  • the subset of genes belonging to a class of oral squamous cell carcinoma-associated genes can comprise fewer than eight genes, or can comprise eight or more genes.
  • the class of oral squamous cell carcinoma-associated genes includes at least 20 genes from a group of 25 genes identified as constituting a highly accurate OSCC gene expression signature.
  • the 25-gene "predictor set” includes: TU3A protein; beta A inhibin; matrix metalloproteinase 1 ; matrix metalloproteinase 9; chemokine ligand 13; matrix metalloproteinase 11 ; type V, alpha 2 collagen (first region); type III, alpha 1 collagen; 601658812R1 NIH MGC 69 Homo sapiens cDNA clone IMAGE:3886131 3' mRNA; type I, alpha 1 collagen; Homo sapiens NKG5 gene; urokinase plasminogen activator; type V, alpha 1 collagen (first region); myosin X; lysyl oxidase-like 2; type V, alpha 1 collagen (second region); type V, alpha
  • the predictive strength of this 25-gene set has been confirmed by testing of OSCC and control specimens. See Example 5, infra.
  • the disclosed methods may further comprise comparing the determined gene expression levels to expression levels of such genes in a patient in which oral squamous cell carcinoma is known to be absent. Accordingly, in addition to a comparison to expression levels from identified OSCC patients, the determined test subject expression levels may also be compared to expression levels from identified control subjects.
  • Control and OSCC patent data sets may be obtained from published sources ⁇ see, e.g., O'Donnell RK et al. Oncogene. 2005;24: 1244-51 ; see also National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO)), or may be independently obtained.
  • NCBI National Center for Biotechnology Information
  • Additional methods of assessing the absence or presence of oral squamous cell carcinoma in a subject comprising determining, in an oral tissue sample from the subject, the expression levels of at least a subset of genes belonging to a class of oral squamous cell carcinoma-associated genes, followed by comparing the determined expression levels with expression levels of such genes in a patient in which oral squamous cell carcinoma ("OSCC") is known to be absent.
  • OSCC oral squamous cell carcinoma
  • the disclosed methods may involve a comparison of the subject's gene expression profile either to expression levels from known OSCC patients, or to expression levels from subjects in which OSCC is known to be absent, or the comparison may be with respect to both OSCC patient expression levels and control subject expression levels.
  • the subset of genes the expression levels of which are compared to expression levels data from OSCC patients, control subjects, or both comprise at least eight genes, at least 10 genes, or at least 15 genes.
  • the class of oral squamous cell carcinoma-associated genes can include at least 20 genes from a group of 25 genes identified as constituting a highly accurate OSCC gene expression signature, and the subset of genes belonging to this class can comprise at least eight, at least 10, or at least 15 genes belonging to this class.
  • the instant methods of assessing the absence or presence of oral squamous cell carcinoma in a subject are highly discriminating and can result in the classification of tumor and normal samples with approximately 96% accuracy ⁇ see Example 5, infra).
  • the oral tissue sample for use in the disclosed methods may be derived from a variety of anatomical sites in the oral cavity of the test subject.
  • the oral tissue sample may be from one or more of the tongue, buccal mucosa, lips, mandible epithelia, gum, or mouth floor of the subject. Tissue samples from these sites are easily obtainable and represent a readily available source of cells from among which oral cancer cells can be detected using minimally invasive techniques according to the instant methods.
  • tissue samples from these anatomical sites in the oral cavity can be obtained using swabs, fine needle asp rat on opsy, s ave opsy, or ot er tec n ques t at enta on y a nom na egree o invasiveness and patient discomfort.
  • the instant methods are compatible with obtaining tissue samples in a manner that requires the excision of minimal amounts of tissue from readily-accessible anatomical sites.
  • Oral cavity saliva which typically contains a sufficient cellular complement, also meets the necessary criteria for the early, accurate, and non-invasive identification of the described biomarkers from among the oral cellular population.
  • Many techniques for the retrieval of oral tissue samples are known to those skilled in the art, and the present invention contemplates the use of any such procedure in connection with the disclosed methods.
  • the subset of genes can comprise at least eight genes belonging to the class of OSCC-associated genes. In other embodiments, the subset of genes can include at least 10 genes or at least 15 genes.
  • the class of oral squamous cell carcinoma- associated genes can include at least 20 genes from a group of 25 genes, identified supra, as constituting a highly accurate OSCC gene expression signature, and the subset of genes belonging to this class can comprise at least eight, at least 10, or at least 15 genes belonging to this class.
  • the provided systems represent systems for rapid, accurate, point-of-care screening, detection, and diagnosis of oral cancer.
  • Microarrays that contain probes representing genes of interest are known in the art and may be prepared by commercial vendors according to customer specifications ⁇ e.g., Affymetrix, Santa Clara, CA).
  • the probes traditionally constitute mRNA, cDNA, or cRNA
  • the solid-support surface typically comprises glass, silicon, nylon substrate, or other film material.
  • Target polynucleotide probes are deposited onto the solid- support surface by techniques known in the art such as photolithography, pipette, drop-touch, piezoelectric (including ink-jet), electric, and other suitable means.
  • the result may comprise a traditional DNA chip array, a microfluidics-based device, or lab-on-a-chip system. All such variations and systems are contemplated as being within the scope of the instant invention.
  • the systems for the diagnosis of OSCC in a tissue sample can additionally comprise a hybridization solution to assist the hybridization between the target polynucleotides and complementary material from a tissue sample.
  • hybridization may take place without the assistance of one or more hybridization solutions, hybridization reagents, several types of which are known among those skilled in the art, may be included in order to enhance the e cacy o t e present systems.
  • e nstant systems can a so nc u e a sta n so ut on.
  • ta n solutions can permit the visualization of hybridized target polynucleotide/complementary material and thereby quantification of experimental results.
  • the stain solution can comprise a streptavidin phycoerythrin conjugate, which would permit scanning and detection of hybridized polynucleotides.
  • Other examples of staining solutions and staining techniques are widely recognized among skilled artisans.
  • the inventive systems for the diagnosis of oral squamous cell carcinoma in tissue sample can further include a wash solution. Appropriate wash buffers are known in the art and can be included in the instant systems for the purpose of, inter alia, permitting a post-hybridization wash prior to staining of hybridized polynucleotides, or as a post staining rinse.
  • determining oral squamous cell carcinoma in a subject comprising contacting a tissue sample from the subject with a solid-support surface to which there are bound target polynucleotides corresponding to at least a subset of genes belonging to the class of OSCC-associated genes; and, measuring the extent of adhesion between the tissue sample and the target polynucleotides.
  • the instant methods comprise contacting a tissue sample from the subject with a solid-support surface to which there are bound target polynucleotides corresponding to at least eight genes belonging to the class of OSCC-associated genes.
  • the subset of genes comprises at least 10 genes or at least 15 genes.
  • the class of oral squamous cell carcinoma-associated genes can include at least 20 genes from a group of 25 genes, identified supra, as constituting a highly accurate OSCC gene expression signature, and the subset of genes belonging to this class can comprise at least eight, at least 10, or at least 15 genes belonging to this class.
  • the tissue sample for use in the disclosed methods may be derived from a variety of anatomical sites in the oral cavity of the test subject, and may be from one or more of the tongue, buccal mucosa, lips, mandible epithelia, gum, or mouth floor of the subject.
  • the tissue sample may also comprise oral cavity saliva.
  • the details of the solid-support surface, the target polynucleotide, and the techniques used for adhering the target polynucleotides to the solid-support surface may be determined as previously described with respect to the disclosed systems.
  • Microarray data from patient-matched normal and OSCC tissue was generated in order to permit identification of a tissue-specific gene expression signature that is capable of predicting OSCC.
  • the patient group was representative of the general population of patients with OSCC, having a median age 59 and a greater percentage (54%) male. Of the patients reporting, greater than 90% smoked tobacco and/or drank alcohol (data not shown).
  • Patient paired normal an tumor spec mens were compare n or er to prov e t e most stat st ca y representat ve ata base for distinguishing gene expression difference between tumor and normal.
  • RNA isolation For RNA isolation each tissue specimen was placed in a liquid nitrogen chilled mortar and the tissue ground to a fine powder. The liquid nitrogen was evaporated, and the tissue was homogenized in Trizol (Invitrogen Corp., Carlsbad, CA). Total RNA was isolated using the Trizol method and dissolved in RNAse-free water. To remove contaminates the RNA was purified using RNeasy spin columns (Qiagen Inc., Valencia, CA). Each specimen typically yielded 50 ⁇ g of total RNA.
  • RNA samples were submitted to the University of Pennsylvania Microarray Facility for microarray analysis using Affymetrix Ul 33 A chips (Affymetrix, Santa Clara, CA). Samples were run on an Agilent Bioanalyzer (Agilent Techs. Inc., Palo Alto, CA) to confirm integrity and concentration. For target preparation and hybridization, all protocols were conducted as described in the Affymetrix GeneChip Expression Analysis Technical Manual. Briefly, 5-8 ⁇ g of total RNA were converted to first-strand cDNA using Superscript II reverse transcriptase (Invitrogen Corp., Carlsbad, CA) primed by a poly(T) oligomer that incorporates the T7 promoter.
  • Superscript II reverse transcriptase Invitrogen Corp., Carlsbad, CA
  • Second-strand cDNA synthesis was followed by in vitro transcription for linear amplification of each transcript and incorporation of biotinylated CTP and UTP (Enzo RNA Labeling Kit, Affymetrix, Santa Clara, CA).
  • the cRNA products are fragmented to 200 nucleotides or less, heated at 99 0 C for 5 minutes and hybridized for 16 hours at 45 0 C to U 133 A GeneChips microarrays (Affymetrix).
  • the microarrays were washed at low (6X SSPE) and high (10OmM MES, 0.1 M NaCl) stringency and stained with streptavidin-phycoerythrin.
  • EXAMPLE 2 Analysis of microarray data
  • a weighted mean of probe fluorescence was calculated using the One-step Tukey's Biweight Estimate. This Signal value, a relative measure of the expression level, was computed for each assayed gene. Global scaling was applied to allow comparison of gene Signals across multiple microarrays: after exclusion of the highest and lowest 2%, the average total chip Signal was calculated and used to determine what scaling factor was required to adjust the chip average to an arbitrary target of 150. All Signal values from one microarray were then multiplied by the appropriate scaling factor.
  • PCA Principle components analysis
  • FIGS. 3A and 3B are the perspective image with the tongue and non-tongue samples as principal components and axes. Samples derived from tongue are represented by squares ( ⁇ ) while non- tongue samples are represented by triangles (A). Clustering of similar samples is contained within circles. The separation of site specific gene expression was readily apparent in the normal specimens but was slightly less distinct in the OSCC samples (FIGS. 3A, 3B).
  • FIG. 4 Closer inspection of the site-specific gene expression patterns identified in FIG. 1 demonstrated distinct gene expression profiles for normal tongue tissue compared to normal non-tongue sites (FIG. 4 and FIG. 5).
  • enzymes associated with biochemical processes were up-regulated in the tongue including (as identified by Affymetrix number) cytochrome family members, aldehyde dehydrogenase, monoglyceride lipase, transglutaminase, sulfotransferase, arachidonate 12-lipoxgenase, glutathione s-transferase, and others.
  • signaling transduction molecules including RAB proteins and Rho-GTPase activating proteins, and the e-erb-b2 receptor were elevated in the tongue specimens (FIG. 4).
  • non-tongue samples did express some biochemical enzymes the number was far less than that for the tongue samples (FIG. 5). Unlike the tongue specimens, the non-tongue samples had several types of receptors up-regulated including (as identified by Affymetrix number) growth factor receptors, prostaglandin receptors, and G-protein couples receptors. Likewise, several growth factors were also elevated, for example the epidermal growth factor, fibroblast growth factor-2 and WNT inhibitory factor. Finally, non-tongue tissues showed expression for several transcription factors, including ets, zinc finger, AP-2 and p300/CBP and signaling molecules like H-Ras suppressor and Grb-2 like proteins. Together, these results indicate that there are unique gene expression patterns between tongue and non-tongue sites in the oral cavity.
  • microtubule-associated protein homolog (Xenopus laevis)
  • the list of 92 genes is comprised of a majority of genes (95%) that are up- regulated in OSCC as compared to normal (similar to that presented in FIG. 1). This list contains genes expressed from 2-fold to over 70-fold in the OSCC with p-values that ranged from 1 x 10 -7 to 0.001 (Table 2). Likewise, the genes which were down-regulated in OSCC ranged from 2- to 33- fold with p-values of 4 x 10 "8 to 7.5 x 10 "4 .
  • MMP-I matrix metalloproteinase-1
  • Ln-5 ⁇ 2 laminin-5 gamma-2 chain
  • PCR reaction mixtures consisted of 2 ⁇ l of Faststart DNA Master SYBR Green I mixture (containing TaqDNA polymerase, reaction buffer, deoxynucleotide triphosphate mix - with dUTP instead of dTTP - SYBR Green I dye, and 10 mM MgCl 2 ), 0.5 ⁇ M of each target primer stock, 2 or 4 mM MgCl 2 (Ln-5 ⁇ 2 chain, MMP-I, MMP-3) in a final reaction volume of 20 ⁇ l.
  • ⁇ -Gus was used as the internal control for normalization and Universal Human RNA (Stratagene, La Jolla, CA) as the standard reference. See Livak KJ & Schmittgen TD. Methods. 2001;25:402-818.
  • ⁇ -Gus was selected as the internal control since it was uniformly expressed across all samples by microarray.
  • SVM Support Vector Machines
  • Furey TS et al. Bioinformatics. 2000 Oct;16(10):906-14 SVM uses recognition and regression estimates to identify class prediction gene sets using a training set of microarray data.
  • the SVM algorithm attempts to find a hyperplane that provides separation between the different input data classes such that there is a maximal distance between the hyperplane and the nearest point on any one of the input classes.
  • SVM performs a 10-fold cross- validation to select a set of predictor genes that leads to the smallest error rate.
  • the patient- matched tumor and normal samples were used as the training set and cross- validated.
  • the first and second data sets identified supra, as well as the OSCC and various tumor data sets were used as the test sets.
  • SVM was set to use the Golub gene selection method.
  • each gene is tested for its ability to discriminate between the classes using a signal-to-noise score, which is given by:
  • yd and ⁇ i (i - 1, 2) are the mean and standard deviation of the expression values over the samples in class /. Genes with the highest scores are kept for subsequent calculations.
  • the Golub method of gene selection calculates the difference in means between the training and test sets divided by the sum of the standard deviations to identify the best set of predictors.
  • the number of genes used for predictors was set at 25 and the 2207 genes identified by filtering for only genes present, and analyzed by ANOVA with Benjamini Hochberg multiple testing correction factor at p ⁇ 0.05 were used as the gene pool. Once this set of 25 predictor genes was identified, it was applied to each test set to test the classification accuracy. All SVM calculations were performing using kernel function set to polynomial dot product (order 1) and diagonal scaling factor set to 0.
  • bTrue Value shows the true value of the class of each sample, as either tumor (t) or normal (n). This value is compared with the value in the Prediction column to validate training set.
  • 'Prediction shows the predicted class; incorrectly predicted samples are highlighted in gray.
  • dMargin shows the distance (in arbitrary units) to the hyperplane for each of the classes, tumor and normal. Positive scores are assigned to one class, and negative scores are assigned to the other class. The scores are then reported as the margins. This corresponds to the distance from the sample to the separating decision boundary. The larger the margin, the farther away a score is from the boundary, and the more confident is the classification. Predictions are considered unreliable when margins are ⁇ 0.5. Interestingly, a majority of the margins in Table 4, 25 of 26 (96%) were > 0.5 and are considered as confident classifications ⁇ TothillRWet al. Cancer Res. 2005 ; 65: 4031-40; Mukherjee, S.
  • the Perm data set was comprised of two normal specimens and 13 OSCC specimens, the RO data set which has been previously been reported consisted of 18 OSCC tumors and the OSCC data set consisted of 5 tumor and 4 normal samples (from GSE1722; GEO DataSet, NCBI) (O'Donnell RK et al. Oncogene. 2005;24: 1244-51).
  • SVM and the Golub method of gene selection all (100%) specimens of the Perm data set were correctly classified, as shown in Table 5, below.
  • Sample number. bTrue Value shows the true value of the class of each sample, as either tumor (t) or normal (n). This value is compared with the value in the Prediction column to validate training set.
  • Prediction shows the predicted class. Prediction was performed using SVM, Golub method for selecting predictor genes selection (9), and a 25-gene predictor.
  • dMargin shows the distance (in arbitrary units) to the hyperplane for each of the classes, tumor and normal. Positive scores are assigned to one class, and negative scores are assigned to the other class. The scores are then reported as the margins. This corresponds to the distance from the sample to the separating decision boundary. The larger the margin, the farther away a score is from the boundary, and the more confident is the classification. Predictions are considered unreliable when margins are O.5. [0058] However, the margins of 2 predictions were considered unreliable resulting in an accuracy of 87%. The 25-gene set predictor was able to accuracy classify 86% of the RO OSCC data (see Table 6, below).
  • True Value shows the true value of the class of each sample, as either tumor (t) or normal (n). This value is compared with the value in the Prediction column to validate training set.
  • Prediction shows the predicted class; incorrectly predicted samples are highlighted in gray. Prediction was performed using SVM, Goloub method for selecting predictor genes (9), and a 25-gene predictor. d
  • Margin shows the distance (in arbitrary units) to the hyperplane for each of the classes, tumor and normal. Positive scores are assigned to one class, and negative scores are assigned to the other class. The scores are then reported as the margins. This corresponds to the distance from the sample to the separating decision boundary. The larger the margin, the farther away a score is from the boundary, and the more confident is the classification. Predictions are considered unreliable when margins are ⁇ 0.5.
  • the OSCC predictor correctly classified all 9 samples as tumor or normal in the OSCC data set. However, one prediction was considered unreliable giving an accuracy of 89% (see Table 7, below).
  • True Value shows the true value of the class of each sample, as either tumor (t) or normal (n). This value is compared with the value in the Prediction column to validate training set.
  • Prediction shows the predicted class. Prediction was performed using SVM, Golub method for selecting predictor genes (9), and a 25-gene predictor. d
  • Margin shows the distance (in arbitrary units) to the single hyperplane for each of the classes, tumor and normal. Positive scores are assigned to one class, and negative scores are assigned to the other class. The scores are then reported as the margins. This corresponds to the distance from the sample to the separating decision boundary. The larger the margin, the farther away a score is from the boundary, and the more confident is the classification. Predictions are considered unreliable when margins are ⁇ 0.5.
  • NCBI GEO DataSets NCBI GEO DataSets
  • AML acute myeloid leukemia
  • BE Barrett's associated adenocarcinomas
  • NE normal esophageal mucosa
  • Breast breast carcinoma
  • LE lymphoblastic leukemia
  • RC renal clear cell tumor.
  • True Value shows the true value of the class of each sample, as either tumor (t) or normal (n). This value is compared with the value in the Prediction column to validate training set.
  • Prediction shows the predicted class; incorrectly predicted samples are highlighted in gray. Prediction was performed using SVM, Goloub method for selecting predictor genes (9), and a 25-gene predictor. d
  • Margin shows the distance (in arbitrary units) to the hyperplane for each of the classes, tumor and normal. Positive scores are assigned to one class, and negative scores are assigned to the other class. The scores are then reported as the margins. This corresponds to the distance from the sample to the separating decision boundary. The larger the margin, the farther away a score is from the boundary, and the more confident is the classification. Predictions are considered unreliable when margins are ⁇ 0.5.
  • beta A activin A, activin AB alpha polypeptide
  • the 25-gene predictor was determined using SVM and the Golub method for gene selection. Predictive strength, the higher value defines the better predictor.
  • the predictor set of genes were comprised of several epithelial marker genes with categories of potential interest including genes encoding extracellular matrix components, genes involved in cell adhesion, including the fasciclin; genes involved in cell-cell integrity, for example lysyl oxidase-like 2 and snail- homolog 2; genes encoding hydrolyzing activities, including proteins involved in degradation of the extracellular matrix like MMP-I, MMP-9, MMP-11 and urokinase and cytokines like inhibin beta A and parathyroid hormone-like.
  • the development OSCC involves stromal and immune-regulatory components. Thus, many of the predictor genes belong to these categories.
  • SVM was chosen because it has been used in several microarray studies with success and appears to be superior to similar algorithms like k-nearest neighbor and PAM.
  • Cross-validation of the training set which consisted of the paired tumor/normal samples from two institutions, resulted in an accuracy rate of 96% using at least 25 genes per class.
  • a highly appropriate test of predictive accuracy is to validate the predictor on an independent set of samples.
  • the samples are split into a training set and a validation set.
  • accuracy rates on the 25 -gene predictor using three independent validation sets were validated and obtained.
  • the three validation sets comprised: an independent OSCC and normal sample set from the University of Pennsylvania, an OSCC sample set previously published and one obtained from the NCBI Gene Dataset (O 'Donnell RK et al. Oncogene. 2005;24:1244-51).
  • the 25-gene predictor had an overall accuracy ranging from 86-89% for these two validation sets. How these numbers compare with the early clinical diagnosis accuracy of OSCC is currently under investigation. It is difficult to determine why some samples were incorrectly classified. This may be the result of other tissue components within the sample, for example bone, or because some samples were mistakenly labeled.
  • OSCC 25-gene predictor's ability was tested for its ability to classify non-oral cavity tumor and normal samples.
  • the OSCC predictor displayed poor classifications, with accuracies of only 25% (75% of the samples were predicted incorrectly), when microarray data sets (obtained from NCBI GEO DataSets) derived from non-OSCC human cancers. This indicated that the oral cancer gene predictor set was tissue specific.
  • Some of the predictor genes have previously been implicated in OSCC development and progression.
  • several of these genes present in Table 5, supra are those differentially expressed in OSCC tumor and normal mucosa components.

Abstract

Methods for assaying oral squamous cell carcinoma (OSCC) are disclosed. The methods include detecting an OSCC gene signature to distinguish normal from OSCC specimens. Also provided are systems and other materials for determining OSCC in a tissue sample.

Description

RAPID SCREENING OF ORAL SQUAMOUS CELL CARCINOMA
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit of priority to U.S. Provisional Patent Application Serial Number 60/853,205, filed October 19, 2006, which is herein incorporated by reference in its entirety.
FIELD OF THE INVENTION
[0002] The present invention provides methods for determining the onset, progression, and treatment outcomes of certain head and neck cancers. The methods include detecting an gene signature to classify normal and pathogenic specimens. Also provided are systems and other materials for determining the presence of disease in a tissue sample.
BACKGROUND OF THE INVENTION
[0003] Head and neck cancers are the sixth most common cancer worldwide and are associated with low survival and high morbidity. Shingaki S et al. Am. J. Surg. 2003;185:278- 284. Cancers of the oral cavity account for 40% of head and neck cancers and include squamous cell carcinomas of the tongue, floor of the mouth, buccal mucosa, lips, hard and soft palate and gingival. Funk GF et al. Head Neck. 2002; 24: 165-180; Weinberg MA & Estefan DJ. Amer. Fam. Phys. 2002, 65(7): 1379-1384. Despite therapeutic and diagnostic advances, the five-year survival rate for oral squamous cell carcinoma (OSCC) remains at about 50%. Funk GF et al. (2002); Weinberg MA & Estefan DJ (2002); Okamoto Met al. J. Oral Pathol. Med. 2002;31:227 -233. In addition, aggressive treatment of OSCC cancer is controversial since it can lead to severe disfigurement and morbidity. Ensley JF, Gutkind JS, Jaco s JR, Lippman M, e s. Hea an Neck Cancer: Emerging Perspectives. Academic Press, 2003. As a result, many patients with OSCC cancers are either over- or under-treated with significant personal and socio-economic impact.
[0004] One of the fundamental factors accounting for the poor outcome of patients with OSCC is that a great proportion of oral cancers are diagnosed at advanced stages and therefore treated late. Early detection of oral cancer lesions will greatly improve morbidity statistics associated with late disease treatment and overall patient survival. For example, early detection could lead to frequent patient monitoring, dietary changes, counseling on and cessation of smoking and drinking, preventative drug administration, and/or lesion removal. As such, early diagnosis and treatment of OSCC has been shown to lead to mean survival of over 80% and a good quality of life after treatment. Epstein JB et al. J. Can. Dent. Assoc. 2000, 68:617-621. However, no methodology exists to mass screen for oral cancer lesions early, accurately and easily.
[0005] Currently, clinical examination and histopathological study is the standard diagnostic method used to ascertain whether biopsied material is a precancerous or cancerous lesion. Sobin, LH & Wittekind, CH. Head and neck tumors. In: Sorbin, Witttekind, eds. TNM Classification of Malignant Tumors, at 17-32 (5 th ed. Berlin, Germany: Springer-Verlag; 1997). Biopsies are invasive procedures typically involving surgical techniques. Furthermore, biopsies are limited when it comes to lesion size. For example, small lesions may not provide enough material for accurate diagnosis while biopsies taken from large lesions may not accurately reflect every histopathological aspect of the lesion. Finally, the biopsy, as a diagnostic tool, has limited sensitivity. Thus, additional methodologies would be helpful in screening for pre-malignant and malignant oral cancer lesions. However, to non-invasively detect oral cancer cells will require easy access to the site where these cancers typically arise and a readily available source of cells. Oral cavity saliva meets both of these criteria.
[0006] Typically, genetic changes in cancer cells lead to altered gene expression patterns that can be identified long before the cancer phenotype has manifested. When compared to normal mucosa, those changes that occur in the cancer cell can be used as biomarkers. Attempts to find biomarkers that identify OSCC pre-malignant and cancerous lesions have resulted in several candidate genes associated with OSCC tumor progression including p53, cyclin Dl, and EGFR. See Greenman, J et al. Clin Otolaryngol Allied Sci. 2000;25(l):9-18. Review.; Vielba R et al. Larynoscope. 2003; 113:167-172. However, to date, nothing has shown diagnostic utility in OSCC. It has now been determined that clinical diagnosis requires cons er ng t e com ne n uence o many genes. xpress on patterns o many genes ave shown dramatic correlations with tumor behavior and patient outcome. Indeed, microarray analysis of several tumor types has demonstrated that global expression profiling can distinguish tumor from normal as well as the class and subtype of cancer in a manner that is far superior to current histopathological diagnostic systems. Golub TR et al. Science. 1999;286:531-537; Singh D et al. Cancer Cell. 2002; 1:203-209; O'Donnell RK et al. Oncogene. 2005, 24:1244-51; Ginos MA et al Cancer Res. 2004;64(l):55-63; Somoza-Martin JM et al. J Oral Maxillofac Surg. 2005 ;63:786-92 ; Belbin TJ et al. Arch Otolaryngol Head Neck Surg. 2005;131:10-8. Recent independent studies carried out by various research groups indicate that OSCC cells have a unique gene transcription profile, which differs from that of normal cells. See Belbin TJ et al. (2005); Tusher VG et al. Proc. Natl. Acad. ScL U.S.A. 2001 ;98: 10515; Eisen MB et al. Proc Natl AcadSci USA. 1998;95: 14863-8; Livak KJ & Schmittgen TD. Methods. 2001;25:402-818; Takahashi Met al Proc Natl AcadSci U.S.A. 2001,98:9754-9. None of these studies have tested the gene profiles for their ability to identifying or predicting OSCC. To date, no accurate, cost- efficient and reproducible method exits that enables mass screening of patients for OSCC.
SUMMARY
[0007] Oral cancer is a significant health problem in the United States and worldwide. The five-year survival rate for oral cancer has not improved significantly over the preceding 20 years and remains at approximately 50%. Although patients diagnosed at an early stage of the disease typically have an 80% chance for cure and functional outcomes, currently most oral cancer patients are identified when the pathology has already reached an advanced stage. Thus, a convenient and accurate means for detecting oral cancer would decrease patient morbidity and mortality.
[0008] The present invention provides methods for monitoring the onset, progression, and treatment outcomes of oral squamous cell carcinoma (OSCC). The methods include detecting an OSCC gene signature in order to distinguish normal from OSCC specimens. Also provided are systems and other materials for diagnosing OSCC in a tissue sample. Using a supervised-learning algorithm, an exemplary gene signature for OSCC was identified that can be applied to the accurate classification of normal and OSCC specimens. The present invention fulfills two necessary, but heretofore unmet, prerequisites for non-invasively monitoring oral cancer onset, progression, and treatment: the identification of specific biomarkers for oral cancers, and non-invasive access plus monitoring of these biomarkers at the point of care by minimally-trained health care personnel. [0009] Additional information concerning this invention can be found at Amy F. Ziober, et al, Clin Cancer Res 2006, 12 5960-5971, which is incorporated herein by reference in its entirety.
BRIEF DESCRIPTION OF THE DRAWINGS
[0010] The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
[0011] FIG. 1 depicts the data resulting from the analysis of microarray data for a total of 2207 genes, graphically represented in Treeview; gene signatures of interest are highlighted by dark lines, and tree headings for tumor and normal specimens are labeled.
[0012] FIG. 2 demonstrates how two major classes of samples were identified using the above-referenced 2207 genes, representing a distinct separation of tumor and normal specimens. Tree headings for each of the two groups are labeled, as are: anatomical sites in the oral cavity from which samples were surgically removed; stage; and, tissue number.
[0013] FIGS. 3 A and 3B depict an unsupervised classification analysis of patient- matched normal oral mucosal and OSCC samples, respectively. In the provided perspective image with tongue and non-tongue samples as principal components and axes, samples derived from tongue are represented by squares (■) while non-tongue samples are represented by triangles (A). Clustering of similar samples is contained within circles.
[0014] FIG. 4 depicts site specific gene expression patterns, labeled to indicate patterns that are (independently) up-regulated and that are associated with tongue samples. Regions identified in FIG. 1 as site specific are enlarged. Genes are identified using Affymetrix numbers.
[0015] FIG. 5 depicts site specific gene expression patterns, labeled to indicate patterns that are (independently) up-regulated and that are associated with non-tongue samples. Regions identified in FIG. 1 as site specific are enlarged, and headings for tumor and normal specimens are labeled. Genes are identified using Affymetrix numbers.
[0016] FIG. 6A provides the results from quantitative real-time PCR of matched normal and tumor specimens (n = 5 pairs) for MMP-I, which was performed in triplicate; error bars represent standard deviation.
[0017] FIG. 6B provides the results from quantitative real-time PCR of matched normal and tumor specimens (n = 5 pairs) for Ln-5 γ2, which was performed in triplicate; error bars represent standard deviation. . prov es t e resu ts rom quant tat ve rea -t me o matc e normal and tumor specimens (n = 5 pairs) for MMP-3, which was performed in triplicate; error bars represent standard deviation.
[0019] FIG. 7 shows results of immunohistochemical staining analysis in normal and OSCC oral mucosa for MMP-I, Ln-5 γ2 chain, and MMP-3.
DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS
[0020] The present invention provides methods for monitoring the onset, progression, and treatment outcomes of oral squamous cell carcinoma (OSCC). The methods include detecting a particular OSCC gene signature in order to distinguish normal from OSCC specimens. In one embodiment there are provided methods of assessing the absence or presence of oral squamous cell carcinoma in a subject comprising determining, in an oral tissue sample from said subject, the expression levels of at least a subset of genes belonging to a class of oral squamous cell carcinoma-associated genes, followed by comparing the determined expression levels with expression levels of such genes in a patient in which oral squamous cell carcinoma ("OSCC") is known to be present. Therefore, in some of the disclosed methods for assessing a subject for the absence or presence of oral squamous cell carcinoma, the gene expression profiles of identified OSCC patients may be used as a benchmark. The subset of genes belonging to a class of oral squamous cell carcinoma-associated genes can comprise fewer than eight genes, or can comprise eight or more genes.
[0021] In another embodiment of the disclosed methods, the class of oral squamous cell carcinoma-associated genes includes at least 20 genes from a group of 25 genes identified as constituting a highly accurate OSCC gene expression signature. The 25-gene "predictor set" includes: TU3A protein; beta A inhibin; matrix metalloproteinase 1 ; matrix metalloproteinase 9; chemokine ligand 13; matrix metalloproteinase 11 ; type V, alpha 2 collagen (first region); type III, alpha 1 collagen; 601658812R1 NIH MGC 69 Homo sapiens cDNA clone IMAGE:3886131 3' mRNA; type I, alpha 1 collagen; Homo sapiens NKG5 gene; urokinase plasminogen activator; type V, alpha 1 collagen (first region); myosin X; lysyl oxidase-like 2; type V, alpha 1 collagen (second region); type V, alpha 2 collagen (second region); parathyroid hormone-like hormone; type IV, alpha 1 collagen; type XI, alpha 1 collagen; myosin IB; osteoblast specific factor 2; 602035015Fl NCI_CGAP_Brn64 Homo sapiens cDNA clone IMAGE:4183107 5' mRNA; snail homolog 2; and, GenBank Accession Number X60469. The predictive strength of this 25-gene set has been confirmed by testing of OSCC and control specimens. See Example 5, infra. [0022] The disclosed methods may further comprise comparing the determined gene expression levels to expression levels of such genes in a patient in which oral squamous cell carcinoma is known to be absent. Accordingly, in addition to a comparison to expression levels from identified OSCC patients, the determined test subject expression levels may also be compared to expression levels from identified control subjects. Control and OSCC patent data sets may be obtained from published sources {see, e.g., O'Donnell RK et al. Oncogene. 2005;24: 1244-51 ; see also National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO)), or may be independently obtained.
[0023] Additional methods of assessing the absence or presence of oral squamous cell carcinoma in a subject are disclosed, comprising determining, in an oral tissue sample from the subject, the expression levels of at least a subset of genes belonging to a class of oral squamous cell carcinoma-associated genes, followed by comparing the determined expression levels with expression levels of such genes in a patient in which oral squamous cell carcinoma ("OSCC") is known to be absent. Thus, the disclosed methods may involve a comparison of the subject's gene expression profile either to expression levels from known OSCC patients, or to expression levels from subjects in which OSCC is known to be absent, or the comparison may be with respect to both OSCC patient expression levels and control subject expression levels.
[0024] In other embodiments, the subset of genes the expression levels of which are compared to expression levels data from OSCC patients, control subjects, or both, comprise at least eight genes, at least 10 genes, or at least 15 genes. The class of oral squamous cell carcinoma-associated genes can include at least 20 genes from a group of 25 genes identified as constituting a highly accurate OSCC gene expression signature, and the subset of genes belonging to this class can comprise at least eight, at least 10, or at least 15 genes belonging to this class. When the expression levels of substantially ten of the 25-gene predictor class are used, the instant methods of assessing the absence or presence of oral squamous cell carcinoma in a subject are highly discriminating and can result in the classification of tumor and normal samples with approximately 96% accuracy {see Example 5, infra).
[0025] The oral tissue sample for use in the disclosed methods may be derived from a variety of anatomical sites in the oral cavity of the test subject. For example, the oral tissue sample may be from one or more of the tongue, buccal mucosa, lips, mandible epithelia, gum, or mouth floor of the subject. Tissue samples from these sites are easily obtainable and represent a readily available source of cells from among which oral cancer cells can be detected using minimally invasive techniques according to the instant methods. For example, small tissue samples from these anatomical sites in the oral cavity can be obtained using swabs, fine needle asp rat on opsy, s ave opsy, or ot er tec n ques t at enta on y a nom na egree o invasiveness and patient discomfort. Thus, although more invasive techniques can also be used, the instant methods are compatible with obtaining tissue samples in a manner that requires the excision of minimal amounts of tissue from readily-accessible anatomical sites. Oral cavity saliva, which typically contains a sufficient cellular complement, also meets the necessary criteria for the early, accurate, and non-invasive identification of the described biomarkers from among the oral cellular population. Many techniques for the retrieval of oral tissue samples are known to those skilled in the art, and the present invention contemplates the use of any such procedure in connection with the disclosed methods.
[0026] Also provided are systems for the diagnosis of oral squamous cell carcinoma in a tissue sample comprising a solid-support surface; and, bound to said solid-support surface, target polynucleotides corresponding to at least a subset of genes belonging to the class of oral squamous cell carcinoma-associated genes. The subset of genes can comprise at least eight genes belonging to the class of OSCC-associated genes. In other embodiments, the subset of genes can include at least 10 genes or at least 15 genes. The class of oral squamous cell carcinoma- associated genes can include at least 20 genes from a group of 25 genes, identified supra, as constituting a highly accurate OSCC gene expression signature, and the subset of genes belonging to this class can comprise at least eight, at least 10, or at least 15 genes belonging to this class.
[0027] The provided systems represent systems for rapid, accurate, point-of-care screening, detection, and diagnosis of oral cancer. Microarrays that contain probes representing genes of interest are known in the art and may be prepared by commercial vendors according to customer specifications {e.g., Affymetrix, Santa Clara, CA). The probes traditionally constitute mRNA, cDNA, or cRNA, and the solid-support surface typically comprises glass, silicon, nylon substrate, or other film material. Target polynucleotide probes are deposited onto the solid- support surface by techniques known in the art such as photolithography, pipette, drop-touch, piezoelectric (including ink-jet), electric, and other suitable means. The result may comprise a traditional DNA chip array, a microfluidics-based device, or lab-on-a-chip system. All such variations and systems are contemplated as being within the scope of the instant invention.
[0028] The systems for the diagnosis of OSCC in a tissue sample can additionally comprise a hybridization solution to assist the hybridization between the target polynucleotides and complementary material from a tissue sample. Although hybridization may take place without the assistance of one or more hybridization solutions, hybridization reagents, several types of which are known among those skilled in the art, may be included in order to enhance the e cacy o t e present systems. e nstant systems can a so nc u e a sta n so ut on. ta n solutions can permit the visualization of hybridized target polynucleotide/complementary material and thereby quantification of experimental results. For example, if the target polynucleotide is biotin-labeled according to techniques well known among those skilled in the art, then the stain solution can comprise a streptavidin phycoerythrin conjugate, which would permit scanning and detection of hybridized polynucleotides. Other examples of staining solutions and staining techniques are widely recognized among skilled artisans. The inventive systems for the diagnosis of oral squamous cell carcinoma in tissue sample can further include a wash solution. Appropriate wash buffers are known in the art and can be included in the instant systems for the purpose of, inter alia, permitting a post-hybridization wash prior to staining of hybridized polynucleotides, or as a post staining rinse.
[0029] There are also provided methods of determining oral squamous cell carcinoma in a subject comprising contacting a tissue sample from the subject with a solid-support surface to which there are bound target polynucleotides corresponding to at least a subset of genes belonging to the class of OSCC-associated genes; and, measuring the extent of adhesion between the tissue sample and the target polynucleotides. In other embodiments, the instant methods comprise contacting a tissue sample from the subject with a solid-support surface to which there are bound target polynucleotides corresponding to at least eight genes belonging to the class of OSCC-associated genes. In other embodiments, the subset of genes comprises at least 10 genes or at least 15 genes. The class of oral squamous cell carcinoma-associated genes can include at least 20 genes from a group of 25 genes, identified supra, as constituting a highly accurate OSCC gene expression signature, and the subset of genes belonging to this class can comprise at least eight, at least 10, or at least 15 genes belonging to this class.
[0030] The tissue sample for use in the disclosed methods may be derived from a variety of anatomical sites in the oral cavity of the test subject, and may be from one or more of the tongue, buccal mucosa, lips, mandible epithelia, gum, or mouth floor of the subject. The tissue sample may also comprise oral cavity saliva. The details of the solid-support surface, the target polynucleotide, and the techniques used for adhering the target polynucleotides to the solid-support surface may be determined as previously described with respect to the disclosed systems.
[0031] The present invention is further defined in the following Examples. It should be understood that these examples, while indicating preferred embodiments of the invention, are given by way of illustration only, and should not be construed as limiting the appended claims From the above discussion and these examples, one skilled in the art can ascertain the essential characteristics of this invention, and without departing from the spirit and scope thereof, can make various changes and modifications of the invention to adapt it to various usages and conditions.
EXAMPLE 1: Generation of microarray data
[0032] Microarray data from patient-matched normal and OSCC tissue was generated in order to permit identification of a tissue-specific gene expression signature that is capable of predicting OSCC. Patient Samples and Characteristics
[0033] All matched patient normal and tumor samples and unmatched normal and tumor samples were obtained from surgical resection specimens from patients undergoing surgery for OSCC using standardized procedures. Clinical characteristics of the 13 paired tumor/normal patients are shown in Table 1, below.
TABLE 1 Noncancerous and OSCC matched specimens
Clinical Features n
Sample Size 26
Noncancerous 13 50
OSCC 13 50
Male Gender 7 54
Median age (yr) 59
Anatomical Site % (Of OSCC)
Tongue 7 54
Floor of the mouth 2 15
Buccal 1 8
Mandible 2 15
Gum 1 8
Clinical Stage
Stage I-II 9 69
Stage HI-IV 4 31
Nodal Disease
Positive nodes 6 46
Negative nodes 7 54
[0034] The patient group was representative of the general population of patients with OSCC, having a median age 59 and a greater percentage (54%) male. Of the patients reporting, greater than 90% smoked tobacco and/or drank alcohol (data not shown). Patient paired normal an tumor spec mens were compare n or er to prov e t e most stat st ca y representat ve ata base for distinguishing gene expression difference between tumor and normal.
[0035] Two patient data sets were produced: the first data set was isolated and microarrayed by the present inventors, while the second data set was provided by Dr. Ruth Muschel (Children's Hospital of Philadelphia, PA) and as previously reported. See O'Donnell RK et al Oncogene. 2005;24:1244-51. The OSCC data set and other human data sets used in this study were downloaded from the GEO DataSets website (National Center for Biotechnology Information, U.S. National Library of Medicine, Bethesda, MD). Cancers of the oral cavity in this study included squamous cell carcinomas of the tongue, floor of the mouth, buccal mucosa, lips, hard and soft palate and gingival. See Table 1 ; Funk GF et al. Head Neck. 2002;24:165- 180; Weinberg MA & Estefan DJ. Amer. Fam. Phys. 2002;65(7): 1379-1384. After resection, matched normal and tumor were fresh frozen in liquid nitrogen. Samples were banked at -80°C for storage until later use. To minimize potential field cancerization effects, all normal samples were obtained at the greatest distance from the tumor, typically 2-3 cm, where grossly no appearance of tumor, leukoplakia or erythroplakia could be detected. Touch preps were used to confirm that each normal appearing mucosa did not contain tumor.
[0036] All sections were evaluated cytologically and diagnosis confirmed. All tissue sections were fixed and stained with hematoxylin and eosin ("H&E") and evaluated and histological analyses were performed to ensure each tumor specimen was pure for microarray analysis, contained >80% tumor tissue, and that each normal section did not contain dysplasia or carcinoma. Those samples that did not meet these criteria were rejected for this study.
[0037] Archival material of normal sections was derived from patients who either had surgical tooth extractions or had non-epithelial-related pathology. Extraction of Total RNA and Microarray Analysis
[0038] For RNA isolation each tissue specimen was placed in a liquid nitrogen chilled mortar and the tissue ground to a fine powder. The liquid nitrogen was evaporated, and the tissue was homogenized in Trizol (Invitrogen Corp., Carlsbad, CA). Total RNA was isolated using the Trizol method and dissolved in RNAse-free water. To remove contaminates the RNA was purified using RNeasy spin columns (Qiagen Inc., Valencia, CA). Each specimen typically yielded 50 μg of total RNA.
[0039] Total RNA samples were submitted to the University of Pennsylvania Microarray Facility for microarray analysis using Affymetrix Ul 33 A chips (Affymetrix, Santa Clara, CA). Samples were run on an Agilent Bioanalyzer (Agilent Techs. Inc., Palo Alto, CA) to confirm integrity and concentration. For target preparation and hybridization, all protocols were conducted as described in the Affymetrix GeneChip Expression Analysis Technical Manual. Briefly, 5-8 μg of total RNA were converted to first-strand cDNA using Superscript II reverse transcriptase (Invitrogen Corp., Carlsbad, CA) primed by a poly(T) oligomer that incorporates the T7 promoter. Second-strand cDNA synthesis was followed by in vitro transcription for linear amplification of each transcript and incorporation of biotinylated CTP and UTP (Enzo RNA Labeling Kit, Affymetrix, Santa Clara, CA). The cRNA products are fragmented to 200 nucleotides or less, heated at 990C for 5 minutes and hybridized for 16 hours at 450C to U 133 A GeneChips microarrays (Affymetrix). The microarrays were washed at low (6X SSPE) and high (10OmM MES, 0.1 M NaCl) stringency and stained with streptavidin-phycoerythrin. Fluorescence was amplified by adding biotinylated anti-streptavidin and an additional aliquot of streptavidin-phycoerythrin stain. A confocal scanner was used to collect fluorescence signal at 3 micron resolution after excitation at 570 nm. The average signal from two sequential scans was calculated for each microarray feature. EXAMPLE 2: Analysis of microarray data
[0040] Initial data analysis was performed using Affymetrix Microarray Suite 5.0 to quantitate expression levels for targeted genes; default values provided by Affymetrix were applied to all analysis parameters. Border pixels were removed, and the average intensity of pixels within the 75th percentile was computed for each probe. The average of the lowest 2% of probe intensities occurring in each of 16 microarray sectors was set as background and subtracted from all features in that sector. Probe pairs were scored positive or negative for detection of the targeted sequence by comparing signals from the perfect match and mismatch probe features. The number of probe pairs meeting the default discrimination threshold (tau = 0.015) is used to assign a call of absent, present or marginal for each assayed gene, and a p- value is calculated to reflect confidence in the detection call. A weighted mean of probe fluorescence (corrected for nonspecific signal by subtracting the mismatch probe value) was calculated using the One-step Tukey's Biweight Estimate. This Signal value, a relative measure of the expression level, was computed for each assayed gene. Global scaling was applied to allow comparison of gene Signals across multiple microarrays: after exclusion of the highest and lowest 2%, the average total chip Signal was calculated and used to determine what scaling factor was required to adjust the chip average to an arbitrary target of 150. All Signal values from one microarray were then multiplied by the appropriate scaling factor.
[0041] For statistical analysis all data was normalized for comparison across arrays using GeneSpring default normalization settings; data transformation was set at measurements less than 0.01 to 0.01, per chip was normalized to 50th percentile, and per gene was normalized to median. Genes differently expressed between matched patient normal and tumor samples were obtained using GeneSpring (Agilent Technologies, Palo Alto, CA) and Statistical Analysis of Microarrays ("SAM"; see Tusher VG et al. Proc. Natl. Acad. Sci. U.S.A. 2001 ;98: 1051 S). Briefly, after normalization, all gene expression data was filtered for those genes that were present in the matched patient normal and tumor samples. The 15,311 genes satisfying this filter were further analyzed using ANOVA with Benjamini Hochberg multiple testing correction factor at p < 0.05 (Benjamini Y & Hochberg YJ. Roy. Stat. Soc. B. 1995;57:289-300). This yielded a highly discriminating set of 2207 genes. To visualize the gene expression data, unsupervised hierarchical clustering was performed independently of normal and tumor samples (FIG. 1), using Cluster (Eisen MB et al. Proc Natl Acad Sci USA. 1998;95: 14863-8) at the settings described using Pearson's correlation distance metric and complete linkage clustering followed by visualization in Treeview. Id.
[0042] Principle components analysis (PCA) was used to compare site specific expression in the oral cavity and was performed using GeneSpring (PCA on Conditions). The samples evenly partitioned into two major groups corresponding to normal and OSCC samples. More genes were differentially regulated in the tumor samples than the normal specimens in this 2207 gene set. Within each tumor and normal cluster were two major classes. This separation was most apparent in the clustering of the normal samples (FIG. 1 and FIG. 2). Six of the 7 (>85%) normal samples that were derived from tongue tissue were in one class. In contrast, >80% (5/6) of those samples obtained from sites other than the tongue, including the mandible, floor of the mouth (FOM), gum, and buccal mucosa, made up the other class (FIG. 2). A similar clustering of the OSCC tongue specimens and non-tongue OSCC specimens was also present in the tumor samples (FIG. 2). Principal components analysis (PCA) highlighted the separation of the tongue and non-tongue specimens in the normal and OSCC tumor groups; shown in FIGS. 3A and 3B is the perspective image with the tongue and non-tongue samples as principal components and axes. Samples derived from tongue are represented by squares (■) while non- tongue samples are represented by triangles (A). Clustering of similar samples is contained within circles. The separation of site specific gene expression was readily apparent in the normal specimens but was slightly less distinct in the OSCC samples (FIGS. 3A, 3B).
[0043] Additionally, within the OSCC tumor samples was a sub-cluster of those primary OSCC tumors that were associated with nodal disease (FIG. 1 ; Ziober, et al., manuscript in preparation). Although expression profiling allowed distinct separation of tongue versus non- tongue samples, several samples clustered into groups that were inconsistent with their histology. However, these results do indicate that there are distinct expression patterns within the oral cavity that can identify normal and tumor tissue sites of origin.
[0044] Closer inspection of the site-specific gene expression patterns identified in FIG. 1 demonstrated distinct gene expression profiles for normal tongue tissue compared to normal non-tongue sites (FIG. 4 and FIG. 5). In particular, several enzymes associated with biochemical processes were up-regulated in the tongue including (as identified by Affymetrix number) cytochrome family members, aldehyde dehydrogenase, monoglyceride lipase, transglutaminase, sulfotransferase, arachidonate 12-lipoxgenase, glutathione s-transferase, and others. In addition, several signaling transduction molecules, including RAB proteins and Rho-GTPase activating proteins, and the e-erb-b2 receptor were elevated in the tongue specimens (FIG. 4).
[0045] While the non-tongue samples did express some biochemical enzymes the number was far less than that for the tongue samples (FIG. 5). Unlike the tongue specimens, the non-tongue samples had several types of receptors up-regulated including (as identified by Affymetrix number) growth factor receptors, prostaglandin receptors, and G-protein couples receptors. Likewise, several growth factors were also elevated, for example the epidermal growth factor, fibroblast growth factor-2 and WNT inhibitory factor. Finally, non-tongue tissues showed expression for several transcription factors, including ets, zinc finger, AP-2 and p300/CBP and signaling molecules like H-Ras suppressor and Grb-2 like proteins. Together, these results indicate that there are unique gene expression patterns between tongue and non-tongue sites in the oral cavity.
[0046] To more thoroughly characterize and identify the most significantly differentially expressed genes in OSCC and normal mucosa we used a combination of ANOVA with the Benjamini Hochberg multiple testing correction factor with p < 0.001 and SAM to analyze the patient-matched tumor/normal samples. See Tusher VG et al. Proc. Natl. Acad. ScL U.S.A. 2001;98:10515; Benjamini Y & Hochberg Y. J. Roy. Stat. Soc. B. 1995;57:289-300. This resulted in a list of 92^genes that are highly significantly differentially expressed in OSCC and normal tissue. See Table 2, below.
TABLE 2
Fold" p-value" Gene Name
Up Regulated
63.68217 1.21E-07 inhibin, beta A (activin A, activin AB alpha polypeptide)
71.4553 2.01E-06 matrix metalloproteinase 1 (interstitial collagenase)
25.8173 2.01E-06 chemokine (C-X-C motif) ligand 13 (B-cell chemoattractant)
9.258157 1.99E-05 matrix metalloproteinase 9 (gelatinase B, 92kDa gelatinase, 92kDa type IV collagenase)
5.970897 5.93E-05 601658812R1 NIH_MGC_69 Homo sapiens cDNA clone IMAGE:3886131 3', mRNA sequence. 8.185247 5.93E-05 collagen, type V, a p a 2 rst reg on)
9.032335 5.93E-05 Homo sapiens NK.G5 gene, complete cds.
2.404988 6.64E-05 myosin X
10.58925 6.64E-05 collagen, type V, alpha 1 (first region)
9.743879 7.51E-05 collagen, type V, alpha 2 (second region)
4.964245 7.70E-05 lysyl oxidase-like 2
14.72698 7.88E-05 parathyroid hormone-like hormone
8.741428 7.88E-05 collagen, type III, alpha 1 (Ehlers-Danlos syndrome type IV, autosomal dominant)
15.03796 8.29E-05 matrix metalloproteinase 1 1 (stromelysin 3)
5.234371 0.000113 snail homolog 2 (Drosophila)
7.123956 0.000174 similar to rat FE65 protein, GenBank Accession Number X60469;
9.496229 0.000201 collagen, type V, alpha 1 (second region)
9.258569 0.000218 collagen, type I, alpha 2
1.992358 0.000218 epithelial protein lost in neoplasm beta
2.240645 0.000219 myosin IB (first region)
10.50398 0.000246 collagen, type I, alpha 1 (first region)
16.73496 0.000246 osteoblast specific factor 2 (fasciclin I-like)
5.122788 0.000256 collagen, type I, alpha 2
47.11779 0.000256 collagen, type XI, alpha 1
8.796429 0.000261 transforming growth factor, beta-induced, 68kDa
9.410731 0.000261 collagen, type I, alpha 1 (second region)
2.658922 0.000261 acid phosphatase 5, tartrate resistant
6.703311 0.000261 plasminogen activator, urokinase
4.470378 0.000261 granulysin
21.31044 0.000261 parathyroid hormone-like hormone
3.059548 0.000261 myosin IB (second region)
7.238651 0.000261 collagen, type V, alpha 1 (third region)
4.245012 0.000264 collagen, type IV, alpha 1
4.509005 0.000293 secreted protein, acidic, cysteine-rich (osteonectin)
5.859527 0.000303 KARP-I -binding protein
3.776092 0.000306 Melanoma associated gene
2.849625 0.000313 trophoblast glycoprotein
7.018807 0.000314 triple functional domain (PTPRF interacting)
10.51419 0.000334 fibroblast activation protein, alpha
7.29039 0.00035 laminin, gamma 2
2.339839 0.000372 ubiquitin-conjugating enzyme E2C
3.226771 0.000421 AL514445 Homo sapiens NEUROBLASTOMA Homo sapiens cDNA clone CLOBBO 10ZF08 3-PRJME, mRNA sequence.
2.778452 0.000431 RecQ protein-like (DNA helicase Ql -like)
11.6708 0.000431 interferon, alpha-inducible protein (clone IFI-15K)
3.029643 0.000431 likely ortholog of mouse She SH2-domain binding protein 1
8.878598 0.00051 fibronectin 1 (first region)
7.076595 0.000534 fibronectin 1 (second region)
30.60455 0.000534 H. sapiens type X collagen gene.
3.374103 0.000537 exostoses (multiple) 1
6.493025 0.000555 fibronectin 1 (first region)
4.030122 0.00051 tumor necrosis factor receptor superfamily, member 12A
5.858879 0.000522 predicted protein of HQ3121 ; Homo sapiens clone FLC 1492
PRO3121 mRNA, complete cds.
6.161157 0.000525 laminin, alpha 3
3.422555 0.000525 TPX2, microtubule-associated protein homolog (Xenopus laevis)
4.454709 0.000525 Human lysyl oxidase (LOX) gene, exon 7.
39.05315 0.000534 matrix metalloproteinase 13 (collagenase 3)
7.076595 0.000534 fibronectin 1 (second region)
30.60455 0.000534 H. sapiens type X collagen gene.
3.374103 0.000537 exostoses (multiple) 1 6.493025 0.000555 fϊbronectin 1 (third region)
3.977667 0.000574 triple functional domain (PTPRF interacting)
1.946353 0.000604 glutathione S-transferase omega 1
3.090632 0.000604 caldesmon 1
1.743859 0.000604 U2-associated SR140 protein
2.595988 0.000604 thymosin, beta 10
3.296296 0.000659 collagen, type IV, alpha 2
16.18147 0.000715 matrix metalloproteinase 12 (macrophage elastase)
4.554788 0.00072 collagen, type VI, alpha 3
2.972363 0.00072 secernin 1
6.32457 0.00072 apolipoprotein L, 1
1.890579 0.00072 methyl-CpG binding domain protein 4
7.61961 0.00072 fibronectin 1 (fourth region)
6.494057 0.000785 chemokine (C-X-C motif) ligand 9
2.451763 0.00082 FAT tumor suppressor homolog 1 (Drosophila)
2.732824 0.00082 myristoylated alanine-rich protein kinase C substrate
9.185588 0.00082 serine (or cysteine) proteinase inhibitor, clade E (nexin, plasminogen activator inhibitor type 1), member 1
12.76825 0.00082 serine (or cysteine) proteinase inhibitor, clade E (nexin, plasminogen activator inhibitor type 1), member 1
9.70648 0.00082 microfibrillar-associated protein 2
8.731814 0.00082 parathyroid hormone-like hormone
5.08355 0.00082 602035015F1 NCI_CGAP_Brn64 Homo sapiens cDNA clone
IMAGE:4183107 5', mRNA sequence.
2.451526 0.000861 proteasome (prosome, macropain) subunit, beta type, 2
3.163611 0.000861 chromosome 5 open reading frame 13
4.858088 0.000861 chromosome 10 open reading frame 3
15.55873 0.000914 a disintegrin and metalloproteinase domain 12 (meltrin alpha)
2.277927 0.000914 BUBl budding uninhibited by benzimidazoles 1 homolog (yeast)
2.804374 0.000925 ribonucleotide reductase M2 polypeptide
2.798002 0.000942 chloride intracellular channel 4
2.252996 0.000942 high-mobility group box 3
3.431822 0.000942 interferon, alpha-inducible protein (clone IFI-6-16)
8.40618 0.000942 signal transducer and activator of transcription 1, 9IkDa
6.538403 0.000948 matrix metalloproteinase 3 (stromelysin 1 , progelatinase)
Down Regulated
0.039489 4.12E-08 TU3A protein
0.004179 0.00035 cysteine-rich secretory protein 3
0.111661 0.000751 monoamine oxidase B
0.539417 0.000751 elongation factor RNA polymerase II-like 3
0.247429 0.000862 [unnamed gene] aFold difference in expression between OSCC and normal tissue obtained using SAM. ""Adjusted p- value using ANOVA and Benjamini Hochberg false discovery value of p 0.001.
As shown in Table 2, the list of 92 genes is comprised of a majority of genes (95%) that are up- regulated in OSCC as compared to normal (similar to that presented in FIG. 1). This list contains genes expressed from 2-fold to over 70-fold in the OSCC with p-values that ranged from 1 x 10 -7 to 0.001 (Table 2). Likewise, the genes which were down-regulated in OSCC ranged from 2- to 33- fold with p-values of 4 x 10"8 to 7.5 x 10"4.
[0047] Although several genes in Table 2 are relatively unknown, many have been implicated in OSCC development and progression. These include molecules associated with the extracellular matrix (ECM), matrix proteolysis, cell-cell adhesion, migration and other processes. For example, several collagen chains, 2 laminin-5 chains, 6 different matrix metalloproteinases (MMP), MMP-I, -3, -9, -11, -12, and -13, and plasminogen activator of urokinase were all up- regulated in OSCC. In addition, genes regulating cell-cell adhesion and motility including snail homolog 2, myosin, meltrin alpha, lysyl oxidase-like 2 were identified as being up-regulated in OSCC. The functions of those genes down-regulated in OSCC are presently unknown. However, several of the up-regulated genes have previously been reported as markers of or being involved in OSCC tumor development and progression. EXAMPLE 3: Immunohistochemical validation
[0048] This sampling of genes was next examined at the protein level in paraffin embedded archival samples, both OSCC and normal, by immunohistochemistry. Immunohistochemistry was performed to validate the differential expression of selected genes in tissue sections and to localize the tissue expression of the genes. Sections were incubated at 7O0C for twenty minutes, de-paraffinized in xylene (20 min, room temperature ("RT")), and then rehydrated through a series of graded ethanol solutions (20 min, RT) followed by water (10 min, RT). Endogenous peroxide was quenched through treatment with hydrogen peroxide solution (15 min, RT). To enhance antigen exposure, specimens for matrix metalloproteinase-1 (MMP-I) and laminin-5 gamma-2 chain (Ln-5γ2) were incubated in 1% sodium citrate solution at 100°C for 10 min and then cooled to RT. Detection of matrix metalloproteinase-3 (MMP-3) required no antigen retrieval methods. Nonspecific binding sites were blocked with horse serum (Vector Laboratories, Burlingame, CA) for 30 min. The slides were then incubated overnight at 4°C with monoclonal mouse antibodies directed against the Ln-5γ2 chain (D4B5; Chemicon, Inc., Ternecula, CA), MMP-I, (3665; R&D Systems, Inc., Minneapolis, MN) or MMP-3 (SLl IED4; Chemicon, Inc.). Biotinylated secondary antibody (anti-mouse, 30 min, RT; Vector Laboratories), a biotinavidin complex (30 min, RT; Vector Laboratories), and a chromo genie substrate DAB, 10 min, RT; Vector Laboratories) were then applied. Sections were counterstained with hematoxylin for 5 min.
[0049] All OSCC sections displayed elevated expression of MMP-I , MMP-3 and Ln-5 γ2 within the tumor islands (FIGS. 6A-6C). In contrast, with the exception of the Ln-5 γ2, these proteins could not be detected in the normal sections. Ln-5γ2 was detected, as expected, in the basement membrane of normal oral epithelium. See FIG. 7; Ziober AF et al. The extracellular matrix in oral squamous cell carcinoma: friend or foe? Head and Neck. 2005. In press. FIG. 7 depicts immunohistochemical staining analysis of MMP-I, Ln-5γ2 chain and MMP-3 in OSCC and normal oral mucosa. In OSCC (B, D, F) staining of MMP-I (B), Ln-5γ2 chain (D), and MMP-3 (F) was detected around and within the OSCC tumor islands. Staining of normal oral mucosa (A, C, E) could not detect expression of MMP-I (A) or MMP-3 (E). The Ln-5γ2 chain (C) was correctly detected in the basement membrane of the oral mucosa and is denoted by arrow (f). The magnification of the provided staining analysis is 40Ox. All total of 7 different OSCC tumor and normal sections were stained by immunohistochemistry. These results demonstrate validation of the OSCC gene signature at the RNA and protein level. EXAMPLE 4: Quantitative real-time PCR analysis
[0050] To confirm the findings from the microarray analysis we performed real-time PCR using primers specific for a sampling of genes that were at the top middle, and bottom of the genes listed in Table I. This included MMP-I, Ln-5 γ2, and MMP-3. The amplification efficiencies and expression values of the primer/probe sets at various dilutions was compared with the amplification efficiencies of the internal control gene β-Gus.
[0051] Changes in mRNA levels were compared by quantitative real-time PCR analysis, using the BioRad MyiQ Single-Color Real-Time PCR Detection System (BioRad, Hercules, CA). All gene specific primers used for real-time PCR were all purchased from Qiagen, Inc. (Valencia CA). Five μg of total RNA from normal and tumor specimens were converted to cDNA using Superscript II (Invitrogen Corp., Carlsbad, CA) according to manufacturer's specifications. PCR reaction mixtures consisted of 2 μl of Faststart DNA Master SYBR Green I mixture (containing TaqDNA polymerase, reaction buffer, deoxynucleotide triphosphate mix - with dUTP instead of dTTP - SYBR Green I dye, and 10 mM MgCl2), 0.5 μM of each target primer stock, 2 or 4 mM MgCl2 (Ln-5γ2 chain, MMP-I, MMP-3) in a final reaction volume of 20μl. β-Gus was used as the internal control for normalization and Universal Human RNA (Stratagene, La Jolla, CA) as the standard reference. See Livak KJ & Schmittgen TD. Methods. 2001;25:402-818. β-Gus was selected as the internal control since it was uniformly expressed across all samples by microarray.
[0052] Cumulative fluorescence was measured at the end of the extension phase of each cycle. Product-specific amplification was confirmed by a melting curve analysis and agarose gel electrophoresis analysis. Quantification was performed at the log- linear phase of the reaction and cycle numbers obtained at this point were plotted against a standard curve prepared with serially diluted samples. Quantitation real-time PCR results was performed using the 2-CΔΔT method. See Livak KJ & Schmittgen TD. Methods. 2001;25:402-818.
[0053] As shown in FIGS. 6A-6C, expression levels of MMP-I, Ln-5 γ2 chain, and MMP-3 were all elevated in the tumor specimens as compared to the adjacent normal tissue. The fold expression of MMP-I, Ln-5 γ2 chain, and MMP-3 in tumor versus normal tissue was in agreement with that determined by microarray analysis, as shown in Table 3, below. TABLE 3 a b C
Gene RT Array Fold Range
MMP-I 122 71 6.5-205
MMP-3 11.9 6.5 2.4-36
Ln-5γ2 11.7 7.2 3-23
Average fold expression of each gene normalized to expression values for internal control gene β-Gus in tumor versus normal tissue (n=5) specimens obtained by quantitative real-time PCR. b
Average fold expression value of each gene obtained from microarray analysis.
High and low fold expression values of each gene in tumor versus normal tissue specimens obtained by quantitative real-time PCR.
Overall, these results confirm, at the RNA level, the differential gene expression signature that distinguishes OSCC tumors and normal mucosa.
EXAMPLE 5: Identification of the 25-gene predictor set for OSCC
[0054] To identify an OSCC tumor predictor the patient-matched OSCC tumor and normal microarray data was next analyzed by the class prediction algorithm Support Vector Machines ("SVM", a supervised machine learning technique; Furey TS et al. Bioinformatics. 2000 Oct;16(10):906-14). SVM uses recognition and regression estimates to identify class prediction gene sets using a training set of microarray data. The SVM algorithm attempts to find a hyperplane that provides separation between the different input data classes such that there is a maximal distance between the hyperplane and the nearest point on any one of the input classes.
[0055] Furthermore, using the training data set SVM performs a 10-fold cross- validation to select a set of predictor genes that leads to the smallest error rate. The patient- matched tumor and normal samples were used as the training set and cross- validated. The first and second data sets identified supra, as well as the OSCC and various tumor data sets were used as the test sets. SVM was set to use the Golub gene selection method. Vielba R et al Larynoscope. 2003; 113:167-172. In the Golub method, each gene is tested for its ability to discriminate between the classes using a signal-to-noise score, which is given by:
μi - μ-2 σi + σ2
wherein yd and σi (i - 1, 2) are the mean and standard deviation of the expression values over the samples in class /. Genes with the highest scores are kept for subsequent calculations. The Golub method of gene selection calculates the difference in means between the training and test sets divided by the sum of the standard deviations to identify the best set of predictors. The number of genes used for predictors was set at 25 and the 2207 genes identified by filtering for only genes present, and analyzed by ANOVA with Benjamini Hochberg multiple testing correction factor at p < 0.05 were used as the gene pool. Once this set of 25 predictor genes was identified, it was applied to each test set to test the classification accuracy. All SVM calculations were performing using kernel function set to polynomial dot product (order 1) and diagonal scaling factor set to 0. Prediction Analysis of Microarrays ("PAM"; see Takahashi M et al. Proc Natl Acad Sd U.S.A. 2001;98:9754-9) and k-nearest neighbors confirmed SVM results (see Vielba R et al. Larynoscope. 2003; 113:167-1729; data not shown).
[0056] The cross-validation of the matched patient tumor and normal data demonstrate that this 25-gene set predictor could classify tumor and normal samples with a 100% accuracy. See Table 4, below.
TABLE 4
Sample3 True Value" Prediction0 t margin" n margin
04-0074 t t 1.407 -1.407
04-0075 n n -1.043 1.043
04-0123 "~ T" • n 0.0447 -0.0447
04-0124 n n * -0.996 0.996
20241 n n -1.053 1.053
20241T t t 1.556 -1.556
22279 n n -1.053 1.053
22279T t t 1.474 -1.474
22478 n n -1.139 1.139
22478T t t 1.248 -1.248
24328 n n -0.943 0.943
24328T t t 1.324 -1.324
24418 n n -1.159 1.159
24418T t t 1.252 -1.252
25395 n n -0.985 0.985
25395T t t 0.957 -0.957
26153 n n -0.964 0.964
26153T t t 0.956 -0.956
27369 n n -1.047 1.047
27639T t t 1.561 -1.561
33635 n n 1.524 -1.524
33635T t t -1.027 1.027
62 t t 0.827 -0.827
61T n n -1.014 1.014
68N t t 1.061 -1.061
67T n n -1.053 1.053 aSample number. bTrue Value shows the true value of the class of each sample, as either tumor (t) or normal (n). This value is compared with the value in the Prediction column to validate training set.
'Prediction shows the predicted class; incorrectly predicted samples are highlighted in gray. dMargin shows the distance (in arbitrary units) to the hyperplane for each of the classes, tumor and normal. Positive scores are assigned to one class, and negative scores are assigned to the other class. The scores are then reported as the margins. This corresponds to the distance from the sample to the separating decision boundary. The larger the margin, the farther away a score is from the boundary, and the more confident is the classification. Predictions are considered unreliable when margins are <0.5. Interestingly, a majority of the margins in Table 4, 25 of 26 (96%) were > 0.5 and are considered as confident classifications {TothillRWet al. Cancer Res. 2005 ; 65: 4031-40; Mukherjee, S. Classifying Microarray Data Using Support Vector Machines. In: A Practical Approach to Microarray Data Analysis, D. P. Berrar, W. Dubitzky and M. Granzow (Eds.), Kluwer Academic Publishers, Boston, MA, Chapter 9, 2003; 166-185). However, the margin for sample 04-0123 was below 0.05 and is considered an unreliable prediction. Thus, this reduced the accuracy to 96%. It was possible to retain this 96% accuracy prediction rate using as few as 10 genes; however, the 25 predictors yielded the best performance when applied to other data sets below (data not shown).
[0057] To test the predictive strength of the 25-gene predictor identified by SVM and cross-validation we tested it on three independent oral cancer data sets. The Perm data set was comprised of two normal specimens and 13 OSCC specimens, the RO data set which has been previously been reported consisted of 18 OSCC tumors and the OSCC data set consisted of 5 tumor and 4 normal samples (from GSE1722; GEO DataSet, NCBI) (O'Donnell RK et al. Oncogene. 2005;24: 1244-51). Using SVM and the Golub method of gene selection all (100%) specimens of the Perm data set were correctly classified, as shown in Table 5, below.
TABLE S
Sample3 True Value" Prediction0 t margin*1 n margin
03-0116 n n -0.699 0.699
03-0117 t t 0.21 -0.21
19965T t t 1.285 -1.285
20063T t t 1.207 -1.207
20282T t t 0.9 -0.9
20402T t t 1.065 -1.065
21600T t t 1.435 -1.435
21732T t t 1.091 -1.091
21984T t t 1.331 -1.331
2234 IT t t 1.51 -1.51
23908T t t 0.11 -0.11
23943 n n -1.16 1.16
23943T t t 1.324 -1.324
27904T t t 1.591 -1.591
RlOOOT t t 1.56 -1.56
"Sample number. bTrue Value shows the true value of the class of each sample, as either tumor (t) or normal (n). This value is compared with the value in the Prediction column to validate training set.
'Prediction shows the predicted class. Prediction was performed using SVM, Golub method for selecting predictor genes selection (9), and a 25-gene predictor. dMargin shows the distance (in arbitrary units) to the hyperplane for each of the classes, tumor and normal. Positive scores are assigned to one class, and negative scores are assigned to the other class. The scores are then reported as the margins. This corresponds to the distance from the sample to the separating decision boundary. The larger the margin, the farther away a score is from the boundary, and the more confident is the classification. Predictions are considered unreliable when margins are O.5. [0058] However, the margins of 2 predictions were considered unreliable resulting in an accuracy of 87%. The 25-gene set predictor was able to accuracy classify 86% of the RO OSCC data (see Table 6, below).
TABLE 6
a b C d
Sample True Value Prediction t margin n margin
IOT t t 0.772 -0.772
HT t t 1.122 -1.122
12T t t 1.078 -1.078
13T t t 1.183 -1.183
14T t n -0.778 0.778
15T t t 1.027 -1.027
16T t t 0.0211 -0.0211
17T t t 1.13 -1.13
18T t t 1.076 -1.076
19T t t 1.223 -1.223
2T t t 1.166 -1.166
2OT t t 0.907 -0.907
21T t t 1.252 -1.252
3T t t 0.55 -0.55
4T t t 1.309 -1.309
5T t t 1.2 -1.2
6T t t 1.248 -1.248
7T t t 1.258 -1.258
8T t t 0.953 -0.953
9t t t 1.209 -1.209
Sample number. b
True Value shows the true value of the class of each sample, as either tumor (t) or normal (n). This value is compared with the value in the Prediction column to validate training set.
Prediction shows the predicted class; incorrectly predicted samples are highlighted in gray. Prediction was performed using SVM, Goloub method for selecting predictor genes (9), and a 25-gene predictor. d
Margin shows the distance (in arbitrary units) to the hyperplane for each of the classes, tumor and normal. Positive scores are assigned to one class, and negative scores are assigned to the other class. The scores are then reported as the margins. This corresponds to the distance from the sample to the separating decision boundary. The larger the margin, the farther away a score is from the boundary, and the more confident is the classification. Predictions are considered unreliable when margins are <0.5.
The OSCC predictor correctly classified all 9 samples as tumor or normal in the OSCC data set. However, one prediction was considered unreliable giving an accuracy of 89% (see Table 7, below).
TABLE 7 a D c α
Sample True Value Prediction t margin n margin
OTl t t 1.039 -1.039
OT2 t t 0.95 -0.95
OT3 t t 1.244 -1.244 OT4 t t 1.546 -1.546
OT5 t t 1.706 -1.706
OTNl n n -0.31 0.31
OTN2 n n -0.593 0.593
OTN3 n n -0.528 0.528
OTN4 n n -0.911 0.911 a
Sample number.
True Value shows the true value of the class of each sample, as either tumor (t) or normal (n). This value is compared with the value in the Prediction column to validate training set.
Prediction shows the predicted class. Prediction was performed using SVM, Golub method for selecting predictor genes (9), and a 25-gene predictor. d
Margin shows the distance (in arbitrary units) to the single hyperplane for each of the classes, tumor and normal. Positive scores are assigned to one class, and negative scores are assigned to the other class. The scores are then reported as the margins. This corresponds to the distance from the sample to the separating decision boundary. The larger the margin, the farther away a score is from the boundary, and the more confident is the classification. Predictions are considered unreliable when margins are <0.5.
Similar results for the above analyses were obtained using k-nearest neighbor and PAM (data not shown). Thus, testing a total of 44 OSCC and normal specimens the 25-gene predictor was able to correctly classify 42 samples producing an average accuracy rate of > 87%. As a follow-up to this initial work we are beginning prospective studies to test the OSCC gene signature/s ability to predict using a larger sample population and improve the positive identification accuracy rate. Finally, using existing Affymetrix Ul 33 A chip derived data sets (NCBI GEO DataSets) from 20 other human tumors including breast, renal clear cell tumor, acute myeloid leukemia, lymphoblastic leukemia, and Barrett's associated adenocarcinomas resulted in accuracies of only 25%, thus illustrating the tissue specificity of the 25-gene predictor. See Table 8, below.
TABLE 8 a b C d
Sample True Value Prediction t margin n margin
AMLl t n -0.724 0.724
AML2 t n -0.8 0.8
AML3 t n -0.65 0.65
AML4 t n -0.627 0.627
BEl t n -1.016 1.016
BE2 t t 0.213 -0.213
BE3 t n -0.931 0.931
NE n n ' -0.874 ' " 0.874
Breast 1 t t 1.041 -1.041
. Breast2 t n -0.631 0.631
Breast3 t n -0.624 0.624
Breast4 t t 0.658 -0.658
Leul t t 0.251 -0.251 i Leu2 t n -0.639 0.639
' Leu2 t n -0.665 0.665
! Leu4 t n -0.542 0.542
: RCl t n -0.872 0.872 RC2 t n -0.961 0.961
RC3 t n -0.902 0.902
RC4 t n -0.953 0.953
Sample number. AML, acute myeloid leukemia; BE, Barrett's associated adenocarcinomas, NE, normal esophageal mucosa; Breast, breast carcinoma; LE, lymphoblastic leukemia; and RC, renal clear cell tumor. b
True Value shows the true value of the class of each sample, as either tumor (t) or normal (n). This value is compared with the value in the Prediction column to validate training set.
Prediction shows the predicted class; incorrectly predicted samples are highlighted in gray. Prediction was performed using SVM, Goloub method for selecting predictor genes (9), and a 25-gene predictor. d
Margin shows the distance (in arbitrary units) to the hyperplane for each of the classes, tumor and normal. Positive scores are assigned to one class, and negative scores are assigned to the other class. The scores are then reported as the margins. This corresponds to the distance from the sample to the separating decision boundary. The larger the margin, the farther away a score is from the boundary, and the more confident is the classification. Predictions are considered unreliable when margins are <0.5.
[0059] Many of the genes that make up the 25-gene predictor have previously been implicated in and described for OSCC, as shown in Table 9, below.
TABLE 9
Affymetrix Predictive Number Strength3 Gene Name
209074_s_at 2.845 TU3A protein
21051 l_s_at 2.375 inhibin, beta A (activin A, activin AB alpha polypeptide)
204475_at 2.334 matrix metalloproteinase 1 (interstitial collagenase)
203936_s_at 2.072 matrix metalloproteinase 9 (gelatinase B, 92kDa gelatinase, 92kDa type IV collagenase)
205242_at 2.026 chemokine (C-X-C motif) ligand 13 (B-cell chemoattractant)
203878_s_at 1.824 matrix metalloproteinase 11 (stromelysin 3)
221729_at 1.688 collagen, type V, alpha 2
215077_at 1.669 collagen, type III, alpha 1 (Ehlers-Danlos syndrome type
IV, autosomal dominant)
212473_s_at 1.669 601658812R1 NIH_MGC_69 Homo sapiens cDNA clone
IMAGE:3886131 3', mRNA sequence.
202311_s_at 1.662 collagen, type I, alpha 1
37145_at 1.635 Homo sapiens NKG5 gene, complete cds.
205479_s_at 1.633 plasminogen activator, urokinase
212488_at 1.627 collagen, type V, alpha 1
201976_s_at 1.584 myosin X
202998_s_at 1.575 lysyl oxidase-like 2
203325_s_at 1.575 collagen, type V, alpha 1
221730_at 1.571 collagen, type V, alpha 2
206300_s_at 1.57 parathyroid hormone-like hormone
211980_at 1.549 collagen, type IV, alpha 1
37892_at 1.527 collagen, type XI, alpha 1
212364_at 1.515 myosin IB
210809_s_at 1.512 osteoblast specific factor 2 (fasciclin I-like)
221898_at 1.503 602035015F1 NCI_CGAP_Brn64 Homo sapiens cDNA clone IMAGE:4183107 5', mRNA sequence,
213139_at 1.495 snail homolog 2 (Drosophila)
213419 at 1.485 similar to rat FE65 protein, GenBank Accession Number X60469
- Ii - "The 25-gene predictor was determined using SVM and the Golub method for gene selection. Predictive strength, the higher value defines the better predictor.
[0060] However, several predictor genes have not been directly associated with OSCC tumorigenesis and will therefore provide starting points for further investigations. Several of the genes identified in the predictor were also present in the set of highly significantly genes expressed between normal and tumor {see Table 2, supra). The predictor set of genes were comprised of several epithelial marker genes with categories of potential interest including genes encoding extracellular matrix components, genes involved in cell adhesion, including the fasciclin; genes involved in cell-cell integrity, for example lysyl oxidase-like 2 and snail- homolog 2; genes encoding hydrolyzing activities, including proteins involved in degradation of the extracellular matrix like MMP-I, MMP-9, MMP-11 and urokinase and cytokines like inhibin beta A and parathyroid hormone-like. As previously proposed, the development OSCC involves stromal and immune-regulatory components. Thus, many of the predictor genes belong to these categories.
[0061] To identify a gene expression signature that was capable of predicting OSCC tissue from normal, the present study involved expression profiling on patient-matched normal mucosa and OSCC tumors. The use of patient paired normal and tumor specimens provided the most representative data-base statistically for distinguishing gene expression difference between tumor and normal. By microarray analysis a highly significant set of differentially expressed genes between normal and OSCC tissue was identified. Furthermore, when used in the supervised machine algorithm SVM, three independent test data sets with accuracies ranging from 87-100% were classified. This is the first report to date that has used patient tumor/normal samples to identify a gene signature for prediction of OSCC. Finally, these studies satisfy the first requirement in developing a microfluidic "lab-on-a-chip" system for rapid (real-time), point- of-care screening, detection and diagnosis of oral cancer; identification of a gene signature that predicts OSCC.
[0062] Several approaches were utilized for the analysis of gene expression data in regards to clinicopathological variables. The initial approach, unsupervised hierarchical clustering, was used to examine similarities and differences among the paired tumor/normal samples in their patterns of gene expression. Unsupervised hierarchical clustering separated the samples in two primary clusters of normal samples and tumor samples. In agreement with the results, several previous studies using unmatched and matched OSCC tumor specimens and normal samples showed a similar distinct clustering of normal and tumor samples (Ginos MA et al. Cancer Res. 2004;64(l):55-63 ; Somoza-Martin JM et al. J Oral Maxillofac Surg. 2005;63:786-92 ; Belbin TJet al. Arch Otolaryngol Head Neck Surg. 2005; 131: 10-8). As with these previous studies, no hierarchical clustering of the samples as related to stage of disease or tumor grade were found. The most differently expressed genes were up-regulated in OSCC samples while those in the normal tissue were down-regulated. More interestingly, and in contrast to previous studies, both tumor samples and to a greater extent the normal samples were clustered as to their anatomical sites in the oral cavity. Tissue from the tongue comprised one major cluster while those from other sites including buccal mucosa, mandible epithelial, gum tissue, and floor of the mouth populated the other cluster. See FIGS. 1 and 2.
[0063] Using a highly significant statistical and data filtering approach 92 genes were identified as differentially expressed between OSCC and normal mucosa (p < 0.001). Furthermore, the strong correlation of real-time PCR with the array data for gene expression and the validation using immunohistochemistry strongly indicate that the 92 genes in this list are representatively expressed genes in OSCC. Thus, these studies indicate that many of the genes identified by microarray analysis are highly relevant to OSCC development and/or progression.
[0064] To date, numerous studies incorporating microarray analysis have reported on the genetic changes associated with OSCC. However, none of these previous studies have tested their respective OSCC gene signatures for the ability to predict OSCC using independent validation data sets as shown here. Thus, there is a unified consensus that distinct gene expression patterns exist when normal and primary OSCC tumors are compared. For example, Ginos et al, who compared OSCC samples to unmatched normal subjects using Affymetrix Ul 33 A chips identified several genes that overlapped with those found in this study (Ginos MA et al. Cancer Res. 2004;64(l):55-63). In addition, Mendez et al. used microarrays to analyze both OSCC and normal specimens (Mendez E et al. Cancer. 2002 ;95: 1482-94). As with our work here, they concluded that oral carcinomas are distinguishable from normal oral tissue using oligonucleotide arrays that contained probes representing only 7000 full-length human genes. Interestingly, these authors indicated that there was expression profile heterogeneity among tumors of a particular histopathologic grade and stage and that no statistically significant differences in gene expression were found between early-stage disease and late-stage disease (Id.).
[0065] The instant data with these authors' assessment that there was no correlation with grade and stage but does not agree with the authors' findings on invasive disease (manuscript in preparation). These differences are more likely due to the dissimilar microarray chips used in the Mendez et al. study as the one presented here. However, these results do illustrate that distinct genes are expressed in OSCC compared to normal. [0066] To test for clinical applicability, it was herein assessed whether different data sets of genes and tissues could be predicted by the OSCC gene expression signature. Selected for this purpose were support vector machines, a supervised machine learning technique used in earlier prediction studies (Furey TS et al. Bioinformatics. 2000 Oct;16(10):906-14). SVM was chosen because it has been used in several microarray studies with success and appears to be superior to similar algorithms like k-nearest neighbor and PAM. Cross-validation of the training set, which consisted of the paired tumor/normal samples from two institutions, resulted in an accuracy rate of 96% using at least 25 genes per class.
[0067] A highly appropriate test of predictive accuracy is to validate the predictor on an independent set of samples. Typically, in studies aimed at identifying a gene predictor the samples are split into a training set and a validation set. Furthermore, when testing a predictor it is important to obtain some estimate of its accuracy. Therefore, to provide a superior means of testing the predictor, the present study did not split the data set and use leave-one-out validation. Instead, accuracy rates on the 25 -gene predictor using three independent validation sets were validated and obtained. The three validation sets comprised: an independent OSCC and normal sample set from the University of Pennsylvania, an OSCC sample set previously published and one obtained from the NCBI Gene Dataset (O 'Donnell RK et al. Oncogene. 2005;24:1244-51). The 25-gene predictor had an overall accuracy ranging from 86-89% for these two validation sets. How these numbers compare with the early clinical diagnosis accuracy of OSCC is currently under investigation. It is difficult to determine why some samples were incorrectly classified. This may be the result of other tissue components within the sample, for example bone, or because some samples were mistakenly labeled.
[0068] Finally, the OSCC 25-gene predictor's ability was tested for its ability to classify non-oral cavity tumor and normal samples. The OSCC predictor displayed poor classifications, with accuracies of only 25% (75% of the samples were predicted incorrectly), when microarray data sets (obtained from NCBI GEO DataSets) derived from non-OSCC human cancers. This indicated that the oral cancer gene predictor set was tissue specific. Some of the predictor genes have previously been implicated in OSCC development and progression. In addition, several of these genes present in Table 5, supra, are those differentially expressed in OSCC tumor and normal mucosa components.
[0069] Thus, a highly significant set of genes that are expressed in OSCC have been identified here, thereby providing an exemplary gene signature into the clinic for patient screening. To date, no accurate, cost-efficient, and reproducible methods exist that enable mass screening of patients for OSCC. The instant results demonstrate that such methods are available using an OSCC gene signature such as the example identified here.

Claims

What is Claimed:
1. A method of assessing the absence or presence of oral squamous cell carcinoma in a subject comprising:
determining, in an oral tissue sample from said subject, the expression levels of at least eight members of the set of genes comprising:
TU3 A protein; beta A inhibin; matrix metalloproteinase 1 ; matrix metalloproteinase 9; chemokine ligand 13; matrix metalloproteinase 11; type V, alpha 2 collagen (first region); type III, alpha 1 collagen; 601658812R1 NIH_MGC_69 Homo sapiens cDNA clone IMAGE:3886131 3' mRNA; type I, alpha 1 collagen; Homo sapiens NKG5 gene; urokinase plasminogen activator; type V, alpha 1 collagen (first region); myosin X; lysyl oxidase-like 2; type V, alpha 1 collagen (second region); type V, alpha 2 collagen (second region); parathyroid hormone-like hormone; type FV, alpha 1 collagen; type XI, alpha 1 collagen; myosin IB; osteoblast specific factor 2; 602035015F1 NCI_CGAP_Brn64 Homo sapiens cDNA clone IMAGE:4183107 5' mRNA; snail homolog 2; and, GenBank Accession Number X60469;
and,
comparing said expression levels with expression levels of such genes in a patient in which oral squamous cell carcinoma is known to be present.
2. The method of claim 1 further comprising comparing said expression levels with expression levels of such genes in a patient in which oral squamous cell carcinoma is known to be absent.
3. The method according to claim 1 wherein said oral tissue sample is from a plurality of anatomical sites in the oral cavity of said subject.
4. The method according to claim 1 wherein said oral tissue sample is from at least one of the tongue, buccal mucosa, lips, mandible epithelia, gum, or mouth floor of said subject.
5. The method according to claim 4 wherein said oral tissue sample is from the tongue of said subject.
6. The method according to claim 1 wherein said oral tissue sample is derived from oral cavity saliva.
7. A method of assessing the absence or presence of oral squamous cell carcinoma in a subject comprising:
determining, in an oral tissue sample from said subject, the expression levels of at least eight members of the set of genes comprising:
TU3A protein; beta A inhibin; matrix metalloproteinase 1; matrix metalloproteinase 9; chemokine ligand 13; matrix metalloproteinase 11; type V, alpha 2 collagen (first region); type III, alpha 1 collagen; 601658812R1 NIH_MGC_69 Homo sapiens cDNA clone EV1AGE:3886131 3' mRNA; type I, alpha 1 collagen; Homo sapiens NKG5 gene; urokinase plasminogen activator; type V, alpha 1 collagen (first region); myosin X; lysyl oxidase-like 2; type V, alpha 1 collagen (second region); type V, alpha 2 collagen (second region); parathyroid hormone-like hormone; type IV, alpha 1 collagen; type XI, alpha 1 collagen; myosin IB; osteoblast specific factor 2; 602035015Fl NCI_CGAP_Brn64 Homo sapiens cDNA clone EVIAGE:4183107 5' mRNA; snail homolog 2; and, GenBank Accession Number X60469;
and,
comparing said expression levels with expression levels of such genes in a patient in which oral squamous cell carcinoma is known to be absent.
8. The method according to claim 9 wherein said oral tissue sample is from a plurality of anatomical sites in the oral cavity of said subject.
9. The method according to claim 9 wherein said oral tissue sample is from at least one of the tongue, buccal mucosa, lips, mandible epithelia, gum, or mouth floor of said subject.
10. The method according to claim 13 wherein said oral tissue sample is from the tongue of said subject.
11. The method according to claim 9 wherein said oral tissue sample is derived from oral cavity saliva.
12. A method of assessing the absence or presence of oral squamous cell carcinoma in a subject comprising:
determining, in an oral tissue sample from said subject, the expression levels of at least a subset of genes belonging to a class of oral squamous cell carcinoma-associated genes; and,
comparing said expression levels with expression levels of such genes in a patient in which oral squamous cell carcinoma is known to be present.
13. The method of claim 12 further comprising comparing said expression levels with expression levels of such genes in a patient in which oral squamous cell carcinoma is known to be absent.
14. A method of assessing the absence or presence of oral squamous cell carcinoma in a subject comprising:
determining, in an oral tissue sample from said subject, the expression levels of at least a subset of genes belonging to a class of oral squamous cell carcinoma-associated genes; and,
comparing said expression levels with expression levels of such genes in a patient in which oral squamous cell carcinoma is known to be absent.
15. A system for the diagnosis of oral squamous cell carcinoma in a tissue sample comprising:
a solid-support surface; and,
bound to said solid-support surface, target polynucleotides corresponding to at least a subset of genes belonging to the class of oral squamous cell carcinoma-associated genes.
16. The system according to claim 15 further comprising a hybridization solution.
17. The system according to claim 15 further comprising a wash solution.
18. The system according to claim 15 further comprising a stain solution.
19. A method of determining oral squamous cell carcinoma in a subject comprising:
contacting a tissue sample from said subject with a solid-support surface to which there are bound target polynucleotides corresponding to at least a subset of genes belonging to the class of oral squamous cell carcinoma-associated genes; and,
measuring the extent of adhesion between said tissue sample and said target polynucleotides.
20. The method according to claim 19 wherein the subset of genes comprises at least eight genes.
21. The method according to claim 19 wherein the class of oral squamous cell carcinoma- associated genes includes at least 20 genes from:
TU3 A protein; beta A inhibin; matrix metalloproteinase 1 ; matrix metalloproteinase 9; chemokine ligand 13; matrix metalloproteinase 11; type V, alpha 2 collagen (first region); type III, alpha 1 collagen; 601658812R1 NIH_MGC_69 Homo sapiens cDNA clone IMAGE:3886131 3' mRNA; type I, alpha 1 collagen; Homo sapiens NKG5 gene; urokinase plasminogen activator; type V, alpha 1 collagen (first region); myosin X; lysyl oxidase-like 2; type V, alpha 1 collagen (second region); type V, alpha 2 collagen (second region); parathyroid hormone-like hormone; type IV, alpha 1 collagen; type XI, alpha 1 collagen; myosin IB; osteoblast specific factor 2; 602035015Fl NCI_CGAP_Brn64 Homo sapiens cDNA clone IMAGE :4183107 5' mRNA; snail homolog 2; and, GenBank Accession Number X60469.
PCT/US2007/021687 2006-10-19 2007-10-09 Rapid screening of oral squamous cell carcinoma WO2008051374A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US85320506P 2006-10-19 2006-10-19
US60/853,205 2006-10-19

Publications (3)

Publication Number Publication Date
WO2008051374A2 true WO2008051374A2 (en) 2008-05-02
WO2008051374A9 WO2008051374A9 (en) 2008-06-26
WO2008051374A3 WO2008051374A3 (en) 2008-11-13

Family

ID=39325102

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2007/021687 WO2008051374A2 (en) 2006-10-19 2007-10-09 Rapid screening of oral squamous cell carcinoma

Country Status (1)

Country Link
WO (1) WO2008051374A2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8153370B2 (en) 2008-03-19 2012-04-10 The Board Of Trustees Of The University Of Illinois RNA from cytology samples to diagnose disease
WO2013021088A3 (en) * 2011-08-09 2013-08-01 Oncomatrix, S.L. Methods and products for in vitro diagnosis, in vitro prognosis and the development of drugs against invasive carcinomas

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040181344A1 (en) * 2002-01-29 2004-09-16 Massachusetts Institute Of Technology Systems and methods for providing diagnostic services
US7108969B1 (en) * 2000-09-08 2006-09-19 Affymetrix, Inc. Methods for detecting and diagnosing oral cancer

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7108969B1 (en) * 2000-09-08 2006-09-19 Affymetrix, Inc. Methods for detecting and diagnosing oral cancer
US20040181344A1 (en) * 2002-01-29 2004-09-16 Massachusetts Institute Of Technology Systems and methods for providing diagnostic services

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZIOBER ET AL.: 'Identification of a Gene Signature for Rapid Screening of Oral Squamous Cell Carcinoma' 15 October 2006, page 5962 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8153370B2 (en) 2008-03-19 2012-04-10 The Board Of Trustees Of The University Of Illinois RNA from cytology samples to diagnose disease
WO2013021088A3 (en) * 2011-08-09 2013-08-01 Oncomatrix, S.L. Methods and products for in vitro diagnosis, in vitro prognosis and the development of drugs against invasive carcinomas
US9702879B2 (en) 2011-08-09 2017-07-11 Oncomatryx Biopharma, S.L. Methods and products for in vitro diagnosis, in vitro prognosis and the development of drugs against invasive carcinomas

Also Published As

Publication number Publication date
WO2008051374A9 (en) 2008-06-26
WO2008051374A3 (en) 2008-11-13

Similar Documents

Publication Publication Date Title
Ziober et al. Identification of a gene signature for rapid screening of oral squamous cell carcinoma
JP5089993B2 (en) Prognosis of breast cancer
JP6140202B2 (en) Gene expression profiles to predict breast cancer prognosis
US7666595B2 (en) Biomarkers for predicting prostate cancer progression
JP4912894B2 (en) Identification of oncoprotein biomarkers using proteomic technology
US20080050726A1 (en) Methods for diagnosing pancreatic cancer
US20060252057A1 (en) Lung cancer prognostics
TWI609967B (en) Use of a device for prognosis prediction for melanoma cancer
CN111500718A (en) NANO46 gene and method for predicting breast cancer outcome
EP1668357A2 (en) Materials and methods relating to breast cancer classification
Chung et al. Gene expression profiling of papillary thyroid carcinomas in Korean patients by oligonucleotide microarrays
JP2021532735A (en) DNA Methylation Markers and Their Use for Non-Invasive Detection of Cancer
Dyrskjøt Classification of bladder cancer by microarray expression profiling: towards a general clinical use of microarrays in cancer diagnostics
US20090286240A1 (en) Biomarkers overexpressed in prostate cancer
Bueno et al. A diagnostic test for prostate cancer from gene expression profiling data
EP2278026A1 (en) A method for predicting clinical outcome of patients with breast carcinoma
WO2008051374A2 (en) Rapid screening of oral squamous cell carcinoma
US8008012B2 (en) Biomarkers downregulated in prostate cancer
US20140248637A1 (en) Composition for diagnosis of lung cancer and diagnosis kit of lung cancer
US8293469B2 (en) Biomarkers downregulated in prostate cancer
US20140287939A1 (en) Biomarker(s) for early detection / diagnosis/ prognosis of gastric cancer
CN113943802A (en) Application of GOLT1B in prognosis of renal cancer
EP2607494A1 (en) Biomarkers for lung cancer risk assessment
CN113322325A (en) Application of gene group as detection index in oral squamous cell carcinoma diagnosis
WO2023164595A2 (en) Methods for subtyping and treatment of head and neck squamous cell carcinoma

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07839451

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 07839451

Country of ref document: EP

Kind code of ref document: A2