WO2009045115A1 - Proliferation signature and prognosis for gastrointestinal cancer - Google Patents

Proliferation signature and prognosis for gastrointestinal cancer Download PDF

Info

Publication number
WO2009045115A1
WO2009045115A1 PCT/NZ2008/000260 NZ2008000260W WO2009045115A1 WO 2009045115 A1 WO2009045115 A1 WO 2009045115A1 NZ 2008000260 W NZ2008000260 W NZ 2008000260W WO 2009045115 A1 WO2009045115 A1 WO 2009045115A1
Authority
WO
WIPO (PCT)
Prior art keywords
expression
genes
cancer
patient
gastrointestinal cancer
Prior art date
Application number
PCT/NZ2008/000260
Other languages
French (fr)
Inventor
Ahmad Anjomshoaa
Anthony Edmund Reeve
Yu-Hsin Lin
Michael A Black
Original Assignee
Pacific Edge Biotechnology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to EP08835078A priority Critical patent/EP2215254A4/en
Priority to KR1020107009975A priority patent/KR101727649B1/en
Priority to KR1020167011870A priority patent/KR101982763B1/en
Priority to KR1020187022213A priority patent/KR20180089565A/en
Application filed by Pacific Edge Biotechnology Ltd filed Critical Pacific Edge Biotechnology Ltd
Priority to JP2010527901A priority patent/JP5745848B2/en
Priority to CA2739004A priority patent/CA2739004C/en
Priority to KR1020207002358A priority patent/KR20200015788A/en
Priority to CN200880119316.5A priority patent/CN101932724B/en
Priority to KR1020227003193A priority patent/KR20220020404A/en
Priority to AU2008307830A priority patent/AU2008307830A1/en
Priority to KR1020207028269A priority patent/KR20200118226A/en
Publication of WO2009045115A1 publication Critical patent/WO2009045115A1/en
Priority to US12/754,077 priority patent/US20110086349A1/en
Priority to US15/233,604 priority patent/US20170088900A1/en
Priority to US15/647,608 priority patent/US20180010198A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6834Enzymatic or biochemical coupling of nucleic acids to a solid phase
    • C12Q1/6837Enzymatic or biochemical coupling of nucleic acids to a solid phase using probe arrays or probe chips
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/574Immunoassay; Biospecific binding assay; Materials therefor for cancer
    • G01N33/57407Specifically defined cancers
    • G01N33/57419Specifically defined cancers of colon
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/574Immunoassay; Biospecific binding assay; Materials therefor for cancer
    • G01N33/57407Specifically defined cancers
    • G01N33/57446Specifically defined cancers of stomach or intestine
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/118Prognosis of disease development
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/16Primer sets for multiplex assays
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2800/00Detection or diagnosis of diseases
    • G01N2800/60Complex ways of combining multiple protein biomarkers for diagnosis

Definitions

  • This invention relates to methods and compositions for determining the prognosis of cancer, particularly gastrointestinal cancer, in a patient. Specifically, this invention relates to the use of genetic markers for determining the prognosis of cancer, such as gastrointestinal cancer, based on cell proliferation signatures.
  • Cellular proliferation is the most fundamental process in living organisms, and as such is precisely regulated by the expression level of proliferation-associated genes (1). Loss of proliferation control is a hallmark of cancer, and it is thus not surprising that growth- regulating genes are abnormally expressed in tumours relative to the neighbouring normal tissue (2). Proliferative changes may accompany other changes in cellular properties, such as invasion and ability to metastasize, and therefore could affect patient outcome. This association has attracted substantial interest and many studies have been devoted to the exploration of tumour cell proliferation as a potential indicator of outcome.
  • Ki-67 a protein expressed in all cell cycle phases except for the resting phase G 0 (4).
  • Using Ki-67 a clear association between the proportion of cycling cells and clinical outcome has been established in malignancies such as breast cancer, lung cancer, soft tissue tumours, and astrocytoma (5). In breast cancer, this association has also been confirmed by microarray analysis, leading to a proliferative gene expression profile that has been employed for identifying patients at increased risk of recurrence (6).
  • the proliferation index (Pl) has produced conflicting results as a prognostic factor and therefore cannot be applied in a clinical context (see below). Studies vary with respect to patient selection, sampling methods, cut-off point levels, antibody choices, staining techniques and the way data have been collected and interpreted. The methodological differences and heterogeneity of these studies may partly explain the contradictory results (7),(8).
  • the use of Ki-67 as a proliferation marker also has limitations. The Ki-67 Pl estimates the fraction of actively cycling cells, but gives no indication of cell cycle length (3), (9). Thus, tumours with a similar Pl may grow at dissimilar rates due to different cycling speeds. In addition, while Ki-67 mRNA is not produced in resting cells, protein may still be detectable in a proportion of colorectal tumours leading to an overestimated proliferation rate (10).
  • This invention provides further methods and compositions based on prognostic cancer markers, specifically gastrointestinal cancer prognostic markers, to aid in the prognosis and treatment of cancer.
  • microarray analysis is used to identify genes that provide a proliferation signature for cancer cells. These genes, and the proteins encoded by those genes, are herein termed gastrointestinal cancer proliferation markers (GCPMs).
  • GCPMs gastrointestinal cancer proliferation markers
  • the cancer for prognosis is gastrointestinal cancer, particularly gastric or colorectal cancer.
  • the invention includes a method for determining the prognosis of a cancer by identifying the expression levels of at least one GCPM in a sample.
  • Selected GCPMs encode proteins that associated with cell proliferation, e.g., cell cycle components. These GCPMs have the added utility in methods for determining the best treatment regime for a particular cancer based on the prognosis.
  • GCPM levels are higher in non-recurring tumour tissue as compared to recurring tumour tissue. These markers can be used either alone or in combination with each other, or other known cancer markers.
  • this invention includes a method for determining the prognosis of a cancer, comprising: (a) providing a sample of the cancer; (b) detecting the expression level of at least one GCPM family member in the sample; and (c) determining the prognosis of the cancer.
  • the invention includes a step of detecting the expression level of at least one GCPM RNA, for example, at least one mRNA. In a further aspect, the invention includes a step of detecting the expression level of at least one GCPM protein. In yet a further aspect, the invention includes a step of detecting the level of at least one GCPM peptide. In yet another aspect, the invention includes detecting the expression level of at least one GCPM family member in the sample. In an additional aspect, the GCPM is a gene associated with cell proliferation, such as a cell cycle component. In other aspects, the at least one GCPM is selected from Table A, Table B, Table C or Table D, herein.
  • the expression levels of all proliferation markers or their expression products are determined, for example, as listed in Table A, Table, B, Table C or Table D; as listed for the group CDC2, MCM6, RPA3, MCM7, PCNA, G22P1 , KPNA2, ANLN, APG7L, TOPK, GMNN, RRM1 , CDC45L, MAD2L1 , RAN, DUT, RRM2, CDK7, MLH3, SMC4L1 , CSPG6, POLD2, POLE2, BCCIP, Pfs2, TREX1 , BUB3, FEN1 , DRF1 , PREI3, CCNE1 , RPA1, POLE3, RFC4, MCM3, CHEK1 , CCND1, and CDC37; or as listed for the group CDC2, RFC4, PCNA, CCNE1 , CCND1 , CDK7, MCM genes (e.g., one or more of MCM3, MCM6, and MCM7), FEN
  • the invention includes a method of determining a treatment regime for a cancer comprising: (a) providing a sample of the cancer; (b) detecting the expression level of at least one GCPM family member in the sample; (c) determining the prognosis of the cancer based on the expression level of at least one GCPM family member; and (d) determining the treatment regime according to the prognosis.
  • An additional aspect of the invention includes a kit for detecting cancer, comprising: (a) a GCPM capture reagent; (b) a detector capable of detecting the captured GCPM, the capture reagent, or a complex thereof; and, optionally, (c) instructions for use.
  • the kit also includes a substrate for the GCPM as captured.
  • this invention includes a method for determining the prognosis of gastrointestinal cancer, especially colorectal or gastric cancer, comprising the steps of: (a) providing a sample, e.g., tumour sample, from a patient suspected of having gastrointestinal cancer; (b) measuring the presence of a GCPM protein using an ELISA method.
  • FIG. 4 Kaplan-Meier survival curves according to the expression level of GPS (gene proliferation signal) in gastric cancer patients. Overall survival is significantly shorter in patients with low GPS expression in this cohort of 38 gastric cancer patients of mixed stage. P values from Log rank test are indicated.
  • FIG. 5 A box-and-whisker plot showing differential expression between cycling cells in the exponential phase (EP) and growth-inhibited cells in the stationary phase (SP) of 11 QRT-PCR-validated genes.
  • the box range includes the 25 to the 75 percentiles of the data.
  • the horizontal line in the box represents the median value.
  • the "whiskers” are the largest and smallest values (excluding outliers). Any points more than 3/2 times of the interquartile range from the end of a box will be outliers and presented as a dot.
  • the Y axis represents the log 2 fold change of the ratio between cell line RNA and reference RNA. Analysis was performed using SPSS software.
  • the present disclosure has succeeded in (i) defining a CRC-specific gene proliferation signature (GPS) using a cell line model; and (ii) determining the prognostic significance of the GPS in the prediction of patient outcome and its association with clinico-pathologic variables in two independent cohorts of CRC patients.
  • GPS CRC-specific gene proliferation signature
  • antibodies and like terms refer to immunoglobulin molecules and immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain an antigen binding site that specifically binds (immunoreacts with) an antigen. These include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, Fc, Fab, Fab', and Fab 2 fragments, and a Fab expression library. Antibody molecules relate to any of the classes IgG, IgM, IgA, IgE, and IgD, which differ from one another by the nature of heavy chain present in the molecule. These include subclasses as well, such as IgGI , lgG2, and others.
  • the light chain may be a kappa chain or a lambda chain.
  • Reference herein to antibodies includes a reference to all classes, subclasses, and types. Also included are chimeric antibodies, for example, monoclonal antibodies or fragments thereof that are specific to more than one source, e.g., a mouse or human sequence. Further included are camelid antibodies, shark antibodies or nanobodies.
  • markers refers to a molecule that is associated quantitatively or qualitatively with the presence of a biological phenomenon.
  • markers include a polynucleotide, such as a gene or gene fragment, RNA or RNA fragment; or a polypeptide such as a peptide, oligopeptide, protein, or protein fragment; or any related metabolites, by products, or any other identifying molecules, such as antibodies or antibody fragments, whether related directly or indirectly to a mechanism underlying the phenomenon.
  • the markers of the invention include the nucleotide sequences (e.g., GenBank sequences) as disclosed herein, in particular, the full-length sequences, any coding sequences, any fragments, or any complements thereof.
  • GCPM gastrointestinal cancer proliferation marker
  • GCPM family member refers to a marker with increased expression that is associated with a positive prognosis, e.g., a lower likelihood of recurrence cancer, as described herein, but can exclude molecules that are known in the prior art to be associated with prognosis of gastrointestinal cancer. It is to be understood that the term GCPM does not require that the marker be specific only for gastrointestinal tumours. Rather, expression of GCPM can be altered in other types of tumours, including malignant tumours.
  • Non-limiting examples of GCPMs are included in Table A, Table B, Table C or Table D, herein below, and include, but are not limited to, the specific group CDC2, MCM6, RPA3, MCM7, PCNA, G22P1 , KPNA2, ANLN, APG7L, TOPK, GMNN, RRM1 , CDC45L, MAD2L1, RAN, DUT, RRM2, CDK7, MLH3, SMC4L1 , CSPG6, POLD2, POLE2, BCCIP, Pfs2, TREXt, BUB3, FEN1 , DRF1 , PREI3, CCNE1, RPA1 , POLE3, RFC4, MCM3, CHEK1 , CCND1 , and CDC37; and the specific group CDC2, RFC4, PCNA, CCNE1 , CCND1 , CDK7, MCM genes (e.g., one or more of MCM3, MCM6, and MCM7), FEN1 , MA
  • cancer and “cancerous” refer to or describe the physiological condition in mammals that is typically characterized by abnormal or unregulated cell growth. Cancer and cancer pathology can be associated, for example, with metastasis, interference with the normal functioning of neighbouring cells, release of cytokines or other secretory products at abnormal levels, suppression or aggravation of inflammatory or immunological response, neoplasia, premalignancy, malignancy, invasion of surrounding or distant tissues or organs, such as lymph nodes, etc.
  • gastrointestinal cancers such as esophageal, stomach, small bowel, large bowel, anal, and rectal cancers, particularly included are gastric and colorectal cancers.
  • colonal cancer includes cancer of the colon, rectum, and/or anus, and especially, adenocarcinomas, and may also include carcinomas (e.g., squamous cloacogenic carcinomas), melanomas, lymphomas, and sarcomas. Epidermoid (nonkeratihizing squamous cell or basaloid) carcinomas are also included.
  • the cancer may be associated, for example, with chronic fistulas, irradiated anal skin, leukoplakia, lymphogranuloma venereum, Bowen's disease (intraepithelial carcinoma), condyloma acuminatum, or human papillomavirus.
  • the cancer may be associated with basal cell carcinoma, extramammary Paget's disease, cloacogenic carcinoma, or malignant melanoma.
  • differentially expressed gene refers to a gene whose expression is activated to a higher or lower level in a subject (e.g., test sample), specifically cancer, such as gastrointestinal cancer, relative to its expression in a control subject (e.g., control sample).
  • the terms also include genes whose expression is activated to a higher or lower level at different stages of the same disease; in recurrent or non-recurrent disease; or in cells with higher or lower levels of proliferation.
  • a differentially expressed gene may be either activated or inhibited at the polynucleotide level or polypeptide level, or may be subject to alternative splicing to result in a different polypeptide product. Such differences may be evidenced by a change in mRNA levels, surface expression, secretion or other partitioning of a polypeptide, for example.
  • microarray refers to an ordered arrangement of capture agents, preferably polynucleotides (e.g., probes) or polypeptides on a substrate. See, e.g., Microarray Analysis, M. Schena, John Wiley & Sons, 2002; Microarray Biochip Technology, M. Schena, ed., Eaton Publishing, 2000; Guide to Analysis of DNA Microarray Data, S. Knudsen, John Wiley & Sons, 2004; and Protein Microarray Technology, D. Kambhampati, ed., John Wiley & Sons, 2004.
  • mRNAs RNAs, cDNAs, and genomic DNAs.
  • the term includes DNAs and RNAs that contain one or more modified bases, such as tritiated bases, or unusual bases, such as inosine.
  • modified bases such as tritiated bases, or unusual bases, such as inosine.
  • the polynucleotides of the invention can encompass coding or non-coding sequences, or sense or antisense sequences.
  • prognosis refers to a prediction of medical outcome (e.g., likelihood of long- term survival); a negative prognosis, or bad outcome, includes a prediction of relapse, disease progression (e.g., tumour growth or metastasis, or drug resistance), or mortality; a positive prognosis, or good outcome, includes a prediction of disease remission, (e.g., disease-free status), amelioration (e.g., tumour regression), or stabilization.
  • prognostic signature “signature,” and the like refer to a set of two or more markers, for example GCPMs 1 that when analysed together as a set allow for the determination of or prediction of an event, for example the prognostic outcome of colorectal cancer.
  • “Stringent conditions” or “high stringency conditions”, as defined herein, typically: (1) employ low ionic strength and high temperature for washing, for example 0.015 M sodium chloride/0.0015 M sodium citrate/0.1% sodium dodecyl sulfate at 50 0 C; (2) employ a denaturing agent during hybridization, such as formamide, for example, 50% (v/v) formamide with 0.1% bovine serum albumin/0.1% Ficoll/0.1% polyvinylpyrrolidone/50 mM sodium phosphate buffer at pH 6.5 with 750 mM sodium chloride, 75 mM sodium citrate at 42°C; or (3) employ 50% formamide, 5X SSC (0.75 M NaCI 1 0.075 M sodium citrate), 50 mM sodium phosphate (pH 6.8), 0.1% sodium pyrophosphate, 5X, Denhardt's solution, sonicated salmon sperm DNA (50 ⁇ g/ml), 0.1% SDS, and 10% dextran sulfate at 42°
  • Cell proliferation is an indicator of outcome in some malignancies. In colorectal cancer, however, discordant results have been reported. As these results are based on a single proliferation marker, the present invention discloses the use of microarrays to overcome this limitation, to reach a firmer conclusion, and to determine the prognostic role of cell proliferation in colorectal cancer.
  • the microarray-based proliferation studies shown herein indicate that reduced rate of the proliferation signature in colorectal cancer is associated with poor outcome. The invention can therefore be used to identify patients at high risk of early death from cancer.
  • a decrease in these markers is indicative of a negative prognosis.
  • This can include disease progression or the increased likelihood of cancer recurrence, especially for gastrointestinal cancer, such as gastric or colorectal cancer.
  • a decrease in expression can be determined, for example, by comparison of a test sample (e.g., tumour sample) to samples associated with a positive prognosis.
  • An increase in expression can be determined, for example, by comparison of a test sample (e.g., tumour samples) to samples associated with a negative prognosis.
  • a patient's sample e.g., tumour sample
  • samples with known patient outcome can be compared to samples with known patient outcome. If the patient's sample shows increased expression of GCPMs that is comparable to samples with good outcome, and/or higher than samples with poor outcome, then a positive . prognosis is implicated. If the patient's sample shows decreased expression of GCPMs that is comparable to samples with poor outcome, and/or lower than samples with good outcome, then a negative prognosis is implicated.
  • a patient's sample can be compared to samples of actively proliferating/non-proliferating tumour cells.
  • the invention provides for a set of genes, identified from cancer patients with various stages of tumours, outlined in Table C that are shown to be prognostic for colorectal cancer. These genes are all associated with cell proliferation and establish a relationship between cell proliferation genes and their utility in cancers prognosis. It has also been found that the genes in the prognostic signature listed in Table C are also correlated with additional cell proliferation genes. Based on these finding, the invention also provides for a set of cell cycle genes, shown in Table D, that are differentially expressed between high and low proliferation groups, for use as prognostic markers.
  • the disclosed GCPMs therefore provide a useful tool for determining the prognosis of cancer, and establishing a treatment regime specific for that tumour.
  • a positive prognosis can be used by a patient to decide to pursue standard or less invasive treatment options.
  • a negative prognosis can be used by a patient to decide to terminate treatment or to pursue highly aggressive or experimental treatments.
  • a patient can chose treatments based on their impact on cell proliferation or the expression of cell proliferation markers (e.g., GCPMs).
  • treatments that specifically target cells with high proliferation or specifically decrease expression of cell proliferation markers would not be preferred for patients with gastrointestinal cancer, such as colorectal cancer or gastric cancer.
  • GCPMs can be detected in tumour tissue, tissue proximal to the tumour, lymph node samples, blood samples, serum samples, urine samples, or faecal samples, using any suitable technique, and can include, but is not limited to, oligonucleotide probes, quantitative PCR, or antibodies raised against the markers.
  • the expression level of one GCPM in the sample will be indicative of the likelihood of recurrence in that subject.
  • oligonucleotide probes quantitative PCR, or antibodies raised against the markers.
  • the expression level of one GCPM in the sample will be indicative of the likelihood of recurrence in that subject.
  • a proliferation signature the sensitivity and accuracy of prognosis will be increased. Therefore, multiple markers according to the present invention can be used to determine the prognosis of a cancer.
  • the present invention relates to a set of markers, in particular, GCPMs, the expression of which has prognostic value, specifically with respect to cancer-free survival.
  • the cancer is gastrointestinal cancer, particularly, gastric or colorectal cancer, and, in further aspects, the colorectal cancer is an adenocarcinoma.
  • the invention relates to a method of predicting the likelihood of long-term survival of a cancer patient without the recurrence of cancer, comprising determining the expression level of one or more proliferation markers or their expression products in a sample obtained from the patient, normalized against the expression level of all RNA transcripts or their products in the sample, or of a reference set of RNA transcripts or their expression products, wherein the proliferation marker is the transcript of one or more markers listed in Table A, Table B, Table C or Table D, herein.
  • a decrease in expression levels of one or more GCPM indicates a decreased likelihood of long-term survival without cancer recurrence, while an increase in expression levels of one or more GCPM indicates an increased likelihood of long-term survival without cancer recurrence.
  • the expression levels one or more, for example at least two, or at least 3, or at least 4, or at least 5, or at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, or at least 75 of the proliferation markers or their expression products are determined, e.g., as selected from Table A, Table, B, Table C or Table D; as selected from CDC2, MCM6, RPA3, MCM7, PCNA, G22P1 , KPNA2, ANLN, APG7L, TOPK, GMNN, RRM1, CDC45L, MAD2L1 , RAN, DUT, RRM2, CDK7, MLH3, SMC4L1 , CSPG6, POLD2, POLE2, BCCIP, Pfs2, TREX1 , BUB3, FEN1, DRF1 , PREI3, CCNE1 , RPA1 , POLE3, RFC4, MCM3, CHEK1 , CCND1 , and
  • the method comprises the determination of the expression levels of all proliferation markers or their expression products, e.g., as listed in Table A, Table, B, Table C or Table D; as listed for the group CDC2, MCM6, RPA3, MCM7, PCNA, G22P1 , KPNA2, ANLN, APG7L, TOPK, GMNN, RRM1 , CDC45L, MAD2L1 , RAN, DUT, RRM2, CDK7, MLH3, SMC4L1 , CSPG6, POLD2, POLE2, BCCIP, Pfs2, TREX1 , BUB3, FEN1, DRF1 , PREI3, CCNE1 , RPA1 , POLE3, RFC4, MCM3, CHEK1 , CCND1 , and CDC37; or as listed for the group CDC2, RFC4, PCNA, CCNE1 , CCND1 , CDK7, MCM genes (e.g., one or more of MCM3, MCM3, M
  • RNA is isolated from a fixed, wax-embedded cancer tissue specimen of the patient. Isolation may be performed by any technique known in the art, for example from core biopsy tissue or fine needle aspirate cells.
  • the invention relates to an array comprising polynucleotides hybridizing to two or more markers as selected from Table A, Table B, Table C or Table D; as selected from CDC2, MCM6, RPA3, MCM7, PCNA, G22P1 , KPNA2, ANLN, APG7L, TOPK, GMNN, RRM1 , CDC45L, MAD2L1 , RAN, DUT, RRM2, CDK7, MLH3, SMC4L1 , CSPG6, POLD2, POLE2, BCCIP, Pfs2, TREX1 , BUB3, FEN1 , DRF1 , PREI3, CCNE1, RPA1, P0LE3, RFC4, MCM3, CHEK1, CCND1 , and CDC37; or as selected from CDC2, RFC4, PCNA, CCNE1 , CCND1 , CDK7, MCM genes (e.g., one or more of MCM3, MCM6, and MCM7), FEN
  • the array comprises polynucleotides hybridizing to the full set of markers listed in Table A, Table B, Table C or Table D; as listed for the group CDC2, MCM6, RPA3, MCM7, PCNA, G22P1 , KPNA2, ANLN, APG7L, TOPK 1 GMNN, RRM1 , CDC45L, MAD2L1 , RAN, DUT, RRM2, CDK7, MLH3, SMC4L1 , CSPG6, POLD2, POLE2, BCCIP, Pfs2, TREX1 , BUB3, FEN1 , DRF1 , PREI3, CCNE1 , RPA1 , POLE3, RFC4, MCM3, CHEK1 , CCND1 , and CDC37; or as listed for the group CDC2, RFC4, PCNA, CCNE1, CCND1 , CDK7, MCM genes (e.g., one or more of MCM3, MCM6, and MCM7),
  • the polynucleotides can be cDNAs, or oligonucleotides, and the solid surface on which they are displayed can be glass, for example.
  • the polynucleotides can hybridize to one or more of the markers as disclosed herein, for example, to the full-length sequences, any coding sequences, any fragments, or any complements thereof.
  • the invention relates to a method of predicting the likelihood of long-term survival of a patient diagnosed with cancer, without the recurrence of cancer, comprising the steps of: (1) determining the expression levels of the RNA transcripts or the expression products of the full set or a subset of the markers listed in Table A, Table B, Table C or Table D, herein, in a sample obtained from the patient, normalized against the expression levels of all RNA transcripts or their expression products in the sample, or of a reference set of RNA transcripts or their products; (2) subjecting the data obtained in step (1) to statistical analysis; and (3) determining whether the likelihood of the long-term survival has increased or decreased.
  • the invention concerns a method of preparing a personalized genomics profile for a patient, e.g., a cancer patient, comprising the steps of: (a) subjecting a sample obtained from the patient to expression analysis; (b) determining the expression level of one or more markers selected from the marker set listed in any one of Table A, Table B, Table C or Table D, wherein the expression level is normalized against a control gene or genes and optionally is compared to the amount found in a reference set; and (c) creating a report summarizing the data obtained by the expression analysis.
  • the report may, for example, include prediction of the likelihood of long term survival of the patient and/or recommendation for a treatment modality of the patient.
  • the relatively low expression of proliferation markers is associated with poor outcome. This can include disease progression or the increased likelihood of cancer recurrence, especially for gastrointestinal cancer, such as gastric or colorectal cancer.
  • the relatively high expression of proliferation markers is associated with a good outcome. This can include decreased likelihood of cancer recurrence after standard treatment, especially for gastrointestinal cancer, such as gastric or colorectal cancer.
  • Low expression can be determined, for example, by comparison of a test sample (e.g., tumour sample) to samples associated with a positive prognosis.
  • High expression can be determined, for example, by comparison of a test sample (e.g., tumour sample) to samples associated with a negative prognosis.
  • a patient's sample e.g., tumour sample
  • samples with known patient outcome can be compared to samples with known patient outcome. If the patient's sample shows high expression of GCPMs that is comparable to samples with good outcome, and/or higher than samples with poor outcome, then a positive prognosis is implicated. If the patient's sample shows low expression of GCPMs that is comparable to samples with poor outcome, and/or lower than samples with good outcome, then a negative prognosis is implicated.
  • a patient's sample can be compared to samples of actively proliferating/non-proliferating tumour cells.
  • a positive prognosis is implicated. If the patient's sample shows low expression of GCPMs that is comparable to non-proliferating cells, and/or lower than actively proliferating cells, then a negative prognosis is implicated.
  • the expression levels of a prognostic signature comprising two or more GCPMs from a patient's sample can be compared to samples of recurrent/non-recurrent cancer. If the patient's sample shows increased or decreased expression of CCPMs by comparison to samples of non-recurrent cancer, and/or comparable expression to samples of recurrent cancer, then a negative prognosis is implicated. If the patient's sample shows expression of GCPMs that is comparable to samples of non-recurrent cancer, and/or lower or higher expression than samples of recurrent cancer, then a positive prognosis is implicated.
  • a prediction method can be applied to a panel of markers, for example the panel of GCPMs outlined in Table A, Table B Table C or Table D, in order to generate a predictive model. This involves the generation of a prognostic signature, comprising two or more GCPMs.
  • GCPMs in Table A, Table B, Table C or Table Dtherefore provide a useful set of markers to generate prediction signatures for determining the prognosis of cancer, and establishing a treatment regime, or treatment modality, specific for that tumour.
  • a positive prognosis can be used by a patient to decide to pursue standard or less invasive treatment options.
  • a negative prognosis can be used by a patient to decide to terminate treatment or to pursue highly aggressive or experimental treatments.
  • a patient can chose treatments based on their impact on the expression of prognostic markers (e.g., GCPMs).
  • GCPMs can be detected in tumour tissue, tissue proximal to the tumour, lymph node samples, blood samples, serum samples, urine samples, or faecal samples, using any suitable technique, and can include, but is not limited to, oligonucleotide probes, quantitative PCR, or antibodies raised against the markers. It will be appreciated that by analyzing the presence and amounts of expression of a plurality of GCPMs in the form of prediction signatures, and constructing a prognostic signature, the sensitivity and accuracy of prognosis will be increased. Therefore, multiple markers according to the present invention can be used to determine the prognosis of a cancer.
  • RNA is isolated from a fixed, wax-embedded cancer tissue specimen of the patient. Isolation may be performed by any technique known in the art, for example from core biopsy tissue or fine needle aspirate cells.
  • the invention relates to a method of predicting a prognosis, e.g., the likelihood of long-term survival of a cancer patient without the recurrence of cancer, comprising determining the expression level of one or more prognostic markers or their expression products in a sample obtained from the patient, normalized against the expression level of other RNA transcripts or their products in the sample, or of a reference set of RNA transcripts or their expression products.
  • the prognostic marker is one or more markers listed in Table A, Table B, Table C or Table D or is included as one or more of the prognostic signatures derived from the markers listed in Table A, Table B, Table C or Table D.
  • the expression levels of the prognostic markers or their expression products are determined, e.g., for the markers listed in Table A, Table B, Table C or Table D, a prognostic signature derived from the markers listed in Table A, Table B, Table C or Table D.
  • the method comprises the determination of the expression levels of a full set of prognosis markers or their expression products, e.g., for the markers listed in Table A, Table B, Table C or Table D, or, a prognostic signature derived from the markers listed in Table A, Table B, Table C or Table D.
  • the invention relates to an array (e.g., microarray) comprising polynucleotides hybridizing to two or more markers, e.g., for the markers listed in Table A, Table B, Table C or Table D, or a prognostic signature derived from the markers listed in Table A, Table B, Table C or Table D.
  • the array comprises polynucleotides hybridizing to prognostic signature derived from the markers listed in Table A, Table B, Table C or Table D, or e.g., for a prognostic signature.
  • the array comprises polynucleotides hybridizing to the full set of markers, e.g., for the markers listed in Table A, Table B, Table C or Table D, or, e.g., for a prognostic signature.
  • the polynucleotides can be cDNAs, or oligonucleotides, and the solid surface on which they are displayed can be glass, for example.
  • the polynucleotides can hybridize to one or more of the markers as disclosed herein, for example, to the full-length sequences, any coding sequences, any fragments, or any complements thereof.
  • an increase or decrease in expression levels of one or more GCPM indicates a decreased likelihood of long-term survival, e.g., due to cancer recurrence, while a lack of an increase or decrease in expression levels of one or more GCPM indicates an increased likelihood of long-term survival without cancer recurrence.
  • the invention relates to a kit comprising one or more of: (1) extraction buffer/reagents and protocol; (2) reverse transcription buffer/reagents and protocol; and (3) quantitative PCR buffer/reagents and protocol suitable for performing any of the foregoing methods. " Other aspects and advantages of the invention are illustrated in the description and examples included herein.
  • Table A Proliferation-related genes differentially expressed between cell lines in high and low proliferative states. Genes that were differentially expressed between cell lines in confluent (low proliferation) and semi-confluent (high proliferation) states (see Figure 1) were identified by microarray analysis on 3OK MWG Biotech arrays. Table A comprises the subset of these genes that were categorized by gene ontology analysis as cell proliferation-related. Table B: GCPMs for cell proliferation signature
  • Table B Known cell proliferation-related genes. All genes categorized as cell proliferation-related by gene ontology analysis and present on the Affymetrix HG- U 133 platform.
  • the following approaches are non-limiting methods that can be used to detect the proliferation markers, including GCPM family members: microarray approaches using oligonucleotide probes selective for a GCPM; real-time qPCR on tumour samples using GCPM specific primers and probes; real-time qPCR on lymph node, blood, serum, faecal, or urine samples using GCPM specific primers and probes; enzyme-linked immunological assays (ELISA); immunohistochemistry using anti-marker antibodies; and analysis of array or qPCR data using computers.
  • Primary data can be collected and fold change analysis can be performed, for example, by comparison of marker expression levels in tumour tissue and non-tumour tissue; by comparison of marker expression levels to levels determined in recurring tumours and non-recurring tumours; by comparison of marker expression levels to levels determined in tumours with or without metastasis; by comparison of marker expression levels to levels determined in differently staged tumours; or by comparison of marker expression levels to levels determined in cells with different levels of proliferation.
  • a negative or positive prognosis is determined based on this analysis. Further analysis of tumour marker expression includes matching those markers exhibiting increased or decreased expression with expression profiles of known gastrointestinal tumours to provide a prognosis.
  • a threshold for concluding that expression is increased is provided as, for example, at least a 1.5-fold or 2-fold increase, and in alternative embodiments, at least a 3-fold increase, 4-fold increase, or 5-fold increase.
  • a threshold for concluding that expression is decreased is provided as, for example, at least a 1.5-fold or 2-fold decrease, and in alternative embodiments, at least a 3-fold decrease, 4-fold decrease, or 5-fold decrease. It can be appreciated that other thresholds for concluding that increased or decreased expression has occurred can be selected without departing from the scope of this invention.
  • a threshold for concluding that expression is increased will be dependent on the particular marker and also the particular predictive model that is to be applied.
  • the threshold is generally set to achieve the highest sensitivity and selectivity with the lowest error rate, although variations may be desirable for a particular clinical situation.
  • the desired threshold is determined by analysing a population of sufficient size taking into account the statistical variability of any predictive model and is calculated from the size of the sample used to produce the predictive model. The same applies for the determination of a threshold for concluding that expression is decreased. It can be appreciated that other thresholds, or methods for establishing a threshold, for concluding that increased or decreased expression has occurred can be selected without departing from the scope of this invention.
  • a prediction model may produce as it's output a numerical value, for example a score, likelihood value or probability.
  • a numerical value for example a score, likelihood value or probability.
  • a negative prognosis is associated with decreased expression of at least one proliferation marker
  • a positive prognosis is associated with increased expression of at least one proliferation marker.
  • an increase in expression is shown by at least 1 , 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, or 75 of the markers disclosed herein.
  • a decrease in expression is shown by at least 1 , 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, or 75 of the markers disclosed herein
  • proliferation signatures comprising one or more GCPMs can be used to determine the prognosis of a cancer, by comparing the expression level of the one or more genes to the disclosed proliferation signature. By comparing the expression of one or more of the GCPMs in a tumour sample with the disclosed proliferation signature, the likelihood of the cancer recurring can be determined.
  • the comparison of expression levels of the prognostic signature to establish a prognosis can be done by applying a predictive model as described previously.
  • Determining the likelihood of the cancer recurring is of great value to the medical practitioner.
  • a high likelihood of reoccurrence means that a longer or higher dose treatment should be given, and the patient should be more closely monitored for signs of recurrence of the cancer.
  • An accurate prognosis is also of benefit to the patient. It allows the patient, along with their partners, family, and friends to also make decisions about treatment, as well as decisions about their future and lifestyle changes. Therefore, the invention also provides for a method establishing a treatment regime for a particular cancer based on the prognosis established by matching the expression of the markers in a tumour sample with the differential proliferation signature.
  • the marker selection, or construction of a proliferation signature does not have to be restricted to the GCPMs disclosed in Table A, Table B, Table C or Table D, herein, but could involve the use of one or more GCPMs from the disclosed signature, or a new signature may be established using GCPMs selected from the disclosed marker lists.
  • the requirement of any signature is that it predicts the likelihood of recurrence with enough accuracy to assist a medical practitioner to establish a treatment regime.
  • the present invention also provides for the use of a marker associated with cell proliferation, e.g., a cell cycle component, as a GCPM.
  • a marker associated with cell proliferation e.g., a cell cycle component
  • determination of the likelihood of a cancer recurring can be accomplished by measuring expression of one or more proliferation-specific markers.
  • the methods provided herein also include assays of high sensitivity.
  • qPCR is extremely sensitive, and can be used to detect markers in very low copy number (e.g., 1 - 100) in a sample. With such sensitivity, prognosis of gastrointestinal cancer is made reliable, accurate, and easily tested.
  • RT-PCR Reverse Transcription PCR
  • RNA is typically total RNA isolated from human tumours or tumour cell lines, and corresponding normal tissues or cell lines, respectively.
  • the starting material is typically total RNA isolated from human tumours or tumour cell lines, and corresponding normal tissues or cell lines, respectively.
  • RNA can be isolated from a variety of samples, such as tumour samples from breast, lung, colon (e.g., large bowel or small bowel), colorectal, gastric, esophageal, anal, rectal, prostate, brain, liver, kidney, pancreas, spleen, thymus, testis, ovary, uterus, etc., tissues, from primary tumours, or tumour cell lines, and from pooled samples from healthy donors.
  • RNA can be extracted, for example, from frozen or archived paraffin-embedded and fixed (e.g., formalin-fixed) tissue samples.
  • the first step in gene expression profiling by RT-PCR is the reverse transcription of the RNA template into cDNA, followed by its exponential amplification in a PCR reaction.
  • the two most commonly used reverse transcriptases are avilo myeloblastosis virus reverse transcriptase (AMV-RT) and Moloney murine leukaemia virus reverse transcriptase (MMLV-RT).
  • AMV-RT avilo myeloblastosis virus reverse transcriptase
  • MMLV-RT Moloney murine leukaemia virus reverse transcriptase
  • the reverse transcription step is typically primed using specific primers, random hexamers, or oligo-dT primers, depending on the circumstances and the goal of expression profiling.
  • extracted RNA can be reverse-transcribed using a GeneAmp RNA PCR kit (Perkin Elmer, CA, USA), following the manufacturer's instructions.
  • the derived cDNA can then be used as a template in the subsequent PCR reaction.
  • the PCR step can use a variety of thermostable DNA-dependent DNA polymerases, it typically employs the Taq DNA polymerase, which has a 5'-3' nuclease activity but lacks a 3'-5' proofreading endonuclease activity.
  • TaqMan (g) PCR typically utilizes the 5' nuclease activity of Taq or Tth polymerase to hydrolyze a hybridization probe bound to its target amplicon, but any enzyme with equivalent 5' nuclease activity can be used.
  • a third oligonucleotide, or probe is designed to detect nucleotide sequence located between the two PCR primers.
  • the probe is non-extendible by Taq DNA polymerase enzyme, and is labeled with a reporter fluorescent dye and a quencher fluorescent dye. Any laser-induced emission from the reporter dye is quenched by the quenching dye when the two dyes are located close together as they are on the probe.
  • the Taq DNA polymerase enzyme cleaves the probe in a template- dependent manner.
  • the resultant probe fragments disassociate in solution, and signal from the released reporter dye is free from the quenching effect of the second fluorophore.
  • One molecule of reporter dye is liberated for each new molecule synthesized, and detection of the unquenched reporter dye provides the basis for quantitative interpretation of the data.
  • TaqMan RT-PCR can be performed using commercially available equipment, such as, for example, ABI PRISM 7700tam Sequence Detection System (Perkin-Elmer-Applied Biosystems, Foster City, CA, USA), or Lightcycler (Roche Molecular Biochemicals, Mannheim, Germany).
  • the 5' nuclease procedure is run on a real-time quantitative PCR device such as the ABI PRISM 7700tam Sequence Detection System.
  • the system consists of a thermocycler, laser, charge-coupled device (CCD), camera, and computer.
  • the system amplifies samples in a 96-well format on a thermocycler.
  • laser-induced fluorescent signal is collected in real- time through fibre optics cables for all 96 wells, and detected at the CCD.
  • the system includes software for running the instrument and for analyzing the data.
  • 5 1 nuclease assay data are initially expressed as Ct, or the threshold cycle.
  • fluorescence values are recorded during every cycle and represent the amount of product amplified to that point in the amplification reaction. The point when the fluorescent signal is first recorded as statistically significant is the threshold cycle.
  • RT-PCR is usually performed using an internal standard.
  • the ideal internal standard is expressed at a constant level among different tissues, and is unaffected by the experimental treatment.
  • RNAs most frequently used to normalize patterns of gene expression are mRNAs for the housekeeping genes glyceraldehyde-3-phosphate-dehydrogenase (GAPDH) and-actin.
  • Real-time quantitative PCR A more recent variation of the RT-PCR technique is the real time quantitative PCR, which measures PCR product accumulation through a dual-labeled fluorigenic probe (i.e., TaqMan@ probe).
  • Real time PCR is compatible both with quantitative competitive PCR and with quantitative comparative PCR.
  • the former uses an internal competitor for each target sequence for normalization, while the latter uses a normalization gene contained within the sample, or a housekeeping gene for RT-PCR.
  • a housekeeping gene for RT-PCR For further details see, e.g., Held et al., Genome Research 6: 986-994 (1996).
  • PCR primers and probes are designed based upon intron sequences present in the gene to be amplified.
  • the first step in the primer/probe design is the delineation of intron sequences within the genes. This can be done by publicly available software, such as the DNA BLAT software developed by Kent, W. J., Genome Res. 12 (4): 656-64 (2002), or by the BLAST software including its variations. Subsequent steps follow well established methods of PCR primer and probe design.
  • PCR primer design The most important factors considered in PCR primer design include primer length, melting temperature (T m ), and G/C content, specificity, complementary primer sequences, and 3' end sequence.
  • optimal PCR primers are generally 17-30 bases in length, and contain about 20-80%, such as, for example, about 50-60% G+C bases.
  • T m s between 50 and 80 0 C, e.g., about 50 to 70 0 C are typically preferred.
  • GCPMs Differential gene expression can also be identified, or confirmed using the microarray technique.
  • the expression profile of GCPMs can be measured in either fresh or paraffin-embedded tumour tissue, using microarray technology.
  • polynucleotide sequences of interest including cDNAs and oligonucleotides
  • the arrayed sequences i.e., capture probes
  • Jhe source of RNA typically is total RNA isolated from human tumours or tumour cell lines, and corresponding normal tissues or cell lines.
  • RNA can be isolated from a variety of primary tumours or tumour cell lines. If the source of RNA is a primary tumour, RNA can be extracted, for example, from frozen or archived paraffin-embedded and fixed (e.g., formalin-fixed) tissue samples, which are routinely prepared and preserved in everyday clinical practice.
  • PCR amplified inserts of cDNA clones are applied to a substrate.
  • the substrate can include up to 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, or 75 nucleotide sequences. In other aspects, the substrate can include at least 10,000 nucleotide sequences.
  • the microarrayed sequences, immobilized on the microchip, are suitable for hybridization under stringent conditions.
  • the targets for the microarrays can be at least 50, 100, 200, 400, 500, 1000, or 2000 bases in length; or 50-100, 100-200, 100-500, 100-1000, 100-2000, or 500- 5000 bases in length.
  • the capture probes for the microarrays can be at least 10, 15, 20, 25, 50, 75, 80, or 100 bases in length; or 10-15, 10-20, 10-25, 10- 50, 10-75, 10-80, or 20-80 bases in length.
  • Fluorescently labeled cDNA probes may be generated through incorporation of fluorescent nucleotides by reverse transcription of RNA extracted from tissues of interest. Labeled cDNA probes applied to the chip hybridize with specificity to each spot of DNA on the array. After stringent washing to remove non-specifically bound probes, the chip is scanned by confocal laser microscopy or by another detection method, such as a CCD camera. Quantitation of hybridization of each arrayed element allows for assessment of corresponding mRNA abundance. With dual colour fluorescence, separately labeled cDNA probes generated from two sources of RNA are hybridized pairwise to the array. The relative abundance of the transcripts from the two sources corresponding to each specified gene is thus determined simultaneously.
  • the miniaturized scale of the hybridization affords a convenient and rapid evaluation of the expression pattern for large numbers of genes.
  • Such methods have been shown to have the sensitivity required to detect rare transcripts, which are expressed at a few copies per cell, and to reproducibly detect at least approximately two-fold differences in the expression levels (Schena et al., Proc. Natl. Acad. Sci. USA 93 (2): 106-149 (1996)).
  • Microarray analysis can be performed by commercially available equipment, following manufacturer's protocols, such as by using the Affymetrix GenChip technology, or Incyte's microarray technology.
  • the development of microarray methods for large-scale analysis of gene expression makes it possible to search systematically for molecular markers of cancer classification and outcome prediction in a variety of tumour types.
  • RNA isolation, purification, and amplification General methods for mRNA extraction are well known in the art and are disclosed in standard textbooks of molecular biology, including Ausubel et al., Current Protocols of Molecular Biology, John Wiley and Sons (1997). Methods for RNA extraction from paraffin embedded tissues are disclosed, for example, in Rupp and Locker, Lab Invest. 56: A67 (1987), and De Sandres et al., BioTechniques 18: 42044 (1995).
  • RNA isolation can be performed using purification kit, buffer set, and protease from commercial manufacturers, such as Qiagen, according to the manufacturer's instructions. For example, total RNA from cells in culture can be isolated using Qiagen RNeasy mini- columns.
  • RNA isolation kits include MasterPure Complete DNA and RNA Purification Kit (EPICENTRE (D, Madison, Wl), and Paraffin Block RNA Isolation Kit (Ambion, Inc.). Total RNA from tissue samples can be isolated using RNA Stat-60 (Tel-Test). RNA prepared from tumour can be isolated, for example, by cesium chloride density gradient centrifugation.
  • RNA isolation, purification, primer extension and amplification are given in various published journal articles (for example: T. E. Godfrey et al. J. Molec. Diagnostics 2: 84-91 (2000); K. Specht et al., Am. J. Pathol. 158: 419-29 (2001)).
  • a representative process starts with cutting about 10 ⁇ m thick sections of paraffin-embedded tumour tissue samples. The RNA is then extracted, and protein and DNA are removed.
  • RNA repair and/or amplification steps may be included, if necessary, and RNA is reverse transcribed using gene specific promoters followed by RT-PCR. Finally, the data are analyzed to identify the best treatment option(s) available to the patient on the basis of the characteristic gene expression pattern identified in the tumour sample examined.
  • antibodies or antisera preferably polyclonal antisera, and most preferably monoclonal antibodies specific for each marker, are used to detect expression.
  • the antibodies can be detected by direct labeling of the antibodies themselves, for example, with radioactive labels, fluorescent labels, hapten labels such as, biotin, or an enzyme such as horse radish peroxidase or alkaline phosphatase.
  • unlabeled primary antibody is used in conjunction with a labeled secondary antibody, comprising antisera, polyclonal antisera or a monoclonal antibody specific for the primary antibody, lmmunohistochemistry protocols and kits are well known in the art and are commercially available.
  • Proteomics can be used to analyze the polypeptides present in a sample (e.g., tissue, organism, or cell culture) at a certain point of time.
  • proteomic techniques can be used to asses the global changes of protein expression in a sample (also referred to as expression proteomics).
  • Proteomic analysis typically includes: (1) separation of individual proteins in a sample by 2-D gel electrophoresis (2-D PAGE); (2) identification of the individual proteins recovered from the gel, e.g., my mass spectrometry or N-terminal sequencing, and (3) analysis of the data using bioinformatics.
  • Proteomics methods are valuable supplements to other methods of gene expression profiling, and can be used, alone or in combination with other methods, to detect the products of the proliferation markers of the present invention.
  • Microarray experiments typically involve the simultaneous measurement of thousands of genes. If one is comparing the expression levels for a particular gene between two groups (for example recurrent and non-recurrent tumours), the typical tests for significance (such as the t-test) are not adequate. This is because, in an ensemble of thousands of experiments (in this context each gene constitutes an "experiment"), the probability of at least one experiment passing the usual criteria for significance by chance alone is essentially unity. In a test for significance, one typically calculates the probability that the "null hypothesis" is correct. In the case of comparing two groups, the null hypothesis is that there is no difference between the two groups.
  • Data Mining is the term used to describe the extraction of "knowledge”, in other words the “know-how”, or predictive ability from (usually) large volumes of data (the dataset). This is the approach used in this study to generate prognostic signatures.
  • the "know-how” is the ability to accurately predict prognosis from a given set of gene expression measurements, or "signature” (as described generally in this section and in more detail in the examples section).
  • Data mining (49), and the related topic machine learning (40) is a complex, repetitive mathematical task that involves the use of one or more appropriate computer software packages (see below).
  • the use of software is advantageous on the one hand, in that one does not need to be completely familiar with the intricacies of the theory behind each technique in order to successfully use data mining techniques, provided that one adheres to the correct methodology.
  • the disadvantage is that the application of data mining can often be viewed as a "black box": one inserts the data and receives the answer. How this is achieved is often masked from the end-user (this is the case for many of the techniques described, and can often influence the statistical method chosen for data mining.
  • neural networks and support vector machines have a particularly complex implementation that makes it very difficult for the end user to extract out the "rules" used to produce the decision.
  • k-nearest neighbours and linear discriminant analysis have a very transparent process for decision making that is not hidden from the user.
  • supervised and unsupervised approaches There are two types of approach used in data mining: supervised and unsupervised approaches.
  • the information that is being linked to the data is known, such as categorical data (e.g. recurrent vs. non recurrent tumours). What is required is the ability to link the observed response (e.g. recurrence vs. non-recurrence) to the input variables.
  • the classes within the dataset are not known in advance, and data mining methodology is employed to attempt to find the classes or structure within the dataset.
  • the overall protocol involves the following steps:
  • Feature Selection typically the dataset contains many more data elements than would be practical to measure on a day-to-day basis, and additionally many elements that do not provide the information needed to produce a prediction model.
  • the actual ability of a prediction model to describe a dataset is derived from. some subset of the full dimensionality of the dataset. These dimensions the most important components (or features) of the dataset. Note in the context of microarray data, the dimensions of the dataset are the individual genes.
  • Feature selection in the context described here, involves finding those genes which are most "differentially expressed". In a more general sense, it involves those groups which pass some statistical test for significance, i.e. is the level of a particular variable consistently higher or lower in one or other of the groups being investigated. Sometimes the features are those variables (or dimensions) which exhibit the greatest variance.
  • the reduced dataset (as described by the features) is applied to the prediction model of choice.
  • the input for this model is usually in the form a multi-dimensional numerical input,(known as a vector), with associated output information (a class label or a response).
  • selected data is input into the prediction model, either sequentially (in techniques such as neural networks) or as a whole (in techniques that apply some form of regression, such as linear models, linear discriminant analysis, support vector machines). In some instances (e.g.
  • the dataset (or subset of the dataset obtained after feature selection) is itself the model.
  • effective models can be established with minimal understanding of the detailed mathematics, through the use of various software packages where the parameters of the model have been pre-determined by expert analysts as most likely to lead to successful results.
  • Free open-source software such as Script (a MatLab clone) - many and varied C++ libraries, which can be used to implement prediction models in a commercial, closed-source setting.
  • the methods can be by first performing the step of data mining process (above), and then applying the appropriate known software packages. Further description of the process of data mining is described in detail in many extremely well-written texts. (49)
  • Linear models (49, 50): The data is treated as the input of a iinear regression model, of which the class labels or responses variables are the output. Class labels, or other categorical data, must be transformed into numerical values
  • Linear Discriminant analysis (49, 51 , 52).
  • the data is linearly separable (i.e. the groups or classes of data can be separated by a hyperplane, which is an n-dimensional extension of a threshold), this technique can be applied.
  • a combination of variables is used to separate the classes, such that the between group variance is maximised, and the within-group variance is minimised.
  • the byproduct of this is the formation of a classification rule.
  • Application of this rule to samples of unknown class allows predictions or classification of class membership to be made for that sample.
  • linear discriminant analysis such as nearest shrunken centroids which are commonly used for microarray analysis.
  • Support vector machines A collection of variables is used in conjunction with a collection of weights to determine a model that maximizes the separation between classes in terms of those weighted variables. Application of this model to a sample then produces a classification or prediction of class membership for that sample.
  • Neural networks The data is treated as input into a network of nodes, which superficially resemble biological neurons, which apply the input from all the nodes to which they are connected, and transform the input into an output.
  • nodes which superficially resemble biological neurons, which apply the input from all the nodes to which they are connected, and transform the input into an output.
  • neural networks use the "multiply and sum" algorithm, to transform the inputs from multiple connected input nodes into a single output.
  • a node may not necessarily produce an output unless the inputs to that node exceed a certain threshold.
  • Each node has as its input the output from several other nodes, with the final output node usually being linked to a categorical variable.
  • the number of nodes, and the topology of the nodes can be varied in almost infinite ways, providing for the ability to classify extremely noisy data that may not be possible to categorize in other ways.
  • the most common implementation of neural networks is the multi-layer perceptron.
  • Classification and regression trees In these, variables are used to define a hierarchy of rules that can be followed in a stepwise manner to determine the class of a sample. The typical process creates a set of rules which lead to a specific class output, or a specific statement of the inability to discriminate.
  • distance functions are the Euclidean distance (an extension of the Pythagorean distance, as in triangulation, to n-dimensions), various forms of correlation (including Pearson Correlation co-efficient).
  • transformation functions that convert data points that would not normally be interconnected by a meaningful distance metric into euclidean space, so that Euclidean distance can then be applied (e.g. Mahalanobis distance).
  • Mahalanobis distance e.g. Mahalanobis distance.
  • the distance metric can be quite complex, the basic premise of k-nearest neighbours is quite simple, essentially being a restatement of "find the k-data vectors that are most similar to the unknown input, find out which class they correspond to, and vote as to which class the unknown input is”.
  • a directed acyclic graph is used to represent a collection of variables in conjunction with their joint probability distribution, which is then used to determine the probability of class membership for a sample.
  • independent components analysis in which independent signals (e.g., class membership) re isolated (into components) from a collection of variables. These components can then be used to produce a classification or prediction of class membership for a sample.
  • Training involves taking a subset of the dataset of interest (in this case gene expression measurements from colorectal tumours), such that it is stratified across the classes that are being tested for (in this case recurrent and non-recurrent tumours). This training set is used to generate a prediction model (defined above), which is tested on the remainder of the data (the testing set).
  • dataset of interest in this case gene expression measurements from colorectal tumours
  • This training set is used to generate a prediction model (defined above), which is tested on the remainder of the data (the testing set).
  • K-fold cross-validation The dataset is divided into K subsamples, each subsample containing approximately the same proportions of the class groups as the original. In each round of validation, one of the K subsamples is set aside, and training is accomplished using the remainder of the dataset. The effectiveness of the training for that round is guaged by how correctly the classification of the left-out group is. This procedure is repeated K- times, and the overall effectiveness ascertained by comparison of the predicted class with the known class.
  • Combinations of CCPMS such as those described above in Tables 1 and 2, can be used to construct predictive models for prognosis.
  • Prognostic signatures comprising one or more of these markers, can be used to determine the outcome of a patient, through application of one or more predictive models derived from the signature.
  • a clinician or researcher can determine the differential expression (e.g., increased or decreased expression) of the one or more markers in the signature, apply a predictive model, and thereby predict the negative prognosis, e.g., likelihood of disease relapse, of a patient, or alternatively the likelihood of a positive prognosis (continued remission).
  • the invention includes a method of determining a treatment regime for a cancer comprising: (a) providing a sample of the cancer; (b) detecting the expression level of a GgCPM family member in said sample; (c) determining the prognosis of the cancer based on the expression level of a CCPM family member; and (d) determining the treatment regime according to the prognosis.
  • the invention includes a device for detecting a GCPM, comprising: a substrate having a GCPM capture reagent thereon; and a detector associated with said substrate, said detector capable of detecting a GCPM associated with said capture reagent.
  • kits for detecting cancer comprising: a substrate; a GCPM capture reagent; and instructions for use.
  • method for detecting aGCPM using qPCR comprising: a forward primer specific for said CCPM; a reverse primer specific for said GCPM; PCR reagents; a reaction vial; and instructions for use.
  • kits for detecting the presence of a GCPM polypeptide or peptide comprising: a substrate having a capture agent for said GCPM polypeptide or peptide; an antibody specific for said GCPM polypeptide or peptide; a reagent capable of labeling bound antibody for said GCPM polypeptide or peptide; and instructions for use.
  • this invention includes a method for determining the prognosis of colorectal cancer, comprising the steps of: providing a tumour sample from a patient suspected of having colorectal cancer; measuring the presence of a GCPM polypeptide using an ELISA method.
  • the GCPM of the invention is selected from the markers set forth in Table A, Table B, Table C or Table D.
  • the GCPM is included in a prognostic signature
  • the GCPMs of the invention also find use for the prognosis of other cancers, e.g., breast cancers, prostate cancers, ovarian cancers, lung cancers (such as adenocarcinoma and, particularly, small cell lung cancer), lymphomas, gliomas, blastomas (e.g., medulloblastomas), and mesothelioma, where decreased or low expression is associated with a positive prognosis, while increased or high expression is associated with a negative prognosis.
  • other cancers e.g., breast cancers, prostate cancers, ovarian cancers, lung cancers (such as adenocarcinoma and, particularly, small cell lung cancer), lymphomas, gliomas, blastomas (e.g., medulloblastomas), and mesothelioma, where decreased or low expression is associated with a positive prognosis, while increased or high expression is associated with a negative prognosis.
  • FIG. 1 The experimental scheme is shown in FIG. 1.
  • Ten colorectal cell lines were cultured and harvested at semi- and full-confluence.
  • Gene expression profiles of the two growth stages were analyzed on 30,000 oligonucleotide arrays and a gene proliferation signature (GPS; Table C) was identified by gene ontology analysis of differentially expressed genes.
  • Unsupervised clustering was then used to independently dichotomize two cohorts of clinical colorectal samples (Cohort A: 73 stage I-IV on oligo arrays, Cohort B: 55 stage Il on Affymetrix chips) based on the similarities of the GPS expression.
  • Ki-67 immunostaining was also performed on tissue sections from Cohort A tumours. Following this, the correlation between proliferation activity and clinico-pathologic parameters was investigated. -» •
  • Cohort B included a group of 55 German colorectal patients who underwent surgery at the Technical University of Kunststoff between 1995 and 2001 and had fresh frozen samples stored in a tissue bank. All 55 had stage Il disease, 26 developed disease recurrence (median survival 47 months) and 29 remained recurrence-free (median survival 82 months). None of patients received chemotherapy or radiotherapy. Clinico-pathologic variables of both cohorts are summarised as part of Table 2.
  • RNA samples and cell lines were homogenised and RNA was extracted using Tri-Reagent (Progenz, Auckland, NZ). The RNA was then purified using RNeasy mini column (Qiagen, Victoria, Australia) according to the manufacture's protocol. Ten micrograms of total RNA extracted from each culture or tumour sample was oligo-dT primed and cDNA synthesis was carried out in the presence of aa-dUTP and Superscript Il RNase H-Reverse Transcriptase (Invitrogen). Cy dyes were incorporated into cDNA using the indirect amino-allyl cDNA labelling method. cDNA derived from a pool of 12 different cell lines was used as the reference for all hybridizations.
  • Cy5-dUTP-tagged cDNA from an individual colorectal cell line or tissue sample was combined with Cy3-dUTP-tagged cDNA from reference sample.
  • the mixture was then purified using a QiaQuick PCR purification Kit (Qiagen, Victoria, Australia) and co-hybridized to a microarray spotted with the MWG 3OK Oligo Set (MWG Biotech, NC).
  • cDNA samples from the second culturing experiment were additionally analysed on microarrays using reverse labelling.
  • Arrays were scanned with a GenePix 4000B Microarray Scanner and data were analysed using GenePix Pro 4.1 Microarray Acquisition and Analysis Software (Axon, CA). The foreground intensities from each channel were log 2 transformed and normalised using the SNOMAD software (35) Normalised values were collated and filtered using BRB-Array Tools Version 3.2 (developed by Dr. Richard Simon and Amy Peng Lam, Biometric Research Branch, National Cancer Institute). Low intensity genes, and genes for which over 20% of measurements across tissue samples or cell lines were missing, were excluded from further analysis.
  • Affymetrix HGU133A GeneChips Affymetrix, Santa Clara, CA
  • streptavidin-phycoerythrin streptavidin-phycoerythrin.
  • the arrays were then scanned with a HP- argon-ion laser confocal microscope and the digitized image data were processed using the Affymetrix® Microarray Suite 5.0 Software. All Affymetrix U133A GeneChips passed quality control to eliminate scans with abnormal characteristics. Background correction and normalization were performed in the R computing environment using the robust multi- array average function implemented in the Bioconductor package affy.
  • RNA was reverse transcribed using Superscript Il RNase H-Reverse Transcriptase kit (Invitrogen) and oligo dT primer (Invitrogen).
  • QPCR was performed on an ABI Prism 7900HT Sequence Detection System (Applied Biosystems) using Taqman Gene Expression Assays (Applied Biosystems). Relative fold changes were calculated using the 2 "MCT method36 with Topoisomerase 3A as the internal control. Reference RNA was used as the calibrator to enable comparison between different experiments.
  • EXAMPLE 5 lmmunohistochemical analysis lmmunohistochemical expression of Ki-67 antigen (MIB-1 ; DakoCytornation, Denmark) was investigated on 4 ⁇ m sections of 73 paraffin-embedded primary colorectal tumours from Cohort A. Endogenous peroxidase activity was blocked with 0.3% hydrogen peroxidase in methanol and antigens were retrieved in boiling citrate buffer (pH 6). Nonspecific binding sites were blocked with 5% normal goat serum containing 1% BSA. Primary antibody (dilution 1 :50) was detected using the EnVision system (Dako EnVision, CA) and the DAB substrate kit (Vector laboratories, CA).
  • Ki-67 proliferation index was presented as the percentage of positively stained nuclei for each tumour. .
  • Relative risk and associated confidence intervals were also estimated for each variable using the Cox univariate model, and a multivariate Cox proportional hazard model was developed using forward stepwise regression with predictive variables that were significant in the univariate analysis.
  • K-means clustering method was used to classify clinical samples based on the expression level of GPS.
  • EXAMPLE 7 Identification of a gene proliferation signature (GPS) using a colorectal cell line model
  • GPS gene proliferation signature
  • the GPS was identified as a subset of genes whose expression correlates with CRC cell proliferation rate.
  • SAM Statistical Analysis of Microarray
  • SAM was used to identify genes differentially expressed (DE) between exponentially growing (semi- confluent) and non-cycling (fully-confluent) CRC cell lines (FIG. 1, stage 1).
  • each culture set was analysed independently. Analyses were limited to 502 DE genes for which a significant expression difference was observed between two growth stages in both sets of cultures (false discovery rate ⁇ 1 %).
  • Gene Ontology (GO) analysis was carried out using EASE39 to identify the biological process categories that were significantly reflected in the DE genes.
  • the expression of eleven genes from the GPS was assessed by QPCR and correlated with corresponding values obtained from the array data. Therefore, QPCR confirmed that elevated expression of the proliferation signature genes correlates with the increased proliferation in CRC cell lines (FIG. 5).
  • EXAMPLE 8 Classification of CRC samples according to the expression level of gene proliferation signature
  • CRC tumours from two cohorts were stratified into two clusters based on the expression of GPS (FIG. 1, stage 2).
  • Analysis of DE genes between two defined clusters using all filtered genes revealed that the GPS was contained within the list of genes upregulated in cluster 1 (FIG. 2A, upper panel) relative to cluster 2 (lower panel) in both cohorts.
  • the tumours in cluster 1 are characterised by high GPS expression
  • the tumours in cluster 2 are characterised by low GPS expression.
  • Ki-67 is not associated with clinico-pathologic variables or survival Ki-67 immunostaining was performed on tissue sections from Cohort A tumours only as paraffin-embedded samples were unavailable for Cohort B (FIG. 1 , stage 3). Nuclear staining was detected in all 73 CRC tumours. Ki-67 Pl ranged from 25 to 96 %, with a mean value of 76.3 ⁇ 17.5. Using the mean Ki-67 value as a cut-off point, tumours were assigned into two groups with low or high Pl. Ki-67 Pl was neither associated with clinico- pathologic variables (Table 2) nor survival (FIG. 3). When the survival analysis was limited to the patients with the highest and lowest Ki-67 values, no statistical difference was observed (data not shown).
  • Cohort B 55 German CRC patients; Table 2 were first classified into low and high proliferation groups using the 38 gene cell proliferation signature (Table C) and the K- means clustering method (Pearson uncentered, 1000 permutations, threshold of occurrence in the same cluster sat at 80%).
  • SAM Statistical Analysis of Microarrays
  • 754 genes were found to be over-expressed in high proliferation group.
  • the GATHER gene ontology program was then used to identify the most over-represented gene ontology categories within the list of differentially expressed genes.
  • the cell cycle category was the most over-represented category within the list of differentially expressed genes.
  • 102 cell cycle genes which are differentially expressed between the low and high proliferation groups are shown in Table D.
  • Table D Cell Cycle Genes that are Differentially Expressed in Low and High Proliferation
  • the present invention is the first to report an association between a gene proliferation signature and major clinico-pathologic variables as well as outcome in colorectal cancer.
  • the disclosed study investigated the proliferation state of tumours using an in vitro- derived multi-gene proliferation signature and by Ki-67 immunostaining. According to the results herein, low expression of the GPS in tumours was associated with a higher risk of recurrence and shorter survival in two independent cohorts of patients. In contrast, Ki-67 proliferation index was not associated with any clinically relevant endpoints.
  • the colorectal GPS encompasses 38 mitotic cell cycle genes and includes a core set of genes (CDC2, RFC4, PCNA, CCNE1 , CDK7, MCM genes, FEN1 , MAD2L1 , MYBL2, RRM2 and BUB3) that are part of proliferation signatures defined for tumours of the breast (40), (41), ovary (42), liver (43), acute lymphoblastic leukaemia (44), neuroblastoma (45), lung squamous cell carcinoma (46), head and neck (47), prostate (48), and stomach (49).
  • the sample size may also explain the lack of an association between clinico-pathologic variables and survival with Ki-67 Pl in the present study.
  • Ki-67 and CRC outcome have reported inconsistent findings.
  • a low Ki-67 Pl was associated with a worse prognosis (27), (29), (30).
  • the multi-gene expression analysis was therefore a more sensitive tool to assess the relationship between proliferation and prognosis than the Ki-67 Pl.
  • the present invention has clarified the previous, conflicting results relating to the prognostic role of cell proliferation in colorectal cancer.
  • a GPS has been developed using CRC cell lines and has been applied to two independent patient cohorts. It was found that low expression of growth-related genes in CRC was associated with more advanced tumour stage (Cohort A) and poor clinical outcome within the same stage (Cohort B). Multi-gene expression analysis was shown as a more powerful indicator than the long- established proliferation marker, Ki-67, for predicting outcome. For future studies, it will be useful to determine the reasons that CRC differs from other common epithelia cancers, such as breast and lung cancers (e.g., in reference to Ki-67). This will likely provide insights into important underlying biological mechanisms.
  • GPS expression can be used as an adjunct to conventional staging for identifying patients at high risk of recurrence and death from colorectal cancer.
  • Neoptolemos JP Oates GD
  • Newbold KM et al: Cyclin/proiiferation cell nuclear antigen immunohistochemistry does not improve the prognostic power of Dukes' or Jass 1 classifications for colorectal cancer.
  • Li JQ, Miki H, Ohmori M, et al Expression of cyclin E and cyclin-dependent kinase 2 correlates with metastasis and prognosis in colorectal carcinoma.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Molecular Biology (AREA)
  • Analytical Chemistry (AREA)
  • Organic Chemistry (AREA)
  • Pathology (AREA)
  • Biomedical Technology (AREA)
  • Urology & Nephrology (AREA)
  • Hematology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Microbiology (AREA)
  • Biotechnology (AREA)
  • Biochemistry (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Oncology (AREA)
  • Hospice & Palliative Care (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Cell Biology (AREA)
  • Food Science & Technology (AREA)
  • Medicinal Chemistry (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Apparatus Associated With Microorganisms And Enzymes (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Pharmaceuticals Containing Other Organic And Inorganic Compounds (AREA)

Abstract

This invention relates to methods and compositions for determining the prognosis of cancer in a patient, particularly for gastrointestinal cancer, such as gastric or colorectal cancer. Specifically, this invention relates to the use of genetic markers for the prediction of the prognosis of cancer, such as gastric or colorectal cancer, based on cell proliferation signatures. In various aspects, the invention relates to a method of predicting the likelihood of long-term survival of a cancer patient, a method of determining a treatment regime for a cancer patient, a method of preparing a personalized genomics profile for a cancer patient, among other methods as well as kits and devices for carrying out these methods.

Description

PROLIFERATION SIGNATURES AND PROGNOSIS FOR GASTROINTESTINAL CANCER
FIELD OF THE INVENTION This invention relates to methods and compositions for determining the prognosis of cancer, particularly gastrointestinal cancer, in a patient. Specifically, this invention relates to the use of genetic markers for determining the prognosis of cancer, such as gastrointestinal cancer, based on cell proliferation signatures.
BACKGROUND OF THE INVENTION
Cellular proliferation is the most fundamental process in living organisms, and as such is precisely regulated by the expression level of proliferation-associated genes (1). Loss of proliferation control is a hallmark of cancer, and it is thus not surprising that growth- regulating genes are abnormally expressed in tumours relative to the neighbouring normal tissue (2). Proliferative changes may accompany other changes in cellular properties, such as invasion and ability to metastasize, and therefore could affect patient outcome. This association has attracted substantial interest and many studies have been devoted to the exploration of tumour cell proliferation as a potential indicator of outcome.
Cell proliferation is usually assessed by flow cytometry or, more commonly, in tissues, by immunohistochemical evaluation of proliferation markers (3). The most widely used proliferation marker is Ki-67, a protein expressed in all cell cycle phases except for the resting phase G0 (4). Using Ki-67, a clear association between the proportion of cycling cells and clinical outcome has been established in malignancies such as breast cancer, lung cancer, soft tissue tumours, and astrocytoma (5). In breast cancer, this association has also been confirmed by microarray analysis, leading to a proliferative gene expression profile that has been employed for identifying patients at increased risk of recurrence (6).
However, in colorectal cancer (CRC), the proliferation index (Pl) has produced conflicting results as a prognostic factor and therefore cannot be applied in a clinical context (see below). Studies vary with respect to patient selection, sampling methods, cut-off point levels, antibody choices, staining techniques and the way data have been collected and interpreted. The methodological differences and heterogeneity of these studies may partly explain the contradictory results (7),(8). The use of Ki-67 as a proliferation marker also has limitations. The Ki-67 Pl estimates the fraction of actively cycling cells, but gives no indication of cell cycle length (3), (9). Thus, tumours with a similar Pl may grow at dissimilar rates due to different cycling speeds. In addition, while Ki-67 mRNA is not produced in resting cells, protein may still be detectable in a proportion of colorectal tumours leading to an overestimated proliferation rate (10).
Since the assessment of a prognosis using a single proliferation marker does not appear to be reliable in CRC (see below), there is a need for further tools to predict the prognosis of gastrointestinal cancer. This invention provides further methods and compositions based on prognostic cancer markers, specifically gastrointestinal cancer prognostic markers, to aid in the prognosis and treatment of cancer.
SUMMARY OF THE INVENTION
In certain aspects of the invention, microarray analysis is used to identify genes that provide a proliferation signature for cancer cells. These genes, and the proteins encoded by those genes, are herein termed gastrointestinal cancer proliferation markers (GCPMs). In one aspect of the invention, the cancer for prognosis is gastrointestinal cancer, particularly gastric or colorectal cancer.
In particular aspects, the invention includes a method for determining the prognosis of a cancer by identifying the expression levels of at least one GCPM in a sample. Selected GCPMs encode proteins that associated with cell proliferation, e.g., cell cycle components. These GCPMs have the added utility in methods for determining the best treatment regime for a particular cancer based on the prognosis. In particular aspects, GCPM levels are higher in non-recurring tumour tissue as compared to recurring tumour tissue. These markers can be used either alone or in combination with each other, or other known cancer markers.
In an additional aspect, this invention includes a method for determining the prognosis of a cancer, comprising: (a) providing a sample of the cancer; (b) detecting the expression level of at least one GCPM family member in the sample; and (c) determining the prognosis of the cancer.
In another aspect, the invention includes a step of detecting the expression level of at least one GCPM RNA, for example, at least one mRNA. In a further aspect, the invention includes a step of detecting the expression level of at least one GCPM protein. In yet a further aspect, the invention includes a step of detecting the level of at least one GCPM peptide. In yet another aspect, the invention includes detecting the expression level of at least one GCPM family member in the sample. In an additional aspect, the GCPM is a gene associated with cell proliferation, such as a cell cycle component. In other aspects, the at least one GCPM is selected from Table A, Table B, Table C or Table D, herein.
In a still further aspect, the invention includes a method for detecting the expression level of at least one GCPM set forth in Table A, Table B, Table C or Table D, herein. In an even further aspect, the invention includes a method for detecting the expression level of at least one of CDC2, MCM6, RPA3, MCM7, PCNA, G22P1 , KPNA2, ANLN, APG7L, TOPK,
GMNN, RRM1, CDC45L, MAD2L1 , RAN, DUT, RRM2, CDK7, MLH3, SMC4L1 , CSPG6,
POLD2, POLE2, BCCIP, Pfs2, TREX1, BUB3, FEN1 , DRF1 , PREI3, CCNE1 , RPA1 , POLE3, RFC4, MCM3, CHEK1 , CCND1 , and CDC37. In yet a further aspect, the invention comprises detecting the expression level of at least one of CDC2, RFC4, PCNA,
CCNE1 , CCND1, CDK7, MCM genes, FEN1, MAD2L1 , MYBL2, RRM2, and BUB3.
In additional aspects, the expression levels of at least two, or at least 5, or at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, or at least 75 of the proliferation markers or their expression products are determined, for example, as selected from Table A, Table, B, Table C or Table D; as selected from CDC2, MCM6, RPA3, MCM7, PCNA, G22P1 , KPNA2, ANLN, APG7L, TOPK, GMNN, RRM1 , CDC45L, MAD2L1 , RAN, DUT, RRM2, CDK7, MLH3, SMC4L1 , CSPG6, POLD2, POLE2, BCCIP, Pfs2, TREX1 , BUB3, FEN1 , DRF1, PREI3, CCNE1 , RPA1 , POLE3, RFC4, MCM3, CHEK1 , CCND1 , and CDC37; or as selected from CDC2, RFC4, PCNA, CCNE1 , CCND1 , CDK7, MCM genes (e.g., one or more of MCM3, MCM6, and MCM7), FEN1 , MAD2L1 , MYBL2, RRM2, and BUB3.
In other aspects, the expression levels of all proliferation markers or their expression products are determined, for example, as listed in Table A, Table, B, Table C or Table D; as listed for the group CDC2, MCM6, RPA3, MCM7, PCNA, G22P1 , KPNA2, ANLN, APG7L, TOPK, GMNN, RRM1 , CDC45L, MAD2L1 , RAN, DUT, RRM2, CDK7, MLH3, SMC4L1 , CSPG6, POLD2, POLE2, BCCIP, Pfs2, TREX1 , BUB3, FEN1 , DRF1 , PREI3, CCNE1 , RPA1, POLE3, RFC4, MCM3, CHEK1 , CCND1, and CDC37; or as listed for the group CDC2, RFC4, PCNA, CCNE1 , CCND1 , CDK7, MCM genes (e.g., one or more of MCM3, MCM6, and MCM7), FEN 1 , MAD2L1 , MYBL2, RRM2, and BUB3.
In yet a further aspect, the invention includes a method of determining a treatment regime for a cancer comprising: (a) providing a sample of the cancer; (b) detecting the expression level of at least one GCPM family member in the sample; (c) determining the prognosis of the cancer based on the expression level of at least one GCPM family member; and (d) determining the treatment regime according to the prognosis.
In yet another aspect, the invention includes a device for detecting at least one GCPM, comprising: (a) a substrate having at least one GCPM capture reagent thereon; and (b) a detector capable of detecting the at least one captured GCPM, the capture reagent, or a complex thereof.
An additional aspect of the invention includes a kit for detecting cancer, comprising: (a) a GCPM capture reagent; (b) a detector capable of detecting the captured GCPM, the capture reagent, or a complex thereof; and, optionally, (c) instructions for use. In certain aspects, the kit also includes a substrate for the GCPM as captured.
Yet a further aspect of the invention includes a method for detecting at least one GCPM using quantitative PCR, comprising: (a) a forward primer specific for the at least one GCPM; (b) a reverse primer specific for the at least one GCPM; (c) PCR reagents; and, optionally, at least one of: (d) a reaction vial; and (e) instructions for use.
Additional aspects of this invention include a kit for detecting the presence of at least one GCPM protein or peptide, comprising: (a) an antibody or antibody fragment specific for the at least one GCPM protein or peptide; and, optionally, at least one of: (b) a label for the antibody or antibody fragment; and (c) instructions for use. In certain aspects, the kit also includes a substrate having a capture agent for the at least one GCPM protein or peptide.
In specific aspects, this invention includes a method for determining the prognosis of gastrointestinal cancer, especially colorectal or gastric cancer, comprising the steps of: (a) providing a sample, e.g., tumour sample, from a patient suspected of having gastrointestinal cancer; (b) measuring the presence of a GCPM protein using an ELISA method.
In additional aspects of this invention, one or more GCPMs of the invention are selected from the group outlined in Table A, Table B, Table C or Table D, herein. Other aspects and embodiments of the invention are described herein below. BRIEF DESCRIPTION OF THE DRAWINGS
This invention is described with reference to specific embodiments thereof and with reference to the figures.
FIG. 1: An overview of the approach used to derive and apply the gene proliferation signature (GPS) disclosed herein.
FIG. 2A: K-means clustering of 73 Cohort A tumours into two groups according to the expression level of the gene proliferation signature. FIG. 2B: Bar graph of Ki-67 Pl (%); vertical line represents the mean Ki-67 Pl across all samples. Tumours with a proliferation index about and below the mean are shown in red and green, respectively. The results show that over-expression of the proliferation signature is not always associated with a higher Ki-67 Pl. FIG. 3: Kaplan-Meier survival curves according to the expression level of GPS (gene proliferation signal) and Ki-67 Pl. Both overall (OS) and recurrence-free survival (RFS) are significantly shorter in patients with low GPS expression in colorectal cancer Cohort A (a, b) and colorectal cancer Cohort B (c, d). No difference was observed in the survival rates of Cohort A patients according to Ki-67 Pl (e, f). P values from Log rank test are indicated.
FIG. 4: Kaplan-Meier survival curves according to the expression level of GPS (gene proliferation signal) in gastric cancer patients. Overall survival is significantly shorter in patients with low GPS expression in this cohort of 38 gastric cancer patients of mixed stage. P values from Log rank test are indicated.
FIG. 5: A box-and-whisker plot showing differential expression between cycling cells in the exponential phase (EP) and growth-inhibited cells in the stationary phase (SP) of 11 QRT-PCR-validated genes. The box range includes the 25 to the 75 percentiles of the data. The horizontal line in the box represents the median value. The "whiskers" are the largest and smallest values (excluding outliers). Any points more than 3/2 times of the interquartile range from the end of a box will be outliers and presented as a dot. The Y axis represents the log 2 fold change of the ratio between cell line RNA and reference RNA. Analysis was performed using SPSS software.
DETAILED DESCRIPTION OF THE INVENTION Because a single proliferation marker is insufficient for obtaining reliable CRC prognosis, the simultaneous analysis of several growth-related genes by microarray was employed to provide a more quantitative and objective method to determine the proliferation state of a gastrointestinal tumour. Table 1 (below) illustrates the previously published and conflicting results shown for use of the proliferation index (Pl) as a prognostic factor for colorectal cancer.
Table 1 : Summary of studies on the association of proliferation indices with the CRC patients' survival
Study Number of patients Dukes stage Marker Association with survival
Evans et al, 2006" 40 A-C Ki-67
Rosati et al, 200412 103 B-C Ki-67
Ishida et al, 200413 51" C Kϊ-67
Buglioni et al, 199914 171 A-D Ki-67
No association was found . Guerra et al, 199815 108 A-C PCNA between proliferation index Kyzer and Gordon, 1997*6 30 B-D Ki-67 and survival
Jansson and Sun, 199717 255 A-D Ki-67
BarettOE et al, 199618 95 A-B KΪ-67
Sun et al, 199619 293 A-C PCNA
Kubota et al, 199220 100 A-D Ki-67
Valera et al, 200521 106 A-D Ki-67
Dziegiel et al, 200322 81 NI Ki-67
Scopa et al, 200323 High proliferation index was
117 A-D Ki-67 associated with shorter Bhatavdekar et al, 200124 98 B-C Ki-67 survival
Chen et al, 199725 70 B-C Ki-67
Choi et al, 199726 86 B-D PCNA
Hilska et al, 2005" 363 A-D Ki-67
Salminen et al? 200528 146 A-D Ki-67
Garrity et al, 200429 366 B-C Ki-67 Low proliferation index was
Allegra et al, 200330 706 B-C Kϊ-67 associated with shorter
Pahnqvist et al, 199931 56 B Ki-67 survival
Paradϊso et al, 199632 71 NI PCNA
Neoptolemos et al, 199533 79 A-C PCNA
NI: No Information available
In contrast, the present disclosure has succeeded in (i) defining a CRC-specific gene proliferation signature (GPS) using a cell line model; and (ii) determining the prognostic significance of the GPS in the prediction of patient outcome and its association with clinico-pathologic variables in two independent cohorts of CRC patients.
Definitions Before describing embodiments of the invention in detail, it will be useful to provide some definitions of terms used herein.
As used herein "antibodies" and like terms refer to immunoglobulin molecules and immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain an antigen binding site that specifically binds (immunoreacts with) an antigen. These include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, Fc, Fab, Fab', and Fab2 fragments, and a Fab expression library. Antibody molecules relate to any of the classes IgG, IgM, IgA, IgE, and IgD, which differ from one another by the nature of heavy chain present in the molecule. These include subclasses as well, such as IgGI , lgG2, and others. The light chain may be a kappa chain or a lambda chain. Reference herein to antibodies includes a reference to all classes, subclasses, and types. Also included are chimeric antibodies, for example, monoclonal antibodies or fragments thereof that are specific to more than one source, e.g., a mouse or human sequence. Further included are camelid antibodies, shark antibodies or nanobodies.
The term "marker" refers to a molecule that is associated quantitatively or qualitatively with the presence of a biological phenomenon. Examples of "markers" include a polynucleotide, such as a gene or gene fragment, RNA or RNA fragment; or a polypeptide such as a peptide, oligopeptide, protein, or protein fragment; or any related metabolites, by products, or any other identifying molecules, such as antibodies or antibody fragments, whether related directly or indirectly to a mechanism underlying the phenomenon. The markers of the invention include the nucleotide sequences (e.g., GenBank sequences) as disclosed herein, in particular, the full-length sequences, any coding sequences, any fragments, or any complements thereof.
The terms "GCPM" or "gastrointestinal cancer proliferation marker" or "GCPM family member" refer to a marker with increased expression that is associated with a positive prognosis, e.g., a lower likelihood of recurrence cancer, as described herein, but can exclude molecules that are known in the prior art to be associated with prognosis of gastrointestinal cancer. It is to be understood that the term GCPM does not require that the marker be specific only for gastrointestinal tumours. Rather, expression of GCPM can be altered in other types of tumours, including malignant tumours.
Non-limiting examples of GCPMs are included in Table A, Table B, Table C or Table D, herein below, and include, but are not limited to, the specific group CDC2, MCM6, RPA3, MCM7, PCNA, G22P1 , KPNA2, ANLN, APG7L, TOPK, GMNN, RRM1 , CDC45L, MAD2L1, RAN, DUT, RRM2, CDK7, MLH3, SMC4L1 , CSPG6, POLD2, POLE2, BCCIP, Pfs2, TREXt, BUB3, FEN1 , DRF1 , PREI3, CCNE1, RPA1 , POLE3, RFC4, MCM3, CHEK1 , CCND1 , and CDC37; and the specific group CDC2, RFC4, PCNA, CCNE1 , CCND1 , CDK7, MCM genes (e.g., one or more of MCM3, MCM6, and MCM7), FEN1 , MAD2L1 , MYBL2, RRM2, and BUB3.
The terms "cancer" and "cancerous" refer to or describe the physiological condition in mammals that is typically characterized by abnormal or unregulated cell growth. Cancer and cancer pathology can be associated, for example, with metastasis, interference with the normal functioning of neighbouring cells, release of cytokines or other secretory products at abnormal levels, suppression or aggravation of inflammatory or immunological response, neoplasia, premalignancy, malignancy, invasion of surrounding or distant tissues or organs, such as lymph nodes, etc. Specifically included are gastrointestinal cancers, such as esophageal, stomach, small bowel, large bowel, anal, and rectal cancers, particularly included are gastric and colorectal cancers.
The term "colorectal cancer" includes cancer of the colon, rectum, and/or anus, and especially, adenocarcinomas, and may also include carcinomas (e.g., squamous cloacogenic carcinomas), melanomas, lymphomas, and sarcomas. Epidermoid (nonkeratihizing squamous cell or basaloid) carcinomas are also included. The cancer may be associated with particular types of polyps or other lesions, for example, tubular adenomas, tubulovillous adenomas (e.g., villoglandular polyps), villous (e.g., papillary) adenomas (with or without adenocarcinoma), hyperplastic polyps, hamartomas, juvenile polyps, polypoid carcinomas, pseudopolyps, lipomas, or leiomyomas. The cancer may be associated with familial polyposis and related conditions such as Gardner's syndrome or Peutz-Jeghers syndrome. The cancer may be associated, for example, with chronic fistulas, irradiated anal skin, leukoplakia, lymphogranuloma venereum, Bowen's disease (intraepithelial carcinoma), condyloma acuminatum, or human papillomavirus. In other aspects, the cancer may be associated with basal cell carcinoma, extramammary Paget's disease, cloacogenic carcinoma, or malignant melanoma.
The terms "differentially expressed gene," "differential gene expression," and like phrases, refer to a gene whose expression is activated to a higher or lower level in a subject (e.g., test sample), specifically cancer, such as gastrointestinal cancer, relative to its expression in a control subject (e.g., control sample). The terms also include genes whose expression is activated to a higher or lower level at different stages of the same disease; in recurrent or non-recurrent disease; or in cells with higher or lower levels of proliferation. A differentially expressed gene may be either activated or inhibited at the polynucleotide level or polypeptide level, or may be subject to alternative splicing to result in a different polypeptide product. Such differences may be evidenced by a change in mRNA levels, surface expression, secretion or other partitioning of a polypeptide, for example.
Differential gene expression may include a comparison of expression between two or more genes or their gene products; or a comparison of the ratios of the expression between two or more genes or their gene products; or a comparison of two differently processed products of the same gene, which differ between normal subjects and diseased δ subjects; or between various stages of the same disease; or between recurring and nonrecurring disease; or between cells with higher and lower levels of proliferation; or between normal tissue and diseased tissue, specifically cancer, or gastrointestinal cancer. Differential expression includes both quantitative, as well as qualitative, differences in the temporal or cellular expression pattern in a gene or its expression products among, for example, normal and diseased cells, or among cells which have undergone different disease events or disease stages, or cells with different levels of proliferation.
The term "expression" includes production of polynucleotides and polypeptides, in particular, the production of RNA (e.g., mRNA) from a gene or portion of a gene, and includes the production of a protein encoded by an RNA or gene or portion of a gene, and the appearance of a detectable material associated with expression. For example, the formation of a complex, for example, from a protein-protein interaction, protein-nucleotide interaction, or the like, is included within the scope of the term "expression". Another example is the binding of a binding ligand, such as a hybridization probe or antibody, to a gene or other oligonucleotide, a protein or a protein fragment and the visualization of the binding ligand. Thus, increased intensity of a spot on a microarray, on a hybridization blot such as a Northern blot, or on an immunoblot such as a Western blot, or on a bead array, or by PCR analysis, is included within the term "expression" of the underlying biological molecule.
The term "gastric cancer" includes cancer of the stomach and surrounding tissue, especially adenocarcinomas, and may also include lymphomas and leiomyosarcomas. The cancer may be associated with gastric ulcers or gastric polyps, and may be classified as protruding, penetrating, spreading, or any combination of these categories, or, alternatively, classified as superficial (elevated, flat, or depressed) or excavated.
The term "long-term survival" is used herein to refer to survival for at least 5 years, more preferably for at least 8 years, most preferably for at least 10 years following surgery or other treatment
The term "microarray" refers to an ordered arrangement of capture agents, preferably polynucleotides (e.g., probes) or polypeptides on a substrate. See, e.g., Microarray Analysis, M. Schena, John Wiley & Sons, 2002; Microarray Biochip Technology, M. Schena, ed., Eaton Publishing, 2000; Guide to Analysis of DNA Microarray Data, S. Knudsen, John Wiley & Sons, 2004; and Protein Microarray Technology, D. Kambhampati, ed., John Wiley & Sons, 2004. The term "oligonucleotide" refers to a polynucleotide, typically a probe or primer, including, without limitation, single-stranded deoxyribonucleotides, single- or double-stranded ribonucleotides, RNA: DNA hybrids, and double-stranded DNAs. Oligonucleotides, such as single-stranded DNA probe oligonucleotides, are often synthesized by chemical methods, for example using automated oligonucleotide synthesizers that are commercially available, or by a variety of other methods, including in vitro expression systems, recombinant techniques, and expression in cells and organisms.
The term "polynucleotide," when used in the singular or plural, generally refers to any polyribonucleotide or polydeoxribonucleotide, which may be unmodified RNA or DNA or modified RNA or DNA. This includes, without limitation, single- and double-stranded DNA, DNA including single- and double- stranded regions, single- and double-stranded RNA, and RNA including single- and double-stranded regions, hybrid molecules comprising DNA and RNA that may be single-stranded or, more typically, double-stranded or include single- and double-stranded regions. Also included are triple-stranded regions comprising RNA or DNA or both RNA and DNA. Specifically included are mRNAs, cDNAs, and genomic DNAs. The term includes DNAs and RNAs that contain one or more modified bases, such as tritiated bases, or unusual bases, such as inosine. The polynucleotides of the invention can encompass coding or non-coding sequences, or sense or antisense sequences.
"Polypeptide," as used herein, refers to an oligopeptide, peptide, or protein sequence, or fragment thereof, and to naturally occurring, recombinant, synthetic, or semi-synthetic molecules. Where "polypeptide" is recited herein to refer to an amino acid sequence of a naturally occurring protein molecule, "polypeptide" and like terms, are not meant to limit the amino acid sequence to the complete, native amino acid sequence for the full-length molecule. It will be understood that each reference to a "polypeptide" or like term, herein, will include the full-length sequence, as well as any fragments, derivatives, or variants thereof.
The term "prognosis" refers to a prediction of medical outcome (e.g., likelihood of long- term survival); a negative prognosis, or bad outcome, includes a prediction of relapse, disease progression (e.g., tumour growth or metastasis, or drug resistance), or mortality; a positive prognosis, or good outcome, includes a prediction of disease remission, (e.g., disease-free status), amelioration (e.g., tumour regression), or stabilization. The terms "prognostic signature," "signature," and the like refer to a set of two or more markers, for example GCPMs1 that when analysed together as a set allow for the determination of or prediction of an event, for example the prognostic outcome of colorectal cancer. The use of a signature comprising two or more markers reduces the effect of individual variation and allows for a more robust prediction. Non-limiting examples of GCPMs are included in Table A, Table B, Table C or Table D, herein below, and include, but are not limited to, the specific group CDC2, MCM6, RPA3, MCM7, PCNA, G22P1 , KPNA2, ANLN, APG7L, TOPK, GMNN, RRM1 , CDC45L, MAD2L1 , RAN, DUT, RRM2, CDK7, MLH3, SMC4L1 , CSPG6, POLD2, POLE2, BCCIP, Pfs2, TREX1, BUB3, FEN1 , DRF1 , PREI3, CCNE1 , RPA1 , POLE3, RFC4, MCM3, CHEK1, CCND1 , and CDC37; and the specific group CDC2, RFC4, PCNA, CCNE1 , CCND1 , CDK7, MCM genes (e.g., one or more of MCM3, MCM6, and MCM7), FEN1 , MAD2L1 , MYBL2, RRM2, and BUB3.
In the context of the present invention, reference to "at least one," "at least two," "at least five," etc., of the markers listed in any particular set (e.g., any signature) means any one or any and all combinations of the markers listed.
The term "prediction method" is defined to cover the broader genus of methods from the fields of statistics, machine learning, artificial intelligence, and data mining, which can be used to specify a prediction model. These are discussed further in the Detailed Description section.
The term "prediction model" refers to the specific mathematical model obtained by applying a prediction method to a collection of data. In the examples detailed herein, such data sets consist of measurements of gene activity in tissue samples taken from recurrent and non-recurrent colorectal cancer patients, for which the class (recurrent or nonrecurrent) of each sample is known. Such models can be used to (1) classify a sample of unknown recurrence status as being one of recurrent or non-recurrent, or (2) make a probabilistic prediction (i.e., produce either a proportion or percentage to be interpreted as a probability) which represents the likelihood that the unknown sample is recurrent, based on the measurement of mRNA expression levels or expression products, of a specified collection of genes, in the unknown sample. The exact details of how these gene-specific measurements are combined to produce classifications and probabilistic predictions are dependent on the specific mechanisms of the prediction method used to construct the model. The term "proliferation" refers to the processes leading to increased cell size or cell number, and can include one or more of: tumour or cell growth, angiogenesis, innervation, and metastasis.
The term "qPCR" or "QPCR" refers to quantative polymerase chain reaction as described, for example, in PCR Technique: Quantitative PCR, J.W. Larrick, ed., Eaton Publishing, 1997, and A-Z of Quantitative PCR, S. Bustin, ed., IUL Press, 2004.
The term "tumour" refers to all neoplastic cell growth and proliferation, whether malignant or benign, and all pre-cancerous and cancerous cells and tissues.
Sensitivity", "specificity" (or "selectivity"), and "classification rate", when applied to the describing the effectiveness of prediction models mean the following:
"Sensitivity" means the proportion of truly positive samples that are also predicted (by the model) to be positive. In a test for cancer recurrence, that would be the proportion of recurrent tumours predicted by the model to be recurrent. "Specificity" or "selectivity" means the proportion of truly negative samples that are also predicted (by the model) to be negative. In a test for CRC recurrence, this equates to the proportion of non-recurrent samples that are predicted to by non-recurrent by the model. "Classification Rate" is the proportion of all samples that are correctly classified by the prediction model (be that as positive or negative).
"Stringent conditions" or "high stringency conditions", as defined herein, typically: (1) employ low ionic strength and high temperature for washing, for example 0.015 M sodium chloride/0.0015 M sodium citrate/0.1% sodium dodecyl sulfate at 500C; (2) employ a denaturing agent during hybridization, such as formamide, for example, 50% (v/v) formamide with 0.1% bovine serum albumin/0.1% Ficoll/0.1% polyvinylpyrrolidone/50 mM sodium phosphate buffer at pH 6.5 with 750 mM sodium chloride, 75 mM sodium citrate at 42°C; or (3) employ 50% formamide, 5X SSC (0.75 M NaCI1 0.075 M sodium citrate), 50 mM sodium phosphate (pH 6.8), 0.1% sodium pyrophosphate, 5X, Denhardt's solution, sonicated salmon sperm DNA (50 μg/ml), 0.1% SDS, and 10% dextran sulfate at 42°C, with washes at 42°C in 0.2X SSC (sodium chloride/sodium citrate) and 50% formamide at 55°C, followed by a high-stringency wash comprising 0.1X SSC containing EDTA at 55°C.
"Moderately stringent conditions" may be identified as described by Sambrook et al., Molecular Cloning: A Laboratory Manual, New York: Cold Spring Harbor Press, 1989, and include the use of washing solution and hybridization conditions (e. g., temperature, ionic strength, and % SDS) less stringent that those described above. An example of moderately stringent conditions is overnight incubation at 37°C in a solution comprising: 20% formamide, 5X SSC (150 mM NaCI1 15 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5X Denhardt's solution, 10% dextran sulfate, and 20 mg/ml denatured sheared salmon sperm DNA, followed by washing the filters in 1X SSC at about 37-500C. The skilled artisan will recognize how to adjust the temperature, ionic strength, etc. as necessary to accommodate factors such as probe length and the like.
The practice of the present invention will employ, unless otherwise indicated, conventional techniques of molecular biology (including recombinant techniques), microbiology, cell biology, and biochemistry, which are within the skill of the art. Such techniques are explained fully in the literature, such as, Molecular Cloning: A Laboratory Manual, 2nd edition, Sambrook et al., 1989; Oligonucleotide Synthesis, MJ Gait, ed., 1984; Animal Cell Culture, R.I. Freshney, ed., 1987; Methods in Enzymology, Academic Press, Inc.; Handbook of Experimental Immunology, 4th edition, D .M. Weir & CC. Blackwell, eds., Blackwell Science Inc., 1987; Gene Transfer Vectors for Mammalian Cells, J. M. Miller & MP. Calos, eds., 1987; Current Protocols in Molecular Biology, F.M. Ausubel et al., eds., 1987; and PCR: The Polymerase Chain Reaction, Mullis et al., eds., 1994.
Description of Embodiments of the Invention
Cell proliferation is an indicator of outcome in some malignancies. In colorectal cancer, however, discordant results have been reported. As these results are based on a single proliferation marker, the present invention discloses the use of microarrays to overcome this limitation, to reach a firmer conclusion, and to determine the prognostic role of cell proliferation in colorectal cancer. The microarray-based proliferation studies shown herein indicate that reduced rate of the proliferation signature in colorectal cancer is associated with poor outcome. The invention can therefore be used to identify patients at high risk of early death from cancer.
The present invention provides for markers for the determination of disease prognosis, for example, the likelihood of recurrence of tumours, including gastrointestinal tumours. Using the methods of the invention, it has been found that numerous markers are associated with the progression of gastrointestinal cancer, and can be used to determine the prognosis of cancer. Microarray analysis of samples taken from patients with various stages of colorectal tumours has led to the surprising discovery that specific patterns of marker expression are associated with prognosis of the cancer. An increase in certain GCPMs, for example, markers associated with cell proliferation, is indicative of positive prognosis. This can include decreased likelihood of cancer recurrence after standard treatment, especially for gastrointestinal cancer, such as gastric or colorectal cancer. Conversely, a decrease in these markers is indicative of a negative prognosis. This can include disease progression or the increased likelihood of cancer recurrence, especially for gastrointestinal cancer, such as gastric or colorectal cancer. A decrease in expression can be determined, for example, by comparison of a test sample (e.g., tumour sample) to samples associated with a positive prognosis. An increase in expression can be determined, for example, by comparison of a test sample (e.g., tumour samples) to samples associated with a negative prognosis.
For example, to obtain a prognosis, a patient's sample (e.g., tumour sample) can be compared to samples with known patient outcome. If the patient's sample shows increased expression of GCPMs that is comparable to samples with good outcome, and/or higher than samples with poor outcome, then a positive . prognosis is implicated. If the patient's sample shows decreased expression of GCPMs that is comparable to samples with poor outcome, and/or lower than samples with good outcome, then a negative prognosis is implicated. Alternatively, a patient's sample can be compared to samples of actively proliferating/non-proliferating tumour cells. If the patient's sample shows increased expression of GCPMs that is comparable to actively proliferating cells, and/or higher than non-proliferating cells, then a positive prognosis is implicated. If the patient's sample shows decreased expression of GCPMs that is comparable to non- proliferating cells, and/or lower than actively proliferating cells, then a negative prognosis is implicated.
The invention provides for a set of genes, identified from cancer patients with various stages of tumours, outlined in Table C that are shown to be prognostic for colorectal cancer. These genes are all associated with cell proliferation and establish a relationship between cell proliferation genes and their utility in cancers prognosis. It has also been found that the genes in the prognostic signature listed in Table C are also correlated with additional cell proliferation genes. Based on these finding, the invention also provides for a set of cell cycle genes, shown in Table D, that are differentially expressed between high and low proliferation groups, for use as prognostic markers. Further, based on the surprising finding of the correlation between prognosis and cell proliferation-related genes, the invention also provides for a set of proliferation-related genes differentially expressed between cell lines in high and low proliferative states (Table A) and known proliferative- reiated genes (Table B). The genes outlined in Table A, Table B, Table C and Table D provide for a set of gastrointestinal cancer prognostic markers (gCPMs).
As one approach, the expression of a panel of markers (e.g., GCPMs) can be analysed by techniques including Linear Discriminant Analysis (LDA) to work out a prognostic score. The marker panel selected and prognostic score calculation can be derived through extensive laboratory testing and multiple independent clinical development studies.
The disclosed GCPMs therefore provide a useful tool for determining the prognosis of cancer, and establishing a treatment regime specific for that tumour. In particular, a positive prognosis can be used by a patient to decide to pursue standard or less invasive treatment options. A negative prognosis can be used by a patient to decide to terminate treatment or to pursue highly aggressive or experimental treatments. In addition, a patient can chose treatments based on their impact on cell proliferation or the expression of cell proliferation markers (e.g., GCPMs). In accordance with the present invention, treatments that specifically target cells with high proliferation or specifically decrease expression of cell proliferation markers (e.g., GCPMs) would not be preferred for patients with gastrointestinal cancer, such as colorectal cancer or gastric cancer.
Levels of GCPMs can be detected in tumour tissue, tissue proximal to the tumour, lymph node samples, blood samples, serum samples, urine samples, or faecal samples, using any suitable technique, and can include, but is not limited to, oligonucleotide probes, quantitative PCR, or antibodies raised against the markers. The expression level of one GCPM in the sample will be indicative of the likelihood of recurrence in that subject. However, it will be appreciated that by analyzing the presence and amounts of expression of a plurality of GCPMs, and constructing a proliferation signature, the sensitivity and accuracy of prognosis will be increased. Therefore, multiple markers according to the present invention can be used to determine the prognosis of a cancer.
The present invention relates to a set of markers, in particular, GCPMs, the expression of which has prognostic value, specifically with respect to cancer-free survival. In specific aspects, the cancer is gastrointestinal cancer, particularly, gastric or colorectal cancer, and, in further aspects, the colorectal cancer is an adenocarcinoma.
In one aspect, the invention relates to a method of predicting the likelihood of long-term survival of a cancer patient without the recurrence of cancer, comprising determining the expression level of one or more proliferation markers or their expression products in a sample obtained from the patient, normalized against the expression level of all RNA transcripts or their products in the sample, or of a reference set of RNA transcripts or their expression products, wherein the proliferation marker is the transcript of one or more markers listed in Table A, Table B, Table C or Table D, herein. In particular aspects, a decrease in expression levels of one or more GCPM indicates a decreased likelihood of long-term survival without cancer recurrence, while an increase in expression levels of one or more GCPM indicates an increased likelihood of long-term survival without cancer recurrence.
In a further aspect, the expression levels one or more, for example at least two, or at least 3, or at least 4, or at least 5, or at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, or at least 75 of the proliferation markers or their expression products are determined, e.g., as selected from Table A, Table, B, Table C or Table D; as selected from CDC2, MCM6, RPA3, MCM7, PCNA, G22P1 , KPNA2, ANLN, APG7L, TOPK, GMNN, RRM1, CDC45L, MAD2L1 , RAN, DUT, RRM2, CDK7, MLH3, SMC4L1 , CSPG6, POLD2, POLE2, BCCIP, Pfs2, TREX1 , BUB3, FEN1, DRF1 , PREI3, CCNE1 , RPA1 , POLE3, RFC4, MCM3, CHEK1 , CCND1 , and CDC37; or as selected from CDC2, RFC4, PCNA, CCNE1 , CCND1 , CDK7, MCM genes (e.g., one or more of MCM3, MCM6, and MCM7), FEN1 , MAD2L1 , MYBL2, RRM2, and BUB3.
In another aspect, the method comprises the determination of the expression levels of all proliferation markers or their expression products, e.g., as listed in Table A, Table, B, Table C or Table D; as listed for the group CDC2, MCM6, RPA3, MCM7, PCNA, G22P1 , KPNA2, ANLN, APG7L, TOPK, GMNN, RRM1 , CDC45L, MAD2L1 , RAN, DUT, RRM2, CDK7, MLH3, SMC4L1 , CSPG6, POLD2, POLE2, BCCIP, Pfs2, TREX1 , BUB3, FEN1, DRF1 , PREI3, CCNE1 , RPA1 , POLE3, RFC4, MCM3, CHEK1 , CCND1 , and CDC37; or as listed for the group CDC2, RFC4, PCNA, CCNE1 , CCND1 , CDK7, MCM genes (e.g., one or more of MCM3, MCM6, and MCM7), FEN1 , MAD2L1 , MYBL2, RRM2, and BUB3.
The invention includes the use of archived paraffin-embedded biopsy material for assay of all markers in the set, and therefore is compatible with the most widely available type of biopsy material. It is also compatible with several different methods of tumour tissue harvest, for example, via core biopsy or fine needle aspiration. In a further aspect, RNA is isolated from a fixed, wax-embedded cancer tissue specimen of the patient. Isolation may be performed by any technique known in the art, for example from core biopsy tissue or fine needle aspirate cells. In another aspect, the invention relates to an array comprising polynucleotides hybridizing to two or more markers as selected from Table A, Table B, Table C or Table D; as selected from CDC2, MCM6, RPA3, MCM7, PCNA, G22P1 , KPNA2, ANLN, APG7L, TOPK, GMNN, RRM1 , CDC45L, MAD2L1 , RAN, DUT, RRM2, CDK7, MLH3, SMC4L1 , CSPG6, POLD2, POLE2, BCCIP, Pfs2, TREX1 , BUB3, FEN1 , DRF1 , PREI3, CCNE1, RPA1, P0LE3, RFC4, MCM3, CHEK1, CCND1 , and CDC37; or as selected from CDC2, RFC4, PCNA, CCNE1 , CCND1 , CDK7, MCM genes (e.g., one or more of MCM3, MCM6, and MCM7), FEN1 , MAD2L1 , MYBL2, RRM2, and BUB3.
In particular aspects, the array comprises polynucleotides hybridizing to at least 3, or at least 5, or at least 10, or at least 15, or at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, or at least 75 or all of the markers listed in Table A, Table B, Table C or Table D; as listed in the group CDC2, MCM6, RPA3, MCM7, PCNA, G22P1, KPNA2, ANLN, APG7L, TOPK, GMNN, RRM1 , CDC45L, MAD2L1 , RAN, DUT, RRM2, CDK7, MLH3, SMC4L1, CSPG6, POLD2, POLE2, BCCIP1 Pfs2, TREX1, BUB3, FEN1, DRF1 , PREI3, CCNE1 , RPA1 , POLE3, RFC4, MCM3, CHEK1 , CCND1 , and CDC37; or as listed in the group CDC2, RFC4, PCNA, CCNE1 , CCND1 , CDK7, MCM genes (e.g., one or more of MCM3, MCM6, and MCM7), FEN1 , MAD2L1 , MYBL2, RRM2, and BUB3.
In another specific aspect, the array comprises polynucleotides hybridizing to the full set of markers listed in Table A, Table B, Table C or Table D; as listed for the group CDC2, MCM6, RPA3, MCM7, PCNA, G22P1 , KPNA2, ANLN, APG7L, TOPK1 GMNN, RRM1 , CDC45L, MAD2L1 , RAN, DUT, RRM2, CDK7, MLH3, SMC4L1 , CSPG6, POLD2, POLE2, BCCIP, Pfs2, TREX1 , BUB3, FEN1 , DRF1 , PREI3, CCNE1 , RPA1 , POLE3, RFC4, MCM3, CHEK1 , CCND1 , and CDC37; or as listed for the group CDC2, RFC4, PCNA, CCNE1, CCND1 , CDK7, MCM genes (e.g., one or more of MCM3, MCM6, and MCM7), FEN1 , MAD2L1 , MYBL2, RRM2, and BUB3.
The polynucleotides can be cDNAs, or oligonucleotides, and the solid surface on which they are displayed can be glass, for example. The polynucleotides can hybridize to one or more of the markers as disclosed herein, for example, to the full-length sequences, any coding sequences, any fragments, or any complements thereof.
In still another aspect, the invention relates to a method of predicting the likelihood of long-term survival of a patient diagnosed with cancer, without the recurrence of cancer, comprising the steps of: (1) determining the expression levels of the RNA transcripts or the expression products of the full set or a subset of the markers listed in Table A, Table B, Table C or Table D, herein, in a sample obtained from the patient, normalized against the expression levels of all RNA transcripts or their expression products in the sample, or of a reference set of RNA transcripts or their products; (2) subjecting the data obtained in step (1) to statistical analysis; and (3) determining whether the likelihood of the long-term survival has increased or decreased.
In yet another aspect, the invention concerns a method of preparing a personalized genomics profile for a patient, e.g., a cancer patient, comprising the steps of: (a) subjecting a sample obtained from the patient to expression analysis; (b) determining the expression level of one or more markers selected from the marker set listed in any one of Table A, Table B, Table C or Table D, wherein the expression level is normalized against a control gene or genes and optionally is compared to the amount found in a reference set; and (c) creating a report summarizing the data obtained by the expression analysis. The report may, for example, include prediction of the likelihood of long term survival of the patient and/or recommendation for a treatment modality of the patient.
In additional aspects, the invention relates to a prognostic method comprising: (a) subjecting a sample obtained from a patient to quantitative analysis of the expression level of the RNA transcript of at least one marker selected from Table A, Table B, Table C or Table D, herein, or its product, and (b) identifying the patient as likely to have an increased likelihood of long-term survival without cancer recurrence if the normalized expression levels of the marker or markers, or their products, are above defined expression threshold. In alternate aspects, step (b) comprises identifying the patient as likely to have a decreased likelihood of long-term survival without cancer recurrence if the normalized expression levels of the marker or markers, or their products, are decreased below a defined expression threshold.
In particular, the relatively low expression of proliferation markers is associated with poor outcome. This can include disease progression or the increased likelihood of cancer recurrence, especially for gastrointestinal cancer, such as gastric or colorectal cancer. By contrast, the relatively high expression of proliferation markers is associated with a good outcome. This can include decreased likelihood of cancer recurrence after standard treatment, especially for gastrointestinal cancer, such as gastric or colorectal cancer. Low expression can be determined, for example, by comparison of a test sample (e.g., tumour sample) to samples associated with a positive prognosis. High expression can be determined, for example, by comparison of a test sample (e.g., tumour sample) to samples associated with a negative prognosis. For example, to obtain a prognosis, a patient's sample (e.g., tumour sample) can be compared to samples with known patient outcome. If the patient's sample shows high expression of GCPMs that is comparable to samples with good outcome, and/or higher than samples with poor outcome, then a positive prognosis is implicated. If the patient's sample shows low expression of GCPMs that is comparable to samples with poor outcome, and/or lower than samples with good outcome, then a negative prognosis is implicated. Alternatively, a patient's sample can be compared to samples of actively proliferating/non-proliferating tumour cells. If the patient's sample shows high expression of GCPMs that is comparable to actively proliferating cells, and/or higher than non- proliferating cells, then a positive prognosis is implicated. If the patient's sample shows low expression of GCPMs that is comparable to non-proliferating cells, and/or lower than actively proliferating cells, then a negative prognosis is implicated.
As further examples, the expression levels of a prognostic signature comprising two or more GCPMs from a patient's sample (e.g., tumour sample) can be compared to samples of recurrent/non-recurrent cancer. If the patient's sample shows increased or decreased expression of CCPMs by comparison to samples of non-recurrent cancer, and/or comparable expression to samples of recurrent cancer, then a negative prognosis is implicated. If the patient's sample shows expression of GCPMs that is comparable to samples of non-recurrent cancer, and/or lower or higher expression than samples of recurrent cancer, then a positive prognosis is implicated.
As one approach, a prediction method can be applied to a panel of markers, for example the panel of GCPMs outlined in Table A, Table B Table C or Table D, in order to generate a predictive model. This involves the generation of a prognostic signature, comprising two or more GCPMs.
The disclosed GCPMs in Table A, Table B, Table C or Table Dtherefore provide a useful set of markers to generate prediction signatures for determining the prognosis of cancer, and establishing a treatment regime, or treatment modality, specific for that tumour. In particular, a positive prognosis can be used by a patient to decide to pursue standard or less invasive treatment options. A negative prognosis can be used by a patient to decide to terminate treatment or to pursue highly aggressive or experimental treatments. In addition, a patient can chose treatments based on their impact on the expression of prognostic markers (e.g., GCPMs). Levels of GCPMs can be detected in tumour tissue, tissue proximal to the tumour, lymph node samples, blood samples, serum samples, urine samples, or faecal samples, using any suitable technique, and can include, but is not limited to, oligonucleotide probes, quantitative PCR, or antibodies raised against the markers. It will be appreciated that by analyzing the presence and amounts of expression of a plurality of GCPMs in the form of prediction signatures, and constructing a prognostic signature, the sensitivity and accuracy of prognosis will be increased. Therefore, multiple markers according to the present invention can be used to determine the prognosis of a cancer.
The invention includes the use of archived paraffin-embedded biopsy material for assay of the markers in the set, and therefore is compatible with the most widely available type of biopsy material. It is also compatible with several different methods of tumour tissue harvest, for example, via core biopsy or fine needle aspiration. In certain aspects, RNA is isolated from a fixed, wax-embedded cancer tissue specimen of the patient. Isolation may be performed by any technique known in the art, for example from core biopsy tissue or fine needle aspirate cells.
In one aspect, the invention relates to a method of predicting a prognosis, e.g., the likelihood of long-term survival of a cancer patient without the recurrence of cancer, comprising determining the expression level of one or more prognostic markers or their expression products in a sample obtained from the patient, normalized against the expression level of other RNA transcripts or their products in the sample, or of a reference set of RNA transcripts or their expression products. In specific aspects, the prognostic marker is one or more markers listed in Table A, Table B, Table C or Table D or is included as one or more of the prognostic signatures derived from the markers listed in Table A, Table B, Table C or Table D.
In further aspects, the expression levels of the prognostic markers or their expression products are determined, e.g., for the markers listed in Table A, Table B, Table C or Table D, a prognostic signature derived from the markers listed in Table A, Table B, Table C or Table D. In another aspect, the method comprises the determination of the expression levels of a full set of prognosis markers or their expression products, e.g., for the markers listed in Table A, Table B, Table C or Table D, or, a prognostic signature derived from the markers listed in Table A, Table B, Table C or Table D. ■
In an additional aspect; the invention relates to an array (e.g., microarray) comprising polynucleotides hybridizing to two or more markers, e.g., for the markers listed in Table A, Table B, Table C or Table D, or a prognostic signature derived from the markers listed in Table A, Table B, Table C or Table D. In particular aspects, the array comprises polynucleotides hybridizing to prognostic signature derived from the markers listed in Table A, Table B, Table C or Table D, or e.g., for a prognostic signature. In another specific aspect, the array comprises polynucleotides hybridizing to the full set of markers, e.g., for the markers listed in Table A, Table B, Table C or Table D, or, e.g., for a prognostic signature.
For these arrays, the polynucleotides can be cDNAs, or oligonucleotides, and the solid surface on which they are displayed can be glass, for example. The polynucleotides can hybridize to one or more of the markers as disclosed herein, for example, to the full-length sequences, any coding sequences, any fragments, or any complements thereof. In particular aspects, an increase or decrease in expression levels of one or more GCPM indicates a decreased likelihood of long-term survival, e.g., due to cancer recurrence, while a lack of an increase or decrease in expression levels of one or more GCPM indicates an increased likelihood of long-term survival without cancer recurrence.
In further aspects, the invention relates to a kit comprising one or more of: (1) extraction buffer/reagents and protocol; (2) reverse transcription buffer/reagents and protocol; and (3) quantitative PCR buffer/reagents and protocol suitable for performing any of the foregoing methods." Other aspects and advantages of the invention are illustrated in the description and examples included herein.
Figure imgf000023_0001
Figure imgf000024_0001
Figure imgf000025_0001
Figure imgf000026_0001
Table A: Proliferation-related genes differentially expressed between cell lines in high and low proliferative states. Genes that were differentially expressed between cell lines in confluent (low proliferation) and semi-confluent (high proliferation) states (see Figure 1) were identified by microarray analysis on 3OK MWG Biotech arrays. Table A comprises the subset of these genes that were categorized by gene ontology analysis as cell proliferation-related. Table B: GCPMs for cell proliferation signature
Figure imgf000027_0001
Figure imgf000028_0001
Figure imgf000029_0001
Figure imgf000030_0001
Figure imgf000031_0001
Figure imgf000032_0001
Figure imgf000033_0001
Figure imgf000034_0001
Figure imgf000035_0001
Figure imgf000036_0001
Figure imgf000037_0001
,j
Figure imgf000038_0001
Figure imgf000039_0001
Figure imgf000040_0001
Figure imgf000041_0001
Figure imgf000042_0001
W
Figure imgf000043_0001
Figure imgf000044_0001
Figure imgf000045_0001
Figure imgf000046_0001
Figure imgf000047_0001
Figure imgf000048_0001
Figure imgf000049_0001
Figure imgf000050_0001
Figure imgf000051_0001
Figure imgf000052_0001
Figure imgf000053_0001
Figure imgf000054_0001
Figure imgf000055_0001
Figure imgf000056_0001
Figure imgf000057_0001
Figure imgf000058_0001
Figure imgf000059_0001
Figure imgf000060_0001
Figure imgf000061_0001
Figure imgf000062_0001
Figure imgf000063_0001
Figure imgf000064_0001
Figure imgf000065_0001
Figure imgf000066_0001
Figure imgf000067_0001
Figure imgf000068_0001
Figure imgf000069_0001
Figure imgf000070_0001
Figure imgf000071_0001
Figure imgf000072_0001
Table B: Known cell proliferation-related genes. All genes categorized as cell proliferation-related by gene ontology analysis and present on the Affymetrix HG- U 133 platform.
General Approaches to Prognostic Marker Detection
The following approaches are non-limiting methods that can be used to detect the proliferation markers, including GCPM family members: microarray approaches using oligonucleotide probes selective for a GCPM; real-time qPCR on tumour samples using GCPM specific primers and probes; real-time qPCR on lymph node, blood, serum, faecal, or urine samples using GCPM specific primers and probes; enzyme-linked immunological assays (ELISA); immunohistochemistry using anti-marker antibodies; and analysis of array or qPCR data using computers.
Other useful methods include northern blotting and in situ hybridization (Parker and Barnes, Methods in Molecular Biology 106: 247-283 (1999)); RNase protection assays (Hod, BioTechniques 13: 852-854 (1992)); reverse transcription polymerase chain reaction (RT-PCR; Weis et al., Trends in Genetics 8: 263-264 (1992)); serial analysis of gene expression (SAGE; Velculescu et al., Science 270: 484-487 (1995); and Velculescu et al., Cell 88: 243-51 (1997)), MassARRAY technology (Sequenom, San Diego, CA), and gene expression analysis by massively parallel signature sequencing (MPSS; Brenner et al., Nature Biotechnology 18: 630-634 (2000)). Alternatively, antibodies may be employed that can recognize specific complexes, including DNA duplexes, RNA duplexes, and DNA- RNA hybrid duplexes or DNA-protein duplexes.
Primary data can be collected and fold change analysis can be performed, for example, by comparison of marker expression levels in tumour tissue and non-tumour tissue; by comparison of marker expression levels to levels determined in recurring tumours and non-recurring tumours; by comparison of marker expression levels to levels determined in tumours with or without metastasis; by comparison of marker expression levels to levels determined in differently staged tumours; or by comparison of marker expression levels to levels determined in cells with different levels of proliferation. A negative or positive prognosis is determined based on this analysis. Further analysis of tumour marker expression includes matching those markers exhibiting increased or decreased expression with expression profiles of known gastrointestinal tumours to provide a prognosis.
A threshold for concluding that expression is increased is provided as, for example, at least a 1.5-fold or 2-fold increase, and in alternative embodiments, at least a 3-fold increase, 4-fold increase, or 5-fold increase. A threshold for concluding that expression is decreased is provided as, for example, at least a 1.5-fold or 2-fold decrease, and in alternative embodiments, at least a 3-fold decrease, 4-fold decrease, or 5-fold decrease. It can be appreciated that other thresholds for concluding that increased or decreased expression has occurred can be selected without departing from the scope of this invention.
It will also be appreciated that a threshold for concluding that expression is increased will be dependent on the particular marker and also the particular predictive model that is to be applied. The threshold is generally set to achieve the highest sensitivity and selectivity with the lowest error rate, although variations may be desirable for a particular clinical situation. The desired threshold is determined by analysing a population of sufficient size taking into account the statistical variability of any predictive model and is calculated from the size of the sample used to produce the predictive model. The same applies for the determination of a threshold for concluding that expression is decreased. It can be appreciated that other thresholds, or methods for establishing a threshold, for concluding that increased or decreased expression has occurred can be selected without departing from the scope of this invention.
It is also possible that a prediction model may produce as it's output a numerical value, for example a score, likelihood value or probability. In these instances, it is possible to apply thresholds to the results produced by prediction models, and in these cases similar principles apply as those used to set thresholds for expression values
Once the expression level of one or more proliferation markers in a tumour sample has been obtained the likelihood of the cancer recurring can then be determined. In accordance with the invention, a negative prognosis is associated with decreased expression of at least one proliferation marker, while a positive prognosis is associated with increased expression of at least one proliferation marker. In various aspects, an increase in expression is shown by at least 1 , 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, or 75 of the markers disclosed herein. In other aspects, a decrease in expression is shown by at least 1 , 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, or 75 of the markers disclosed herein
From the genes identified, proliferation signatures comprising one or more GCPMs can be used to determine the prognosis of a cancer, by comparing the expression level of the one or more genes to the disclosed proliferation signature. By comparing the expression of one or more of the GCPMs in a tumour sample with the disclosed proliferation signature, the likelihood of the cancer recurring can be determined. The comparison of expression levels of the prognostic signature to establish a prognosis can be done by applying a predictive model as described previously.
Determining the likelihood of the cancer recurring is of great value to the medical practitioner. A high likelihood of reoccurrence means that a longer or higher dose treatment should be given, and the patient should be more closely monitored for signs of recurrence of the cancer. An accurate prognosis is also of benefit to the patient. It allows the patient, along with their partners, family, and friends to also make decisions about treatment, as well as decisions about their future and lifestyle changes. Therefore, the invention also provides for a method establishing a treatment regime for a particular cancer based on the prognosis established by matching the expression of the markers in a tumour sample with the differential proliferation signature.
It will be appreciated that the marker selection, or construction of a proliferation signature, does not have to be restricted to the GCPMs disclosed in Table A, Table B, Table C or Table D, herein, but could involve the use of one or more GCPMs from the disclosed signature, or a new signature may be established using GCPMs selected from the disclosed marker lists. The requirement of any signature is that it predicts the likelihood of recurrence with enough accuracy to assist a medical practitioner to establish a treatment regime.
Surprisingly, it was discovered that many of the GCPM were associated with increased levels of cell proliferation, and were also associated with a positive prognosis. It has similarly been found that there is a close correlation between the decreased expression level of GCPMs and a negative prognosis, e.g., an increased likelihood of gastrointestinal cancer recurring. Therefore, the present invention also provides for the use of a marker associated with cell proliferation, e.g., a cell cycle component, as a GCPM.
As described herein, determination of the likelihood of a cancer recurring can be accomplished by measuring expression of one or more proliferation-specific markers. The methods provided herein also include assays of high sensitivity. In particular, qPCR is extremely sensitive, and can be used to detect markers in very low copy number (e.g., 1 - 100) in a sample. With such sensitivity, prognosis of gastrointestinal cancer is made reliable, accurate, and easily tested.
Reverse Transcription PCR (RT-PCR) Of the techniques listed above, the most sensitive and most flexible quantitative method is RT-PCR, which can be used to compare RNA levels in different sample populations, in normal and tumour tissues, with or without drug treatment, to characterize patterns of expression, to discriminate between closely related RNAs, and to analyze RNA structure.
For RT-PCR, the first step is the isolation of RNA from a target sample. The starting material is typically total RNA isolated from human tumours or tumour cell lines, and corresponding normal tissues or cell lines, respectively. RNA can be isolated from a variety of samples, such as tumour samples from breast, lung, colon (e.g., large bowel or small bowel), colorectal, gastric, esophageal, anal, rectal, prostate, brain, liver, kidney, pancreas, spleen, thymus, testis, ovary, uterus, etc., tissues, from primary tumours, or tumour cell lines, and from pooled samples from healthy donors. If the source of RNA is a tumour, RNA can be extracted, for example, from frozen or archived paraffin-embedded and fixed (e.g., formalin-fixed) tissue samples.
The first step in gene expression profiling by RT-PCR is the reverse transcription of the RNA template into cDNA, followed by its exponential amplification in a PCR reaction. The two most commonly used reverse transcriptases are avilo myeloblastosis virus reverse transcriptase (AMV-RT) and Moloney murine leukaemia virus reverse transcriptase (MMLV-RT). The reverse transcription step is typically primed using specific primers, random hexamers, or oligo-dT primers, depending on the circumstances and the goal of expression profiling. For example, extracted RNA can be reverse-transcribed using a GeneAmp RNA PCR kit (Perkin Elmer, CA, USA), following the manufacturer's instructions. The derived cDNA can then be used as a template in the subsequent PCR reaction.
Although the PCR step can use a variety of thermostable DNA-dependent DNA polymerases, it typically employs the Taq DNA polymerase, which has a 5'-3' nuclease activity but lacks a 3'-5' proofreading endonuclease activity. Thus, TaqMan (g) PCR typically utilizes the 5' nuclease activity of Taq or Tth polymerase to hydrolyze a hybridization probe bound to its target amplicon, but any enzyme with equivalent 5' nuclease activity can be used.
Two oligonucleotide primers are used to generate an amplicon typical of a PCR reaction. A third oligonucleotide, or probe, is designed to detect nucleotide sequence located between the two PCR primers. The probe is non-extendible by Taq DNA polymerase enzyme, and is labeled with a reporter fluorescent dye and a quencher fluorescent dye. Any laser-induced emission from the reporter dye is quenched by the quenching dye when the two dyes are located close together as they are on the probe. During the amplification reaction, the Taq DNA polymerase enzyme cleaves the probe in a template- dependent manner. The resultant probe fragments disassociate in solution, and signal from the released reporter dye is free from the quenching effect of the second fluorophore. One molecule of reporter dye is liberated for each new molecule synthesized, and detection of the unquenched reporter dye provides the basis for quantitative interpretation of the data. TaqMan RT-PCR can be performed using commercially available equipment, such as, for example, ABI PRISM 7700tam Sequence Detection System (Perkin-Elmer-Applied Biosystems, Foster City, CA, USA), or Lightcycler (Roche Molecular Biochemicals, Mannheim, Germany). In a preferred embodiment, the 5' nuclease procedure is run on a real-time quantitative PCR device such as the ABI PRISM 7700tam Sequence Detection System. The system consists of a thermocycler, laser, charge-coupled device (CCD), camera, and computer. The system amplifies samples in a 96-well format on a thermocycler. During amplification, laser-induced fluorescent signal is collected in real- time through fibre optics cables for all 96 wells, and detected at the CCD. The system includes software for running the instrument and for analyzing the data.
51 nuclease assay data are initially expressed as Ct, or the threshold cycle. As discussed above, fluorescence values are recorded during every cycle and represent the amount of product amplified to that point in the amplification reaction. The point when the fluorescent signal is first recorded as statistically significant is the threshold cycle.
To minimize errors and the effect of sample-to-sample variation, RT-PCR is usually performed using an internal standard. The ideal internal standard is expressed at a constant level among different tissues, and is unaffected by the experimental treatment. RNAs most frequently used to normalize patterns of gene expression are mRNAs for the housekeeping genes glyceraldehyde-3-phosphate-dehydrogenase (GAPDH) and-actin.
Real-time quantitative PCR (qPCR) A more recent variation of the RT-PCR technique is the real time quantitative PCR, which measures PCR product accumulation through a dual-labeled fluorigenic probe (i.e., TaqMan@ probe). Real time PCR is compatible both with quantitative competitive PCR and with quantitative comparative PCR. The former uses an internal competitor for each target sequence for normalization, while the latter uses a normalization gene contained within the sample, or a housekeeping gene for RT-PCR. For further details see, e.g., Held et al., Genome Research 6: 986-994 (1996).
Expression levels can be determined using fixed, paraffin-embedded tissues as the RNA source. According to one aspect of the present invention, PCR primers and probes are designed based upon intron sequences present in the gene to be amplified. In this embodiment, the first step in the primer/probe design is the delineation of intron sequences within the genes. This can be done by publicly available software, such as the DNA BLAT software developed by Kent, W. J., Genome Res. 12 (4): 656-64 (2002), or by the BLAST software including its variations. Subsequent steps follow well established methods of PCR primer and probe design.
In order to avoid non-specific signals, it is useful to mask repetitive sequences within the introns when designing the primers and probes. This can be easily accomplished by using the Repeat Masker program available on-line through the Baylor College of Medicine, which screens DNA sequences against a library of repetitive elements and returns a query sequence in which the repetitive elements are masked. The masked sequences can then be used to design primer and probe sequences using any commercially or otherwise publicly available primer/probe design packages, such as Primer Express (Applied Biosystems); MGB assay-by-design (Applied Biosystems); Primer3 (Steve Rozen and Helen J. Skaletsky (2000) Primer3 on the WWW for general users and for biologist programmers in: Krawetz S, Misener S (eds) Bioinformatics Methods and Protocols: Methods in Molecular Biology. Humana Press, Totowa, NJ, pp 365-386).
The most important factors considered in PCR primer design include primer length, melting temperature (Tm), and G/C content, specificity, complementary primer sequences, and 3' end sequence. In general, optimal PCR primers are generally 17-30 bases in length, and contain about 20-80%, such as, for example, about 50-60% G+C bases. Tms between 50 and 800C, e.g., about 50 to 700C are typically preferred. For further guidelines for PCR primer and probe design see, e.g., Dieffenbach, C. W. et al., General Concepts for PCR Primer Design in: PCR Primer, A Laboratory Manual, Cold Spring Harbor Laboratory Press, New York, 1995, pp. 133-155; lnnis and Gelfand, Optimization of PCRs in: PCR Protocols, A Guide to Methods and Applications, CRC Press, London, 1994, pp. 5-11 ; and Plasterer, T. N. Primerselect: Primer and probe design. Methods MoI. Biol. 70: 520-527 (1997), the entire disclosures of which are hereby expressly incorporated by reference.
Microarray analysis
Differential gene expression can also be identified, or confirmed using the microarray technique. Thus, the expression profile of GCPMs can be measured in either fresh or paraffin-embedded tumour tissue, using microarray technology. In this method, polynucleotide sequences of interest (including cDNAs and oligonucleotides) are plated, or arrayed, on a microchip substrate. The arrayed sequences (i.e., capture probes) are then hybridized with specific polynucleotides from cells or tissues of interest (i.e., targets). Just as in the RT-PCR method, Jhe source of RNA typically is total RNA isolated from human tumours or tumour cell lines, and corresponding normal tissues or cell lines. Thus RNA can be isolated from a variety of primary tumours or tumour cell lines. If the source of RNA is a primary tumour, RNA can be extracted, for example, from frozen or archived paraffin-embedded and fixed (e.g., formalin-fixed) tissue samples, which are routinely prepared and preserved in everyday clinical practice.
In a specific embodiment of the microarray technique, PCR amplified inserts of cDNA clones are applied to a substrate. The substrate can include up to 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, or 75 nucleotide sequences. In other aspects, the substrate can include at least 10,000 nucleotide sequences. The microarrayed sequences, immobilized on the microchip, are suitable for hybridization under stringent conditions. As other embodiments, the targets for the microarrays can be at least 50, 100, 200, 400, 500, 1000, or 2000 bases in length; or 50-100, 100-200, 100-500, 100-1000, 100-2000, or 500- 5000 bases in length. As further embodiments, the capture probes for the microarrays can be at least 10, 15, 20, 25, 50, 75, 80, or 100 bases in length; or 10-15, 10-20, 10-25, 10- 50, 10-75, 10-80, or 20-80 bases in length.
Fluorescently labeled cDNA probes may be generated through incorporation of fluorescent nucleotides by reverse transcription of RNA extracted from tissues of interest. Labeled cDNA probes applied to the chip hybridize with specificity to each spot of DNA on the array. After stringent washing to remove non-specifically bound probes, the chip is scanned by confocal laser microscopy or by another detection method, such as a CCD camera. Quantitation of hybridization of each arrayed element allows for assessment of corresponding mRNA abundance. With dual colour fluorescence, separately labeled cDNA probes generated from two sources of RNA are hybridized pairwise to the array. The relative abundance of the transcripts from the two sources corresponding to each specified gene is thus determined simultaneously.
The miniaturized scale of the hybridization affords a convenient and rapid evaluation of the expression pattern for large numbers of genes. Such methods have been shown to have the sensitivity required to detect rare transcripts, which are expressed at a few copies per cell, and to reproducibly detect at least approximately two-fold differences in the expression levels (Schena et al., Proc. Natl. Acad. Sci. USA 93 (2): 106-149 (1996)).
Microarray analysis can be performed by commercially available equipment, following manufacturer's protocols, such as by using the Affymetrix GenChip technology, or Incyte's microarray technology. The development of microarray methods for large-scale analysis of gene expression makes it possible to search systematically for molecular markers of cancer classification and outcome prediction in a variety of tumour types.
RNA isolation, purification, and amplification General methods for mRNA extraction are well known in the art and are disclosed in standard textbooks of molecular biology, including Ausubel et al., Current Protocols of Molecular Biology, John Wiley and Sons (1997). Methods for RNA extraction from paraffin embedded tissues are disclosed, for example, in Rupp and Locker, Lab Invest. 56: A67 (1987), and De Sandres et al., BioTechniques 18: 42044 (1995). In particular, RNA isolation can be performed using purification kit, buffer set, and protease from commercial manufacturers, such as Qiagen, according to the manufacturer's instructions. For example, total RNA from cells in culture can be isolated using Qiagen RNeasy mini- columns. Other commercially available RNA isolation kits include MasterPure Complete DNA and RNA Purification Kit (EPICENTRE (D, Madison, Wl), and Paraffin Block RNA Isolation Kit (Ambion, Inc.). Total RNA from tissue samples can be isolated using RNA Stat-60 (Tel-Test). RNA prepared from tumour can be isolated, for example, by cesium chloride density gradient centrifugation.
The steps of a representative protocol for profiling gene expression using fixed, paraffin- embedded tissues as the RNA source, including mRNA isolation, purification, primer extension and amplification are given in various published journal articles (for example: T. E. Godfrey et al. J. Molec. Diagnostics 2: 84-91 (2000); K. Specht et al., Am. J. Pathol. 158: 419-29 (2001)). Briefly, a representative process starts with cutting about 10 μm thick sections of paraffin-embedded tumour tissue samples. The RNA is then extracted, and protein and DNA are removed. After analysis of the RNA concentration, RNA repair and/or amplification steps may be included, if necessary, and RNA is reverse transcribed using gene specific promoters followed by RT-PCR. Finally, the data are analyzed to identify the best treatment option(s) available to the patient on the basis of the characteristic gene expression pattern identified in the tumour sample examined.
lmmunohistochemistry and proteomics lmmunohistochemistry methods are also suitable for detecting the expression levels of the proliferation markers of the present invention. Thus, antibodies or antisera, preferably polyclonal antisera, and most preferably monoclonal antibodies specific for each marker, are used to detect expression. The antibodies can be detected by direct labeling of the antibodies themselves, for example, with radioactive labels, fluorescent labels, hapten labels such as, biotin, or an enzyme such as horse radish peroxidase or alkaline phosphatase. Alternatively, unlabeled primary antibody is used in conjunction with a labeled secondary antibody, comprising antisera, polyclonal antisera or a monoclonal antibody specific for the primary antibody, lmmunohistochemistry protocols and kits are well known in the art and are commercially available.
Proteomics can be used to analyze the polypeptides present in a sample (e.g., tissue, organism, or cell culture) at a certain point of time. In particular, proteomic techniques can be used to asses the global changes of protein expression in a sample (also referred to as expression proteomics). Proteomic analysis typically includes: (1) separation of individual proteins in a sample by 2-D gel electrophoresis (2-D PAGE); (2) identification of the individual proteins recovered from the gel, e.g., my mass spectrometry or N-terminal sequencing, and (3) analysis of the data using bioinformatics. Proteomics methods are valuable supplements to other methods of gene expression profiling, and can be used, alone or in combination with other methods, to detect the products of the proliferation markers of the present invention.
Selection of Differentially Expressed Genes.
An early approach to the selection of genes deemed significant involved simply looking at the "fold change" of a given gene between the two groups of interest. While this approach hones in on genes that seem to change the most spectacularly, consideration of basic statistics leads one to realize that if the variance (or noise level) is quite high (as is often seen in microarray experiments), then seemingly large fold-change can happen frequently by chance alone.
Microarray experiments, such as those described here, typically involve the simultaneous measurement of thousands of genes. If one is comparing the expression levels for a particular gene between two groups (for example recurrent and non-recurrent tumours), the typical tests for significance (such as the t-test) are not adequate. This is because, in an ensemble of thousands of experiments (in this context each gene constitutes an "experiment"), the probability of at least one experiment passing the usual criteria for significance by chance alone is essentially unity. In a test for significance, one typically calculates the probability that the "null hypothesis" is correct. In the case of comparing two groups, the null hypothesis is that there is no difference between the two groups. If a statistical test produces a probability for the null hypothesis below some threshold (usually 0.05 or 0.01), it is stated that we can reject the null hypothesis, and accept the hypothesis that the two groups are significantly different. Clearly, in such a test, a rejection of the null hypothesis by chance alone could be expected 1 in 20 times (or 1 in 100). The use of t- tests, or other similar statistical tests for significance, fail in the context of microarrays, producing far too many false positives (or type I errors)
In this type of situation, where one is testing multiple hypotheses at the same time, one applies typical multiple comparison procedures, such as the Bonferroni Method (43). However such tests are too conservative for most microarray experiments, resulting in too many false negative (type II) errors.
A more recent approach is to do away with attempting to apply a probability for a given test being significant, and establish a means for selecting a subset of experiments, such that the expected proportion of Type I errors (or false discovery rate; 47) is controlled for. It is this approach that has been used in this investigation, through various implementations, namely the methods provided with BRB Array Tools (48), and the limma (11 ,42) package of Bioconductor (that uses the R statistical environment; 10,39).
General methodology for Data Mining: Generation of Prognostic Signatures
Data Mining is the term used to describe the extraction of "knowledge", in other words the "know-how", or predictive ability from (usually) large volumes of data (the dataset). This is the approach used in this study to generate prognostic signatures. In the case of this study the "know-how" is the ability to accurately predict prognosis from a given set of gene expression measurements, or "signature" (as described generally in this section and in more detail in the examples section).
The specific details used for the methods used in this study are described in Examples 17-20. However, application of any of the data mining methods (both those described in the Examples, and those described here) can follow this general protocol.
Data mining (49), and the related topic machine learning (40) is a complex, repetitive mathematical task that involves the use of one or more appropriate computer software packages (see below). The use of software is advantageous on the one hand, in that one does not need to be completely familiar with the intricacies of the theory behind each technique in order to successfully use data mining techniques, provided that one adheres to the correct methodology. The disadvantage is that the application of data mining can often be viewed as a "black box": one inserts the data and receives the answer. How this is achieved is often masked from the end-user (this is the case for many of the techniques described, and can often influence the statistical method chosen for data mining. For example, neural networks and support vector machines have a particularly complex implementation that makes it very difficult for the end user to extract out the "rules" used to produce the decision. On the other hand, k-nearest neighbours and linear discriminant analysis have a very transparent process for decision making that is not hidden from the user.
There are two types of approach used in data mining: supervised and unsupervised approaches. In the supervised approach, the information that is being linked to the data is known, such as categorical data (e.g. recurrent vs. non recurrent tumours). What is required is the ability to link the observed response (e.g. recurrence vs. non-recurrence) to the input variables. In the unsupervised approach, the classes within the dataset are not known in advance, and data mining methodology is employed to attempt to find the classes or structure within the dataset.
In the present example the supervised approach was used and is discussed in detail here, although it will be appreciated that any of the other techniques could be used.
The overall protocol involves the following steps:
• Data representation. This involves transformation of the data into a form that is most likely to work successfully with the chosen data mining technique. In where the data is numerical, such as in this study where the data being investigated represents relative levels of gene expression, this is fairly simple. If the data covers a large dynamic range (i.e. many orders of magnitude) often the log of the data is taken. If the data covers many measurements of separate samples on separate days by separate investigators, particular care has to be taken to ensure systematic error is minimised. The minimisation of systematic error (i.e. errors resulting from protocol differences, machine differences, operator differences and other quantifiable factors) is the process referred to here as "normalisation".
• Feature Selection. Typically the dataset contains many more data elements than would be practical to measure on a day-to-day basis, and additionally many elements that do not provide the information needed to produce a prediction model. The actual ability of a prediction model to describe a dataset is derived from. some subset of the full dimensionality of the dataset. These dimensions the most important components (or features) of the dataset. Note in the context of microarray data, the dimensions of the dataset are the individual genes. Feature selection, in the context described here, involves finding those genes which are most "differentially expressed". In a more general sense, it involves those groups which pass some statistical test for significance, i.e. is the level of a particular variable consistently higher or lower in one or other of the groups being investigated. Sometimes the features are those variables (or dimensions) which exhibit the greatest variance.
The application of feature selection is completely independent of the method used to create a prediction model, and involves a great deal of experimentation to achieve the desired results. Within this invention, the selection of significant genes, and -those which correlated with the earlier successful model (the NZ classifier), entailed feature selection. In addition, methods of data reduction (such as principal component analysis) can be applied to the dataset.
• Training. Once the classes (e.g. recurrence/non-recurrence) and the features of the dataset have been established, and the data is represented in a form that is acceptable as input for data mining, the reduced dataset (as described by the features) is applied to the prediction model of choice. The input for this model is usually in the form a multi-dimensional numerical input,(known as a vector), with associated output information (a class label or a response). In the training process, selected data is input into the prediction model, either sequentially (in techniques such as neural networks) or as a whole (in techniques that apply some form of regression, such as linear models, linear discriminant analysis, support vector machines). In some instances (e.g. k-nearest neighbours) the dataset (or subset of the dataset obtained after feature selection) is itself the model. As discussed, effective models can be established with minimal understanding of the detailed mathematics, through the use of various software packages where the parameters of the model have been pre-determined by expert analysts as most likely to lead to successful results.
• Validation. This is a key component of the data-mining protocol, and the incorrect application of this frequently leads to errors. Portions of the dataset are to be set aside, apart from feature selection and training, to test the success of the prediction model. Furthermore, if the results of validation are used to effect feature selection and training of the model, then one obtains a further validation set to test the model before it is applied to real-life situations. If this process is not strictly adhered to the model is likely to fail in real-world situations. The methods of validation are described in more detail below.
• Application. Once the model has been constructed, and validated, it must be packaged in some way as it is accessible to end users. This often involves implementation of some form a spreadsheet application, into which the model has been imbedded, scripting of a statistical software package, or refactoring of the model into a hard-coded application by information technology staff.
Examples of software packages that are frequently used are:
- Spreadsheet plugins, obtained from multiple vendors.
- The R statistical environment.
- The commercial packages MatLab, S-plus, SAS, SPSS, STATA.
- Free open-source software such as Octave (a MatLab clone) - many and varied C++ libraries, which can be used to implement prediction models in a commercial, closed-source setting.
Examples of Data Mining Methods.
The methods can be by first performing the step of data mining process (above), and then applying the appropriate known software packages. Further description of the process of data mining is described in detail in many extremely well-written texts. (49)
• Linear models (49, 50): The data is treated as the input of a iinear regression model, of which the class labels or responses variables are the output. Class labels, or other categorical data, must be transformed into numerical values
(usually integer). In generalised linear models, the class labels or response variables are not themselves linearly related to the input data, but are transformed through the use of a "link function". Logistic regression is the most common form of generalized linear model.
• Linear Discriminant analysis (49, 51 , 52). Provided the data is linearly separable (i.e. the groups or classes of data can be separated by a hyperplane, which is an n-dimensional extension of a threshold), this technique can be applied. A combination of variables is used to separate the classes, such that the between group variance is maximised, and the within-group variance is minimised. The byproduct of this is the formation of a classification rule. Application of this rule to samples of unknown class allows predictions or classification of class membership to be made for that sample. There are variations of linear discriminant analysis such as nearest shrunken centroids which are commonly used for microarray analysis.
• Support vector machines (53): A collection of variables is used in conjunction with a collection of weights to determine a model that maximizes the separation between classes in terms of those weighted variables. Application of this model to a sample then produces a classification or prediction of class membership for that sample.
• Neural networks (52): The data is treated as input into a network of nodes, which superficially resemble biological neurons, which apply the input from all the nodes to which they are connected, and transform the input into an output. Commonly, neural networks use the "multiply and sum" algorithm, to transform the inputs from multiple connected input nodes into a single output. A node may not necessarily produce an output unless the inputs to that node exceed a certain threshold. Each node has as its input the output from several other nodes, with the final output node usually being linked to a categorical variable. The number of nodes, and the topology of the nodes can be varied in almost infinite ways, providing for the ability to classify extremely noisy data that may not be possible to categorize in other ways. The most common implementation of neural networks is the multi-layer perceptron.
• Classification and regression trees (54): In these, variables are used to define a hierarchy of rules that can be followed in a stepwise manner to determine the class of a sample. The typical process creates a set of rules which lead to a specific class output, or a specific statement of the inability to discriminate. A example classification tree is an implementation of an algorithm such as: if gene A> x and gene Y > x and gene Z = Z then class A else if geneA = q then class B • Nearest neighbour methods (51 , 52). Predictions or classifications are made by comparing a sample (of unknown class) to those around it (or known class), with closeness defined by a distance function. It is possible to define many different distance functions. Commonly used distance functions are the Euclidean distance (an extension of the Pythagorean distance, as in triangulation, to n-dimensions), various forms of correlation (including Pearson Correlation co-efficient). There are also transformation functions that convert data points that would not normally be interconnected by a meaningful distance metric into euclidean space, so that Euclidean distance can then be applied (e.g. Mahalanobis distance). Although the distance metric can be quite complex, the basic premise of k-nearest neighbours is quite simple, essentially being a restatement of "find the k-data vectors that are most similar to the unknown input, find out which class they correspond to, and vote as to which class the unknown input is".
• Other methods:
- Bayesian networks. A directed acyclic graph is used to represent a collection of variables in conjunction with their joint probability distribution, which is then used to determine the probability of class membership for a sample.
- Independent components analysis, in which independent signals (e.g., class membership) re isolated (into components) from a collection of variables. These components can then be used to produce a classification or prediction of class membership for a sample.
Ensemble learning methods in which a collection of prediction methods are combined to produce a joint classification or prediction of class membership for a sample
There are many variations of these methodologies that can be explored (49), and many new methodologies are constantly being defined and developed. It will be appreciated that any one of these methodologies can be applied in order to obtain an acceptable result. Particular care must be taken to avoid overfitting, by ensuring that all results are tested via a comprehensive validation scheme.
Validation
Application of any of the prediction methods described involves both training and cross-validation (43, 55) before the method can be applied to new datasets (such as data from a clinical trial). Training involves taking a subset of the dataset of interest (in this case gene expression measurements from colorectal tumours), such that it is stratified across the classes that are being tested for (in this case recurrent and non-recurrent tumours). This training set is used to generate a prediction model (defined above), which is tested on the remainder of the data (the testing set).
It is possible to alter the parameters of the prediction model so as to obtain better performance in the testing set, however, this can lead to the situation known as overfitting, where the prediction model works on the training dataset but not on any external dataset. In order to circumvent this, the process of validation is followed. There are two major types of validation typically applied, the first (hold-out validation) involves partitioning the dataset into three groups: testing, training, and validation. The validation set has no input into the training process whatsoever, so that any adjustment of parameters or other refinements must take place during application to the testing set (but not the validation set). The second major type is cross-validation, which can be applied in several different ways, described below.
There are two main sub-types of cross-validation: K-fold cross-validation, and leave-one- out cross-validation
K-fold cross-validation: The dataset is divided into K subsamples, each subsample containing approximately the same proportions of the class groups as the original. In each round of validation, one of the K subsamples is set aside, and training is accomplished using the remainder of the dataset. The effectiveness of the training for that round is guaged by how correctly the classification of the left-out group is. This procedure is repeated K- times, and the overall effectiveness ascertained by comparison of the predicted class with the known class.
Leave-one-out cross-validation: A commonly used variation of K-fold cross validation, in which K=n, where n is the number of samples.
Combinations of CCPMS, such as those described above in Tables 1 and 2, can be used to construct predictive models for prognosis.
Prognostic Signatures
Prognostic signatures, comprising one or more of these markers, can be used to determine the outcome of a patient, through application of one or more predictive models derived from the signature. In particular, a clinician or researcher can determine the differential expression (e.g., increased or decreased expression) of the one or more markers in the signature, apply a predictive model, and thereby predict the negative prognosis, e.g., likelihood of disease relapse, of a patient, or alternatively the likelihood of a positive prognosis (continued remission).
In still further aspects, the invention includes a method of determining a treatment regime for a cancer comprising: (a) providing a sample of the cancer; (b) detecting the expression level of a GgCPM family member in said sample; (c) determining the prognosis of the cancer based on the expression level of a CCPM family member; and (d) determining the treatment regime according to the prognosis.
In still further aspects, the invention includes a device for detecting a GCPM, comprising: a substrate having a GCPM capture reagent thereon; and a detector associated with said substrate, said detector capable of detecting a GCPM associated with said capture reagent. Additional aspects include kits for detecting cancer, comprising: a substrate; a GCPM capture reagent; and instructions for use. Yet further aspects of the invention include method for detecting aGCPM using qPCR, comprising: a forward primer specific for said CCPM; a reverse primer specific for said GCPM; PCR reagents; a reaction vial; and instructions for use.
Additional aspects of this invention comprise a kit for detecting the presence of a GCPM polypeptide or peptide, comprising: a substrate having a capture agent for said GCPM polypeptide or peptide; an antibody specific for said GCPM polypeptide or peptide; a reagent capable of labeling bound antibody for said GCPM polypeptide or peptide; and instructions for use.
In yet further aspects, this invention includes a method for determining the prognosis of colorectal cancer, comprising the steps of: providing a tumour sample from a patient suspected of having colorectal cancer; measuring the presence of a GCPM polypeptide using an ELISA method. In specific aspects of this invention the GCPM of the invention is selected from the markers set forth in Table A, Table B, Table C or Table D. In still further aspects, the GCPM is included in a prognostic signature
While exemplified herein for gastrointestinal cancer, e.g., gastric and colorectal cancer, the GCPMs of the invention also find use for the prognosis of other cancers, e.g., breast cancers, prostate cancers, ovarian cancers, lung cancers (such as adenocarcinoma and, particularly, small cell lung cancer), lymphomas, gliomas, blastomas (e.g., medulloblastomas), and mesothelioma, where decreased or low expression is associated with a positive prognosis, while increased or high expression is associated with a negative prognosis.
EXAMPLES The examples described herein are for purposes of illustrating embodiments of the invention. Other embodiments, methods, and types of analyses are within the scope of persons of ordinary skill in the molecular diagnostic arts and need not be described in detail hereon. Other embodiments within the scope of the art are considered to be part of this invention.
EXAMPLE 1 : Cell cultures
The experimental scheme is shown in FIG. 1. Ten colorectal cell lines were cultured and harvested at semi- and full-confluence. Gene expression profiles of the two growth stages were analyzed on 30,000 oligonucleotide arrays and a gene proliferation signature (GPS; Table C) was identified by gene ontology analysis of differentially expressed genes. Unsupervised clustering was then used to independently dichotomize two cohorts of clinical colorectal samples (Cohort A: 73 stage I-IV on oligo arrays, Cohort B: 55 stage Il on Affymetrix chips) based on the similarities of the GPS expression. Ki-67 immunostaining was also performed on tissue sections from Cohort A tumours. Following this, the correlation between proliferation activity and clinico-pathologic parameters was investigated. -»
Ten colorectal cancer cell lines derived from different disease stages were included in this study: DLD-1 , HCT-8, HCT-116, HT-29, LoVo, Ls174T, SK-CO-1 , SW48, SW480, and SW620 (ATCC, Manassas, VA). Cells were cultivated in a 5% CO2 humidified atmosphere at 370C in alpha minimum essential medium supplemented with 10% fetal bovine serum, 100 IU/ml penicillin and 100 μg/ml streptomycin (GIBCO-lnvitrogen, CA). Two cell cultures were established for each cell line. The first culture was harvested upon reaching semi- confluence (50-60%). When cells in the second culture reached full-confluence (determined both microscopically and macroscopically), media was replaced, and cells were harvested twenty-four hours later to prepare RNA from the growth-inhibited cells. Array experiments were carried out on RNA extracted from each cell culture. In addition, a second culturing experiment was done following the same procedure and extracted RNA was used for dye-reversed hybridizations. EXAMPLE 2: Patients
Two cohorts of patients were analysed. Cohort A included 73 New Zealand colorectal cancer patients who underwent surgery at Dunedin and Auckland hospitals between 1995 and 2000. These patients were part of a prospective cohort study and included all disease stages. Tumour samples were collected fresh from the operation theatre, snap frozen in liquid nitrogen and stored at -8O0C. Specimens were reviewed by a single pathologist (H-S Y) and tumours were staged according to the TNM system (34). Of the 73 patients, 32 developed disease recurrence and 41 remained recurrence-free after a minimum of five years follow up. The median overall survival was 29.5 and 66 months for recurrent and recurrent-free patients, respectively. Twenty patients received 5-FU-based post-operative adjuvant chemotherapy and 12 patients received radiotherapy (7 pre- and 5 postoperative).
Cohort B included a group of 55 German colorectal patients who underwent surgery at the Technical University of Munich between 1995 and 2001 and had fresh frozen samples stored in a tissue bank. All 55 had stage Il disease, 26 developed disease recurrence (median survival 47 months) and 29 remained recurrence-free (median survival 82 months). None of patients received chemotherapy or radiotherapy. Clinico-pathologic variables of both cohorts are summarised as part of Table 2.
Table 2: Clinico-pathologic parameters and their association with the GPS expression and Ki-67 Pl
Number of patients GPS Ki-67 PI* cohort A cohort B
Parameters cohort A cohort B Mean ± SD p-value 5
(p-value)5 (p-value)5
Age 11 < Mean 34 31 1 0.79 74.4*17.9 0.6
>Mean 39 24 77.9*17.3
Sex Male 35 33 0.16 1 77.3±15.3 1
Female 38 22 75.3±19.5
Site£ Right side 30 12 1 0.2 80.4±13.3 0.2
Left side 43 43 73.1*19.7
Grade Well 9 0 0.22 0.2 75.6±18.1
Moderate 50 33 73.9*18.9 0.98
Poor 14 22 84.3±9.3
Dukes stage A 10 0 0.006 NA 78.8±17.3 0.73
B 27 55 75.7*18.4
C 28 0 76±16.1
D 8 0 75.9*22
T stage Tl 5 0 0.16 0.62 71.3±22.4 0.16
T2 11 11 85.4±7.4
T3 50 41 76*17
T4 7 3 66.2±26.3
N stage NO 38 55 0.03 NA 76.5±17.9 1
N1+>J2 35 0 76*17.4
Vascular invasion Yes 5 1 0.67 NA 54.4±31.5 0.32
No 68 54 78*15
Lymphatic invasion Yes 32 5 0.06 0.35 76.5*18.3 0.6
No 41 50 75.1±17.3
Lymphocyte infiltration Mild 35 15 0.89 1 75±18.6 0.85
Moderate 27 25 79.4±16.5
Prominent 11 15 73.5±18.3
Margin Infiltrative 45 0.47 NA 75.8*18.9 1
INA
Expansive 28 77.1±15.7
Recurrence Yes 32 26 0.03 <0.001 75.6*19 0.79
No 41 29 76.8*16.2
Total 73 55 76.3*17.5
§ A Fisher's Exact Test or Kruskal-Wallis Test were used for testing association between clinico-pathologic parameters and
GPS expression or Ki-67 PI, as appropriate.
* Ki-67 immunostaining was performed on tumor sections from cohort A patients.
£ Proximal and distal to splenic flexure, respectively
K Average age 68 and 63 years for cohort A and B patients, respectively
NA: not applicable
EXAMPLE 3: Array preparation and gene expression analysis
Cohort A tumours and cell lines: Tissue samples and cell lines were homogenised and RNA was extracted using Tri-Reagent (Progenz, Auckland, NZ). The RNA was then purified using RNeasy mini column (Qiagen, Victoria, Australia) according to the manufacture's protocol. Ten micrograms of total RNA extracted from each culture or tumour sample was oligo-dT primed and cDNA synthesis was carried out in the presence of aa-dUTP and Superscript Il RNase H-Reverse Transcriptase (Invitrogen). Cy dyes were incorporated into cDNA using the indirect amino-allyl cDNA labelling method. cDNA derived from a pool of 12 different cell lines was used as the reference for all hybridizations. The Cy5-dUTP-tagged cDNA from an individual colorectal cell line or tissue sample was combined with Cy3-dUTP-tagged cDNA from reference sample. The mixture was then purified using a QiaQuick PCR purification Kit (Qiagen, Victoria, Australia) and co-hybridized to a microarray spotted with the MWG 3OK Oligo Set (MWG Biotech, NC). cDNA samples from the second culturing experiment were additionally analysed on microarrays using reverse labelling.
Arrays were scanned with a GenePix 4000B Microarray Scanner and data were analysed using GenePix Pro 4.1 Microarray Acquisition and Analysis Software (Axon, CA). The foreground intensities from each channel were log2 transformed and normalised using the SNOMAD software (35) Normalised values were collated and filtered using BRB-Array Tools Version 3.2 (developed by Dr. Richard Simon and Amy Peng Lam, Biometric Research Branch, National Cancer Institute). Low intensity genes, and genes for which over 20% of measurements across tissue samples or cell lines were missing, were excluded from further analysis.
Cohort B tumours: Total RNA was extracted from each tumour using RNeasy Mini Kit and purified on RNeasy Columns (Qiagen, Hilden, Germany). Ten micrograms of total RNA was used to synthesize double-stranded cDNA with Superscript Il reverse transcriptase (GIBCO-lnvitrogen, NY) and an oligo-dT-T7 primer (Eurogentec, Koeln, Germany). Biotinylated cRNA was synthesized from the double-stranded cDNA using the Promega RiboMax T7-kit (Promega, Madison, Wl) and Biotin-NTP labelling mix (Loxo, Dossenheim, Germany). Then, the biotinylated cRNA was purified and fragmented. The fragmented cRNA was hybridized to Affymetrix HGU133A GeneChips (Affymetrix, Santa Clara, CA) and stained with streptavidin-phycoerythrin. The arrays were then scanned with a HP- argon-ion laser confocal microscope and the digitized image data were processed using the Affymetrix® Microarray Suite 5.0 Software. All Affymetrix U133A GeneChips passed quality control to eliminate scans with abnormal characteristics. Background correction and normalization were performed in the R computing environment using the robust multi- array average function implemented in the Bioconductor package affy.
EXAMPLE 4: Quantitative real-time PCR (QPCR)
The expression of eleven genes (MAD2L1 , POLE2, CDC2, MCM6, MCM7, RANSEH2A, TOPK, KPNA2, G22P1 , PCNA, and GMNN) was validated using the cDNA from the cell cultures. Total RNA (2 μg) was reverse transcribed using Superscript Il RNase H-Reverse Transcriptase kit (Invitrogen) and oligo dT primer (Invitrogen). QPCR was performed on an ABI Prism 7900HT Sequence Detection System (Applied Biosystems) using Taqman Gene Expression Assays (Applied Biosystems). Relative fold changes were calculated using the 2"MCT method36 with Topoisomerase 3A as the internal control. Reference RNA was used as the calibrator to enable comparison between different experiments.
EXAMPLE 5: lmmunohistochemical analysis lmmunohistochemical expression of Ki-67 antigen (MIB-1 ; DakoCytornation, Denmark) was investigated on 4 μm sections of 73 paraffin-embedded primary colorectal tumours from Cohort A. Endogenous peroxidase activity was blocked with 0.3% hydrogen peroxidase in methanol and antigens were retrieved in boiling citrate buffer (pH 6). Nonspecific binding sites were blocked with 5% normal goat serum containing 1% BSA. Primary antibody (dilution 1 :50) was detected using the EnVision system (Dako EnVision, CA) and the DAB substrate kit (Vector laboratories, CA). Five high-power fields were selected using a 10 x 10 microscope grid and cell counts were performed manually in a blind fashion without knowledge of the ciinico-pathologic data. The Ki-67 proliferation index (Pl) was presented as the percentage of positively stained nuclei for each tumour. .
EXAMPLE 6: Statistical analysis
Statistical analyses were performed using SPSS® version 14.0.0 (SPSS Inc., Chicago, IL). Ki-67 proliferation indices were presented as mean ± SD. A Fisher's Exact Test or Kruskal-Wallis Test was used to evaluate the differences between categorized groups based on the expression of the GPS or the Ki-67 Pl versus the ciinico-pathologic parameters. A P value ≤ 0.05 was considered significant. Overall survival (OS) and recurrence-free survival (RFS) were plotted using the method of Kaplan and Meier (37). A log-rank test was used to test for differences in survival time between the categorized groups. Relative risk and associated confidence intervals were also estimated for each variable using the Cox univariate model, and a multivariate Cox proportional hazard model was developed using forward stepwise regression with predictive variables that were significant in the univariate analysis. K-means clustering method was used to classify clinical samples based on the expression level of GPS.
EXAMPLE 7: Identification of a gene proliferation signature (GPS) using a colorectal cell line model
An overview of the approach used to derive and apply a gene proliferation signature (GPS) is summarised in FIG. 1. The GPS, including 38 mitotic cell cycle genes (Table C), was relatively over-expressed in cycling cells in semi-confluent cultures. Low proliferation, defined by low GPS expression, was associated with unfavourable ciinico-pathologic variables, shorter overall and recurrence-free survival (p<0.05). No association was found between Ki-67 proliferation index and ciinico-pathologic variables or clinical outcome. Table C: GCPMs for cell proliferation signature
Figure imgf000095_0001
Figure imgf000096_0001
Figure imgf000097_0001
The GPS was identified as a subset of genes whose expression correlates with CRC cell proliferation rate. Statistical Analysis of Microarray (SAM; Reference 38) was used to identify genes differentially expressed (DE) between exponentially growing (semi- confluent) and non-cycling (fully-confluent) CRC cell lines (FIG. 1, stage 1). To adjust for gene specific dye bias and other sources of variation, each culture set was analysed independently. Analyses were limited to 502 DE genes for which a significant expression difference was observed between two growth stages in both sets of cultures (false discovery rate < 1 %). Gene Ontology (GO) analysis was carried out using EASE39 to identify the biological process categories that were significantly reflected in the DE genes. Cell-proliferation related categories were over-represented mainly due to genes upregulated in exponentially growing cells. The mitotic cell cycle category (GO:0000278) was defined as the GPS because (i) this biological process was the most over- represented GO term (EΞASE score=5.5211); and (ii) all 38 mitotic cell cycle genes (Table C) were expressed at higher levels in rapidly growing compared to growth-inhibited cells. The expression of eleven genes from the GPS was assessed by QPCR and correlated with corresponding values obtained from the array data. Therefore, QPCR confirmed that elevated expression of the proliferation signature genes correlates with the increased proliferation in CRC cell lines (FIG. 5).
EXAMPLE 8: Classification of CRC samples according to the expression level of gene proliferation signature
In order to examine the relative proliferation state of CRC tumours and the utility of the GPS for clinical application, CRC tumours from two cohorts were stratified into two clusters based on the expression of GPS (FIG. 1, stage 2). Expression values of the 38 genes defining the GPS were first obtained from the microarray-generated expression profiles of tumours. Tumours from each cohort were then separately classified into two clusters (K=2) based on their GPS expression level similarities using K-means unsupervised clustering. Analysis of DE genes between two defined clusters using all filtered genes revealed that the GPS was contained within the list of genes upregulated in cluster 1 (FIG. 2A, upper panel) relative to cluster 2 (lower panel) in both cohorts. Thus, the tumours in cluster 1 are characterised by high GPS expression, while the tumours in cluster 2 are characterised by low GPS expression.
EXAMPLE 9: Low gene proliferation signature is associated with unfavourable ciinico-pathologic variables
Table 2 summarises the association between GPS expression levels and ciinico- pathologic variables. An association was observed between low proliferation activity, defined by low GPS expression, and an increased risk of recurrence in both cohorts (P=0.03 and <0.001 for Cohort A and B, respectively). In Cohort A, low GPS expression was also associated with a higher disease stage and lymph node metastasis (P=0.006 and 0.03 respectively). In addition, tumours with lymphatic invasion from Cohort A tended to be less proliferative than tumours without lymphatic invasion, albeit without reaching statistical significance (P=0.06). No association was found between the GPS expression level and tumour site, age, sex, degree of differentiation, T-stage, vascular invasion, degree of lymphocyte infiltration and tumour margin. EXAMPLE 10: Gene proliferation signature predicts clinical outcome
To examine the performance of the GPS in predicting patient outcome, Kaplan-Meier survival analysis was used to compare RFS and OS between low and high GPS tumours (FIG. 3). All patients were censored at 60 months post-operation. In colorectal cancer Cohort A, OS and RFS were shorter in patients with low GPS expression (Log rank test P=O.04 and 0.01 , respectively). In colorectal cancer Cohort B, low GPS expression was also associated with decreased OS (P=0.0004) and RFS (P=0.0002). When the parameters predicting OS and RFS in univariate analysis were investigated in a multivariate model, disease stage was the only independent predictor of 5-year OS, while disease stage and T-stage were independent predictors of RFS in Cohort A. In Cohort B, low GPS expression and lymphatic invasion showed an independent contribution to both OS and RFS. If survival analysis was limited to Cohort B patients without lymphatic invasion, low GPS was still associated with shorter OS and RFS, confirming the independence of the GPS as a predictor. Analyses of single and multiple-variable associations with survival are summarized in Table 3.
Low GPS expression was also associated with decreased 5-year overall survival in patients with gastric cancer (p=0.008). A Kaplan-Meier survival plot comparing the overall survival of low and high GPS gastric tumours is shown in Fig. 4.
Table 3: Uni- and multivariate analysis of prognostic factors for OS and RFS in both cohorts
Figure imgf000100_0001
EXAMPLE 11 : Ki-67 is not associated with clinico-pathologic variables or survival Ki-67 immunostaining was performed on tissue sections from Cohort A tumours only as paraffin-embedded samples were unavailable for Cohort B (FIG. 1 , stage 3). Nuclear staining was detected in all 73 CRC tumours. Ki-67 Pl ranged from 25 to 96 %, with a mean value of 76.3±17.5. Using the mean Ki-67 value as a cut-off point, tumours were assigned into two groups with low or high Pl. Ki-67 Pl was neither associated with clinico- pathologic variables (Table 2) nor survival (FIG. 3). When the survival analysis was limited to the patients with the highest and lowest Ki-67 values, no statistical difference was observed (data not shown). The sum of these results indicates that the low expression of growth-related genes is associated with poor outcome in colorectal cancer, and Ki-67 was not sensitive enough to detect an association. These findings can be used as additional criteria for identifying patients at high risk of early death from cancer. EXAMPLE 12: Selection of correlated cell proliferation genes
Cohort B (55 German CRC patients; Table 2) were first classified into low and high proliferation groups using the 38 gene cell proliferation signature (Table C) and the K- means clustering method (Pearson uncentered, 1000 permutations, threshold of occurrence in the same cluster sat at 80%). Statistical Analysis of Microarrays (SAM) was then applied to identify differentially expressed genes between low and high proliferation groups (FDR=O) when all filtered genes (16041 genes) were included for the analysis. 754 genes were found to be over-expressed in high proliferation group. The GATHER gene ontology program was then used to identify the most over-represented gene ontology categories within the list of differentially expressed genes. The cell cycle category was the most over-represented category within the list of differentially expressed genes. 102 cell cycle genes which are differentially expressed between the low and high proliferation groups (in addition to the original 38 gene signature) are shown in Table D. Table D: Cell Cycle Genes that are Differentially Expressed in Low and High Proliferation
Figure imgf000101_0001
Figure imgf000102_0001
Figure imgf000103_0001
Figure imgf000104_0001
Figure imgf000105_0001
Conclusions
The present invention is the first to report an association between a gene proliferation signature and major clinico-pathologic variables as well as outcome in colorectal cancer. The disclosed study investigated the proliferation state of tumours using an in vitro- derived multi-gene proliferation signature and by Ki-67 immunostaining. According to the results herein, low expression of the GPS in tumours was associated with a higher risk of recurrence and shorter survival in two independent cohorts of patients. In contrast, Ki-67 proliferation index was not associated with any clinically relevant endpoints.
The colorectal GPS encompasses 38 mitotic cell cycle genes and includes a core set of genes (CDC2, RFC4, PCNA, CCNE1 , CDK7, MCM genes, FEN1 , MAD2L1 , MYBL2, RRM2 and BUB3) that are part of proliferation signatures defined for tumours of the breast (40), (41), ovary (42), liver (43), acute lymphoblastic leukaemia (44), neuroblastoma (45), lung squamous cell carcinoma (46), head and neck (47), prostate (48), and stomach (49). This represents a conserved pattern of expression, as most of these genes have been found to be highly overexpressed in fast-growing tumours and to reflect a high proportion of rapidly cycling cells (50). Therefore, the expression level of the colorectal GPS provides a measure for the proliferative state of a tumour.
In this study, several clinico-pathologic variables related to poor outcome (disease stage, lymph node metastasis and lymphatic invasion) were associated with low GPS expression in Cohort A patients. In Cohort B, consisting entirely of stage Il tumours, the study assessed the association between the GPS and lymphatic invasion. The association failed to reach statistical significance due to the small number of tumours with lymphatic invasion in this cohort (5/55). Without being bound by theory, the low GPS expression in more advanced tumours may indicate that CRC progression is not driven by enhanced proliferation. While accelerated proliferation may still be an important driving force during the initial phases of tumourigenesis, it is possible that more advanced disease is more dependent on processes such as genetic instability to allow continuous selection. Consistent with our finding, two large-scale studies reported an association between decreased expression of CDK2, cyclin E and A, and advanced stage, deep infiltration and lymph node metastasis (51), (52).
The relationship between low GPS and unfavourable clinico-pathologic variables suggested that the GPS should also predict patient outcome. Indeed, in both Cohort A and B, low GPS expression was associated with a higher risk of recurrence and shorter overall and recurrence-free survival. In Cohort B, where all patients had stage Il tumours, the association remained in multivariate analysis. However, in Cohort A, where patients had stage I-IV disease, the association was not independent of tumour stage. The number of patients with and without recurrence, within each stage of disease in Cohort A, was probably insufficient to demonstrate an independent association between the GPS and survival. In Cohort B, low GPS expression and lymphatic invasion remained independent predictors in multivariate analysis suggesting that the GPS may improve the prediction of CRC patient outcome within the same disease stage. Not surprisingly, the presence of lymph node and distant organ involvement were the most powerful predictors of outcome as these are direct manifestations of tumour metastasis.
Treatment with radiotherapy or chemotherapy, used in 18% and 27% of Cohort A patients respectively, was a possible confounding factor in this study. Theoretically, the improved survival associated with elevated GPS expression might reflect the better response of fast proliferating tumours to cancer treatment (53), (54). However, no correlation was found between treatment and GPS expression. Furthermore, no patients in Cohort B received adjuvant therapy indicating that the association between GPS and survival is independent of treatment. It should be noted that this study was not designed to investigate the relationship between tumour proliferation and response to chemotherapy or radiotherapy.
The sample size may also explain the lack of an association between clinico-pathologic variables and survival with Ki-67 Pl in the present study. As mentioned above, other studies on Ki-67 and CRC outcome have reported inconsistent findings. However, in the three other CRC studies with the largest sample size a low Ki-67 Pl was associated with a worse prognosis (27), (29), (30). We came to the same conclusion applying the GPS, but based on a much smaller sample size. The multi-gene expression analysis was therefore a more sensitive tool to assess the relationship between proliferation and prognosis than the Ki-67 Pl.
The biological reason behind an unfavourable prognosis in tumours with a low GPS will involve further investigation. Mechanisms that could potentially contribute to worse clinical outcome in low GPS tumours include: (i) a more effective immune response to rapidly proliferating tumours; (ii) a higher level of genetic damage that may render cancer cells more resistant to apoptosis, and increase invasiveness, but also perturb smooth replication machinery; (iii) an increased number of cancer stem cells that divide slowly, similar to normal stem cells, but have a high metastatic potential; and (iv) a higher proportion of microsatellite unstable tumours which have a high proliferation rate but a relatively good prognosis. In sum, the present invention has clarified the previous, conflicting results relating to the prognostic role of cell proliferation in colorectal cancer. A GPS has been developed using CRC cell lines and has been applied to two independent patient cohorts. It was found that low expression of growth-related genes in CRC was associated with more advanced tumour stage (Cohort A) and poor clinical outcome within the same stage (Cohort B). Multi-gene expression analysis was shown as a more powerful indicator than the long- established proliferation marker, Ki-67, for predicting outcome. For future studies, it will be useful to determine the reasons that CRC differs from other common epithelia cancers, such as breast and lung cancers (e.g., in reference to Ki-67). This will likely provide insights into important underlying biological mechanisms. From a practical viewpoint, the ability to stratify recurrence risk within a given pathological stage could enable adjuvant therapy to be targeted more accurately. Thus, GPS expression can be used as an adjunct to conventional staging for identifying patients at high risk of recurrence and death from colorectal cancer.
All publications and patents mentioned in the above specification are herein incorporated by reference.
Wherein in the foregoing description reference has been made to integers or components having known equivalents, such equivalents are herein incorporated as if individually set fourth.
Although the invention has been described by way of example and with reference to possible embodiments thereof, it is to be appreciated that improvements and/or modifications may be made without departing from the scope or the spirit thereof.
References:
1. Evan Gl, Vousden KH: Proliferation, cell cycle and apoptosis in cancer. Nature 411:342-8, 2001
2. Whitfield ML, George LK, Grant GD, et al: Common markers of proliferation. Nat Rev Cancer 6:99-106, 2006
3. Rew DA, Wilson GD: Cell production rates in human tissues and tumours and their significance. Part 1 : an introduction to the techniques of measurement and their limitations. Eur J Surg Oncol 26:227-38, 2000
4. Endle E, Gerdes J: The Ki-67 protein: fascinating forms and an unknown function. Exp Cell Res 257:231-7, 2000 5. Brown DC, Gatter KC: Ki67 protein: The immaculate deception. Histopathology 40:2- 11, 2002
6. Paik S, Shak S, Tang G, et al: A multigene assay to predict recurrence of tamoxifen- treated, node-negative breast cancer. N Engl J Med 351:2817-26, 2004 7. Ofner D, Grothaus A, Riedmann B, et al: MIB1 in colorectal carcinomas: its evaluation by three different methods reveals lack of prognostic significance. Anal Cell Pathol 12:61- 70, 1996
8. lhmann T, Liu J, Schwabe W, et al: High-level mRNA quantification of proliferation marker pKi-67 is correlated with favorable prognosis in colorectal carcinoma. J Cancer Res Clin Oncol 130:749-756, 2004
9. Van Oijen MG, Medema RH, Slootweg PJ, et al: Positivity of the proliferation marker pKi-67 in non-cycling cells. Am J Clin Pathol 110:24-31, 1998
10. Duchrow M, Ziemann T, Windhόvel U, et al: Colorectal carcinomas with high MIB-1 labelling indices but low pKi67 mRNA levels correlate with better prognostic outcome. Histopathology 42:566-574, 2003
11. Evans C, Morrison I, Heriot AG, et al: The correlation between colorectal cancer rates of proliferation and apoptosis and systemic cytokine levels; plus their influence upon survival. Br J Cancer 94:1412-9, 2006
12. Rosati G, Chiacchio R, Reggiardo G, et al: Thymidylate synthase expression, p53, bcl- 2, Ki-67 and p27 in colorectal cancer: relationships with tumour recurrence and survival.
Tumour Biol 25:258-63, 2004
13. lshida H, Miwa H, Tatsuta M, et al: Ki-67 and CEA expression as prognostic markers in Dukes' C colorectal cancer. Cancer Lett 207:109-115, 2004
14. Buglioni S, D'Agnano I, Cosimelli M, et al: Evaluation of multiple bio-pathological factors in colorectal adenocarcinomas: independent prognostic role of p53 and bcl-2. lnt J Cancer 84:545-52, 1999
15. Guerra A, Borda F, Javier Jimenez F, et al: Multivariate analysis of prognostic factors in resected colorectal cancer: a new prognostic index. Eur J Gastroenterol Hepatol 10:51- 8, 1998 16. Kyzer S, Gordon PH: Determination of proliferative activity in colorectal carcinoma using monoclonal antibody Ki67. Dis Colon Rectum 40:322-5, 1997
17. Jansson A, Sun XF: Ki-67 expression in relation to clinicopathological variables and prognosis in colorectal adenocarcinomas. APMIS105:730-4, 1997
18. Baretton GB, Diebold J, Christoforis G1 et al: Apoptosis and immunohistochemical bcl- 2 expression in colorectal adenomas and carcinomas. Aspects of carcinogenesis and prognostic significance. Cancer 77:255-64, 1996 19. Sun XF, Carstensen JM, Stal O, et al: Proliferating cell nuclear antigen (PCNA) in relation to ras, c-erbB-2, p53, clinico-pathological variables and prognosis in colorectal adenocarcinoma, lnt J Cancer 69:5-8, 1996
20. Kubota Y, Petras RE, Easley KA, et al: Ki-67-determined growth fraction versus standard staging and grading parameters in colorectal carcinoma. A multivariate analysis. Cancer 70:2602-9, 1992
21. Valera V, Yokoyama N, Walter B, et al: Clinical significance of Ki-67 proliferation index in disease progression and prognosis of patients with resected colorectal carcinoma. Br J Surg 92:1002-7, 2005 22. Dziegiel P, Forgacz J, Suder E, et al: Prognostic significance of metallothionein expression in correlation with Ki-67 expression in adenocarcinomas of large intestine. Histol Histopathol 18:401-7, 2003
23. Scopa CD, Tsamandas AC, Zolata V, et al: Potential role of bcl-2 and Ki-67 expression and apoptosis in colorectal carcinoma: a clinicopathologic study. Dig Dis Sci 48:1990-7, 2003
24. Bhatavdekar JM, Patel DD, Chikhlikar PR, et al: Molecular markers are predictors of recurrence and survival in patients with Dukes B and Dukes C colorectal adenocarcinoma. Dis Colon Rectum 44:523-33, 2001
25. Chen YT, Henk MJ, Carney KJ, et al: Prognostic Significance of Tumor Markers in Colorectal Cancer Patients: DNA Index, S-Phase Fraction, p53 Expression, and Ki-67
Index. J Gastrointest Surg 1:266-273, 1997
26. Choi HJ1 Jung IK, Kim SS1 et al: Proliferating cell nuclear antigen expression and its relationship to malignancy potential in invasive colorectal carcinomas. Dis Colon Rectum 40:51-9, 1997 27. Hilska M, Collan YU, O Laine VJ, et al: The significance of tumour markers for proliferation and apoptosis in predicting survival in colorectal cancer. Dis Colon Rectum 48:2197-208, 2005
28. Salminen E, Palmu S, Vahlberg T, et al: Increased proliferation activity measured by immunoreactive Ki67 is associated with survival improvement in rectal/recto sigmoid cancer. World J Gastroenterol 11 :3245-9, 2005
29. Garrity MM, Burgart LJ, Mahoney MR, et al: Prognostic value of proliferation, apoptosis, defective DNA mismatch repair, and p53 overexpression in patients with resected Dukes' B2 or C colon cancer: a North Central Cancer Treatment Group Study. J Clin Oncol 22:1572-82, 2004 30. Allegra CJ, Paik S, Colangelo LH1 et al: Prognostic value of thymidylate synthase, Ki- 67, and p53 in patients with Dukes' B and C colon cancer: a National Cancer Institute- National Surgical Adjuvant Breast and Bowel Project collaborative study. J Clin Oncol 21 :241-50, 2003
31. Palmqvist R, Sellberg P, Oberg A, et al: Low tumour cell proliferation at the invasive margin is associated with a poor prognosis in Dukes' stage B colorectal cancers. Br J Cancer 79:577-81 , 1999
32. Paradiso A, Rabinovich M, Vallejo C, et al: p53 and PCNA expression in advanced colorectal cancer: response to chemotherapy and long-term prognosis, lnt J Cancer 69:437-41, 1996
33. Neoptolemos JP, Oates GD, Newbold KM, et al: Cyclin/proiiferation cell nuclear antigen immunohistochemistry does not improve the prognostic power of Dukes' or Jass1 classifications for colorectal cancer. Br J Surg 82:184-7, 1995
34. Compton C, Fenoglio-Preiser CM, Pettigrew N, et al: American joint committee on cancer prognostic factors consensus conference. Colorectal working group. Cancer 88: 1739-1757, 2000 35. Colantuoni C, Henry G, Zeger S, et al: SNOMAD (Standarization and Normalization of MicroArray Data): web-accessible gene expression data analysis. Bioinformatics 18:1540- 1541 , 2002
36. Livak KJ, Schmittgen TD: Analysis of Relative Gene Expression Data Using Real- Time Quantitative PCR and the 2-ΔΔCT Method. METHODS 25:402-408, 2001 37. Pocock SJ, Clayton TC, Altman DG: Survival plots of time-to-event outcomes in clinical trials: good practice and pitfalls. Lancet 359:1686-89, 2002
38. Trusher VG, Tibshirani R, Chu G: Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci USA 98:5116-21, 2001
39. Hosack DA, Dennis G, Sherman BT, et al: Identifying biological themes within lists of genes with EASE. Genome biology 4:R70, 2003
40. Perou CM, Jeffrey SS, DE Rijn MV: Distinctive gene expression patterns in human mammary epithelial cells and breast cancers. Proc. Natl. Acad. Sci. USA 96:9212-17, 1999
41. Perou CM: Molecular portraits of human breast tumours. Nature 406:747-752, 2000 42. Welsh JB, Zarrinkar PP, Sapinoso LM, et al: Analysis of gene expression profiles in normal and neoplastic ovarian tissue samples identifies candidate molecular markers of epithelial ovarian cancer. Proc. Natl Acad. Sci. USA 98: 1176-1181, 2001 43. Chen X, Cheung ST, So S, et al: Gene expression patterns in human liver cancers. MoI. Biol. Cell 13:1929-1939, 2002 44. Kirschner-Schwabe R, Lottaz C, Todling J, et al: Expression of late cell cycle genes and an increased proliferative capacity characterize very early relapse of childhood acute lymphoblastic leukemia. Clin Cancer Res 12:4553-61 , 2006 45. Krasnoselsky AL, Whiteford CC, Wei JS, et al: Altered expression of cell cycle genes distinguishes aggressive neuroblastoma. Oncogene 24:1533-1541 , 2005
46. lnamura K, Fujiwara T, Hoshida Y, et al: Two subclasses of lung squamous cell carcinoma with different gene expression profiles and prognosis identified by hierarchical clustering and non-negative matrix factorization. Oncogene 24:7105-13, 2005
47. Chung CH, Parker JS, Karaca G, et al: Molecular classification of head and neck squamous cell carcinomas using patterns of gene expression. Cancer Cell 5:489-500, 2004
48. LaTulippe E, Satagopan J, Smith A, et al: Comprehensive gene expression analysis of prostate cancer reveals distinct transcriptional programs associated with metastatic disease. Cancer Res 62:4499-4506, 2002
49. Hippo Y, Taniguchi H, Tsutumi S, et al: Global gene expression analysis of gastric cancer by oligonucleotide microarrays. Cancer Res 62:233-40, 2002
50. Whitfield ML, Sherlock G, Saldanha AJ, et al: Identification of genes periodically expressed in the human cell cycle and their expression in tumours. MoI Biol Cell 13:1977- 2000, 2002
51. Li JQ, Miki H, Ohmori M, et al: Expression of cyclin E and cyclin-dependent kinase 2 correlates with metastasis and prognosis in colorectal carcinoma. Hum Pathol 32:945-53, 2001 52. Li JQ, Miki H, Wu F, et al: Cyclin A correlates with carcinogenesis and metastasis, and p27 (kip1) correlates with lymphatic invasion, in colorectal neoplasms. Hum Pathol 33, 1006-15, 2002
53. ltamochi H, Kigawa J, Sugiyama T, et al: Low proliferation activity may be associated with chemoresistance in clear cell carcinoma of the ovary. Obstet Gynecol 100:281-287, 2002
54: lmdahl A, Jenkner J, lhling C, et al: Is MIB-1 proliferation index a predictor for response to neoadjuvant therapy in patients with esophageal cancer? Am J Surg 179:514-520, 2000

Claims

1. A prognostic signature for determining progression of gastrointestinal cancer in a patient, comprising one or more genes selected from Table A, Table B, Table C or Table D.
2. The signature of claim 1 , wherein the signature comprises one or more genes selected from any one of CDC2, MCM6, RPA3, MCM7, PCNA1 G22P1 , KPNA2, ANLN, APG7L, TOPK, GMNN, RRM1, CDC45L, MAD2L1 , RAN, DUT, RRM2, CDK7, MLH3, SMC4L1, CSPG6, POLD2, POLE2, BCCIP, Pfs2, TREX1, BUB3, FEN1 , DRF1 , PREI3, CCNE1 , RPA1 , POLE3, RFC4, MCM3, CHEK1 , CCND1 , and CDC37.
3. A method of predicting the likelihood of long-term survival of a gastrointestinal cancer patient without the recurrence of gastrointestinal cancer, comprising determining the expression level of one or more prognostic RNA transcripts or their expression products in a gastrointestinal sample obtained from the patient, normalized against the expression level of all RNA transcripts or their products in the gastrointestinal cancer tissue sample, or of a reference set of RNA transcripts or their expression products; wherein the prognostic RNA transcript is the transcript of one or more genes selected from table A, Table B, Table C or Table D ; and establishing likelihood of long-term survival without gastrointestinal cancer recurrence.
4. The method of claim 3, wherein at least one prognostic RNA transcripts or its expression products is selected from any one of CDC2, MCM6, RPA3, MCM7, PCNA, G22P1 , KPNA2, ANLN, APG7L, TOPK1 GMNN, RRM1 , CDC45L, MAD2L1 , RAN, DUT, RRM2, CDK7, MLH3, SMC4L1 , CSPG6, POLD2, POLE2, BCCIP, Pfs2, TREX1 , BUB3, FEN1, DRF1 , PREI3, CCNE1 , RPA1 , POLE3, RFC4, MCM3, CHEK1 , CCND1 , and CDC37
5. The method of claim 3 or claim 4 comprising determining the expression level of at least two, at least five, at least 10, or at least 15 of the prognostic RNA transcripts or their expression products.
6. The method according to any one of claims 3 to 5, wherein increased expression of the one or more prognostic RNA transcripts or their expression products indicates an increased likelihood of long-term survival without gastrointestinal cancer recurrence.
7. The method according to any one of claims 3 to 5, wherein a predictive model is applied, established by applying a predictive method to expressions levels of the predictive signature in recurrent and non-recurrent tumour samples, to establishing likelihood of long-term survival without gastrointestinal cancer recurrence.
8. The method of claim 7, wherein said predictive method is selected from the group consisting of linear models, support vector machines, neural networks, classification and regression trees, ensemble learning methods, discriminant analysis, nearest neighbor method, bayesian networks, independent components analysis.
9. The method of any one of claims 3 to 8 wherein the gastrointestinal cancer is gastric cancer or colorectal cancer.
10. The method of any one of claims 3 to 9 wherein the expression level of one or more prognostic RNA transcripts is determined.
11. The method of any one of claims 3 to 10 wherein the RNA is isolated from a fixed, wax- embedded gastrointestinal cancer tissue specimen of the patient.
12. The method of any one of claims 3 to 10 wherein the RNA is isolated from core biopsy tissue or fine needle aspirate cells.
13. An array comprising polynucleotides hybridizing to two or more genes selected from table A, Table B, Table C or Table D.
14 An array of claim 13 comprising polynucleotides hybridizing to two or more of the following genes: CDC2, MCM6, RPA3, MCM7, PCNA, G22P1, KPNA2, ANLN, APG7L, TOPK, GMNN, RRM1, CDC45L, MAD2L1, RAN, DUT, RRM2, CDK7, MLH3, SMC4L1, CSPG6, POLD2, POLE2, BCCIP, Pfs2, TREX1, BUB3, FEN1 , DRF1, PREI3, CCNE1, RPA1 , POLE3, RFC4, MCM3, CHEK1 , CCND1 , and CDC37.
15. The array of claim 13 or claim 14 comprising polynucleotides hybridizing to at least 3, at least five, at least 10 or at least 15 of the genes.
16. The array of claim 13 comprising polynucleotides hybridizing to the following genes: CDC2, MCM6, RPA3, MCM7, PCNA, G22P1, KPNA2, ANLN, APG7L, TOPK, GMNN, RRM1 , CDC45L, MAD2L1 , RAN, DUT, RRM2, CDK7, MLH3, SMC4L1, CSPG6, POLD2, POLE2, BCCIP, Pfs2, TREX1, BUB3, FEN1 , DRF1 , PREI3, CCNE1 , RPA1 , POLE3, RFC4, MCM3, CHEK1, CCND1 , and CDC37.
17. The array of any one of claims 13 to 16 wherein the polynucleotides are cDNAs.
18. The array of claim 17 wherein the cDNAs are about 500 to 5000 bases long.
19. The array of claim any one of claims 13 to 16 wherein the polynucleotides are oligonucleotides.
20. The array of claim 19 wherein the oligonucleotides are about 20 to 80 bases long.
21. The array of any one of claims 13 to 20 wherein the solid surface is glass.
22. A method of predicting the likelihood of long-term survival of a patient diagnosed with gastrointestinal cancer, without the recurrence of gastrointestinal cancer, comprising the steps of:
(1) determining the expression levels of the RNA transcripts or the expression products of genes or a gene selected from table A, Table B, Table C or Table D, in a gastrointestinal cancer tissue sample obtained from the patient, normalized against the expression levels of all RNA transcripts or their expression products in the gastrointestinal cancer tissue sample, or of a reference set of RNA transcripts or their products;
(2) subjecting the data obtained in step (1) to statistical analysis; and (3) determining whether the likelihood of the long-term survival has increased or decreased; and establishing the likelihood of long-term survival without gastrointestinal cancer recurrence.
23 The method of claim 22, wherein at least one prognostic RNA transcripts or its expression products is selected from any one CDC2, MCM6; RPA3, MCM7, PCNA, G22P1 , KPNA2, ANLN, APG7L, TOPK, GMNN, RRM1 , CDC45L, MAD2L1 , RAN, DUT, RRM2, CDK7, MLH3, SMC4L1, CSPG6, POLD2, POLE2, BCCIP, Pfs2, TREX1, BUB3, FEN1 , DRF1 , PREI3, CCNE1 , RPA1 , POLE3, RFC4, MCM3, CHEK1 , CCND1 , and CDC37.
24. The method of claim 22 or claim 23 wherein the statistical analysis is performed by using the Cox Proportional Hazards model.
25. A method of preparing a personalized genomics profile for a cancer patient, comprising the steps of: (a) subjecting RNA extracted from a gastrointestinal tissue obtained from the patient to gene expression analysis; (b) determining the expression level of one or more genes selected from the gastrointestinal cancer gene set listed in any one of Table A, Table B, Table C or Table D, wherein the expression level is normalized against a control gene or genes and optionally is compared to the amount found in a gastrointestinal cancer reference tissue set; and (c) creating a report summarizing the data obtained by the gene expression analysis.
25. The method of claim 24, wherein the gastrointestinal tissue comprises gastrointestinal cancer cells.
26. The method of claim 24 wherein the gastrointestinal tissue is obtained from a fixed, paraffin-embedded biopsy sample.
27. The method of claim 26 wherein the RNA is fragmented.
28. The method of any on of claims 22 to 27 wherein the report includes prediction of the likelihood of long term survival of the patient.
29. The method of any one of claims 22 to 29 wherein the report includes recommendation for a treatment modality of the patient.
30. A prognostic method comprising: (a) subjecting a sample comprising gastrointestinal cancer cells obtained from a patient to quantitative analysis of the levels of RNA transcripts of at least one gene selected from any one of Table A, Table B, Table C or table D, or its product, and (b) identifying the patient as likely to have an increased likelihood of long-term survival without gastrointestinal cancer recurrence if normalized expression levels of the gene or genes, or their products, are elevated above a defined expression threshold.
31. The method of claim 30, wherein at least one prognostic RNA transcripts or its expression products is selected from any one CDC2, MCM6, RPA3, MCM7, PCNA, G22P1 , KPNA2, ANLN, APG7L, TOPK, GMNN1 RRM1, CDC45L, MAD2L1 , RAN, DUT, RRM2, CDK7, MLH3, SMC4L1 , CSPG6, POLD2, POLE2, BCCIP, Pfs2, TREX1, BUB3, FEN1 , DRF1 , PREI3, CCNE1 , RPA1 , POLE3, RFC4, MCM3, CHEK1 , CCND1 , and CDC37.
32. The method of claim 30 or claim 31 , wherein the levels of the RNA transcripts of the genes are normalized relative to the mean level of the RNA transcript or the product of two or more housekeeping genes.
33. The method of claim 32 wherein the housekeeping genes are selected from the group consisting of glyceraldehyde-3-phosphate dehydrogenase (GAPDH), Cypl, albumin, actins, tubulins, cyclophiiin hypoxantine phosphoribosyltransferase (HRPT), L32, 28S, and 185.
34. The method of any one of claims 30 to 33 wherein the sample is subjected to global gene expression analysis of all genes present above the limit of detection.
35. The method of any one of claims 30 to 34 wherein the levels of RNA transcripts of the genes are normalized relative to the mean signal of the RNA transcripts or the products of all assayed genes or a subset thereof.
36. The method of any one of claims 30 to 35 wherein the levels of RNA transcripts are determined by quantitative RT-PCR, and the signal is a Ct value.
37. The method of claim 35 wherein the assayed genes include at least 50 or at least 100 cancer related genes.
38. The method of any one of claims 30 to 37 wherein the patient is human.
39. The method of any one of claims 30 to 38 wherein the sample is a fixed, paraffin- embedded tissue (FPET) sample, or fresh or frozen tissue sample.
40. The method of any one of claims 30 to 38 wherein the sample is a tissue sample from fine needle, core, or other types of biopsy.
41. The method of any one of claims 30 to 40 wherein the quantitative analysis is performed by quantitative RT-PCR.
42. The method of any one of claims 30 to 40 wherein the quantitative analysis is performed by quantifying the products of the genes.
43. The method of any one of claims 30 to 40 wherein the products are quantified by immunohistochemistry or by proteomics technology.
44. The method of any one of claims 30 to 43 further comprising the step of preparing a report indicating that the patient has an increased likelihood of long-term survival without gastrointestinal cancer recurrence.
45. A kit comprising one or more of (1) extraction buffer/reagents and protocol; (2) reverse transcription buffer/reagents and protocol; and (3) quantitative RT-PCR buffer/reagents and protocol suitable for performing the method of any one of claims 3, 25, and 30.
PCT/NZ2008/000260 2007-10-05 2008-10-06 Proliferation signature and prognosis for gastrointestinal cancer WO2009045115A1 (en)

Priority Applications (14)

Application Number Priority Date Filing Date Title
CA2739004A CA2739004C (en) 2007-10-05 2008-10-06 Proliferation signatures and prognosis for gastrointestinal cancer
KR1020167011870A KR101982763B1 (en) 2007-10-05 2008-10-06 Proliferation signature and prognosis for gastrointestinal cancer
KR1020187022213A KR20180089565A (en) 2007-10-05 2008-10-06 Proliferation signature and prognosis for gastrointestinal cancer
CN200880119316.5A CN101932724B (en) 2007-10-05 2008-10-06 The hyperplasia label of human primary gastrointestinal cancers and prognosis
JP2010527901A JP5745848B2 (en) 2007-10-05 2008-10-06 Signs of growth and prognosis in gastrointestinal cancer
KR1020107009975A KR101727649B1 (en) 2007-10-05 2008-10-06 Proliferation signature and prognosis for gastrointestinal cancer
KR1020207002358A KR20200015788A (en) 2007-10-05 2008-10-06 Proliferation signature and prognosis for gastrointestinal cancer
EP08835078A EP2215254A4 (en) 2007-10-05 2008-10-06 Proliferation signature and prognosis for gastrointestinal cancer
KR1020227003193A KR20220020404A (en) 2007-10-05 2008-10-06 Proliferation signature and prognosis for gastrointestinal cancer
AU2008307830A AU2008307830A1 (en) 2007-10-05 2008-10-06 Proliferation signature and prognosis for gastrointestinal cancer
KR1020207028269A KR20200118226A (en) 2007-10-05 2008-10-06 Proliferation signature and prognosis for gastrointestinal cancer
US12/754,077 US20110086349A1 (en) 2007-10-05 2010-04-05 Proliferation Signatures and Prognosis for Gastrointestinal Cancer
US15/233,604 US20170088900A1 (en) 2007-10-05 2016-08-10 Test Kits and Uses Thereof
US15/647,608 US20180010198A1 (en) 2007-10-05 2017-07-12 Methods of identifying proliferation signatures for colorectal cancer

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
NZ562237A NZ562237A (en) 2007-10-05 2007-10-05 Proliferation signature and prognosis for gastrointestinal cancer
NZ562237 2007-10-05

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US12/754,077 Continuation US20110086349A1 (en) 2007-10-05 2010-04-05 Proliferation Signatures and Prognosis for Gastrointestinal Cancer

Publications (1)

Publication Number Publication Date
WO2009045115A1 true WO2009045115A1 (en) 2009-04-09

Family

ID=40526417

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/NZ2008/000260 WO2009045115A1 (en) 2007-10-05 2008-10-06 Proliferation signature and prognosis for gastrointestinal cancer

Country Status (10)

Country Link
US (3) US20110086349A1 (en)
EP (2) EP2215254A4 (en)
JP (4) JP5745848B2 (en)
KR (6) KR20200118226A (en)
CN (2) CN108753975A (en)
AU (1) AU2008307830A1 (en)
CA (2) CA2739004C (en)
NZ (1) NZ562237A (en)
SG (3) SG185278A1 (en)
WO (1) WO2009045115A1 (en)

Cited By (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011113819A3 (en) * 2010-03-19 2011-11-10 Immatics Biotechnologies Gmbh Diagnosis and treatment of cancer based on avl9
EP2417271A2 (en) * 2009-04-07 2012-02-15 Genomic Health, Inc. Methods of predicting cancer risk using gene expression in premalignant tissue
WO2012090479A1 (en) * 2010-12-28 2012-07-05 Oncotherapy Science, Inc. Mcm7 as a target gene for cancer therapy and diagnosis
JP2012533071A (en) * 2009-07-16 2012-12-20 エフ.ホフマン−ラ ロシュ アーゲー Flap endonuclease-1 as a cancer marker
US8338109B2 (en) 2006-11-02 2012-12-25 Mayo Foundation For Medical Education And Research Predicting cancer outcome
EP2591126A2 (en) * 2010-07-07 2013-05-15 Myriad Genetics, Inc. Gene signatures for cancer prognosis
US20140122382A1 (en) * 2008-10-15 2014-05-01 Eric A. Elster Bayesian modeling of pre-transplant variables accurately predicts kidney graft survival
US8871451B2 (en) 2006-09-25 2014-10-28 Mayo Foundation For Medical Education And Research Extracellular and membrane-associated prostate cancer markers
CN104395756A (en) * 2012-06-18 2015-03-04 北卡罗莱纳大学查佩尔山分校 Methods for head and neck cancer prognosis
EP2668296A4 (en) * 2011-01-25 2015-09-02 Almac Diagnostics Ltd Colon cancer gene expression signatures and methods of use
EP2715348A4 (en) * 2011-06-02 2015-10-07 Almac Diagnostics Ltd Molecular diagnostic test for cancer
CN105510586A (en) * 2015-12-22 2016-04-20 湖北鹊景生物医学有限公司 Kit for lung cancer diagnosis and use method of kit
EP2982985A4 (en) * 2013-04-05 2016-11-09 Univ Industry Foundation Yonsei University System for predicting prognosis of locally advanced gastric cancer
US9605319B2 (en) 2010-08-30 2017-03-28 Myriad Genetics, Inc. Gene signatures for cancer diagnosis and prognosis
US9976188B2 (en) 2009-01-07 2018-05-22 Myriad Genetics, Inc. Cancer biomarkers
CN108192972A (en) * 2010-10-06 2018-06-22 生物医学研究机构基金会 For the method for the diagnosis of Metastasis in Breast Cancer, prognosis and treatment
CN108841959A (en) * 2018-07-12 2018-11-20 吉林大学 A kind of oral cavity and head-neck malignant tumor neurological susceptibility prediction kit and system
US10308980B2 (en) 2011-11-04 2019-06-04 Oslo Universitetssykehus Hf Methods and biomarkers for analysis of colorectal cancer
US10407731B2 (en) 2008-05-30 2019-09-10 Mayo Foundation For Medical Education And Research Biomarker panels for predicting prostate cancer outcomes
US10513737B2 (en) 2011-12-13 2019-12-24 Decipher Biosciences, Inc. Cancer diagnostics using non-coding transcripts
WO2020223233A1 (en) * 2019-04-30 2020-11-05 Genentech, Inc. Prognostic and therapeutic methods for colorectal cancer
US10865452B2 (en) 2008-05-28 2020-12-15 Decipher Biosciences, Inc. Systems and methods for expression-based discrimination of distinct clinical disease states in prostate cancer
US10876164B2 (en) 2012-11-16 2020-12-29 Myriad Genetics, Inc. Gene signatures for cancer prognosis
US11035005B2 (en) 2012-08-16 2021-06-15 Decipher Biosciences, Inc. Cancer diagnostics using biomarkers
US11078542B2 (en) 2017-05-12 2021-08-03 Decipher Biosciences, Inc. Genetic signatures to predict prostate cancer metastasis and identify tumor aggressiveness
US11091809B2 (en) 2012-12-03 2021-08-17 Almac Diagnostic Services Limited Molecular diagnostic test for cancer
US11174517B2 (en) 2014-05-13 2021-11-16 Myriad Genetics, Inc. Gene signatures for cancer prognosis
US11208697B2 (en) 2017-01-20 2021-12-28 Decipher Biosciences, Inc. Molecular subtyping, prognosis, and treatment of bladder cancer
US11414708B2 (en) 2016-08-24 2022-08-16 Decipher Biosciences, Inc. Use of genomic signatures to predict responsiveness of patients with prostate cancer to post-operative radiation therapy
US11596642B2 (en) 2016-05-25 2023-03-07 Inbiomotion S.L. Therapeutic treatment of breast cancer based on c-MAF status
US11654153B2 (en) 2017-11-22 2023-05-23 Inbiomotion S.L. Therapeutic treatment of breast cancer based on c-MAF status
US11732304B2 (en) 2017-03-14 2023-08-22 Novomics Co., Ltd. System for predicting prognosis and benefit from adjuvant chemotherapy for patients with stage II and III gastric cancer
US11873532B2 (en) 2017-03-09 2024-01-16 Decipher Biosciences, Inc. Subtyping prostate cancer to predict response to hormone therapy

Families Citing this family (49)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120100999A1 (en) * 2009-04-20 2012-04-26 University Health Network Prognostic gene expression signature for squamous cell carcinoma of the lung
GB201009798D0 (en) 2010-06-11 2010-07-21 Immunovia Ab Method,array and use thereof
DE102010033575B4 (en) * 2010-08-02 2016-01-14 Eberhard-Karls-Universität Tübingen Universitätsklinikum ASPP2 splice variant
JP5976694B2 (en) * 2011-03-11 2016-08-24 エフ.ホフマン−ラ ロシュ アーゲーF. Hoffmann−La Roche Aktiengesellschaft NNMT as a marker for chronic obstructive pulmonary disease (COPD)
CA2854665A1 (en) 2011-11-10 2013-05-16 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Gene expression signatures of neoplasm responsiveness to therapy
GB201206323D0 (en) * 2012-04-10 2012-05-23 Immunovia Ab Methods and arrays for use in the same
WO2014001988A2 (en) * 2012-06-25 2014-01-03 Manuel Gidekel USE OF CTBP1 siRNA FOR THE TREATMENT OF GASTRIC CANCER
US9336302B1 (en) 2012-07-20 2016-05-10 Zuci Realty Llc Insight and algorithmic clustering for automated synthesis
CN105907859B (en) * 2012-09-25 2020-01-17 生物梅里埃股份公司 Colorectal cancer screening kit
US10860683B2 (en) 2012-10-25 2020-12-08 The Research Foundation For The State University Of New York Pattern change discovery between high dimensional data sets
PT2935222T (en) 2012-12-21 2018-12-10 Epizyme Inc Prmt5 inhibitors and uses thereof
US20170081723A1 (en) * 2014-03-21 2017-03-23 Agency For Science, Technology And Research Fusion Genes in Cancer
CN111579784B (en) * 2014-11-07 2023-12-22 藤仓化成株式会社 Arteriosclerosis and cancer detection method using DHPS gene as index
CN105462942B (en) * 2015-12-28 2018-09-07 国家***第三海洋研究所 Archaeal dna polymerase and its encoding gene and application
CN106244680B (en) * 2016-07-29 2019-05-17 北京泱深生物信息技术有限公司 The application of ORC1L gene and its expression product in diagnosis of disease
US11205103B2 (en) 2016-12-09 2021-12-21 The Research Foundation for the State University Semisupervised autoencoder for sentiment analysis
CN107022627B (en) * 2017-05-10 2020-11-24 哈尔滨医科大学 Application of KPNA2 gene and application of siRNA for inhibiting KPNA2 gene expression
AU2018282865A1 (en) * 2017-06-13 2019-12-19 Bostongene Corporation Systems and methods for generating, visualizing and classifying molecular functional profiles
WO2019126594A2 (en) * 2017-12-20 2019-06-27 Radimmune Therapeutics, Inc. Antibodies to centrin-1, methods of making, and uses thereof
CN108192866B (en) * 2018-03-07 2021-05-11 郑州大学第一附属医院 Method for preparing memory T cells by combining SFN (single domain frame) with IL-15 and IL-21 and application of method
CN108676881A (en) * 2018-05-24 2018-10-19 江苏大学附属医院 Purposes of the reagent of specific recognition CHAF1A in preparing gastric cancer prognosis evaluation reagent kit
CN108831556B (en) * 2018-06-24 2021-06-18 大连理工大学 Method for predicting heparin dosage in continuous renal replacement therapy process
CN108676891B (en) * 2018-07-12 2022-02-01 吉林大学 Rectal adenocarcinoma susceptibility prediction kit and system
CN108676890B (en) * 2018-07-12 2022-01-28 吉林大学 Female breast malignant tumor susceptibility prediction kit and system
JP7451499B2 (en) * 2018-08-24 2024-03-18 エフ. ホフマン-ラ ロシュ アーゲー Circulating FGFBP-1 (fibroblast growth factor binding protein 1) for determination of atrial fibrillation and prediction of stroke
US11216512B2 (en) * 2018-10-08 2022-01-04 Fujitsu Limited Accessible machine learning backends
CN111100189B (en) * 2018-10-29 2023-09-08 中国科学院分子细胞科学卓越创新中心 Polypeptide for treating cancer and pharmaceutical composition thereof
TW202018727A (en) 2018-11-09 2020-05-16 財團法人工業技術研究院 Ensemble learning predicting method and system
CN109371022A (en) * 2018-12-11 2019-02-22 宁夏医科大学总医院 A kind of circular rna hsa_circKPNA2_002 and its specificity amplification primer and application
CN109781985B (en) * 2019-02-27 2021-10-29 中山大学肿瘤防治中心 Kit for detecting cancer radiotherapy sensitivity and application thereof
EP3935581A4 (en) 2019-03-04 2022-11-30 Iocurrents, Inc. Data compression and communication using machine learning
CN109811064B (en) * 2019-04-02 2023-12-05 华南农业大学 Molecular marker related to avian leukosis resistance of chicken J subgroup and application thereof
CN110197701B (en) * 2019-04-22 2021-08-10 福建医科大学附属第一医院 Novel multiple myeloma nomogram construction method
JP7304030B2 (en) * 2019-04-26 2023-07-06 国立大学法人 東京大学 Method for predicting efficacy and prognosis of cancer treatment, and method for selecting therapeutic means
CN110714078B (en) * 2019-09-29 2021-11-30 浙江大学 Marker gene for colorectal cancer recurrence prediction in stage II and application thereof
CN111676290B (en) * 2020-07-02 2021-03-02 王伟佳 Drug resistance molecular marker for acute myelogenous leukemia induced differentiation treatment and application thereof
CN112485428A (en) * 2020-11-23 2021-03-12 浙江大学 Application of reagent for detecting HRP2 protein expression level in preparation of colon cancer metastasis screening reagent
WO2022125867A1 (en) * 2020-12-11 2022-06-16 Icahn School Of Medicine At Mount Sinai Methods of monitoring inflammatory bowel diseases
CN112921092A (en) * 2021-03-18 2021-06-08 上海交通大学医学院附属上海儿童医学中心 New pathogenic mutation site of fumarase gene leading to HLRCC
CN113624665A (en) * 2021-07-30 2021-11-09 中国药科大学 Application of anti-tumor candidate compound in medicine for treating colorectal cancer and determination method
CN113699242A (en) * 2021-10-18 2021-11-26 浙江省人民医院 Primer probe, kit and method for detecting KRAS gene mutation, ADAMTS1 and BNC1 methylation
CN114292920B (en) * 2021-12-10 2023-07-28 中国人民解放军军事科学院军事医学研究院 Group of gastric precancerous lesions and gastric early diagnosis plasma RNA marker combination and application
CN114425090B (en) * 2022-01-26 2024-01-26 四川大学华西医院 XRCC6 gene and application of protein encoded by same
CN114941031A (en) * 2022-01-28 2022-08-26 中国医学科学院北京协和医院 Early gastric cancer prognosis differential gene and recurrence prediction model
CN114377135B (en) * 2022-03-15 2023-11-24 河南大学 Application of PPM1G in diagnosis and treatment of lung cancer
CN115011689B (en) * 2022-05-12 2023-08-15 南方医科大学南方医院 Evaluation model for predicting prognosis of colon cancer and assisting chemotherapy to benefit and application
CN114748498A (en) * 2022-05-13 2022-07-15 南京大学 Application of shRNA aiming at CBFbeta in preparation of colorectal cancer treatment drug
CN115927306A (en) * 2022-07-18 2023-04-07 南通市肿瘤医院 PPM1G gene targeted shRNA and application thereof
CN117074679B (en) * 2023-09-20 2024-06-11 上海爱谱蒂康生物科技有限公司 Biomarker combination and application thereof in predicting effect of immunotherapy combined with chemotherapy in treating esophageal cancer

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050064455A1 (en) * 2003-05-28 2005-03-24 Baker Joffre B. Gene expression markers for predicting response to chemotherapy
US20060041387A1 (en) * 2004-08-17 2006-02-23 Xiumei Sun Smart microarray cancer detection system

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5856094A (en) * 1995-05-12 1999-01-05 The Johns Hopkins University School Of Medicine Method of detection of neoplastic cells
US5989885A (en) * 1997-01-10 1999-11-23 Myriad Genetics, Inc. Specific mutations of map kinase 4 (MKK4) in human tumor cell lines identify it as a tumor suppressor in various types of cancer
US6905827B2 (en) * 2001-06-08 2005-06-14 Expression Diagnostics, Inc. Methods and compositions for diagnosing or monitoring auto immune and chronic inflammatory diseases
WO2005007846A1 (en) * 2003-04-25 2005-01-27 Japanese Foundation For Cancer Research Method of judging senstivity of tumor cell to anticancer agent
WO2005054508A2 (en) * 2003-12-01 2005-06-16 Ipsogen Gene expression profiling of colon cancer by dna microarrays and correlation with survival and histoclinical parameters
DE102004042822A1 (en) * 2004-08-31 2006-03-16 Technische Universität Dresden Compounds and methods of treatment, diagnosis and prognosis in pancreatic diseases
WO2006074392A2 (en) * 2005-01-06 2006-07-13 Genentech, Inc. Cancer prognostic, diagnostic and treatment methods
US20090299640A1 (en) * 2005-11-23 2009-12-03 University Of Utah Research Foundation Methods and Compositions Involving Intrinsic Genes
NZ544432A (en) * 2005-12-23 2009-07-31 Pacific Edge Biotechnology Ltd Prognosis prediction for colorectal cancer using a prognositc signature comprising markers ME2 and FAS
US20100009905A1 (en) * 2006-03-24 2010-01-14 Macina Roberto A Compositions and Methods for Detection, Prognosis and Treatment of Colon Cancer
CN101135664A (en) * 2007-09-29 2008-03-05 上海交通大学医学院附属瑞金医院 Protein marker for distinguishing colorectal cancer at stages I, II, III and IV and application thereof

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050064455A1 (en) * 2003-05-28 2005-03-24 Baker Joffre B. Gene expression markers for predicting response to chemotherapy
US20060041387A1 (en) * 2004-08-17 2006-02-23 Xiumei Sun Smart microarray cancer detection system

Non-Patent Citations (9)

* Cited by examiner, † Cited by third party
Title
"AFFYMETRIX GENECHIP HUMAN GENOME U.133", 11 March 2002 (2002-03-11), XP008136197, Retrieved from the Internet <URL:http://wrvw.ncbi.nlm.nih.gov/geolquerylacc.cgi?acc-GPL96> [retrieved on 20090226] *
"BMC Gastroenterology", vol. 4, 23 September 2004, ELECTRONIC PUBLICATION, article BAHNASSY A. A. ET AL.: "Cyclin A and cyclin Dl as significant prognostic markers in colorectal cancer patients", pages: 22, XP021003773 *
ARBER N. ET AL.: "Increased expression of cyclin D and p53 in small bowel adenocarcinomas are closely related and are associated with a poorer prognosis", GASTROENEROLOGY, vol. 112, no. 4 SUPP, 1997, pages A533, XP008134517 *
INOMATA M. ET AL.: "Amplification and overexpression of cyclin D1 in aggressive human esophageal cancer", ONCOLOGY REPORTS, vol. 5, no. 1, January 1998 (1998-01-01), pages 171 - 176, XP008134939 *
KOURAKLIS G. ET AL.: "Cyclin D1 and Rb protein expression and their correlation with prognosis in patients with colon cancer", WORLD JOURNAL OF SURGICAL ONCOLOGY, vol. 4, no. 5, 20 January 2006 (2006-01-20), pages 5, XP021009247 *
MAEDA K. ET AL.: "Cyclin D1 overexpression and prognosis in colorectal adenocarcinoma", ONCOLOGY, vol. 55, no. 2, March 1998 (1998-03-01), pages 145 - 151, XP008134937 *
MIAO L. ET AL.: "Expression of p16, cyclin D1 and RB protein in gastric carcinoma and premalignant lesions", CHINESE JOURNAL OF CANCER RESEARCH, vol. 15, no. 1, 2003, pages 58 - 62, XP008134940 *
ODA K. ET AL.: "Evaluation of cyclin D1 mRNA expression in gastric and colorectal cancers", RESEARCH COMMUNICATIONS IN MOLECULAR PATHOLOGY AND PHARMACOLOGY, vol. 105, no. 3, 1999, pages 237 - 252, XP008134938 *
See also references of EP2215254A4 *

Cited By (70)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8871451B2 (en) 2006-09-25 2014-10-28 Mayo Foundation For Medical Education And Research Extracellular and membrane-associated prostate cancer markers
US9534249B2 (en) 2006-11-02 2017-01-03 Mayo Foundation For Medical Education And Research Predicting cancer outcome
US8338109B2 (en) 2006-11-02 2012-12-25 Mayo Foundation For Medical Education And Research Predicting cancer outcome
US10494677B2 (en) 2006-11-02 2019-12-03 Mayo Foundation For Medical Education And Research Predicting cancer outcome
US10865452B2 (en) 2008-05-28 2020-12-15 Decipher Biosciences, Inc. Systems and methods for expression-based discrimination of distinct clinical disease states in prostate cancer
US10407731B2 (en) 2008-05-30 2019-09-10 Mayo Foundation For Medical Education And Research Biomarker panels for predicting prostate cancer outcomes
US9561006B2 (en) * 2008-10-15 2017-02-07 The United States Of America As Represented By The Secretary Of The Navy Bayesian modeling of pre-transplant variables accurately predicts kidney graft survival
US20140122382A1 (en) * 2008-10-15 2014-05-01 Eric A. Elster Bayesian modeling of pre-transplant variables accurately predicts kidney graft survival
US9976188B2 (en) 2009-01-07 2018-05-22 Myriad Genetics, Inc. Cancer biomarkers
US10519513B2 (en) 2009-01-07 2019-12-31 Myriad Genetics, Inc. Cancer Biomarkers
EP2417271A4 (en) * 2009-04-07 2012-08-29 Genomic Health Inc Methods of predicting cancer risk using gene expression in premalignant tissue
US8765383B2 (en) 2009-04-07 2014-07-01 Genomic Health, Inc. Methods of predicting cancer risk using gene expression in premalignant tissue
EP2417271A2 (en) * 2009-04-07 2012-02-15 Genomic Health, Inc. Methods of predicting cancer risk using gene expression in premalignant tissue
JP2012533071A (en) * 2009-07-16 2012-12-20 エフ.ホフマン−ラ ロシュ アーゲー Flap endonuclease-1 as a cancer marker
US10898546B2 (en) 2010-03-19 2021-01-26 Immatics Biotechnologies Gmbh Immunotherapy against several tumors including gastrointestinal and gastric cancer
US10420816B1 (en) 2010-03-19 2019-09-24 Immatics Biotechnologies Gmbh Immunotherapy against several tumors including gastrointestinal and gastric cancer
US11975042B2 (en) 2010-03-19 2024-05-07 Immatics Biotechnologies Gmbh Immunotherapy against several tumors including gastrointestinal and gastric cancer
EA024497B1 (en) * 2010-03-19 2016-09-30 Имматикс Байотекнолоджиз Гмбх Peptinde binding to a molecule of the human major histocompatibility complex (mhc) class-i and use thereof for treating cancer
US11969455B2 (en) 2010-03-19 2024-04-30 Immatics Biotechnologies Gmbh Immunotherapy against several tumors including gastrointestinal and gastric cancer
EP3058947A3 (en) * 2010-03-19 2016-11-30 immatics biotechnologies GmbH Novel immunotherapy against several tumors including gastrointestinal and gastric cancer
US11298404B2 (en) 2010-03-19 2022-04-12 Immatics Biotechnologies Gmbh Immunotherapy against several tumors including gastrointestinal and gastric cancer
US9101585B2 (en) 2010-03-19 2015-08-11 Immatics Biotechnologies Gmbh Immunotherapy against several tumors including gastrointestinal and gastric cancer
US11273200B2 (en) 2010-03-19 2022-03-15 Immatics Biotechnologies Gmbh Immunotherapy against several tumors including gastrointestinal and gastric cancer
US9717774B2 (en) 2010-03-19 2017-08-01 Immatics Biotechnologies Gmbh Immunotherapy against several tumors including gastrointestinal and gastric cancer
US9895415B2 (en) 2010-03-19 2018-02-20 Immatics Biotechnologies Gmbh Immunotherapy against several tumors including gastrointestinal and gastric cancer
EP2845604A3 (en) * 2010-03-19 2015-05-20 immatics biotechnologies GmbH Novel immunotherapy against several tumors including gastrointestinal and gastric cancer
US9993523B2 (en) 2010-03-19 2018-06-12 Immatics Biotechnologies Gmbh Immunotherapy against several tumors including gastrointestinal and gastric cancer
US11648292B2 (en) 2010-03-19 2023-05-16 Immatics Biotechnologies Gmbh Immunotherapy against several tumors including gastrointestinal and gastric cancer
US10064913B2 (en) 2010-03-19 2018-09-04 Immatics Biotechnologies Gmbh Immunotherapy against several tumors including gastrointestinal and gastric cancer
US11957730B2 (en) 2010-03-19 2024-04-16 Immatics Biotechnologies Gmbh Immunotherapy against several tumors including gastrointestinal and gastric cancer
US11839643B2 (en) 2010-03-19 2023-12-12 Immatics Biotechnologies Gmbh Immunotherapy against several tumors including gastrointestinal and gastric cancer
WO2011113819A3 (en) * 2010-03-19 2011-11-10 Immatics Biotechnologies Gmbh Diagnosis and treatment of cancer based on avl9
US11850274B2 (en) 2010-03-19 2023-12-26 Immatics Biotechnologies Gmbh Immunotherapy against several tumors including gastrointestinal and gastric cancer
US10357540B2 (en) 2010-03-19 2019-07-23 Immatics Biotechnologies Gmbh Immunotherapy against several tumors including gastrointestinal and gastric cancer
US11077171B2 (en) 2010-03-19 2021-08-03 Immatics Biotechnologies Gmbh Immunotherapy against several tumors including gastrointestinal and gastric cancer
US10933118B2 (en) 2010-03-19 2021-03-02 Immatics Biotechnologies Gmbh Immunotherapy against several tumors including gastrointestinal and gastric cancer
US10478471B2 (en) 2010-03-19 2019-11-19 Immatics Biotechnologies Gmbh Immunotherapy against several tumors including gastrointestinal and gastric cancer
CN112430255A (en) * 2010-03-19 2021-03-02 伊玛提克斯生物技术有限公司 Novel immunotherapy for several tumors including gastrointestinal and gastric cancers
US10905741B2 (en) 2010-03-19 2021-02-02 Immatics Biotechnologies Gmbh Immunotherapy against several tumors including gastrointestinal and gastric cancer
US11883462B2 (en) 2010-03-19 2024-01-30 Immatics Biotechnologies Gmbh Immunotherapy against several tumors including gastrointestinal and gastric cancer
EP2591126A2 (en) * 2010-07-07 2013-05-15 Myriad Genetics, Inc. Gene signatures for cancer prognosis
EP2591126A4 (en) * 2010-07-07 2014-01-01 Myriad Genetics Inc Gene signatures for cancer prognosis
US10954568B2 (en) 2010-07-07 2021-03-23 Myriad Genetics, Inc. Gene signatures for cancer prognosis
EP3812469A1 (en) * 2010-07-07 2021-04-28 Myriad Genetics, Inc. Gene signatures for cancer prognosis
US9605319B2 (en) 2010-08-30 2017-03-28 Myriad Genetics, Inc. Gene signatures for cancer diagnosis and prognosis
CN108192972B (en) * 2010-10-06 2022-09-09 生物医学研究机构基金会 Methods for diagnosis, prognosis and treatment of breast cancer metastasis
CN108192972A (en) * 2010-10-06 2018-06-22 生物医学研究机构基金会 For the method for the diagnosis of Metastasis in Breast Cancer, prognosis and treatment
WO2012090479A1 (en) * 2010-12-28 2012-07-05 Oncotherapy Science, Inc. Mcm7 as a target gene for cancer therapy and diagnosis
EP2668296A4 (en) * 2011-01-25 2015-09-02 Almac Diagnostics Ltd Colon cancer gene expression signatures and methods of use
US10196691B2 (en) 2011-01-25 2019-02-05 Almac Diagnostics Limited Colon cancer gene expression signatures and methods of use
US10260097B2 (en) 2011-06-02 2019-04-16 Almac Diagnostics Limited Method of using a gene expression profile to determine cancer responsiveness to an anti-angiogenic agent
EP2715348A4 (en) * 2011-06-02 2015-10-07 Almac Diagnostics Ltd Molecular diagnostic test for cancer
US10308980B2 (en) 2011-11-04 2019-06-04 Oslo Universitetssykehus Hf Methods and biomarkers for analysis of colorectal cancer
US10513737B2 (en) 2011-12-13 2019-12-24 Decipher Biosciences, Inc. Cancer diagnostics using non-coding transcripts
CN104395756A (en) * 2012-06-18 2015-03-04 北卡罗莱纳大学查佩尔山分校 Methods for head and neck cancer prognosis
US11035005B2 (en) 2012-08-16 2021-06-15 Decipher Biosciences, Inc. Cancer diagnostics using biomarkers
US10876164B2 (en) 2012-11-16 2020-12-29 Myriad Genetics, Inc. Gene signatures for cancer prognosis
US11091809B2 (en) 2012-12-03 2021-08-17 Almac Diagnostic Services Limited Molecular diagnostic test for cancer
EP2982985A4 (en) * 2013-04-05 2016-11-09 Univ Industry Foundation Yonsei University System for predicting prognosis of locally advanced gastric cancer
US11174517B2 (en) 2014-05-13 2021-11-16 Myriad Genetics, Inc. Gene signatures for cancer prognosis
CN105510586A (en) * 2015-12-22 2016-04-20 湖北鹊景生物医学有限公司 Kit for lung cancer diagnosis and use method of kit
US11596642B2 (en) 2016-05-25 2023-03-07 Inbiomotion S.L. Therapeutic treatment of breast cancer based on c-MAF status
US11414708B2 (en) 2016-08-24 2022-08-16 Decipher Biosciences, Inc. Use of genomic signatures to predict responsiveness of patients with prostate cancer to post-operative radiation therapy
US11208697B2 (en) 2017-01-20 2021-12-28 Decipher Biosciences, Inc. Molecular subtyping, prognosis, and treatment of bladder cancer
US11873532B2 (en) 2017-03-09 2024-01-16 Decipher Biosciences, Inc. Subtyping prostate cancer to predict response to hormone therapy
US11732304B2 (en) 2017-03-14 2023-08-22 Novomics Co., Ltd. System for predicting prognosis and benefit from adjuvant chemotherapy for patients with stage II and III gastric cancer
US11078542B2 (en) 2017-05-12 2021-08-03 Decipher Biosciences, Inc. Genetic signatures to predict prostate cancer metastasis and identify tumor aggressiveness
US11654153B2 (en) 2017-11-22 2023-05-23 Inbiomotion S.L. Therapeutic treatment of breast cancer based on c-MAF status
CN108841959A (en) * 2018-07-12 2018-11-20 吉林大学 A kind of oral cavity and head-neck malignant tumor neurological susceptibility prediction kit and system
WO2020223233A1 (en) * 2019-04-30 2020-11-05 Genentech, Inc. Prognostic and therapeutic methods for colorectal cancer

Also Published As

Publication number Publication date
JP2017060517A (en) 2017-03-30
JP5745848B2 (en) 2015-07-08
KR101982763B1 (en) 2019-05-27
NZ562237A (en) 2011-02-25
KR20200015788A (en) 2020-02-12
KR20200118226A (en) 2020-10-14
JP2015165811A (en) 2015-09-24
US20180010198A1 (en) 2018-01-11
JP2018126154A (en) 2018-08-16
SG10201602601QA (en) 2016-04-28
KR20180089565A (en) 2018-08-08
CA2739004C (en) 2020-10-27
KR20100084648A (en) 2010-07-27
CN108753975A (en) 2018-11-06
JP2010539973A (en) 2010-12-24
CN101932724A (en) 2010-12-29
CA3090677A1 (en) 2009-04-09
CA2739004A1 (en) 2009-04-09
KR101727649B1 (en) 2017-04-17
KR20220020404A (en) 2022-02-18
AU2008307830A1 (en) 2009-04-09
JP6824923B2 (en) 2021-02-03
KR20160058190A (en) 2016-05-24
EP2995690A1 (en) 2016-03-16
CN101932724B (en) 2018-07-24
EP2215254A4 (en) 2012-07-18
SG10201912106YA (en) 2020-02-27
SG185278A1 (en) 2012-11-29
EP2215254A1 (en) 2010-08-11
US20170088900A1 (en) 2017-03-30
US20110086349A1 (en) 2011-04-14

Similar Documents

Publication Publication Date Title
JP6824923B2 (en) Signs and prognosis of growth in gastrointestinal cancer
KR101530689B1 (en) Prognosis prediction for colorectal cancer
US10266902B2 (en) Methods for prognosis prediction for melanoma cancer
NZ555353A (en) TNF antagonists

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200880119316.5

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08835078

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2010527901

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2008307830

Country of ref document: AU

WWE Wipo information: entry into national phase

Ref document number: 2988/DELNP/2010

Country of ref document: IN

WWE Wipo information: entry into national phase

Ref document number: 2008835078

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 20107009975

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2008307830

Country of ref document: AU

Date of ref document: 20081006

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 2739004

Country of ref document: CA