WO2017047102A1 - Biomarker for cancer and use thereof - Google Patents

Biomarker for cancer and use thereof Download PDF

Info

Publication number
WO2017047102A1
WO2017047102A1 PCT/JP2016/004260 JP2016004260W WO2017047102A1 WO 2017047102 A1 WO2017047102 A1 WO 2017047102A1 JP 2016004260 W JP2016004260 W JP 2016004260W WO 2017047102 A1 WO2017047102 A1 WO 2017047102A1
Authority
WO
WIPO (PCT)
Prior art keywords
cancer
biomarker
inventors
sequence
genes
Prior art date
Application number
PCT/JP2016/004260
Other languages
French (fr)
Inventor
Bogumil KACZKOWSKI
Alistair Forrest
Piero Carninci
Yuji Tanaka
Hideya KAWAJI
Yoshihide Hayashizaki
Original Assignee
Riken
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Riken filed Critical Riken
Publication of WO2017047102A1 publication Critical patent/WO2017047102A1/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/106Pharmacogenomics, i.e. genetic variability in individual responses to drugs and drug metabolism
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Definitions

  • the present invention relates to biomarkers for cancer.
  • Non-Patent Literature 1 Non-Patent Literature 2
  • Non-Patent Literature 3 Cancers of the same tissue of origin can be very heterogeneous, often being derived from different cell types and having drastically different mutation profiles e.g.(Non-Patent Literature 3). At the same time, cancers of different origins share some common features, as recently shown in a series of pan cancer studies recently published by The Cancer Genome Atlas (TCGA) consortium (Non-Patent Literature 4), which report genes and pathways affected by DNA copy number alterations, mutations, methylation and transcriptome changes across 12 primary tumors types (Non-Patent Literature 4).
  • TCGA Cancer Genome Atlas
  • PIK3R3 induces epithelial-to-mesenchymal transition and promotes metastasis in colorectal cancer.
  • Mcm2, Geminin, and KI67 define proliferative state and are prognostic markers in renal cell carcinoma.
  • Replicative Mcm2 protein as a novel proliferation marker in oligodendrogliomas and its relationship to Ki67 labelling index, histological grade and prognosis.
  • MCM-2 is a therapeutic target of Trichostatin A in colon cancer cells. Toxicol Lett. 2013;221:23-30.
  • a first object of the present invention is to provide a novel biomarker for cancer.
  • a biomarker capable of detecting a plurality of types of cancers be developed.
  • a second object of the present invention is to provide a novel biomarker capable of detecting a plurality of types of cancers.
  • the present invention includes at least one of the following aspects.
  • a biomarker for cancer being one of: transcription products having at least a part of an RNA sequence corresponding to a DNA sequence in "sequence (DPI - UPPERCASE, 100 downstream - lowercase)" in Table 3 and starting from any position of a corresponding sequence in "sequence_DPI_only” in Table 3 or translation products derived therefrom.
  • a biomarker set comprising: a first biomarker for cancer which first biomarker is one of: transcription products having at least a part of an RNA sequence corresponding to a DNA sequence in "sequence (DPI - UPPERCASE, 100 downstream - lowercase)" in Table 3 and starting from any position of a corresponding sequence in "sequence_DPI_only” in Table3, or translation products derived therefrom; and a second biomarker for cancer which second biomarker is another one of: the transcription products having at least the part of the RNA sequence corresponding to the DNA sequence in "sequence (DPI - UPPERCASE, 100 downstream - lowercase)” in Table 3 and starting from any position of the corresponding sequence in “sequence_DPI_only” in Table3, or the translation products derived therefrom.
  • a cancer detection method including the step of measuring, in a sample collected from a living body, the amount of the biomarker mentioned in (1) or the first and second biomarkers included in the biomarker set mentioned in (2).
  • a cancer detection kit comprising the biomarker mentioned in (1) or the biomarker set mentioned in (2).
  • a biomarker for cancer being a full length or a fragment of one of: transcription products of genes listed in Table 1, and transcription products or translation products of genes listed in Table 2.
  • a biomarker set comprising: a first biomarker for cancer which first biomarker is a full length or a fragment of one of: transcription products of genes listed in Table1, and transcription products or translation products of genes listed in Table 2; and a second biomarker for cancer which second biomarker is a full length or a fragment of another one of: the transcription products of the genes listed in Table 1, and the transcription products or the translation products of the genes listed in Table 2.
  • the present invention can provide a novel biomarker for cancer and use of the novel biomarker for cancer.
  • 8 pan-cancer marker candidates used in the experiment were each smaller in terms of p-value in a t-test than CK19.
  • the 8 pan-cancer marker candidates are each identified below with a corresponding number of organs whose cancer samples were higher in average expression level than corresponding normal samples.
  • PKMYT1 8 organs
  • BLM 7 organs
  • FOXP4AS1 also referred to as RP11-328M4.2
  • ENST00000448869 also referred to as RP11-284F21.7
  • 8 organs LOC643401 (also referred to as LINC01021): 5 organs
  • GABRD 7 organs
  • MNX1_AS1 6 organs
  • FIRRE also referred to as RP11-453F18_B1
  • a biomarker for cancer in accordance with one embodiment of the present invention is one of: transcription products having at least a part of an RNA sequence corresponding to a DNA sequence in "sequence (DPI - UPPERCASE, 100 downstream - lowercase)" in Table 3 and starting from any position of a corresponding sequence in "sequence_DPI_only” in Table 3 or translation products derived therefrom.
  • a biomarker for cancer in accordance with another embodiment of the present invention is a full length or a fragment of one of: transcription products of genes listed in Table 1, and transcription products or translation products of genes listed in Table 2.
  • the transcription products are RNAs
  • the translation products are proteins.
  • the transcription products include processed RNAs.
  • a biomarker that is a fragment of a transcription product has a length, which is not particularly limited, of, for example, 10 bases or more, 15 bases or more, 20 bases or more, 25 bases or more, 50 bases or more, 100 bases or more, or 150 bases or more, and, 1000 bases or less, 500 bases or less, 250 bases or less, or 200 bases or less.
  • a biomarker that is a fragment of a translation product has a length, which is not particularly limited, of, for example, 3 residues or more, 5 residues or more, 10 residues or more, 20 residues or more, or 30 residues or more, and, 300 residues or less, 200 residues or less, or 100 residues or less.
  • lncRNAs long-non-coding RNAs
  • a biomarker of the present invention may be a full length or a fragment of a transcription product (mRNA) of a gene listed in Table 2, or a full length or a fragment of a translation product (protein) of a gene listed in Table 2.
  • mRNA transcription product
  • protein translation product
  • a biomarker of the present invention which biomarker relates to a gene listed in Table 2 is preferably a full length or a fragment of a transcription product.
  • Table 3 provides, from another point of view, a summary of information related to the long-non-coding RNAs (lncRNAs) and protein-coding genes each obtained in accordance with a method of Examples.
  • “dpi_id” in Table 3 indicates a location on a chromosome corresponding to DPI (a putative promoter of the long-non-coding RNAs or the protein-coding genes).
  • “range (DPI + 100 downstream)” in Table 3 indicates a location on a chromosome corresponding to both a sequence in the DPI and a sequence of 100 bases downstream of the DPI, where “+” means a location on the plus strand side and “-” means a location on the minus strand side.
  • sequence_DPI_only indicates a sequence in a putative promoter region (DPI) obtained by carrying out DPI analysis with respect to a profile which has been analyzed by a CAGE method and which is related to a location on a chromosome of a transcription product and an amount of transcription of the transcription product
  • sequence_100downstream indicates a sequence of 100 bases downstream of the putative promoter region
  • sequence (DPI - UPPERCASE, 100 downstream - lowercase)” in Table 3 is a sequence obtained by combining the sequence in the putative promoter region and the sequence of 100 bases downstream of the putative promoter region (each sequence is indicated as a DNA sequence of a genome).
  • “M.logFC” in Table 3 is obtained by logarithmically transforming the fold change (FC) of expression cancer versus normal cells and positive values correspond to up-regulation in cancer and negative to donw-regulation in cancer. Thus, in one embodiment, a marker having a greater absotule logFC value is preferable. “summary” in Table 3 suggests a purpose of use as a marker. Specifically, “Pan” in Table 3 indicates the possibility of being a marker for a plurality of types of cancers, and “Solid” in Table 3 indicates the possibility of being a marker for at least a solid cancer.
  • “ON” in Table 3 indicates that transcription occurs in a cancer that can be a target of the marker, whereas substantially no transcription occurs in a control (non-cancer tissue), and “OFF” in Table 3 indicates that substantially no transcription occurs in a cancer that can be a target of the marker, whereas transcription occurs in a control (non-cancer tissue).
  • “UP” in Table 3 indicates that a cancer that can be a target of the marker further increases in amount of transcription than a control
  • “DOWN” in Table 3 indicates that a cancer that can be a target of the marker further decreases in amount of transcription than a control (non-cancer tissue).
  • “gene name” in Table 3 indicates a name of a gene having a putative promoter.
  • genes classified as “protein_coding” are protein-coding genes, and genes classified as a category different from “protein_coding” are genes whose transcription products are the long-non-coding RNAs.
  • the long-non-coding RNAs which serve as one of the biomarkers of the present invention, preferably include the sequence in the putative promoter region obtained by carrying out DPI analysis, and more preferably include the sequence in the putative promoter region and the sequence of 10 or more bases downstream of the putative promoter region.
  • Transcription products of the protein-coding genes, which transcription products serve as one of the biomarkers of the present invention preferably include the sequence in the putative promoter region and the sequence of 10 or more bases downstream of the putative promoter region.
  • the length of a sequence corresponding to a DNA sequence in "sequence_100downstream" in Table 3 contained in the transcription product is, for example, at least 1 base, at least 2, at least 5 bases, at least 10 bases, at least 15 bases, at least 20 bases, at least 30 bases, at least 50 bases, at least 70 bases or 100 bases.
  • a biomarker may preferably a full length or a fragment of one of the transcription products of the genes listed in Tables 1 and 2.
  • a subject having cancer contains a biomarker in an amount which is not particularly limited.
  • the amount may be upregulated in the subject having cancer, or may be downregulated in the subject having cancer.
  • expression may be ON in the subject having cancer, or expression may be OFF in the subject having cancer. Tables 1 and 2 show, for each gene, UP/DOWN/ON/OFF in the subject having cancer.
  • Examples of types of cancers targeted by a biomarker in accordance with the present invention include blood (e.g. lymphoid, myeloid), bone, brain, breast, kidney, liver, lung, melanocyte, mesothelium, ovary, and prostate cancers.
  • blood e.g. lymphoid, myeloid
  • bone e.g., brain, breast, kidney, liver, lung, melanocyte, mesothelium, ovary, and prostate cancers.
  • cancers targeted by the biomarker in accordance with the present invention include bladder urothelial carcinoma, breast invasive carcinoma, colon adenocarcinoma, colorectal carcinoma, head and neck squamous cell carcinoma, kidney chromophobe, kidney renal clear cell carcinoma, kidney renal papillary cell carcinoma, liver hepatocellular carcinoma, lung adenocarcinoma, lung squamous cell carcinoma, prostate adenocarcinoma, rectum adenocarcinoma, thyroid carcinoma, uterine corpus endometrial carcinoma, brain cancer, bone cancer, myeloma, lymphoma, and stomach cancer.
  • cancers targeted by the biomarker in accordance with the present invention include sarcoma, carcinoma, fibrosarcoma, myxosarcoma, liposarcoma, chondrosarcoma, osteogenic sarcoma, chordoma, angiosarcoma, endotheliosarcoma, lymphangiosarcoma, lymphangioendotheliosarcoma, synovioma, mesothelioma, Ewing's tumor, leiomyosarcoma, rhabdomyosarcoma, colon carcinoma, pancreatic cancer, breast cancer, ovarian cancer, prostate cancer, squamous cell carcinoma, basal cell carcinoma, adenocarcinoma, sweat gland carcinoma, sebaceous gland carcinoma, papillary carcinoma, papillary adenocarcinomas, cystadenocarcinoma, medullary carcinoma, bronchogenic carcinoma, renal cell carcinoma, hepatoma, bile
  • One type of cancer or a plurality of types (e.g., two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, or fourteen or more types) of cancers may be targeted by the biomarker of the present invention.
  • a plurality of types of cancers are preferably targeted by the biomarker of the present invention.
  • types of cancers in a subject can be comprehensively examined by use of a single biomarker.
  • Tables 1 and 2 each show an example of a preferable cancer(s) for each gene. Note, however, that the present invention is not limited to such a combination.
  • SEQ ID NOs: 61, 63, 67, 69, 71, 80, 82, and 90 are preferable.
  • GBRD Circulating Tumor Cells
  • FCM Flow Cytometry
  • MCS Manetic-Activated Cell Sorting
  • GBRD versatility of which is indicated in the qPCR validation, is expected to be useful for the detection and the enrichment. Utilization of these cell surface biomarkers for enrichment of Circulating Tumor Cells (CTC) allows a reduction in missing CTC samplings, and therefore restricts the number of false positive cases.
  • the biomarker of the present invention can be particularly useful in a case where metastasis of cancer from one organ or tissue to another (many conventional biomarkers cause expression level to vary depending on the organ) has occurred.
  • a biomarker such as CK19 which may cause an expression level to increase in normal samples depending on an organ, it is not possible to assess metastasis by staining CK19 RNA or protein, if the expression of CK19 is confirmed in an organ to which cancer metastasized.
  • the biomarker of the present invention in contrast, expression levels are low in the majortiy of normal tissues. It is therefore highly probable that cells which metastasized can be identified by use of staining.
  • the biomarkers of the present invention include those downregulated in cancers.
  • FLRT2, ZNF677, SRPX, NAALADL1 and TCEAL7 listed in Table 2 are shown to be downregulated in almost all of the cancer tissues tested.
  • Such a downregulated biomarker can be used to identify a cancer based on the absence of or decrease in the expression thereof.
  • a downregulated biomarker corresponding to a surface protein e.g., FLRT2 or NAALADL1
  • a downregulated biomarker corresponding to a surface protein e.g., FLRT2 or NAALADL1
  • Biomarker set The biomarkers of the present invention can be used in combination.
  • a biomarker set comprising: a first biomarker for cancer which first biomarker is one of: transcription products having at least a part of an RNA sequence corresponding to a DNA sequence in "sequence (DPI - UPPERCASE, 100 downstream - lowercase)" in Table 3 and starting from any position of a corresponding sequence in "sequence_DPI_only” in Table 3, or translation products derived therefrom; and a second biomarker for cancer which second biomarker is another one of: the transcription products having at least the part of the RNA sequence corresponding to the DNA sequence in "sequence (DPI - UPPERCASE, 100 downstream - lowercase)” in Table 3 and starting from any position of the corresponding sequence in “sequence_DPI_only” in Table 3, or the translation products derived therefrom.
  • a biomarker set comprising: a first biomarker for cancer which first biomarker is a full length or a fragment of one of: transcription products of genes listed in Table 1, and transcription products or translation products of genes listed in Table 2; and a second biomarker for cancer which second biomarker is a full length or a fragment of another one of: the transcription products of the genes listed in Table 1, and the transcription products or the translation products of the genes listed in Table 2.
  • the first biomarker for cancer and the second biomarker for cancer which are not particularly limited, any two of the biomarkers described in (1. Biomarker for cancer).
  • a preferable combination of biomarkers include a combination of biomarkers targeting types of cancers which types are different from each other. Such a case has an advantage in that more types of cancers can be comprehensively detected by use of a biomarker set.
  • a combination of biomarkers may be a combination of biomarkers targeting types of cancers which types are identical to each other. Such a case makes it possible to detect one (or more) types of cancers with higher accuracy.
  • the biomarker set may further include a third biomarker, a fourth biomarker, or a fifth or higher-order biomarker.
  • a biomarker may be one of: transcription products having at least a part of an RNA sequence corresponding to a DNA sequence in "sequence (DPI - UPPERCASE, 100 downstream - lowercase)" in Table 3 and starting from any position of a corresponding sequence in "sequence_DPI_only” in Table 3 or translation products derived therefrom, or a biomarker different from the one of: transcription products having at least a part of an RNA sequence corresponding to a DNA sequence in "sequence (DPI - UPPERCASE, 100 downstream - lowercase)” in Table 3 and starting from any position of a corresponding sequence in "sequence_DPI_only” in Table 3 or translation products derived therefrom.
  • such a biomarker may be a full length or a fragment of still another one of the transcription products of the genes listed in Table 1, and the transcription products or the translation products of the genes listed in Table 2, or a biomarker different from the full length or the fragment of any one of the transcription products of the genes listed in Table 1, and the transcription products or the translation products of the genes listed in Table 2.
  • biomarker different from the one of: transcription products having at least a part of an RNA sequence corresponding to a DNA sequence in "sequence (DPI - UPPERCASE, 100 downstream - lowercase)" in Table 3 and starting from any position of a corresponding sequence in "sequence_DPI_only” in Table 3 or translation products derived therefrom include a publicly-known biomarker for cancer.
  • transcription products having at least a part of an RNA sequence corresponding to a DNA sequence in "sequence (DPI - UPPERCASE, 100 downstream - lowercase)" in Table 3 and starting from any position of a corresponding sequence in “sequence_DPI_only” in Table 3 or translation products derived therefrom include a publicly-known biomarker for cancer.
  • examples of the biomarker different from the full length or the fragment of any one of the transcription products of the genes listed in Table 1, and the transcription products or the translation products of the genes listed in Table 2 include a publicly-known biomarker for cancer.
  • SEQ ID NOs: 61, 63, 67, 69, 71, 80, 82, and 90 are preferable.
  • a cancer detection method in accordance with the present invention includes the step of measuring, in a sample collected from a living body, the amount of the biomarker described in (1. Biomarker for cancer) or the first and second biomarkers included in the biomarker set described in (2. Biomarker set).
  • the other specific steps of the method, and an appliance and an apparatus that are used in the method are not particularly limited.
  • sample for use in a cancer detection method in accordance with the present invention is a sample (biological sample) collected from a living body (subject).
  • samples of the sample collected from the living body mainly include a sample derived from a cell, a tissue, and a body fluid (e.g., blood, urine, saliva, etc.).
  • a blood-derived sample mainly include complete blood, serum, and blood plasma.
  • the “sample” is preferably a cell or a tissue, and more preferably a cell.
  • a tissue to be detected the biomarker of the present invention is, for example, a lymph node removed during surgery, and preferably a lymph node located in a vicinity of a primary lesion of cancer.
  • the biomarker of the present invention is preferably detected during surgical removal of the primary lesion. This makes it possible to determine, for example, a risk of metastasis of the cancer (while performing the surgery).
  • An embodiment of the cancer detection method in accordance with the present invention further includes the step of collecting the sample from the living body (subject).
  • An embodiment of the cancer detection method in accordance with the present invention further includes the step of pretreating the sample collected from the living body (subject). Examples of the pretreatment of the sample mainly include obtainment of a cell extract by lysis of a cell.
  • One type of cancer or a plurality of types of cancers may be to be detected. Specific examples of types of cancers are given earlier in (1. Biomarker for cancer).
  • a method for detecting a biomarker is not particularly limited.
  • examples of the method for detecting the biomarker that is a protein include a method based on an antigen-antibody reaction, a method in which an interaction of a protein is used, a mass spectrometry, and a method for identifying a protein such as electrophoresis, or combination of thereof, for example Immunoelectrophoresis.
  • Examples of the method for detecting the biomarker that is an RNA include a method in which various nucleic acid amplification techniques are used and a method in which no nucleic acid amplification technique needs to be used.
  • a nucleic acid amplification technique is exemplified by but not particularly limited to a PCR method, a RT-PCR method, and an LCR method (Ligase Chain Reaction: Barany, F., Proc. Natl. Acad. Sci. USA, Vol.88, p.189-193, 1991).
  • the nucleic acid amplification technique is also preferably exemplified by isothermal amplification methods such as a SmartAmp (Registered Trademark: Smart Amplification Process) method (see also Japanese Patent No.
  • LAMP Loop-Mediated Isothermal Amplification
  • ICAN Isothermal and Chimeric primer-initiated Amplification of Nucleic acids
  • RCA rolling circle amplification
  • Examples of the method in which no nucleic acid amplification technique needs to be used mainly include a CAGE method (Nature Methods 3 (2006), 211-222), RNA-seq and RNA-seq on RNA samples enriched or depleted in specific RNA targets, for example enriched in sequences from to RNA biomarkers, for example by hybridization method, for example similar to SeqCap lncRNA Enrichment Kit by Roche.
  • the biomarker may also be detected by use of an artificial nucleic acid in which an exciton effect is used (see also Japanese Patent No. 4370385, for example).
  • Cancer-Eprobe-FISH Fluorescent in situ hybridization
  • biomarker can be applied to, not limited to the biomarkers of the present invention, all biomarkers (including publicly-known and unknown biomarkers).
  • a result obtained by carrying out a cancer detection method in accordance with the present invention can be utilized as one of materials used by a doctor for diagnosis. Therefore, the present invention also may provide "a method for disgnosing cancer by use of a result obtained by carrying out a cancer detection method in accordance with the present invention.
  • therapeutic strategy can be decided.
  • Examples of the decision on the therapeutic strategy encompass (i) selection from treatment cessation, chemotherapy, radiotherapy, a surgical operation, etc. and/or (ii) selection of a drug to be used.
  • a doctor diagnoses that "a subject has a cancer" based on a result obtained by carrying out the method according to the present invention
  • another inspection e.g., echography, endoscopy, radioscopy, CT scanning, MRI scanning and/or PET scanning
  • an age of the subject e.g., a definitive diagnosis as to whether or not the subject has a cancer is made based on the result of the biopsy, and thereafter a therapeutic strategy for the subject is decided.
  • a doctor can diagnose that a subject has a cancer in the following manner, for example. First, a certain reference value is predetermined. Then, if a value measured in the subject is equal to or higher than the reference value, the doctor can diagnose that "the subject has a cancer".
  • the subject can receive a sutable therapy based on the therapeutic strategy.
  • kits in accordance with the present invention is a cancer detection kit including the biomarker described in (1. Biomarker for cancer) or the biomarker set described in (2. Biomarker set).
  • the cancer detection kit in accordance with the present invention may further appropriately include at least one of, for example, various reagents, an appliance, an instruction for use of the detection kit, a sample for comparison to be used during detection, and data for comparison to be used to analyze a result of the detection.
  • the instruction for use of the detection kit records therein the details of the cancer detection method in accordance with the present invention, which cancer detection method are described earlier in (3. Cancer detection method).
  • the present invention also provides a method for diagnosis of cancer including the step of measuring, in a sample collected from a living body, the amount of the biomarker described in (1. Biomarker for cancer) or the first and second biomarkers included in the biomarker set described in (2. Biomarker set).
  • a subject from which a result that the subject may have cancer has been obtained by carrying out the detection method described earlier in (3. Cancer detection method) can be provided with treatment appropriately followed by a result of diagnosis by a doctor.
  • the treatment mainly include a chemotherapy, a radiotherapy, a surgery each carried out by a doctor or a specialist different from the doctor.
  • the present invention also provides a cancer treatment method including the step of measuring, in a sample collected from a living body, the amount of the biomarker described in (1. Biomarker for cancer) or the first and second biomarkers included in the biomarker set described in (2. Biomarker set).
  • the present invention is not limited to the embodiments, but can be altered by a skilled person in the art within the scope of the claims.
  • An embodiment derived from a proper combination of technical means each disclosed in a different embodiment is also encompassed in the technical scope of the present invention. Further, it is possible to form a new technical feature by combining the technical means disclosed in the respective embodiments.
  • CAGE Cap Analysis of Gene Expression
  • Non-Patent Literature 6 To confirm that these transcripts that are relevant to cancer inventors compared their expression in 4,055 primary tumors and 563 matching tissue sets profiled by the TCGA (https://tcga-data.nci.nih.gov/) and in a smaller set of colorectal tumor (Non-Patent Literature 6) samples confirmed protein level changes. Inventors also compare inventors' lists of pan-cancer lncRNAs to the 229 ‘onco-lncRNAs’ identified by Cabanski et al. (Non-Patent Literature 7) and the 6485 ‘cancer-associated lncRNA genes’ from the miTranscriptome study (Non-Patent Literature 8).
  • inventors compare inventors' results from pure populations of cancer cell lines and normal primary cells of FANTOM5 with results generated from tumor samples that are made up of varying and complex mixtures of cell types. Inventors find an overlapping but distinct set of transcripts. Finally, for the most promising biomarker candidates inventors performed qRTPCR validations in cancer cell lines and tumor cDNA panels. Taken together, inventors' analyses allowed for identification of a set of robust pan cancer biomarker candidates, which have the potential for development as blood biomarkers for early detection and for histological screening of biopsies.
  • inventors For protein coding genes, inventors fist screened for genes that are robustly up regulated in cancer cell lines vs normal cells (promoter up-regulation fold change > 4, gene-wide (all promoter combined) fold change > 2, FDR ⁇ 0.01). Second, inventors performed differential analysis using TCGA tumor data in 14 tumor types. Inventors selected genes that were differentially expressed (fold change > 2, FDR >0.01) in at least 10 cancer origins (10 out of 14). The final selection includes genes that were differentially expressed in both cancer cell lines and >10 TCGA tumor types. Inventors excluded genes that were previously reported in other pan-cancer studies (ONCOMINE and Cabanski et al.
  • lncRNAs For lncRNAs, inventors first screened for genes that are robustly up regulated in cancer cell lines vs normal cells (promoter up-regulation fold change > 4, FDR ⁇ 0.01). Then inventors investigated miTranscriptome data base of non-coding RNAs that are associated with cancer type or cell lineage. Inventors selected lncRNAs that were up regulated in FANTOM5 cancer cell lines and in at least one TCGA cancer type (from miTranscriptome database).
  • miTranscriptome database contains ⁇ 8,000 long noncoding RNAs (lncRNAs) differentially expressed within a cancer type and/or tissue type, however it does not prioritize/select which of them are suitable for biomarkers (which lncRNAs out of 8,000 listed can be used as biomarkers). Additionally, miTranscriptome does not focus on lncRNAs that are expressed in multiple cancer types (no pan cancer analysis).
  • lncRNAs long noncoding RNAs
  • Inventors' approach provides a unique and novel way to select best lncRNA biomarkers by selecting top X candidates that were robustly upregulated in cancer cell lines and in one or more clinical tumor types from miTranscriptome database.
  • select the best cancer biomarkers from ⁇ 8,000 candidates of miTranscriptome database.
  • inventors claim that they are applicable for diagnosis of multiple cancer types. Even though some of them were associated with one cancer type only in miTranscriptome database, in FANTOM5 they were up-regulated with cancer cell lines from multiple cancer types, thus inventors claim them as pan cancer biomarkers.
  • qPCR validations were performed in multiple cancer types.
  • the cancer cell line and primary cell datasets were divided into three subsets; Firstly, inventors attempted to match cell lines derived from solid tissues to primary cell counterparts in the FANTOM5 collection. Cell-lines and primary cells that could be matched are referred to as matched-solid. The remaining cell lines and primary cells from solid tissue are referred to as unmatched-solid. Lastly, matched hematopoietic cells and their blood-cancer counterparts are referred to as matched-blood. In each data set, inventors identified promoters that were differentially expressed between cancer and normal (edgeR(10), >4 fold change, FDR ⁇ 0.01).
  • Inventors also performed an alternative binary analysis (inventors refer to as an ON/OFF analysis) to identify transcripts that were consistently switched off or switched on in cancer.
  • Inventors' motivation was that transcripts, which are switched on or off in cancer, are more promising as candidates for biomarkers and therapeutic targets than up and down regulated transcripts.
  • Inventors used the criteria that they were 4 times more often expressed (switched ON) or not detected (switched OFF) in the cancer group compared to the normal group, using a significance level of FDR ⁇ 0.01 by Fisher’s exact test (examples on Figure 1B).
  • Inventors then merged the results of the ON/OFF and edgeR analyses to obtain a final selection of up and down regulated promoters.
  • the flowchart of the differential expression pipeline is presented on Figure 1A.
  • ⁇ Differentially expressed protein coding genes are enriched in cancer-associated genes>
  • the inventors identified 911 promoters of protein coding genes that represented 656 unique genes: 435 up regulated and 221 down regulated.
  • pan cancer genes such as TERT, PRAME or TOP2A (Non-Patent Literature 14, Non-Patent Literature 15) and other genes such as MYEOV and MNX1 involved in blood malignancies (Non-Patent Literature 16, Non-Patent Literature 17) and FAM111B in prostate cancer (Non-Patent Literature 18).
  • Non-Patent Literature 6 The spectral count data were available for 239 out of 656 differentially expressed genes in cancer cell lines. 20 mRNAs/proteins were up regulated in both the cancer cell lines (CAGE) and colorectal tumors (mass spec data) versus 16 genes being up regulated in both TCGA comparisons (RNA-seq and mass spec). Notably 4 genes were robustly up regulated in all 3 comparisons: MCM2, TOP2A, ASNS and MKI67.
  • PLMYT1 protein coding genes
  • BLM protein coding genes
  • GABRD protein coding genes
  • inventors performed qRTPCR validation in cancer cell lines vs. primary cells and also in a cDNA panel covering 8 tumor types and normal matching tissues.
  • the targets were highly significantly up regulated in both cancer cell lines and tumors ( Figure 2 and Figure 3) as compared with CK19, a conventionally known potential biomarker for a plurality of cancers which was used as a reference.
  • Non-Patent Literature 8 Based on the cancer cell line analysis inventors observed 246 aberrantly expressed promoters that overlapped 181 long non-coding RNAs gene models from GENCODE 19. Additionally, the incorporation of the recently published miTRanscriptome study (Non-Patent Literature 8) allowed us to annotate a further 90 lncRNAs. Combined, inventors identified 271 differentially expressed lncRNAs. The majority (247 lncRNAs) were up-regulated, while 24 were down-regulated.
  • Non-Patent Literature 8 Inventors note that 39 and 5 of these were up and down regulated, respectively, in both the cancer cell line analysis and at least one tumor type in the miTranscriptome study.
  • Non-Patent Literature 8 To focus on robustly differentially expressed lncRNAs inventors selected those that showed consistent expression change in both the FANTOM5 cancer cell line analysis and were also cancer associated in at least 2 cancer types in the miTranscriptome study. This identified 21 consistently up regulated and 2 consistently down regulated lncRNAs.
  • Non-Patent Literature 7 Non-Patent Literature 7
  • Non-Patent Literature 20 Inventors' analysis of pre-processed TCGA RNA-seq data also allowed us to confirm de-regulation of 4 lncRNAs, 2 already confirmed by miTranscriptome and Cabanski (down regulated MEG3 and up -regulated DGCR5) and 2 that were missed by other reports; down-regulation of the MT1L pseudogene and most notably the up-regulation of PVT1, which is a known lncRNA oncogene (Non-Patent Literature 20).
  • Inventors also observe RP11-1070N10.5 neighboring the TCL6 (lincRNA), TCL1A and TCL1B oncogenes (located in a breakpoint cluster region on chromosome 14q32 in T-cell leukemia (Non-Patent Literature 21)) and HOXA11-AS, neighboring HOXA13 and HOXA9 and overlapping the HOXA11 oncogene.
  • 5 out of 6 cancer related genes located with 1kb from up regulated lncRNA were also up regulated, these include the MCF2L, GATA2, MNX1 oncogenes and BSG, CSAG1 cancer antigens.
  • ⁇ Activation of repeat elements in cancer> Globally about 20% of all FANTOM5 promoters initiate from within repetitive elements and low complexity DNA sequences annotated by RepeatMasker.
  • the up-regulated promoters tended to be enriched in repetitive elements, most noticeably retrotransposons (SINE/Alu, LINE/L1, LTR/ERV1, LTR/ERVL).
  • SINE/Alu and LINE/L1 overlapping promoters tended to be located in intronic regions of protein coding genes (49% and 32%, respectively) and not associated to known RNA transcripts.
  • the up-regulated promoters overlapping LTR/ERV1 often initiated the expression of lncRNAs (31 GENCODE lncRNAs and 48 miTranscriptome lncRNAs).
  • REP522 ⁇ Bidirectional transcription from REP522 satellite repeat is activated in cancer> Interestingly, a specific repeat family, REP522, was strongly enriched in the most up-regulated promoters.
  • Non-Patent Literature 8 Inventors observed that in most cases the transcription is initiated bidirectionally and 5 overlap regions previously annotated as enhancers.
  • REP522 element represents true biological signal and was not due to a mapping artifact
  • inventors performed qPCR validation for 11 up-regulated, REP522 associated transcripts from different genomic regions in 3 cancer cell lines and dermal fibroblast cells as a control. For 8 of these inventors confirmed up-regulation in the cancer cell lines compared to normal fibroblasts. In one case inventors confirmed the bidirectional transcription of CCD144NL and CCD144NL-AS1 from one REP522 element. To inventors' knowledge this is the first report implicating REP522 activation in cancer.
  • Non-Patent Literature 9 Taking advantage of the fact that CAGE data can be used to estimate the activity of enhancers from balanced bidirectional capped transcription (Non-Patent Literature 9), inventors performed differential expression analysis based on CAGE tags counts under 43,011 CAGE defined enhancers (Non-Patent Literature 9) using the same differential expression pipeline as for the promoter regions. Inventors found 28 pan-cancer enhancers up regulated in solid and blood cancers and a further 62 up regulated in solid cancers only.
  • Chromatin Interaction Analysis Paired-End Tags (ChIA-PET) data from the ENCODE project (GSE39495; datasets using antibodies against estrogen receptor alpha (ERa), RNA polymerase II (RNAPII), and CCCTC binding factor (CTCF) in four different cancer cell lines, K562, HCT116, HeLa-S3, MCF-7, and NB4) to associate these pan-cancer enhancers with target genes.
  • ERa estrogen receptor alpha
  • RNAPII RNA polymerase II
  • CCCTC binding factor CCCTC binding factor
  • a further 16 of the enhancers were linked to cancer related genes, including 6 oncogenes (BCL9, CREB1, ZNF384, SALL4, TFRC and BTG1), 2 tumor suppressors (ING4, KCTD11) and 5 Mut-Drivers (PIK3CB, CLIP1, KIFC3, GPS2 and CARM1). Additionally, 8 of the up-regulated enhancers were linked to promoters found to be up regulated in cancer cell lines, including cancer linked genes such as TNFSF12 and PIK3R3 (reported to increase tumor migration and invasion when overexpressed (Non-Patent Literature 23)).
  • inventors avoid the complication of expression profiling of the complex mixtures of cells in both tumors (infiltrating lymphocytes, stroma and blood vessels) and normal tissues, as differentially expressed genes in such comparisons may reflect differences in cell composition as well as the transformed state.
  • tumors infiltrating lymphocytes, stroma and blood vessels
  • normal tissues as differentially expressed genes in such comparisons may reflect differences in cell composition as well as the transformed state.
  • this issue is acknowledged in the sample collection procedures of the TCGA (Non-Patent Literature 3). They require that the tumor samples contain at least 60% tumor cells and less than 20% necrosis to be included in the study.
  • the normal tissue counterparts to which they are compared are complex mixtures of cells.
  • inventors acknowledge that, artifacts from the long-term culture of cell lines and their artificial in-vitro culture conditions can affect inventors' analysis, however by focusing on matched primary cell counterparts inventors avoid the cell composition issue.
  • the relevance of inventors' identified gene sets is confirmed by enrichment for oncogenes and tumors suppressors, 19 known tumor cancer antigens and overlap with those perturbed in primary tumor datasets.
  • inventors also report a collection of 271 lncRNA, and 90 eRNA that are consistently perturbed in cancer cell lines and have the potential for use as biomarkers. Additionally inventors report an interesting phenomenon that those overlapping retrotransposon elements are more likely to be up regulated in cancer.
  • Non-Patent Literature 24 Basil et al.
  • Bipolar et al. reported a common cancer signature of 332 genes from 373 archival samples profiled using microarrays
  • Rhodes et al. identified a multicancer-type meta-signature of 67 genes up regulated in cancer by preforming a meta-analysis of 40 published microarray experiments
  • Xu et al. reported 46 genes consistently up-regulated across 21 microarray data sets (Non-Patent Literature 25).
  • TOP2A is targeted by etoposide (Non-Patent Literature 26)
  • ASNS is targeted in asparginase therapy of acute lymphoblastic leukemia (Non-Patent Literature 27) and both MKI67 and MCM2 have bene studied as biomarkers (Non-Patent Literature 28) and (Non-Patent Literature 29) and potential drug targets (Non-Patent Literature 30) and (Non-Patent Literature 31).
  • the advantage of developing drugs to target these molecules is that as they are recurrently up regulated across many cancer types consequently they are likely to be of therapeutic value to many patients.
  • Non-Patent Literature 7 Novartid RNAs
  • Non-Patent Literature 8 Novartid RNAs
  • eRNAs that are up-regulated in a pan-cancer manner, similar to the repeats discussed below their activation could suggest a reversion to a more stem like state, bystander amplification and co-activation (neighbouring genes are more likely to be co-expressed) or demethylation and opening of chromatin permissive for transcription initiation.
  • Non-Patent Literature 32 The finding that promoters, which overlap repetitive elements, are activated in cancer is quite interesting, and the validation of the REP522 derived transcripts is novel.
  • One instance of REP522 near the BAGE locus has been reported to be marked with H3K9me3 and actively transcribed (Non-Patent Literature 32), however relatively little is known about this element.
  • Non-Patent Literature 33 RNA based biomarkers have been identified and commercialized.
  • RNA based biomarkers have been identified and commercialized.
  • RNA interference mediated cancer treatments Non-Patent Literature 38
  • some first-in-humans trials already reported in metastatic cancers Non-Patent Literature 39.
  • Inventors split the data into 3 data sets: a) matched solid - data for 10 types of solid cancer, for which both cancer cell lines (total 72) and corresponding normal primary cells (total 65) were available, b) unmatched solid - data of 97 cancer cell lines and 254 primary cells that could not be unambiguously matched and c) matched blood - data for 2 types of blood cancer with cancer cell lines (total 51) matched to corresponding normal primary cells (total 74).
  • Non-Patent Literature 5 The CAGE tags counts under 184,827 robust DPI clusters (decomposition-based peak identification (Non-Patent Literature 5) were used to represent a promoter-level genome-wide expression. For the enhancer activity, inventors used the CAGE tags counts under 43,011 enhancer regions identified by presence of balanced bidirectional capped transcription (Non-Patent Literature 9).
  • ⁇ FANTOM5 differential expression analysis> Inventors used edgeR to identify up and down regulated transcripts and ON/OFF analysis to identify genes that were switched on and off in cancer cell lines versus normal primary cells.
  • the raw count data were modeled by Genewise Negative Binomial Generalized Linear Models as implemented in edgeR (Non-Patent Literature 10).
  • the coefficients of the design matrix were set to group the samples into 20 groups according to the tissue origin and sample category: 10 cancer origins and 10 normal tissues.
  • the coefficients of the design matrixes grouped the samples into 2 groups: cancer and normal.
  • the cancer vs normal comparison was done simply by setting the contrast's coefficients to 1 for cancer and -1 for normal.
  • the p-values were adjusted for multiple testing by Benjamini-Hochberg False Discovery Rate (FDR) method.
  • FDR Benjamini-Hochberg False Discovery Rate
  • RNA-seq data Inventors obtained the RNA-Seq profiling data of 4055 cancer samples and 563 normal tissues data from The Cancer Genome Atlas (TCGA) Data Portal (https://tcga-data.nci.nih.gov/tcga/). The profiles represented 14 solid cancer type for which both tumor and normal tissue samples were available. Blood malignancies were not included in the analysis because no RNA-Seq data was available for the normal tissues. Inventors downloaded level3 RNASeqV2, upper quartile normalized RSEM count estimates (data status as of Aug 5, 2013). After merging, inventors obtained a matrix with expression profiles of 20531 genes in 4618 samples.
  • Non-Patent Literature 12 For the list of genes frequently mutated in cancer (also called Mut Drivers) inventors used the list of High Confidence Drivers mutated across 12 cancer types as reported by Tamborero et al. (Non-Patent Literature 12). Inventors also tested for enrichment of cancer related genes listed in COSMIC: Cancer Gene census (Non-Patent Literature 13).
  • ⁇ Proteome data analysis> Inventors used the spectral count data table for 90 colorectal cancer and 30 normal generated by Liebler lab (Non-Patent Literature 6) for all 7244 genes (no low count filtering was performed). The provided data were produced by liquid chromatography-tandem mass spectrometry (LC- MS/MS) and were quantile-normalized, followed by log base 2 transformation by the authors. For each gene, inventors performed a non-parametric two-sided Wilcoxon rank-sum test to identify differentially expressed genes in colorectal cancer (vs. normal tissue) at protein level. Inventors adjust the p-valued for multiple testing with Benjamini-Hochbert FDR method.
  • Hct116-Pol2 Helas3-Pol2, K562-Ctcf, K562-Pol2 (2 replicates), Mcf7-Ctcf (2 replicates), Mcf7-ERalpha a (3 replicates), Mcf7-Pol2 (4 replicates), Nb4-Pol2.
  • Inventors downloaded the ChIA-PET Chromatin Interaction PET clusters in BED format, and merged the interaction from all experiments. The interactions linking sites on different chromosomes were removed. Inventors then extracted the ChIA-PET interaction clusters overlapping the genomic locations of enhancers and searched if the linked genomic locations overlap promoters of the annotated genes. The visualization of associations between the enhancers and promoters was performed in ZENBU (Non-Patent Literature 43).
  • qPCR ⁇ Quantitative PCR (qPCR) for call line samples and human cancer/normal tissue cDNA> qPCR was carried out with ABI 7500 Fast Real-Time PCR System (Thermo Fisher Scientific, MA) according to the Primers for real-time PCR were designed by Primer3 web tool, and synthesized by Eurofins Genomics, Tokyo. The housekeeping gene ACTB was utilized as an internal control to normalize the expression levels (The primer was delivered by OriGene, MD).
  • cDNA tissue 15 ⁇ l of Power SYBR Green PCR Master Mix (Thermo Fisher Scientific, MA), 1 ⁇ l of 10 ⁇ M forward primer, 1 ⁇ l of 10 ⁇ M reverse primer, and 13 ⁇ l of DNA/RNA free distilled water were applied to each reaction with cDNA in each well on the tissue panel.
  • Power SYBR Green PCR Master Mix Thermo Fisher Scientific, MA
  • the thermal cycle conditions for qPCR were 50°C for 2 min, 95°C for 10 min, followed by 45 cycles of 95°C for 15 sec (denaturation), 55°C or 53°C for 1 min (annealing), and 72°C for 1 min (extension). Each reaction was run in triplicate for cell line samples, and singlet for human cancer/normal tissue cDNA panel.
  • the present invention can be widely used in the field of diagnostic medicine and the field of health and medical science.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Hospice & Palliative Care (AREA)
  • Biophysics (AREA)
  • Oncology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

A biomarker for cancer in accordance with the present invention is one of: transcription products having at least a part of an RNA sequence corresponding to a DNA sequence in "sequence (DPI - UPPERCASE, 100 downstream - lowercase)" in Table 5 and starting from any position of a corresponding sequence in "sequence_DPI_only" in Table 5 or translation products derived therefrom.

Description

BIOMARKER FOR CANCER AND USE THEREOF
The present invention relates to biomarkers for cancer.
A successful therapy of cancer patients often depends heavily on early detection and diagnosis; therefore there is a great need for reliable and clinically applicable cancer biomarkers. Despite decades of research, relatively few biomarkers are routinely used in clinics, for example CA-125 and PSA are used as serum markers for detection of ovarian and prostate cancers respectively (Non-Patent Literature 1, Non-Patent Literature 2).
Cancers of the same tissue of origin can be very heterogeneous, often being derived from different cell types and having drastically different mutation profiles e.g.(Non-Patent Literature 3). At the same time, cancers of different origins share some common features, as recently shown in a series of pan cancer studies recently published by The Cancer Genome Atlas (TCGA) consortium (Non-Patent Literature 4), which report genes and pathways affected by DNA copy number alterations, mutations, methylation and transcriptome changes across 12 primary tumors types (Non-Patent Literature 4).
Felder M, Kapur A, Gonzalez-Bosquet J, Horibata S, Heintz J, Albrecht R, et al. MUC16 (CA125): tumor biomarker to cancer therapy, a work in progress. Mol Cancer. 2014;13:129. Makarov DV, Loeb S, Getzenberg RH, Partin AW. Biomarkers for prostate cancer. Annu Rev Med. 2009;60:139-51. Cancer Genome Atlas Research Network, Kandoth C, Schultz N, Cherniack AD, Akbani R, Liu Y, et al. Integrated genomic characterization of endometrial carcinoma. Nature. 2013;497:67-73. Cancer Genome Atlas Research Network, Genome Characterization Center, Chang K, Creighton CJ, Davis C, Donehower L, et al. The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet. 2013;45:1113-20. FANTOM Consortium and the RIKEN PMI and CLST (DGT). A promoter-level mammalian expression atlas. Nature. 2014;507:462-70. Zhang B, Wang J, Wang X, Zhu J, Liu Q, Shi Z, et al. Proteogenomic characterization of human colon and rectal cancer. Nature. 2014;513:382-7. Cabanski CR, White NM, Dang HX, Silva-Fisher JM, Rauck CE, Cicka D, et al. Pan-cancer transcriptome analysis reveals long noncoding RNAs with conserved function. RNA Biol. 2015;:0. Iyer MK, Niknafs YS, Malik R, Singhal U, Sahu A, Hosono Y, et al. The landscape of long noncoding RNAs in the human transcriptome. Nat Genet. 2015. Andersson R, Gebhard C, Miguel-Escalada I, Hoof I, Bornholdt J, Boyd M, et al. An atlas of active enhancers across human cell types and tissues. Nature. 2014;507:455-61. Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26:139-40. Harrow J, Frankish A, Gonzalez JM, Tapanari E, Diekhans M, Kokocinski F, et al. GENCODE: The reference human genome annotation for The ENCODE Project. Genome Res. 2012;22:1760-74. Tamborero D, Gonzalez-Perez A, Perez-Llamas C, Deu-Pons J, Kandoth C, Reimand J, et al. Comprehensive identification of mutational cancer driver genes across 12 tumor types. Sci Rep. 2013;3:2650. Futreal PA, Coin L, Marshall M, Down T, Hubbard T, Wooster R, et al. A census of human cancer genes. Nat Rev Cancer. 2004;4:177-83. Fratta E, Coral S, Covre A, Parisi G, Colizzi F, Danielli R, et al. The biology of cancer testis antigens: putative function, regulation and therapeutic potential. Mol Oncol. 2011;5:164-82. Rhodes DR, Yu J, Shanker K, Deshpande N, Varambally R, Ghosh D, et al. Large-scale meta-analysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression. Proc Natl Acad Sci USA. 2004;101:9309-14. Janssen JW, Vaandrager JW, Heuser T, Jauch A, Kluin PM, Geelen E, et al. Concurrent activation of a novel putative transforming gene, myeov, and cyclin D1 in a subset of multiple myeloma cell lines with t(11;14)(q13;q32). Blood. 2000;95:2691-8. Taketani T, Taki T, Sako M, Ishii T, Yamaguchi S, Hayashi Y. MNX1-ETV6 fusion gene in an acute megakaryoblastic leukemia and expression of the MNX1 gene in leukemia and normal B cell lines. Cancer Genet Cytogenet. 2008;186:115-9. Akamatsu S, Takata R, Haiman CA, Takahashi A, Inoue T, Kubo M, et al. Common variants at 11q12, 10q26 and 3p11.2 are associated with prostate cancer susceptibility in Japanese. Nat Genet. 2012;44:426-9-S1. Barrett CW, Ning W, Chen X, Smith JJ, Washington MK, Hill KE, et al. Tumor suppressor function of the plasma glutathione peroxidase gpx3 in colitis-associated carcinoma. Cancer Res. 2013;73:1245-55. Tseng Y-Y, Moriarity BS, Gong W, Akiyama R, Tiwari A, Kawakami H, et al. PVT1 dependence in cancer with MYC copy-number increase. Nature. 2014;512:82-6. Saitou M, Sugimoto J, Hatakeyama T, Russo G, Isobe M. Identification of the TCL6 genes within the breakpoint cluster region on chromosome 14q32 in T-cell leukemia. Oncogene. 2000;19:2796-802. Wheeler TJ, Clements J, Eddy SR, Hubley R, Jones TA, Jurka J, et al. Dfam: a database of repetitive DNA based on profile hidden Markov models. Nucleic Acids Res. 2013;41:D70-82. Wang G, Yang X, Li C, Cao X, Luo X, Hu J. PIK3R3 induces epithelial-to-mesenchymal transition and promotes metastasis in colorectal cancer. Mol Cancer Ther. 2014;13:1837-47. Basil CF, Zhao Y, Zavaglia K, Jin P, Panelli MC, Voiculescu S, et al. Common cancer biomarkers. Cancer Res. 2006;66:2953-61. Xu L, Geman D, Winslow RL. Large-scale integration of cancer microarray data identifies a robust common cancer signature. BMC Bioinformatics. 2007;8:275. Johnson CA, Padget K, Austin CA, Turner BM. Deacetylase activity associates with topoisomerase II and is necessary for etoposide-induced apoptosis. J Biol Chem. 2001;276:4539-42. Hawkins DS, Park JR, Thomson BG, Felgenhauer JL, Holcenberg JS, Panosyan EH, et al. Asparaginase pharmacokinetics after intensive polyethylene glycol-conjugated L-asparaginase therapy for children with relapsed acute lymphoblastic leukemia. Clin Cancer Res. 2004;10:5335-41. Dudderidge TJ, Stoeber K, Loddo M, Atkinson G, Fanshawe T, Griffiths DF, et al. Mcm2, Geminin, and KI67 define proliferative state and are prognostic markers in renal cell carcinoma. Clin Cancer Res. 2005;11:2510-7. Wharton SB, Chan KK, Anderson JR, Stoeber K, Williams GH. Replicative Mcm2 protein as a novel proliferation marker in oligodendrogliomas and its relationship to Ki67 labelling index, histological grade and prognosis. Neuropathol Appl Neurobiol. 2001;27:305-13. Liu Y, He G, Wang Y, Guan X, Pang X, Zhang B. MCM-2 is a therapeutic target of Trichostatin A in colon cancer cells. Toxicol Lett. 2013;221:23-30. Rahmanzadeh R, Rai P, Celli JP, Rizvi I, Baron-Luhr B, Gerdes J, et al. Ki-67 as a molecular target for therapy in an in vitro three-dimensional model for ovarian cancer. Cancer Res. 2010;70:9234-42. Ward MC, Wilson MD, Barbosa-Morais NL, Schmidt D, Stark R, Pan Q, et al. Latent regulatory potential of human-specific repetitive elements. Mol Cell. 2013;49:262-72. Qi P, Du X. The long non-coding RNAs, a new cancer diagnostic and therapeutic gold mine. Mod Pathol. 2013;26:155-65. Ren S, Wang F, Shen J, Sun Y, Xu W, Lu J, et al. Long non-coding RNA metastasis associated in Garcia JM, Garcia V, Pena C, Dominguez G, Silva J, Diaz R, et al. Extracellular plasma RNA from colon cancer patients is confined in a vesicle-like structure and is mRNA-enriched. RNA. 2008;14:1424-32. Casciano I, Vinci AD, Banelli B, Brigati C, Forlani A, Allemanni G, et al. Circulating Tumor Nucleic Acids: Perspective in Breast Cancer. Breast Care (Basel). 2010;5:75-80. Miura N, Osaki Y, Nagashima M, Kohno M, Yorozu K, Shomori K, et al. A novel biomarker TERTmRNA is applicable for early detection of hepatoma. BMC Gastroenterol. 2010;10:46. Fellmann C, Lowe SW. Stable RNA interference rules for silencing. Nat Cell Biol. 2014;16:10-8. Tabernero J, Shapiro GI, LoRusso PM, Cervantes A, Schwartz GK, Weiss GJ, et al. First-in-humans trial of an RNA interference therapeutic targeting VEGF and KSP in cancer patients with liver involvement. Cancer Discov. 2013;3:406-17. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA. 2005;102:15545-50. Magrane M, Consortium U. UniProt Knowledgebase: a hub of integrated protein data. Database (Oxford). 2011;2011:bar009. Zhao M, Sun J, Zhao Z. TSGene: a web resource for tumor suppressor genes. Nucleic Acids Res. 2013;41:D970-6. Severin J, Lizio M, Harshbarger J, Kawaji H, Daub CO, Hayashizaki Y, et al. Interactive visualization and analysis of large-scale sequencing datasets using ZENBU. Nat Biotechnol. 2014;32:217-9. Zhang L, Huang J, Yang N, Greshock J, Liang S, Hasegawa K, et al. Integrative genomic analysis of phosphatidylinositol 3'-kinase family identifies PIK3R3 as a potential therapeutic target in epithelial ovarian cancer. Clin Cancer Res. 2007;13:5314-21.
It is desired that a novel biomarker for cancer be developed. Under the circumstances, a first object of the present invention is to provide a novel biomarker for cancer.
Further, it is also desired that a biomarker capable of detecting a plurality of types of cancers be developed. Under the circumstances, a second object of the present invention is to provide a novel biomarker capable of detecting a plurality of types of cancers.
In order to attain the objects, the present invention includes at least one of the following aspects.
(1)A biomarker for cancer being one of:
transcription products having at least a part of an RNA sequence corresponding to a DNA sequence in "sequence (DPI - UPPERCASE, 100 downstream - lowercase)" in Table 3 and starting from any position of a corresponding sequence in "sequence_DPI_only" in Table 3 or translation products derived therefrom.
(2) A biomarker set comprising:
a first biomarker for cancer which first biomarker is one of:
transcription products having at least a part of an RNA sequence corresponding to a DNA sequence in "sequence (DPI - UPPERCASE, 100 downstream - lowercase)" in Table 3 and starting from any position of a corresponding sequence in "sequence_DPI_only" in Table3, or translation products derived therefrom; and
a second biomarker for cancer which second biomarker is another one of:
the transcription products having at least the part of the RNA sequence corresponding to the DNA sequence in "sequence (DPI - UPPERCASE, 100 downstream - lowercase)" in Table 3 and starting from any position of the corresponding sequence in "sequence_DPI_only" in Table3, or the translation products derived therefrom.
(3) A cancer detection method including the step of measuring, in a sample collected from a living body, the amount of the biomarker mentioned in (1) or the first and second biomarkers included in the biomarker set mentioned in (2).
(4) The cancer detection method mentioned in (3), wherein the cancer includes a plurality of types of cancers.
(5) A cancer detection kit comprising the biomarker mentioned in (1) or the biomarker set mentioned in (2).
(6) A biomarker for cancer being a full length or a fragment of one of:
transcription products of genes listed in Table 1, and transcription products or translation products of genes listed in Table 2.
(7) A biomarker set comprising:
a first biomarker for cancer which first biomarker is a full length or a fragment of one of:
transcription products of genes listed in Table1, and transcription products or translation products of genes listed in Table 2; and
a second biomarker for cancer which second biomarker is a full length or a fragment of another one of:
the transcription products of the genes listed in Table 1, and the transcription products or the translation products of the genes listed in Table 2.
The present invention can provide a novel biomarker for cancer and use of the novel biomarker for cancer.
Summary of comparisons carried out to identify recurrently perturbed transcripts in the FANTOM5 cell line dataset. A) Differential expression pipeline applied to the FANTOM5 data. B) Examples of differentially expressed promoters showing expression switching (ON and OFF) and expression shift (UP and DOWN). Validation of pan-cancer biomarkers by qRTPCR. For the most promising candidates, inventors performed qRTPCR validation across a cDNA panel of 65 tumors, 7 lesions and 24 normal tissues. It has been conventionally known that CK19 can serve as a biomarker for a plurality of cancers. In the present experiment, therefore, CK19 was used as a reference. 8 pan-cancer marker candidates used in the experiment were each smaller in terms of p-value in a t-test than CK19. Organ specific gene expression concerning the best 8 pan-cancer marker candidates and CK19. The 8 pan-cancer marker candidates are each identified below with a corresponding number of organs whose cancer samples were higher in average expression level than corresponding normal samples. PKMYT1: 8 organs, BLM: 7 organs, FOXP4AS1 (also referred to as RP11-328M4.2): 7 organs, ENST00000448869 (also referred to as RP11-284F21.7): 8 organs, LOC643401 (also referred to as LINC01021): 5 organs, GABRD: 7 organs, MNX1_AS1: 6 organs, and FIRRE (also referred to as RP11-453F18_B1): 7 organs. In these pan-cancer marker candidates, inversions of expression levels between normal samples and corresponding cancer samples hardly occurred. In the case of CK19, 4 organs were found such that the expression levels were higher in cancer samples than in normal samples, and 2 organs were found such that the expression levels were lower in cancer samples than in normal samples. It is thus indicated that the best 8 pan-cancer marker candidates are more versatile than CK19.
The following specifically describes an embodiment of the present invention.
(1. Biomarker for cancer)
A biomarker for cancer in accordance with one embodiment of the present invention is one of:
transcription products having at least a part of an RNA sequence corresponding to a DNA sequence in "sequence (DPI - UPPERCASE, 100 downstream - lowercase)" in Table 3 and starting from any position of a corresponding sequence in "sequence_DPI_only" in Table 3 or translation products derived therefrom.
A biomarker for cancer in accordance with another embodiment of the present invention is a full length or a fragment of one of:
transcription products of genes listed in Table 1, and transcription products or translation products of genes listed in Table 2.
According to an embodiment of the present invention, the transcription products are RNAs, and the translation products are proteins. The transcription products include processed RNAs.
A biomarker that is a fragment of a transcription product has a length, which is not particularly limited, of, for example, 10 bases or more, 15 bases or more, 20 bases or more, 25 bases or more, 50 bases or more, 100 bases or more, or 150 bases or more, and, 1000 bases or less, 500 bases or less, 250 bases or less, or 200 bases or less. A biomarker that is a fragment of a translation product has a length, which is not particularly limited, of, for example, 3 residues or more, 5 residues or more, 10 residues or more, 20 residues or more, or 30 residues or more, and, 300 residues or less, 200 residues or less, or 100 residues or less.
The genes listed in Table 1 are long-non-coding RNAs (lncRNAs). Table 1 appropriately lists, for example, names, various IDs, and nucleotide sequences of the genes listed in Table 1.
The genes listed in Table 2 are protein coding genes. Thus, a biomarker of the present invention may be a full length or a fragment of a transcription product (mRNA) of a gene listed in Table 2, or a full length or a fragment of a translation product (protein) of a gene listed in Table 2. A biomarker of the present invention which biomarker relates to a gene listed in Table 2 is preferably a full length or a fragment of a transcription product.
Table 3 provides, from another point of view, a summary of information related to the long-non-coding RNAs (lncRNAs) and protein-coding genes each obtained in accordance with a method of Examples. “dpi_id” in Table 3 indicates a location on a chromosome corresponding to DPI (a putative promoter of the long-non-coding RNAs or the protein-coding genes). “range (DPI + 100 downstream)” in Table 3 indicates a location on a chromosome corresponding to both a sequence in the DPI and a sequence of 100 bases downstream of the DPI, where “+” means a location on the plus strand side and “-” means a location on the minus strand side. “sequence_DPI_only” in Table 3 indicates a sequence in a putative promoter region (DPI) obtained by carrying out DPI analysis with respect to a profile which has been analyzed by a CAGE method and which is related to a location on a chromosome of a transcription product and an amount of transcription of the transcription product, “sequence_100downstream” in Table 3 indicates a sequence of 100 bases downstream of the putative promoter region, and “sequence (DPI - UPPERCASE, 100 downstream - lowercase)” in Table 3 is a sequence obtained by combining the sequence in the putative promoter region and the sequence of 100 bases downstream of the putative promoter region (each sequence is indicated as a DNA sequence of a genome). “M.logFC” in Table 3 is obtained by logarithmically transforming the fold change (FC) of expression cancer versus normal cells and positive values correspond to up-regulation in cancer and negative to donw-regulation in cancer. Thus, in one embodiment, a marker having a greater absotule logFC value is preferable. “summary” in Table 3 suggests a purpose of use as a marker. Specifically, “Pan” in Table 3 indicates the possibility of being a marker for a plurality of types of cancers, and “Solid” in Table 3 indicates the possibility of being a marker for at least a solid cancer. “ON” in Table 3 indicates that transcription occurs in a cancer that can be a target of the marker, whereas substantially no transcription occurs in a control (non-cancer tissue), and “OFF” in Table 3 indicates that substantially no transcription occurs in a cancer that can be a target of the marker, whereas transcription occurs in a control (non-cancer tissue). “UP” in Table 3 indicates that a cancer that can be a target of the marker further increases in amount of transcription than a control, and “DOWN” in Table 3 indicates that a cancer that can be a target of the marker further decreases in amount of transcription than a control (non-cancer tissue). “gene name” in Table 3 indicates a name of a gene having a putative promoter. As to “gene type” in Table 3, genes classified as “protein_coding” are protein-coding genes, and genes classified as a category different from “protein_coding” are genes whose transcription products are the long-non-coding RNAs. Note that the long-non-coding RNAs, which serve as one of the biomarkers of the present invention, preferably include the sequence in the putative promoter region obtained by carrying out DPI analysis, and more preferably include the sequence in the putative promoter region and the sequence of 10 or more bases downstream of the putative promoter region. Transcription products of the protein-coding genes, which transcription products serve as one of the biomarkers of the present invention, preferably include the sequence in the putative promoter region and the sequence of 10 or more bases downstream of the putative promoter region.
In one embodiment, the length of a sequence corresponding to a DNA sequence in "sequence_100downstream" in Table 3 contained in the transcription product is, for example, at least 1 base, at least 2, at least 5 bases, at least 10 bases, at least 15 bases, at least 20 bases, at least 30 bases, at least 50 bases, at least 70 bases or 100 bases.
According to an embodiment of the present invention, a biomarker may preferably a full length or a fragment of one of the transcription products of the genes listed in Tables 1 and 2.
As compared with a subject having no cancer, a subject having cancer contains a biomarker in an amount which is not particularly limited. The amount may be upregulated in the subject having cancer, or may be downregulated in the subject having cancer. Alternatively, expression may be ON in the subject having cancer, or expression may be OFF in the subject having cancer. Tables 1 and 2 show, for each gene, UP/DOWN/ON/OFF in the subject having cancer.
Examples of types of cancers targeted by a biomarker in accordance with the present invention include blood (e.g. lymphoid, myeloid), bone, brain, breast, kidney, liver, lung, melanocyte, mesothelium, ovary, and prostate cancers. Further examples of the types of cancers targeted by the biomarker in accordance with the present invention include bladder urothelial carcinoma, breast invasive carcinoma, colon adenocarcinoma, colorectal carcinoma, head and neck squamous cell carcinoma, kidney chromophobe, kidney renal clear cell carcinoma, kidney renal papillary cell carcinoma, liver hepatocellular carcinoma, lung adenocarcinoma, lung squamous cell carcinoma, prostate adenocarcinoma, rectum adenocarcinoma, thyroid carcinoma, uterine corpus endometrial carcinoma, brain cancer, bone cancer, myeloma, lymphoma, and stomach cancer. Still further examples of the types of cancers targeted by the biomarker in accordance with the present invention include sarcoma, carcinoma, fibrosarcoma, myxosarcoma, liposarcoma, chondrosarcoma, osteogenic sarcoma, chordoma, angiosarcoma, endotheliosarcoma, lymphangiosarcoma, lymphangioendotheliosarcoma, synovioma, mesothelioma, Ewing's tumor, leiomyosarcoma, rhabdomyosarcoma, colon carcinoma, pancreatic cancer, breast cancer, ovarian cancer, prostate cancer, squamous cell carcinoma, basal cell carcinoma, adenocarcinoma, sweat gland carcinoma, sebaceous gland carcinoma, papillary carcinoma, papillary adenocarcinomas, cystadenocarcinoma, medullary carcinoma, bronchogenic carcinoma, renal cell carcinoma, hepatoma, bile duct carcinoma, choriocarcinoma, seminoma, embryonal carcinoma, Wilms' tumor, cervical cancer, testicular cancer, lung carcinoma, bladder carcinoma, epithelial carcinoma, glioma, astrocytoma, medulloblastoma, craniopharyngioma, ependymnoma, Kaposi's sarcoma, pinealoma, hemangioblastoma, acoustic neuroma, oligodendroglioma, menangioma, melanoma, neuroblastoma, retinoblastoma, and the like.
One type of cancer or a plurality of types (e.g., two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, or fourteen or more types) of cancers may be targeted by the biomarker of the present invention. According to an embodiment, a plurality of types of cancers are preferably targeted by the biomarker of the present invention. In a case where a plurality of types of cancers are targeted by the biomarker of the present invention, types of cancers in a subject can be comprehensively examined by use of a single biomarker.
Tables 1 and 2 each show an example of a preferable cancer(s) for each gene. Note, however, that the present invention is not limited to such a combination.
In one embodiment, among the DNA sequences in "sequence (DPI - UPPERCASE, 100 downstream - lowercase)" in Table 3, SEQ ID NOs: 61, 63, 67, 69, 71, 80, 82, and 90 are preferable.
"GABRD", "GPR19", "FLRT2", and "NAALADL1" are cell surface proteins. Thus, in one embodiment, these surface proteins are employed as the biomarker. This enables (i) detection of intact cells with the use of an antibody and (ii) enrichment of Circulating Tumor Cells (CTC) (using Flow Cytometry (FCM), Manetic-Activated Cell Sorting (MACS), or the like). In particular, "GABRD", versatility of which is indicated in the qPCR validation, is expected to be useful for the detection and the enrichment. Utilization of these cell surface biomarkers for enrichment of Circulating Tumor Cells (CTC) allows a reduction in missing CTC samplings, and therefore restricts the number of false positive cases.
The biomarker of the present invention can be particularly useful in a case where metastasis of cancer from one organ or tissue to another (many conventional biomarkers cause expression level to vary depending on the organ) has occurred. With a biomarker such as CK19 which may cause an expression level to increase in normal samples depending on an organ, it is not possible to assess metastasis by staining CK19 RNA or protein, if the expression of CK19 is confirmed in an organ to which cancer metastasized. With the biomarker of the present invention, in contrast, expression levels are low in the majortiy of normal tissues. It is therefore highly probable that cells which metastasized can be identified by use of staining.
The biomarkers of the present invention include those downregulated in cancers. For example, FLRT2, ZNF677, SRPX, NAALADL1 and TCEAL7 listed in Table 2 are shown to be downregulated in almost all of the cancer tissues tested. Such a downregulated biomarker can be used to identify a cancer based on the absence of or decrease in the expression thereof. Further, a downregulated biomarker corresponding to a surface protein (e.g., FLRT2 or NAALADL1) can be used to eliminate normal cells from cell suspension containing cancer cells to enrich cancer (or tumor) cells in collected samples, for example, using magnetic beads with a binding substance such as an antibody against the surface protein.
(2. Biomarker set)
The biomarkers of the present invention can be used in combination.
According to one embodiment of the present invention, provided is a biomarker set comprising:
a first biomarker for cancer which first biomarker is one of:
transcription products having at least a part of an RNA sequence corresponding to a DNA sequence in "sequence (DPI - UPPERCASE, 100 downstream - lowercase)" in Table 3 and starting from any position of a corresponding sequence in "sequence_DPI_only" in Table 3, or translation products derived therefrom; and
a second biomarker for cancer which second biomarker is another one of:
the transcription products having at least the part of the RNA sequence corresponding to the DNA sequence in "sequence (DPI - UPPERCASE, 100 downstream - lowercase)" in Table 3 and starting from any position of the corresponding sequence in "sequence_DPI_only" in Table 3, or the translation products derived therefrom.
According to another embodiment of the present invention, provided is a biomarker set comprising:
a first biomarker for cancer which first biomarker is a full length or a fragment of one of:
transcription products of genes listed in Table 1, and transcription products or translation products of genes listed in Table 2; and
a second biomarker for cancer which second biomarker is a full length or a fragment of another one of:
the transcription products of the genes listed in Table 1, and the transcription products or the translation products of the genes listed in Table 2.
It is possible to select, as the first biomarker for cancer and the second biomarker for cancer, which are not particularly limited, any two of the biomarkers described in (1. Biomarker for cancer). Examples of a preferable combination of biomarkers include a combination of biomarkers targeting types of cancers which types are different from each other. Such a case has an advantage in that more types of cancers can be comprehensively detected by use of a biomarker set. According to another embodiment, a combination of biomarkers may be a combination of biomarkers targeting types of cancers which types are identical to each other. Such a case makes it possible to detect one (or more) types of cancers with higher accuracy.
The biomarker set may further include a third biomarker, a fourth biomarker, or a fifth or higher-order biomarker. Such a biomarker may be one of: transcription products having at least a part of an RNA sequence corresponding to a DNA sequence in "sequence (DPI - UPPERCASE, 100 downstream - lowercase)" in Table 3 and starting from any position of a corresponding sequence in "sequence_DPI_only" in Table 3 or translation products derived therefrom, or a biomarker different from the one of: transcription products having at least a part of an RNA sequence corresponding to a DNA sequence in "sequence (DPI - UPPERCASE, 100 downstream - lowercase)" in Table 3 and starting from any position of a corresponding sequence in "sequence_DPI_only" in Table 3 or translation products derived therefrom. Alternately, such a biomarker may be a full length or a fragment of still another one of the transcription products of the genes listed in Table 1, and the transcription products or the translation products of the genes listed in Table 2, or a biomarker different from the full length or the fragment of any one of the transcription products of the genes listed in Table 1, and the transcription products or the translation products of the genes listed in Table 2.
Examples of the biomarker different from the one of: transcription products having at least a part of an RNA sequence corresponding to a DNA sequence in "sequence (DPI - UPPERCASE, 100 downstream - lowercase)" in Table 3 and starting from any position of a corresponding sequence in "sequence_DPI_only" in Table 3 or translation products derived therefrom include a publicly-known biomarker for cancer. Examples of the biomarker different from the full length or the fragment of any one of the transcription products of the genes listed in Table 1, and the transcription products or the translation products of the genes listed in Table 2 include a publicly-known biomarker for cancer.
In one embodiment, among the DNA sequences in "sequence (DPI - UPPERCASE, 100 downstream - lowercase)" in Table 3, SEQ ID NOs: 61, 63, 67, 69, 71, 80, 82, and 90 are preferable.
(3. Cancer detection method)
A cancer detection method in accordance with the present invention includes the step of measuring, in a sample collected from a living body, the amount of the biomarker described in (1. Biomarker for cancer) or the first and second biomarkers included in the biomarker set described in (2. Biomarker set). The other specific steps of the method, and an appliance and an apparatus that are used in the method are not particularly limited.
A “sample” for use in a cancer detection method in accordance with the present invention is a sample (biological sample) collected from a living body (subject). Examples of the sample collected from the living body mainly include a sample derived from a cell, a tissue, and a body fluid (e.g., blood, urine, saliva, etc.). Examples of a blood-derived sample mainly include complete blood, serum, and blood plasma. According to an embodiment, the “sample” is preferably a cell or a tissue, and more preferably a cell.
A tissue to be detected the biomarker of the present invention is, for example, a lymph node removed during surgery, and preferably a lymph node located in a vicinity of a primary lesion of cancer. Note that the biomarker of the present invention is preferably detected during surgical removal of the primary lesion. This makes it possible to determine, for example, a risk of metastasis of the cancer (while performing the surgery).
The “living body (subject)”, which is exemplified by a human and non-human animals such as mammals such as a mouse, a rat, a rabbit, and a monkey, is preferably a human.
An embodiment of the cancer detection method in accordance with the present invention further includes the step of collecting the sample from the living body (subject). An embodiment of the cancer detection method in accordance with the present invention further includes the step of pretreating the sample collected from the living body (subject). Examples of the pretreatment of the sample mainly include obtainment of a cell extract by lysis of a cell.
One type of cancer or a plurality of types of cancers may be to be detected. Specific examples of types of cancers are given earlier in (1. Biomarker for cancer).
A method for detecting a biomarker is not particularly limited. Note, however, that examples of the method for detecting the biomarker that is a protein include a method based on an antigen-antibody reaction, a method in which an interaction of a protein is used, a mass spectrometry, and a method for identifying a protein such as electrophoresis, or combination of thereof, for example Immunoelectrophoresis.
Examples of the method for detecting the biomarker that is an RNA include a method in which various nucleic acid amplification techniques are used and a method in which no nucleic acid amplification technique needs to be used.
A nucleic acid amplification technique is exemplified by but not particularly limited to a PCR method, a RT-PCR method, and an LCR method (Ligase Chain Reaction: Barany, F., Proc. Natl. Acad. Sci. USA, Vol.88, p.189-193, 1991). The nucleic acid amplification technique is also preferably exemplified by isothermal amplification methods such as a SmartAmp (Registered Trademark: Smart Amplification Process) method (see also Japanese Patent No. 3897805, for example), a LAMP (Loop-Mediated Isothermal Amplification) method, an ICAN(Isothermal and Chimeric primer-initiated Amplification of Nucleic acids) method, and an RCA (rolling circle amplification) method.
Examples of the method in which no nucleic acid amplification technique needs to be used mainly include a CAGE method (Nature Methods 3 (2006), 211-222), RNA-seq and RNA-seq on RNA samples enriched or depleted in specific RNA targets, for example enriched in sequences from to RNA biomarkers, for example by hybridization method, for example similar to SeqCap lncRNA Enrichment Kit by Roche.
The biomarker may also be detected by use of an artificial nucleic acid in which an exciton effect is used (see also Japanese Patent No. 4370385, for example).
As a technique for in situ hybridization of a target probe, it is possible to use, for example, Cancer-Eprobe-FISH (Fluorescent in situ hybridization) (see, for example, WO 2014/013954 A1 and Okamoto et al. RNA (2011)).
Note that, above-mentioned methods for detecting the biomarker can be applied to, not limited to the biomarkers of the present invention, all biomarkers (including publicly-known and unknown biomarkers).
In one embodiment, a result obtained by carrying out a cancer detection method in accordance with the present invention can be utilized as one of materials used by a doctor for diagnosis. Therefore, the present invention also may provide "a method for disgnosing cancer by use of a result obtained by carrying out a cancer detection method in accordance with the present invention".
Further, if it is determined that a subject has a possibility of a cancer as a result of carrying out the cancer detection method in accordance with the present invention, therapeutic strategy can be decided. Examples of the decision on the therapeutic strategy encompass (i) selection from treatment cessation, chemotherapy, radiotherapy, a surgical operation, etc. and/or (ii) selection of a drug to be used.
More specifically, for example, in a case where a doctor diagnoses that "a subject has a cancer" based on a result obtained by carrying out the method according to the present invention, it is possible to carry out a biopsy on the subject after consideration of an observation by means of another inspection (e.g., echography, endoscopy, radioscopy, CT scanning, MRI scanning and/or PET scanning), an age of the subject, a family history of the subject, and/or the like. Typically, a definitive diagnosis as to whether or not the subject has a cancer is made based on the result of the biopsy, and thereafter a therapeutic strategy for the subject is decided.
A doctor can diagnose that a subject has a cancer in the following manner, for example. First, a certain reference value is predetermined. Then, if a value measured in the subject is equal to or higher than the reference value, the doctor can diagnose that "the subject has a cancer".
Further, if it is determined that a subject has a cancer, the subject can receive a sutable therapy based on the therapeutic strategy.
Further, one can collect, from liquid biopsy samples such as blood and urine, cancer cells expressing a surface protein as a biomarker of the present invention, and subject the cells to a further analysis. For example, one can predetermine the efficacy or side effect of a drug to be used by determining the presence of a molecule for a molecular target drug on the collected cells (companion diagnostics). Alternatively, one can determine the prognosis by measuring the number of cancer cells in liquid biopsy samples using such a surface-protein biomarker according to the present invention.
(4. Kit)
A kit in accordance with the present invention is a cancer detection kit including the biomarker described in (1. Biomarker for cancer) or the biomarker set described in (2. Biomarker set).
The cancer detection kit in accordance with the present invention may further appropriately include at least one of, for example, various reagents, an appliance, an instruction for use of the detection kit, a sample for comparison to be used during detection, and data for comparison to be used to analyze a result of the detection. Note that the instruction for use of the detection kit records therein the details of the cancer detection method in accordance with the present invention, which cancer detection method are described earlier in (3. Cancer detection method).
(5. Others)
A result obtained by carrying out the detection method described earlier in (3. Cancer detection method) can be used as one of diagnostic materials for diagnosis carried out by a doctor. Thus, the present invention also provides a method for diagnosis of cancer including the step of measuring, in a sample collected from a living body, the amount of the biomarker described in (1. Biomarker for cancer) or the first and second biomarkers included in the biomarker set described in (2. Biomarker set).
Further, a subject from which a result that the subject may have cancer has been obtained by carrying out the detection method described earlier in (3. Cancer detection method) can be provided with treatment appropriately followed by a result of diagnosis by a doctor. Note here that examples of the treatment mainly include a chemotherapy, a radiotherapy, a surgery each carried out by a doctor or a specialist different from the doctor. Thus, the present invention also provides a cancer treatment method including the step of measuring, in a sample collected from a living body, the amount of the biomarker described in (1. Biomarker for cancer) or the first and second biomarkers included in the biomarker set described in (2. Biomarker set).
The present invention is not limited to the embodiments, but can be altered by a skilled person in the art within the scope of the claims. An embodiment derived from a proper combination of technical means each disclosed in a different embodiment is also encompassed in the technical scope of the present invention. Further, it is possible to form a new technical feature by combining the technical means disclosed in the respective embodiments.
The present specification includes the contents of the specification and/or the drawings of the Japanese Patent Application No. 2015-183484, based on which the present application claims priority. Furthermore, all the documents described herein are incorporated therein by reference.
(Introduction)
Here inventors focus on identifying genes that are recurrently up-regulated or down-regulated across many different cancer types using Cap Analysis of Gene Expression (CAGE) data collected for the FANTOM5 (Functional ANnoTation Of Mammalian genomes) project (Non-Patent Literature 5). CAGE is a 5’ sequence tag technology that globally determines transcription start sites (TSS) in the genome and their expression levels (Non-Patent Literature 5). Inventors compare CAGE profiles of cancer cell lines to their matched primary cell counterparts to identify mRNAs, long-non-coding RNAs (lncRNAs) and enhancer RNAs (eRNAs) that are recurrently perturbed in cancer lines. To confirm that these transcripts that are relevant to cancer inventors compared their expression in 4,055 primary tumors and 563 matching tissue sets profiled by the TCGA (https://tcga-data.nci.nih.gov/) and in a smaller set of colorectal tumor (Non-Patent Literature 6) samples confirmed protein level changes. Inventors also compare inventors' lists of pan-cancer lncRNAs to the 229 ‘onco-lncRNAs’ identified by Cabanski et al. (Non-Patent Literature 7) and the 6485 ‘cancer-associated lncRNA genes’ from the miTranscriptome study (Non-Patent Literature 8). Thus inventors compare inventors' results from pure populations of cancer cell lines and normal primary cells of FANTOM5 with results generated from tumor samples that are made up of varying and complex mixtures of cell types. Inventors find an overlapping but distinct set of transcripts. Finally, for the most promising biomarker candidates inventors performed qRTPCR validations in cancer cell lines and tumor cDNA panels. Taken together, inventors' analyses allowed for identification of a set of robust pan cancer biomarker candidates, which have the potential for development as blood biomarkers for early detection and for histological screening of biopsies.
This work is part of the FANTOM5 project. Data download, genomic tools and co-published manuscripts have been summarized at http://fantom.gsc.riken.jp/5/.
(Outline)
Many genes have been previously reported to be differentially expressed in cancer, however few are used in the clinics and cancer biomarker.
In this study inventors looked for genes that are up regulated in both cancer cell lines and clinical tumors, and thus inventors selected genes that are robustly up-regulated regardless on the context (cancer cell lines or tumor). Finally we validated a subset of them using RT-qPCR in cDNA panel from 8 tumor types.
For protein coding genes, inventors fist screened for genes that are robustly up regulated in cancer cell lines vs normal cells (promoter up-regulation fold change > 4, gene-wide (all promoter combined) fold change > 2, FDR < 0.01). Second, inventors performed differential analysis using TCGA tumor data in 14 tumor types. Inventors selected genes that were differentially expressed (fold change > 2, FDR >0.01) in at least 10 cancer origins (10 out of 14). The final selection includes genes that were differentially expressed in both cancer cell lines and >10 TCGA tumor types. Inventors excluded genes that were previously reported in other pan-cancer studies (ONCOMINE and Cabanski et al. 2015)
For lncRNAs, inventors first screened for genes that are robustly up regulated in cancer cell lines vs normal cells (promoter up-regulation fold change > 4, FDR < 0.01). Then inventors investigated miTranscriptome data base of non-coding RNAs that are associated with cancer type or cell lineage. Inventors selected lncRNAs that were up regulated in FANTOM5 cancer cell lines and in at least one TCGA cancer type (from miTranscriptome database). miTranscriptome database contains ~8,000 long noncoding RNAs (lncRNAs) differentially expressed within a cancer type and/or tissue type, however it does not prioritize/select which of them are suitable for biomarkers (which lncRNAs out of 8,000 listed can be used as biomarkers). Additionally, miTranscriptome does not focus on lncRNAs that are expressed in multiple cancer types (no pan cancer analysis).
Inventors' approach provides a unique and novel way to select best lncRNA biomarkers by selecting top X candidates that were robustly upregulated in cancer cell lines and in one or more clinical tumor types from miTranscriptome database. Thus inventors select the best cancer biomarkers from ~8,000 candidates of miTranscriptome database. Additionally inventors claim that they are applicable for diagnosis of multiple cancer types. Even though some of them were associated with one cancer type only in miTranscriptome database, in FANTOM5 they were up-regulated with cancer cell lines from multiple cancer types, thus inventors claim them as pan cancer biomarkers. Additionally, qPCR validations were performed in multiple cancer types.
As a result, 19 lncRNAs (Table1) and 15 protein coding genes (Table 2) were selected.
(Results)
<Identification of transcripts recurrently up- or down-regulated in cancer cell lines>
Using CAGE data collected for the FANTOM5 project inventors compared expression levels of transcripts from promoter and enhancer regions between a panel of 225 cancer cell lines and a panel of 339 primary cell samples. Promoter and enhancer expression levels were estimated from the counts of CAGE tags falling under 184,827 TSS regions (referred to hereafter as promoters) and 43,011 enhancer regions defined in FANTOM5 (Non-Patent Literature 5, Non-Patent Literature 9).
The cancer cell line and primary cell datasets were divided into three subsets; Firstly, inventors attempted to match cell lines derived from solid tissues to primary cell counterparts in the FANTOM5 collection. Cell-lines and primary cells that could be matched are referred to as matched-solid. The remaining cell lines and primary cells from solid tissue are referred to as unmatched-solid. Lastly, matched hematopoietic cells and their blood-cancer counterparts are referred to as matched-blood. In each data set, inventors identified promoters that were differentially expressed between cancer and normal (edgeR(10), >4 fold change, FDR < 0.01). Inventors also performed an alternative binary analysis (inventors refer to as an ON/OFF analysis) to identify transcripts that were consistently switched off or switched on in cancer. Inventors' motivation was that transcripts, which are switched on or off in cancer, are more promising as candidates for biomarkers and therapeutic targets than up and down regulated transcripts. Inventors used the criteria that they were 4 times more often expressed (switched ON) or not detected (switched OFF) in the cancer group compared to the normal group, using a significance level of FDR < 0.01 by Fisher’s exact test (examples on Figure 1B). Inventors then merged the results of the ON/OFF and edgeR analyses to obtain a final selection of up and down regulated promoters. The flowchart of the differential expression pipeline is presented on Figure 1A.
In total, inventors' analysis yielded 2,108 differentially expressed promoters in cancer. 781 promoters were consistently up regulated in all three datasets and a further 814 that were up only in solid cancers. Conversely 99 promoters were consistently down regulated in all three datasets and a further 414 promoters were down only in the solid cancer cell lines. 1,334 of the differentially expressed promoters (63%) overlapped protein-coding loci. The majority of the CAGE peaks were located at known promoter regions or 5’ UTRs however inventors note that 279 peaks overlapped introns, 94 overlapped internal exons and 51 overlapped 3’UTR, and thus may not represent coding transcripts. 12% of the differentially expressed peaks overlapped annotated long non coding RNA genes (based on GENCODE 19) (Non-Patent Literature 11). 527 peaks, roughly 25%, were not associated to any known genes.
<Differentially expressed protein coding genes are enriched in cancer-associated genes>
For protein coding genes, the inventors identified 911 promoters of protein coding genes that represented 656 unique genes: 435 up regulated and 221 down regulated.
To compare inventors' results with primary tumor data inventors performed an analogous cancer vs. normal analysis on RNA-seq data from 14 tumor-normal pairs (4,055 primary cancer samples and 563 normal tissues samples) from The Cancer Genome Atlas (TCGA, http://cancergenome.nih.gov/). Inventors observed 490 up-regulated genes and 1,661 down regulated genes (DE thresholds: abs FC > 2, FDR < 0.01). Inventors applied a less stringent threshold of abs FC > 2 to obtain a similar number of DE genes as inventors observed for the FANTOM5 analysis (vs. abs FC > 4 in cancer cell lines), because the fold changes in TCGA tumors tended to be more moderate in scale. Inventors note that many more genes were called down-regulated in the tumor-normal comparison than inventors observed for the cancer cell line-primary cell comparison.
<Potential pan cancer biomarkers>
Inventors found that 76 (17%) of the up regulated protein coding genes identified in the cancer cell lines analysis were also up-regulated in primary tumors from TCGA. Among them, inventors find oncogenes (HOXC13, MYEOV, MNX1 and CASC5), cancer antigens (PRAME, CD70, CASC5, IDF2BP3) and, somewhat unexpectedly, tumors suppressors (TP73, BLM, BUB1B). The up-regulated genes were also enriched in genes involved in cell cycle, DNA metabolism, biopolymer metabolism and homeobox genes involved in development. This included well-known pan cancer genes such as TERT, PRAME or TOP2A (Non-Patent Literature 14, Non-Patent Literature 15) and other genes such as MYEOV and MNX1 involved in blood malignancies (Non-Patent Literature 16, Non-Patent Literature 17) and FAM111B in prostate cancer (Non-Patent Literature 18).
For the 52 down regulated genes in the FANTOM5 cancer cell lines analysis 19% were also down regulated in primary tumors. Interestingly the list was enriched for genes related to oxidoreductase activity (5 genes: AOX1, GPX3, PTGS1, ACOX2, COX7A1), and inventors note one, GPX3, is a reported tumor suppressor (Non-Patent Literature 19).
Inventors also used recently published proteome data from 90 colorectal cancers and 30 normal tissues published by Zhang et al. (Non-Patent Literature 6). The spectral count data were available for 239 out of 656 differentially expressed genes in cancer cell lines. 20 mRNAs/proteins were up regulated in both the cancer cell lines (CAGE) and colorectal tumors (mass spec data) versus 16 genes being up regulated in both TCGA comparisons (RNA-seq and mass spec). Notably 4 genes were robustly up regulated in all 3 comparisons: MCM2, TOP2A, ASNS and MKI67.
For three of the protein coding genes (PKMYT1, BLM, GABRD), inventors performed qRTPCR validation in cancer cell lines vs. primary cells and also in a cDNA panel covering 8 tumor types and normal matching tissues. In all cases, the targets were highly significantly up regulated in both cancer cell lines and tumors (Figure 2 and Figure 3) as compared with CK19, a conventionally known potential biomarker for a plurality of cancers which was used as a reference.
<Pan cancer long non coding RNAs>
From the cancer cell line analysis inventors observed 246 aberrantly expressed promoters that overlapped 181 long non-coding RNAs gene models from GENCODE 19. Additionally, the incorporation of the recently published miTRanscriptome study (Non-Patent Literature 8) allowed us to annotate a further 90 lncRNAs. Combined, inventors identified 271 differentially expressed lncRNAs. The majority (247 lncRNAs) were up-regulated, while 24 were down-regulated.
Inventors note that 39 and 5 of these were up and down regulated, respectively, in both the cancer cell line analysis and at least one tumor type in the miTranscriptome study (Non-Patent Literature 8). To focus on robustly differentially expressed lncRNAs inventors selected those that showed consistent expression change in both the FANTOM5 cancer cell line analysis and were also cancer associated in at least 2 cancer types in the miTranscriptome study. This identified 21 consistently up regulated and 2 consistently down regulated lncRNAs.
For five of these lncRNAs (FOXP4AS1 (RP11-328M4.2), ENST00000448869 (RP11-284F21.7, CAT122), LOC643401 (LINC01021, ESAT107), MNX1_AS1, FIRRE (LOC286467, LSCAT182)), inventors performed qRTPCR validation in cancer cell lines vs. primary cells and also in a cDNA panel covering 8 tumor types and normal matching tissues. In all cases, the targets were highly significantly up regulated in both cancer cell lines and tumors (Figure 2 and Figure 3) as compared with CK19, a conventionally known potential biomarker for a plurality of cancers which was used as a reference.
Inventors also looked for the overlap with the lists of pan-cancer lncRNAs to the 229 ‘onco-lncRNAs’ identified by Cabanski et al. (Non-Patent Literature 7), which allowed us to confirm 3 additional up-regulated lncRNAs. Inventors' analysis of pre-processed TCGA RNA-seq data also allowed us to confirm de-regulation of 4 lncRNAs, 2 already confirmed by miTranscriptome and Cabanski (down regulated MEG3 and up -regulated DGCR5) and 2 that were missed by other reports; down-regulation of the MT1L pseudogene and most notably the up-regulation of PVT1, which is a known lncRNA oncogene (Non-Patent Literature 20).
<Deregulated long non-coding RNAs located near cancer related genes>
Inventors next looked at the genomic neighborhood of the differentially expressed lncRNAs. For 27 of the 181 (GENCODE19) differentially expressed lncRNAs inventors found 33 cancer related genes within 100kb. For example PVT1 neighbors MYC, which are consistently co-gained in cancer. Inventors also observe RP11-1070N10.5 neighboring the TCL6 (lincRNA), TCL1A and TCL1B oncogenes (located in a breakpoint cluster region on chromosome 14q32 in T-cell leukemia (Non-Patent Literature 21)) and HOXA11-AS, neighboring HOXA13 and HOXA9 and overlapping the HOXA11 oncogene. Inventors also note that 5 out of 6 cancer related genes located with 1kb from up regulated lncRNA were also up regulated, these include the MCF2L, GATA2, MNX1 oncogenes and BSG, CSAG1 cancer antigens.
<Activation of repeat elements in cancer>
Globally about 20% of all FANTOM5 promoters initiate from within repetitive elements and low complexity DNA sequences annotated by RepeatMasker. The up-regulated promoters tended to be enriched in repetitive elements, most noticeably retrotransposons (SINE/Alu, LINE/L1, LTR/ERV1, LTR/ERVL). The SINE/Alu and LINE/L1 overlapping promoters tended to be located in intronic regions of protein coding genes (49% and 32%, respectively) and not associated to known RNA transcripts. The up-regulated promoters overlapping LTR/ERV1 often initiated the expression of lncRNAs (31 GENCODE lncRNAs and 48 miTranscriptome lncRNAs).
<Bidirectional transcription from REP522 satellite repeat is activated in cancer>
Interestingly, a specific repeat family, REP522, was strongly enriched in the most up-regulated promoters. REP522, originally called a telomeric satellite, is a largely palindromic, unclassified interspersed repeat of ~1.8Kb in size (Non-Patent Literature 22). Inventors observed that out of 72 promoters overlapping REP522, 25 were up-regulated in cancer (odds ratio: 62.05). 20 out of these 25 promoters were associated with a known transcript (5 pseudogenes, 9 lncRNAs and 1 protein coding genes) including the pseudogene BAGE2 (B melanoma antigen family, member 2) and the lncRNAs PCAT7 (prostate cancer associated transcript 7) and BRCAT95 which were previously implicated in cancer (Non-Patent Literature 8). Inventors observed that in most cases the transcription is initiated bidirectionally and 5 overlap regions previously annotated as enhancers. To show that the observed activation of REP522 element represents true biological signal and was not due to a mapping artifact, inventors performed qPCR validation for 11 up-regulated, REP522 associated transcripts from different genomic regions in 3 cancer cell lines and dermal fibroblast cells as a control. For 8 of these inventors confirmed up-regulation in the cancer cell lines compared to normal fibroblasts. In one case inventors confirmed the bidirectional transcription of CCD144NL and CCD144NL-AS1 from one REP522 element. To inventors' knowledge this is the first report implicating REP522 activation in cancer.
<Enhancer activation in cancer>
Taking advantage of the fact that CAGE data can be used to estimate the activity of enhancers from balanced bidirectional capped transcription (Non-Patent Literature 9), inventors performed differential expression analysis based on CAGE tags counts under 43,011 CAGE defined enhancers (Non-Patent Literature 9) using the same differential expression pipeline as for the promoter regions. Inventors found 28 pan-cancer enhancers up regulated in solid and blood cancers and a further 62 up regulated in solid cancers only.
Inventors found that 23 of the 90 up-regulated enhancers could be associated to a miTranscriptome transcript (5’ end within 500bp from the enhancers) and that 4 of those transcripts (were reported to be up regulated in at least one cancer type.
Inventors next used Chromatin Interaction Analysis Paired-End Tags (ChIA-PET) data from the ENCODE project (GSE39495; datasets using antibodies against estrogen receptor alpha (ERa), RNA polymerase II (RNAPII), and CCCTC binding factor (CTCF) in four different cancer cell lines, K562, HCT116, HeLa-S3, MCF-7, and NB4) to associate these pan-cancer enhancers with target genes. Inventors found that 55 out of the 90 up-regulated enhancers can be physically linked to promoters of known genes (228 unique enhancer - gene links). One of the enhancers was associated with the promoter for the oncogenic MIR21. A further 16 of the enhancers were linked to cancer related genes, including 6 oncogenes (BCL9, CREB1, ZNF384, SALL4, TFRC and BTG1), 2 tumor suppressors (ING4, KCTD11) and 5 Mut-Drivers (PIK3CB, CLIP1, KIFC3, GPS2 and CARM1). Additionally, 8 of the up-regulated enhancers were linked to promoters found to be up regulated in cancer cell lines, including cancer linked genes such as TNFSF12 and PIK3R3 (reported to increase tumor migration and invasion when overexpressed (Non-Patent Literature 23)).
(Discussion)
Comparing FANTOM5 CAGE based expression profiles of cancer cell lines to that of their normal counterparts, inventors identified 2,108 differentially expressed promoters associated with 656 protein-coding genes and 271 long non-coding RNAs. Additionally, inventors used the unique capability of CAGE to estimate the activity of enhancers based on the bi-directional transcription of enhancer RNAs and identified 90 enhancers that are recurrently activated in cancer cell lines. Lastly, inventors used the genomic location of the TSSs provided by CAGE to show that promoters that overlap repetitive elements are more likely to be up regulated in cancer. Thus, inventors' analysis provides unique new insights that complement previous cancer transcriptome studies, based on microarray and RNA-seq platforms.
Distinguishing inventors' study from prior work inventors used homogenous, pure cell cultures; cancer cell lines to represent cancer and primary cells to represent the normal state. Thus inventors avoid the complication of expression profiling of the complex mixtures of cells in both tumors (infiltrating lymphocytes, stroma and blood vessels) and normal tissues, as differentially expressed genes in such comparisons may reflect differences in cell composition as well as the transformed state. Inventors note that this issue is acknowledged in the sample collection procedures of the TCGA (Non-Patent Literature 3). They require that the tumor samples contain at least 60% tumor cells and less than 20% necrosis to be included in the study. Similarly the normal tissue counterparts to which they are compared are complex mixtures of cells. Inventors acknowledge that, artifacts from the long-term culture of cell lines and their artificial in-vitro culture conditions can affect inventors' analysis, however by focusing on matched primary cell counterparts inventors avoid the cell composition issue. The relevance of inventors' identified gene sets is confirmed by enrichment for oncogenes and tumors suppressors, 19 known tumor cancer antigens and overlap with those perturbed in primary tumor datasets. In addition to 656 protein coding genes inventors also report a collection of 271 lncRNA, and 90 eRNA that are consistently perturbed in cancer cell lines and have the potential for use as biomarkers. Additionally inventors report an interesting phenomenon that those overlapping retrotransposon elements are more likely to be up regulated in cancer.
In terms of other prior work on pan-cancer protein/mRNA biomarkers, Basil et al. (Non-Patent Literature 24) reported a common cancer signature of 332 genes from 373 archival samples profiled using microarrays, Rhodes et al. identified a multicancer-type meta-signature of 67 genes up regulated in cancer by preforming a meta-analysis of 40 published microarray experiments (Non-Patent Literature 15) and Xu et al. reported 46 genes consistently up-regulated across 21 microarray data sets (Non-Patent Literature 25). Comparing inventors' lists with these inventors note that there were relatively few overlapping genes between the signatures, with only 6 genes present in at least 2 out of the 3 signatures (CKS2, NME1, SOX4, KIF14, RFC4, TOP2A). Despite this by comparing the lists obtained from the FANTOM5 cell line analysis with those of the TCGA the inventors identified a core set of 128 markers that are perturbed in both. 4 of them are also up regulated on protein level (from high throughput mass spectrometry data). Specifically: TOP2A, MKI67, MCM2, ASNS, which are amongst some of the most studied cancer biomarkers and drug targets. TOP2A is targeted by etoposide (Non-Patent Literature 26) ASNS is targeted in asparginase therapy of acute lymphoblastic leukemia (Non-Patent Literature 27) and both MKI67 and MCM2 have bene studied as biomarkers (Non-Patent Literature 28) and (Non-Patent Literature 29) and potential drug targets (Non-Patent Literature 30) and (Non-Patent Literature 31). The advantage of developing drugs to target these molecules is that as they are recurrently up regulated across many cancer types consequently they are likely to be of therapeutic value to many patients.
Beyond protein level markers, recently ‘onco-lncRNAs’ (Non-Patent Literature 7) and ‘cancer-associated lncRNA’ (Non-Patent Literature 8) have been reported, suggesting that development of non-coding RNA biomarkers may be possible. Comparing inventors' 271 lncRNA pan-cancer candidates, there was relatively little overlap with these previous studies (Non-Patent Literature 3 and Non-Patent Literature 39 respectively). However inventors confirmed by reanalysis of the TCGA data and qRTPCR of several candidates, that these lncRNAs are indeed up-regulated. Additionally inventors report on the existence of eRNAs that are up-regulated in a pan-cancer manner, similar to the repeats discussed below their activation could suggest a reversion to a more stem like state, bystander amplification and co-activation (neighbouring genes are more likely to be co-expressed) or demethylation and opening of chromatin permissive for transcription initiation.
The finding that promoters, which overlap repetitive elements, are activated in cancer is quite interesting, and the validation of the REP522 derived transcripts is novel. One instance of REP522 near the BAGE locus has been reported to be marked with H3K9me3 and actively transcribed (Non-Patent Literature 32), however relatively little is known about this element.
In conclusion, traditionally, the focus has been on protein biomarkers, however recently RNA based biomarkers have been identified and commercialized (Non-Patent Literature 33). There are multiple reports showing possible application of circulating plasma RNA (both protein coding and lncRNAs) for detection and diagnosis of lung, prostate (Non-Patent Literature 34), colon (Non-Patent Literature 35), breast (Non-Patent Literature 36) and liver (Non-Patent Literature 37) cancers. For therapy, there is great potential for RNA interference mediated cancer treatments (Non-Patent Literature 38), with some first-in-humans trials already reported in metastatic cancers (Non-Patent Literature 39). Inventors' results, which highlight the transcriptome changes in cancer and cover both protein coding genes, non-protein coding transcripts, unannotated promoters and enhancer RNAs, represent an important step towards discovery of potentially useful cancer biomarkers and therapeutic targets. Development of technologies to detect and target these molecules has the great potential to be applicable to a broad range of cancers. One last note is that inventors identify molecules that are consistently up or down in cancer normal comparisons, but are not necessarily always higher in all cancers relative to all normal tissues (a subset are). Some of inventors' biomarkers may not be suitable for plasma/serum based diagnostics but would be useful in screening biopsies in a histopathological setting.
(Methods)
<FANTOM5 data>
Inventors used the cap analysis of gene expression (CAGE) data from the FANTOM5 project, which consisted of data from CAGE libraries sequenced to a median depth of 4 million mapped tags per sample using single-molecule sequencing (Non-Patent Literature 5). Inventors used 564 CAGE profiles, 216 representing the cancer cell lines and 391 representing the primary cells. Inventors split the data into 3 data sets: a) matched solid - data for 10 types of solid cancer, for which both cancer cell lines (total 72) and corresponding normal primary cells (total 65) were available, b) unmatched solid - data of 97 cancer cell lines and 254 primary cells that could not be unambiguously matched and c) matched blood - data for 2 types of blood cancer with cancer cell lines (total 51) matched to corresponding normal primary cells (total 74).
The CAGE tags counts under 184,827 robust DPI clusters (decomposition-based peak identification (Non-Patent Literature 5) were used to represent a promoter-level genome-wide expression. For the enhancer activity, inventors used the CAGE tags counts under 43,011 enhancer regions identified by presence of balanced bidirectional capped transcription (Non-Patent Literature 9).
<FANTOM5 differential expression analysis>
Inventors used edgeR to identify up and down regulated transcripts and ON/OFF analysis to identify genes that were switched on and off in cancer cell lines versus normal primary cells. The raw count data were modeled by Genewise Negative Binomial Generalized Linear Models as implemented in edgeR (Non-Patent Literature 10). In matched solid dataset, the coefficients of the design matrix were set to group the samples into 20 groups according to the tissue origin and sample category: 10 cancer origins and 10 normal tissues.
Genewise negative binomial generalized linear models (glm) were fitted by glmFit function. The cancer vs. normal comparison in matched solid comparison was performed using glmLRT function and assigning contrast coefficients of 1/10 to each cancer type and -1/10 to each normal tissue origin. This way equal weight was placed on each solid cancer type.
In unmatched solid and matched blood dataset, the coefficients of the design matrixes grouped the samples into 2 groups: cancer and normal. The cancer vs normal comparison was done simply by setting the contrast's coefficients to 1 for cancer and -1 for normal.
The p-values were adjusted for multiple testing by Benjamini-Hochberg False Discovery Rate (FDR) method. The features showing fold change of expression > 4 with significance FDR < 0.01 were selected as differentially expressed.
<ON/OFF analysis>
For each feature, inventors determined the expression status across all 564 samples in binary fashion: ON (expressed/detected, count > 0), OFF (not detected, count = 0). In each dataset (matched solid, unmatched solid and matched blood), inventors then tested for the significance of the association (contingency) between ON/OFF status and cancer/normal classification by 2-sided Fisher's exact test. The p-values were adjusted for multiple testing by Benjamini-Hochberg method.
In the ON/OFF analysis inventors were interested to test for two possible scenarios: a) feature is expressed in cancer and not in normal cells (ON in cancer) or b) feature is expressed in normal cells but the expression is lost in cancer (OFF in cancer).
To test this, inventors calculated the frequency of expression of given feature in cancer and normal samples; frequency = number of samples non-zero expression / number of samples in a group. Features expressed 4 times more frequently in cancer that in normal samples mere selected as "ON"/expression gain in cancer (mean frequency in cancer > frequency in normal * 4). Features not expressed/lost 4 times more frequently in cancer that in normal samples mere selected as "OFF"/expression loss in cancer ( 1-frequency in cancer > (1-frequency in cancer) * 4 ). The procedure was applied to each dataset (matched solid, unmatched solid and matched blood). Besides passing the gain/loss threshold, the feature had to be significantly associated with cancer in the Fisher's exact described above (FDR < 0.01). The pipeline of differential expression described above was applied separately to the DPI/promoter counts and enhancer counts. The features found differentially expressed all 3 datasets were selected as "pan" cancer features, whereas features differentially expressed in matched and unmatched solid datasets only were selected as "solid only" cancer features.
<TCGA RNA-seq data>
Inventors obtained the RNA-Seq profiling data of 4055 cancer samples and 563 normal tissues data from The Cancer Genome Atlas (TCGA) Data Portal (https://tcga-data.nci.nih.gov/tcga/). The profiles represented 14 solid cancer type for which both tumor and normal tissue samples were available. Blood malignancies were not included in the analysis because no RNA-Seq data was available for the normal tissues. Inventors downloaded level3 RNASeqV2, upper quartile normalized RSEM count estimates (data status as of Aug 5, 2013). After merging, inventors obtained a matrix with expression profiles of 20531 genes in 4618 samples.
<TCGA Differential expression analysis>
The normalized counts were log2 transformed and used as an input expression data to LIMMA. The offset of 2 was used to remove zero values prior to log2 transformation; additionally it had an effect of stabilizing the variance of lowly expressed features.
In the design matrix the data were divided into 28 groups according to the cancer type; 14 cancer types and 14 normal tissues and the linear model was fitted for each gene (lmFit function). To perform the cancer versus normal comparison, inventors contrasted the 14 cancer types to 14 corresponding normal tissues by assigning 1/14 and -1/14 coefficients to each cancer and normal group respectively (contrasts.fit function). This way inventors ensured that each cancer type had the same weight in the overall comparison and the selected differentially expressed features showed general pan cancer de-regulation rather than being specific to a particular cancer type. The p-values from moderated t-statistics implemented in limma were adjusted for multiple testing by Benjamini-Hochberg False Discovery Rate (FDR) method. The features showing and absolute fold change of expression > 4 and significance FDR < 0.01 were selected as differentially expressed.
<Enrichment for cancer related genes>
Inventors tested for the enrichment/over-representation by applying hyper geometric test, using the significance threshold of p-value < 0.05. Inventors compiled the list of oncogenes as union of oncogenes listed in MSigDB (Non-Patent Literature 40) and UniProt (Non-Patent Literature 41) databases. For tumor suppressors, inventors considered genes listed in at least two of three sources: MSigDB (Non-Patent Literature 40), UniProt (Non-Patent Literature 41) and TSGene (Non-Patent Literature 42) tumor suppressor list. For the list of genes frequently mutated in cancer (also called Mut Drivers) inventors used the list of High Confidence Drivers mutated across 12 cancer types as reported by Tamborero et al. (Non-Patent Literature 12). Inventors also tested for enrichment of cancer related genes listed in COSMIC: Cancer Gene census (Non-Patent Literature 13).
<Proteome data analysis>
Inventors used the spectral count data table for 90 colorectal cancer and 30 normal generated by Liebler lab (Non-Patent Literature 6) for all 7244 genes (no low count filtering was performed). The provided data were produced by liquid chromatography-tandem mass spectrometry (LC- MS/MS) and were quantile-normalized, followed by log base 2 transformation by the authors. For each gene, inventors performed a non-parametric two-sided Wilcoxon rank-sum test to identify differentially expressed genes in colorectal cancer (vs. normal tissue) at protein level. Inventors adjust the p-valued for multiple testing with Benjamini-Hochbert FDR method. Inventors used fold change >2 and FDR < 0.05 to select genes differentially expressed in colorectal cancers. Since the genes with low spectral counts values were originally filtered out by the authors due to lower reliability, inventors report that only <10% of selected differentially expressed proteins were lowly expressed.
<Chia-PET Enhancer-promoter pairs>
Inventors obtained the Chromatin Interaction Analysis Paired-End Tags (ChIA-PET) data from ENCODE/GIS-Ruan project (GSE39495, http://genome.ucsc.edu/cgi-bin/hgFileUi?db=hg19&g=wgEncodeGisChiaPet, April 21, 2014). Data files from 15 experiments were downloaded, covering 5 cell lines (Hct-116, Helas3, K562, Mcf7 and Nb4) and 3 transcription factors (Pol2, Ctcf, ERalpha a). The detailed list: Hct116-Pol2, Helas3-Pol2, K562-Ctcf, K562-Pol2 (2 replicates), Mcf7-Ctcf (2 replicates), Mcf7-ERalpha a (3 replicates), Mcf7-Pol2 (4 replicates), Nb4-Pol2.
Inventors downloaded the ChIA-PET Chromatin Interaction PET clusters in BED format, and merged the interaction from all experiments. The interactions linking sites on different chromosomes were removed. Inventors then extracted the ChIA-PET interaction clusters overlapping the genomic locations of enhancers and searched if the linked genomic locations overlap promoters of the annotated genes. The visualization of associations between the enhancers and promoters was performed in ZENBU (Non-Patent Literature 43).
<cDNA synthesis for cell line samples>
Total RNA was isolated from K562, HepG2, MCF7, and HDF with miRNA Mini Kit (QIAGEN). Five hundred nanograms of total RNA (or 0 g for non-temprate control) was reverse-transcribed with PrimeScriptR 1st strand cDNA Synthesis Kit (TaKaRa Japan, 6110A) using oligo dT primer. This reaction was repeated 3 times for each total RNA sample. The resultants containing cDNA were diluted to 12.5 times with DNA/RNA free distilled water, and the 3 cDNA solutions from common sample were mixed at equal rate for further evaluation processes.
<Quantitative PCR (qPCR) for call line samples and human cancer/normal tissue cDNA>
qPCR was carried out with ABI 7500 Fast Real-Time PCR System (Thermo Fisher Scientific, MA) according to the Primers for real-time PCR were designed by Primer3 web tool, and synthesized by Eurofins Genomics, Tokyo. The housekeeping gene ACTB was utilized as an internal control to normalize the expression levels (The primer was delivered by OriGene, MD). For cell line samples, 15 μl of Power SYBR Green PCR Master Mix (Thermo Fisher Scientific, MA), 1 μl of 10 μM forward primer, 1 μl of 10 μM reverse primer, and 1 μl of the diluted traverse-transcription resultant were used and diluted up to 30 μl with DNA/RNA free distilled water.
For validation in tumor samples, inventors performed qPCR validation across the TissueScan Cancer Survey Panel 96 - I cDNA panel (CSRT501, OriGene, MD) of 65 tumors, 7 lesions and 24 normal tissues. The cDNA panel covers 8 tumor types and normal matching tissues. Inventors observed that the targets were highly significantly up regulated tumors compared to normal tissues. For the cDNA tissue 15μl of Power SYBR Green PCR Master Mix (Thermo Fisher Scientific, MA), 1 μl of 10 μM forward primer, 1 μl of 10 μM reverse primer, and 13 μl of DNA/RNA free distilled water were applied to each reaction with cDNA in each well on the tissue panel. The thermal cycle conditions for qPCR were 50°C for 2 min, 95°C for 10 min, followed by 45 cycles of 95°C for 15 sec (denaturation), 55°C or 53°C for 1 min (annealing), and 72°C for 1 min (extension). Each reaction was run in triplicate for cell line samples, and singlet for human cancer/normal tissue cDNA panel.
Primers used for the qPCR validations are shown in Table 4.
Figure JPOXMLDOC01-appb-T000001
Figure JPOXMLDOC01-appb-I000001
Figure JPOXMLDOC01-appb-T000002
Figure JPOXMLDOC01-appb-I000002
Figure JPOXMLDOC01-appb-I000003
Figure JPOXMLDOC01-appb-T000003
Figure JPOXMLDOC01-appb-I000004
Figure JPOXMLDOC01-appb-I000005
Figure JPOXMLDOC01-appb-I000006
Figure JPOXMLDOC01-appb-T000004
The present invention can be widely used in the field of diagnostic medicine and the field of health and medical science.

Claims (5)

  1. A biomarker for cancer being one of:
    transcription products having at least a part of an RNA sequence corresponding to a DNA sequence in "sequence (DPI - UPPERCASE, 100 downstream - lowercase)" in Table 5 and starting from any position of a corresponding sequence in "sequence_DPI_only" in Table 5 or translation products derived therefrom.
  2. A biomarker set comprising:
    a first biomarker for cancer which first biomarker is one of:
    transcription products having at least a part of an RNA sequence corresponding to a DNA sequence in "sequence (DPI - UPPERCASE, 100 downstream - lowercase)" in Table 5 and starting from any position of a corresponding sequence in "sequence_DPI_only" in Table 5, or translation products derived therefrom; and
    a second biomarker for cancer which second biomarker is another one of:
    the transcription products having at least the part of the RNA sequence corresponding to the DNA sequence in "sequence (DPI - UPPERCASE, 100 downstream - lowercase)" in Table 5 and starting from any position of the corresponding sequence in "sequence_DPI_only" in Table 5, or the translation products derived therefrom.
  3. A cancer detection method comprising the step of measuring, in a sample collected from a living body, the amount of the biomarker recited in claim 1 or the first and second biomarkers included in the biomarker set recited in claim 2.
  4. The cancer detection method as set forth in claim 3, wherein the cancer includes a plurality of types of cancers.
  5. A cancer detection kit comprising the biomarker recited in claim 1 or the biomarker set recited in claim 2.
PCT/JP2016/004260 2015-09-16 2016-09-16 Biomarker for cancer and use thereof WO2017047102A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2015-183484 2015-09-16
JP2015183484 2015-09-16

Publications (1)

Publication Number Publication Date
WO2017047102A1 true WO2017047102A1 (en) 2017-03-23

Family

ID=58288588

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2016/004260 WO2017047102A1 (en) 2015-09-16 2016-09-16 Biomarker for cancer and use thereof

Country Status (1)

Country Link
WO (1) WO2017047102A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107083433A (en) * 2017-06-01 2017-08-22 北京泱深生物信息技术有限公司 Applications of the lncRNA in liver cancer diagnosis and treatment
WO2019165212A1 (en) * 2018-02-22 2019-08-29 University Of Pittsburgh - Of The Commonwealth System Of Higher Education TARGETING CANCER-ASSOCIATED LONG NON-CODING RNAs
CN111420058A (en) * 2020-04-23 2020-07-17 侯本国 Gene inhibitor for treating prostatic cancer
CN113817831A (en) * 2021-10-12 2021-12-21 湖南聚点生物科技有限公司 Primer and kit for diagnosis and detection of lung adenocarcinoma and detection method of methylation region

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001061009A2 (en) * 2000-02-15 2001-08-23 Curagen Corporation Polypeptides and nucleic acids encoding same
WO2008094678A2 (en) * 2007-01-31 2008-08-07 Applera Corporation A molecular prognostic signature for predicting breast cancer distant metastasis, and uses thereof
WO2008153743A2 (en) * 2007-05-21 2008-12-18 Dana Farber Cancer Institute Compositions and methods for cancer gene discovery
WO2011068839A1 (en) * 2009-12-01 2011-06-09 Compendia Bioscience, Inc. Classification of cancers
WO2015115544A1 (en) * 2014-01-31 2015-08-06 学校法人順天堂 Evaluation method for risk of metastasis or recurrence of colon cancer

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001061009A2 (en) * 2000-02-15 2001-08-23 Curagen Corporation Polypeptides and nucleic acids encoding same
WO2008094678A2 (en) * 2007-01-31 2008-08-07 Applera Corporation A molecular prognostic signature for predicting breast cancer distant metastasis, and uses thereof
WO2008153743A2 (en) * 2007-05-21 2008-12-18 Dana Farber Cancer Institute Compositions and methods for cancer gene discovery
WO2011068839A1 (en) * 2009-12-01 2011-06-09 Compendia Bioscience, Inc. Classification of cancers
WO2015115544A1 (en) * 2014-01-31 2015-08-06 学校法人順天堂 Evaluation method for risk of metastasis or recurrence of colon cancer

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
AKBANI, R. ET AL.: "A pan-cancer proteomic perspective on The Cancer Genome Atlas", NATURE COMMUNICATIONS, vol. 5, 2014, pages 1 - 14, XP055368828 *
CABANSKI, C. R. ET AL.: "Pan-cancer transcriptome analysis reveals long noncoding RNAs with conserved function", RNA BIOLOGY, vol. 12, no. Issue 6, 3 June 2015 (2015-06-03), pages 628 - 642, XP055368834 *
KACZKOWSKI, B. ET AL.: "Transcriptome Analysis of Recurrently Deregulated Genes across Multiple Cancers Identifies New Pan-Cancer Biomarkers", CANCER RESEARCH, vol. 76, no. 2, 15 January 2016 (2016-01-15), pages 216 - 226, XP055368841 *
WHITE, N. M. ET AL.: "Transcriptome sequencing reveals altered long intergenic non-coding RNAs in lung cancer", GENOME BIOLOGY, vol. 15, no. 8, 13 August 2014 (2014-08-13), pages 429, XP021196563 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107083433A (en) * 2017-06-01 2017-08-22 北京泱深生物信息技术有限公司 Applications of the lncRNA in liver cancer diagnosis and treatment
CN107083433B (en) * 2017-06-01 2021-03-09 青岛泱深生物医药有限公司 Application of lncRNA in diagnosis and treatment of liver cancer
WO2019165212A1 (en) * 2018-02-22 2019-08-29 University Of Pittsburgh - Of The Commonwealth System Of Higher Education TARGETING CANCER-ASSOCIATED LONG NON-CODING RNAs
US11667919B2 (en) 2018-02-22 2023-06-06 University of Pittsburgh—of the Commonwealth System of Higher Education Targeting cancer-associated long non-coding RNAs
CN111420058A (en) * 2020-04-23 2020-07-17 侯本国 Gene inhibitor for treating prostatic cancer
CN111420058B (en) * 2020-04-23 2021-10-15 侯本国 Gene inhibitor for treating prostatic cancer
CN113817831A (en) * 2021-10-12 2021-12-21 湖南聚点生物科技有限公司 Primer and kit for diagnosis and detection of lung adenocarcinoma and detection method of methylation region

Similar Documents

Publication Publication Date Title
Kaczkowski et al. Transcriptome analysis of recurrently deregulated genes across multiple cancers identifies new pan-cancer biomarkers
Rakha et al. Molecular classification of breast cancer: what the pathologist needs to know
Shah et al. A recurrent germline PAX5 mutation confers susceptibility to pre-B cell acute lymphoblastic leukemia
Stuart et al. Identification of gene markers associated with aggressive meningioma by filtering across multiple sets of gene expression arrays
Graham et al. Gene expression profiles of estrogen receptor–positive and estrogen receptor–negative breast cancers are detectable in histologically normal breast epithelium
Pesson et al. A gene expression and pre-mRNA splicing signature that marks the adenoma-adenocarcinoma progression in colorectal cancer
EP2707506B1 (en) Method of detecting cancer through generalized loss of stability of epigenetic domains, and compositions thereof
LaPointe et al. Discovery and validation of molecular biomarkers for colorectal adenomas and cancer with application to blood testing
Masuda et al. Overexpression of the S100A2 protein as a prognostic marker for patients with stage II and III colorectal cancer
WO2016172265A1 (en) Method to increase sensitivity of next generation sequencing
WO2017047102A1 (en) Biomarker for cancer and use thereof
Kamdar et al. Exploring targets of TET2-mediated methylation reprogramming as potential discriminators of prostate cancer progression
Kulda et al. Prognostic significance of TMPRSS2-ERG fusion gene in prostate cancer
Igci et al. Differential expression of a set of genes in follicular and classic variants of papillary thyroid carcinoma
Abou Daya et al. Circulating tumor DNA, liquid biopsy, and next generation sequencing: A comprehensive technical and clinical applications review
Kidd et al. High expression of SCHLAP1 in primary prostate cancer is an independent predictor of biochemical recurrence, despite substantial heterogeneity
Shivakumar et al. Comparative analysis of copy number variations in ulcerative colitis associated and sporadic colorectal neoplasia
Huang et al. Inhibition of ZEB1 by miR-200 characterizes Helicobacter pylori-positive gastric diffuse large B-cell lymphoma with a less aggressive behavior
Keraite et al. PIK3CA mutation enrichment and quantitation from blood and tissue
Beddowes et al. Predicting treatment resistance and relapse through circulating DNA
Gong et al. Novel lincRNA SLINKY is a prognostic biomarker in kidney cancer
Nørgaard et al. Epigenetic silencing of MEIS2 in prostate cancer recurrence
Wang et al. A rapid and cost-effective gene expression assay for the diagnosis of well-differentiated and dedifferentiated liposarcomas
CN107208148B (en) Method and kit for the pathological grading of breast tumors
US20150252415A1 (en) Arid1b and neuroblastoma

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16845976

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

NENP Non-entry into the national phase

Ref country code: JP

122 Ep: pct application non-entry in european phase

Ref document number: 16845976

Country of ref document: EP

Kind code of ref document: A1