WO2012060760A1 - Molecular marker for cancer - Google Patents

Molecular marker for cancer Download PDF

Info

Publication number
WO2012060760A1
WO2012060760A1 PCT/SE2011/000201 SE2011000201W WO2012060760A1 WO 2012060760 A1 WO2012060760 A1 WO 2012060760A1 SE 2011000201 W SE2011000201 W SE 2011000201W WO 2012060760 A1 WO2012060760 A1 WO 2012060760A1
Authority
WO
WIPO (PCT)
Prior art keywords
gene
whsc1l1
expression
cancer
assessed
Prior art date
Application number
PCT/SE2011/000201
Other languages
French (fr)
Inventor
Toshima Parris
Khalil Helou
Original Assignee
Fujirebio Diagnostics Ab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujirebio Diagnostics Ab filed Critical Fujirebio Diagnostics Ab
Publication of WO2012060760A1 publication Critical patent/WO2012060760A1/en

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/574Immunoassay; Biospecific binding assay; Materials therefor for cancer
    • G01N33/57407Specifically defined cancers
    • G01N33/57415Specifically defined cancers of breast
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • G01N33/6893Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids related to diseases not provided for elsewhere
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2800/00Detection or diagnosis of diseases
    • G01N2800/38Pediatrics
    • G01N2800/385Congenital anomalies

Definitions

  • the present invention relates to the assessment of a gene in the 8p11 -p12 chromosomal region and gene products thereof, and their use as powerful molecular markers in decisions related to the diagnosis and/or prognosis of cancer, and in particular breast cancer.
  • Breast cancer is the most commonly diagnosed malignancy in Swedish females, accounting for around 30% of female cancers diagnosed in Sweden.
  • the selection of treatment is influenced by various prognostic factors such as axillary lymph node status, pathological tumor grade, S-phase fraction, and the status of molecular markers such as ERBB2/HER2 and the estrogen receptor.
  • the current prognostic factors are, however, inadequate when distinguishing between favorable and unfavorable prognosis patients, resulting in the selection of inadequate treatment for many breast cancer cases, while treating many others unsuccessfully, and thereby imposing unnecessary adverse effects.
  • many patients could benefit greatly from additional complementary molecular markers, which may help guide treatment decisions and be of value in the development of new therapeutic agents.
  • Biomarkers can be assessed at different biological levels.
  • the genome can be analyzed for mutations, copy number variations (allelic loss, gain, and amplification), and epigenetic modulations.
  • the transcriptome can be assessed by methods analyzing RNA, and the proteome by methods for protein/peptide analysis. Recently also controlling elements, such as microRNA's and other non-coding RNAs can be measured as molecular markers.
  • markers are assayed individually, such as circulating MUC-1 using CA 15-3 and CA 27.29 assays, Carcinoembryonic antigen (CEA), estrogen (ER) and progesterone receptor (PgR), HER2, Ki67, cyclin D, cyclin E, p27, p21 , thymidine kinase (TK), topoisomerase II, p53, Urokinase plasminogen activator (uPA), plasminogen activator inhibitor (PAI-1), etc.
  • CEA Carcinoembryonic antigen
  • ER estrogen
  • PgR progesterone receptor
  • HER2 HER2
  • Ki67 cyclin D
  • cyclin E cyclin E
  • p27 p27
  • p21 thymidine kinase
  • TK thymidine kinase
  • uPA Urokinase plasminogen activator
  • PAI-1 plasminogen activator inhibitor
  • the 70-gene MammaPrint panel and the 76-gene Rotterdam share only three genes,- and neither of them has any genes in common with the 21 -gene Oncotype Dx panel.
  • the reason all three panels perform rather well despite negligible gene overlap is that many genes can reflect the same aberrations.
  • the 8p11-12 region spans approximately 10 Mb and encompasses some 61 known genes, any of which can potentially be used as markers for genomic aberrations, such as loss or amplification of this region.
  • expression pathways are not as unequivocally defined as genomic aberrations, any malfunction will be reflected by any of the genes involved.
  • the three tests show important overlap among the expression pathways and chromosomal hotspots reflected by the genes in their panels.
  • genes will be more powerful markers for any single pathway or aberration than others. For example, while it is common that large segments in the genome are lost or amplified, and can be detected by any of the genes in the region, amplification or loss of a shorter region will only be reflected by the genes involved and will have an effect only if the genes are relevant to the disease. If the genes are assessed on the expression level, other factors also affect their suitability as markers. For example, the level of expression may differ among the markers, their degradation rate, and, particularly for protein markers, their availability. For example, it is of great advantage if the markers are secreted. Also, the degree of disease involvement of the marker is expected to be relevant.
  • the marker is directly affecting the course of the disease even small differences in its expression level may be significant, while the levels of markers that are only affected indirectly may be moderated by many factors and therefore changes in their levels will be less significant. It is very challenging to identify optimum markers for the various aberrations and expression pathways of relevance for cancer. Approaches based on global screening of genes are likely to find candidates for the most relevant pathways and aberrations, but they will not find the optimal set because of the high false discovery rates in the statistical analysis. To identify optimal markers, the system must be reduced to study only the relevant genes. As already mentioned for the breast cancer hotspot regions 17q12, 8q24, and 11q13, consensus key marker genes have been identified, while for 8p1 1-p12 and 20q13 the most relevant markers have remained elusive until disclosed in the herein described invention.
  • Amplification of the 8p1 1 -p12 region has been associated with increased cancer proliferation rate (high SBR tumor grade, Ki-67 expression, and cyclin E expression) and reduced survival. Although the amplification and over-expression of specific genes in this region have been reported, there is conflicting information on which genes can function as bona fide oncogenes and serve as key diagnostic, prognostic and theranostic markers (Bernard-Pierrot I, Gruel N, Stransky N, et al. Characterization of the recurrent 8p1 1-p12 amplicon identifies PPAPDC1 B, a phosphatase protein, as a new therapeutic target in breast cancer.
  • CTC Circulating tumor cells
  • CTC ' s from breast cancer patients, for example, are analyzed for the markers EpCAM, MUC , and HER2. Besides giving an indication of the CTC load by the overall expression of the tumor markers, the test also indicates features of the spreading cells.
  • Recent publications disclose measurements of even larger number of markers from the CTC ' s (Sieuwerts et al. , J. Natl. Cancer Inst. 2009, 101 , 61 -66). These studies are very promising, but suffer from drawbacks.
  • the enriched CTC ' s from the peripheral blood are not pure, but the sample contains also other cells, most likely leukocytes but possibly also other cells. This compromises attempts to normalize the data in order to separate the response of markers stemming from the CTC ' s from markers deriving from the other cells present.
  • markers exclusively expressed in cancer cells should come from the CTC ' s. But few markers, if any, are expressed exclusively in cancer cells due to illegitimate expression in normal cells. Still, since these techniques provide no count of the cells for normalization to determine the load, which remains a problem. Therefore, markers that indicate disease state and progression without relying on traditional normalization to cell count or housekeeping genes should have important advantages.
  • WHSC1L1 Wolf-Hirschhorn syndrome candidate 1-like 1
  • one object of the present invention is to provide a method to diagnose and/or prognose cancer in a biological sample in vitro, said method comprising the steps of a) assessing changes in expression levels of one or more molecular markers in the 8p1 1 -p12 chromosomal region, and wherein said changes are indicative of an activity of the Wolf-Hirschhorn syndrome candidate 1 -like 1 (WHSC1L1) gene, or gene products thereof,
  • WHSC1L1 Wolf-Hirschhorn syndrome candidate 1 -like 1
  • the method may include diagnosing and/or prognosing cancer when the cancer is breast cancer.
  • the method may include diagnosing and/or prognosing breast cancer when the breast cancer is invasive breast cancer.
  • the biological sample may be any kind of biopsy, blood, serum, plasma, urine, saliva, bone marrow, and/or cell populations.
  • the one or more molecular markers assessed in the method of invention may be selected from the group consisting of genomic sequences or parts thereof, transcription products, translation products, and polypeptides.
  • the genomic sequence may be selected from the group consisting of genes and gene fragments, non-coding gene sequences and epigenetic signatures.
  • the transcription product may be selected from the group consisting of primary transcripts, spliced primary transcripts, mRNA, microRNA, cDNA, and non-coding RNA.
  • the translation product may be selected from the group consisting of peptides, polypeptides, proteins or fragments thereof, and post translationally modified proteins.
  • the translation product may be an isoform of the WHSC1 L1 protein or a fragment thereof, and/or a post translationally modified isoform of the WHSC1 L1 protein.
  • the WHSC1L1 gene, or a gene product thereof may be the sole molecular marker assessed in the 8p11-p12 chromosomal region.
  • the one or more molecular markers can be assessed by Array comparative genomic hybridization (array-CGH), Polymerase Chain Reaction (PCR) based techniques, lllumina's Gene Expression BeadArray technology, Fluorescence in situ hybridization (FISH), Enzyme-linked immunosorbent assay (ELISA), Enzyme immunoassay (EIA) and/or Western Blot .
  • array-CGH array comparative genomic hybridization
  • PCR Polymerase Chain Reaction
  • lllumina's Gene Expression BeadArray technology lllumina's Gene Expression BeadArray technology
  • FISH Fluorescence in situ hybridization
  • ELISA Enzyme-linked immunosorbent assay
  • EIA Enzyme immunoassay
  • the activity of the WHSC1L1 gene may be a change in the copy number of the WHSC1L1 gene.
  • the activity of the WHSC1L1 gene may be an amplification, gain, heterozygous loss or homozygous deletion of the WHSC1L1 gene.
  • the activity of the WHSC1L1 gene product may be enhanced or abnormally suppressed expression of said gene product.
  • the copy number of the WHSC1L1 gene can be measured by real-time quantitative PCR.
  • the copy number of the WHSC1L1 gene can be measured by real-time quantitative PCR using the primers: 5'-GCAACAAGCATGACTCATCAAGAT-3' (SEQ ID NO: 1) and 5'- ACACAAGATCGCCAACCTGAA-3 * (SEQ ID NO: 2).
  • the copy number of the WHSC1L1 gene can be measured by real-time quantitative PCR using the primers: 5'-GGCTTATAATTTCTACACCAAA-3' (SEQ ID NO: 3) and 5'- GAACGAGTTCTAATTGATCTTC-3' (SEQ ID NO: 4) optionally in combination with a probe.
  • the probe may be a hydrolysis probe with the sequence: 5'-AAGCCAACGC
  • the expression level of the WHSC1L1 gene may be assessed relative to the expression of one, preferably two, and more preferably 5-10 reference genes.
  • the expression levels of two or more splice variants of the WHSC1L1 gene may be assessed relative to each other.
  • the activity of the WHSC1L1 gene, or gene products thereof may be assessed by the number of DNA copies of the WHSC1L1 genome in combination with a relative expression level of the WHSC1L1 gene transcript or translation products thereof.
  • At least 2, 4, 6, 8, 10, 20, 30, 40, 50, 60, 70, 80 or 90 further markers for cancer may be assessed in combination with the one or more molecular markers indicating the activity of the WHSC1L1 gene or gene products thereof.
  • the further markers for cancer may be selected from the group consisting of v-myc myelocytomatosis viral oncogene homolog (avian; MYC), Human Epidermal growth factor Receptor 2 ⁇ HER2), and StAR-related lipid transfer (START) domain containing 3 (STARD3).
  • An amplification of a gene coding for the further marker for cancer may be assessed.
  • An expression of a gene coding for the further marker for cancer may be assessed.
  • An enhanced expression level of the WHSC1L1 gene transcription and/or translation products, but no amplification of the 8p11-p12 region may be an indication of a negative prognosis of cancer.
  • a relative expression of the short (CCDS6105) and long (CCDS43729) isoform of WHSC1L1 may be assessed.
  • An assessment of isoforms of WHSC1L1 may be made by measuring differential expression of said isoforms.
  • An assessment of isoforms of WHSC1L1 may be made by measuring differential expression of a set of transcripts that includes the transcript producing the short
  • WHSC1 L1 isoform (CCDS6105) relative to a set of transcripts that does not include it, or a set of transcripts that includes the transcript producing the long WHSC1 L1 isoform (CCDS43729) relative to a set of transcripts that does not include it.
  • Cancer cells enriched from circulation (CTC) may be analyzed.
  • a further object of the invention is to provide a kit for in vitro diagnosing and/or prognosing cancer in a biological sample, said kit comprising means for assessing changes in expression levels of one or more molecular markers in the 8p11-p12 chromosomal region, and wherein said changes indicate an activity of the Wolf-Hirschhorn syndrome candidate 1-like 1 (WHSC1L1) gene, or gene products thereof.
  • WHSC1L1 Wolf-Hirschhorn syndrome candidate 1-like 1
  • the kit may further comprise a positive and/or a negative control.
  • the kit may further comprise instructions to the method as disclosed above.
  • Figure 2 shows an identification of the 8p11-p12 region in a diploid breast tumor using array-CGH.
  • Figure 3 shows a verification of 8p11-p12 amplification in the cells of a breast tumor using FISH.
  • Figure 4 shows the expression of WHSC1L1 in the tumor samples harboring 8p11-p12 amplification and tumor samples with normal DNA dosage measured with quantitative real-time PCR.
  • Figure 5 shows effects of A) all forms of 8p1 1 -p12 genetic aberrations, B) 8p1 1-p12 amplification, C) 8p11-p12 loss and D) 8p11-p12 gain on patient overall survival rates.
  • Figure 6 shows effects of A) WHSC1L1 gene amplification, B) WHSC1L1 over-expression and C) WHSC1L1 gene amplification and over-expression on patient overall survival rates using the lllumina gene expression system.
  • Figure 7 shows effects of either WHSC1L1 gene amplification and over-expression, or no WHSC1L1 gene amplification but over-expression.
  • Figures 8A-B show the total expression of WHSC1 L1 -S in the tumor sample harboring WHSC1L1 gene amplification compared to the tumor sample with normal DNA dosage levels.
  • Figure 9 shows effects of total WHSC1 L1-S expression on patient overall survival rates using the WHSC1 L1 antibody with immunohistochemistry.
  • Figures 10A-H show the sensitivity, specificity and dynamic range of the herein invented qPCR assays for WHSC1L1.
  • Figures 1 1 A-C show classification using (A) Principal Component Analysis (PCA) or (B) hierarchical clustering of CTC samples collected from 21 patients with a panel of 31 markers; and (C) classifies the samples based on total expression of WHSC1L1 and expression of the short splice variant of WHSC1L1.
  • PCA Principal Component Analysis
  • B hierarchical clustering of CTC samples collected from 21 patients with a panel of 31 markers
  • C classifies the samples based on total expression of WHSC1L1 and expression of the short splice variant of WHSC1L1.
  • Figures 12 A and B compare the ratio between the expression of the short splice variant of WHSC1L1 and total WHSC1L1 expression among patients classified as survivors and non-survivors.
  • WHSC1L1 Wolf-Hirschhorn syndrome candidate 1 -like 1
  • one aspect of the present invention concerns a method for in vitro diagnosing and/or prognosing cancer in a biological sample, said method comprising the steps of a) assessing changes in expression levels of one or more molecular markers in the 8p11- p12 chromosomal region, and wherein said changes are indicative of an activity of the Wolf-Hirschhorn syndrome candidate 1-like 1 (WHSC1L1) gene, or gene products thereof, b) comparing the amount of change assessed in a) above to a positive and/or negative control, thereby diagnosing and/or prognosing cancer.
  • WHSC1L1 Wolf-Hirschhorn syndrome candidate 1-like 1
  • the biological sample may be any kind of biopsy including fresh-frozen tissue samples, paraffin-embedded biopsy material, or archived biopsy material, blood, serum, urine, saliva, bone marrow, and cell populations that may comprise normal cells or cancer cells.
  • the results obtained from performing the in vitro method of the invention may be used in the assessment of a diagnosis and/or prognosis of cancer in a subject.
  • a "subject" as used herein is intended to mean any living animal or human.
  • the term subject includes, but is not limited to, humans, nonhuman primates such as chimpanzees and other apes and monkey species, farm animals such as cattle, sheep, pigs, goats and horses, domestic mammals such as dogs and cats, laboratory animals including rodents such as mice, rats and guinea pigs, and the like.
  • the term does not denote a particular age or sex. Thus, adult and newborn subjects, as well as fetuses, whether male or female, are intended to be covered.
  • the subject is a mammal, including humans and non-human mammals.
  • the subject is a human.
  • cancer refers to a physical condition in mammals that is typically characterized by a group of cells that display uncontrolled growth (division beyond the normal limits), invasion (intrusion on and destruction of adjacent tissues), and sometimes metastasis (spread to other locations in the body via lymph or blood).
  • cancers include but are not limited to breast cancer, colon cancer, lung cancer, prostate cancer, hepatocellular cancer, gastric cancer, pancreatic cancer, cervical cancer, ovarian cancer, liver cancer, bladder cancer, cancer of the urinary tract, thyroid cancer, renal cancer, carcinoma, melanoma, and brain cancer.
  • the cancer can be any of the above-mentioned types of cancer but preferably is a solid tumor of epithelial origin, such as breast cancer, and can, for example, be tumors in invasive breast cancer (locally advanced, locally recurrent or metastatic), or stage II, stage III and stage IV breast cancer.
  • breast cancer as used herein encompasses several different subtypes of breast cancer.
  • Invasive breast cancer may include tumors stratified by the number of lesions detected at the time of diagnosis, histological cell type, pathological tumor size, level of differentiation, DNA content (diploid (i.e. the normal (expected) amount of chromosomes in a normal cell, i.e. two copies of each autosome and two sex chromosomes), aneuploid (i.e. the chromosomal amount deviating from that detected in a normal cell as a result of chromosomal gains or losses or multiploid)), steroid hormone receptor content, HER2/nei/ expression, triple-negative status and the number of detected axillary lymph nodes (i.e.
  • the breast cancer may include single or multiple lesions detected at the same time of diagnosis (synchronous) or at least 3 months apart (metachronous), in the same breast (ipslilateral) or in both breasts (bilateral).
  • the breast cancer may include tumors of the group selected from the type of tissue from which the cancer arises, such as ductal (tumors originating from the ducts of the breast), lobular (tumors originating from the lobules of the breast), or rare cases originating from connective tissue.
  • pathological tumor size is intended to mean the physical tumor volume.
  • pT1 is a tumor 2 cm (3/4 of an inch) or less across;
  • pT2 is a tumor more than 2 cm but not more than 5 cm (2 inches) across;
  • pT3 is a tumor more than 5 cm across;
  • pT4 is a tumor of any size growing into the chest wall or skin, including inflammatory breast cancer.
  • the tumors included in the present invention may vary in size from smaller than 2 cm (3/4 of an inch) to over 5 cm (2 inches) across, as well as, vary in differentiation levels.
  • the Scarff-Bloom-Richardson (SBR) grade system is used to classify tumors based on three morphological features.
  • GGI Genetic Grade Index
  • axillary lymph node negative and “axillary lymph node positive” refer to subjects with the absence or presence of cancer cells in the axillary lymph nodes.
  • a triple negative breast cancer status is assigned based on the presence, or lack thereof, of three important receptors: estrogen, progesterone, and the human epidermal growth factor receptor 2 (HER2//7eu) oncogene. None of these receptors are present in triple negative breast cancers (estrogen receptor negative, progesterone receptor negative, HER2/neu negative), which therefore are not responsive to receptor-targeted treatments.
  • the method involves assessing changes in expression levels of one or more molecular markers in the 8p11 -p12 chromosomal region, and said changes are indicative of an activity of the Wolf-Hirschhorn syndrome candidate 1 -like 1
  • WHSC1L1 gene, or gene products thereof.
  • the term "molecular marker” is here intended to mean any molecule that can be measured and whose abundance reflects the activity of targeted genes in the 8p1 1 -p12 chromosomal region, which spans an approximate 8 Mb region and encompasses about 61 known genes. More specifically the method of the invention involves assessing changes in expression levels of one or more molecular markers indicative of an activity of the Wolf-Hirschhorn syndrome candidate 1 - like 1 (WHSC1L1) gene, or gene products thereof.
  • the WHSC1L1 gene which may also be referred to in the literature as NSD3, spans over 90 kb of the 8p12 locus and consists of 24 exons.
  • the WHSC1 L1 protein belongs to the histone-lysine methyltransferase family and SET2 (Su(var)3-9, Enhancer-of-zeste, Trithorax) subfamily.
  • the function of the WHSC1 L1 protein is still rather unknown, but it may encompass activities such as histone methyltransferase activity and preferentially methylates Lys-4 and Lys-27 of histone H3. Methylation of Lys-4 initiates an epigenetic transcriptional activation, while methylation of Lys-27 initiates an epigenetic transcriptional repression.
  • the WHSC1L1 gene, or gene products thereof refer to a contiguous stretch of nucleotide bases within the genome that is transcribed into a transcription product, more specifically an mRNA.
  • transcription product can refer not only to the mRNA transcribed from the DNA, i.e. a transcript, but also the DNA copy of the mRNA, e.g. cDNA.
  • Such mRNA is subsequently translated into a translation product, i.e. a "gene product” such as e.g. a polypeptide or protein.
  • the gene may have multiple sections, parts or regions, e.g. coding and non-coding sections. The complete gene comprises all of the sections but alternative splicing of the gene may give rise to one or more fragment of a gene consisting of less than all the sections.
  • the present invention encompasses all known isoforms representing different splice variants of the WHSC1L1 gene as published by the Ensembl project database,
  • WHSC1 L1-001 Transcript ID: ENST00000317025 in the Ensembl project database and Consensus Coding Sequence (CCDS) 43729
  • CCDS Consensus Coding Sequence 43729
  • PWWP proline- tryptophan-tryptophan-proline domains
  • Isoform WHSC1 L1-002 (Transcript ID: ENST00000316985 and CCDS6105) contains 645 amino acids (Protein ID ENSP00000313410). Amino acids in positions 620-645 are substituted and amino acid in positions 646-1437 are missing, and therefore only consists of one of the two PWWP domains.
  • Isoform WHSC1 L1-003 (Transcript ID: ENST00000433384) is 1388 amino acids long (Protein ID ENSP00000393284). It lacks the PHD-type 3 domain as amino acid positions 871-919 are missing.
  • Isoform WHSC1 L1-005 (Transcript ID: ENST00000527502) is 1426 amino acids long (Protein ID ENSP00000434730).
  • Isoform WHSC L1-008 Transcript ID: ENST00000534155
  • ENSP00000432544 25 amino acids long
  • Isoform WHSC1 L1-009 (Transcript ID: ENST00000529223) is 282 amino acids long (Protein ID ENSP00000435422).
  • Isoform WHSC1 L1-010 (Transcript ID: ENST00000534539) is 15 amino acids long (Protein ID ENSP00000431598).
  • Isoform WHSC1 L1 -011 (Transcript ID: ENST00000528627) is 77 amino acids long (Protein ID ENSP00000435073).
  • Isoform WHSC1 L1-201 (Transcript ID: ENST00000446459) is 1374 amino acids long (Protein ID ENSP00000387883), and is missing residues in positions 67-129.
  • WHSC1 L1-002 The short isoform (WHSC1 L1-002) and a long isoform (WHSC1 L1-001) have been reported to be co-expressed in many tissues, with the short being the prevalent form.
  • Examples of molecular markers that are assessed in the present invention may include genomic sequences or fragments thereof, transcription products, translation products and polypeptides.
  • the terms "genome(s)" or “genomic sequences” is intended to mean the hereditary information of an organism typically encoded in nucleic acids, either DNA (Deoxyribonucleic acid), or RNA, i.e. ribonucleic acid which is a nucleic acid produced with DNA as template, and includes genes and non-coding gene sequences.
  • the genome may refer to the nucleic acids making up one set of chromosomes of an organism (haploid genome) or both sets of chromosomes of an organism (diploid genome) depending on the context in which it is used.
  • the genome may be at least partially isolated, part of a nucleus, and/or in a cell, such as, but not limited to, a germ cell or a somatic cell, i.e. a cell of the body, not including the reproductive cells.
  • a cell such as, but not limited to, a germ cell or a somatic cell, i.e. a cell of the body, not including the reproductive cells.
  • one or more genomes may include, but not be limited to, nuclear, organellas and/or mitochondrial genomes. Examples of genome molecular markers that are included in the present invention are genes and fragments thereof, non-coding gene sequences and epigenetic signatures.
  • transcription products generally refers to any equivalent RNA copies of a sequence of DNA which may be unmodified RNA, or modified RNA of any length. Included are primary transcripts (i.e. the nucleic acid sequences corresponding to the transcribed portions of a gene), spliced primary transcripts, messenger RNA (mRNA), microRNA, complementary DNA (cDNA) i.e. a DNA molecule synthesized from an mRNA template, genomic DNA (gDNA), non-coding RNA, epigenetic signatures.
  • triple-stranded regions comprising RNA or DNA or both RNA and DNA are included in the term transcription products.
  • the strands in such regions may be from the same molecule or from different molecules.
  • DNAs or RNAs comprising unusual bases such as inosine, or modified bases such as tritiated bases are also encompassed in the term transcription products.
  • translation product is intended to encompass products that are a result of the translation of messenger RNA (mRNA) by the ribosome to produce a specific amino acid chain, i.e. peptides such as oligopeptides, polypeptides or proteins.
  • peptide refers to an amino acid chain of any length greater than two amino acids in which the amino acid residues are linked by peptide bonds or modified peptide bonds into a polypeptide.
  • An oligopeptide is a polypeptide of between 2 and 20 amino acids.
  • a protein is one or more polypeptides more than about 50 amino acids long, often folded into a globular form.
  • oligo peptide also encompass various modified forms thereof.
  • modified forms may be naturally occurring modified forms or chemically modified forms, post translationally modified polypeptides, proteins or fragmented or degraded proteins.
  • modified forms include, but are not limited to, glycosylated forms, phosphorylated forms, myristoylated forms, palmitoylated forms, ribosylated forms, acetylated forms, ubiquitinated forms. Modifications also include intra-molecular crosslinking and covalent attachment to various moieties such as lipids, flavin, biotin, polyethylene glycol or derivatives thereof. In addition, modifications may also include cyclization, branching and cross-linking.
  • fragmented or degraded proteins may be truncated proteins, which are proteins that have not achieved its full length or its proper form, and thus is missing some of the amino acid residues that are present in the normal protein.
  • a truncated protein generally cannot perform the function for which it was intended because its structure is incapable of doing so.
  • amino acids other than the conventional twenty amino acids encoded by genes may also be included in a polypeptide or protein.
  • the translation products of the invention include, but are not limited to oligo peptides, polypeptides, proteins, truncated proteins or isoforms thereof, post translationally modified proteins or isoforms thereof, or fragmented or degradation products of a protein, and secreted peptides that originate from a protein.
  • the molecular markers may be assessed by different techniques that are well known to the person skilled in the field of molecular genetics, proteomics, and chemistry. The following techniques are of particular relevance for the present invention:
  • Array comparative genomic hybridization (also CMA, Chromosomal Microarray Analysis, Microarray-based comparative genomic hybridization, array-CGH, a-CGH, aCGH, or Virtual Karyotype) is a technique to detect genomic copy number variations at a higher resolution level than chromosome-based comparative genomic hybridization (CGH).
  • DNA from a test sample and normal reference sample are labeled differentially, using different fluorophores, and hybridized to several thousand probes.
  • the probes are derived from most of the known genes and non-coding regions of the genome, and printed on a glass slide.
  • the ratio of the fluorescence intensity of the test to that of the reference DNA is then calculated, to measure the copy number changes for a particular location in the genome. Details for how this technique may be applied in the present invention are described in Example 1 a below.
  • PCR Polymerase Chain Reaction
  • qPCR quantitative real-time PCR
  • dPCR digital PCR
  • the method relies on thermal cycling, consisting of cycles of repeated heating and cooling of the reaction for DNA melting and enzymatic replication of the DNA.
  • Primers short DNA fragments containing sequences complementary to the target region along with a DNA polymerase are key components to enable selective and repeated amplification.
  • the DNA generated is itself used as a template for replication, setting in motion a chain reaction in which the DNA template is exponentially amplified.
  • PCR can be extensively modified to perform a wide array of genetic manipulations.
  • a commercially available qPCR assay was ordered from Life Technologies, which is a leading global provider of qPCR assays.
  • the assay detects the main isoforms, and two novel highly optimized assays based on specific primer pairs that detect all known protein coding isoforms were invented (for details see Example 4).
  • primer is intended to mean a synthetic oligonucleotide that may be modified or contain modified bases and is sufficiently complementary to a target DNA sequence to be extended by a polymerase in for example the polymerase chain reaction.
  • the lllumina Gene Expression BeadArray technology is based on 3 micron perfectly spherical beads that sit in micro-wells that are formed in either the end of a fiber optic bundle or on a planar surface such as a silica slide.
  • the beads have a very uniform spacing of 5.7 micron and are loaded at an even depth of 2 micron. These beads act as the functional elements of the array and are evenly covered with about 700 to 800 thousand copies of a specific oligonucleotide that acts as the capture sequence in a genotyping or gene expression assay.
  • the Beads themselves are not colored and are coated with unlabelled oligonucleotides - but these in turn hybridize to fluorescently tagged reaction products from bioassays. Further details of how this technique may be used in the present invention can be found in Example 2 below.
  • Fluorescence in situ hybridization is a cytogenetic technique that is used to detect and localize the presence or absence of specific DNA sequences on chromosomes.
  • FISH uses fluorescent probes that bind to only those parts of the chromosome with which they show a high degree of sequence similarity.
  • the probe is tagged directly with fluorophores, with targets for antibodies or with biotin. Tagging can be done in various ways, such as nick translation or PCR using tagged nucleotides. Then, an interphase or metaphase chromosome preparation is produced.
  • the chromosomes are firmly attached to a substrate, usually glass. Repetitive DNA sequences must be blocked by adding short fragments of DNA to the sample.
  • the probe is then applied to the chromosome DNA and incubated for approximately 12 hours while hybridizing. Several wash steps remove all non-hybridized or partially-hybridized probes. The results are then visualized and quantified using a microscope that is capable of exciting the dye and recording images. Specific application of this technique as used in the present invention may be seen in detail in Example 1b below.
  • Enzyme-linked immunosorbent assay or Enzyme immunoassay (EIA) are biochemical techniques used mainly in immunology to detect the presence of an antibody or an antigen in a sample.
  • the sample with an unknown amount of antigen is immobilized on a solid support (usually a polystyrene microtiter plate) either non-specifically (via adsorption to the surface) or specifically (via capture by another antibody specific to the same antigen, in a "sandwich” ELISA). After the antigen is immobilized the detection antibody is added, forming a complex with the antigen.
  • the detection antibody can be covalently linked to an enzyme, or can itself be detected by a secondary antibody which is linked to an enzyme through bioconjugation.
  • the plate is typically washed with a mild detergent solution to remove any proteins or antibodies that are not specifically bound.
  • the plate is developed by adding an enzymatic substrate to produce a visible signal, which indicates the quantity of antigen in the sample.
  • Immunohistochemistry or IHC refers to the process of detecting antigens (e.g., proteins) in cells of a tissue section by exploiting the principle of antibodies binding specifically to antigens in biological tissues. Immunohistochemical staining is widely used in the diagnosis of abnormal cells such as those found in cancerous tumors. Specific molecular markers are characteristic of particular cellular events such as proliferation or cell death (apoptosis). Visualising an antibody-antigen interaction can be accomplished in a number of ways. In the most common instance, an antibody is conjugated to an enzyme, such as peroxidase, that can catalyse a colour-producing reaction. Alternatively, the antibody can also be tagged to a fluorophore, such as fluorescein or rhodamine.
  • an enzyme such as peroxidase
  • the antibody can also be tagged to a fluorophore, such as fluorescein or rhodamine.
  • the direct method is a one-step staining method and involves a labeled antibody (e.g. FITC-conjugated antiserum) reacting directly with the antigen in tissue sections. While this technique utilizes only one antibody and therefore is simple and rapid, the sensitivity is lower due to little signal amplification, such as with indirect methods, and is less commonly used than indirect methods.
  • the indirect method involves an unlabeled primary antibody (first layer) that binds to the target antigen in the tissue and a labeled secondary antibody (second layer) that reacts with the primary antibody. After immunohistochemical staining of the target antigen, a second stain is often applied to provide contrast that helps the primary stain stand out.
  • Western Blot is an analytical technique used to detect specific proteins in a given sample of tissue homogenate or extract. It uses gel electrophoresis to separate native or denatured proteins by the length of the polypeptide (denaturing conditions) or by the 3-D structure of the protein (native/ non-denaturing conditions). The proteins are then transferred to a membrane (typically nitrocellulose or PVDF), where they are probed (detected) using antibodies specific to the target protein.
  • a membrane typically nitrocellulose or PVDF
  • the term "activity of the WHSC1L1 (Wolf-Hirschhorn syndrome candidate 1-like 1 ) gene” includes but is not limited to activities such as changes in the gene copy number, i.e. changes in the number of copies of the gene of interest. Such a change may be classified as amplification, gain, loss and deletion on the genome level.
  • the term "gain" in copy number has occurred when about 1-2 more copies than the normal gene copy number (but less than amplification levels) are produced, an "amplification”, when there is more than 2.5-fold the normal gene copy number, a “heterozygous loss”, when whole or small DNA segments of a chromosome have been lost, and “homozygous deletion”, when both copies of a gene have been lost.
  • the activity may include assessing the WHSC1L1 gene copy number, gene sequence variations that affect WHSC1L1 transcription, or the expression level of the WHSC1L1 gene.
  • gene expression is measured relatively, i.e., comparing the expression of a potential oncomarker to that of a reference gene, whose expression supposedly should not depend on the disease.
  • Such measurements usually provide more accurate quantitative information, since normalization with the reference gene removes much of the confounding variation introduced by the handling of the sample, including degradation of the sample material that may have occurred during the sample preservation, transport, storage and possibly thawing of the sample.
  • the reference gene should preferably not be located in the 8p1 1-p12 region, since its expression should be independent of the disease. There is an advantage in using more than one reference gene and normalize to their geometric mean expression levels, since the average value will be a more robust normalizes Preferably 2, 3, 4, 5, 6, 7, 8, 9, 10 or more reference genes are used for normalization purposes. Suitable reference genes are, for example, those available in the Human Endogenous Control Gene Panel from TATAA Biocenter (www.tataa.com)
  • the activity of the WHSC1L1 gene may also be reflected by any changes in expression of products of this gene, i.e. changes in the expression levels of transcription or translation products (as defined above) produced by this gene. Such changes may e.g.
  • the expression level is considered to be enhanced when there is more than 1.5-fold expression compared to the normal gene expression level and the expression is considered abnormally suppressed when the expression is below 50 % of the normal level.
  • the term "normal level" is the gene expression level in a sample known to be devoid of any abnormal activity of the WHSC1L1 gene or gene product.
  • activity encompasses any physiological or biochemical activities displayed by, or associated with, a particular protein or protein complex including, but not limited to, activities exhibited in biological processes and cellular functions, the ability to interact with or bind another molecule or a moiety thereof, a binding affinity or specificity to certain molecules, the in vitro or in vivo stability (e.g., protein degradation rate, or in the case of protein complexes, the ability to maintain the form of a protein complex), antigenicity, immunogenicity, and enzymatic activities.
  • the activity of the WHSC1L1 gene or gene product can be assessed by measuring the WHSC1L1 protein in one or several of its isoforms, truncated variants or peptide fragments. Such activities may be detected or assayed by any of a variety of suitable methods as will be apparent to skilled artisans.
  • the change in activity is compared to a positive and/or negative control.
  • Positive controls confirm that the procedure is competent in observing the effect (therefore minimizing false negatives).
  • Negative controls confirm that the procedure is not observing an unrelated effect (therefore minimizing false positives).
  • a positive control is a procedure that is very similar to the actual experimental test, but which is known from previous experience to give a result that is hypothesized to occur in the treatment group (positive result).
  • a negative control is known to give a negative result.
  • the positive control confirms that the basic conditions of the experiment were able to produce a positive result, even if none of the actual experimental samples produce a positive result.
  • the negative control demonstrates the base-line result obtained when a test does not produce a measurable positive result; often the value of the negative control is treated as a "background” value to be subtracted from the test sample results, or be used as the "100%" value against which the test sample results are weighed.
  • the negative control may represent a "normal sample", i.e. a sample known to be devoid of any abnormal activity of the WHSC1L1 gene or gene product.
  • the activity of the WHSC1L1 gene or gene product is thus compared to a "normal sample” and any difference in activity of the tested sample compared to that of the normal sample may be used in the assessment of a diagnosis and/or prognosis of cancer.
  • biopsy material, blood, serum, urine, saliva, bone marrow, and cell population samples known to be diseased or healthy in the species to be analyzed may be used as positive and/or negative controls.
  • the molecular markers to be assessed should indicate the activity of the WHSC1L1 gene or gene product thereof. However, preferably more than one molecular marker indicating the activity of the WHSC1L1 gene or gene product thereof is assessed in the method of the present invention.
  • the molecular markers assessed are the gene coding for WHSC1L1, and the relative transcription level of the same WHSC1L1 gene. To the great surprise of the inventors, it is seen that an increase in the mRNA levels of the WHSC1L1 gene, i.e. more than 1.5-fold the normal gene expression level, is found very frequently among breast cancer patients (for details see example 2 below).
  • the WHSC1L1 gene was over-expressed in 75-93% of the tumors having an 8p11-p12 amplification; the spread reflects variation in hybridization among the probes used (Table 1 , column 1), which target different regions of the gene, and the BAC clones (Table 1 , column 2). Furthermore, for a large number of the breast tumors 8p11-p12 chromosomal aberrations are not found, but by measuring mRNA expression of a number of genes located in the 8p11-p12 region, it was found that the WHSC1L1 gene was over- expressed in 29-45% (depending on the probe and the BAC clone used, Table 1 , columns 1-2) of these cases.
  • Some preferred further markers include but are not limited to v-myc myelocytomatosis viral oncogene homolog (avian; MYC), Human Epidermal growth factor Receptor 2 (HER2), and StAR- related lipid transfer (START) domain containing 3 (STARD3).
  • v-myc myelocytomatosis viral oncogene homolog avian; MYC
  • HER2 Human Epidermal growth factor Receptor 2
  • STT StAR- related lipid transfer domain containing 3
  • the amplification of the genes of the further markers of cancer or preferably the expression thereof is measured, but is not limited thereto.
  • diagnosis is defined to encompass the following processes either individually or cumulatively depending upon the clinical context: determining the presence of disease, determining the nature of a disease, and distinguishing one disease from another.
  • prognosis is defined to encompass forecasting as to the probable outcome of a disease state, determining the prospect as to recovery from a disease as indicated by the nature and symptoms of a case, monitoring the disease status of a patient, monitoring a patient for recurrence of disease, and/or determining the preferred therapeutic regimen for a patient.
  • the degree of WHSC1L1 over-expression reflects the severity of the disease, and that over-expression is a negative prognostic factor, showing substantially shorter survival rates for patients with WHSC1L1 over-expression compared to patients with normal WHSC1L1 gene transcript level. Furthermore, there is shorter survival of patients with extensive WHSC1L1 over-expression compared to patients with only modest over-expression.
  • the degree of WHSC1L1 over-expression can be used as prognostic factor, wherein the likelihood of survival is assessed based on the WHSC1L1 expression level. Of particular interest is to correlate the level of WHSC1L1 expression to survival time.
  • WHSC1L1 there are two main forms of WHSC1L1 that differ in length and the present invention teaches how they can be exploited to further improve the diagnostic and theranostic relevance of the invention (see examples 5 and 6).
  • qPCR assays were invented that selectively quantify the short transcript variant of WHSC1L1, the long transcript variant of WHSC1L1 and both variants of WHSC1L1.
  • CTC circulating tumor cell
  • kits for in vitro diagnosing and/or prognosing cancer in a biological sample comprising
  • WHSC1L1 Wolf-Hirschhorn syndrome candidate 1-like 1
  • kits comprising means for assessing changes in expression levels of one or more molecular markers in the 8p1 1 -p12 chromosomal region, said changes are indicative of an activity of the Wolf-Hirschhorn syndrome candidate 1-like 1 ⁇ WHSC1L1) gene, or gene products thereof.
  • said kit may include positive or negative controls and control samples, such as biopsy material, blood, serum, urine, saliva, bone marrow, and cell population samples known to be diseased or healthy of the species to be analysed,.
  • said kit may include instructional materials disclosing, for example, use of the means for assessing changes in expression levels of one or more molecular markers in the 8p11-p12 chromosomal region, or means of use for a particular reagent.
  • the instructional materials may be written, in an electronic form (e.g., computer diskette or compact disk) or may be visual (e.g., video files).
  • the kits may also include additional components to facilitate the particular application for which the kit is designed.
  • the kit can include buffers and other reagents routinely used for the practice of a particular disclosed method. Such kits and appropriate contents are well known to those of skill in the art.
  • the kit may further comprise, in an amount sufficient for at least one assay, the means for assessing changes in expression levels of one or more molecular markers in the 8p11- p12 chromosomal region described herein to as a separately packaged reagent, as well as separate instructions for its use to selectively recognize oligomeric form compared to corresponding monomers.
  • Instructions for use of the packaged reagent are also typically included.
  • Such instructions typically include a tangible expression describing reagent concentrations and/or at least one assay method parameter such as the relative amounts of reagent and sample to be mixed, maintenance time periods for reagent/sample admixtures, temperature, buffer conditions, and the like.
  • Said kit may further include a carrier means, such as a box, a bag, a satchel, plastic carton (such as moulded plastic or other clear packaging), wrapper (such as, a sealed or sealable plastic, paper, or metallic wrapper), or other container.
  • a carrier means such as a box, a bag, a satchel, plastic carton (such as moulded plastic or other clear packaging), wrapper (such as, a sealed or sealable plastic, paper, or metallic wrapper), or other container.
  • kit components will be enclosed in a single packaging unit, such as a box or other container, which packaging unit may have compartments into which one or more components of the kit can be placed.
  • a kit includes one or more containers, for instance vials, tubes, and the like that can retain, for example, one or more biological samples to be tested.
  • kit embodiments include, for instance, syringes, cotton swabs, or latex gloves, which may be useful for handling, collecting and/or processing a biological sample.
  • Kits may also optionally contain implements useful for moving a biological sample from one location to another, including, for example, droppers, syringes, and the like.
  • Still said kit may include disposal means for discarding used or no longer needed items (such as subject samples, etc.).
  • disposal means can include, without limitation, containers that are capable of containing leakage from discarded materials, such as plastic, metal or other impermeable bags, boxes or containers.
  • Example 1a Molecular characterization of 8p11-p12 genetic aberrations in primary invasive breast tumors
  • Hybridization Kit (Corning). The hybridized arrays were incubated for 72 h at 37°C.
  • the x-axis corresponds to the genomic region from chromosome 1 to X and the y-axis to the percentage of gains and losses in the given chromosomal region, respectively.
  • the core of the amplified region ( Figure 2) displayed a heterogeneous pattern with a complex structure, varying in amplitude and size among the tumor specimens.
  • the amplicon mapped to a 12.1 Megabase (Mb) region from 31.9-43.9 Mb (from telomere to centromere) with 5 subregions of amplification and 9 minimal common amplification peaks (range, 41.2-377.4 kb) from 34.3-42.5 Mb, identifying loci for candidate genes involved in breast cancer development and progression.
  • Mb Megabase
  • telomere to centromere 5 subregions of amplification
  • 9 minimal common amplification peaks range, 41.2-377.4 kb
  • identifying loci for candidate genes involved in breast cancer development and progression One of the smallest peaks and notably the most common mapped to a 67.9 kb region spanning the WHSC1L1 (Wolf-Hirschhorn syndrome candidate 1-like 1) gene on chromosome band 8p12.
  • the WHSC1L1 gene was amplified in 32/47 samples, consisting of four Basal-like, three HER2/ER-, and twenty-five Luminal
  • Example 1b Verification of 8p11-p12 genetic aberrations in the cells of breast tumors using Fluorescence in situ hybridization (FISH)
  • a panel of 42 overlapping BAC (Bacterial Artificial Chromosome) clones building an assembled set of overlapping DNA sequences (a contig) over the altered region was isolated.
  • Genetic aberrations identified at the 8p11-p12 locus using array-CGH were verified with fluorescence in situ hybridization (FISH; Figure 3).
  • FISH analyses of the 8p11-p12 region were carried out on touch preparation imprints from fresh cuts of frozen tumor samples to characterize and narrow down the number of genes within the region that may be targeted for amplification and/or deletion.
  • the tumor imprints were air-dried overnight at room temperature and fixed and gradually dehydrated through an ethanol (EtOH) gradient (70%, 80%, and 100% EtOH for 3 min each).
  • the slides were stored at -20°C in 100% EtOH.
  • One slide was counterstained with 4', 6-diamidino-2- phenylindole (DAPI) in an antifade solution (VectashieldTM, Vector Laboratories, Inc., Burlingame, CA, USA) for analysis of the quality of the imprint and fixation process.
  • DAPI 6-diamidino-2- phenylindole
  • Hybridization probes for FISH were prepared from BAC clones selected by compiling array-CGH data from breast tumor samples revealing amplifications in the 8p11-p12 chromosome region. Forty-two BAC clones were chosen which appeared in the peaks of amplification of the array-CGH graphs.
  • the BAC clones (BacPac Resources, CHORI, Oakland, CA, USA) used in FISH experiments were from the RPCI-11 (RP11) and Caltech whole genome libraries (CTD) unless otherwise stated.
  • the BAC clones were inoculated from bacteria stabs with a sterile culture loop and grown on agar plates containing Luria-Bertani (LB) media supplemented with chloramphenicol (12.5 pg/ml). To prevent the isolation of degraded DNA, two single colonies were separately inoculated into two 10 ml snap-cap polypropylene tubes containing 1 ml LB media. The bacteria were cultured for 4 hours at 37°C in a shaking incubator.
  • LB Luria-Bertani
  • the bacterial lysate was treated with 0.3 ml P2 solution (1 M NaOH, 10% SDS) and 5 incubated for 5 min at room temperature, allowing the appearance of the suspension to alter from very turbid to almost transparent.
  • the suspensions were treated slowly with 0.3 ml P3 solution (3 M KoAc pH 5.5, stored at 4°C) and placed on ice for at least 5 min. A thick white precipitate of protein and E. coli was produced.
  • the suspensions were centrifuged at 10,000 rpm for 10 min at 4°C. The supernatant was carefully transferred to a new
  • each DNA sample and 1 ⁇ 6x Blue Gel Loading Dye were analyzed by electrophoresis on a 1 % agarose gel, stained with GelRed, and visualized with UV illumination using GelDoc 2000 (Bio Rad, Hercules, CA, USA).
  • Each BAC DNA probe was labeled by nick translation using
  • Equal amounts of two differentially labeled probes (2 pg) were pooled pair wise, together with unlabeled Cot-1 DNA (Invitrogen, Carlsbad, CA and Roche, Mannheim, Germany).
  • the Cot-1 DNA was included to suppress repetitive sequences.
  • the probe DNA mixtures were precipitated with 0.1 volume 3M sodium acetate (NaOAc) pH 5.2, 2.5 volumes ice cold 30 100% EtOH and incubated overnight at -20°C.
  • the probe DNA mixtures were centrifuged for 30 minutes at 13000-15000 rpm at 4°C, washed in 0.5 ml ice cold 70% EtOH, and centrifuged again for an additional 30 minutes.
  • the DNA samples were air-dried and resuspended in 200 ⁇ hybridization mixture (50% formamide, 10% dextran sulfate, and 2xSSC). Dual-color FISH analysis was performed on interphase cells using BAC biotin- and digoxigenin-labelled probes. All BACs were checked for chimerism and chromosomal sub-localization at 8p11-p12 using FISH on metaphase chromosomes prepared from normal lymphocytes. Metaphase chromosomes and/or interphase cells were denatured with 70% formamide/2xSSC at 75°C for 2 min, followed by dehydration through a graded ice cold ethanol series (70%, 80%, and 100% EtOH) for 3 min each, and allowed to air-dry.
  • Hybridization probe mixture (100 ng) was denatured at 76°C for 5 min, put on ice for 1 min, pre-annealed at 37°C for 10 min, and hybridized to the denatured cells on the slides. Each sample was covered by a cover slip and sealed with rubber cement. Hybridization was performed at 37°C for 24-72 hours in a humidified chamber. Slides were washed with an SSC gradient (2xSSC for 5 min, 0.5xSSC in a 72°C water bath for 2 min, O.lxSSC for 3 min while shaking), and dipped momentarily in 4xSSC Tween 20.
  • the slides were treated with a blocking solution (3% bovine serum albumin (BSA)) at 37°C for 10 min and briefly washed in 4xSSC/Tween 20.
  • BSA bovine serum albumin
  • the first staining stage a fluorochrome mixture consisting of 1% BSA, fluorescein isothiocyanate (FITC) for biotin (Roche, Mannheim, Germany), and Rhodamine for digoxigenin (Roche, Mannheim, Germany), was applied to the slides, coverslipped, and incubated in a humidified chamber at 37°C for 30-60 min. The slides were then washed in 4xSSC/Tween 20 for 3x5 min while shaking. Cells were counterstained with DAPI in an antifade solution.
  • the slides were analyzed using a Leica DMRA2 fluorescent microscope (Leica, Wetzler, Germany) equipped with an ORCA Hamamatsu CCD (charged-couple devices) camera (Hamamatsu City, Japan) and filter cubes specific for FITC; Rhodamine, and UV for DAPI visualization. Digitalized black and white images were captured using the Leica CW4000 software package (Cambridge, UK). At least 100 nuclei with intact morphology on the basis of DAPI counterstaining were scored from each clinical specimen. Clumped, damaged, overlapping, and nuclei located outside the area of probe hybridization were excluded from the analysis. If less than 100 cells were on the slide, then as many nuclei as possible were evaluated.
  • a gene amplification classification system was applied where specimens were classified as low level amplification when the number of specific hybridization signals varied from 5 to 10 signals, moderate amplification when the number of signals varied from 1 1 to 20, and as high level amplification when the number of signals was > 20.
  • An average DNA copy number was calculated taken in consideration of the clonal heterogeneity seen in the analyzed specimens.
  • Hybridization signals were observed in a variety of patterns from a clustered to a slightly scattered pattern. These patterns could vary between cells in the same tissue sample, as well as, between tissue samples.
  • the tested BAC clones were highly amplified in a number of tumors, but not amplified with the same pattern in all analyzed tumors.
  • FISH Fluorescence In situ hybridization
  • 3 Three types of FISH methods were used in this study, namely, 1 ) metaphase-, 2) interphase-, and 3) Fiber-FISH.
  • the BAC clones were checked for sub-chromosomal localization and chimerism by hybridizing the BAC probes to metaphase chromosomes prepared from normal lymphocytes. Chimeric clones exist in genomic libraries, such as BAC libraries, which can give misleading results.
  • Interphase FISH was performed on imprints from breast tumor samples using a total of forty-two BAC clones.
  • Fiber- FISH was used on stretched free chromatin samples when signals in interphase FISH were difficult to distinguish.
  • the hybridization patterns also varied according to the tumor specimen and the BAC clone. There were two main types of observed patterns: either the hybridization signals were clustered in set positions in the interphase nuclei or the signals were scattered.
  • Recent 24-Color 3D FISH studies have shown that human chromosomes have fixed positions in interphase nuclei (Bolzer A, Kreth G, Solovei I, et al. Three-dimensional maps of all chromosomes in human male fibroblast nuclei and prometaphase rosettes. PLoS Biol 2005;3:e157).
  • the clustered hybridization signals may represent
  • HSRs homogenously staining regions
  • Analyzed tumors samples from the same patient were either surgically removed from multiple foci in the same breast or from different breasts.
  • the genetic profiles for these tumors were often drastically different. For example, one tumor from the same patient may appear normal at the 8p11-p12 region, while the other showed low-level
  • the tumors may either be genetically distinct or that the second tumor contained genetic material from the primary tumor as well as additional DNA mutations.
  • Example 2 Effects of 8p11-p12 amplification on gene expression patterns in primary invasive breast tumors
  • the expression microarrays contained approximately 49,000 probes representing more than 25,400 RefSeq (Build 36.2, Release 22) and Unigene (Build 199) annotated genes. Images and raw signal intensities were acquired using the lllumina BeadArray Reader scanner and BeadScan 3.5.31.17122 (lllumina) image analysis software, respectively. Data preprocessing and quantile normalization were applied to the raw signal intensities using BASE. Further data processing was performed in Nexus Expression 2.0 (BioDiscovery) using log2- transformed, normalized expression values and a variance filter. Normalized values from five normal breast samples profiled with lllumina HumanWG-6 Expression Beadchips (Gene Expression Omnibus, accession number GSE17072) were used as reference. Differentially expressed genes were determined using the Benjamini-Hochberg method to control for the false discovery rate (FDR) with FDR-corrected p-values ⁇ 0.01.
  • expression profiling was analyzed using two different approaches to delineate genes within the amplicon with oncogenic potential.
  • a correlation analysis between DNA and relative mRNA levels was also performed.
  • lllumina HumanHT-12 probe nucleotide sequences were mapped to genomic locations (NCBI Build 35) using sequences downloaded from the UCSC Genome Browser (Internet address: http://hgdownload.cse.ucsc.edu/goldenPath/hg17/chromosomes/).
  • a pair-wise comparison of the lllumina probe and BAC clone nucleotide sequences was then conducted to generate lllumina-BAC probe pairs with 100% sequence similarity.
  • 327 lllumina-BAC probe pairs spanning the 8p1 1 -p12 genomic region were selected from smoothed array-CGH data. The statistical analysis was performed in three sequential steps.
  • CNA copy number alteration
  • probe pairs with Pearson correlation r > 0.7 were selected for further analysis to assess the number of tumors displaying gene over-expression (log2 > 0.58) when amplified (log2ratio > 0.5), as well as, the number of tumors displaying over- expression in the absence of gene amplification (-0.2 ⁇ log2ratio ⁇ 0.5).
  • Statistical analyses were performed in R/Bioconductor. These analyses showed that gene copy number impacts gene expression levels.
  • 115 probe pairs showed a significant association using Pearson correlation (Benjamini-Hochberg adjusted p-value ⁇ 0.5). Using this approach, the 8 identified genes were narrowed down to 11 unique genes (R > 0.7).
  • the WHSC1L1 gene had the highest DNA/mRNA correlation in 116 breast tumors (45 with 8p11-p12 amplification and 71 with no 8p11-p12 genetic aberrations).
  • the WHSC1L1 gene was also over-expressed (log2 ⁇ 0.58, at least 1.5-fold change) in 39 out of 87 tumors in the absence of 8p11-p12 amplification.
  • Table 1 shows the statistical analysis of the effect of the DNA gene dosage on relative mRNA levels in 116 primary invasive breast tumors. Eleven significantly regulated genes (R>0.7; Benjamini-Hochberg p-adjusted value ⁇ 0.01) were identified in the 8p11-p12 amplification region using array- CGH and expression analysis.
  • Column 1 in Table 1 shows the lllumina HumanHT-12 probe
  • column 2 shows the BAC clone
  • Column 3 shows the targeted gene.
  • Column 4 shows the number of tumors with amplification in the particular BAC clone
  • column 5 shows the number of those tumors with the targeted gene over-expressed
  • column 6 shows the percentage of tumors with amplification that have the target gene over-expressed.
  • Column 7 shows the number of tumors without amplification
  • column 8 shows the number of these that have the target gene over-expressed
  • column 9 shows the percentage of tumors without amplification that have the target gene over-expressed.
  • column 10 shows the total number of tumors analyzed using the particular BAC clone
  • column 1 1 shows the number of these in which the target gene is over-expressed
  • column 12 shows the percentage of all tumors analyzed using the particular BAC clone that have the target gene over- expressed.
  • Each gene is represented by several BAC clones and lllumina transcripts in the array-CGH and expression platforms. Gene amplification designated at array-CGH log 2 ratio > 0.5 and normal DNA copy numbers designated at log 2 ratio ⁇ 0.5, over-expression designated relative to normal breast tissue log 2 ratio > 0.58 (1.5-fold change). Abbreviation: ND, not determined.
  • WHSC1L1 was over-expressed in 75-93% of the tumors with 8p11-p12 amplification based on 38K array comparative genomic hybridization (array-CGH) with BAC clones (Table I, column 6).
  • array-CGH array comparative genomic hybridization
  • BAC clones BAC clones
  • Table 1 summarizes data for gene transcripts with the strongest DNA-mRNA correlation (r > 0.7) for tumors without genetic alterations; genes not listed are over-expressed in less than 8% of these cases. Since protein production depends on the presence of RNA and in general increases with the level of mRNA, WHSC1 L1 protein is expected to be the most frequently over-expressed protein marker in the 8p11 -p12 region. Hence, the present invention offers better means to detect tumors in breast cancer patients with normal DNA copy number in the 8p11-p12 chromosomal region than when using any other biomarker.
  • WHSC1L1 can according to this invention be measured also by other means than with the bead-based lllumina system.
  • Figure 4 shows the much higher expression of WHSC1L1 in the tumor samples harboring 8p1 1-p12 amplification compared to tumor samples with normal DNA dosage levels when measured by quantitative real-time PCR (qPCR).
  • the expression of WHSC1L1 was measured using pre-designed TaqMan® Gene Expression Assay from Life Technologies that detects the splice variants WHSC1 L1-001 and WHSC1 L1-002.
  • patients with extensive (> 2-fold) over-expression of WHSC1L1 have significantly reduced survival rates than patients with only modest (1.5 - 2-fold) over-expression of WHSC1L1 ( Figure 7).
  • This correlation between WHSC1L1 gene expression level and survival makes the WHSC1L1 gene transcript level a most suitable marker for prognostics either as a sole biological marker or as a critical, possibly dominant component in a multimarker based survival score calculation.
  • FFPE formalin-fixed, paraffin-embedded tissues
  • Optimal antibody dilutions and assay conditions were achieved for immunohistochemistry using breast carcinoma as positive controls.
  • Four micrometer FFPE sections were subsequently immunostained with an antibody specific for the short WHSC1 L1 isoform (WHSC1 L1 -S).
  • the sections were pretreated using the Dako PTLink system (Dako, Carpinteria, CA, USA) for 60 min and processed on an automated Dako Autostainer platform using the Dako EnvisionTM FLEX High pH Link Kit (pH 9) for WHSC1 L1 -S (Sigma-Aldrich HPA018893, 1 :200 dilution).
  • Peroxidase-catalyzed diaminobenzidine was used as the chromogen, followed by hematoxylin counterstain.
  • the slides were then rinsed with deionized water, dehydrated in absolute alcohol, followed by 95% alcohol, cleared in xylene, and mounted.
  • H&E hematoxylin and eosin
  • the staining index (SI) was determined to define low (SI ⁇ 6) or high (SI >6) expression by calculating the product of the percentage and intensity of positively stained cells. The mean staining intensity was used for replicate samples.
  • FFPE specimens did not contain tissue from invasive tumors, but frequently consisted of a mixture of normal, inflammatory, hyperplastic, and/or in situ tissues. Inflammatory infiltration in the specimens ranged from minimal to strong. Subsequently, all specimens were
  • WHSC1 L1 was not expressed in inflammatory and normal cell structure, however, X cases displayed positive staining. In invasive tissue, ubiquitous cytoplasmic expression was observed, but nuclear staining was predominantly observed in tissues harboring amplification of the WHSC1L1 gene. Minimal nuclear staining was also observed in one ductal in situ case. The WHSC1 L1-S antigen was detected in the cytoplasm for 53/90 (59%) cases, of which 37 showed low expression and 16 high expression. Two of the positive cases showed simultaneous expression in the cytoplasm and nucleus.
  • Figure 8 shows (A) weak cytoplasmic staining in the invasive and in situ components of a tumor lacking WHSC1L1 gene amplification; (B) displays moderate cytoplasmic and strong nuclear staining in invasive epithelial cells in a tumor harboring WHSC1L1 gene amplification using archived breast carcinoma tissue and the WHSC1 L1 antibody (1 :200; Sigma-Aldrich). (A) and (B) were taken at X100 magnification.
  • Example 5 Quantification of WHSC1L1 gene expression using novel, highly specific quantitative real-time PCR Critical for any testing of molecular markers is a highly specific and sensitive assay that targets all the relevant variants of the transcript that may be present. At least five transcript variants of the WHSC1L1 gene have been described and in order to obtain the highest sensitivity it is essential that an assay targets all of the transcript variants with high specificity, high tolerance for interfering substances and high efficiency. No such assays were commercially available and therefore an assay that detects all known splice variants of WHSC1 L1 with high specificity, high sensitivity, high accuracy, wide dynamic range, with high PCR efficiency and that is not sensitive to the interfering agents typically present in complex sample matrices was designed. After extensive in silico design, testing, and confirmations and then extensive screening of many candidates under challenging conditions we unexpectedly found these assays to perform exceedingly well:
  • Reverse (Rev) primer 5'-ACACAAGATCGCCAACCTGAA-3' (SEQ ID NO: 2)
  • Fwd primer 5'-GGCTTATAATTTCTACACCAAA-3' (SEQ ID NO: 3)
  • Rev primer 5'-GAACGAGTTCTAATTGATCTTC-3' (SEQ ID NO: 4)
  • Probe 5'-AAGCCAACGC AGAGTGTATC ATCT-3' (SEQ ID NO: 5)
  • Dye based (D)-assays usually reach higher PCR efficiency and therefore are often more reproducible than probe based (P) assays and they are also inexpensive in comparison, to P-assays that have higher specificity.
  • the D-Assay is designed to detect all splice variants of WHSC1L1 gene transcripts using a sequence non-specific dye such as SYBRGreen I, Chromofy, SYT09, EvaGreen and the like.
  • the assay spans between exons 2 and 4, which are present in all splice variants, and does not amplify genomic DNA under the qPCR conditions used.
  • the P-Assay is also able to detect all variants of WHSC1L1 gene transcripts, but with even higher specificity by means of a probe, preferably a hydrolysis probe also known by the brand name Taqman®, but it can be used with dye as reporter as well.
  • the P-assay spans between exon 7 and does not amplify genomic DNA under the qPCR assay conditions used here.
  • Assays that only amplify certain splice variants of WHSC1L1, expected to produce different combinations of WHSC1L1 isoforms were designed.
  • an assay that amplifies the transcript producing the short isoform but not the long isoform was designed:
  • WHSC1 L1_S_Fwd AC AGTTCCTCAG G CTACAGTG AAG A (SEQ ID NO: 6)
  • WHSC1 L1_S_Rev CTCAATCG CTG CGG AGACGG (SEQ ID NO: 7)
  • WHSC1 L1_L_Fw ACAGTTCCTCAGGCTACAGTGAAGA (SEQ ID NO: 8)
  • WHSC1 L1_L_Rev CTCAATCGCTGCGGAGACGG (SEQ ID NO: 9)
  • RNA quality was assessed using the RNA 6000 Nano LabChip Kit with Agilent 2100 Bioanalyzer (Agilent Technologies).
  • Complementary DNA was produced using SuperscriptTM III First-Strand Synthesis SuperMix for qRT-PCR (Invitrogen) using 1ug total RNA.
  • a qPCR master mix was prepared as: 2 ul iQ SYBR Green Supermix (BioRad, 170-8882), 0.3 ⁇ of 10 ⁇ of each primer and 2.7 ⁇ water (8 ⁇ in total, with 300 nM of each primer).
  • the serial dilution of the cDNA for the standard curve was done in TE/LPA buffer (4ml 1 *TE was mixed with 3.2 ⁇ 25 ⁇ / ⁇ LPA (5-6575 Sigma), which gave a 20ng/Ml of LPA).
  • Two micoliters of the cDNA was mixed with 8 ⁇ of the master mix giving a total reaction volume of 10 ⁇ .
  • reaction mix was PCR amplified using an ABI 7500 Fast qPCR instrument with the protocol: activation for 3 minutes at 95°C, followed by 40 cycles of 3 seconds at 95°C and 30 seconds at 60 "C. After completed PCR a melt curve was recorded between 60 and 95°C.
  • H shows WHSC1L1 transcripts (total RNA) measured with the D-assay in selected representative cancer samples. Substantially more WHSC1L1 transcripts are found in samples wherein the region is amplified, and there is also strong correlation with the amount of WHSC1L1 transcripts measured with the novel qPCR assay and transcripts measured with the lllumina assays (example 2).
  • Blood samples (7,5 ml) were collected from 30 metastatic breast cancer patients before the start of new line of therapy. The inclusion criteria were: age above 18 years; patients with measurable or evaluable metastatic breast cancer; Eastern Cooperative Oncology Group (ECOG) scores for performance status of 0-2; no severe uncontrolled comorbidities or medical conditions; no second malignancies. Patients had either a relapse or were diagnosed for BC earlier (D1 year) and were about to start chemotherapy or had documented progressive BC before receiving a new endocrine, chemo- or experimental therapy. Informed consent for participation in the study was obtained from all patients. The blood samples were enriched for CTCs using immunomagnetic cell capture with the CancerSelect Breast Cancer test from AdnaGen (Langenhagen, Germany).
  • CTCs samples were analyzed for the expression EpCAM, MUC-1 , HER2 by classical parallel PCR and capillary electrophoresis using the CancerDetect Breast Cancer test from AdnaGen (Langenhagen, Germany).
  • a second part was pre-amplified by PCR for limited number of cycles with TaqMan® PreAmp Master Mix according to manufacturer instruction using in-house-designed assays at a final concentration of 25nM.
  • Pre- amplified cDNA was used as template for qPCR analysis of 31 transcripts (TOP2A, ADAM17, PARP1, VEGF, VEGFR, PRG, ESR, MTOR, AKT2, STATB1, PTEN, KRT19, EPCAM, AURKA, MUC1, CD45, CXCR1, UPA, MCM, CXCR1, KI67, TWIST, ALDH1, p53, HER2, kRAS, C-myc, and the short splice variant of WHSC1L1 (WHSC1L1-S) and total expression of WHSC1L1 (WHSC1L1-L) on the BioMarkTM HD System using 48x48 Dynamic ArrayTM integrated fluidic circuits (IFCs) (Fluidigm, USA).
  • IFCs Dynamic ArrayTM integrated fluidic circuits
  • the IFC chip was primed in the NanoFlexTM IFC Controller (Fluidigm, USA) prior analysis.
  • the assay and sample mixtures were prepared according to manufacturer protocol for EvaGreen (Fluidigm, USA) and loaded into each sample/detector inlet of the dynamic array chip according to the loading scheme.
  • Chip was inserted into the IFC Controller for loading procedure followed by thermal cycling on BioMark qPCR System.
  • the cycling program for Ssofast EvaGreen Supermix was modified as follows: 3 minutes initial denaturation and enzyme activation at 95°C followed by 40 cycles of cycling at 95°C for 15 seconds, 60°C for 20 seconds, and 72°C for 20 seconds.
  • the chip was resubmitted into the BioMark system for melt curve analysis using recommended protocol.
  • BioMark Gene Expression Data Analysis and BioMark Melt Curve Analysis software were used to obtain Ct values and Tm data.
  • the data was analyzed for stable expressed genes that could suit as normalizers, but no such genes were found. Instead the data analyzed was either normalized to the average expression of all genes (global normalization) or not normalized, which corresponds to normalization to the sample amount extracted. The problem of finding suitable normalizer produces ambiguous results, although main features are seen.
  • the data was further autoscaled (subtracting mean expression of every gene and dividing with its standard deviation) to give all markers the same weights.
  • Figure 1 1 shows (A) classification using Principal Component Analysis (PCA) of CTC samples collected from 21 patients with a panel of 31 markers including assays with differential sensitivity for splice variants of WHSC1L1. Stars indicate patients that are still alive and hexagons indicate deceased patients. There is clear separation of the live and deceased patients along PC1 as well as along PC2 evidencing the here invented markers and tests are of prognostic value.
  • PCA Principal Component Analysis
  • genes form three main clusters: a) AURKA, MCM, KRT19, HER2, MUC, EPCAM, TOP2, VEGFA, CXCR1 and WHSC1L1-S; b) TWIST, SCCB, PGR, UP A, VEGFR, KI67, KRAS, ALDH and ESR7; c) TP53, PARP, CTSD, CD24, ADAM, AKT, MYC, CD45, PTEN, MTOR, SATB and WHSC1L1-T.
  • the short variant and total expression of WHSC1L1 is found in different clusters evidencing they are important for the classification.
  • WHSC1L1 Expression of WHSC1L1 is found in all samples and there is correlation between high expression of WHSC1L1 and CTC positivity based on the AdnaTest (p ⁇ 0.05) evidencing the invention here is useful as cancer marker also for analysis of samples collected from blood.
  • WHSC1L1 There are two main forms of WHSC1L1 that differ in length and the present invention teaches how they can be exploited to further improve the diagnostic and theranostic relevance of the invention.
  • qPCR assays were invented that selectively quantify the short transcript variant of WHSC1L1, the long transcript variant of WHSC1L 1 and both variants of WHSC1L1.
  • Figure 1 1 C the majority of survivors with CTC positive samples based on the AdnaGen, express predominantly the short variant of WHSC1L1.
  • Figure 12 A compares separately the expression
  • the relative expression of the short and long splice variants can be measured using the invented assays specific for the short and long transcript variants, respectively, or it can be calculated from measurements of either an assay specific for the short splice variant or an assay specific for the long splice variant combined with measurement of total WHSC1L1 expression.
  • the relative expression of the short to long splice variant of WHSC1L1 among survivors and deceased patients is shown in Figure 12 B.
  • the herein invented approach of comparing the relative expression of two (or more) splice variants of the same gene is that normalization to reference genes, which often is unreliable in cancer diagnostics, is redundant.
  • the measurement of an elevated relative expression of the short variant of WHSC1L1 compared to the long variant using the assays of the present invention provides a positive indicator for the survival of cancer patients.

Abstract

The present invention provides a method for in vitro diagnosing and/or prognosing cancer in a biological sample, wherein changes in expression levels of one or more molecular markers in the 8p11 -p12 chromosomal region, and said changes indicative of an activity of the Wolf-Hirschhorn syndrome candidate 1-like 1 (WHSC1L1) gene, or gene products thereof are assessed. Said changes in activity are used during the assessment of a diagnosis and/or prognosis of cancer and in particular breast cancer. The invention also provides a kit for use in the assessment of cancer.

Description

TITLE
MOLECULAR MARKER FOR CANCER TECHNICAL FIELD
The present invention relates to the assessment of a gene in the 8p11 -p12 chromosomal region and gene products thereof, and their use as powerful molecular markers in decisions related to the diagnosis and/or prognosis of cancer, and in particular breast cancer.
BACKGROUND OF THE INVENTION
Breast cancer is the most commonly diagnosed malignancy in Swedish females, accounting for around 30% of female cancers diagnosed in Sweden. The selection of treatment is influenced by various prognostic factors such as axillary lymph node status, pathological tumor grade, S-phase fraction, and the status of molecular markers such as ERBB2/HER2 and the estrogen receptor. The current prognostic factors are, however, inadequate when distinguishing between favorable and unfavorable prognosis patients, resulting in the selection of inadequate treatment for many breast cancer cases, while treating many others unsuccessfully, and thereby imposing unnecessary adverse effects. Clearly, many patients could benefit greatly from additional complementary molecular markers, which may help guide treatment decisions and be of value in the development of new therapeutic agents.
During the last two decades, cancer research has shown that tumorigenesis results from the accumulation of critical mutations in a single somatic cell, which in turn, leads to the loss of normal cell growth control. Mutations vary from single nucleotide defects in the DNA sequence to epigenetic modulations and complex rearrangements of entire chromosomes. Most mutations are somatic, acquired as a result of environmental factors or simply by chance. However, genetic predisposition (heredity) plays a role in the development of cancer resulting in different cancer forms and progression of the disease. This is reflected in its polygenetic and heterogeneous, as well as, environmental predisposition.
In recent years, the analysis of the genetic composition of frozen or otherwise preserved archived breast tumors has identified several potential drug targets and key molecular markers for prognosis and treatment decisions. Among these are genes located within the most frequently amplified DNA regions, i.e. 17q12, 8q24, and 11q13. Oncogenes, such as ERBB2IHER2, MYC, and CCND1 located in these regions are attractive targets for drug development. Other amplified regions have also been identified in breast cancer, such as 8p11-p12 and 20q13, however the driver genes in these genomic regions have remained elusive.
Management of cancer is complicated because of the highly heterogeneous nature of the disease. During the last decades, major efforts have been made to identify molecular markers that can guide decision making processes, including markers for (i) Disease predisposition - i.e. the preventive examination to predict which individuals are more or less likely to develop cancer; (ii) Primary diagnostics - i.e. the screening of individuals without symptoms to detect early onset of cancer; (iii) Therapy/disease monitoring wherein biomarkers are measured during ongoing therapy to monitor patient's response and adjust therapy accordingly; (iv) Relapse- i.e. the monitoring of patients that have undergone therapy and initially have no symptoms, to detect any relapse early; (v) Prognostics to predict the severity of the disease and likelihood of long term survival of patients; and (vi) Theranostics where biomarkers are used to prescribe optimum therapy and to identify patients that are likely to respond to any given therapy. For the
management of breast cancer recommendations on which markers to use are given by the American Society of Clinical Oncology (ASCO). The most recent ASCO guidelines are from 2007 and summarize data from 321 references (J. of Clin. Oncology 25, 2007).
Biomarkers can be assessed at different biological levels. The genome can be analyzed for mutations, copy number variations (allelic loss, gain, and amplification), and epigenetic modulations. The transcriptome can be assessed by methods analyzing RNA, and the proteome by methods for protein/peptide analysis. Recently also controlling elements, such as microRNA's and other non-coding RNAs can be measured as molecular markers. Most markers are assayed individually, such as circulating MUC-1 using CA 15-3 and CA 27.29 assays, Carcinoembryonic antigen (CEA), estrogen (ER) and progesterone receptor (PgR), HER2, Ki67, cyclin D, cyclin E, p27, p21 , thymidine kinase (TK), topoisomerase II, p53, Urokinase plasminogen activator (uPA), plasminogen activator inhibitor (PAI-1), etc. These markers are good for specific cases and may provide valuable indications, but they are generally insufficient and only a few of them are recommended by the guidelines of most countries. It is also expensive to test for many markers, particularly when testing them individually. An approach to reduce the cost per marker testing is to use arrays, chips or microfluidic platforms that allow tens to hundreds and potentially even larger numbers of markers to be measured in parallel. Indeed, in February 2007, the U.S Food and Drug Administration (FDA) cleared the MammaPrint test from Agendia, which is based on a 70-gene breast cancer gene signature to be used in the U.S. for lymph node negative breast cancer patients under 61 years of age with tumors of less than 5 cm. Genomic Health has developed Oncotype Dx based on a panel of 21 genes to predict breast cancer recurrence within 10 years of the initial diagnosis. The Erasmus MC/Daniel den Hoed Cancer Center, Rotterdam, the Netherlands, has generated a 76-gene panel for prognosis of lymph-node-negative breast cancer patients.
Comparing the markers of these panels it is found that the 70-gene MammaPrint panel and the 76-gene Rotterdam share only three genes,- and neither of them has any genes in common with the 21 -gene Oncotype Dx panel. The reason all three panels perform rather well despite negligible gene overlap is that many genes can reflect the same aberrations. For example, the 8p11-12 region spans approximately 10 Mb and encompasses some 61 known genes, any of which can potentially be used as markers for genomic aberrations, such as loss or amplification of this region. Although expression pathways are not as unequivocally defined as genomic aberrations, any malfunction will be reflected by any of the genes involved. The three tests show important overlap among the expression pathways and chromosomal hotspots reflected by the genes in their panels.
Although there is redundancy among the genes that can be used for diagnostic and prognostic purposes, certain genes will be more powerful markers for any single pathway or aberration than others. For example, while it is common that large segments in the genome are lost or amplified, and can be detected by any of the genes in the region, amplification or loss of a shorter region will only be reflected by the genes involved and will have an effect only if the genes are relevant to the disease. If the genes are assessed on the expression level, other factors also affect their suitability as markers. For example, the level of expression may differ among the markers, their degradation rate, and, particularly for protein markers, their availability. For example, it is of great advantage if the markers are secreted. Also, the degree of disease involvement of the marker is expected to be relevant. If the marker is directly affecting the course of the disease even small differences in its expression level may be significant, while the levels of markers that are only affected indirectly may be moderated by many factors and therefore changes in their levels will be less significant. It is very challenging to identify optimum markers for the various aberrations and expression pathways of relevance for cancer. Approaches based on global screening of genes are likely to find candidates for the most relevant pathways and aberrations, but they will not find the optimal set because of the high false discovery rates in the statistical analysis. To identify optimal markers, the system must be reduced to study only the relevant genes. As already mentioned for the breast cancer hotspot regions 17q12, 8q24, and 11q13, consensus key marker genes have been identified, while for 8p1 1-p12 and 20q13 the most relevant markers have remained elusive until disclosed in the herein described invention.
Amplification of the 8p1 1 -p12 region has been associated with increased cancer proliferation rate (high SBR tumor grade, Ki-67 expression, and cyclin E expression) and reduced survival. Although the amplification and over-expression of specific genes in this region have been reported, there is conflicting information on which genes can function as bona fide oncogenes and serve as key diagnostic, prognostic and theranostic markers (Bernard-Pierrot I, Gruel N, Stransky N, et al. Characterization of the recurrent 8p1 1-p12 amplicon identifies PPAPDC1 B, a phosphatase protein, as a new therapeutic target in breast cancer. Cancer Res 2008;68:7165-75; Gelsi-Boyer V, Orsetti B, Cervera N, et al. Comprehensive profiling of 8p1 1 -12 amplification in breast cancer. Mol Cancer Res 2005;3:655-67). These reports propose several candidates as target for amplification including FGFR1 and PPAPDC1 B. A recent functional study proposed PPAPDC1 B as the amplification target (Bernard-Pierrot I, Gruel N, Stransky N, et al. Characterization of the recurrent 8p1 1 -12 amplicon identifies PPAPDC1 B, a phosphatase protein, as a new therapeutic target in breast cancer. Cancer Res 2008;68:7165-75). However, until now researchers have studied the genomic amplification process. These studies reveal genes that are amplified and may serve as markers for amplification of the region, but those markers may not be best for the assessment of breast cancer disease.
A path for cancer spread is via peripheral blood. Tumor cells shed from the primary tumor, enter circulation, and find homing organs where they develop into metastases. Circulating tumor cells (CTC) can be enriched from peripheral blood using a variety of technologies that rely on, for example, the expression of certain antibodies or cell size. Currently only one such system is approved by the FDA: CellSearch (Veridex, Raritan, NJ), which relies on counting of CTC's. Recent studies suggest the technique may underestimate the number of CTC's. Furthermore, the system does not characterize the CTC's. Other techniques, such as the AdnaTest (AdnaGen, Langenhagen, Germany) which also enriches for CTC's, analyze the CTC's for expression markers. CTC's from breast cancer patients, for example, are analyzed for the markers EpCAM, MUC , and HER2. Besides giving an indication of the CTC load by the overall expression of the tumor markers, the test also indicates features of the spreading cells. Recent publications disclose measurements of even larger number of markers from the CTC's (Sieuwerts et al. , J. Natl. Cancer Inst. 2009, 101 , 61 -66). These studies are very promising, but suffer from drawbacks. Most importantly, the enriched CTC's from the peripheral blood are not pure, but the sample contains also other cells, most likely leukocytes but possibly also other cells. This compromises attempts to normalize the data in order to separate the response of markers stemming from the CTC's from markers deriving from the other cells present. Of course, markers exclusively expressed in cancer cells should come from the CTC's. But few markers, if any, are expressed exclusively in cancer cells due to illegitimate expression in normal cells. Still, since these techniques provide no count of the cells for normalization to determine the load, which remains a problem. Therefore, markers that indicate disease state and progression without relying on traditional normalization to cell count or housekeeping genes should have important advantages.
There is thus a need for novel molecular markers to assess breast cancer. The present application addresses those needs and interest.
SUMMARY OF THE INVENTION
In the present invention not only is genome amplification of genes within the 8p11-p12 region studied in tumor biopsies, but rather general activation of the genes in this region. Most unexpectedly it was found that the Wolf-Hirschhorn syndrome candidate 1-like 1 (WHSC1L1), which is a gene never before attributed as an important cancer marker nor used in any of the panels mentioned above, is in fact the best cancer marker among the genes in the 8p1 1-12 region.
Thus, one object of the present invention is to provide a method to diagnose and/or prognose cancer in a biological sample in vitro, said method comprising the steps of a) assessing changes in expression levels of one or more molecular markers in the 8p1 1 -p12 chromosomal region, and wherein said changes are indicative of an activity of the Wolf-Hirschhorn syndrome candidate 1 -like 1 (WHSC1L1) gene, or gene products thereof,
b) comparing the amount of changes assessed in a) above to a positive and/or negative control, thereby diagnosing and/or prognosing cancer.
The method may include diagnosing and/or prognosing cancer when the cancer is breast cancer. The method may include diagnosing and/or prognosing breast cancer when the breast cancer is invasive breast cancer.
The biological sample may be any kind of biopsy, blood, serum, plasma, urine, saliva, bone marrow, and/or cell populations. The one or more molecular markers assessed in the method of invention may be selected from the group consisting of genomic sequences or parts thereof, transcription products, translation products, and polypeptides.
The genomic sequence may be selected from the group consisting of genes and gene fragments, non-coding gene sequences and epigenetic signatures.
The transcription product may be selected from the group consisting of primary transcripts, spliced primary transcripts, mRNA, microRNA, cDNA, and non-coding RNA.
The translation product may be selected from the group consisting of peptides, polypeptides, proteins or fragments thereof, and post translationally modified proteins. The translation product may be an isoform of the WHSC1 L1 protein or a fragment thereof, and/or a post translationally modified isoform of the WHSC1 L1 protein.
The WHSC1L1 gene, or a gene product thereof may be the sole molecular marker assessed in the 8p11-p12 chromosomal region.
The one or more molecular markers can be assessed by Array comparative genomic hybridization (array-CGH), Polymerase Chain Reaction (PCR) based techniques, lllumina's Gene Expression BeadArray technology, Fluorescence in situ hybridization (FISH), Enzyme-linked immunosorbent assay (ELISA), Enzyme immunoassay (EIA) and/or Western Blot .
The activity of the WHSC1L1 gene may be a change in the copy number of the WHSC1L1 gene.
The activity of the WHSC1L1 gene may be an amplification, gain, heterozygous loss or homozygous deletion of the WHSC1L1 gene.
The activity of the WHSC1L1 gene product may be enhanced or abnormally suppressed expression of said gene product.
The copy number of the WHSC1L1 gene can be measured by real-time quantitative PCR.
The copy number of the WHSC1L1 gene can be measured by real-time quantitative PCR using the primers: 5'-GCAACAAGCATGACTCATCAAGAT-3' (SEQ ID NO: 1) and 5'- ACACAAGATCGCCAACCTGAA-3* (SEQ ID NO: 2).
The copy number of the WHSC1L1 gene can be measured by real-time quantitative PCR using the primers: 5'-GGCTTATAATTTCTACACCAAA-3' (SEQ ID NO: 3) and 5'- GAACGAGTTCTAATTGATCTTC-3' (SEQ ID NO: 4) optionally in combination with a probe.
The probe may be a hydrolysis probe with the sequence: 5'-AAGCCAACGC
AGAGTGTATC ATCT-3' (SEQ ID NO: 5).
The expression level of the WHSC1L1 gene may be assessed relative to the expression of one, preferably two, and more preferably 5-10 reference genes.
The expression levels of two or more splice variants of the WHSC1L1 gene may be assessed relative to each other. The activity of the WHSC1L1 gene, or gene products thereof may be assessed by the number of DNA copies of the WHSC1L1 genome in combination with a relative expression level of the WHSC1L1 gene transcript or translation products thereof.
At least 2, 4, 6, 8, 10, 20, 30, 40, 50, 60, 70, 80 or 90 further markers for cancer may be assessed in combination with the one or more molecular markers indicating the activity of the WHSC1L1 gene or gene products thereof.
The further markers for cancer may be selected from the group consisting of v-myc myelocytomatosis viral oncogene homolog (avian; MYC), Human Epidermal growth factor Receptor 2 {HER2), and StAR-related lipid transfer (START) domain containing 3 (STARD3).
An amplification of a gene coding for the further marker for cancer may be assessed. An expression of a gene coding for the further marker for cancer may be assessed.
An enhanced expression level of the WHSC1L1 gene transcription and/or translation products, but no amplification of the 8p11-p12 region may be an indication of a negative prognosis of cancer.
A relative expression of the short (CCDS6105) and long (CCDS43729) isoform of WHSC1L1 may be assessed.
An assessment of isoforms of WHSC1L1 may be made by measuring differential expression of said isoforms.
An assessment of isoforms of WHSC1L1 may be made by measuring differential expression of a set of transcripts that includes the transcript producing the short
WHSC1 L1 isoform (CCDS6105) relative to a set of transcripts that does not include it, or a set of transcripts that includes the transcript producing the long WHSC1 L1 isoform (CCDS43729) relative to a set of transcripts that does not include it. Cancer cells enriched from circulation (CTC) may be analyzed.
A further object of the invention is to provide a kit for in vitro diagnosing and/or prognosing cancer in a biological sample, said kit comprising means for assessing changes in expression levels of one or more molecular markers in the 8p11-p12 chromosomal region, and wherein said changes indicate an activity of the Wolf-Hirschhorn syndrome candidate 1-like 1 (WHSC1L1) gene, or gene products thereof.
The kit may further comprise a positive and/or a negative control.
The kit may further comprise instructions to the method as disclosed above.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 shows genome-wide frequency plots of gains and losses in tumors harboring 8p11-p12 amplification (n=45), gain (n=18), loss (n=16) and tumors lacking these aberrations (n=71).
Figure 2 shows an identification of the 8p11-p12 region in a diploid breast tumor using array-CGH.
Figure 3 shows a verification of 8p11-p12 amplification in the cells of a breast tumor using FISH.
Figure 4 shows the expression of WHSC1L1 in the tumor samples harboring 8p11-p12 amplification and tumor samples with normal DNA dosage measured with quantitative real-time PCR.
Figure 5 shows effects of A) all forms of 8p1 1 -p12 genetic aberrations, B) 8p1 1-p12 amplification, C) 8p11-p12 loss and D) 8p11-p12 gain on patient overall survival rates.
Figure 6 shows effects of A) WHSC1L1 gene amplification, B) WHSC1L1 over-expression and C) WHSC1L1 gene amplification and over-expression on patient overall survival rates using the lllumina gene expression system.
Figure 7 shows effects of either WHSC1L1 gene amplification and over-expression, or no WHSC1L1 gene amplification but over-expression.
Figures 8A-B show the total expression of WHSC1 L1 -S in the tumor sample harboring WHSC1L1 gene amplification compared to the tumor sample with normal DNA dosage levels. Figure 9 shows effects of total WHSC1 L1-S expression on patient overall survival rates using the WHSC1 L1 antibody with immunohistochemistry.
Figures 10A-H show the sensitivity, specificity and dynamic range of the herein invented qPCR assays for WHSC1L1.
Figures 1 1 A-C show classification using (A) Principal Component Analysis (PCA) or (B) hierarchical clustering of CTC samples collected from 21 patients with a panel of 31 markers; and (C) classifies the samples based on total expression of WHSC1L1 and expression of the short splice variant of WHSC1L1.
Figures 12 A and B compare the ratio between the expression of the short splice variant of WHSC1L1 and total WHSC1L1 expression among patients classified as survivors and non-survivors.
DETAILED DESCRIPTION OF THE INVENTION
In order to provide the correct treatment for cancer patients it is imperative to diagnose the disease as accurately as possible. It is necessary to have a better understanding of the genetic modulations, the involvement of specific oncogenes (i.e. genes whose product may transform normal cells to cancer cells), tumor suppressor genes and the activities of molecular pathways involved. The present inventors have now very unexpectedly found that the Wolf-Hirschhorn syndrome candidate 1 -like 1 (WHSC1L1) gene located in the 8p1 1 -p12 chromosomal region can be used as a powerful molecular marker in the assessment of breast cancer.
Therefore, one aspect of the present invention concerns a method for in vitro diagnosing and/or prognosing cancer in a biological sample, said method comprising the steps of a) assessing changes in expression levels of one or more molecular markers in the 8p11- p12 chromosomal region, and wherein said changes are indicative of an activity of the Wolf-Hirschhorn syndrome candidate 1-like 1 (WHSC1L1) gene, or gene products thereof, b) comparing the amount of change assessed in a) above to a positive and/or negative control, thereby diagnosing and/or prognosing cancer.
The biological sample may be any kind of biopsy including fresh-frozen tissue samples, paraffin-embedded biopsy material, or archived biopsy material, blood, serum, urine, saliva, bone marrow, and cell populations that may comprise normal cells or cancer cells.
The results obtained from performing the in vitro method of the invention may be used in the assessment of a diagnosis and/or prognosis of cancer in a subject. A "subject" as used herein is intended to mean any living animal or human. The term subject includes, but is not limited to, humans, nonhuman primates such as chimpanzees and other apes and monkey species, farm animals such as cattle, sheep, pigs, goats and horses, domestic mammals such as dogs and cats, laboratory animals including rodents such as mice, rats and guinea pigs, and the like. The term does not denote a particular age or sex. Thus, adult and newborn subjects, as well as fetuses, whether male or female, are intended to be covered. In preferred embodiments, the subject is a mammal, including humans and non-human mammals. In one embodiment, the subject is a human.
As used herein the term "cancer" refers to a physical condition in mammals that is typically characterized by a group of cells that display uncontrolled growth (division beyond the normal limits), invasion (intrusion on and destruction of adjacent tissues), and sometimes metastasis (spread to other locations in the body via lymph or blood).
Examples of cancers include but are not limited to breast cancer, colon cancer, lung cancer, prostate cancer, hepatocellular cancer, gastric cancer, pancreatic cancer, cervical cancer, ovarian cancer, liver cancer, bladder cancer, cancer of the urinary tract, thyroid cancer, renal cancer, carcinoma, melanoma, and brain cancer. The cancer can be any of the above-mentioned types of cancer but preferably is a solid tumor of epithelial origin, such as breast cancer, and can, for example, be tumors in invasive breast cancer (locally advanced, locally recurrent or metastatic), or stage II, stage III and stage IV breast cancer.
The term breast cancer as used herein encompasses several different subtypes of breast cancer. Invasive breast cancer may include tumors stratified by the number of lesions detected at the time of diagnosis, histological cell type, pathological tumor size, level of differentiation, DNA content (diploid (i.e. the normal (expected) amount of chromosomes in a normal cell, i.e. two copies of each autosome and two sex chromosomes), aneuploid (i.e. the chromosomal amount deviating from that detected in a normal cell as a result of chromosomal gains or losses or multiploid)), steroid hormone receptor content, HER2/nei/ expression, triple-negative status and the number of detected axillary lymph nodes (i.e. the lymph nodes located under the armpit to which cells from a primary breast tumor often spread). Axillary lymph node-negative (pNO) refers to primary breast tumors that have not spread to these nodes while axillary lymph node-positive (pN1) refers to those that have. Specifically, the breast cancer may include single or multiple lesions detected at the same time of diagnosis (synchronous) or at least 3 months apart (metachronous), in the same breast (ipslilateral) or in both breasts (bilateral). The breast cancer may include tumors of the group selected from the type of tissue from which the cancer arises, such as ductal (tumors originating from the ducts of the breast), lobular (tumors originating from the lobules of the breast), or rare cases originating from connective tissue.
As used herein the term "pathological tumor size" is intended to mean the physical tumor volume. pT1 is a tumor 2 cm (3/4 of an inch) or less across; pT2 is a tumor more than 2 cm but not more than 5 cm (2 inches) across; pT3 is a tumor more than 5 cm across; pT4 is a tumor of any size growing into the chest wall or skin, including inflammatory breast cancer. The tumors included in the present invention may vary in size from smaller than 2 cm (3/4 of an inch) to over 5 cm (2 inches) across, as well as, vary in differentiation levels. The Scarff-Bloom-Richardson (SBR) grade system is used to classify tumors based on three morphological features. These morphological features include the degree of tumor tubule formation, mitotic activity, and nuclear pleomorphism of tumor cells (also referred to as nuclear grade). The system is organized into three grades: SBR grade I (well differentiated), SBR grade II (moderately differentiated), and SBR grade III (poorly differentiated). In addition, the expression profiles of proliferation-related genes can be used to stratify estrogen receptor positive breast cancers according to histologic grade. As used herein this stratification is referred to as the "Genomic Grade Index (GGI)", where low GGI corresponds frequently to SBR grade I and high GGI corresponds frequently to SBR grade III. As used herein the terms "axillary lymph node negative" and "axillary lymph node positive" refer to subjects with the absence or presence of cancer cells in the axillary lymph nodes. Lastly, a triple negative breast cancer status is assigned based on the presence, or lack thereof, of three important receptors: estrogen, progesterone, and the human epidermal growth factor receptor 2 (HER2//7eu) oncogene. None of these receptors are present in triple negative breast cancers (estrogen receptor negative, progesterone receptor negative, HER2/neu negative), which therefore are not responsive to receptor-targeted treatments.
In the described invention the method involves assessing changes in expression levels of one or more molecular markers in the 8p11 -p12 chromosomal region, and said changes are indicative of an activity of the Wolf-Hirschhorn syndrome candidate 1 -like 1
(WHSC1L1) gene, or gene products thereof. The term "molecular marker" is here intended to mean any molecule that can be measured and whose abundance reflects the activity of targeted genes in the 8p1 1 -p12 chromosomal region, which spans an approximate 8 Mb region and encompasses about 61 known genes. More specifically the method of the invention involves assessing changes in expression levels of one or more molecular markers indicative of an activity of the Wolf-Hirschhorn syndrome candidate 1 - like 1 (WHSC1L1) gene, or gene products thereof. The WHSC1L1 gene which may also be referred to in the literature as NSD3, spans over 90 kb of the 8p12 locus and consists of 24 exons. The WHSC1 L1 protein belongs to the histone-lysine methyltransferase family and SET2 (Su(var)3-9, Enhancer-of-zeste, Trithorax) subfamily. The function of the WHSC1 L1 protein is still rather unknown, but it may encompass activities such as histone methyltransferase activity and preferentially methylates Lys-4 and Lys-27 of histone H3. Methylation of Lys-4 initiates an epigenetic transcriptional activation, while methylation of Lys-27 initiates an epigenetic transcriptional repression. Within the context of the present invention "the WHSC1L1 gene, or gene products thereof refer to a contiguous stretch of nucleotide bases within the genome that is transcribed into a transcription product, more specifically an mRNA. As used herein, the term transcription product can refer not only to the mRNA transcribed from the DNA, i.e. a transcript, but also the DNA copy of the mRNA, e.g. cDNA. Such mRNA is subsequently translated into a translation product, i.e. a "gene product" such as e.g. a polypeptide or protein. The gene may have multiple sections, parts or regions, e.g. coding and non-coding sections. The complete gene comprises all of the sections but alternative splicing of the gene may give rise to one or more fragment of a gene consisting of less than all the sections.
The present invention encompasses all known isoforms representing different splice variants of the WHSC1L1 gene as published by the Ensembl project database,
http://www.ensembl.org/Homo sapiens/Gene/Summary?g=ENSG00000147548. The longest, i.e. isoform WHSC1 L1-001 (Transcript ID: ENST00000317025 in the Ensembl project database and Consensus Coding Sequence (CCDS) 43729), has 1437 amino acids (Protein ID ENSP00000313983). WHSC1 L1-001 contains two PWWP (proline- tryptophan-tryptophan-proline) domains, which are present in proteins of nuclear origin and are involved in protein-protein interactions. It further contains four PHD-type zinc finger motifs (homeodomain), a SET and Post-SET domain, and an AWS (associated with SET) domain with lysine methyltransferase function. Some amino acid residues are missing from the other four isoforms, but all amino acid positional information regarding these isoforms refers to the longest isoform WHSC1 L1-001.
Isoform WHSC1 L1-002 (Transcript ID: ENST00000316985 and CCDS6105) contains 645 amino acids (Protein ID ENSP00000313410). Amino acids in positions 620-645 are substituted and amino acid in positions 646-1437 are missing, and therefore only consists of one of the two PWWP domains. Isoform WHSC1 L1-003 (Transcript ID: ENST00000433384) is 1388 amino acids long (Protein ID ENSP00000393284). It lacks the PHD-type 3 domain as amino acid positions 871-919 are missing.
Isoform WHSC1 L1-005 (Transcript ID: ENST00000527502) is 1426 amino acids long (Protein ID ENSP00000434730).
Isoform WHSC L1-008 (Transcript ID: ENST00000534155) is 25 amino acids long (Protein ID ENSP00000432544).
Isoform WHSC1 L1-009 (Transcript ID: ENST00000529223) is 282 amino acids long (Protein ID ENSP00000435422).
Isoform WHSC1 L1-010 (Transcript ID: ENST00000534539) is 15 amino acids long (Protein ID ENSP00000431598).
Isoform WHSC1 L1 -011 (Transcript ID: ENST00000528627) is 77 amino acids long (Protein ID ENSP00000435073).
Isoform WHSC1 L1-201 (Transcript ID: ENST00000446459) is 1374 amino acids long (Protein ID ENSP00000387883), and is missing residues in positions 67-129.
The short isoform (WHSC1 L1-002) and a long isoform (WHSC1 L1-001) have been reported to be co-expressed in many tissues, with the short being the prevalent form. Any molecular marker arising from any of the different forms of the WHSC1L1 genes or isoforms produced from the splice variants thereof are encompassed in the present invention.
Examples of molecular markers that are assessed in the present invention may include genomic sequences or fragments thereof, transcription products, translation products and polypeptides. As used herein, the terms "genome(s)" or "genomic sequences" is intended to mean the hereditary information of an organism typically encoded in nucleic acids, either DNA (Deoxyribonucleic acid), or RNA, i.e. ribonucleic acid which is a nucleic acid produced with DNA as template, and includes genes and non-coding gene sequences. The genome may refer to the nucleic acids making up one set of chromosomes of an organism (haploid genome) or both sets of chromosomes of an organism (diploid genome) depending on the context in which it is used. The genome may be at least partially isolated, part of a nucleus, and/or in a cell, such as, but not limited to, a germ cell or a somatic cell, i.e. a cell of the body, not including the reproductive cells. In some embodiments, one or more genomes may include, but not be limited to, nuclear, organellas and/or mitochondrial genomes. Examples of genome molecular markers that are included in the present invention are genes and fragments thereof, non-coding gene sequences and epigenetic signatures.
The genomic sequences or genes give rise to products such as transcription products and translational products which also are molecular markers that may be assessed in the method of the invention. As used herein the term "transcription products", generally refers to any equivalent RNA copies of a sequence of DNA which may be unmodified RNA, or modified RNA of any length. Included are primary transcripts (i.e. the nucleic acid sequences corresponding to the transcribed portions of a gene), spliced primary transcripts, messenger RNA (mRNA), microRNA, complementary DNA (cDNA) i.e. a DNA molecule synthesized from an mRNA template, genomic DNA (gDNA), non-coding RNA, epigenetic signatures. In addition, triple-stranded regions comprising RNA or DNA or both RNA and DNA are included in the term transcription products. The strands in such regions may be from the same molecule or from different molecules. Moreover, DNAs or RNAs comprising unusual bases such as inosine, or modified bases such as tritiated bases are also encompassed in the term transcription products.
As used herein the term "translation product" is intended to encompass products that are a result of the translation of messenger RNA (mRNA) by the ribosome to produce a specific amino acid chain, i.e. peptides such as oligopeptides, polypeptides or proteins. The term "peptide" as used herein refers to an amino acid chain of any length greater than two amino acids in which the amino acid residues are linked by peptide bonds or modified peptide bonds into a polypeptide. An oligopeptide is a polypeptide of between 2 and 20 amino acids. A protein is one or more polypeptides more than about 50 amino acids long, often folded into a globular form. Unless otherwise specified, the terms "oligo peptide", "polypeptide" and "protein" also encompass various modified forms thereof. Such modified forms may be naturally occurring modified forms or chemically modified forms, post translationally modified polypeptides, proteins or fragmented or degraded proteins.
Examples of modified forms include, but are not limited to, glycosylated forms, phosphorylated forms, myristoylated forms, palmitoylated forms, ribosylated forms, acetylated forms, ubiquitinated forms. Modifications also include intra-molecular crosslinking and covalent attachment to various moieties such as lipids, flavin, biotin, polyethylene glycol or derivatives thereof. In addition, modifications may also include cyclization, branching and cross-linking.
Examples of fragmented or degraded proteins may be truncated proteins, which are proteins that have not achieved its full length or its proper form, and thus is missing some of the amino acid residues that are present in the normal protein. A truncated protein generally cannot perform the function for which it was intended because its structure is incapable of doing so. Further, amino acids other than the conventional twenty amino acids encoded by genes may also be included in a polypeptide or protein. Thus, the translation products of the invention include, but are not limited to oligo peptides, polypeptides, proteins, truncated proteins or isoforms thereof, post translationally modified proteins or isoforms thereof, or fragmented or degradation products of a protein, and secreted peptides that originate from a protein.
The molecular markers may be assessed by different techniques that are well known to the person skilled in the field of molecular genetics, proteomics, and chemistry. The following techniques are of particular relevance for the present invention:
Array comparative genomic hybridization (also CMA, Chromosomal Microarray Analysis, Microarray-based comparative genomic hybridization, array-CGH, a-CGH, aCGH, or Virtual Karyotype) is a technique to detect genomic copy number variations at a higher resolution level than chromosome-based comparative genomic hybridization (CGH). DNA from a test sample and normal reference sample are labeled differentially, using different fluorophores, and hybridized to several thousand probes. The probes are derived from most of the known genes and non-coding regions of the genome, and printed on a glass slide. The ratio of the fluorescence intensity of the test to that of the reference DNA is then calculated, to measure the copy number changes for a particular location in the genome. Details for how this technique may be applied in the present invention are described in Example 1 a below.
Polymerase Chain Reaction (PCR) based techniques, such as e.g. quantitative real-time PCR (qPCR), digital PCR (dPCR); is a scientific technique to amplify a single or a few copies of a piece of DNA across several orders of magnitude, generating thousands to millions of copies of a particular DNA sequence. The method relies on thermal cycling, consisting of cycles of repeated heating and cooling of the reaction for DNA melting and enzymatic replication of the DNA. Primers (short DNA fragments) containing sequences complementary to the target region along with a DNA polymerase are key components to enable selective and repeated amplification. As PCR progresses, the DNA generated is itself used as a template for replication, setting in motion a chain reaction in which the DNA template is exponentially amplified. PCR can be extensively modified to perform a wide array of genetic manipulations. In the experiments performed to support the present invention a commercially available qPCR assay was ordered from Life Technologies, which is a leading global provider of qPCR assays. The assay detects the main isoforms, and two novel highly optimized assays based on specific primer pairs that detect all known protein coding isoforms were invented (for details see Example 4). As used herein the term primer is intended to mean a synthetic oligonucleotide that may be modified or contain modified bases and is sufficiently complementary to a target DNA sequence to be extended by a polymerase in for example the polymerase chain reaction. Furthermore, combinations of qPCR assays that reflect relative expression of WHSC1L1 splice variants and thus WHSC1 L1 isoforms were invented. An approach to classify the CTC samples based on the relative expression of those transcripts was invented. The invention has the advantage that it does not rely on normalization with reference genes, which is very unreliable norm for aneuploidic cells and in situations when the cell population is heterogeneous.
The lllumina Gene Expression BeadArray technology is based on 3 micron perfectly spherical beads that sit in micro-wells that are formed in either the end of a fiber optic bundle or on a planar surface such as a silica slide. The beads have a very uniform spacing of 5.7 micron and are loaded at an even depth of 2 micron. These beads act as the functional elements of the array and are evenly covered with about 700 to 800 thousand copies of a specific oligonucleotide that acts as the capture sequence in a genotyping or gene expression assay. The Beads themselves are not colored and are coated with unlabelled oligonucleotides - but these in turn hybridize to fluorescently tagged reaction products from bioassays. Further details of how this technique may be used in the present invention can be found in Example 2 below.
Fluorescence in situ hybridization (FISH) is a cytogenetic technique that is used to detect and localize the presence or absence of specific DNA sequences on chromosomes. FISH uses fluorescent probes that bind to only those parts of the chromosome with which they show a high degree of sequence similarity. In short, first a probe is constructed. The probe must be large enough to hybridize specifically with its target but not so large as to impede the hybridization process. The probe is tagged directly with fluorophores, with targets for antibodies or with biotin. Tagging can be done in various ways, such as nick translation or PCR using tagged nucleotides. Then, an interphase or metaphase chromosome preparation is produced. The chromosomes are firmly attached to a substrate, usually glass. Repetitive DNA sequences must be blocked by adding short fragments of DNA to the sample. The probe is then applied to the chromosome DNA and incubated for approximately 12 hours while hybridizing. Several wash steps remove all non-hybridized or partially-hybridized probes. The results are then visualized and quantified using a microscope that is capable of exciting the dye and recording images. Specific application of this technique as used in the present invention may be seen in detail in Example 1b below.
Enzyme-linked immunosorbent assay (ELISA) or Enzyme immunoassay (EIA) are biochemical techniques used mainly in immunology to detect the presence of an antibody or an antigen in a sample. In simple terms, the sample with an unknown amount of antigen is immobilized on a solid support (usually a polystyrene microtiter plate) either non-specifically (via adsorption to the surface) or specifically (via capture by another antibody specific to the same antigen, in a "sandwich" ELISA). After the antigen is immobilized the detection antibody is added, forming a complex with the antigen. The detection antibody can be covalently linked to an enzyme, or can itself be detected by a secondary antibody which is linked to an enzyme through bioconjugation. Between each step the plate is typically washed with a mild detergent solution to remove any proteins or antibodies that are not specifically bound. After the final wash step the plate is developed by adding an enzymatic substrate to produce a visible signal, which indicates the quantity of antigen in the sample.
Immunohistochemistry or IHC refers to the process of detecting antigens (e.g., proteins) in cells of a tissue section by exploiting the principle of antibodies binding specifically to antigens in biological tissues. Immunohistochemical staining is widely used in the diagnosis of abnormal cells such as those found in cancerous tumors. Specific molecular markers are characteristic of particular cellular events such as proliferation or cell death (apoptosis). Visualising an antibody-antigen interaction can be accomplished in a number of ways. In the most common instance, an antibody is conjugated to an enzyme, such as peroxidase, that can catalyse a colour-producing reaction. Alternatively, the antibody can also be tagged to a fluorophore, such as fluorescein or rhodamine. The direct method is a one-step staining method and involves a labeled antibody (e.g. FITC-conjugated antiserum) reacting directly with the antigen in tissue sections. While this technique utilizes only one antibody and therefore is simple and rapid, the sensitivity is lower due to little signal amplification, such as with indirect methods, and is less commonly used than indirect methods. The indirect method involves an unlabeled primary antibody (first layer) that binds to the target antigen in the tissue and a labeled secondary antibody (second layer) that reacts with the primary antibody. After immunohistochemical staining of the target antigen, a second stain is often applied to provide contrast that helps the primary stain stand out. Many of these stains show specificity for discrete cellular compartments or antigens, while others will stain the whole cell. Western Blot is an analytical technique used to detect specific proteins in a given sample of tissue homogenate or extract. It uses gel electrophoresis to separate native or denatured proteins by the length of the polypeptide (denaturing conditions) or by the 3-D structure of the protein (native/ non-denaturing conditions). The proteins are then transferred to a membrane (typically nitrocellulose or PVDF), where they are probed (detected) using antibodies specific to the target protein.
As used herein the term "activity of the WHSC1L1 (Wolf-Hirschhorn syndrome candidate 1-like 1 ) gene" includes but is not limited to activities such as changes in the gene copy number, i.e. changes in the number of copies of the gene of interest. Such a change may be classified as amplification, gain, loss and deletion on the genome level. As used herein the term "gain" in copy number has occurred when about 1-2 more copies than the normal gene copy number (but less than amplification levels) are produced, an "amplification", when there is more than 2.5-fold the normal gene copy number, a "heterozygous loss", when whole or small DNA segments of a chromosome have been lost, and "homozygous deletion", when both copies of a gene have been lost.
More specifically, the activity may include assessing the WHSC1L1 gene copy number, gene sequence variations that affect WHSC1L1 transcription, or the expression level of the WHSC1L1 gene. By most methods gene expression is measured relatively, i.e., comparing the expression of a potential oncomarker to that of a reference gene, whose expression supposedly should not depend on the disease. Such measurements usually provide more accurate quantitative information, since normalization with the reference gene removes much of the confounding variation introduced by the handling of the sample, including degradation of the sample material that may have occurred during the sample preservation, transport, storage and possibly thawing of the sample. There is thus an advantage of measuring the expression of the WHSC1L1 gene in combination with the expression of other genes for normalization purposes. The reference gene should preferably not be located in the 8p1 1-p12 region, since its expression should be independent of the disease. There is an advantage in using more than one reference gene and normalize to their geometric mean expression levels, since the average value will be a more robust normalizes Preferably 2, 3, 4, 5, 6, 7, 8, 9, 10 or more reference genes are used for normalization purposes. Suitable reference genes are, for example, those available in the Human Endogenous Control Gene Panel from TATAA Biocenter (www.tataa.com) The activity of the WHSC1L1 gene may also be reflected by any changes in expression of products of this gene, i.e. changes in the expression levels of transcription or translation products (as defined above) produced by this gene. Such changes may e.g. be an enhanced or abnormally suppressed expression on the mRNA and/or protein level. In the present invention the expression level is considered to be enhanced when there is more than 1.5-fold expression compared to the normal gene expression level and the expression is considered abnormally suppressed when the expression is below 50 % of the normal level. As used herein the term "normal level" is the gene expression level in a sample known to be devoid of any abnormal activity of the WHSC1L1 gene or gene product. When used in connection with polypeptide or protein "activity" encompasses any physiological or biochemical activities displayed by, or associated with, a particular protein or protein complex including, but not limited to, activities exhibited in biological processes and cellular functions, the ability to interact with or bind another molecule or a moiety thereof, a binding affinity or specificity to certain molecules, the in vitro or in vivo stability (e.g., protein degradation rate, or in the case of protein complexes, the ability to maintain the form of a protein complex), antigenicity, immunogenicity, and enzymatic activities. It should be pointed out that the activity of the WHSC1L1 gene or gene product can be assessed by measuring the WHSC1L1 protein in one or several of its isoforms, truncated variants or peptide fragments. Such activities may be detected or assayed by any of a variety of suitable methods as will be apparent to skilled artisans.
In the second step of the method, b) the change in activity is compared to a positive and/or negative control. Positive controls confirm that the procedure is competent in observing the effect (therefore minimizing false negatives). Negative controls confirm that the procedure is not observing an unrelated effect (therefore minimizing false positives). A positive control is a procedure that is very similar to the actual experimental test, but which is known from previous experience to give a result that is hypothesized to occur in the treatment group (positive result). A negative control is known to give a negative result. The positive control confirms that the basic conditions of the experiment were able to produce a positive result, even if none of the actual experimental samples produce a positive result. The negative control demonstrates the base-line result obtained when a test does not produce a measurable positive result; often the value of the negative control is treated as a "background" value to be subtracted from the test sample results, or be used as the "100%" value against which the test sample results are weighed. In the present invention the negative control may represent a "normal sample", i.e. a sample known to be devoid of any abnormal activity of the WHSC1L1 gene or gene product. The activity of the WHSC1L1 gene or gene product is thus compared to a "normal sample" and any difference in activity of the tested sample compared to that of the normal sample may be used in the assessment of a diagnosis and/or prognosis of cancer. For example, biopsy material, blood, serum, urine, saliva, bone marrow, and cell population samples known to be diseased or healthy in the species to be analyzed may be used as positive and/or negative controls.
In the present invention at least one of the molecular markers to be assessed should indicate the activity of the WHSC1L1 gene or gene product thereof. However, preferably more than one molecular marker indicating the activity of the WHSC1L1 gene or gene product thereof is assessed in the method of the present invention. In one embodiment of the invention, the molecular markers assessed are the gene coding for WHSC1L1, and the relative transcription level of the same WHSC1L1 gene. To the great surprise of the inventors, it is seen that an increase in the mRNA levels of the WHSC1L1 gene, i.e. more than 1.5-fold the normal gene expression level, is found very frequently among breast cancer patients (for details see example 2 below). Using the method of the invention it was found that the WHSC1L1 gene was over-expressed in 75-93% of the tumors having an 8p11-p12 amplification; the spread reflects variation in hybridization among the probes used (Table 1 , column 1), which target different regions of the gene, and the BAC clones (Table 1 , column 2). Furthermore, for a large number of the breast tumors 8p11-p12 chromosomal aberrations are not found, but by measuring mRNA expression of a number of genes located in the 8p11-p12 region, it was found that the WHSC1L1 gene was over- expressed in 29-45% (depending on the probe and the BAC clone used, Table 1 , columns 1-2) of these cases. Hence, by measuring WHSC1L1 mRNA over-expression most of the tumors with chromosomal aberrations in the 8p1 1 -p12 region but also tumors with no 8p1 1-p12 chromosomal aberration are detected. Many of these tumors would go undetected if assayed based on DNA dosage only. From example 2 below it is seen that the herein invented approach based on WHSC1L1 gene over-expression applies to 45- 53% (Table 1 , column 12) of all the studied tumors. This is far better than when using any other molecular marker in the 8p11-p12 region, and possibly better than with any other marker what so ever.
Although according to this invention a large number of breast tumors can be detected, some still go undetected because the novel WHSC1L1 marker is not over-expressed in all tumors. Hence, there is an advantage of combining the assessment of cancer using molecular makers for the WHSC1L1 gene or gene products thereof with measurement of further markers for cancer. Thus at least 2, 4, 6, 8, 10, 20, 30, 40, 50, 60, 70, 80 or even 90 or more further markers for cancer may be assessed simultaneously. These further markers should preferably be chosen from other chromosomal regions frequently aberrant in breast cancer, including the 17q12, 8q24, 1 1 q13 and 20q13 regions. Some preferred further markers include but are not limited to v-myc myelocytomatosis viral oncogene homolog (avian; MYC), Human Epidermal growth factor Receptor 2 (HER2), and StAR- related lipid transfer (START) domain containing 3 (STARD3). Advantageously the amplification of the genes of the further markers of cancer or preferably the expression thereof is measured, but is not limited thereto. There is less advantage of choosing another marker of the 8p1 1 -p12 region, since according to the invention here WHSC1L1 is by far the most powerful one, but of course, in situations where very large numbers of markers can be assessed, adding another marker from the 8p1 1-p12 region may help detect some of those cases that are missed when assaying for only WHSC1L1.
Common approaches to predict survival time, time to relapse, and other clinical parameters based on molecular markers are typically multivariate in nature, i.e., taking into account the responses of multiple markers as discussed above, whose contributions are weighted. Examples of mathematical approaches behind optimization of predictive models are multivariate regression models, artificial neural networks, support vector machines, principle components analysis, self organized maps, potential curves, decision trees, clustering methods, random forest. Because of the very high predictive power of WHSC1L1 based on this invention, multivariate models in which WHSC1L1 is given high weight compared to other markers will be more powerful.
Any change in activity compared to the positive and/or negative control may be used to assess a diagnosis and/or prognosis of cancer. The term "diagnosis" as used herein is defined to encompass the following processes either individually or cumulatively depending upon the clinical context: determining the presence of disease, determining the nature of a disease, and distinguishing one disease from another. The term "prognosis" as used herein is defined to encompass forecasting as to the probable outcome of a disease state, determining the prospect as to recovery from a disease as indicated by the nature and symptoms of a case, monitoring the disease status of a patient, monitoring a patient for recurrence of disease, and/or determining the preferred therapeutic regimen for a patient.
In the present invention, as revealed in Example 3 below, it is seen that the degree of WHSC1L1 over-expression reflects the severity of the disease, and that over-expression is a negative prognostic factor, showing substantially shorter survival rates for patients with WHSC1L1 over-expression compared to patients with normal WHSC1L1 gene transcript level. Furthermore, there is shorter survival of patients with extensive WHSC1L1 over-expression compared to patients with only modest over-expression. According to the invention the degree of WHSC1L1 over-expression can be used as prognostic factor, wherein the likelihood of survival is assessed based on the WHSC1L1 expression level. Of particular interest is to correlate the level of WHSC1L1 expression to survival time.
As discussed above, there are two main forms of WHSC1L1 that differ in length and the present invention teaches how they can be exploited to further improve the diagnostic and theranostic relevance of the invention (see examples 5 and 6). qPCR assays were invented that selectively quantify the short transcript variant of WHSC1L1, the long transcript variant of WHSC1L1 and both variants of WHSC1L1. Unexpectedly and most interestingly, as seen below, the majority of survivors with circulating tumor cell (CTC) positive samples based on the AdnaGen, express predominantly the short variant of WHSC1L1. Clearly, a measurement of an elevated relative expression of the short variant of WHSC1L1 compared to the long variant, i.e. the differential expression between the two isoforms, using the assays of the present invention provides an indicator for the survival of cancer patients as can be seen in the examples below.
Further aspects of the invention encompass a kit for in vitro diagnosing and/or prognosing cancer in a biological sample, said kit comprising
i) means for assessing changes in expression levels of one or more molecular markers in the 8p1 1-p12 chromosomal region, and wherein said changes indicates an activity of the Wolf-Hirschhorn syndrome candidate 1-like 1 (WHSC1L1) gene, or gene products thereof, ii) instructions to use said means for assessing changes in expression levels of one or more molecular markers in the 8p11-p12 chromosomal region, and wherein said changes indicates an activity of the Wolf-Hirschhorn syndrome candidate 1-like 1 {WHSC1L1) gene, or gene products thereof according to any of the methods described herein.
Thus, a further aspect of the invention provides a kit comprising means for assessing changes in expression levels of one or more molecular markers in the 8p1 1 -p12 chromosomal region, said changes are indicative of an activity of the Wolf-Hirschhorn syndrome candidate 1-like 1 {WHSC1L1) gene, or gene products thereof.
Further, said kit may include positive or negative controls and control samples, such as biopsy material, blood, serum, urine, saliva, bone marrow, and cell population samples known to be diseased or healthy of the species to be analysed,. Further, said kit may include instructional materials disclosing, for example, use of the means for assessing changes in expression levels of one or more molecular markers in the 8p11-p12 chromosomal region, or means of use for a particular reagent. The instructional materials may be written, in an electronic form (e.g., computer diskette or compact disk) or may be visual (e.g., video files). The kits may also include additional components to facilitate the particular application for which the kit is designed. Thus, for example, the kit can include buffers and other reagents routinely used for the practice of a particular disclosed method. Such kits and appropriate contents are well known to those of skill in the art.
The kit may further comprise, in an amount sufficient for at least one assay, the means for assessing changes in expression levels of one or more molecular markers in the 8p11- p12 chromosomal region described herein to as a separately packaged reagent, as well as separate instructions for its use to selectively recognize oligomeric form compared to corresponding monomers.
Instructions for use of the packaged reagent are also typically included. Such instructions typically include a tangible expression describing reagent concentrations and/or at least one assay method parameter such as the relative amounts of reagent and sample to be mixed, maintenance time periods for reagent/sample admixtures, temperature, buffer conditions, and the like.
Said kit may further include a carrier means, such as a box, a bag, a satchel, plastic carton (such as moulded plastic or other clear packaging), wrapper (such as, a sealed or sealable plastic, paper, or metallic wrapper), or other container.
In some examples, kit components will be enclosed in a single packaging unit, such as a box or other container, which packaging unit may have compartments into which one or more components of the kit can be placed. In other examples, a kit includes one or more containers, for instance vials, tubes, and the like that can retain, for example, one or more biological samples to be tested.
Other kit embodiments include, for instance, syringes, cotton swabs, or latex gloves, which may be useful for handling, collecting and/or processing a biological sample. Kits may also optionally contain implements useful for moving a biological sample from one location to another, including, for example, droppers, syringes, and the like. Still said kit may include disposal means for discarding used or no longer needed items (such as subject samples, etc.). Such disposal means can include, without limitation, containers that are capable of containing leakage from discarded materials, such as plastic, metal or other impermeable bags, boxes or containers.
In the following examples the invention will be described in more detail. However, the described embodiments mentioned below are only given as examples and should not be 5 limiting to the present invention. Other solutions, uses, objectives, and functions within the scope of the invention as claimed in the below described patent claims should be apparent for the person skilled in the art.
Example 1a: Molecular characterization of 8p11-p12 genetic aberrations in primary invasive breast tumors
10 Studies to narrow down the boundaries flanking the 8p11-p12 amplification and deletion regions were performed on primary invasive breast tumors and carried out as follows: The tumors were screened for genetic aberrations (herein the term aberration is intended to mean a change in the genetic composition of a biological sample in comparison with a normal sample) using 38K array comparative genomic hybridization (array-CGH). In total,
15 229 breast tumor specimens from female patients (forty axillary lymph node-negative (pNO), 102 diploid/aneuploid, and eighty-seven multiple breast tumor lesions) were analyzed using array-CGH with 38,043 reporters (i.e. individual sequences of a DNA probe used on a microarray). Normal genomic DNA (gDNA) obtained from females or males were used as reference. In brief, two micrograms of tumor and reference gDNA were differentially labeled
20 with Cy3-dCTP and Cy5-dCTP fluorochromes, respectively. Labeled tumor and reference DNA samples were pooled and mixed with unlabeled Human Cot-1 DNA to mask repetitive sequences. Hybridization probes were denatured for 15 min at 70°C, pre-annealed for 30 min at 37°C, and co-hybridized to arrays pretreated with the Universal Microarray
Hybridization Kit (Corning). The hybridized arrays were incubated for 72 h at 37°C.
25 Following hybridization, slides were washed in 2xSSC/0.1% SDS for 15 min at room
temperature, 2xSSC/50% formamide for 15 min at 45°C, 2*SSC/0.1% SDS for 30 min at 45°C, 2xSSC for 15 min at room temperature and dried by centrifugation. Data
preprocessing and normalization were performed using the web-based BioArray Software Environment system (BASE; Saal LH, Troein C, Vallon-Christersson J, Gruvberger S, Borg
30 A, Peterson C. BioArray Software Environment (BASE): a platform for comprehensive
management and analysis of microarray data. Genome Biol 2002;3:SOFTWARE0003, Internet address: http://base2.thep.lu.se/onk/). Further analysis to segment the data into regions of gains and losses was performed using the Rank Segmentation algorithm with Nexus Copy Number Professional 4.1 software (BioDiscovery). The array-CGH data was segmented into regions of gains and losses using log2-ratio thresholds for gain, i.e. when about 1-2 more copies than the normal gene copy number (but less than amplification levels) are produced (log2-ratio >+0.2), amplification, i.e. more than 2.5-fold the normal gene copy number (log2-ratio >+0.5), loss, i.e. the heterozygous loss of whole or small DNA segments of a chromosome (log2-ratio <-0.2), and
homozygous deletion, i.e. the loss of both copies of a gene (log2-ratio <-1.0) with a 0.01 p-value cutoff. The 8p11-p12 aberration was revealed in 83 tumors (47/83 amplification, 20/83 gain, 16/83 loss). In the genomic-wide frequency plot of Figure 1 , the x-axis corresponds to the genomic region from chromosome 1 to X and the y-axis to the percentage of gains and losses in the given chromosomal region, respectively. The core of the amplified region (Figure 2) displayed a heterogeneous pattern with a complex structure, varying in amplitude and size among the tumor specimens. The amplicon mapped to a 12.1 Megabase (Mb) region from 31.9-43.9 Mb (from telomere to centromere) with 5 subregions of amplification and 9 minimal common amplification peaks (range, 41.2-377.4 kb) from 34.3-42.5 Mb, identifying loci for candidate genes involved in breast cancer development and progression. One of the smallest peaks and notably the most common mapped to a 67.9 kb region spanning the WHSC1L1 (Wolf-Hirschhorn syndrome candidate 1-like 1) gene on chromosome band 8p12. The WHSC1L1 gene was amplified in 32/47 samples, consisting of four Basal-like, three HER2/ER-, and twenty-five Luminal subtype B tumors.
Amplification of the 8p11-p12 region was more prevalent in ipsilateral (P < 0.001 ), aneuploid (P < 0.001), pathological tumor size pT3 (P < 0.001), deceased patients (P = 0.004), high Genetic Grade Index (GGI; P = 0.009), and tumors from patients older than 50 years at the time of diagnosis (P = 0.019); 8p11-p12 gain was more prevalent in tumors from patients younger than 50 years at the time of diagnosis (P = 0.03), synchronous (P = 0.038), and progesterone positive tumors (P = 0.039); 8p11-p12 loss was more prevalent in deceased patients (P = 0.016), pNO (P = 0.033), and triple negative tumors (P = 0.042); lastly, tumors lacking aberrations at the 8p11-p12 locus were more prevalent in diploid (P < 0.001), unilateral (P < 0.001), SBR grade II (P < 0.001 ), long-term survivor patients (P < 0.001), low GGI (P < 0.001), 1-3 positive axillary lymph nodes (P = 0.018), pNO (P = 0.019), ductal (P = 0.039), pT2 (P = 0.043), and pT1 tumors (P = 0.044).
These findings indicate that amplification as well as loss of the 8p11-p12 region are associated with a more malignant phenotype in breast cancer, i.e. these aberrations are present in tumors from short-term survivors (deceased patients), in one or both of the multiple primary lesions originating from patients with more than one primary tumor (i.e. the tumor or abnormally growing cells which grows at the anatomical location where the abnormal growth of cells initiated), high GGI tumors, and triple negative tumors. On the other hand, gain of the 8p11-p12 genomic region and the absence of this aberration were associated with less aggressive tumors, i.e. tumors from long-term survivors, smaller tumors, progesterone positive tumors, and low GGI tumors.
Example 1b: Verification of 8p11-p12 genetic aberrations in the cells of breast tumors using Fluorescence in situ hybridization (FISH)
To further investigate the genomic region, a panel of 42 overlapping BAC (Bacterial Artificial Chromosome) clones building an assembled set of overlapping DNA sequences (a contig) over the altered region was isolated. Genetic aberrations identified at the 8p11-p12 locus using array-CGH were verified with fluorescence in situ hybridization (FISH; Figure 3). The FISH analyses of the 8p11-p12 region were carried out on touch preparation imprints from fresh cuts of frozen tumor samples to characterize and narrow down the number of genes within the region that may be targeted for amplification and/or deletion. The tumor imprints were air-dried overnight at room temperature and fixed and gradually dehydrated through an ethanol (EtOH) gradient (70%, 80%, and 100% EtOH for 3 min each). The slides were stored at -20°C in 100% EtOH. One slide was counterstained with 4', 6-diamidino-2- phenylindole (DAPI) in an antifade solution (Vectashield™, Vector Laboratories, Inc., Burlingame, CA, USA) for analysis of the quality of the imprint and fixation process.
Interphase FISH was performed on unstained imprints. Hybridization probes for FISH were prepared from BAC clones selected by compiling array-CGH data from breast tumor samples revealing amplifications in the 8p11-p12 chromosome region. Forty-two BAC clones were chosen which appeared in the peaks of amplification of the array-CGH graphs. The BAC clones (BacPac Resources, CHORI, Oakland, CA, USA) used in FISH experiments were from the RPCI-11 (RP11) and Caltech whole genome libraries (CTD) unless otherwise stated. The BAC clones were inoculated from bacteria stabs with a sterile culture loop and grown on agar plates containing Luria-Bertani (LB) media supplemented with chloramphenicol (12.5 pg/ml). To prevent the isolation of degraded DNA, two single colonies were separately inoculated into two 10 ml snap-cap polypropylene tubes containing 1 ml LB media. The bacteria were cultured for 4 hours at 37°C in a shaking incubator. Half of the bacteria suspensions (500 μΙ) were frozen at -80°C for further study, while an additional 5 μΙ of the bacteria suspension was used for further culturing in 4 ml LB media (plus chloramphenicol) overnight (up to 16 hours) at 37°C in a shaking incubator. Bacterial suspensions in growing phase were centrifuged at 3,000 rpm for 10 min. The supernatants were discarded, each pellet was resuspended in 0.3 ml P1 solution (50 mM Tris pH 8, 10mM EDTA, 100 pg/ml RNase A, stored at 4°C), and transferred to eppendorf tubes. The bacterial lysate was treated with 0.3 ml P2 solution (1 M NaOH, 10% SDS) and 5 incubated for 5 min at room temperature, allowing the appearance of the suspension to alter from very turbid to almost transparent. The suspensions were treated slowly with 0.3 ml P3 solution (3 M KoAc pH 5.5, stored at 4°C) and placed on ice for at least 5 min. A thick white precipitate of protein and E. coli was produced. The suspensions were centrifuged at 10,000 rpm for 10 min at 4°C. The supernatant was carefully transferred to a new
10 eppendorf tube containing 0.8 ml ice cold isopropanol and mixed by inverting. The tubes were incubated on ice for at least 5 min. The BAC DNAs were precipitated by centrifugation at 4°C for 15 min. The DNA pellet was washed with 1 ml 70% EtOH and centrifuged for 5 min. The air-dried DNA pellets were dissolved in 200 μΙ Tris EDTA (TE) buffer at 37°C and stored at 4°C. These BAC DNA samples were purified using phenol/chloroform and the
15 DNA pellet resuspended in RNase-free ddH20 at 37°C and stored at 4°C.
To ensure the integrity and to estimate the concentration of the BAC DNA, 5 μΙ of each DNA sample and 1 μΙ 6x Blue Gel Loading Dye were analyzed by electrophoresis on a 1 % agarose gel, stained with GelRed, and visualized with UV illumination using GelDoc 2000 (Bio Rad, Hercules, CA, USA). Each BAC DNA probe was labeled by nick translation using
20 biotin-16-dUTP or digoxigenin-11-dUTP (Roche, Mannheim, Germany) for one hour at 15°C. To ensure the integrity of the labeled DNA, 5 μΙ of each DNA sample and 1 μΙ 6x Blue Gel Loading Dye were electrophoresed on a 1 % agarose gel. Ideally, the length of the DNA fragments was expected to be between 200-500 base pairs (bp) following the procedure. If the DNA samples were too large, additional DNase I (3 μg/ml) and Polymerase I (Kornberg)
25 were added and incubated at 15°C for 10-30 minutes.
Equal amounts of two differentially labeled probes (2 pg) were pooled pair wise, together with unlabeled Cot-1 DNA (Invitrogen, Carlsbad, CA and Roche, Mannheim, Germany). The Cot-1 DNA was included to suppress repetitive sequences. The probe DNA mixtures were precipitated with 0.1 volume 3M sodium acetate (NaOAc) pH 5.2, 2.5 volumes ice cold 30 100% EtOH and incubated overnight at -20°C. The probe DNA mixtures were centrifuged for 30 minutes at 13000-15000 rpm at 4°C, washed in 0.5 ml ice cold 70% EtOH, and centrifuged again for an additional 30 minutes.
The DNA samples were air-dried and resuspended in 200 μΙ hybridization mixture (50% formamide, 10% dextran sulfate, and 2xSSC). Dual-color FISH analysis was performed on interphase cells using BAC biotin- and digoxigenin-labelled probes. All BACs were checked for chimerism and chromosomal sub-localization at 8p11-p12 using FISH on metaphase chromosomes prepared from normal lymphocytes. Metaphase chromosomes and/or interphase cells were denatured with 70% formamide/2xSSC at 75°C for 2 min, followed by dehydration through a graded ice cold ethanol series (70%, 80%, and 100% EtOH) for 3 min each, and allowed to air-dry. Hybridization probe mixture (100 ng) was denatured at 76°C for 5 min, put on ice for 1 min, pre-annealed at 37°C for 10 min, and hybridized to the denatured cells on the slides. Each sample was covered by a cover slip and sealed with rubber cement. Hybridization was performed at 37°C for 24-72 hours in a humidified chamber. Slides were washed with an SSC gradient (2xSSC for 5 min, 0.5xSSC in a 72°C water bath for 2 min, O.lxSSC for 3 min while shaking), and dipped momentarily in 4xSSC Tween 20. Before applying the antibodies, the slides were treated with a blocking solution (3% bovine serum albumin (BSA)) at 37°C for 10 min and briefly washed in 4xSSC/Tween 20. The first staining stage, a fluorochrome mixture consisting of 1% BSA, fluorescein isothiocyanate (FITC) for biotin (Roche, Mannheim, Germany), and Rhodamine for digoxigenin (Roche, Mannheim, Germany), was applied to the slides, coverslipped, and incubated in a humidified chamber at 37°C for 30-60 min. The slides were then washed in 4xSSC/Tween 20 for 3x5 min while shaking. Cells were counterstained with DAPI in an antifade solution.
The slides were analyzed using a Leica DMRA2 fluorescent microscope (Leica, Wetzler, Germany) equipped with an ORCA Hamamatsu CCD (charged-couple devices) camera (Hamamatsu City, Japan) and filter cubes specific for FITC; Rhodamine, and UV for DAPI visualization. Digitalized black and white images were captured using the Leica CW4000 software package (Cambridge, UK). At least 100 nuclei with intact morphology on the basis of DAPI counterstaining were scored from each clinical specimen. Clumped, damaged, overlapping, and nuclei located outside the area of probe hybridization were excluded from the analysis. If less than 100 cells were on the slide, then as many nuclei as possible were evaluated. To determine the gene copy number and the pattern of gene amplification, a gene amplification classification system was applied where specimens were classified as low level amplification when the number of specific hybridization signals varied from 5 to 10 signals, moderate amplification when the number of signals varied from 1 1 to 20, and as high level amplification when the number of signals was > 20. An average DNA copy number was calculated taken in consideration of the clonal heterogeneity seen in the analyzed specimens. Hybridization signals were observed in a variety of patterns from a clustered to a slightly scattered pattern. These patterns could vary between cells in the same tissue sample, as well as, between tissue samples. The tested BAC clones were highly amplified in a number of tumors, but not amplified with the same pattern in all analyzed tumors. In Figure 3 two differentially labeled BAC clones were cohybridized to touch preparations of a fresh frozen tumor sample. The large cell shows abnormal copy numbers (with up to 50 hybridization signals) while the smaller cell shows normal copy numbers (with two hybridization signals).
FISH is a powerful and effective method for assessing copy number changes on a cell-to- cell basis. Three types of FISH methods were used in this study, namely, 1 ) metaphase-, 2) interphase-, and 3) Fiber-FISH. The BAC clones were checked for sub-chromosomal localization and chimerism by hybridizing the BAC probes to metaphase chromosomes prepared from normal lymphocytes. Chimeric clones exist in genomic libraries, such as BAC libraries, which can give misleading results. Interphase FISH was performed on imprints from breast tumor samples using a total of forty-two BAC clones. Lastly, Fiber- FISH was used on stretched free chromatin samples when signals in interphase FISH were difficult to distinguish. Performing Fiber-FISH on free chromatin samples produces linear hybridization lines rather than focal signals, which are often easy to count. Used together, these three FISH methods were applied to verify the previous array-CGH results, to get higher resolution, and subsequently to narrow down the genes with elevated copy numbers in breast tumor imprints. The adjacent BAC clones containing overlapping DNA sequences were paired into hybridization probes, when possible. This pairing style allowed for higher resolution and a comparison of the hybridization patterns for each BAC clone. Most paired clones displayed similar hybridization patterns. The number of hybridization signals varied from normal diploid patterns to high level amplification patterns with over fifty signals, which differed between tumor samples and BAC clones. The hybridization patterns also varied according to the tumor specimen and the BAC clone. There were two main types of observed patterns: either the hybridization signals were clustered in set positions in the interphase nuclei or the signals were scattered. Recent 24-Color 3D FISH studies have shown that human chromosomes have fixed positions in interphase nuclei (Bolzer A, Kreth G, Solovei I, et al. Three-dimensional maps of all chromosomes in human male fibroblast nuclei and prometaphase rosettes. PLoS Biol 2005;3:e157). The clustered hybridization signals may represent
homogenously staining regions (HSRs) at 8p1 1 -p12 which are located in the set position for chromosome 8 in the interphase nuclei or translocations on another chromosome. The scattered hybridization patterns can be explained as possible translocation events with DNA sequences from chromosome 8 on other chromosomes.
Analyzed tumors samples from the same patient were either surgically removed from multiple foci in the same breast or from different breasts. The genetic profiles for these tumors were often drastically different. For example, one tumor from the same patient may appear normal at the 8p11-p12 region, while the other showed low-level
amplification; or gain on one tumor and high-level amplification on the other; or different levels of amplification depending on the progression of tumorigenesis. This could indicate that the tumors may either be genetically distinct or that the second tumor contained genetic material from the primary tumor as well as additional DNA mutations.
Example 2: Effects of 8p11-p12 amplification on gene expression patterns in primary invasive breast tumors
Studies of how 8p11-p12 genetic aberrations impact the expression patterns of specific genes were performed on 150 primary invasive breast tumors (45 tumors with 8p11-p12 amplification, 18 tumors with 8p11-p12 gain, 16 tumors with 8p1 1 -p12 loss, and 71 tumors lacking 8p1 1-p12 genetic aberrations) using lllumina HumanHT-12 expression Beadchips. Total RNA from the same extraction was used for both expression profiling and subsequent validation with quantitative real-time PCR (qPCR). The RNA samples were processed using lllumina HumanHT-12 Whole-Genome Expression BeadChips (lllumina), according to the manufacturer's instructions. The expression microarrays contained approximately 49,000 probes representing more than 25,400 RefSeq (Build 36.2, Release 22) and Unigene (Build 199) annotated genes. Images and raw signal intensities were acquired using the lllumina BeadArray Reader scanner and BeadScan 3.5.31.17122 (lllumina) image analysis software, respectively. Data preprocessing and quantile normalization were applied to the raw signal intensities using BASE. Further data processing was performed in Nexus Expression 2.0 (BioDiscovery) using log2- transformed, normalized expression values and a variance filter. Normalized values from five normal breast samples profiled with lllumina HumanWG-6 Expression Beadchips (Gene Expression Omnibus, accession number GSE17072) were used as reference. Differentially expressed genes were determined using the Benjamini-Hochberg method to control for the false discovery rate (FDR) with FDR-corrected p-values <0.01.
Further analysis to verify the expression patterns of 1 1 genes within the 8p11 -p12 locus was performed in 84 tumors using qPCR. Validation of the expression microarray data was performed using qPCR with pre-designed TaqMan® Gene Expression Assays (Applied Biosystems). In brief, 82 total RNA samples were used to validate the expression patterns of 16 transcripts on 8p11-p12 and three endogenous controls, i.e. PPIA, PUM1 , and HPRT1. The endogenous controls were selected based on their constitutive expression using the lllumina HumanHT-12 platform. The geometric mean of the three endogenous controls was used to normalize the data and relative gene expression levels were calculated with the relative standard curve method. The Student's t-test was calculated to determine the difference in expression between studied groups and the Spearman correlation coefficients (two-tailed) to establish the relationship between microarray and qPCR expression patterns.
To determine the impact of DNA amplification in the 8p1 1-p12 genomic region on gene expression patterns, expression profiling was analyzed using two different approaches to delineate genes within the amplicon with oncogenic potential. First, a comparison of expression patterns from tumors harboring 8p11-p12 amplification and those lacking aberrations was performed and revealed a large number of differentially regulated transcripts (n=1933) between the two groups, among which 22 transcripts (18 genes) were located within the 8p1 1-p12 region (Benjamini-Hochberg adjusted p-value < 0.01). Next, a correlation analysis between DNA and relative mRNA levels was also performed. In brief, lllumina HumanHT-12 probe nucleotide sequences were mapped to genomic locations (NCBI Build 35) using sequences downloaded from the UCSC Genome Browser (Internet address: http://hgdownload.cse.ucsc.edu/goldenPath/hg17/chromosomes/). A pair-wise comparison of the lllumina probe and BAC clone nucleotide sequences was then conducted to generate lllumina-BAC probe pairs with 100% sequence similarity. In total, 327 lllumina-BAC probe pairs spanning the 8p1 1 -p12 genomic region were selected from smoothed array-CGH data. The statistical analysis was performed in three sequential steps. Firstly, copy number alteration (CNA) induced genes were assessed by Pearson correlation (Q<0.05) and the change in relative mRNA levels as a function of DNA copy numbers was estimated by robust piecewise linear regression. Secondly, for lllumina-BAC probes showing significant association between CNA and expression patterns, t-test was performed to estimate the difference between the mean relative mRNA values for tumors containing either gene amplification versus tumors without this aberration. Lastly, probe pairs with Pearson correlation r > 0.7 were selected for further analysis to assess the number of tumors displaying gene over-expression (log2 > 0.58) when amplified (log2ratio > 0.5), as well as, the number of tumors displaying over- expression in the absence of gene amplification (-0.2 < log2ratio < 0.5). Statistical analyses were performed in R/Bioconductor. These analyses showed that gene copy number impacts gene expression levels. Of the 327 lllumina-BAC probe pairs tested for correlation between DNA copy number and relative mRNA levels, 115 probe pairs showed a significant association using Pearson correlation (Benjamini-Hochberg adjusted p-value < 0.5). Using this approach, the 8 identified genes were narrowed down to 11 unique genes (R > 0.7). Specifically, the WHSC1L1 gene had the highest DNA/mRNA correlation in 116 breast tumors (45 with 8p11-p12 amplification and 71 with no 8p11-p12 genetic aberrations). In addition, the WHSC1L1 gene was also over-expressed (log2≥0.58, at least 1.5-fold change) in 39 out of 87 tumors in the absence of 8p11-p12 amplification. Table 1 shows the statistical analysis of the effect of the DNA gene dosage on relative mRNA levels in 116 primary invasive breast tumors. Eleven significantly regulated genes (R>0.7; Benjamini-Hochberg p-adjusted value <0.01) were identified in the 8p11-p12 amplification region using array- CGH and expression analysis.
Column 1 in Table 1 shows the lllumina HumanHT-12 probe, column 2 shows the BAC clone and Column 3 shows the targeted gene. Column 4 shows the number of tumors with amplification in the particular BAC clone, column 5 shows the number of those tumors with the targeted gene over-expressed, and column 6 shows the percentage of tumors with amplification that have the target gene over-expressed. Column 7 shows the number of tumors without amplification, column 8 shows the number of these that have the target gene over-expressed, and column 9 shows the percentage of tumors without amplification that have the target gene over-expressed. Finally column 10 shows the total number of tumors analyzed using the particular BAC clone, column 1 1 shows the number of these in which the target gene is over-expressed, and column 12 shows the percentage of all tumors analyzed using the particular BAC clone that have the target gene over- expressed. Note: Each gene is represented by several BAC clones and lllumina transcripts in the array-CGH and expression platforms. Gene amplification designated at array-CGH log2ratio > 0.5 and normal DNA copy numbers designated at log2ratio < 0.5, over-expression designated relative to normal breast tissue log2ratio > 0.58 (1.5-fold change). Abbreviation: ND, not determined.
With two of the lllumina probes used here (ILMNJ 807379, ILMN_1666715) WHSC1L1 was over-expressed in 75-93% of the tumors with 8p11-p12 amplification based on 38K array comparative genomic hybridization (array-CGH) with BAC clones (Table I, column 6). Hence, by measuring WHSC1L1 mRNA over-expression most of the tumors with chromosomal aberrations in the 8p11-p12 region are detected. This is better than for other markers in the region, though some give almost as good results (e.g., 55-85% positive signals with BAG4). For a large number of the breast tumors 8p11-p12 chromosomal aberrations are not found (Table 1 , column 7). These tumors would go undetected if assayed based on DNA dosage only. By measuring mRNA expression, it is seen that the WHSC1L1 gene is over-expressed in 29-45% (depending on the BAC clone and probe used) of these cases (Table 1 , column 9). This is far more than when assaying any of the other genes in the 8p1 1-p12 region.
Table 1
GENE-SPECIFIC ILLUMINA-BAC PROBE GENE AMPLIFICATION NORMAL DNA COPY NUMBERS ALL TUMORS
PAIRS
lllumina probe BAC clone Gene Tumors Tumors Tumors with Tumors Tumors Tumors with Total Total Number Tumors symbol with Gene with Gene Gene with with Normal DNA Copy Number of Tumors with Gene
AmplificatAmplificatAmplification Normal Normal Numbers & Gene of with Gene Over- ion (n) ion & & Over- DNA DNA Copy Over-expression Tumors Over- expressOver- expression Copy Numbers & (%) <n) expression ion (%) express(%) Number Gene Over- (n)
ion (n) s (n) expression
(n)
ILMNJ 727996 RP11-90P5 BAG4 25 21 84 83 7 8 108 28 26
ILMNJ 727996 RP1 1 -636F12 BAG4 28 24 86 80 6 8 108 30 28
ILMNJ 727996 RP11-389E22 BAG4 20 18 90 84 7 8 104 25 24
ILMNJ 807379 RP11-636F12 #wse¾i 28 26 93 80 25 31 108 51 47
ILMNJ 767324 RP1 1-156L3 EIF4EBP1 20 11 55 78 3 4 98 14 14
ILMNJ697919 RP11-636F12 WHSC1L1 28 19 68 ,. v.¾,-.. 8¾.. 1 108 20 19
ILMN_2179063 RP1 1 -636F12 DDHD2 28 19 68 80 5 6 108 24 22
ILMNJ767324 RP11-594D10 EIF4EBP1 23 13 57 86 7 8 109 20 18
ILMNJ 748908 RP13-580P15 PROSC 30 5 17 83 0 0 113 5 4
ILMNJ 675406 RP11-636F12 PPAPDC1B 28 15 54 80 3 4 108 18 17
ILMNJ 778673 RP11-802B19 GOLGA7 24 2 8 88 0 0 112 2 2
ILMN_2218450 RP11-389E22 LSM1 20 14 70 84 4 5 104 18 17
Table 1 continued
Figure imgf000036_0001
Table 1 continued
ILMN_1665554 RP11-168H8 BRF2 29 14 48 83 1 1 112 15 13
I LMN_2179063 RP1 1-601 G22 DDHD2 29 21 72 80 2 3 109 23 21
I LMN_1705871 CTD-2385A20 DDHD2 31 14 45 80 0 0 111 14 13
ILMN_1666715 RP 1-350N15 WHSC1L1 29 22 76 84 38 45 113 60 53
Table 1 summarizes data for gene transcripts with the strongest DNA-mRNA correlation (r > 0.7) for tumors without genetic alterations; genes not listed are over-expressed in less than 8% of these cases. Since protein production depends on the presence of RNA and in general increases with the level of mRNA, WHSC1 L1 protein is expected to be the most frequently over-expressed protein marker in the 8p11 -p12 region. Hence, the present invention offers better means to detect tumors in breast cancer patients with normal DNA copy number in the 8p11-p12 chromosomal region than when using any other biomarker.
Columns 10-12 in Table I summarize results for all the tumors, i.e., those with and those without 8p1 1-p12 aberration. The here invented approach based on WHSC1L1 gene over-expression applies to up to 53% (Table 1 , column 12) of all the studied tumors. This is far better than when using any other molecular marker in the 8p1 1-p12 region, and possibly better than with any other marker what so ever.
Expression of WHSC1L1 can according to this invention be measured also by other means than with the bead-based lllumina system. Figure 4 shows the much higher expression of WHSC1L1 in the tumor samples harboring 8p1 1-p12 amplification compared to tumor samples with normal DNA dosage levels when measured by quantitative real-time PCR (qPCR). The expression of WHSC1L1 was measured using pre-designed TaqMan® Gene Expression Assay from Life Technologies that detects the splice variants WHSC1 L1-001 and WHSC1 L1-002. These findings are significant and novel as previous studies of the 8p11-p12 amplicon have identified candidate genes based solely on the principle of the amplification process while oncogenes are frequently activated by more than one molecular mechanism. The results of the present investigation indicate that WHSC1L1 is activated by several mechanisms, including gene amplification, making this gene one of the most relevant targets within this region. Example 3: Effects of 8p11-p12 genetic aberrations on patient overall survival
Studies showing how 8p11-p12 genetic aberrations correlate with patient overall survival were performed using Kaplan-Meier survival curves (Log Rank, Mantel-Cox). All important forms of genetic aberrations of the 8p11-p12 region (P = 0.000005), including
amplification (P = 0.000044) and loss (P = 0.00538), were associated with reduced survival rates, while gain of this region was not significantly indicative of unfavorable patient outcome (P = 0.08; Figure 5A-D). Further analyses were performed to assess whether WHSC1L1 gene amplification and/or WHSC1L1 over-expression have an impact on patient outcome. Figures 6A-C show the results of these analyses, indicating that WHSC1L1 gene amplification (P = 0.010) has an adverse effect on patient overall survival rates; WHSC1L1 over-expression has borderline statistical significance (P = 0.066). In addition, patients with WHSC1L1 gene amplification and over-expression or only over- expression have reduced overall survival rates compared to patients with neither
WHSC1L1 gene amplification or over-expression (P = 0.014), Finally, patients with extensive (> 2-fold) over-expression of WHSC1L1 have significantly reduced survival rates than patients with only modest (1.5 - 2-fold) over-expression of WHSC1L1 (Figure 7). This correlation between WHSC1L1 gene expression level and survival makes the WHSC1L1 gene transcript level a most suitable marker for prognostics either as a sole biological marker or as a critical, possibly dominant component in a multimarker based survival score calculation.
Example 4: Effects of WHSC1L1 protein expression on patient overall survival
Protein expression and localization of WHSC1 L1 was determined using
immunohistochemistry on formalin-fixed, paraffin-embedded tissues (FFPE; n = 127) from invasive breast carcinomas. The FFPE samples were obtained from the Pathology Department at Sahlgrenska University Hospital in accordance with the Declaration of Helsinki and approved by the Medical Faculty Research Ethics Committee (Gothenburg, Sweden). The tissue specimens corresponded to 95 breast cancer patients prepared occasionally in duplicate or triplicate, in addition to, five metastases to the axillary lymph nodes. These specimens were previously profiled using array-CGH and the lllumina system.
Optimal antibody dilutions and assay conditions were achieved for immunohistochemistry using breast carcinoma as positive controls. Four micrometer FFPE sections were subsequently immunostained with an antibody specific for the short WHSC1 L1 isoform (WHSC1 L1 -S). The sections were pretreated using the Dako PTLink system (Dako, Carpinteria, CA, USA) for 60 min and processed on an automated Dako Autostainer platform using the Dako Envision™ FLEX High pH Link Kit (pH 9) for WHSC1 L1 -S (Sigma-Aldrich HPA018893, 1 :200 dilution). Peroxidase-catalyzed diaminobenzidine was used as the chromogen, followed by hematoxylin counterstain. The slides were then rinsed with deionized water, dehydrated in absolute alcohol, followed by 95% alcohol, cleared in xylene, and mounted. To assess the degree of inflammatory infiltration, one FFPE section was also stained with hematoxylin and eosin (H&E). Immunostaining was scored using a semi-quantitative method to categorize the percentage of stained invasive cells (0% = 0; 1 -10% = 1 ; 1 1 -50% = 2; 51 -80% = 3; 81 -100% = 4) exhibiting total, nucleic, or cytoplasmic staining (negative staining = 0; weak staining = 1 ; moderate staining = 2; strong staining = 4). The staining index (SI) was determined to define low (SI <6) or high (SI >6) expression by calculating the product of the percentage and intensity of positively stained cells. The mean staining intensity was used for replicate samples.
Hematoxylin and eosin staining showed that 21/127 (17%) FFPE specimens did not contain tissue from invasive tumors, but frequently consisted of a mixture of normal, inflammatory, hyperplastic, and/or in situ tissues. Inflammatory infiltration in the specimens ranged from minimal to strong. Subsequently, all specimens were
immunostained (despite the lack of invasive tissue) to identify subcellular protein expression in invasive and non-invasive tissue. In general, WHSC1 L1 was not expressed in inflammatory and normal cell structure, however, X cases displayed positive staining. In invasive tissue, ubiquitous cytoplasmic expression was observed, but nuclear staining was predominantly observed in tissues harboring amplification of the WHSC1L1 gene. Minimal nuclear staining was also observed in one ductal in situ case. The WHSC1 L1-S antigen was detected in the cytoplasm for 53/90 (59%) cases, of which 37 showed low expression and 16 high expression. Two of the positive cases showed simultaneous expression in the cytoplasm and nucleus. Both of these cases were from specimens which displayed amplification of the WHSC1L1 gene. All other cases were exclusively expressed in the cytoplasm. As the WHSC1 L1 -S antigen was predominantly expressed in the cytoplasm, only total protein expression was scored. Figure 8 shows (A) weak cytoplasmic staining in the invasive and in situ components of a tumor lacking WHSC1L1 gene amplification; (B) displays moderate cytoplasmic and strong nuclear staining in invasive epithelial cells in a tumor harboring WHSC1L1 gene amplification using archived breast carcinoma tissue and the WHSC1 L1 antibody (1 :200; Sigma-Aldrich). (A) and (B) were taken at X100 magnification. It is seen that the tumor sample harboring WHSC1L1 gene amplification show much higher total expression of WHSC1 L1-S compared to the tumor sample with normal DNA dosage levels. Further analyses were performed to assess whether WHSC1 L1-S protein expression has an impact on patient outcome. Figure 9 shows effects of total (cytoplasmic and nuclear staining) WHSC1 L1-S expression on patient overall survival rates using the WHSC1 L1 antibody (1 :200; Sigma-Aldrich) with immunohistochemistry.lt is seen that WHSC1 L1-S protein expression (P = 0.022) has an adverse effect on patient overall survival rates.
Example 5: Quantification of WHSC1L1 gene expression using novel, highly specific quantitative real-time PCR Critical for any testing of molecular markers is a highly specific and sensitive assay that targets all the relevant variants of the transcript that may be present. At least five transcript variants of the WHSC1L1 gene have been described and in order to obtain the highest sensitivity it is essential that an assay targets all of the transcript variants with high specificity, high tolerance for interfering substances and high efficiency. No such assays were commercially available and therefore an assay that detects all known splice variants of WHSC1 L1 with high specificity, high sensitivity, high accuracy, wide dynamic range, with high PCR efficiency and that is not sensitive to the interfering agents typically present in complex sample matrices was designed. After extensive in silico design, testing, and confirmations and then extensive screening of many candidates under challenging conditions we unexpectedly found these assays to perform exceedingly well:
Dye based (D) -Assay:
Forward (Fwd) primer: 5'-GCAACAAGCATGACTCATCAAGAT-3' (SEQ ID NO: 1 )
Reverse (Rev) primer: 5'-ACACAAGATCGCCAACCTGAA-3' (SEQ ID NO: 2)
PCR-product: 225bp
Targets transcripts:
uc003xli.2_WHSC1 L1 :11 17+1341 ,
uc011lbm.1_WHSC1 L1 :1117+1341 ,
uc003xlj.2_WHSC1 L1 :11 17+1341 ; and
uc010lwe.2_WHSC1 L1 :1 117+1341
(UCSC genome browser (November 6, 201 1 : http://genome.ucsc.edu/cgi- bin/hqPcr?command=start).
Probe based (P) -Assay:
Fwd primer: 5'-GGCTTATAATTTCTACACCAAA-3' (SEQ ID NO: 3)
Rev primer: 5'-GAACGAGTTCTAATTGATCTTC-3' (SEQ ID NO: 4)
Probe: 5'-AAGCCAACGC AGAGTGTATC ATCT-3' (SEQ ID NO: 5)
PCR-product: 130bp
Targets transcripts:
uc003xli.2_WHSC1 L1 :2143+2272,
uc01 1 lbm.1_WHSC1 L1 :2143+2272,
uc003xlj.2_WHSC1 L1 :2143+2272; and
uc010lwe.2 WHSC1 L1 :2143+2272 (UCSC genome browser (November 6, 2011 : http://genome.ucsc.edu/cgi- bin/hgPcr?command=start).
Dye based (D)-assays usually reach higher PCR efficiency and therefore are often more reproducible than probe based (P) assays and they are also inexpensive in comparison, to P-assays that have higher specificity. The D-Assay is designed to detect all splice variants of WHSC1L1 gene transcripts using a sequence non-specific dye such as SYBRGreen I, Chromofy, SYT09, EvaGreen and the like. The assay spans between exons 2 and 4, which are present in all splice variants, and does not amplify genomic DNA under the qPCR conditions used. The P-Assay is also able to detect all variants of WHSC1L1 gene transcripts, but with even higher specificity by means of a probe, preferably a hydrolysis probe also known by the brand name Taqman®, but it can be used with dye as reporter as well. The P-assay spans between exon 7 and does not amplify genomic DNA under the qPCR assay conditions used here. Assays that only amplify certain splice variants of WHSC1L1, expected to produce different combinations of WHSC1L1 isoforms were designed. In particular, an assay that amplifies the transcript producing the short isoform but not the long isoform was designed:
WHSC1 L1_S_Fwd: AC AGTTCCTCAG G CTACAGTG AAG A (SEQ ID NO: 6)
WHSC1 L1_S_Rev: CTCAATCG CTG CGG AGACGG (SEQ ID NO: 7)
This assay amplifies a 108 base long transcript (uc003xlj.2 WHSC11.1 :2334+2441 ) based on the UCSC genome browser (November 6, 2011 : http://qenome.ucsc.edu/cqi- bin/hqPcr?command=start). And one that amplifies the transcript for the long isoform but not the short isoform:
WHSC1 L1_L_Fw: ACAGTTCCTCAGGCTACAGTGAAGA (SEQ ID NO: 8)
WHSC1 L1_L_Rev: CTCAATCGCTGCGGAGACGG (SEQ ID NO: 9)
This assay amplifies a 85 base long transcript (uc003xli.2 WHSC1 L1 :2449+2533, uc011lbm.1_WHSC1 L1 :2449+2533 and uc010lwe.2_WHSC1 L1 :2449+2533) based on the UCSC genome browser (November 6, 201 1 : http://qenome.ucsc.edu/cqi- bin/hgPcr?command=start).
Fresh-frozen tissue samples were homogenized into a fine powder using the Mikro- Dismembrator S ball mill (Sartorius Stedim Biotech), followed by isolation of total RNA using the RNeasy Lipid Tissue Mini Kit (Qiagen) according to the manufacturer's instructions. RNA quality was assessed using the RNA 6000 Nano LabChip Kit with Agilent 2100 Bioanalyzer (Agilent Technologies). Complementary DNA was produced using Superscript™ III First-Strand Synthesis SuperMix for qRT-PCR (Invitrogen) using 1ug total RNA. A qPCR master mix was prepared as: 2 ul iQ SYBR Green Supermix (BioRad, 170-8882), 0.3 μΙ of 10μΜ of each primer and 2.7 μΙ water (8 μΙ in total, with 300 nM of each primer). The serial dilution of the cDNA for the standard curve was done in TE/LPA buffer (4ml 1 *TE was mixed with 3.2μΙ 25μο/μΙ LPA (5-6575 Sigma), which gave a 20ng/Ml of LPA). Two micoliters of the cDNA was mixed with 8 μΙ of the master mix giving a total reaction volume of 10 μΙ. The reaction mix was PCR amplified using an ABI 7500 Fast qPCR instrument with the protocol: activation for 3 minutes at 95°C, followed by 40 cycles of 3 seconds at 95°C and 30 seconds at 60 "C. After completed PCR a melt curve was recorded between 60 and 95°C.
The here invented new qPCR assays for the WHSC1L1 gene are tested and evaluated for sensitivity, specificity and dynamic range. The results can be seen in Figure 10 A-H, wherein A) shows a standard dilution series with the D-assay evidencing essentially unlimited dynamic range; B) shows melt curves of products formed with the D-assay evidencing perfect primer specificity with negligible formation of aberrant primer; C) shows a qPCR analysis with the D-assay of representative cancer samples evidencing high specificity and robust signals when testing biological material in complex samples; D) shows a standard dilution series with the P-assay evidencing essentially unlimited dynamic range; E) shows melt curves of products formed with the P-assay evidencing perfect primer specificity with negligible formation of aberrant primer; F) shows a qPCR analysis with the P-assay of representative cancer samples evidencing high specificity and robust signals when testing biological material in complex samples; G) shows standard curves performed with the D and the P-assays evidencing very high PCR efficiencies, essentially unlimited dynamic range and a sensitivity only limited by sampling ambiguity and losses during sample handling. H) shows WHSC1L1 transcripts (total RNA) measured with the D-assay in selected representative cancer samples. Substantially more WHSC1L1 transcripts are found in samples wherein the region is amplified, and there is also strong correlation with the amount of WHSC1L1 transcripts measured with the novel qPCR assay and transcripts measured with the lllumina assays (example 2).
It can be concluded that standard dilution series and corresponding standard curves evidence excellent performance with essentially unlimited dynamic range, excellent PCR efficiencies (D-assay 91%; P-assay 86%), negligible formation of aberrant side products and a sensitivity only limited by sampling ambiguity and losses during sample handling. The novel assays also perform excellent on representative samples. Example 6: Prognostics based on classification of circulating tumor cells characterized by expression profiling measurements based on panels that include WHSC1L1
Blood samples (7,5 ml) were collected from 30 metastatic breast cancer patients before the start of new line of therapy. The inclusion criteria were: age above 18 years; patients with measurable or evaluable metastatic breast cancer; Eastern Cooperative Oncology Group (ECOG) scores for performance status of 0-2; no severe uncontrolled comorbidities or medical conditions; no second malignancies. Patients had either a relapse or were diagnosed for BC earlier (D1 year) and were about to start chemotherapy or had documented progressive BC before receiving a new endocrine, chemo- or experimental therapy. Informed consent for participation in the study was obtained from all patients. The blood samples were enriched for CTCs using immunomagnetic cell capture with the CancerSelect Breast Cancer test from AdnaGen (Langenhagen, Germany). One part of the CTCs samples were analyzed for the expression EpCAM, MUC-1 , HER2 by classical parallel PCR and capillary electrophoresis using the CancerDetect Breast Cancer test from AdnaGen (Langenhagen, Germany). A second part was pre-amplified by PCR for limited number of cycles with TaqMan® PreAmp Master Mix according to manufacturer instruction using in-house-designed assays at a final concentration of 25nM. Pre- amplified cDNA was used as template for qPCR analysis of 31 transcripts (TOP2A, ADAM17, PARP1, VEGF, VEGFR, PRG, ESR, MTOR, AKT2, STATB1, PTEN, KRT19, EPCAM, AURKA, MUC1, CD45, CXCR1, UPA, MCM, CXCR1, KI67, TWIST, ALDH1, p53, HER2, kRAS, C-myc, and the short splice variant of WHSC1L1 (WHSC1L1-S) and total expression of WHSC1L1 (WHSC1L1-L) on the BioMark™ HD System using 48x48 Dynamic Array™ integrated fluidic circuits (IFCs) (Fluidigm, USA).
The IFC chip was primed in the NanoFlex™ IFC Controller (Fluidigm, USA) prior analysis. The assay and sample mixtures were prepared according to manufacturer protocol for EvaGreen (Fluidigm, USA) and loaded into each sample/detector inlet of the dynamic array chip according to the loading scheme. Chip was inserted into the IFC Controller for loading procedure followed by thermal cycling on BioMark qPCR System. The cycling program for Ssofast EvaGreen Supermix was modified as follows: 3 minutes initial denaturation and enzyme activation at 95°C followed by 40 cycles of cycling at 95°C for 15 seconds, 60°C for 20 seconds, and 72°C for 20 seconds. The chip was resubmitted into the BioMark system for melt curve analysis using recommended protocol. BioMark Gene Expression Data Analysis and BioMark Melt Curve Analysis software were used to obtain Ct values and Tm data. The data was analyzed for stable expressed genes that could suit as normalizers, but no such genes were found. Instead the data analyzed was either normalized to the average expression of all genes (global normalization) or not normalized, which corresponds to normalization to the sample amount extracted. The problem of finding suitable normalizer produces ambiguous results, although main features are seen. The data was further autoscaled (subtracting mean expression of every gene and dividing with its standard deviation) to give all markers the same weights. Figure 1 1 shows (A) classification using Principal Component Analysis (PCA) of CTC samples collected from 21 patients with a panel of 31 markers including assays with differential sensitivity for splice variants of WHSC1L1. Stars indicate patients that are still alive and hexagons indicate deceased patients. There is clear separation of the live and deceased patients along PC1 as well as along PC2 evidencing the here invented markers and tests are of prognostic value. (B) shows a classification of the same data using hierarchical clustering. Three main clusters of patients are seen, two with mainly deceased patients and one with mainly survivors clearly evidencing the prognostic potential of the test. Also genes form three main clusters: a) AURKA, MCM, KRT19, HER2, MUC, EPCAM, TOP2, VEGFA, CXCR1 and WHSC1L1-S; b) TWIST, SCCB, PGR, UP A, VEGFR, KI67, KRAS, ALDH and ESR7; c) TP53, PARP, CTSD, CD24, ADAM, AKT, MYC, CD45, PTEN, MTOR, SATB and WHSC1L1-T. Notably, the short variant and total expression of WHSC1L1 is found in different clusters evidencing they are important for the classification. The data presented in figures (A) and (B) are autoscaled values of the (non-normalized) measured quantities, while (C) Shows classification of the data based on the expression of the short splice variant of WHSC1L1 and the total expression of WHSC1L1 only. The scales of the two axes are in log base 2 and arbitrary set 0 when expression was not detected (below limit of detection). Lowest positive signal is set arbitrarily to 4 (corresponding to 2Λ4 = 16 times higher). Expression of WHSC1L1 is found in all samples and there is correlation between high expression of WHSC1L1 and CTC positivity based on the AdnaTest (p<0.05) evidencing the invention here is useful as cancer marker also for analysis of samples collected from blood.
There are two main forms of WHSC1L1 that differ in length and the present invention teaches how they can be exploited to further improve the diagnostic and theranostic relevance of the invention. qPCR assays were invented that selectively quantify the short transcript variant of WHSC1L1, the long transcript variant of WHSC1L 1 and both variants of WHSC1L1. Unexpectedly and most interestingly, as seen in Figure 1 1 C, the majority of survivors with CTC positive samples based on the AdnaGen, express predominantly the short variant of WHSC1L1. Figure 12 A compares separately the expression
(expressed as log2) of the short transcript of WHSC1L1 and total expression of WHSC1L1 among survivors and among deceased patients that were CTC positive, showing that the total expression of WHSC1L1 is a more negative survivor indicator than the short transcript coding for the short splice variant of WHSC1L1. The relative expression of the short and long splice variants can be measured using the invented assays specific for the short and long transcript variants, respectively, or it can be calculated from measurements of either an assay specific for the short splice variant or an assay specific for the long splice variant combined with measurement of total WHSC1L1 expression. The relative expression of the short to long splice variant of WHSC1L1 among survivors and deceased patients is shown in Figure 12 B. The y-scale is arbitrary, because the relative sensitivities of the two qPCR assays have not been determined, but this has no relevance for the statistical comparison (p=0.5 in t-test and p=0.04 in Mann-Whitney test). The herein invented approach of comparing the relative expression of two (or more) splice variants of the same gene is that normalization to reference genes, which often is unreliable in cancer diagnostics, is redundant. Clearly, the measurement of an elevated relative expression of the short variant of WHSC1L1 compared to the long variant using the assays of the present invention provides a positive indicator for the survival of cancer patients.

Claims

1. A method for in vitro diagnosing and/or prognosing cancer in a biological sample, said method comprising the steps of
a) assessing changes in expression levels of one or more molecular markers in the 8p11 -p12 chromosomal region, and wherein said changes are indicative of an activity of the Wolf-Hirschhorn syndrome candidate 1-like 1 (WHSC1L1) gene, or gene products thereof,
b) comparing the amount of changes assessed in a) above to a positive and/or negative control, thereby diagnosing and/or prognosing cancer. 2. A method according to claim 1 , wherein the cancer is breast cancer.
3. A method according to claim 1 , wherein the breast cancer is invasive breast cancer.
4. A method according to any of claims 1-3, wherein the biological sample is a biopsy, blood, serum, plasma, urine, saliva, bone marrow, and/or cell populations.
5. A method according to any of claims 1 -4, wherein the one or more molecular markers are selected from the group consisting of genomic sequences or parts thereof, transcription products, translation products, and polypeptides.
6. A method according to claim 5, wherein the genomic sequence is selected from the group consisting of genes and gene fragments, non-coding gene sequences, and epigenetic signatures. 7. A method according to claim 5, wherein the transcription product is selected from the group consisting of primary transcripts, spliced primary transcripts, mRNA, microRNA, cDNA and non-coding RNA.
8. A method according to claim 5, wherein the translation product is selected from the group consisting of peptides, polypeptides, proteins or fragments thereof, and post translationally modified proteins.
9. A method according to claim 5 and 8, wherein the translation product is an isoform of the WHSC1 L1 protein or a fragment thereof, and/or a post translationally modified isoform of the WHSC1 L1 protein.
10. A method according to any of claims 1-9, wherein the WHSC1L1 gene, or a gene product thereof is the sole molecular marker assessed in the 8p11-p12 chromosomal region.
1 1. A method according to any one of claims 1-10, wherein the one or more molecular 5 markers are assessed by Array comparative genomic hybridization (array-CGH),
Polymerase Chain Reaction (PCR) based techniques, lllumina's BeadArray technology, Fluorescence in situ hybridization (FISH), Enzyme-linked immunosorbent assay (ELISA), Enzyme immunoassay (EIA) and/or Western Blot .
12. A method according to any one of claims 1-11 , wherein the activity of the WHSC1L1 10 gene is a change in the copy number of the WHSC1L1 gene, 3. A method according to claim 12, wherein the activity of the WHSC1L1 gene is an amplification, gain, heterozygous loss or homozygous deletion of the WHSC1L1 gene.
1 . A method according to any one of claims 1-11 , wherein the activity of the WHSC1L1 gene product is enhanced or abnormally suppressed expression of said gene product.
15 5. A method according to claims 12 and 13, wherein the copy number of the WHSC1L1 gene is measured by real-time quantitative PCR
16. A method according to claim 15, wherein copy number the WHSC1L1 gene is
measured by real-time quantitative PCR using the primers: 5'- GCAACAAGCATGACTCATCAAGAT-3' (SEQ ID NO: 1 ) and 5'-
20 ACACAAGATCGCCAACCTGAA-3' (SEQ ID NO: 2).
17. A method according to claim 15, wherein the copy number of the WHSC1L1 gene is measured by real-time quantitative PCR using the primers: 5'- GGCTTATAATTTCTACACCAAA-3' (SEQ ID NO: 3) and 5'-
GAACGAGTTCTAATTGATCTTC-3' (SEQ ID NO: 4) optionally in combination with a 25 probe.
18. A method according to claim 17 wherein the probe is a hydrolysis probe with the
sequence: 5'-AAGCCAACGC AGAGTGTATC ATCT-3' (SEQ ID NO: 5).
19. A method according to any one of claims 15-18, wherein the expression level of the WHSC1L1 gene is assessed relative to the expression of one, preferably two, and
30 more preferably 5-10 reference genes.
20. A method according to any one of claims 1-19, wherein the activity of the WHSC1L1 gene, or gene products thereof is assessed by the number of DNA copies of the WHSC1L1 genome in combination with a relative expression level of the WHSC1L1 gene transcript or translation products thereof.
5 21.A method according to any one of claims 1-20, wherein at least 2, 4, 6, 8, 10, 20, 30, 40, 50, 60, 70, 80 or 90 further markers for cancer are assessed in combination with the one or more molecular markers indicating the activity of the WHSC1L1 gene or gene products thereof.
22. A method according to claim 21 , wherein the further markers for cancer are selected
10 from the group consisting of v-myc myelocytomatosis viral oncogene homolog (avian;
MYC), Human Epidermal growth factor Receptor 2 {HER2), and StAR-related lipid transfer {START) domain containing 3 (STARD3).
23. A method according to any one of claims 21-22, wherein an amplification of a gene coding for the further marker for cancer is assessed.
15 24. A method according to any one of claims 21-22, wherein an expression of a gene coding for the further marker for cancer is assessed.
25. A method according to claim 20, wherein an enhanced expression level of the
WHSC1L1 gene transcription and/or translation products, but no amplification of the 8p11-p12 region indicates a negative prognosis of cancer.
20 26. A method according to any of the claims 1-9, wherein the relative expression of two or more splice variants or isoforms of WHSC1L1 is assessed.
27. A method according to claim 26 wherein the relative expression of the short
(CCDS6105) and long (CCDS43729) isoform of WHSC1L1 are assessed.
28. A method according to claims 26-27, wherein the assessment of isoforms of
25 WHSC1L1 is made by measuring differential expression of said isoforms.
29. A method according to claim 28, wherein the assessment of isoforms of WHSC1L1 is made by measuring differential expression of a set of transcripts that includes the transcript producing the short WHSC1 L1 isoform (CCDS6105) relative to a set of transcripts that does not include it, or a set of transcripts that includes the transcript producing the long WHSC1 L1 isoform (CCDS43729) relative to a set of transcripts that does not include it.
30. A method according to claims 1 -29, wherein the analyzed sample are cancer cells enriched from circulation (CTC)
5 31. A kit for in vitro diagnosing and/or prognosing cancer in a biological sample, said kit comprising means for assessing changes in expression levels of one or more molecular markers in the 8p1 1-p12 chromosomal region, and wherein said changes indicate an activity of the Wolf-Hirschhorn syndrome candidate 1-like 1 (WHSC1L1) gene, or gene products thereof.
10 32. The kit of claim 31 , further comprising a positive and/or a negative control.
33. The kit according to any of claims 30-31 , further comprising instructions to the method according to any of claims 1 -29.
PCT/SE2011/000201 2010-11-05 2011-11-07 Molecular marker for cancer WO2012060760A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
SE1001081-7 2010-11-05
SE1001081 2010-11-05

Publications (1)

Publication Number Publication Date
WO2012060760A1 true WO2012060760A1 (en) 2012-05-10

Family

ID=46024699

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SE2011/000201 WO2012060760A1 (en) 2010-11-05 2011-11-07 Molecular marker for cancer

Country Status (1)

Country Link
WO (1) WO2012060760A1 (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8952026B2 (en) 2013-03-14 2015-02-10 Epizyme, Inc. PRMT1 inhibitors and uses thereof
US9023883B2 (en) 2013-03-14 2015-05-05 Epizyme, Inc. PRMT1 inhibitors and uses thereof
US9045455B2 (en) 2013-03-14 2015-06-02 Epizyme, Inc. Arginine methyltransferase inhibitors and uses thereof
US9120757B2 (en) 2013-03-14 2015-09-01 Epizyme, Inc. Arginine methyltransferase inhibitors and uses thereof
US9133189B2 (en) 2013-03-14 2015-09-15 Epizyme, Inc. Arginine methyltransferase inhibitors and uses thereof
US9346761B2 (en) 2013-03-14 2016-05-24 Epizyme, Inc. Arginine methyltransferase inhibitors and uses thereof
US9365527B2 (en) 2013-03-14 2016-06-14 Epizyme, Inc. Arginine methyltransferase inhibitors and uses thereof
US9394258B2 (en) 2013-03-14 2016-07-19 Epizyme, Inc. Arginine methyltransferase inhibitors and uses thereof
US9447079B2 (en) 2013-03-14 2016-09-20 Epizyme, Inc. PRMT1 inhibitors and uses thereof
US9598374B2 (en) 2013-03-14 2017-03-21 Epizyme, Inc. Arginine methyltransferase inhibitors and uses thereof

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1961825A1 (en) * 2007-02-26 2008-08-27 INSERM (Institut National de la Santé et de la Recherche Medicale) Method for predicting the occurrence of metastasis in breast cancer patients
US20090203051A1 (en) * 2006-06-09 2009-08-13 The Regents Of The University Of California Targets in breast cancer for prognosis or therapy
WO2011096211A1 (en) * 2010-02-03 2011-08-11 Oncotherapy Science, Inc. Whsc1 and whsc1l1 for target genes of cancer therapy and diagnosis

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090203051A1 (en) * 2006-06-09 2009-08-13 The Regents Of The University Of California Targets in breast cancer for prognosis or therapy
EP1961825A1 (en) * 2007-02-26 2008-08-27 INSERM (Institut National de la Santé et de la Recherche Medicale) Method for predicting the occurrence of metastasis in breast cancer patients
WO2011096211A1 (en) * 2010-02-03 2011-08-11 Oncotherapy Science, Inc. Whsc1 and whsc1l1 for target genes of cancer therapy and diagnosis

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
GELSI-BOYER V. ET AL.: "Comprehensive profiling of 8p11-12 amplification in breast cancer", MOLECULAR CANCER RESEARCH, vol. 3, 2005, pages 655 - 667 *
HANNEMANN J. ET AL.: "Classification of ductal carcinoma in situ by gene expression profiling", BREAST CANCER RESEARCH, vol. 8, 2006, pages R61 *
SUNG MI KIM ET AL.: "Characterization of a novel WHSC1- associated SET domain protein with H3K4 and H3K27 methyltransferase activity", BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, vol. 345, 2006, pages 318 - 323 *
YANG Z-Q. ET AL., TRANSFORMING PROPERTIES OF 8P11-12 AMPLIFIED GENES IN HUMAN BREAST CANCER, vol. 70, 12 October 2010 (2010-10-12), pages 8487 - 8497 *
ZHOU Z. ET AL.: "The NSD3L histone methyltransferase regulates cell cycle and cell invasion in breast cancer cells", BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, vol. 398, 1 July 2010 (2010-07-01), pages 565 - 570 *

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8952026B2 (en) 2013-03-14 2015-02-10 Epizyme, Inc. PRMT1 inhibitors and uses thereof
US9023883B2 (en) 2013-03-14 2015-05-05 Epizyme, Inc. PRMT1 inhibitors and uses thereof
US9045455B2 (en) 2013-03-14 2015-06-02 Epizyme, Inc. Arginine methyltransferase inhibitors and uses thereof
US9120757B2 (en) 2013-03-14 2015-09-01 Epizyme, Inc. Arginine methyltransferase inhibitors and uses thereof
US9133189B2 (en) 2013-03-14 2015-09-15 Epizyme, Inc. Arginine methyltransferase inhibitors and uses thereof
US9346761B2 (en) 2013-03-14 2016-05-24 Epizyme, Inc. Arginine methyltransferase inhibitors and uses thereof
US9365527B2 (en) 2013-03-14 2016-06-14 Epizyme, Inc. Arginine methyltransferase inhibitors and uses thereof
US9394258B2 (en) 2013-03-14 2016-07-19 Epizyme, Inc. Arginine methyltransferase inhibitors and uses thereof
US9440950B2 (en) 2013-03-14 2016-09-13 Epizyme, Inc. Arginine methyltransferase inhibitors and uses thereof
US9447079B2 (en) 2013-03-14 2016-09-20 Epizyme, Inc. PRMT1 inhibitors and uses thereof
US9475776B2 (en) 2013-03-14 2016-10-25 Epizyme, Inc. PRMT1 inhibitors and uses thereof
US9598374B2 (en) 2013-03-14 2017-03-21 Epizyme, Inc. Arginine methyltransferase inhibitors and uses thereof
US9630961B2 (en) 2013-03-14 2017-04-25 Epizyme, Inc. Arginine methyltransferase inhibitors and uses thereof
US9724332B2 (en) 2013-03-14 2017-08-08 Epizyme, Inc. Arginine methyltransferase inhibitors and uses thereof
US9732041B2 (en) 2013-03-14 2017-08-15 Epizyme, Inc. Arginine methyltransferase inhibitors and uses thereof
US9765035B2 (en) 2013-03-14 2017-09-19 Epizyme, Inc. Arginine methyltransferase inhibitors and uses thereof
US9776972B2 (en) 2013-03-14 2017-10-03 Epizyme Inc. Pyrazole derivatives as arginine methyltransferase inhibitors and uses thereof
US9868703B2 (en) 2013-03-14 2018-01-16 Epizyme, Inc. PRMT1 inhibitors and uses thereof
US9943504B2 (en) 2013-03-14 2018-04-17 Epizyme, Inc. Arginine methyltransferase inhibitors and uses thereof
US10039748B2 (en) 2013-03-14 2018-08-07 Epizyme, Inc. PRMT1 inhibitors and uses thereof
US10081603B2 (en) 2013-03-14 2018-09-25 Epizyme Inc. Arginine methyltransferase inhibitors and uses thereof
US10227307B2 (en) 2013-03-14 2019-03-12 Epizyme, Inc. PRMT1 inhibitors and uses thereof
US10632103B2 (en) 2013-03-14 2020-04-28 Epizyme, Inc. Arginine methyltransferase inhibitors and uses thereof
US10800743B2 (en) 2013-03-14 2020-10-13 Epizyme, Inc. Arginine methyltransferase inhibitors and uses thereof
US11185531B2 (en) 2013-03-14 2021-11-30 Epizyme, Inc. Arginine methyltransferase inhibitors and uses thereof
US11512053B2 (en) 2013-03-14 2022-11-29 Epizyme, Inc. Arginine methyltransferase inhibitors and uses thereof

Similar Documents

Publication Publication Date Title
WO2012060760A1 (en) Molecular marker for cancer
AU2016349644B2 (en) Representative diagnostics
Hanna et al. Chromogenic in-situ hybridization: a viable alternative to fluorescence in-situ hybridization in the HER2 testing algorithm
Baak et al. Genomics and proteomics in cancer
EP2430193B1 (en) Markers for detection of gastric cancer
Codrich et al. Integrated multi-omics analyses on patient-derived CRC organoids highlight altered molecular pathways in colorectal cancer progression involving PTEN
Murria Estal et al. MicroRNA signatures in hereditary breast cancer
AU2020207053A1 (en) Genomic profiling similarity
CA3177323A1 (en) Immunotherapy response signature
Radhakrishnan et al. Analysis of chromosomal aberration (1, 3, and 8) and association of microRNAs in uveal melanoma
KR20230011905A (en) Panomic genomic prevalence score
Morley-Bunker et al. Assessment of intra-tumoural colorectal cancer prognostic biomarkers using RNA in situ hybridisation
Kim et al. Tumor immune microenvironment is influenced by frameshift mutations and tumor mutational burden in gastric cancer
Tanami et al. Involvement of cyclin D3 in liver metastasis of colorectal cancer, revealed by genome-wide copy-number analysis
WO2019169336A1 (en) Methods for prostate cancer detection
McIntyre et al. MYB-NFIB gene fusions identified in archival adenoid cystic carcinoma tissue employing NanoString analysis: an exploratory study
EP2550534A1 (en) Prognosis of oesophageal and gastro-oesophageal junctional cancer
KR20170127774A (en) Method for predicting prognosis of breast cancer patients using gene deletions as biomarkers
US20230368915A1 (en) Metastasis predictor
WO2007043418A1 (en) Method for prediction of postoperative prognosis for patient with pulmonary adenocarcinoma, and composition for use in the prediction
Jain et al. Biomarkers of cancer
Gill et al. Cytogenetics to multiomics in biology of cancer
Gemoll et al. Applying genomics and proteomics in translational surgical oncology research
WO2014207167A1 (en) Methods for monitoring treatment response and relapse in breast cancer
Sato et al. Biomarkers of Uterine Fibroids

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11838314

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11838314

Country of ref document: EP

Kind code of ref document: A1