WO1999037764A2 - New members of the glypican gene family - Google Patents

New members of the glypican gene family Download PDF

Info

Publication number
WO1999037764A2
WO1999037764A2 PCT/EP1999/000329 EP9900329W WO9937764A2 WO 1999037764 A2 WO1999037764 A2 WO 1999037764A2 EP 9900329 W EP9900329 W EP 9900329W WO 9937764 A2 WO9937764 A2 WO 9937764A2
Authority
WO
WIPO (PCT)
Prior art keywords
gene
glypican
nucleotide sequence
sequence
probes
Prior art date
Application number
PCT/EP1999/000329
Other languages
French (fr)
Other versions
WO1999037764A3 (en
Inventor
Mark Paul Dittmar Veugelers
Guido Joseph Frans David
Original Assignee
Vlaams Interuniversitair Instituut Voor Biotechnologie
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vlaams Interuniversitair Instituut Voor Biotechnologie filed Critical Vlaams Interuniversitair Instituut Voor Biotechnologie
Priority to AU24229/99A priority Critical patent/AU2422999A/en
Publication of WO1999037764A2 publication Critical patent/WO1999037764A2/en
Publication of WO1999037764A3 publication Critical patent/WO1999037764A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/475Growth factors; Growth regulators
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/46Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • C07K14/47Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
    • C07K14/4701Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals not used
    • C07K14/4725Proteoglycans, e.g. aggreccan

Definitions

  • the present invention relates to the characterization and chromosomal localization of new members of the glypican gene family and to the use of members of this family in diagnostics and/or therapeutics.
  • Glypicans are glypiated cell surface heparan sulfate proteoglycans, the first of which was originally identified in human lung fibroblasts (David et al . , 1990) .
  • the known five members of this family have similar core protein sizes (about 60 kDa) , share a unique and very conserved cysteine spacing, and are linked to the cell membrane by a glycosyl phosphatidyl inositol (GPI)- anchor.
  • GPI glycosyl phosphatidyl inositol
  • Those five known glypicans are of vertebrate origin and include glypican (glypican-1, David et al . , supra), Cerebroglycan (glypican-2, Stipp et al .
  • All the structural features of the vertebrate glypicans are also present in the product of dally (division abnormally delayed) , a locus identified in Drosophila melanogaster by genetic screening for mutants affecting cell division patterning in the developing central nervous system (Nakato et al . , 1995). Besides disturbing cell cycling in the nervous system dally mutations also affect viability and produce morphological defects in several adult tissues, including the eyes, antennae, wings and genitalia.
  • the dally mutants the well established co-receptor activities of the cell surface proteoglycans for various ligands that are known to mediate developmental instructions, and the tissue and stage-specific expressions of the glypicans, all implicate the glypican group of integral membrane proteoglycans in the control of cell division and patterning during development.
  • This contention has recently been corroborated by the identification of mutations in GPC3 , the gene coding for the human homologue of OCI-5 (glypican-3 ) , that cause the Simpson- Golabi-Behmel overgrowth syndrome (SGBS) (Pilia et al . , 1996) .
  • SGBS Simpson- Golabi-Behmel overgrowth syndrome
  • This X-linked condition which clinically has to be differentiated from the autosomal Beckwith-Wiedeman Syndrome, is characterized by pre- and post-natal overgrowth with visceral and skeletal anomalies, and is associated with a high risk for developing embryonal tumors, including Wilms ' tumor and neuroblastoma .
  • chromosomal assignment of the genes for the members of the glypican family and the identification of potentially additional members in this family may be of general relevance for the understanding of somatic overgrowth and tumor predisposition.
  • glypican the homologue of OCI-5 (glypican-3) and glypican- 5 have been identified in human.
  • the corresponding genes GPC1 , GPC3 and GPC5 have been localized to chromosomes 2q35-q37, Xq26 and 13q32, respectively (Vermeesch et al . , 1995; Pilia et al . , 1996; Veugelers et al . , 1997) .
  • the cDNA nucleotide and derived amino acid sequences of these genes are given in figures 3, 4 and 5, respectively.
  • the object of the present invention to provide new members of the glypican family, and to study their possible implications in various medical indications. It is a further object of the invention to use the information derivable from the members of the glypican gene family for designing diagnostic methods and kits and/or for the development of therapeutics.
  • two novel human cDNAs were identified encoding glypican-related proteins. The corresponding gene for the first was mapped to chromosome 13q32. In this application this gene will be identified as GPC6, whereas the protein encoded by the gene will be called glypican-6.
  • the predicted primary structure of the GPC6 protein was found to show significant sequence similarity to glypican (glypican- 1) , to the human homologue of OCI-5 (glypican-3) , to glypican-5, to the glypican-related proteins Cerebroglycan (glypican-2 of the rat) , to K-glypican (glypican-4 of the mouse) and to the gene product of the dally locus in Drosophila melanogaster .
  • the similarity pertains to a conserved sequence motif, present in all seven proteins and that include a set of 14 conserved cysteine residues found in specific positions.
  • Glypican-6 is, however, more similar to K- glypican and human glypican-4 (see below) than to the other glypicans.
  • glypican- 6 is more similar to glypican-1 and glypican-2 than to glypican-3 and glypican-5.
  • the gene encoding glypican-6 (GPC6) has a similar exon-intron organization as the gene encoding glypican-4 (GPC4 as was now found according to the invention) and the gene encoding glypican-1 (GPCl) .
  • This organization differs from the exon-intron organization of the gene encoding glypican-3 (GPC3) and that of the gene encoding glypican-5 (GPC5) , while GPC3 and GPC5, in turn, resemble one another in terms of their intron-exon organizations.
  • GPC3 and GPC5 in turn, resemble one another in terms of their intron-exon organizations.
  • a cDNA is provided that encodes the human homologue of K- glypican (glypican-4) , the corresponding gene of which localizes to chromosome Xq26 in very close proximity to the gene for glypican-3.
  • the GPC3 and GPC4 genes are adjacent, or near adjacent, to one another on chromosome Xq26, while the GPC5 and GPC6 genes are adjacent, or near adjacent, to one another on chromosome 13q32.
  • a member (GPC3, respectively GPC5) of one of the glypican subfamilies is physically linked to a member (GPC4, respectively GPC6) of another glypican subfamily. This indicates that the glypican subfamilies and various members of these families may have arisen from the duplications of one ancestral glypican gene and ancestral gene cluster. Maintenance of the physical associations between these genes during evolution suggests that these genes may also be functionally linked.
  • GPC3 the gene for glypican-3
  • GPC4 the gene for glypican-4
  • various diagnostic tests could be developed in order to detect aberrations in the genes that encode glypicans and aberrations in the expression levels of these genes.
  • this knowledge can be used to develop therapeutic compounds that restore the physical damage caused by the mutant gene .
  • the aberrations in the gene comprise for example deletions or translocations within either or both of the two genes, but also mutations in either or both of them. These aberrations may lead to the absence of gene products or to abnormal gene products. Thus, the expression level of the gene may be used as another parameter indicating the presence of one or more aberrant genes .
  • GPC3 and GPC4 can be extrapolated to other members of the glypican gene family.
  • GPC5 and GPC6 are likewise associated. Aberrations in these genes can be identified in a similar manner as herein described for GPC3 and GPC4.
  • SSCP single strand conformation polymorphism
  • RFLP restriction fragment length polymorphism
  • gel electrophoresis Southern blot analysis
  • PCR DNA sequencing
  • Diagnostic methods according to the invention are based on the information derivable from the gene and/or its gene product.
  • Such information comprises the nucleotide sequence, either sense or antisense, of the gene and the complementary strand thereof, and the amino acid sequence of the gene product encoded by the coding sequence of the gene.
  • the information derivable from the gene or gene product can very well be defined by a person skilled in the art by referring to figures 1 to 6.
  • Figures 1 to 5 disclose the nucleotide sequence of the human cDNAs for glypicans 1 and 3 to 6, as well as the derived amino acid sequence of the protein encoded by the cDNA.
  • Figure 6 gives an alignment of the predicted amino acid sequences and the position of the exon boundaries for each of them. This information can be used to define so-called derivatives.
  • Derivatives of the nucleotide sequence of the gene are the gene itself, either isolated or synthetic, fragments of the gene, either isolated or synthetic and having a length that is smaller than the complete gene; primers, comprising at least 10 consecutive gene specific nucleotides, preferably about 20 gene specific consecutive nucleotides of the nucleotide sequence of the gene; longer oligonucleotides up to the full length of the sequence of the gene; antisense variants of the gene, the fragments or the primers; antibodies directed to the gene, fragments, primers or complementary strands thereof; any specific ligand for DNA that can be used as a specific probe, peptide nucleic acid probes.
  • transcripts mRNA sequences of the gene, from which in turn cDNA, antisense RNA, antisense cDNA, antibodies directed to the transcript, sense and antisense cDNA, antisense RNA and any specific ligand for RNA that can be used as a specific probe can be derived.
  • Derivatives of the amino acid sequence include the isolated or synthetic gene product (also called protein or polypeptide); isolated or synthetic peptides, comprising a specific sequence of consecutive amino acids encoded by the gene, antibodies directed to the gene product or peptides and any specific ligand for peptides that can be used as a specific probe.
  • Other derivatives are heparan sulfate chains or heparan sulfate structures, antibodies directed to heparan sulfate structures present on the product of the natural or synthetic gene as a result of the posttranslational modification of these gene products, any specific ligand for heparan sulfate that can be used as a specific probe.
  • the gene or cDNA may be used for the transfection of cells, which transfection results in cells expressing or secreting the desired glypican.
  • the transfected cells can be used to produce transgenic animals therefrom, which in case the gene is an aberrant gene, can be used to study the effect of the aberration or to test medicaments.
  • natural glypicans may be isolated or recombinant (wild type or mutated) glypicans produced (in transfected cells or transgenic animals) for use as therapeuticals .
  • Such therapeuticals may be used to mimic the biological effects of the glypicans (control of cell growth and differentiation) , in attempts to remedy the effects of absolute or relative deficiencies of these genes or to enhance the effects of the normal genes.
  • therapeuticals based on modified glypican gene sequences may also be used to block the effects of the glypicans, in attempts to remedy the effects of absolute or relative overexpressions or activities of the products of the various glypican genes.
  • recombinant soluble glypicans may be used as decoy receptors for antagonizing the effects of factors that depend on membrane-anchored glypicans, whereas the delivery of membrane- intercalatable glypicans to cells may restore cellular sensitivity to these factors. Diagnostic methods according to the invention comprise but are not limited to the following.
  • a method for diagnosing aberrations in a glypican encoding gene comprises isolation of the gene from cells expected to be harboring an aberrant gene; and comparison of the nucleotide sequence of the gene thus obtained with the nucleotide sequence of a wild type gene.
  • wild type gene as used in this application is intended to encompass a gene from a non- affected individual.
  • sequences given in figures 1 to 5 are representatives for wild type gene sequences.
  • Comparison between the potentially aberrant and wild type nucleotide sequences can be performed at various levels. On a first level it can be established whether the expected aberration (s) has (have) resulted in restriction fragment length polymorphism. In order to do this the isolated gene and a wild type comparison gene are separately digested with one or more selected restriction enzymes. The digest thus obtained is separated on a gel revealing a pattern of bands. Differences in the pattern indicate the presence of differences in the restriction sites present in the polynucleotide and thus changes in the sequence thereof. Deletions can be detected by means of any nucleic acid amplification technique such as the Polymerase Chain Reaction (PCR) . For this, probes are identified corresponding to various parts of the gene to be diagnosed, for example exons. Amplification between a set of probes will only occur if the part of the gene to which a selected set of probes should hybridize is still present. In addition or as an alternative the length of the amplified fragment indicates whether any part is deleted.
  • PCR Polyme
  • Point mutations can be identified by more sophisticated techniques such as SSCP (single-strand conformation polymorphism screening) , heteroduplex analyses, DNA-chips, chemical and enzymatic methods, sequencing of PCR products, denaturant gradient gel electrophoresis or other state of the art methods that may become available in the future.
  • SSCP single-strand conformation polymorphism screening
  • heteroduplex analyses DNA-chips
  • chemical and enzymatic methods sequencing of PCR products
  • denaturant gradient gel electrophoresis denaturant gradient gel electrophoresis or other state of the art methods that may become available in the future.
  • Translocations can be detected by hybridizing a set of chromosomes with a first probe that hybridizes to a part of the glypican gene that is not likely to be involved in the translocation, inversion or deletion and a second probe that hybridizes to a part of the glypican gene that is likely to be involved in such aberration. When a translocation has occurred the second probe will be found on another chromosome than the first one. If the probable translocation partner is identified, an additional set of probes can be used which hybridize to the translocated part and the remaining part, respectively of the translocation partner and bearing a different label from the first set of probes. Upon translocation one of the probes of the first set will be found on the chromosome of the translocation partner and one probe of the second set will be found on the chromosome of the glypican gene and vice versa.
  • Identifying inversions and deletions works in a similar way with two probes, one that hybridizes to a part of the gene that is not likely to be involved in the inversion or deletion and a second probe that is likely to be involved in such aberration.
  • the second probe will be found closer to or further away from the first probe than in a non-aberrant chromosome.
  • deletions one of the probes will be missing on the aberrant chromosome.
  • Diagnosis can also be performed at the level of the (potentially absent or aberrant) protein encoded by the glypican gene.
  • Antibodies directed to the gene product or protein can be used on Western blots to detect the presence of the protein in the cell or to assess the amount of protein present .
  • the diagnostic tests of the invention can be performed on various source materials.
  • RF P, deletion PCR, SSCP and chromosome analyses are for example performed on blood cells or tissue biopsy samples of the patient and his or her family. Furthermore, tumor cells and normal cells of these subjects may be used. For protein analysis, tissue samples, sera, tissue fluids of patients and family, pleura exudates, ascites etc. may be used.
  • Figure 1 shows the nucleotide sequence of the glypican-6 cDNA, comprising the coding sequence of the newly identified GPC6 gene, and the predicted amino acid sequence .
  • Figure 2 shows the nucleotide sequence of the glypican-4 cDNA, comprising the coding sequence of the GPC4 gene, and the predicted amino acid sequence.
  • Figure 3 shows the nucleotide sequence of the human glypican-1 cDNA, comprising the coding sequence of the GPC1 gene, and the predicted amino acid sequence (Genbank accession number X54232; David et al . , J. Cell Biol. Ill, 3165-3176, 1990) .
  • Figure 4 shows the nucleotide sequence of the human glypican-3 cDNA, comprising the coding sequence of the GPC3 gene, and the predicted amino acid sequence (Genbank accession number Z37987) .
  • Figure 5 shows the nucleotide sequence of the human glypican-5 cDNA, comprising the coding sequence of the GPC5 gene, and the predicted amino acid sequence (Genbank accession number U66033; Veugelers et al . Genomics 40, 24-30, 1997) .
  • Figure 6 shows the alignment of the predicted amino acid sequences of the members of the glypican family.
  • GPC1 is human glypican (David et al . , 1990)
  • GPC3 is the translated ORF of MXR7 (GenBank #Z37987) and the human homologue of rat OCI-5
  • GPC4 is human glypican-4 (see example 2) and the human homologue of K-glypican (Watanabe et al . , 1995)
  • GPC5 is human glypican-5 (Veugelers et al . , 1997) .
  • the set of fourteen conserved cysteines and the putative glycosaminoglycan attachment sites are outlined by underlining.
  • Figure 7A shows a Northern blot for GPC6 of human fetal Brain (lane 1), Lung (lane 2), liver (lane 3) and Kidney (lane 4) RNA.
  • the positions of RNA size markers (kb) are indicated in the abscissa.
  • Figure 7B shows a Northern blot for GPC6 of human adult Heart (lane 1) , Brain (lane 2) , Placenta
  • FIG. 7C shows a Northern blot for GPC6 of human adult Spleen (lane 1) , Thymus (lane 2) , Prostate (lane 3) , Testis (lane 4) , Ovary (lane 5) , Small Intestine (lane 6) , Colon Mucosal lining (lane 7) , and Peripheral Blood Leukocyte (lane 8) RNA.
  • the positions of RNA size markers (kb) are indicated in the abscissa.
  • Figure 8 shows heparan sulfate core protein expression in control and GPC-transfectant Namalwa cells.
  • This antibody reacts with the desaturated uronates that are generated by heparitinase and that remain in association with the core protein after the enzyme treatment and during electrophoresis.
  • heparitinase After heparitinase it therefore detects all heparan sulfate proteoglycan core proteins present in the sample. The positions of protein size markers are indicated in the abscissa.
  • Hase heparitinase; Case: chondroitinase ABC.
  • Control wild type Namalwa cells (wt) and Namalwa cells, transfected with pREP4 0; GPC1, GPC4 and GPC6 : Namalwa cells, transfected with respectively the plasmids glypl-pREP4, glyp4-pREP4 and glyp6-pREP4.
  • Figure 9 shows heparan sulfate expression in control and GPC-transfectant Namalwa cells. FACS analysis of non-digested and heparitinase-digested cells, using the native HS-specific 10E4 antibody (non digested cells (- • - • -)) and delta-HS-specific 3G10 antibody (digested cells ( ).
  • Namalwa cells transfected with pREP4; GPC1 , GPC3, GPC4 , GPC5 and GPC6 : Namalwa cells, transfected with respectively the plasmids glypl-pREP4, glyp3-pREP4, glyp4-pREP4, glyp5-pREP4 and glyp6-pREP4.
  • Figure 10 shows the chromosomal localization of GPC6 to chromosome band 13Q32 as a photo (A) and a schematic representation thereof (B) . Arrows indicate colored bands .
  • Figure 11A shows the Northern blot for GPC4 of human fetal Kidney (lane 1) , Liver (lane 2) , Lung (lane 3) and Brain (lane 4) RNA.
  • the upper part of the figures represents the hybridization with the GPC4 probe; the lower part the hybridization with a ⁇ -actin control probe .
  • Figure 11B shows the Northern blot for GPC4 of human adult Heart (lane 1) , Brain (lane 2) , Placenta (lane 3), Lung (lane 4), Liver (lane 5), Skeletal Muscle (lane 6) , Kidney (lane 7) and Pancreas (lane 8) RNA.
  • FIG 11C shows the Northern blot for GPC4 of human adult Spleen (lane 1), Thymus (lane 2), Prostate (lane 3) , Testis (lane 4) , Ovary (lane 5) , Small intestine (lane 6) , Colon; mucosal lining (lane 7) , peripheral blood leukocyte (lane 8) .
  • the positions of RNA size markers (kb) are indicated in the abscissa.
  • Figure 12A illustrates the chromosomal localization of GPC4 to chromosome Xq26 and relative order of GPC3 and GPC4.
  • FISH FISH was performed using either BAC 35H9 or BAC 68G14 on metaphase spreads, prepared from PHA-stimulated normal peripheral blood leukocytes ( Figure 12A) .
  • FISH was performed with BAC ' s for GPC4 (35H9, 68G14 labeled in red) and BAC ' s for GPC3 (166D10 and 36D20, labeled in green) on PHA-stimulated cell lines GM3884, GM13034 and GM0097 ( Figures 12B, 12C and 12D) .
  • Chromosomes were counterstained with DAPI, and the images were taken using a cooled CCD device. Arrows indicate the positive signals at chromosome Xq26.
  • Figure 13A and 13B show a BAC/PAC contig linking GPC4 to GPC3 on Xq26.
  • Figure 13C shows glypican deletions found in SGBS-patients . STS ' s are indicated by black circles; exons are indicated by grey squares. Not drawn to scale (The distance between SWXD1698 and exon-8 of GPC3 is approximately 250 kb (See Pilia et al . , 1996) .
  • Table 1 shows the primers used in 5 ' -RACE experiments for the identification of GPC6.
  • Table 2 shows the percentages of amino acid identities between glypicans.
  • Table 3 shows the primers used in the RACE experiments with GPC4.
  • Table 4 shows the gene specific primers used for sequencing of the GPC4 gene.
  • Table 5 shows the novel STSs MV1 , MV2 and MV3.
  • Table 6 shows localization of FISH signals in
  • Table 7 shows the intron-exon organization of GPC4.
  • Table 8 shows primers to be used in deletion analysis of GPC3 and GPC4.
  • Table 9 shows the primers for use in SSCA of GPC4.
  • Table 10 shows the results of deletion and SSCP screening in 8 patients with SGBS.
  • Table 11 shows primer pairs for deletion PCR of
  • Table 12 shows primer pairs for deletion PCR of GPC6.
  • the isolation of one of the cDNAs of the present invention started from EST database entries showing significant similarity with (cDNA coding for) glypicans.
  • the cDNA was found in a cDNA library of fetal brain.
  • EST entries (including homology data) were retrieved from dbEST using a text string based query interface (http://www.ncbi.nlm.nih.gov/dbEST/index.html). Protein alignments were made using the program Clustal . DNA alignments were made using the program GENEPRO .
  • the primer sets used for the 5 ' -RACEs (5' rapid amplification of cDNA ends) are given in Table 1.
  • the cDNAs were amplified from a library of adaptor-ligated double strand human fetal brain cDNA (Clontech, Palo Alto, CA) through a two-step PCR protocol. In the first PCR a gene-specific primer was used and an anchor primer provided by the supplier. Then 1 ⁇ l of each first PCR was used as template for a second PCR, using a second gene-specific nested primer (cf .
  • Clone zh83a06 from the Soares fetal liver/spleen library (Lennon et al . , 1996), which had yielded EST No. AA001322, was obtained from the I.M.A.G.E. Consortium (http://www-bio.llnl.gov/bbrp/image/image.html) through Research Genetics, Inc. (Huntsville, AL) . This clone (ID: 427858) was completely sequenced, yielding residues 1835- 2748 of the composite cDNA sequence that is shown in figure 1.
  • Metaphase spreads were prepared from PHA- stimulated human peripheral blood lymphocytes cultured for 72 hours. Prior to FISH, slides were treated with RNAse A and pepsin as described (Wiegant et al . , 1991) . Human Cotl DNA (Life Technologies) was used as a competitor. Denaturation of the slides and probes, hybridization, and subsequent cytochemical detection of the hybridization signals were performed as previously described (Vermeesch et al . , 1995) . Chromosomes were counterstained with DAPI and the slides were mounted in Vectashield mounting medium (Vector Laboratories Inc, Burlingame, CA) . The signal was visualized by digital imaging microscopy using a cooled charge-coupled device camera (Photometries Ltd, Arlington, AZ) . Merging and pseudocoloring were performed using the Smart Capture software (Vysis, Stuttgart, Germany).
  • the membranes for the Northern blots were obtained from Clontech. Hybridization was performed for two hours at 68 °C, using Expresshyb solution (Clontech) according to the manufacturer's specifications.
  • the probe was either a 32 P-oligolabeled BamHI-Xbal fragment from the I .M.A.G.E. -clone 427858 (corresponding to residues 2147- 2488 of the GPC6 sequence) or a Hindlll-BamHI fragment from the GPC6 composite cDNA sequence (corresponding to residues 1724-2147 of the GPC6 sequence) .
  • Dehybridisation included two washes with 2.0% SSC, 0.05% SDS (5 min at RT; 30 min at 60°C) and a high stringency wash with 0.1% SSC, 0.1% SDS (30 min at 65°C) .
  • Notl-BstEII and BstEII-Aval fragments from overlapping RACE-clones, and the Aval -HindiII fragment of I .M.A.G.E. -clone 427858 were ligated together in pCR2.1.
  • Notl-Xbal and Xhol-Xbal fragments from this construct, containing the Kozak sequence and initiator ATG, the full coding sequence and the stop codon, were subcloned in, respectively, pcDNAIII and pBluescript .
  • the full length cDNA was released from pBluescript with Kpnl and Notl, and ligated into pREP4 , yielding the plasmid glyp6-pREP4.
  • Namalwa cells (ATCC CRL 1432) were routinely grown in DMEF12 medium supplemented with 10% FCS and L- glutamine. For transfection, the cells were prewashed with Ca- and Mg-free PBS and incubated for 10 min at 4°C (10 7 cells in 1 ml Ca/Mg free PBS) with 30 ⁇ g glyp6-pREP4 plasmid before electroporation at 240 V and 960 ⁇ F (Gene Pulser, Biorad) .
  • Figure 1 represents the merged sequences of these clones, and the predicted structure of the protein encoded by the message that corresponds to this cDNA.
  • the sequence features an ATG start codon, in a Kozak sequence context, at position 586 and a TAA stop codon at position 2251.
  • Two AATAAA sequences are present at positions 2598 and 2690.
  • the open reading frame in the sequence codes for a protein of 555 residues.
  • the protein sequence starts and terminates with hydrophobic signal peptide-like sequences. It contains no asparagines that correspond to potential N-glycosylation sites, and contains four serine-glycine dipeptide sequences.
  • Ser-Gly dipeptide sequences occur towards the C-terminus of the protein, and form part of a direct Ser-Gly repeat sequence.
  • This Ser-Gly tetra-repeat sequence is flanked, both upstream and downstream, by acidic amino acids (D/E) , reproducing a motif that has been reported to promote the assembly of heparan sulfate in proteoglycans.
  • the downstream acidic residues occur within the sequence CMDDVC, and may reproduce a motif (a small acidic loop supported by a disulfide bond) that is shared by most glypicans (except glypican-2) .
  • This loop follows the SG repeats in the glypicans 1, 4 and 6, but interrupts or precedes the SG repeats in the glypicans 3 and 5.
  • Alignment of this predicted protein sequence with the protein sequences of the other known members of the glypican family revealed significant sequence similarities (figure 6) .
  • This similarity included the 14 cysteines and the position and identity of several additional amino acid residues that are conserved in all glypicans identified so far.
  • the entire protein showed 63% of sequence identity to human glypican-4, 44% of identity to human glypican-1, and 24-25% identity to the human glypicans 3 and 5.
  • glypican family of cell surface heparan sulfate proteoglycans may be composed of discrete subfamilies: one comprising glypicans 4 and 6, and possibly also glypican-1 (and 2); the second comprising glypicans 3 and 5.
  • the message is expressed at very high levels in ovary, and at high levels in liver, kidney, small intestine and colon (mucosal lining) .
  • the message is also present at low levels in heart, brain, placenta, lung, skeletal muscle, pancreas, spleen, thymus, prostate and testis (figures 7B and C) .
  • the message is undetectable in peripheral blood leukocytes.
  • In adult kidney the probes also detected a second less abundant message of approximately 5.8 kb.
  • Adult heart and adult skeletal muscle yielded an extra band of -3.9 kb .
  • the glypican-6 insert was subcloned in the pREP4 episomal expression vector and transfected in Namalwa cells.
  • the Namalwa cells used for these experiments had previously been shown to express little endogenous heparan sulfate, but to support the synthesis of large amounts of heparan sulfate when transfected with cDNAs (cloned in pREP4) that code for syndecans or glypican-1.
  • GPC6 Two BACs for GPC6, 114A17 and 182F5 were used to localize GPC6 to chromosome band 13q32 by fluorescent in situ hybridization on metaphase chromosomes (figure 10; BAC 114A17) . From this it follows that GPC5 (closely related to GPC3) and GPC6 (closely related to GPC4) map in close proximity of one another on 13q32, mimicking the clustering of the GPC3 and GPC4 genes on chromosome Xq26 (see example 2) .
  • GPC5 closely related to GPC3
  • GPC6 closely related to GPC4 map in close proximity of one another on 13q32, mimicking the clustering of the GPC3 and GPC4 genes on chromosome Xq26 (see example 2) .
  • the isolation of the cDNA that is used in the present invention for diagnostics started from a partial cDNA for human glypican-4.
  • a cDNA comprising the complete coding sequences for human glypican-4 was found in a cDNA library of fetal brain.
  • EST entries were retrieved from dbEST using either a text string based query interface (http://www.ncbi.nlm.nih.gov/dbEST/index.html), or by BLAST searches using the BLAST-server
  • a partial cDNA for human GPC4 was obtained by PCR on a human fetal kidney library (pKGP-PCR) .
  • the sequence of this cDNA was used to design the primers for the RACE experiments and the isolation of cDNA for the complete coding sequence of human GPC4.
  • the 5 ' -RACE and 3 ' -RACE experiments were performed on a library of adaptor-ligated ds fetal brain cDNA, using the Marathon cDNA Amplification kit from Clontech (Palo Alto, CA) .
  • the cDNAs were amplified through a two-step PCR protocol. The first PCR used a gene-specific primer (Table 3) and an anchor primer provided by the supplier.
  • the pKGP- PCR probe (corresponding to residues 422-1497 of the GPC4 cDNA sequence shown in figure 2) and the Notl-Bglll fragment (residues 1-386) of the GPC4 cDNA were gel purified, 32 P-labelled and used to screen a human genomic BAC library (Research Genetics, Inc . , Huntsville, AL) . Two BACs, 35H9 and 151D8 were isolated with the PCR probe, and one BAC, 68G14, with the Notl-Bglll fragment.
  • BAC 166D10 which contained exon-3 of GPC3
  • BAC 36D20 which contained exon-2 of GPC3.
  • Metaphase spreads were prepared from PHA- stimulated human peripheral blood lymphocytes cultured for 72 hours. Prior to FISH, slides were treated with RNAse A and pepsin as described (Wiegant et al . , 1991) . Human Cotl DNA (Life Technologies) was used as a competitor. Denaturation of the slides and probes, hybridization, and subsequent cytoche ical detection of the hybridization signals were performed as previously described (Vermeesch et al . , 1995). Chromosomes were counterstained with DAPI and the slides were mounted in Vectashield mounting medium (Vector Laboratories Inc, Burlingame, CA) . The signal was visualized by digital imaging microscopy using a cooled charge-coupled device camera (Photometries Ltd, Arlington, AZ) . Merging and pseudocoloring were performed using the Smart Capture software (Vysis, Stuttgart, Germany).
  • Exon-intron boundaries were determined by cycle-sequencing of BAC DNA using gene specific primers. Alternatively, BAC DNA was subcloned in plasmids, verified for the presence of GPC4 exons (by PCR and Southern blotting) and subsequently sequenced.
  • ATCC American Type Culture Collection
  • BACs 35H9 and 68G14 were sequenced and used to construct the novel sequence-tags (STS) MVl, MV2 and MV3 (Table 5) .
  • the PAC-library from P. de Jong was screened with 32 P-oligolabeled probes for MVl, exon-1 GPC4 , exon-2 GPC4 and exon- 8 GPC3. PAC content was verified by PCR. STS ' s for GPC4 exons are described in Table 5, STS ' s SWXD1698, SWXD1165 and sWXD2342 have been described by others (see The Genome Database http : //www.gdb. org) .
  • the reaction cycles for STS ' s MVl, MV2 and MV3 were: 94 °C for 30 sec, 55°C for 30 sec, 72° for 30 sec, for 35 cycles. Cycling was preceded by a 150 sec incubation at 94 °C.
  • the membranes for the Northern blots were obtained from Clontech. Hybridization was performed for two hours at 68 °C, using Expresshyb solution (Clontech) according to the manufacturer's specifications.
  • the probe was either a 32 P-oligolabeled Notl-Bglll fragment from one of the 5 'RACE clones (corresponding to residues 1-386 of the sequence shown in figure 2) , or a 32 P-oligolabeled BamHI-BamHI fragment from the composite cDNA sequence constructed in pREP4 (corresponding to residues 1148-2291 of the sequence shown in figure 2) .
  • Dehybridisation included washing at room temperature for 30 min with 2.0% SSC, 0.05% SDS and a high stringency wash for 30 min at 0.1% SSC, 0.1% SDS and 65°C.
  • Genomic DNA was obtained from one newly identified patient, counseled at the Center for Human Genetics (CME) of the University of Leuven (with informed consent from the parents); from the lymphoblastoid cell lines AG0817, AG0857, AG0893, AG0946, AG0969, and FY0367 (database IDs) from the European Collection of Cell Cultures (ECACC) ; and from the fibroblastic cell lines GM13034, GM3884, GM0097 (ATCC), all established from patients with SGBS. All patient DNAs were analyzed by PCR. The reaction cycles were: 94 °C for 30 sec, annealing temperature for 30 sec, 72°C for 30 sec, for 35 cycles. Cycling was preceded by a 2.5 minute incubation at 94°.
  • reaction products were analyzed by electrophoresis in 2% agarose gels or, alternatively, were analyzed for single-strand conformation polymorphisms (SSCP) in non-denaturing polyacrylamide gels as described previously (Matthijs et al . , 1997) .
  • PCR products with variant SSCs and controls were sequenced, either directly after gel purification or after T/A cloning in pCR2.1 (Invitrogen) . In the latter case, several independent clones from independent amplifications were characterized by Dye Primer Cycle Sequencing.
  • a Notl-EcoRI fragment from a 5' RACE clone, a EcoRI-Pstl fragment from pKGP, and a Pstl- BamHI fragment from a 3 ' -RACE clone were ligated together in pBluescript.
  • a Notl-Notl fragment containing GPC4 was isolated from this construct and ligated in pCDNAIII and pREP4 , yielding respectively glyp4-pcDNAIII and glyp4- pREP4.
  • Namalwa cells (ATCC CRL 1432) were routinely grown in DMEF12 medium supplemented with 10% FCS and
  • the cells were prewashed with Ca- and Mg-free PBS and incubated for 10 min at 4°C (10 7 cells in 1 ml Ca/Mg free PBS) with 30 ⁇ g glyp4-pREP4 plasmid before electroporation at 240 V and 960 ⁇ F (Gene Pulser, Biorad) . Selection was started 48 h later with 250 ⁇ g/ml of hygromycin B. Stable transfection was achieved after 12 days.
  • 26 pcDNAIII selected in media containing 400 ⁇ g/ml of G418, and panned on 10E4 antibody.
  • AATAAA sequence potential polyadenylation signal
  • the protein sequence starts and terminates with hydrophobic signal peptide-like sequences. It contains three serine-glycine dipeptide sequences. All three Ser- Gly dipeptide sequences occur towards the C-terminus of the protein, and two of these form part of a direct Ser- Gly repeat sequence. These Ser-Gly sequences are flanked, both upstream and downstream, by acidic amino acids (D/E) , reproducing a motif that has been reported to promote the assembly of heparan sulfate in proteoglycans (Zhang et al . , 1995). Because of the presence of three Ser-Gly repeats, glypican-4 would be predicted to have up to three heparan sulfate chains implanted on its core protein.
  • D/E acidic amino acids
  • the acidic residue downstream of the Ser-Gly repeat occurs within the sequence CEYQQC, and may reproduce a motif (a small acidic loop supported by a disulfide bond) that is shared by most glypicans (except glypican-2) .
  • This loop follows the SG repeats in the glypicans -1, -4 and -6, but interrupts or precedes the SG repeats in the glypicans -3 and -5.
  • both probes corresponding either to residues 1-386, or residues 1148- 2291 of the GPC4 sequence were detecting two messages, one of 2.9 and one of 4.3 kb.
  • the messages were expressed, in several, but not all, of the human fetal and adult tissues tested (see figure 11) .
  • the origin of these two bands is not known, but could be due to the alternative usage of multiple polyadenylation signals, alternative splicing or, less likely, cross-hybridization with messages for other (possibly yet to be identified) members of the glypican gene family.
  • fetal tissues the messages were expressed in brain, kidney and lung; but barely detectable in liver.
  • the glypican-4 insert was subcloned in the pREP4 episomal expression vector and transfected in Namalwa cells.
  • the Namalwa cells used for these experiments had previously been shown to express little endogenous heparan sulfate, but to support the synthesis of large amounts of heparan sulfate when transfected with cDNAs (cloned in pREP4) that code for syndecans or glypican-1.
  • BAC 35H9, BAC 151D8, and BAC 68G12 Three BACs were identified for GPC4 : BAC 35H9, BAC 151D8, and BAC 68G12. BACs 35H9 and 151D8 contained exons 2 to 9 of GPC4 , while BAC 68G12 contained exon-1 of GPC4.
  • FISH performed on metaphase chromosomes, localized all BACs for GPC4 to Xq26 (figure 12) . Since GPC3 had also been localized to chromosome band Xq26 (Pilia et al .
  • GPC3 (closely related to GPC5) and GPC4 (closely related to GPC6) probably mapped in proximity of one another on Xq26, mimicking the clustering of the GPC5 and GPC6 genes on chromosome 13q32 (see example 1) .
  • SGBS Simpson-Golabi-Behmel
  • FIG. 13 shows the BAC/PAC contig containing the entire GPC4 gene and linking both GPC3 and GPC4. This contig indicates that both glypicans form a tandem array with exon-1 and the promotor region of GPC4 lying adjacent to the last exon of GPC3.
  • the GPC4 gene exon/intron structure is schematically shown in Table 7.
  • Genomic DNA was obtained from one newly identified patient, counseled at the Center for Human Genetics (CME) of the University of Leuven (with informed consent from the parents) ; from the lymphoblastoid cell lines AG0817, AG0857, AG0893, AG0946, AG0969, and FY0367 (database IDs) from the
  • ECACC European Collection of Cell Cultures
  • ATCC fibroblastic cell lines GM13034, GM3884, GM0097
  • All patient DNAs were analyzed by PCR.
  • the reaction cycles were: 94°C for 30 sec, annealing temperature for 30 sec, 72 °C for 30 sec, for 35 cycles. Cycling was preceded by a 2.5 minute incubation at 94°. Primers and annealing temperatures are given in Table 8.
  • the reaction products were analyzed by electrophoresis in 2% agarose gels.
  • PCR primer pairs were designed for amplification of all exons of GPC4 (including exon/intron boundaries, Table 11) and the corresponding PCR products were analyzed for single- strand conformation polymorphisms (SSCP) in non- denaturing polyacrylamide gels as described previously (Matthijs et al . , 1997). PCR products with variant SSCs and controls were either directly sequenced after gel purification or T/A cloned in pCR2.1 (Invitrogen) . Several independent clones from independent amplifications were characterized by Dye Primer Cycle Sequencing.
  • SSCP single- strand conformation polymorphisms
  • W 296 corresponds to one of the residues that are strictly conserved in all glypicans identified so far.
  • Deletion of one T nucleotide (del T875) leading to a frame shift mutation and termination, was the basis for the variant SSC of exon 3 in a third patient (b) .
  • SSCA of the GPC4 exons revealed polymorphisms for the exons 7 and 8, in one and the same patient. Sequencing of the PCR product of exon-7 in this patient identified a G>T mutation leading to a substitution of D 391 by E in glypican-4 (c) .
  • GPC-4 Ex-1 5 ' -ggggCATCgTTCTTgTTgAA PR07-44 F-E (AS)
  • GPC-4 Ex-2 5 ' -TCATCAAACTTCTTgTAACg PR10-26 F-E (AS)
  • GPC-4 Ex-3 5 ' -AAGTGGTACTGGGAGTTCAC PR10-50
  • F-E (AS) GPC-4 Ex-4: 5 ' -CTCCAGCAAGCACAAATATG PR06-40
  • F-E (AS) GPC-4 Ex-5: 5 ' -CTTCTgAgACACTTgAACAC PR06-41
  • GPC-4 Ex-6 5 ' -CCAgTCggTCCAAACTAgTg PR09-11
  • GPC-4 Ex-8 5 ' -AgTCCACgTCgTTCCCATTg PR09-10
  • F-E (AS) GPC-4 Ex-9: 5 ' -CTCCACTCTCTCTg

Landscapes

  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Medicinal Chemistry (AREA)
  • Molecular Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Toxicology (AREA)
  • Pharmaceuticals Containing Other Organic And Inorganic Compounds (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Peptides Or Proteins (AREA)

Abstract

The invention relates to a novel polynucleotide encoding a new glypican-related protein ('glypican-6') and the gene for glypican-4 as well as derivatives of both genes for use in methods of diagnosis and therapy. Derivatives comprise for example fragments of the gene either isolated or synthetic and having a length that is smaller than the complete gene; primers, comprising at least 10 consecutive gene specific nucleotides, preferably about 20 gene specific consecutive nucleotides of the nucleotide sequence of the gene; longer oligonucleotides up to the full length of the gene; antisense variants of the gene, the fragments or the primers; antibodies directed to the gene, fragments, primers or complementary strands thereof; any specific ligand for DNA that can be used as a specific probe, peptide nucleic acid probes.

Description

NEW MEMBERS OF THE GLYPICAN GENE FAMILY
The present invention relates to the characterization and chromosomal localization of new members of the glypican gene family and to the use of members of this family in diagnostics and/or therapeutics.
Glypicans are glypiated cell surface heparan sulfate proteoglycans, the first of which was originally identified in human lung fibroblasts (David et al . , 1990) . The known five members of this family have similar core protein sizes (about 60 kDa) , share a unique and very conserved cysteine spacing, and are linked to the cell membrane by a glycosyl phosphatidyl inositol (GPI)- anchor. Those five known glypicans are of vertebrate origin and include glypican (glypican-1, David et al . , supra), Cerebroglycan (glypican-2, Stipp et al . , 1994), OCI-5 (glypican-3, Filmus et al . , 1995), K-glypican (glypican-4, Watanabe et al . , 1995) and glypican-5 (Veugelers et al . , 1997).
All the structural features of the vertebrate glypicans are also present in the product of dally (division abnormally delayed) , a locus identified in Drosophila melanogaster by genetic screening for mutants affecting cell division patterning in the developing central nervous system (Nakato et al . , 1995). Besides disturbing cell cycling in the nervous system dally mutations also affect viability and produce morphological defects in several adult tissues, including the eyes, antennae, wings and genitalia. The dally mutants, the well established co-receptor activities of the cell surface proteoglycans for various ligands that are known to mediate developmental instructions, and the tissue and stage-specific expressions of the glypicans, all implicate the glypican group of integral membrane proteoglycans in the control of cell division and patterning during development. This contention has recently been corroborated by the identification of mutations in GPC3 , the gene coding for the human homologue of OCI-5 (glypican-3 ) , that cause the Simpson- Golabi-Behmel overgrowth syndrome (SGBS) (Pilia et al . , 1996) . This X-linked condition, which clinically has to be differentiated from the autosomal Beckwith-Wiedeman Syndrome, is characterized by pre- and post-natal overgrowth with visceral and skeletal anomalies, and is associated with a high risk for developing embryonal tumors, including Wilms ' tumor and neuroblastoma .
It was therefore anticipated by the present inventors that chromosomal assignment of the genes for the members of the glypican family and the identification of potentially additional members in this family may be of general relevance for the understanding of somatic overgrowth and tumor predisposition. So far, only glypican, the homologue of OCI-5 (glypican-3) and glypican- 5 have been identified in human. The corresponding genes GPC1 , GPC3 and GPC5 have been localized to chromosomes 2q35-q37, Xq26 and 13q32, respectively (Vermeesch et al . , 1995; Pilia et al . , 1996; Veugelers et al . , 1997) . The cDNA nucleotide and derived amino acid sequences of these genes are given in figures 3, 4 and 5, respectively.
It is thus the object of the present invention to provide new members of the glypican family, and to study their possible implications in various medical indications. It is a further object of the invention to use the information derivable from the members of the glypican gene family for designing diagnostic methods and kits and/or for the development of therapeutics. In the research that led to the present invention two novel human cDNAs were identified encoding glypican-related proteins. The corresponding gene for the first was mapped to chromosome 13q32. In this application this gene will be identified as GPC6, whereas the protein encoded by the gene will be called glypican-6. The predicted primary structure of the GPC6 protein was found to show significant sequence similarity to glypican (glypican- 1) , to the human homologue of OCI-5 (glypican-3) , to glypican-5, to the glypican-related proteins Cerebroglycan (glypican-2 of the rat) , to K-glypican (glypican-4 of the mouse) and to the gene product of the dally locus in Drosophila melanogaster . The similarity pertains to a conserved sequence motif, present in all seven proteins and that include a set of 14 conserved cysteine residues found in specific positions. Additional similarities include the overall sizes of the proteins, the presence of N-terminal and C-terminal signal peptide-like sequences (the first predicted to be involved in the membrane translocation of the nascent polypeptide, the second in the temporary membrane anchorage and subsequent glypiation of the proteins) , and the presence of glycosaminoglycan attachment consensus sequences close to the C-termini of the proteins .
Glypican-6 is, however, more similar to K- glypican and human glypican-4 (see below) than to the other glypicans. As to these other glypicans, glypican- 6 is more similar to glypican-1 and glypican-2 than to glypican-3 and glypican-5. The gene encoding glypican-6 (GPC6) has a similar exon-intron organization as the gene encoding glypican-4 (GPC4 as was now found according to the invention) and the gene encoding glypican-1 (GPCl) . This organization differs from the exon-intron organization of the gene encoding glypican-3 (GPC3) and that of the gene encoding glypican-5 (GPC5) , while GPC3 and GPC5, in turn, resemble one another in terms of their intron-exon organizations. This indicates the possible existence of (at least two) glypican subfamilies: one comprising, so far, the glypicans 1, 2, 4 and 6; the other comprising, so far, the glypicans 3 and 5.
According to a further aspect of the invention, a cDNA is provided that encodes the human homologue of K- glypican (glypican-4) , the corresponding gene of which localizes to chromosome Xq26 in very close proximity to the gene for glypican-3. Thus the GPC3 and GPC4 genes are adjacent, or near adjacent, to one another on chromosome Xq26, while the GPC5 and GPC6 genes are adjacent, or near adjacent, to one another on chromosome 13q32. In these two examples a member (GPC3, respectively GPC5) of one of the glypican subfamilies is physically linked to a member (GPC4, respectively GPC6) of another glypican subfamily. This indicates that the glypican subfamilies and various members of these families may have arisen from the duplications of one ancestral glypican gene and ancestral gene cluster. Maintenance of the physical associations between these genes during evolution suggests that these genes may also be functionally linked.
Furthermore, it was found according to the invention that besides the gene for glypican-3 (GPC3) also the gene for glypican-4 (GPC4) may show aberrations in patients having disorders and diseases involving abnormal cell growth and behavior, like somatic overgrowth and tumor formation. With this information various diagnostic tests could be developed in order to detect aberrations in the genes that encode glypicans and aberrations in the expression levels of these genes. Moreover, this knowledge can be used to develop therapeutic compounds that restore the physical damage caused by the mutant gene .
The aberrations in the gene comprise for example deletions or translocations within either or both of the two genes, but also mutations in either or both of them. These aberrations may lead to the absence of gene products or to abnormal gene products. Thus, the expression level of the gene may be used as another parameter indicating the presence of one or more aberrant genes .
In the research that led to the invention various aberrations have been found in patients suffering from Simpson-Golabi-Behmel syndrome, including deletion of the entire GPC4 gene and exons 7 and 8 of GPC3 , deletion of exons 1 and 2 of GPC3 , a T>A mutation in exon 3 of GPC3 leading to a W296>R substitution in glypican 3, a C>T mutation in exon 8 of GPC4 leading to a A42>V substitution in glypican 4, deletion of one dT nucleotide in GPC3 resulting in a frame shift mutation and premature termination of the protein.
The finding for GPC3 and GPC4 can be extrapolated to other members of the glypican gene family. GPC5 and GPC6 are likewise associated. Aberrations in these genes can be identified in a similar manner as herein described for GPC3 and GPC4.
Various molecular biological techniques can be used to find these types of aberration in the genes. These techniques include but are not limited to single strand conformation polymorphism (SSCP) screening, restriction fragment length polymorphism (RFLP) screening, gel electrophoresis , Southern blot analysis, PCR, DNA sequencing, etc.
Diagnostic methods according to the invention are based on the information derivable from the gene and/or its gene product. Such information comprises the nucleotide sequence, either sense or antisense, of the gene and the complementary strand thereof, and the amino acid sequence of the gene product encoded by the coding sequence of the gene. The information derivable from the gene or gene product can very well be defined by a person skilled in the art by referring to figures 1 to 6. Figures 1 to 5 disclose the nucleotide sequence of the human cDNAs for glypicans 1 and 3 to 6, as well as the derived amino acid sequence of the protein encoded by the cDNA. Figure 6 gives an alignment of the predicted amino acid sequences and the position of the exon boundaries for each of them. This information can be used to define so-called derivatives.
Derivatives of the nucleotide sequence of the gene are the gene itself, either isolated or synthetic, fragments of the gene, either isolated or synthetic and having a length that is smaller than the complete gene; primers, comprising at least 10 consecutive gene specific nucleotides, preferably about 20 gene specific consecutive nucleotides of the nucleotide sequence of the gene; longer oligonucleotides up to the full length of the sequence of the gene; antisense variants of the gene, the fragments or the primers; antibodies directed to the gene, fragments, primers or complementary strands thereof; any specific ligand for DNA that can be used as a specific probe, peptide nucleic acid probes.
Other derivatives are transcripts (mRNA sequences) of the gene, from which in turn cDNA, antisense RNA, antisense cDNA, antibodies directed to the transcript, sense and antisense cDNA, antisense RNA and any specific ligand for RNA that can be used as a specific probe can be derived.
Derivatives of the amino acid sequence include the isolated or synthetic gene product (also called protein or polypeptide); isolated or synthetic peptides, comprising a specific sequence of consecutive amino acids encoded by the gene, antibodies directed to the gene product or peptides and any specific ligand for peptides that can be used as a specific probe. Other derivatives are heparan sulfate chains or heparan sulfate structures, antibodies directed to heparan sulfate structures present on the product of the natural or synthetic gene as a result of the posttranslational modification of these gene products, any specific ligand for heparan sulfate that can be used as a specific probe.
Furthermore, the gene or cDNA may be used for the transfection of cells, which transfection results in cells expressing or secreting the desired glypican. The transfected cells can be used to produce transgenic animals therefrom, which in case the gene is an aberrant gene, can be used to study the effect of the aberration or to test medicaments. Alternatively, natural glypicans may be isolated or recombinant (wild type or mutated) glypicans produced (in transfected cells or transgenic animals) for use as therapeuticals . Such therapeuticals may be used to mimic the biological effects of the glypicans (control of cell growth and differentiation) , in attempts to remedy the effects of absolute or relative deficiencies of these genes or to enhance the effects of the normal genes. With the appropriate modifications of the glypican gene sequences, therapeuticals based on modified glypican gene sequences may also be used to block the effects of the glypicans, in attempts to remedy the effects of absolute or relative overexpressions or activities of the products of the various glypican genes. As a non-limiting example, recombinant soluble glypicans may be used as decoy receptors for antagonizing the effects of factors that depend on membrane-anchored glypicans, whereas the delivery of membrane- intercalatable glypicans to cells may restore cellular sensitivity to these factors. Diagnostic methods according to the invention comprise but are not limited to the following.
A method for diagnosing aberrations in a glypican encoding gene, comprises isolation of the gene from cells expected to be harboring an aberrant gene; and comparison of the nucleotide sequence of the gene thus obtained with the nucleotide sequence of a wild type gene. The term "wild type gene" as used in this application is intended to encompass a gene from a non- affected individual. The sequences given in figures 1 to 5 are representatives for wild type gene sequences.
Comparison between the potentially aberrant and wild type nucleotide sequences can be performed at various levels. On a first level it can be established whether the expected aberration (s) has (have) resulted in restriction fragment length polymorphism. In order to do this the isolated gene and a wild type comparison gene are separately digested with one or more selected restriction enzymes. The digest thus obtained is separated on a gel revealing a pattern of bands. Differences in the pattern indicate the presence of differences in the restriction sites present in the polynucleotide and thus changes in the sequence thereof. Deletions can be detected by means of any nucleic acid amplification technique such as the Polymerase Chain Reaction (PCR) . For this, probes are identified corresponding to various parts of the gene to be diagnosed, for example exons. Amplification between a set of probes will only occur if the part of the gene to which a selected set of probes should hybridize is still present. In addition or as an alternative the length of the amplified fragment indicates whether any part is deleted.
Point mutations can be identified by more sophisticated techniques such as SSCP (single-strand conformation polymorphism screening) , heteroduplex analyses, DNA-chips, chemical and enzymatic methods, sequencing of PCR products, denaturant gradient gel electrophoresis or other state of the art methods that may become available in the future.
Other diagnostic methods comprise the in situ detection of physical changes like translocations, inversions or deletions. Translocations can be detected by hybridizing a set of chromosomes with a first probe that hybridizes to a part of the glypican gene that is not likely to be involved in the translocation, inversion or deletion and a second probe that hybridizes to a part of the glypican gene that is likely to be involved in such aberration. When a translocation has occurred the second probe will be found on another chromosome than the first one. If the probable translocation partner is identified, an additional set of probes can be used which hybridize to the translocated part and the remaining part, respectively of the translocation partner and bearing a different label from the first set of probes. Upon translocation one of the probes of the first set will be found on the chromosome of the translocation partner and one probe of the second set will be found on the chromosome of the glypican gene and vice versa.
Identifying inversions and deletions works in a similar way with two probes, one that hybridizes to a part of the gene that is not likely to be involved in the inversion or deletion and a second probe that is likely to be involved in such aberration. In case of an inversion the second probe will be found closer to or further away from the first probe than in a non-aberrant chromosome. In the case of deletions one of the probes will be missing on the aberrant chromosome. The description and examples of this application give various examples of "parts of the gene that are likely to be involved in an aberration" .
Diagnosis can also be performed at the level of the (potentially absent or aberrant) protein encoded by the glypican gene. Antibodies directed to the gene product or protein can be used on Western blots to detect the presence of the protein in the cell or to assess the amount of protein present .
The diagnostic tests of the invention can be performed on various source materials. RF P, deletion PCR, SSCP and chromosome analyses are for example performed on blood cells or tissue biopsy samples of the patient and his or her family. Furthermore, tumor cells and normal cells of these subjects may be used. For protein analysis, tissue samples, sera, tissue fluids of patients and family, pleura exudates, ascites etc. may be used.
All these diagnostic methods are based on the information that can be derived from the various genes and the gene products thereof. This information is given in the following figures. Figure 1 shows the nucleotide sequence of the glypican-6 cDNA, comprising the coding sequence of the newly identified GPC6 gene, and the predicted amino acid sequence .
Figure 2 shows the nucleotide sequence of the glypican-4 cDNA, comprising the coding sequence of the GPC4 gene, and the predicted amino acid sequence.
Figure 3 shows the nucleotide sequence of the human glypican-1 cDNA, comprising the coding sequence of the GPC1 gene, and the predicted amino acid sequence (Genbank accession number X54232; David et al . , J. Cell Biol. Ill, 3165-3176, 1990) .
Figure 4 shows the nucleotide sequence of the human glypican-3 cDNA, comprising the coding sequence of the GPC3 gene, and the predicted amino acid sequence (Genbank accession number Z37987) .
Figure 5 shows the nucleotide sequence of the human glypican-5 cDNA, comprising the coding sequence of the GPC5 gene, and the predicted amino acid sequence (Genbank accession number U66033; Veugelers et al . Genomics 40, 24-30, 1997) .
Figure 6 shows the alignment of the predicted amino acid sequences of the members of the glypican family. GPC1 is human glypican (David et al . , 1990), GPC3 is the translated ORF of MXR7 (GenBank #Z37987) and the human homologue of rat OCI-5, GPC4 is human glypican-4 (see example 2) and the human homologue of K-glypican (Watanabe et al . , 1995), GPC5 is human glypican-5 (Veugelers et al . , 1997) . The set of fourteen conserved cysteines and the putative glycosaminoglycan attachment sites are outlined by underlining. Serines occurring in SGXG sequence contexts are indicated in bold. Alignment was done using the Clustal V program (Higgins, 1994) . Single carets under the sequences indicate exon-intron boundaries occurring within codons ; double carets indicate exon-intron boundaries occurring between codons. In the following examples reference is made to the following additional figures. Figure 7A shows a Northern blot for GPC6 of human fetal Brain (lane 1), Lung (lane 2), liver (lane 3) and Kidney (lane 4) RNA. The positions of RNA size markers (kb) are indicated in the abscissa.
Figure 7B shows a Northern blot for GPC6 of human adult Heart (lane 1) , Brain (lane 2) , Placenta
(lane 3) , Lung (lane 4) , Liver (lane 5) , Skeletal Muscle (lane 6) , Kidney (lane 7) and Pancreas (lane 8) RNA. Figure 7C shows a Northern blot for GPC6 of human adult Spleen (lane 1) , Thymus (lane 2) , Prostate (lane 3) , Testis (lane 4) , Ovary (lane 5) , Small Intestine (lane 6) , Colon Mucosal lining (lane 7) , and Peripheral Blood Leukocyte (lane 8) RNA. The positions of RNA size markers (kb) are indicated in the abscissa.
Figure 8 shows heparan sulfate core protein expression in control and GPC-transfectant Namalwa cells. Western blotting of non digested, heparitinase-digested, doubly heparitinase- and chondroitinase ABC-digested, and chondroitinase ABC-digested proteoglycan fractions, using the delta-HS-specific mAb 3G10. This antibody reacts with the desaturated uronates that are generated by heparitinase and that remain in association with the core protein after the enzyme treatment and during electrophoresis. After heparitinase it therefore detects all heparan sulfate proteoglycan core proteins present in the sample. The positions of protein size markers are indicated in the abscissa. Hase : heparitinase; Case: chondroitinase ABC. Control: wild type Namalwa cells (wt) and Namalwa cells, transfected with pREP4 0; GPC1, GPC4 and GPC6 : Namalwa cells, transfected with respectively the plasmids glypl-pREP4, glyp4-pREP4 and glyp6-pREP4. Figure 9 shows heparan sulfate expression in control and GPC-transfectant Namalwa cells. FACS analysis of non-digested and heparitinase-digested cells, using the native HS-specific 10E4 antibody (non digested cells (---)) and delta-HS-specific 3G10 antibody (digested cells ( ). Control: Namalwa cells, transfected with pREP4; GPC1 , GPC3, GPC4 , GPC5 and GPC6 : Namalwa cells, transfected with respectively the plasmids glypl-pREP4, glyp3-pREP4, glyp4-pREP4, glyp5-pREP4 and glyp6-pREP4.
Figure 10 shows the chromosomal localization of GPC6 to chromosome band 13Q32 as a photo (A) and a schematic representation thereof (B) . Arrows indicate colored bands .
Figure 11A shows the Northern blot for GPC4 of human fetal Kidney (lane 1) , Liver (lane 2) , Lung (lane 3) and Brain (lane 4) RNA. The upper part of the figures represents the hybridization with the GPC4 probe; the lower part the hybridization with a β-actin control probe . Figure 11B shows the Northern blot for GPC4 of human adult Heart (lane 1) , Brain (lane 2) , Placenta (lane 3), Lung (lane 4), Liver (lane 5), Skeletal Muscle (lane 6) , Kidney (lane 7) and Pancreas (lane 8) RNA.
Figure 11C shows the Northern blot for GPC4 of human adult Spleen (lane 1), Thymus (lane 2), Prostate (lane 3) , Testis (lane 4) , Ovary (lane 5) , Small intestine (lane 6) , Colon; mucosal lining (lane 7) , peripheral blood leukocyte (lane 8) . The positions of RNA size markers (kb) are indicated in the abscissa. Figure 12A illustrates the chromosomal localization of GPC4 to chromosome Xq26 and relative order of GPC3 and GPC4. For initial chromosomal localization of GPC4, FISH was performed using either BAC 35H9 or BAC 68G14 on metaphase spreads, prepared from PHA-stimulated normal peripheral blood leukocytes (Figure 12A) . For relative ordering of GPC genes FISH was performed with BAC ' s for GPC4 (35H9, 68G14 labeled in red) and BAC ' s for GPC3 (166D10 and 36D20, labeled in green) on PHA-stimulated cell lines GM3884, GM13034 and GM0097 (Figures 12B, 12C and 12D) . Chromosomes were counterstained with DAPI, and the images were taken using a cooled CCD device. Arrows indicate the positive signals at chromosome Xq26.
Figure 13A and 13B show a BAC/PAC contig linking GPC4 to GPC3 on Xq26. Figure 13C shows glypican deletions found in SGBS-patients . STS ' s are indicated by black circles; exons are indicated by grey squares. Not drawn to scale (The distance between SWXD1698 and exon-8 of GPC3 is approximately 250 kb (See Pilia et al . , 1996) . The following tables that precede the references give inter alia suitable primers for use in the various diagnostic methods: Table 1 shows the primers used in 5 ' -RACE experiments for the identification of GPC6.
Table 2 shows the percentages of amino acid identities between glypicans. Table 3 shows the primers used in the RACE experiments with GPC4.
Table 4 shows the gene specific primers used for sequencing of the GPC4 gene.
Table 5 shows the novel STSs MV1 , MV2 and MV3. Table 6 shows localization of FISH signals in
SGBS patients.
Table 7 shows the intron-exon organization of GPC4.
Table 8 shows primers to be used in deletion analysis of GPC3 and GPC4.
Table 9 shows the primers for use in SSCA of GPC4.
Table 10 shows the results of deletion and SSCP screening in 8 patients with SGBS. Table 11 shows primer pairs for deletion PCR of
GPC5.
Table 12 shows primer pairs for deletion PCR of GPC6.
The present invention will be further illustrated in the accompanying examples that are in no way intended as a limitation of the present invention.
EXAMPLES EXAMPLE 1
Identification and characterization of glypican-6 1.1 Introduction
The isolation of one of the cDNAs of the present invention (GPC6) started from EST database entries showing significant similarity with (cDNA coding for) glypicans. The cDNA was found in a cDNA library of fetal brain. 1.2 Materials and methods
1.2.1 Bioinformatics
EST entries (including homology data) were retrieved from dbEST using a text string based query interface (http://www.ncbi.nlm.nih.gov/dbEST/index.html). Protein alignments were made using the program Clustal . DNA alignments were made using the program GENEPRO .
1.2.2 Molecular cloning of human GPC6 The primer sets used for the 5 ' -RACEs (5' rapid amplification of cDNA ends) , corresponding to two GPC- like ESTs (GenBank Accession Nos. N87558 and AA001322), are given in Table 1. The cDNAs were amplified from a library of adaptor-ligated double strand human fetal brain cDNA (Clontech, Palo Alto, CA) through a two-step PCR protocol. In the first PCR a gene-specific primer was used and an anchor primer provided by the supplier. Then 1 μl of each first PCR was used as template for a second PCR, using a second gene-specific nested primer (cf . Table 1) and a nested anchor primer provided by the supplier. The products of the second steps were analyzed by electrophoresis in a 0.6% agarose gel. Distinct bands were gel purified using the QIQUICK II DNA clean-up System (Qiagen) , T/A-cloned in the plasmid pCR2.1 (Invitrogen), and sequenced using a Pharmacia A.L.F. DNA Sequencer, with Dye Primer Cycle Sequencing chemistry on double stranded plasmid templates. In total, 5 independent clones from two separate 5 ' -RACE experiments (2 from the 5'-RACE-l and 3 from the 5'-RACE-2 experiment) were sequenced.
Clone zh83a06 from the Soares fetal liver/spleen library (Lennon et al . , 1996), which had yielded EST No. AA001322, was obtained from the I.M.A.G.E. Consortium (http://www-bio.llnl.gov/bbrp/image/image.html) through Research Genetics, Inc. (Huntsville, AL) . This clone (ID: 427858) was completely sequenced, yielding residues 1835- 2748 of the composite cDNA sequence that is shown in figure 1.
1.2.3 BAC cloning and chromosomal localization of GPC6 by FISH
To isolate genomic clones for GPC6 , part of the cDNA insert from I .M.A.G.E. -clone 427858 was isolated with EcoRI/Pstl, gel purified, labeled and used to screen a human genomic BAC library (Research Genetics, Inc., Huntsville, AL) . Two BACs, 114AI7 and 182F5, were isolated. Their authenticity was verified by Southern blotting. These results indicated that the BACs 114A17 and 182F5 contain exons 6-9 of GPC6. BAC DNA was labeled with bio-16-dUTP (Sigma) by nick-translation using a commercial kit (Life Technologies, Gaithersburg, MD) . Metaphase spreads were prepared from PHA- stimulated human peripheral blood lymphocytes cultured for 72 hours. Prior to FISH, slides were treated with RNAse A and pepsin as described (Wiegant et al . , 1991) . Human Cotl DNA (Life Technologies) was used as a competitor. Denaturation of the slides and probes, hybridization, and subsequent cytochemical detection of the hybridization signals were performed as previously described (Vermeesch et al . , 1995) . Chromosomes were counterstained with DAPI and the slides were mounted in Vectashield mounting medium (Vector Laboratories Inc, Burlingame, CA) . The signal was visualized by digital imaging microscopy using a cooled charge-coupled device camera (Photometries Ltd, Tucson, AZ) . Merging and pseudocoloring were performed using the Smart Capture software (Vysis, Stuttgart, Germany).
1.2.4 Northern blotting
The membranes for the Northern blots were obtained from Clontech. Hybridization was performed for two hours at 68 °C, using Expresshyb solution (Clontech) according to the manufacturer's specifications. The probe was either a 32P-oligolabeled BamHI-Xbal fragment from the I .M.A.G.E. -clone 427858 (corresponding to residues 2147- 2488 of the GPC6 sequence) or a Hindlll-BamHI fragment from the GPC6 composite cDNA sequence (corresponding to residues 1724-2147 of the GPC6 sequence) . Dehybridisation included two washes with 2.0% SSC, 0.05% SDS (5 min at RT; 30 min at 60°C) and a high stringency wash with 0.1% SSC, 0.1% SDS (30 min at 65°C) .
1.2.5 Construction of expression plasmids and cell transfection
The Notl-BstEII and BstEII-Aval fragments from overlapping RACE-clones, and the Aval -HindiII fragment of I .M.A.G.E. -clone 427858 were ligated together in pCR2.1. Notl-Xbal and Xhol-Xbal fragments from this construct, containing the Kozak sequence and initiator ATG, the full coding sequence and the stop codon, were subcloned in, respectively, pcDNAIII and pBluescript . For episomal expression in Namalwa cells, the full length cDNA was released from pBluescript with Kpnl and Notl, and ligated into pREP4 , yielding the plasmid glyp6-pREP4.
Namalwa cells (ATCC CRL 1432) were routinely grown in DMEF12 medium supplemented with 10% FCS and L- glutamine. For transfection, the cells were prewashed with Ca- and Mg-free PBS and incubated for 10 min at 4°C (107 cells in 1 ml Ca/Mg free PBS) with 30 μg glyp6-pREP4 plasmid before electroporation at 240 V and 960 μF (Gene Pulser, Biorad) .
Selection was started 48 h later with 250 μg/ml of hygromycin B. Stable transfection was achieved after 12 days. Expression of heparan sulfate in the transfectants was analyzed by FACS, using the HS-specific antibody 10E4 and the delta-HS-specific antibody 3G10, and by Western blotting, using the 3G10 antibody as described before (David et al . , 1992; Steinfeld et al . , 1996) . Stable expressing transfectant Namalwa clones were isolated from cells transfected with linearised glyp6- pcDNAIII, selected in media containing 400 μg/ml of G418, and panned on 10E4 antibody. (The HS-specific antibodies 10E4 and 3G10 have been isolated, characterized and produced in our own laboratory (David et al . , 1992) . They are now also commercially available from Seikagaku Co.)
1.3 Results
1.3.1 Molecular cloning of glypican-6 (GPC6) , a novel glypican
During the screening of public EST databases, it was noticed that there was one EST from human fetal heart (GenBank Ace. No. N87558; clone not available) and one EST from fetal liver/spleen (GenBank Ace. No. AA001322; available as clone 427858 from the I.M.A.G.E- consortium) with very high homology (but not identical) to human GPC1 and GPC4 (approximately 70% identity at the nucleotide- and encoded amino acid-sequence level) . It was assumed that these might represent (a) novel glypican (s), and primers (annealing to regions with significant sequence divergence from GPC1 and GPC4) were designed which would amplify the corresponding cDNA(s) (see Table 1) . These primers were used on the same human fetal brain cDNA library that was used for the isolation of GPC5 (Veugelers et al . , 1997) and human GPC4 (example 2), in RACE experiments.
From the analysis of their sequences, it appeared that all clones from these RACE experiments and the I .M.A.G.E . -clone 427858 represented overlapping cDNAs .
Figure 1 represents the merged sequences of these clones, and the predicted structure of the protein encoded by the message that corresponds to this cDNA. The sequence features an ATG start codon, in a Kozak sequence context, at position 586 and a TAA stop codon at position 2251. Two AATAAA sequences (potential polyadenylation signals) are present at positions 2598 and 2690. The open reading frame in the sequence codes for a protein of 555 residues. The protein sequence starts and terminates with hydrophobic signal peptide-like sequences. It contains no asparagines that correspond to potential N-glycosylation sites, and contains four serine-glycine dipeptide sequences. All four Ser-Gly dipeptide sequences occur towards the C-terminus of the protein, and form part of a direct Ser-Gly repeat sequence. This Ser-Gly tetra-repeat sequence is flanked, both upstream and downstream, by acidic amino acids (D/E) , reproducing a motif that has been reported to promote the assembly of heparan sulfate in proteoglycans. The downstream acidic residues occur within the sequence CMDDVC, and may reproduce a motif (a small acidic loop supported by a disulfide bond) that is shared by most glypicans (except glypican-2) . This loop follows the SG repeats in the glypicans 1, 4 and 6, but interrupts or precedes the SG repeats in the glypicans 3 and 5. Alignment of this predicted protein sequence with the protein sequences of the other known members of the glypican family (glypicans 1, 3, 4, and 5 as identified in man, and glypican-2 as identified in rat) revealed significant sequence similarities (figure 6) . This similarity included the 14 cysteines and the position and identity of several additional amino acid residues that are conserved in all glypicans identified so far. The entire protein showed 63% of sequence identity to human glypican-4, 44% of identity to human glypican-1, and 24-25% identity to the human glypicans 3 and 5. Comparison with rat glypican-2 showed only 41% of identity (Table 2) . Since (where both available) human and rodent glypican sequences had always proven highly similar (-90% of sequence identity) , it seemed unlikely that this protein represented the human homologue of cerebroglycan (glypican-2) . This protein was therefore designated as glypican-6, the sixth member of the vertebrate glypican family.
This alignment further indicated that the glypicans 4 and 6 were more closely related to one another (63% identity) than to the other glypicans (only 20 - 40% identity) , and that the glypicans 3 and 5 were more closely related to one another (43% identity) than to the other glypicans (-20% identity) (see Table 2) . The high similarity of glypican-4 and glypican-6 became even more striking when the N-terminal and C-terminal hydrophic signal peptide-like sequences (absent from the mature proteins) were excluded in the alignments
(identity raising to 70%) . These data suggest that the glypican family of cell surface heparan sulfate proteoglycans (as known today) may be composed of discrete subfamilies: one comprising glypicans 4 and 6, and possibly also glypican-1 (and 2); the second comprising glypicans 3 and 5.
1.3.2 Expression of glypican-6
In Northern blotting experiments (figures 7A, 7B and 7C) , two different GPC6 probes (one from the 3 ' UTR, corresponding to residues 2147-2488 of the sequence shown in figure 1, the other corresponding to residues 1724-2147 of the sequence shown in figure 1) detected a transcript of ~7kb in (all) fetal tissues (analyzed here) . High levels of expression were apparent in fetal kidney, moderate levels in fetal lung and fetal liver, and a low level of expression in fetal brain (figure 7A) . In adult tissues the message appears to be expressed almost ubiquitously. The message is expressed at very high levels in ovary, and at high levels in liver, kidney, small intestine and colon (mucosal lining) . The message is also present at low levels in heart, brain, placenta, lung, skeletal muscle, pancreas, spleen, thymus, prostate and testis (figures 7B and C) . The message is undetectable in peripheral blood leukocytes. In adult kidney the probes also detected a second less abundant message of approximately 5.8 kb. Adult heart and adult skeletal muscle yielded an extra band of -3.9 kb . These data indicated differential and only partially overlapping expressions of the GPC6 and other GPC messages in the different tissues, further evidence that the various GPCs are distinctive transcripts. 1.3.3 Identification of glypican-6 as a heparan sulfate proteoglycan
To test if glypican-6 would drive the synthesis of heparan sulfate, the glypican-6 insert was subcloned in the pREP4 episomal expression vector and transfected in Namalwa cells. The Namalwa cells used for these experiments had previously been shown to express little endogenous heparan sulfate, but to support the synthesis of large amounts of heparan sulfate when transfected with cDNAs (cloned in pREP4) that code for syndecans or glypican-1. Analysis of heparitinase digested proteoglycan from transfectant and control cells by Western blotting, using the delta-HS-specific mAb 3G10, confirmed that the transfectant cells expressed heparan sulfate proteoglycan core proteins of -65 kDa (major band) and -18-14 kDa (minor bands, possibly metabolic degradation products) that were not detectable in the control cells (figure 8) . Major bands of -65 kDa and minor bands of smaller sizes were also observed for transfectant Namalwa cells expressing glypican-1 or glypican-4 (figure 8) and glypican-3 or glypican-5 (not shown) . FACS analyses of these cells with the HS-specific mAb 10E4 and, after heparitinase, with mAb 3G10 revealed a dramatic increase in the expression of cell surface-HS in the transfectants (figure 9) .
1.3.4 Chromosomal mapping of GPC6
Two BACs for GPC6, 114A17 and 182F5 were used to localize GPC6 to chromosome band 13q32 by fluorescent in situ hybridization on metaphase chromosomes (figure 10; BAC 114A17) . From this it follows that GPC5 (closely related to GPC3) and GPC6 (closely related to GPC4) map in close proximity of one another on 13q32, mimicking the clustering of the GPC3 and GPC4 genes on chromosome Xq26 (see example 2) . EXAMPLE 2
Identification and characterization of human glypican-4 2.1 Introduction
The isolation of the cDNA that is used in the present invention for diagnostics (GPC4) started from a partial cDNA for human glypican-4. A cDNA comprising the complete coding sequences for human glypican-4 was found in a cDNA library of fetal brain.
2.2 Materials and methods
2.2.1 Bioinformatics
EST entries (including homology data) were retrieved from dbEST using either a text string based query interface (http://www.ncbi.nlm.nih.gov/dbEST/index.html), or by BLAST searches using the BLAST-server
(http://www.ncbi.nlm.nih.gov/BLAST/). Protein alignments were made using the program Clustal (Higgins et al . , 1994) . DNA alignments were made using the program GENEPRO.
2.2.2 Molecular cloning of human GPC4
A partial cDNA for human GPC4 was obtained by PCR on a human fetal kidney library (pKGP-PCR) . The sequence of this cDNA was used to design the primers for the RACE experiments and the isolation of cDNA for the complete coding sequence of human GPC4. The 5 ' -RACE and 3 ' -RACE experiments were performed on a library of adaptor-ligated ds fetal brain cDNA, using the Marathon cDNA Amplification kit from Clontech (Palo Alto, CA) . The cDNAs were amplified through a two-step PCR protocol. The first PCR used a gene-specific primer (Table 3) and an anchor primer provided by the supplier. Then 1 μl of the first PCR reaction was used as template for the second PCR-reaction, using a second gene-specific nested primer and a nested anchor primer provided by the supplier. The products of the second PCR were analyzed by electrophoresis in a 0.6% agarose gel. Distinct bands were gel purified using the Qiaquick DNA clean-up system (Qiagen) , T/A-cloned in the plasmid pCR2.1 (Invitrogen), and sequenced using a Pharmacia A.L.F. DNA Sequencer, with Dye Primer Cycle Sequencing chemistry on double stranded plasmid templates . In total 3 independent RACE clones were sequenced for the 5 ' -RACE and 3 independent RACE clones were sequenced for the 3 '-RACE.
Additionally, clone zxl2dl2 from the Soares ' 9 week normal fetus cDNA library (Lennon et al . , 1996) was obtained from the I.M.A.G.E. Consortium
(http: //www. bio. llnl . gov/bbrp/image/image .html) trough Research Genetics, Inc. (Huntsville, AL) . This clone (ID: 786263) was completely sequenced, yielding residues 1443- 2315 of the composite cDNA sequence shown in figure 2. All the sequences obtained for the coding region (residues 213-1883) were derived from at least two different RACE-products .
2.2.3 BAC cloning and chromosomal localization of GPC4 by FISH
To isolate genomic clones for GPC4, the pKGP- PCR probe (corresponding to residues 422-1497 of the GPC4 cDNA sequence shown in figure 2) and the Notl-Bglll fragment (residues 1-386) of the GPC4 cDNA were gel purified, 32P-labelled and used to screen a human genomic BAC library (Research Genetics, Inc . , Huntsville, AL) . Two BACs, 35H9 and 151D8 were isolated with the PCR probe, and one BAC, 68G14, with the Notl-Bglll fragment. The authenticity of these clones was verified by Southern blotting and by cycle-sequencing of the exon-intron boundaries, using gene-specific primers derived from the GPC4 cDNA sequence (Table 4) . BAC DNA was labeled with bio-16-dUTP (Sigma) by nick-translation, using a commercial kit (Life Technologies, Gaithersburg, MD) . A similar strategy was used to isolate BACs for
GPC3. Using cDNAs corresponding to residues 1-2300 and 1- 408 of the GPC3 sequence (Genbank Access N° Z37987) , two BACs were identified: BAC 166D10, which contained exon-3 of GPC3 , and BAC 36D20 which contained exon-2 of GPC3.
Metaphase spreads were prepared from PHA- stimulated human peripheral blood lymphocytes cultured for 72 hours. Prior to FISH, slides were treated with RNAse A and pepsin as described (Wiegant et al . , 1991) . Human Cotl DNA (Life Technologies) was used as a competitor. Denaturation of the slides and probes, hybridization, and subsequent cytoche ical detection of the hybridization signals were performed as previously described (Vermeesch et al . , 1995). Chromosomes were counterstained with DAPI and the slides were mounted in Vectashield mounting medium (Vector Laboratories Inc, Burlingame, CA) . The signal was visualized by digital imaging microscopy using a cooled charge-coupled device camera (Photometries Ltd, Tucson, AZ) . Merging and pseudocoloring were performed using the Smart Capture software (Vysis, Stuttgart, Germany).
2.2.4 GPC4 gene structure and BAC/PAC contig of the GPC3/GPC4 gene cluster on Xq26
Exon-intron boundaries were determined by cycle-sequencing of BAC DNA using gene specific primers. Alternatively, BAC DNA was subcloned in plasmids, verified for the presence of GPC4 exons (by PCR and Southern blotting) and subsequently sequenced.
YAC'S yWXD363, yWXD2789-I, yWXD440, yWXD736, yWXD69, yWXD808, yWXD6857-I, yWXD6858-I, yWXD3373, yWXD2704-I, yWXD6142, yWXD2724-I from the Xq26 contig (Pilia et al . , 1996) were obtained from the American Type Culture Collection (ATCC) and verified for GPC4 content by PCR and Southern blotting.
The ends of BACs 35H9 and 68G14 were sequenced and used to construct the novel sequence-tags (STS) MVl, MV2 and MV3 (Table 5) .
The PAC-library from P. de Jong was screened with 32P-oligolabeled probes for MVl, exon-1 GPC4 , exon-2 GPC4 and exon- 8 GPC3. PAC content was verified by PCR. STS ' s for GPC4 exons are described in Table 5, STS ' s SWXD1698, SWXD1165 and sWXD2342 have been described by others (see The Genome Database http : //www.gdb. org) . The reaction cycles for STS ' s MVl, MV2 and MV3 were: 94 °C for 30 sec, 55°C for 30 sec, 72° for 30 sec, for 35 cycles. Cycling was preceded by a 150 sec incubation at 94 °C.
2.2.5 Northern blotting
The membranes for the Northern blots were obtained from Clontech. Hybridization was performed for two hours at 68 °C, using Expresshyb solution (Clontech) according to the manufacturer's specifications. The probe was either a 32P-oligolabeled Notl-Bglll fragment from one of the 5 'RACE clones (corresponding to residues 1-386 of the sequence shown in figure 2) , or a 32P-oligolabeled BamHI-BamHI fragment from the composite cDNA sequence constructed in pREP4 (corresponding to residues 1148-2291 of the sequence shown in figure 2) . Dehybridisation included washing at room temperature for 30 min with 2.0% SSC, 0.05% SDS and a high stringency wash for 30 min at 0.1% SSC, 0.1% SDS and 65°C.
2.2.6 Mutational analysis of the GPC4 gene
From the characterization of the corresponding intron/exon boundaries in GPC4, primer pairs were designed for the amplification of all exons of the human GPC4 gene, to permit deletion and mutational analysis (Table 8) . For GPC3 deletion analysis, we designed new primers for the amplification of all exons (Table 8) . Genomic DNA was obtained from one newly identified patient, counseled at the Center for Human Genetics (CME) of the University of Leuven (with informed consent from the parents); from the lymphoblastoid cell lines AG0817, AG0857, AG0893, AG0946, AG0969, and FY0367 (database IDs) from the European Collection of Cell Cultures (ECACC) ; and from the fibroblastic cell lines GM13034, GM3884, GM0097 (ATCC), all established from patients with SGBS. All patient DNAs were analyzed by PCR. The reaction cycles were: 94 °C for 30 sec, annealing temperature for 30 sec, 72°C for 30 sec, for 35 cycles. Cycling was preceded by a 2.5 minute incubation at 94°. The reaction products were analyzed by electrophoresis in 2% agarose gels or, alternatively, were analyzed for single-strand conformation polymorphisms (SSCP) in non-denaturing polyacrylamide gels as described previously (Matthijs et al . , 1997) . PCR products with variant SSCs and controls were sequenced, either directly after gel purification or after T/A cloning in pCR2.1 (Invitrogen) . In the latter case, several independent clones from independent amplifications were characterized by Dye Primer Cycle Sequencing.
2.2.7 Construction of expression plasmids and cell transfection
To construct a partial GPC4 cDNA, containing the entire coding region, a Notl-EcoRI fragment from a 5' RACE clone, a EcoRI-Pstl fragment from pKGP, and a Pstl- BamHI fragment from a 3 ' -RACE clone were ligated together in pBluescript. A Notl-Notl fragment containing GPC4 was isolated from this construct and ligated in pCDNAIII and pREP4 , yielding respectively glyp4-pcDNAIII and glyp4- pREP4. Namalwa cells (ATCC CRL 1432) were routinely grown in DMEF12 medium supplemented with 10% FCS and
L-glutamine. For transfection, the cells were prewashed with Ca- and Mg-free PBS and incubated for 10 min at 4°C (107 cells in 1 ml Ca/Mg free PBS) with 30 μg glyp4-pREP4 plasmid before electroporation at 240 V and 960 μF (Gene Pulser, Biorad) . Selection was started 48 h later with 250 μg/ml of hygromycin B. Stable transfection was achieved after 12 days. Expression of heparan sulfate in the transfectants was analyzed by FACS, using the HS- specific antibody 10E4 and the delta-HS-specific antibody 3G10, and by Western blotting, using the 3G10 antibody as described before (David et al . , 1992; Steinfeld et al . , 1996). Stable expressing transfectant Namalwa clones were isolated from cells transfected with linearized glyp4- 99/37764
26 pcDNAIII, selected in media containing 400 μg/ml of G418, and panned on 10E4 antibody.
2.3 Results 2.3.1 Molecular cloning of human glypican-4 (GPC4) The combination of 5 ' -RACE and 3 ' -RACE experiments, performed on a library of adaptor-ligated ds fetal brain cDNA library, yielded the complete coding sequence for human GPC4. Figure 2 represents the merged sequences of the RACE clones, pKGP and EST zxl2dl2 (identified as GPC4 from BLAST searches) of public databases and the predicted structure of the protein encoded by the message that corresponds to this cDNA. The sequence features an ATG start codon, preceded by a Kozak sequence, at position 213 and a TAA stop codon at position 1881. One AATAAA sequence (potential polyadenylation signal) is present at position 3697, and a stretch of polyA starts at position 3706. The predicted amino acid sequence of human GPC4 (556 residues) was found to be highly homologous to that of mouse GPC4
(K-glypican) (93.5% sequence identity) (see also Table 2) .
The protein sequence starts and terminates with hydrophobic signal peptide-like sequences. It contains three serine-glycine dipeptide sequences. All three Ser- Gly dipeptide sequences occur towards the C-terminus of the protein, and two of these form part of a direct Ser- Gly repeat sequence. These Ser-Gly sequences are flanked, both upstream and downstream, by acidic amino acids (D/E) , reproducing a motif that has been reported to promote the assembly of heparan sulfate in proteoglycans (Zhang et al . , 1995). Because of the presence of three Ser-Gly repeats, glypican-4 would be predicted to have up to three heparan sulfate chains implanted on its core protein. The acidic residue downstream of the Ser-Gly repeat occurs within the sequence CEYQQC, and may reproduce a motif (a small acidic loop supported by a disulfide bond) that is shared by most glypicans (except glypican-2) . This loop follows the SG repeats in the glypicans -1, -4 and -6, but interrupts or precedes the SG repeats in the glypicans -3 and -5.
2.3.2 Expression of human glypican-4
In Northern blotting experiments, both probes corresponding either to residues 1-386, or residues 1148- 2291 of the GPC4 sequence were detecting two messages, one of 2.9 and one of 4.3 kb. The messages were expressed, in several, but not all, of the human fetal and adult tissues tested (see figure 11) . The origin of these two bands is not known, but could be due to the alternative usage of multiple polyadenylation signals, alternative splicing or, less likely, cross-hybridization with messages for other (possibly yet to be identified) members of the glypican gene family. In fetal tissues the messages were expressed in brain, kidney and lung; but barely detectable in liver. In adult tissues the message is highly abundant in skeletal muscle, pancreas, kidney, placenta, lung, heart, spleen ,testis, ovary , colon, small intestine. Less intense bands were seen in brain, thy us and prostate, and barely detectable bands were seen in the liver. The message appears to be absent from peripheral blood leukocytes. EST's for human GPC4 were also present in libraries prepared from a 9 week old fetus, pregnant uterus, fetal heart, adult lung, placenta and colon. The expression pattern of human GPC4 is almost the same as murine K-glypican with the exception of mGPC4 being abundantly expressed in the liver.
2.3.3 Identification of glypican-4 as a heparan sulfate proteoglycan
To test if glypican-4 would support the synthesis of heparan sulfate, the glypican-4 insert was subcloned in the pREP4 episomal expression vector and transfected in Namalwa cells. The Namalwa cells used for these experiments had previously been shown to express little endogenous heparan sulfate, but to support the synthesis of large amounts of heparan sulfate when transfected with cDNAs (cloned in pREP4) that code for syndecans or glypican-1. Analysis of heparitinase- digested proteoglycan from transfectant and controls cells by Western Blotting, using the delta-HS-specific mAb 3G10, confirmed that the transfectant cells expressed heparan sulfate proteoglycan core proteins of -65 kDa (major band) and -18-14 kDa (minor bands) that were not detectable in the control cells (figure 8) . FACS analyses of these cells with the HS-specific mAb 10E4 and, after heparitinase, with mAb 3G10 revealed a dramatic increase in the expression of cell surface-HS in the transfectants figure 9) .
2.3.4 Chromosomal mapping of GPC4
Three BACs were identified for GPC4 : BAC 35H9, BAC 151D8, and BAC 68G12. BACs 35H9 and 151D8 contained exons 2 to 9 of GPC4 , while BAC 68G12 contained exon-1 of GPC4. FISH, performed on metaphase chromosomes, localized all BACs for GPC4 to Xq26 (figure 12) . Since GPC3 had also been localized to chromosome band Xq26 (Pilia et al . , 1996), these results suggested that GPC3 (closely related to GPC5) and GPC4 (closely related to GPC6) probably mapped in proximity of one another on Xq26, mimicking the clustering of the GPC5 and GPC6 genes on chromosome 13q32 (see example 1) . The relative orientation of GPC4 and GPC3 was determined by FISH on cell lines of Simpson-Golabi-Behmel (SGBS) syndrome patients with translocations in the GPC3 gene (Table 6 and figure 12) .
These FISH data indicate that the GPC4 gene lies centromeric to GPC3. Since there is a YAC-contig covering Xq26, it was decided to look for YAC ' s containing GPC4. The following YAC ' s were tested by Southern-blotting and PCR for the presence of GPC4 exons: yWXD363, yWXD2789-I, yWXD440, yWXD736, yWXD69, yWXD808, yWXD6857-I, yWXD6858-I, yWXD3373, yWXD2704-I, yWXD6142- I, yWXD2724-I. Only YAC's yWXD3373 and yWXD6858-I were found to be positive for exon- 2 to exon-9 of GPC4. No YAC's were found positive for exon-1 of GPC4. Moreover, only YAC's yWXD6142 and yWXD2704 were positive for exon- 8 of GPC3. These data suggested that some YAC's might have undergone internal deletions, and lead to the construction of a new BAC/PAC contig. Figure 13 shows the BAC/PAC contig containing the entire GPC4 gene and linking both GPC3 and GPC4. This contig indicates that both glypicans form a tandem array with exon-1 and the promotor region of GPC4 lying adjacent to the last exon of GPC3. The GPC4 gene exon/intron structure is schematically shown in Table 7.
EXAMPLE 3
Glypican involvement in the Simpson-Golabi-Behmel syndrome
3.1 Introduction
Recently, deletions and translocations involving the gene for glypican-3 (GPC3) have been shown to occur in patients with the Simpson-Golabi-Behmel overgrowth syndrome (Pilia et al . , 1996) . Not all patients with this X-linked condition, however, are affected by mutations of the GPC3 gene that can easily be demonstrated (Lindsay et al . , 1997) . GPC4 was mapped by the present inventors on Xq26, in close proximity to GPC3, in an interval such that it would be deleted in at least one family with SGBS (Pilia et al , family c and figure 13C) . Therefore, the possibility was investigated of Xq mutations in patients with SGBS that affect GPC4, in addition to or rather than GPC3. The results show that in some patients this is indeed the case.
3.2 Materials and methods From the characterization of the corresponding intron/exon boundaries in GPC4 (analysis of the BACs described in example 2) and GPC3 new primers were designed for the amplification of all exons. Genomic DNA was obtained from one newly identified patient, counseled at the Center for Human Genetics (CME) of the University of Leuven (with informed consent from the parents) ; from the lymphoblastoid cell lines AG0817, AG0857, AG0893, AG0946, AG0969, and FY0367 (database IDs) from the
European Collection of Cell Cultures (ECACC) ; and from the fibroblastic cell lines GM13034, GM3884, GM0097 (ATCC), all established from patients with SGBS. All patient DNAs were analyzed by PCR. The reaction cycles were: 94°C for 30 sec, annealing temperature for 30 sec, 72 °C for 30 sec, for 35 cycles. Cycling was preceded by a 2.5 minute incubation at 94°. Primers and annealing temperatures are given in Table 8. The reaction products were analyzed by electrophoresis in 2% agarose gels. PCR primer pairs were designed for amplification of all exons of GPC4 (including exon/intron boundaries, Table 11) and the corresponding PCR products were analyzed for single- strand conformation polymorphisms (SSCP) in non- denaturing polyacrylamide gels as described previously (Matthijs et al . , 1997). PCR products with variant SSCs and controls were either directly sequenced after gel purification or T/A cloned in pCR2.1 (Invitrogen) . Several independent clones from independent amplifications were characterized by Dye Primer Cycle Sequencing.
3.3 Results
A summary of the PCR and SSC analyses is given in Table 10. These analyses identified one patient with a deletion that involved the entire GPC4 gene and part of the GPC3 gene (exons 7 and 8) . No other GPC4 deletions were detected, but a partial deletion of GPC3 (exons 1 and 2) was also identified in the patient diagnosed at CME. SSCA of the GPC3 exons revealed polymorphism for exon 3. Two of the patients with a variant exon 3 were brothers, and sequencing of the corresponding PCR products identified a T>A mutation leading to a missense mutation of R for W296 in glypican-3 ( (a) in Table 10) . Interestingly, W296 corresponds to one of the residues that are strictly conserved in all glypicans identified so far. Deletion of one T nucleotide (del T875) , leading to a frame shift mutation and termination, was the basis for the variant SSC of exon 3 in a third patient (b) . SSCA of the GPC4 exons revealed polymorphisms for the exons 7 and 8, in one and the same patient. Sequencing of the PCR product of exon-7 in this patient identified a G>T mutation leading to a substitution of D391 by E in glypican-4 (c) . Sequencing of the PCR product of exon 8 in this patient and controls identified a C>T substitution leading to a substitution of V by A as residue442 in glypican-4 (d) . It may be noted that the RACE experiments also yielded V as residue 442 and D as residue 391 (see figure 2) , and that these residues have not been conserved in the glypicans . Moreover the IMAGE Consortium cDNA clone for GPC4 also had a V at position 442, and the plasmid pKGP had an E at position 391.
Tabl e 1
Primers used in the 5 ' -RACE-experiments for GPC6
5' -ATTCCACTCTGTGTCGAGGTCAGCCTGA-3 ' (1425-1452) : first primer (1)
5 ' -ACAGTATGGGCAGTACAGCATCTTCATG-3 ' (1329-1358) : nested primer (1)
5' -CCTGAGCCACTGGATTCATCACTTG-3 ' (2048-2072) : first primer
(2)
5 ' -GCCATAATCTGCTGTCTGATGAAAGTG-3 ' (1956-1982) : nested primer (2)
(The numbers in parenthesis refer to the corresponding residues in the composite cDNA sequence shown in figure 1) .
Table 2
Percent amino acid identities between the glypicans
Figure imgf000034_0001
Table 3
Primers used in the RACE-experiments for GPC4
5 ' -CGAGCCCAGAAGTCATTTAGCATTTCTTCC-3 5' -RACE GPC4 5 ' -CACAAACATATCATTCAGGGATTTCTCTGC-3 nested 5 ' -RACE GPC4 5 ' -ATTTGAAGATCTGTCCCCAGGGTTCTAC-3 ' 3 ' -RACE GPC4 5 ' -AGTGTGGTCAGCGAACAGTGCAATC-3 ' nested 3 ' -RACE GPC4 5 ' -CCAACTGTGATCTCGCCTTGTTTCT-3 ' nested 3 ' -RACE GPC4
Table 4
PR08-37 F-E (AS) GPC-4 Ex-1: 5 ' -ggggCATCgTTCTTgTTgAA PR07-44 F-E (AS) GPC-4 Ex-2: 5 ' -TCATCAAACTTCTTgTAACg PR10-26 F-E (AS) GPC-4 Ex-3: 5 ' -AAGTGGTACTGGGAGTTCAC PR10-50 F-E (AS) GPC-4 Ex-4: 5 ' -CTCCAGCAAGCACAAATATG PR06-40 F-E (AS) GPC-4 Ex-5: 5 ' -CTTCTgAgACACTTgAACAC PR06-41 F-E (AS) GPC-4 Ex-6: 5 ' -CCAgTCggTCCAAACTAgTg PR09-11 F-E (AS) GPC-4 Ex-8: 5 ' -AgTCCACgTCgTTCCCATTg PR09-10 F-E (AS) GPC-4 Ex-9: 5 ' -CTCCACTCTCTCTgCATAAC PR10-38 F-E (AS) GPC-4 Ex-9:
5 ' -CCAACTGTGATCTCGCCTTGTTTCT
PR10-12 F-E GPC- 4 Ex- 1 5 ' -TCCCgTCCggTCCCAAAggT PR10-13 F-E GPC - 4 Ex- • 2 5 ' -CCTCTTggACTTCATgAATg PR10-14 F-E GPC- - 4 Ex- 3 5 ' -TAAATgACTTCTgggCTCgC PR06-38 F-E GPC - - 4 Ex- 3 5 ' -TgCAgAgAAATCCCTgAATg PR10-44 F-E GPC- - 4 Ex- 4 5 ' -ATGATCTACTGCTCCCACTG PR10-33 F-E GPC - - 4 Ex- - 4 5 ' -TGAGAAGAAATCACCTGCTC PR10-15 F-E GPC - - 4 Ex- - 5 5 ' -ATTgTTCTTCCCCAAACCTC
PR10-34 F-E GPC - - 4 Ex- - 6 5 ' -CTTCCCTAACATGACGACAC PR06-39 F-E GPC- - 4 Ex- - 7 5 ' -ggTTACTgATgTCAAggAgA PRll-11 F-E GPC- - 4 Ex- - 8 5 ' -CGAATCATCTTACTGGGGTCA PR11-38 F-E GPC - - 4 Ex- - 9 5 ' -CAGAACTGGGGTACAGCCTG PR11-39 F-E GPC - - 4 Ex- - 9 5 ' -CTACATCTTAGTGGTGACCT
Table 5
5 ' -TCATGACTAGTTTCTTGCACGG-3 ' : STS MVla 5 ' -TGAAAATCCACATGATTGGAAA-3 ' : STS MVlb 5'-AAGCTTGAAGGGTGCTCAGA-3 ' : STS MV2a 5 ' -ATTTCCTGCTGCTGGTCACT-3 ' : STS MV2b 5' -TCTCCTTTCCCTGGACTAACC-3 ' : STS MV3a 5 ' -TGAGTCAAATTAAAGAGCAAGGC-3 ' : STS MV3b Table 6
Figure imgf000036_0001
Table 7
Figure imgf000036_0002
Table 8
Figure imgf000037_0001
Figure imgf000038_0001
# : Stratagene Optiprime buffer (number) ; Perkin Elmer Buffer (P) T : extension temperature
Figure imgf000039_0001
Figure imgf000039_0002
99/37764
38 Table 10
Patient ID
0817 0857 0893 0946 0969 0367* 13084 pCME GPC-3 Exon
1 + + + + + + + —
2 + + + + + + + -
3 + (a) + (a) + (b) + + + + +
4 + + + + + + + +
5 + + + + + + + +
6 + + + + + + + +
7 + + + - + + + +
8 + + + - + + + +
GPC- 4 Exon
1 + + + - + + + +
2 + + + - + + + +
3 + + + - + + + +
4 + + + - + + + +
5 + + + - + + +
6 + + + - + + +
7 + + + (c) - + + +
8 + + + (d) - + + +
9 + + + - + + +
Table 11
Figure imgf000041_0001
REFERENCES
1. David et al . , J. Cell Biol. 111:3165-3176 (1990)
2. David et al . , J. Cell Biol. 119:961-975 (1992)
3. Filmus et al . , Biochem. J. 311:561-565 (1995)
4. Higgins, Methods Mol. Biol. 25:307-318 (1994)
5. Matthijs et al . , Nat. Genet. 16:88-92 (1997)
6. Nakato et al . , Drosophila Development 121:3687-3702
(1995)
7. Pilia et al . , Nat. Genet. 12:241-247 (1996)
8. Steinfeld et al . , J. Cell Biol. 133:405-416 (1997)
9. Stipp et al., J. Cell Biol. 124:149-160 (1994)
10. Vermeesch et al . , Genomics 25:327-329 (1995)
11. Veugelers et al . , Genomics 40:24-30 (1997)
12. atanabe et al . , J. Cell Biol. 130:1207-1218 (1995)
13. Wiegant et al . , Genomics 10:345-368 (1991)
14. Zhang et al . , J. Biol. Chem. 270:27127-27135 (1995)

Claims

1. Polynucleotide encoding a glypican-related protein, identified herein as "glypican-6" and comprising at least the coding sequence of the nucleotide sequence as depicted in figure 1.
2. Glypican-6 gene (GPC6) in isolated form comprising at least as the coding sequence the nucleotide sequence as depicted in figure 1 optionally interrupted by one or more introns, and optionally operably linked to transcription and translation regulatory sequences.
3. Polynucleotide encoding a glypican-related protein, identified herein as "glypican-4" and comprising at least the coding sequence of the nucleotide sequence as depicted in figure 2.
4. Glypican-4 gene (GPC4) in isolated form comprising at least as the coding sequence the nucleotide sequence as depicted in figure 2 optionally interrupted by one or more introns, and optionally operably linked to transcription and translation regulatory sequences.
5. Derivatives of the polynucleotide sequence as depicted in figures 1 or 2 , which derivatives are selected from fragments of the gene as claimed in claim 2 or 4, either isolated or synthetic and having a length that is smaller than the complete gene; primers, comprising at least 10 consecutive gene specific nucleotides, preferably about 20 gene specific consecutive nucleotides of the nucleotide sequence of the gene; longer oligonucleotides up to the full length of the gene; antisense variants of the gene, the fragments or the primers; antibodies directed to the gene, fragments, primers or complementary strands thereof; any specific ligand for DNA that can be used as a specific probe, peptide nucleic acid probes.
6. Derivatives as claimed in claim 5, which derivatives are selected from transcripts (mRNA sequences) of the gene, cDNA, antisense RNA, antisense cDNA, antibodies directed to the transcript, sense and antisense cDNA, antisense RNA and any specific ligand for RNA that can be used as a specific probe.
7. Derivatives as claimed in claim 5, which derivatives comprise at least part of the amino acid sequence encoded by the coding sequence of the nucleotide sequence depicted in figure 1 or 2 and selected from the isolated or synthetic gene product (protein or polypeptide); isolated or synthetic peptides, comprising a specific sequence of consecutive amino acids encoded by the gene, antibodies directed to the gene product or peptides and any specific ligand for peptides that can be used as a specific probe.
8. Polynucleotides of claim 1 and 3, for use in diagnosis and/or therapy.
9. Gene of claim 2 and 4 and/or derivatives of claims 5-7 for use in diagnosis and/or therapy.
10. Method for diagnosing aberrations in a glypican encoding gene, comprising isolation of the gene from cells expected to be harboring an aberrant gene; and comparing the nucleotide sequence of the gene thus obtained with the nucleotide sequence of a wild type gene .
11. Method as claimed in claim 10, wherein comparing the nucleotide sequence of the gene to be diagnosed with the nucleotide sequence of a wildtype gene is performed by restriction fragment length polymorphism screening, comprising separately digesting the isolated gene and a wild type comparison gene with one or more selected restriction enzymes, separating the digest thus obtained on a gel to reveal a pattern of bands, and comparing the patterns of the isolated gene and the wildtype gene.
12. Method as claimed in claim 10, wherein comparing the nucleotide sequence of the gene to be diagnosed with the nucleotide sequence of a wildtype gene is performed by means of Polymerase Chain Reaction (PCR) , between probes corresponding to various parts of the gene to be diagnosed, for example exons, separating the reaction mixture thus obtained on a gel to reveal a pattern of bands, and comparing the patterns of the isolated gene and the wildtype gene.
13. Method as claimed in claim 10, wherein comparing the nucleotide sequence of the gene to be diagnosed with the nucleotide sequence of a wildtype gene is performed by single-strand conformation polymorphism screening.
14. Method as claimed in claim 10, wherein comparing the nucleotide sequence of the gene to be diagnosed with the nucleotide sequence of a wildtype gene is performed by DNA sequencing.
15. Method for the in situ detection of physical changes in a glypican gene, like translocations, inversions or deletions, by the in situ hybridization of labeled probes with a set of chromosomes.
16. Method as claimed in claim 15, wherein translocations can be detected by hybridizing a set of chromosomes with a first probe that hybridizes to a part of the glypican gene that is not likely to be involved in the translocation, and a second probe that hybridizes to a part of the glypican gene that is likely to be involved in such aberration, wherein translocations are identified when the second probe is detected on another chromosome than the first probe.
17. Method as claimed in claim 16, further comprising identification of the translocation partner, and using in addition probes hybridizing to the translocated part and the remaining part of the translocation partner and bearing a different label from the first set of probes.
18. Method as claimed in claim 15, wherein an inversion is identified if the second probe is found closer to or further away from the first probe than in a non-aberrant chromosome.
19. Method as claimed in claim 15, wherein deletions are detected when one or both of the probes are not present on the aberrant chromosome .
20. Method as claimed in claims 15-19, wherein the gene to be diagnosed is the glypican 3 gene and the probes are as given in Table 10.
21. Method as claimed in claims 15-19, wherein the gene to be diagnosed is the glypican 4 gene and the probes are as given in Table 10 and/or 11.
22. Method as claimed in claims 15-19, wherein the gene to be diagnosed is the glypican 5 gene and the probes are as given in Table 13.
23. Method as claimed in claims 15-19, wherein the gene to be diagnosed is the glypican 6 gene and the probes are as given in Table 14.
24. Method as claimed in claims 15-19, wherein the gene to be diagnosed is the glypican 1 gene and the probes are derivable from figure 3.
25. Method for diagnosing the expression pattern of glypican genes, wherein antibodies directed to the gene product or protein are reacted with Western blots of cell extracts to detect the presence of the protein in the cell or to assess the amount of protein presen .
PCT/EP1999/000329 1998-01-27 1999-01-20 New members of the glypican gene family WO1999037764A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU24229/99A AU2422999A (en) 1998-01-27 1999-01-20 New members of the glypican gene family

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP98200226.3 1998-01-27
EP98200226 1998-01-27

Publications (2)

Publication Number Publication Date
WO1999037764A2 true WO1999037764A2 (en) 1999-07-29
WO1999037764A3 WO1999037764A3 (en) 2000-02-03

Family

ID=8233325

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP1999/000329 WO1999037764A2 (en) 1998-01-27 1999-01-20 New members of the glypican gene family

Country Status (2)

Country Link
AU (1) AU2422999A (en)
WO (1) WO1999037764A2 (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003100429A2 (en) * 2002-05-23 2003-12-04 Sunnybrook And Women's College Health Sciences Centre Diagnosis of hepatocellular carcinoma
WO2004022595A1 (en) * 2002-09-04 2004-03-18 Chugai Seiyaku Kabushiki Kaisha CONSTRUCTION OF ANTIBODY USING MRL/lpr MOUSE
WO2006090136A2 (en) * 2005-02-22 2006-08-31 University Court Of The University Of Edinburgh Genetic screening of animals
US7303914B2 (en) * 2003-01-30 2007-12-04 Hongyang Wang Monoclonal antibody against human hepatoma and use thereof
US20120040452A1 (en) * 2005-08-09 2012-02-16 Oncotherapy Science, Inc. Glypican-3 (gpc3)-derived tumor rejection antigenic peptides useful for hla-a2-positive patients and pharmaceutical comprising the same
WO2016112423A1 (en) 2015-01-16 2016-07-21 Minomic International Ltd. Glypican epitopes and uses thereof
CN107338306A (en) * 2017-07-23 2017-11-10 嘉兴允英医学检验有限公司 A kind of kit for GPC1mRNA detection of expression
WO2018038046A1 (en) * 2016-08-22 2018-03-01 中外製薬株式会社 Gene-modified non-human animal expressing human gpc3 polypeptide
JP2020141702A (en) * 2020-05-28 2020-09-10 グリピー ホールディングス ピーティーワイ リミテッド Glypican epitopes and uses thereof
US11612149B2 (en) 2015-07-10 2023-03-28 Chugai Seiyaku Kabushiki Kaisha Non-human animal having human CD3 gene substituted for endogenous CD3 gene

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
DATABASE EMBL [Online] Accession Number AA001322, 20 July 1996 (1996-07-20) HILLIER L ET AL: "zh83a06.r1 Soares fetal liver spleen 1NFLS S1 Homo sapiens cDNA clone 427858 5' similar to SW:GLYP_HUMAN P35052 GLYPICAN PRECURSOR" XP002109642 cited in the application *
DATABASE EMBL [Online] Accession Number N87558, 25 July 1996 (1996-07-25) LIEW C C: "LL1806F Fetal heart, Lambda ZAP Express Homo sapiens cDNA clone LL1806 5' similar to K-glypican." XP002109643 cited in the application *
FILMUS J ET AL: "Identification of a new membrane-bound heparan sulphate proteoglycan" BIOCHEMICAL JOURNAL., vol. 311, 15 October 1995 (1995-10-15), pages 561-565, XP002109639 LONDON, GB ISSN: 0264-6021 cited in the application *
HUBER R ET AL: "Analysis of exon/intron structure and 400 kb of genomic sequence surrounding the 5'-promoter and 3'-terminal ends of the human glypican 3 (GPC3) gene" GENOMICS., vol. 45, no. 1, 1 October 1997 (1997-10-01), pages 48-58, XP002109641 SAN DIEGO., US ISSN: 0888-7543 cited in the application *
VEUGELERS M ET AL: "Characterization of glypican-5 and chromosomal localization of human GPC5, a new member of the glypican gene family." GENOMICS., vol. 40, no. 1, 15 February 1997 (1997-02-15), pages 24-30, XP002109640 SAN DIEGO., US ISSN: 0888-7543 cited in the application *
WATANABE K.: "K-glypican: a novel GPI-anhcored heparan sulfate proteoglycan that is highly expressed in developing brain and kidney." THE JOURNAL OF CELL BIOLOGY., vol. 130, no. 5, September 1995 (1995-09), pages 1207-1218, XP002109638 cited in the application *

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003100429A3 (en) * 2002-05-23 2004-02-19 Sunnybrook & Womens College Diagnosis of hepatocellular carcinoma
WO2003100429A2 (en) * 2002-05-23 2003-12-04 Sunnybrook And Women's College Health Sciences Centre Diagnosis of hepatocellular carcinoma
US7883853B2 (en) 2002-05-23 2011-02-08 Sunnybrook Health Sciences Centre Diagnosis of hepatocellular carcinoma
WO2004022595A1 (en) * 2002-09-04 2004-03-18 Chugai Seiyaku Kabushiki Kaisha CONSTRUCTION OF ANTIBODY USING MRL/lpr MOUSE
US7303914B2 (en) * 2003-01-30 2007-12-04 Hongyang Wang Monoclonal antibody against human hepatoma and use thereof
WO2006090136A2 (en) * 2005-02-22 2006-08-31 University Court Of The University Of Edinburgh Genetic screening of animals
WO2006090136A3 (en) * 2005-02-22 2007-02-15 Univ Edinburgh Genetic screening of animals
US20120040452A1 (en) * 2005-08-09 2012-02-16 Oncotherapy Science, Inc. Glypican-3 (gpc3)-derived tumor rejection antigenic peptides useful for hla-a2-positive patients and pharmaceutical comprising the same
US8535942B2 (en) * 2005-08-09 2013-09-17 Oncotherapy Science, Inc. Glypican-3 (GPC3)-derived tumor rejection antigenic peptides useful for HLA-A2-positive patients and pharmaceutical comprising the same
WO2016112423A1 (en) 2015-01-16 2016-07-21 Minomic International Ltd. Glypican epitopes and uses thereof
CN107531755A (en) * 2015-01-16 2018-01-02 美侬米克国际有限公司 Glypican epitope and application thereof
JP2018504903A (en) * 2015-01-16 2018-02-22 ミノミック インターナショナル リミティド Glypican epitopes and uses thereof
CN107531755B (en) * 2015-01-16 2022-02-25 美侬米克国际有限公司 Phosphatidylglycoproteoglycan epitopes and uses thereof
EP3245219A4 (en) * 2015-01-16 2018-06-06 Minomic International Ltd. Glypican epitopes and uses thereof
US11612149B2 (en) 2015-07-10 2023-03-28 Chugai Seiyaku Kabushiki Kaisha Non-human animal having human CD3 gene substituted for endogenous CD3 gene
JPWO2018038046A1 (en) * 2016-08-22 2019-06-20 中外製薬株式会社 Genetically modified non-human animal expressing human GPC3 polypeptide
WO2018038046A1 (en) * 2016-08-22 2018-03-01 中外製薬株式会社 Gene-modified non-human animal expressing human gpc3 polypeptide
US11793180B2 (en) * 2016-08-22 2023-10-24 Chugai Seiyaku Kabushiki Kaisha Gene-modified mouse expressing human GPC3 polypeptide
CN107338306A (en) * 2017-07-23 2017-11-10 嘉兴允英医学检验有限公司 A kind of kit for GPC1mRNA detection of expression
JP2020141702A (en) * 2020-05-28 2020-09-10 グリピー ホールディングス ピーティーワイ リミテッド Glypican epitopes and uses thereof

Also Published As

Publication number Publication date
WO1999037764A3 (en) 2000-02-03
AU2422999A (en) 1999-08-09

Similar Documents

Publication Publication Date Title
Veugelers et al. GPC4, the gene for human K-Glypican, FlanksGPC3on Xq26: Deletion of theGPC3–GPC4Gene cluster in one family with Simpson–Golabi–Behmel syndrome
US6596488B2 (en) Tumor suppressor gene
Ferrero et al. The making of a leukocyte receptor: origin, genes and regulation of human CD38 and related molecules
CA2268771C (en) Polymorphisms and new genes in the region of the human hemochromatosis gene
WO1999037764A2 (en) New members of the glypican gene family
WO1997042314A1 (en) Nucleic acid encoding spinocerebellar ataxia-2 and products related thereto
Larsen et al. The WFDC1 gene encoding ps20 localizes to 16q24, a region of LOH in multiple cancers
WO1997023598A2 (en) A long qt syndrome gene which encodes kvlqt1 and its association with mink
WO1998057982A2 (en) Aib1, a steroid receptor co-activator
US20060275314A1 (en) Transmembrane protein differentially expressed in cancer
US20060211033A1 (en) Novel gene
EP1033404A1 (en) New gene with down-regulated expression in metastatic human melanoma cells
Suzuki et al. A novel gene in the chromosomal region for juvenile myoclonic epilepsy on 6p12 encodes a brain-specific lysosomal membrane protein
JP2004500077A (en) NPHS2 gene involved in steroid-resistant nephrotic syndrome, protein encoded by the gene, and diagnostic and therapeutic uses thereof
WO1999003883A1 (en) Compositions and methods based upon the tuberous sclerosis-1 (tsc1) gene and gene product
US7112419B2 (en) Human hepatoma associated protein and the polynucleotide encoding said polypeptide
AU762987B2 (en) Chondrosarcoma associated genes
US20040214990A1 (en) Transmembrane protein differentially expressed in cancer
EP1038960A1 (en) BSMAP, a surface protein expressed specifically in the brain
WO2004072268A2 (en) Pkhdl1, a homolog of the autosomal recessive kidney disease gene
WO1993024628A9 (en) SEQUENCE OF HUMAN DOPAMINE TRANSPORTER cDNA
JP2002369696A (en) Treatment and diagnosis of disease of kiaa0172 gene and utilization of the same gene to drug
US20030104472A1 (en) Antibody specifically binding human pinch protein homolog
WO2001032861A1 (en) Tumour suppressor genes from chromosome 16
WO1999043695A1 (en) Best's macular dystrophy gene

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
AK Designated states

Kind code of ref document: A3

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

NENP Non-entry into the national phase in:

Ref country code: KR

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: CA