WO2009055760A1 - Promoter detection and analysis - Google Patents

Promoter detection and analysis Download PDF

Info

Publication number
WO2009055760A1
WO2009055760A1 PCT/US2008/081240 US2008081240W WO2009055760A1 WO 2009055760 A1 WO2009055760 A1 WO 2009055760A1 US 2008081240 W US2008081240 W US 2008081240W WO 2009055760 A1 WO2009055760 A1 WO 2009055760A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
dna
tag
promoter
vector
Prior art date
Application number
PCT/US2008/081240
Other languages
French (fr)
Inventor
Xavier Danthinne
Yongsheng Ma
Original Assignee
Od260, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Od260, Inc. filed Critical Od260, Inc.
Priority to CN2008801233105A priority Critical patent/CN101918578A/en
Priority to EP08841807A priority patent/EP2209903A4/en
Publication of WO2009055760A1 publication Critical patent/WO2009055760A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1051Gene trapping, e.g. exon-, intron-, IRES-, signal sequence-trap cloning, trap vectors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1065Preparation or screening of tagged libraries, e.g. tagged microorganisms by STM-mutagenesis, tagged polynucleotides, gene tags

Definitions

  • the present disclosure relates to methods for detecting regulatory elements in a cell sample. More specifically, the disclosure relates to methods for detecting regulatory elements in multiple cell samples at the same time and uses arising there from. The present disclosure also provides a vector for detection and analysis of regulatory elements.
  • the genes of all living organisms are encoded by the nucleic acids DNA and RNA. Each gene encodes a protein that may be produced by the organism through expression of the gene.
  • the systems that regulate gene expression respond to a wide variety of developmental and environmental stimuli, thus allowing each cell type to express a unique and characteristic subset of its genes, and to adjust the dosage of particular gene products as needed.
  • the importance of dosage control is underscored by the fact that targeted disruption of key regulatory molecules in mice often results in drastic phenotypic abnormalities (Johnson, R. S., et al., Cell, 71 :577-586 (1992)), just as inherited or acquired defects in the function of genetic regulatory mechanisms contribute broadly to human disease.
  • Standard molecular biology techniques have been used to analyze the expression of genes in a cell by measuring nucleic acids. These techniques include PCR, northern blot analysis, or other types of DNA probe analysis such as in situ hybridization. Each of these methods allows one to analyze the transcription of only known genes and/or small numbers of genes at a time (Nucl. Acids Res. 19, 7097-7104 (1991); Nucl. Acids Res. 18, 4833-4842 (1990); Nucl. Acids Res. 18, 2789-2792 (1989); European J. Neuroscience 2, 1063-1073 (1990); Analytical Biochem. 187, 364-373 (1990); Genet. Annal Techn. Appl. 7, 64-70 (1990); GATA 8(4), 129-133 (1991); Pro. Natl.
  • Gene expression has also been monitored by measuring levels of the gene product, (i.e., the expressed protein), in a cell, tissue, organ system, or even organism.
  • Measurement of gene expression by measuring the protein gene product may be performed using antibodies known to bind to the particular protein to be detected. A difficulty arises in needing to generate antibodies to each protein to be detected.
  • Measurement of gene expression via protein detection may also be performed using 2-dimensional gel electrophoresis, wherein proteins can be, in principle, identified and quantified as individual bands, and ultimately reduced to a discrete signal.
  • each band In order to positively analyze each band, each band must be excised from the membrane and subjected to protein sequence analysis (e.g., Edman degradation). However, it tends to be difficult to isolate a sufficient amount of protein to obtain a reliable protein sequence. In addition, many of the bands often contain more multiple proteins.
  • Another difficulty associated with quantifying gene expression by measuring an amount of protein gene product in a cell is that protein expression is an indirect measure of gene expression. It is impossible to know from a protein present in a cell when the expression of that protein occurred. Thus, it is difficult to determine whether the protein expression changes over time due to cells being exposed to different stimuli. The measurement of the amount of particular activated transcription factors has been used to monitor gene expression.
  • a reporter gene (often simply reporter) is a gene that researchers often attach to another gene of interest in cell culture, animals or plants. Certain genes are chosen as reporters because the characteristics they confer on organisms expressing them are easily identified and measured, or because they are selectable markers.
  • Reporter genes are generally used to determine whether the gene of interest has been taken up by or expressed in the cell or organism population
  • researchers place the reporter gene and the gene of interest in the same DNA construct to be inserted into the cell or organism
  • this is usually in the form of a circular DNA molecule called a plasmid
  • reporter genes that induce visually identifiable characteristics usually involve fluorescent proteins; for example, green fluorescent protein (GFP) and the luciferase assay.
  • reporter genes include, for example, beta-galactosidase, X-gal, and chloramphenicol acetyltransferase (CAT).
  • CAT chloramphenicol acetyltransferase
  • the reporter gene's expression is independent of the gene of interest's expression, which is an advantage when the gene of interest is only expressed under certain specific conditions or in tissues that are difficult to access
  • selectable-marker reporters such as CAT
  • the transfected population of bacteria can be grown on a substrate that contains chloramphenicol. Only those cells that have successfully taken up the construct containing the CAT gene will survive and multiply under these conditions.
  • Reporter genes can also be used to assay for the expression of the gene of interest,
  • the reporter is directly attached to the gene of interest to create a
  • the two genes are under the same promoter and are transcribed into a single
  • 109 region is usually included so that the reporter and the gene of interest will only minimally
  • 111 Reporter genes can be used to assay for the activity of a particular promoter in a cell
  • EDD Eukaryotic Promoter Database
  • TRRD Transcription regulator
  • Promoters are generally
  • Typical computer algorithms for promoter prediction are based on
  • promoters of a given organism may provide a global view of transcription networks.
  • the transcription start site is defined with standard molecular biology tools such as
  • 155 kb is cloned and demonstrated to have promoter activity by performing a reporter assay in a
  • 158 regulation may be obtained by applying different induction or repression agents in transient
  • protein levels may not always correlate with mRNA levels.
  • reporter assays e.g. chloramphenicol acetyl-
  • the time difference between the first and last sample may be significant
  • a second reporter cassette has to be included as an internal control. In some instances,
  • the assay relies on the fluorescent reporter GFP for detection and screens
  • FACS fluorescence-activated cell sorting
  • 208 object of the present disclosure is to provide a novel reporter system that is specific
  • the present disclosure provides a method for the detection and analysis of DNA
  • the present disclosure provides a method
  • 212 for detecting DNA regulatory sequences comprising: a) inserting a promoter sequence
  • promoter sequence candidate is inserted in a position to drive transcription of the TAG
  • 220 labeled mRNA, cDNA or probe is analyzed with an array wherein the array comprises
  • the labeled mRNA is identical or complementary sequence to the TAG sequence.
  • the labeled mRNA is preferably identical or complementary sequence to the TAG sequence.
  • 222 cDNA or probe hybridizes to the array and the label of the mRNA, cDNA or probe has a
  • the present disclosure provides a method for the detection
  • 226 candidates are integrated into vectors that comprise a TAG sequence, one or more multiple-
  • RNA stabilization fragment such as
  • a transcription termination signal such as a poly A
  • promoter sequence candidates are integrated into a vector comprising a TAG sequence, one
  • the present disclosure provides a method for the detection
  • the vector comprises a TAG sequence, one or more multiple-
  • the vector is a plasmid
  • the vector is a plasmid
  • RNA stabilization fragment is from an alpha-globm gene.
  • the transcription of the gene is from an alpha-globm gene.
  • 249 termination signal is a poly A signal.
  • the present disclosure provides a method for the detection
  • the vector comprises a TAG sequence, one or more multiple-
  • the vector is a plasmid.
  • the 259 fragment is from an alpha-globm gene.
  • the transcription termination signal is a
  • the DNA recombination sequences are attPl and att?2
  • the present disclosure provides a method for the detection
  • MCS multiple cloning sites
  • RNA synthesis preferably a T7 promoter sequence; a unique reporter TAG, a specific 270 MA segment useful to synthesize probes from RNA, wherein the MA segment is comprised
  • RNA stabilization fragment preferably from a hemoglobin or alpha-globin gene
  • promoter sequence candidate inserts are cloned into a host, preferably Escherichia coli.
  • Suitable bead compositions include those used in peptide, nucleic acid and
  • organic moeity synthesis including but not limited to, plastics, ceramics, glass, polystyrene,
  • the vector is a plasmid.
  • the label of the mRNA, cDNA is a plasmid.
  • the reporters are short oligonucleotides TAGs.
  • the reporters are short oligonucleotides TAGs.
  • TAG sequence is between about 16 base pairs and about 200 base pairs, more
  • 300 pairs more preferably between about 50 base pairs and about 75 base pairs, more preferably
  • the method enables the unbiased quantification of various mRNAs by hybridization under the same 304 temperature and ionic strength conditions.
  • the method enables the
  • 311 a single population of cells creating a competitive environment for the various promoters to
  • vectors preferably plasmids, purified
  • 314 amounts of all the DNA promoter sequences are mixed and used for transfection of a single
  • 318 amounts of the vectors can be obtained by: 1) making the vector library; 2) array the vector
  • 319 library (e.g., 96 well plate); 3) take an equal fraction from each clone and pool them all; 4)
  • the transformation agent e.g., a vector, plasmid or virus
  • 323 amounts of the vector can be obtained by: 1) making the vector library; 2) array the vector
  • 324 library (e.g., 96 well plate); 3) grow each clone individually (e.g., in a deep-well plate in case
  • 326 transformation agent e.g., vector, plasmid or virus
  • transfect the vector or plasmid or
  • 328 obtained by: 1) making the vector library; 2) array the vector library (e.g., 96 well plate); 3)
  • transformation agent e.g., vector, plasmid or virus
  • transfect vector or
  • 333 vector can be obtained by: 1) making the vector library; 2) take a fraction from each clone,
  • extract transformation agent e.g., vector, or plasmid
  • transfect vector or plasmid or infect virus
  • reporter cell line determines 337 the TAG of interest (e.g., high level of expression)
  • TAG of interest e.g., high level of expression
  • TAG of interest e.g., colony hybridization
  • the present disclosure provides a method for the detection
  • 343 sequence preferably attP 1 or att?2, a negative selection marker, preferably ccdB, a
  • RNA synthesis such as a T7 promoter sequence, a MAc
  • RNA stabilization fragment preferably from the
  • hemoglobin or alpha-globin gene 346 hemoglobin or alpha-globin gene, and transcription termination signal, such as a poly A-
  • 349 inserts are cloned into a host, preferably Escherichia coli, and the clones are arrayed into a
  • the vector is a plasmid.
  • the label of the mRNA is a plasmid.
  • 355 cDNA or probe has a detectable response.
  • the disclosure provides a method for the detection and
  • a negative selection marker such as ccdB, a nucleotide sequence useful to
  • RNA synthesis preferably a T7 promoter sequence, a MA segment, a translation stop
  • RNA stabilization fragment preferably a hemoglobin or alpha-globin gene
  • 363 transcription termination signal preferably a poly A-signal, and wherein the DNA promoter
  • 364 sequence candidate is located such that it drives the transcription of the TAG sequence.
  • the present disclosure provides a method for the detection
  • 369 marker a nucleotide sequence useful to enable RNA synthesis, a MA segment, a translation
  • 373 host preferably Escherichia coli, and the clones are arrayed into a 96-well plate and grown to
  • the purified vector mixture is transfected into a cell line of interest; and (e) the RNA is
  • the vector is a plasmid.
  • the DNA sequence is a plasmid.
  • 378 recombination sequence is attVl or attV2.
  • RNA synthesis is a T7 promoter sequence.
  • RNA stabilization fragment is from the hemoglobin
  • the label of the mRNA, cDNA or probe has a detectable
  • the present disclosure provides a method for the detection
  • DNA promoter sequences comprising: (a) integrating a DNA promoter
  • 387 marker a nucleotide sequence useful to enable RNA synthesis, a MA segment, a translation
  • the DNA promoter sequence candidate is located such that it drives the transcription of the
  • 391 host preferably Escherichia coli
  • the clones are arrayed into a 96-well plate and grown to
  • the vector is a plasmid.
  • the DNA recombination is a plasmid.
  • 398 sequence is ⁇ ffPl or att?2.
  • RNA stabilization fragment is from the
  • the transcription termination signal is a poly
  • the label of the mRNA, cDNA or probe has a detectable response.
  • the disclosure provides a method for
  • sequences such as genomic library, comprising: (a) mixing promoter sequence candidates 405 with TAG-vectors, wherein the TAG-vector comprises: multiple cloning sites (MCS) for
  • promoter sequence candidate at least one DNA recombination sequence, such as
  • promoter sequence inserts such as, for example, a ccdB gene, a T7 promoter sequence to
  • RNA stabilization fragment such as, for example, alpha-globin or hemoglobin
  • promoter sequence candidate inserts are cloned into a host, preferably Escherichia coli, and
  • RNA 417 mixture is transfected into a cell line of interest; and (e) the RNA is extracted, labeled, and
  • the TAG-vector is a TAG-plasmid.
  • the label of the mRNA is a TAG-plasmid.
  • 420 cDNA or probe has a detectable response.
  • the disclosure provides a method for
  • 423 nucleotide sequences such as a genomic library, comprising: (a) mixing promoter sequence
  • TAG-vector comprises: multiple cloning sites
  • MCS 425
  • RNA 426 a negative selection marker, a nucleotide sequence useful to enable RNA synthesis, a unique
  • MA segment is comprised of approximately 25% A, 25% T, 25% G,
  • 431 inserts are cloned into a host, preferably Escherichia coli, and the clones are arrayed into a
  • RNA is extracted, labeled, and quantified by
  • the vectors are plasmids.
  • the DNA recombination sequence is attVl
  • the negative selection marker is ccdB.
  • the nucleotide 438 sequence to enable RNA synthesis is a T7 promoter sequence.
  • the RNA 439 stabilization fragment is from the alpha-globin gene.
  • the label of the mRNA, cDNA or probe has a detectable
  • the disclosure provides a method for
  • sequences such as a genomic library, comprising (a) mixing promoter sequence candidates
  • TAG-vector comprises: multiple cloning sites (MCS) for TAG-vectors.
  • MCS multiple cloning sites
  • RNA wherein the MA segment is comprised of approximately 25% A, 25% T, 25% G,
  • RNA stabilization fragment 450 and 25% C, a three frame translation stop codon, a RNA stabilization fragment, and a
  • 455 vectors are transfected into a cell line of interest and wherein the use of internal controls is
  • RNA is extracted, labeled, and quantified by hybridization to the
  • TAG-vectors 457 DNA TAG sequences arrayed on a membrane or glass support.
  • the TAG-vectors 457 DNA TAG sequences arrayed on a membrane or glass support.
  • the TAG-vectors 457 DNA TAG sequences arrayed on a membrane or glass support.
  • the DNA recombination sequence is attP 1 or attP2.
  • the negative selection marker is ccdB.
  • RNA stabilization is a T7 promoter sequence.
  • the RNA stabilization is a T7 promoter sequence.
  • the transcription termination signal is a
  • the label of the mRNA, cDNA or probe has a detectable response
  • the disclosure provides a method for analysis and detection of
  • 466 candidates are, for example, selected from computer-predicted promoter sequence candidates.
  • TAG-vector comprises: multiple cloning sites
  • selection marker a nucleotide sequence useful to enable RNA synthesis, a unique
  • MA segment 472 approximate 60 base pair reporter TAG, a specific MA segment useful to synthesize probes 473 from RNA, wherein the MA segment is comprised of about 25% A, 25% T, 25% G, and 25%
  • RNA is extracted, labeled, and quantified by hybridization to the DNA TAG
  • the TAG-vectors are TAG-
  • the DNA recombination sequence is attPl or attV2.
  • the DNA recombination sequence is attPl or attV2.
  • RNA sequence to enable RNA is ccdB.
  • nucleotide sequence to enable RNA is ccdB.
  • RNA stabilization fragment is from the
  • the transcription termination signal is a poly A-signal.
  • the label of the mRNA, cDNA or probe has a detectable response.
  • the disclosure provides a method for detection and analysis of
  • TAG-vector comprises: multiple cloning sites for
  • the MA segment is comprised of about 25% A, 25% T, 25% G, and 25% C, a three frame
  • RNA 502 are transfected into a cell line of interest; and (e) the RNA is extracted, labeled, and
  • the TAG-vectors are TAG-plasmids.
  • the DNA sequence is TAG-plasmids.
  • the 505 recombination sequence is ati? ⁇ or ⁇ ftP2.
  • the negative selection marker is ccdB.
  • the nucleotide sequence to enable RNA synthesis is a T7 promoter sequence.
  • the RNA stabilization fragment is from the alpha-globin gene.
  • the nucleotide sequence to enable RNA synthesis is a T7 promoter sequence.
  • the RNA stabilization fragment is from the alpha-globin gene.
  • 508 transcription termination signal is a poly A-signal.
  • the label of the mRNA, cDNA is a poly A-signal.
  • the disclosure provides a method for the detection and
  • 513 candidates are, for example, selected from computer-predicted promoter sequence candidates.
  • TAG-vector comprises: multiple cloning sites
  • selection marker a nucleotide sequence useful to enable RNA synthesis, a unique
  • the MA segment is comprised of about 25% A, 25% T, 25% G, and 25%
  • clones 523 are cloned into a host, preferably Escherichia coli, and the clones are arrayed into a 96-well
  • the TAG-vectors are TAG-plasmids.
  • the TAG-vectors are TAG-plasmids.
  • 529 DNA recombination sequence is ⁇ Pl or attP2.
  • the negative selection marker is
  • the nucleotide sequence to enable RNA synthesis is a T7 promoter
  • RNA stabilization fragment is from the alpha-globin gene.
  • the transcription termination signal is a poly A-signal.
  • the label of the transcription termination signal is a poly A-signal.
  • 533 mRNA, cDNA or probe has a detectable response.
  • the present disclosure provides a vector In a preferred embodiment, the present
  • 535 disclosure provides a vector into which a DNA promoter sequence candidate is inserted into
  • RNA stabilization fragment 538 a MA segment, a translation stop codon, a RNA stabilization fragment, and a transcription
  • DNA promoter sequence candidate is located such that it
  • the vector 540 can drive the transcription of the TAG sequence.
  • the vector is a plasmid. 541
  • the present disclosure provides for a plasmid vector
  • the MA sequence is either MA5 or MA4.
  • the MA sequence is located 3' from the TAG sequence.
  • the luciferase is located 3' from the TAG sequence.
  • 547 gene sequence is partial luciferase gene sequence or the full luciferase gene sequence.
  • the translational stop sequence is a translational stop sequence in at least one
  • the DNA recombination sequences are attP 1 and att?2.
  • the present disclosure provides a plasmid vector into which a
  • 552 DNA promoter sequence is inserted into comprising a TAG sequence, one or more multiple-
  • 554 polymerase promoter sequence, a MA segment, a translation stop codon, a RNA stabilization
  • the vector is a
  • the TAG sequence is between about 16 base pairs to about 200 base
  • the vector of the TAG sequence is about 60 base pairs.
  • the vector of the TAG sequence is about 60 base pairs.
  • TAG sequence is located 3' to the inserted promoter sequence and 5' to a transcription
  • the DNA promoter sequence is an enhancer.
  • the DNA promoter sequence is an enhancer.
  • 561 translation stop codon is a three frame translation stop codon.
  • 562 stabilization fragment is from an alpha-globin gene.
  • the transcription termination is from an alpha-globin gene.
  • RNA polymerase promoter sequence is a T7
  • the disclosure provides for a vector.
  • the disclosure provides
  • nucleotide sequence for use in the detection and analysis of a promoter nucleotide sequence
  • T7 promoter comprising: a T7 promoter, a TAG sequence, a MA sequence, and a poly A-signal.
  • the promoter sequence candidate is selected from
  • nucleotide sequences such as a genomic library, deletion or site-directed
  • the TAG sequence is a DNA sequence composed of random nucleotides.
  • the length of the TAG sequence is short, preferably between about 16
  • 576 preferably between about 40 base pairs to about 100 base pairs, more preferably between
  • each TAG sequence will have approximately equivalent
  • the MA segment is comprised of about 25% A
  • the disclosure provides a method where a nucleotide
  • 586 sequence is used for the detection and analysis of a promoter nucleotide sequence
  • T7 promoter sequence comprising: a T7 promoter sequence, a TAG sequence, a MA sequence, and a poly A-signal.
  • a DNA promoter sequence candidate may be selected from promoter sequence candidates
  • 590 sequences such as a genomic library, deletion or site-directed mutants of a specific promoter
  • the TAG 591 tissue-specific promoters, artificial promoters, etc.
  • the TAG 591 tissue-specific promoters, artificial promoters, etc.
  • the TAG 591 tissue-specific promoters, artificial promoters, etc.
  • 592 sequence is a DNA sequence comprised of short, random nucleotides preferably between
  • the present disclosure provides a cloning vector comprising a
  • telomere sequence 598 TAG sequence; a transcription termination signal, preferably a poly A-signal; a nucleotide
  • RNA sequence useful to enable RNA synthesis preferably a T7 promoter sequence; and a MA
  • nucleotide sequence useful to enable RNA synthesis preferably a T7
  • a cloning vector is provided wherein the cloning vector
  • 603 is comprised of a DNA promoter sequence candidate, a TAG sequence, a transcription
  • 604 termination signal preferably a polyA signal; a nucleotide sequence useful to enable RNA
  • 607 preferably a poly A-signal, are located on the sense DNA strand.
  • 608 In another embodiment of the present disclosure, a cloning vector is provided wherein
  • the cloning vector is comprised of a TAG sequence, a transcription termination signal,
  • RNA 610 preferably a poly A-signal, a nucleotide sequence useful to enable RNA synthesis, preferably
  • transcription termination signal preferably a poly A-signal
  • a cloning vector is provided wherein the cloning vector is comprised of a
  • TAG sequence preferably a poly A-signal; a nucleotide
  • RNA synthesis preferably a T7 promoter sequence, and a MA
  • TAG sequence is located 3' to the DNA promoter sequence candidate and
  • the transcription termination signal preferably a poly A-signal, is located 3 ' to the TAG
  • a cloning vector is provided wherein
  • the cloning vector is comprised of a TAG sequence, a transcription termination signal,
  • RNA 622 preferably a poly A-signal, a nucleotide sequence useful to enable RNA synthesis, preferably
  • 625 cloning vector is provided wherein the cloning vector is comprised of a DNA promoter
  • RNA 627 signal a nucleotide sequence useful to enable RNA synthesis, preferably a T7 promoter
  • termination signal preferably a poly A-signal
  • a cloning vector is provided wherein
  • the cloning vector is comprised of a TAG sequence, a transcription termination signal,
  • RNA 632 preferably a poly A-signal, a nucleotide sequence useful to enable RNA synthesis, preferably
  • the TAG sequence is located 5' to the transcription termination
  • transcription termination signal is 3' to a DNA promoter
  • TAG sequence and TAG sequence is operably linked to the transcription termination signal 638
  • a cloning vector is provided wherein
  • the cloning vector is comprised of a pair of MCS, a TAG sequence, a transcription
  • termination signal preferably a poly A-signal, a nucleotide sequence useful to enable RNA
  • the present disclosure provides an array-based method for promoter detection and
  • the method provides for transcriptional products that are tagged as they are
  • the method fulfills the need for reduction of labor, costs, and provides for
  • transformation e.g., electroporation, lipofection.
  • electroporation e.g., electroporation, lipofection.
  • lipofection e.g., lipofection
  • nucleic acids are written left to right in 5' to 3' orientation; amino acid
  • Amino acids may be referred to herein by either their commonly known three
  • amplified refers to the construction of multiple copies of a nucleic acid
  • Amplification systems include, for example, the
  • PCR polymerase chain reaction
  • LCR ligase chain reaction
  • TAS transcription-based amplification system
  • SDA 688 amplification
  • the product of amplification is termed an amplicon.
  • array refers to an array containing nucleic acid samples.
  • An array may be
  • microarray refers to an array containing
  • nucleic acid samples also referred to as microscopic DNA 'spots,' bound to solid substrates
  • each sample 695 occupied by each sample is usually 50-200 ⁇ m in diameter, nucleic acid samples representing
  • the solid substrate may include
  • Macroarrays may be such as those available commercially (Clontech)
  • Beads may be of those used in peptide, nucleic acid and organic
  • Microarrays allow the genes of a given sample to be simultaneously monitored
  • Microarrays may be fabricated by
  • nucleic acid samples may be manually deposited.
  • the term "DNA microarray” may apply to 709 several different forms of the technology, each differing in the type of nucleic acid applied
  • say marker or a “reporter gene” refers to a gene that can be detected, or
  • the expression of the reporter gene may be measured at either the RNA level, or
  • the gene product may be detected in experimental assay protocol, such as
  • marker enzymes as marker enzymes, antigens, amino acid sequence markers, cellular phenotypic markers,
  • reporter gene is a gene that
  • fluorescent proteins include fluorescent proteins, luciferase, beta-galactosidase, and selectable markers, such as
  • cDNA refers to DNA synthesized from a mature niRNA template.
  • pre-mRNA 730 RNA (pre-mRNA); b) the same cell processes the pre-mRNA strands by splicing out introns,
  • cloning host cell refers to a host cell that contains a cloning vector.
  • cloning vector refers to a DNA molecule such as a plasmid, cosmid, or
  • bacterial phage or virus, such as, for example retroviruses, adeno-associated adenoviruses,
  • Cloning vectors typically contain one or a small number of
  • Selectable marker genes may include genes that provide
  • detectable marker encompasses both the selectable markers and assay
  • markers refers to a variety of gene products to which cells
  • markers such as receptors for adherence ligands allowing selective adherence, and the like.
  • detectable response refers to any signal or response that may be detected
  • 759 responses include, but are not limited to, radioactive decay and energy (e.g., fluorescent,
  • a detectable response may be the result of an assay to measure one
  • a biologic material such as melting point, density, conductivity, surface
  • a "detection reagent” is any
  • Detection reagents include any of a variety of molecules, such as
  • a detection reagent 769 antibodies, nucleic acid sequences and enzymes.
  • a detection reagent 769 antibodies, nucleic acid sequences and enzymes.
  • 770 may comprise a marker.
  • DNA recombination sequences refers to nucleic acid sequence that
  • BP Clonase may be used to mediate the lambda recombination reactions. Transferring a gene
  • the expression clone contains the gene of interest recombined into the
  • Gateway® cloning system (Invitrogen Inc., Carlsbad, CA).
  • Electroporation is done with electroporators,
  • the solution is pipetted into a glass or plastic cuvette which has two Al electrodes
  • expression system refers to a genetic sequence which includes a protein
  • the expression system will include a
  • regulatory element such as a promoter or enhancer, to increase transcription and/or
  • regulatory element may be located upstream or downstream of the protein encoding region
  • expression vector refers a DNA molecule comprising a gene that is
  • gene expression is placed under the control of certain
  • 821 regulatory elements including promoters, tissue specific regulatory elements, and enhancers.
  • homing endonucleases refers to double stranded DNases that have large
  • Introns are spliced out of precursor RNAs, while
  • 831 inteins are spliced out of precursor proteins. Homing endonucleases are named using
  • 834 endonuclease recognition sites are extremely rare. For example, an 18 base pair recognition
  • host cell encompasses any cell which contains a vector and preferably
  • Host cells may be prokaryotic cells
  • Escherichia coli such as Escherichia coli, or eukaryotic cells such as yeast, insect, amphibian, or mammalian cells
  • the term as used herein means any cell which may be in culture or in vivo as part of a
  • hybridization refers to the process of combining complementary, single-
  • IVS internal ribosome entry site
  • An IRES is included to initiate translation of selectable marker protein
  • IRESes 860 coding sequences.
  • suitable IRESes include the mammalian
  • IRES of the immunoglobulin heavy-chain-binding protein (BiP).
  • Other suitable IRESes are
  • IRESes include those from the picomaviruses.
  • IRESes include those from the picomaviruses.
  • IRESes include those from the picomaviruses.
  • encephalomyocarditis virus preferably nucleotide numbers 163-746
  • poliovirus preferably
  • the IRES are located in the long 5' untranslated regions of the picomaviruses
  • isolated refers to material, such as a nucleic acid or a protein, which is: (1)
  • the isolated material optionally comprises
  • a naturally occurring nucleic acid becomes an isolated nucleic 877 acid if it is altered, or if it is transcribed from DNA which has been altered, by means of
  • nucleic acid e.g., a promoter
  • nucleic acids which are "isolated" as defined herein are also provided.
  • heterologous nucleic acids 884 referred to as "heterologous" nucleic acids.
  • 886 cell refers to "transfection” or “transformation” or “transduction” and includes reference to
  • nucleic acid 887 the incorporation of a nucleic acid into a eukaryotic or prokaryotic cell where the nucleic acid
  • 888 may be incorporated into the genome of the cell (e.g., chromosome, plasmid, plastid or
  • label refers to incorporation of a detectable marker
  • nucleic acid 893 a nucleic acid that can be detected or measured.
  • Various methods of labeling nucleic acids are known in the art.
  • nucleic acids 895 2002 may be used.
  • labels for nucleic acids include, but are not limited to,
  • radioisotopes e.g., 32 P-labeled NTPs and dNTPs; 35 S-labeled NTPs and
  • MA segment also referred to as a “MA sequence,” refers to a nucleotide
  • 904 complementary primer can anneal and initiate the synthesis of the first strand cDNA in order
  • the MA sequence is usually 20 to 30 nucleotides in length
  • mixing refers to combining, joining, uniting, associating, fusing, or
  • 917 refers to a short segment of DNA which contains many (usually 20+) sites recognized by
  • nucleic acid refers to a deoxyribomicleotide or ribonucleotide polymer in
  • nucleotide refers to a chemical compound that consists of a heterocyclic
  • 926 is a derivative of purine or pyrimidme, and the sugar is the pentose deoxyribose or ⁇ bose
  • Nucleotides are the monomers of nucleic acids, with three or more bonding together in order
  • Nucleotides are the structural units of RNA, DNA, and several
  • the 929 cofactors CoA, FAD, DMN, NAD, and NADP.
  • the purines include adenine (A), and
  • guanine (G) 930 guanine (G); the py ⁇ midines include cytosine (C), thymine (T), and uracil (U).
  • 934 population indicates that all cells within that population are genetically identical.
  • operably linked refers to a functional linkage between a promoter and a
  • nucleic acid sequences being linked are contiguous and, where necessary to join two
  • optical density refers to the absorbance of an optical element for a given
  • PCR polymerase chain reaction
  • polynucleotide refers to a deoxyribopolynucleotide, ribopolymicleotide, or
  • a polynucleotide can be full-length or a
  • one or more amino acid residue is an artificial chemical analogue of a corresponding amino acid residue
  • polypeptides are not entirely linear.
  • polypeptides may be
  • 974 branched as a result of ubiquitination, and they may be circular, with or without branching,
  • 977 branched circular polypeptides may be synthesized by non-translation natural process and by
  • primer refers to a nucleic acid which, when hybridized to a strand of
  • 980 DNA is capable of initiating the synthesis of an extension product in the presence of a
  • the primer preferably is sufficiently long to hybridize
  • a primer may also be used on RNA, for
  • promoter refers to a region of DNA upstream, downstream, or distal, from
  • T7, T3 and Sp6 are RNA polymerase
  • promoters are a means to demarcate which genes
  • promoter sequence candidate refers to a nucleotide sequence that contains
  • a promoter sequence candidate may be provided by a
  • promoterless refers to a protein coding sequence contained in a vector
  • the vector, plasmid, viral or otherwise may contain a promoter, but that promoter
  • protein coding sequence refers a nucleotide sequence encoding a
  • Protein coding sequences include those commonly
  • protein coding sequences include those coding
  • 1008 sequences include thymidine kinase, beta.-galactosidase, tryptophan synthetase, neomycin
  • DHFR dihydrofolate reductase
  • 1024 expression cassette can be incorporated into a plasmid, chromosome, mitochondrial DNA,
  • 1026 expression vector includes, among other sequences, a nucleic acid to be transcribed, a
  • a transcription termination signal such as a poly-A signal.
  • recombinant host refers to any prokaryotic or eukaryotic cell that contains
  • regulatory sequence also called regulatory region or regulatory element
  • reporter cell line refers to prokaryotic or eukaryotic cells that contain a
  • restriction enzyme refers to an enzyme that
  • the enzyme makes two incisions, one through each of the
  • Type 1 Type II, Type III, and Type II
  • the sites of actual cleavage are at variable distances from these recognition sites
  • the restriction enzyme is independent of its methylase, and cleavage occurs at very
  • Type lib enzymes cut sequences twice at both sites outside of
  • the sites are generally, but not necessarily, palindromic, (because restriction
  • 1059 enzymes usually bind as homodimers) and a particular enzyme may cut between two
  • RT-PCR refers to amplifying a defined piece of a ribonucleic acid (RNA) molecule
  • RNA strand is first reverse transcribed into its DNA complement or complementary DNA
  • selectable marker refers to a gene introduced into a cell, especially a
  • adenosine deaminase (thymidine, hypoxanthme, 9- ⁇ -D-xylofuranosyl adenine, T-
  • Negative selectable markers may utilize: cytosine deaminase (5-
  • sense refers to the general concept used to compare the polarity of nucleic acid
  • TAG refers to a DNA sequence composed of random nucleotides
  • the length of the TAG sequence is short, preferably between about
  • 1104 are preferably different or distinct enough to avoid annealing to each other at times when the
  • 1105 oligonucleotide is present as a single strand.
  • sequence should not be self-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-re-rese RNA sequence strand.
  • sequence should not be self-
  • each TAG sequence will have approximately equivalent
  • each TAG sequence has approximately
  • transcription termination signal refers to a section of genetic sequence that
  • 1115 marks the end of gene or operon on genomic DNA for transcription. In prokaryotes, two
  • 1117 signals where a hairpin structure forms within the nascent transcript that disrupts the mRNA-
  • 1121 signals are recognized by protein factors that co-transc ⁇ ptionally cleave the nascent RNA at a
  • 1125 are distinct from termination codons that occur in the mRNA and are the stopping signal for
  • 1126 translation which may also be called nonsense codons.
  • translational stop sequence refers to a sequence which codes for the
  • the translational stop sequence may be in
  • transfection refers to the introduction of foreign DNA into eukaryotic or
  • Transfection typically involves opening transient holes in cells to allow the
  • HEPES-buffered saline solution containing phosphate ions is combined with a
  • MgCi 2 or RbCl can be
  • Liposomes are small, membrane-
  • lipid-cation based transfection is typically used.
  • Other methods of transfection include
  • 1144 gives the cell some selection advantage, such as resistance towards a certain toxin. If the
  • Geneticin also known as G418, which is a toxin that can be

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Biomedical Technology (AREA)
  • Wood Science & Technology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Plant Pathology (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present disclosure discloses an array -based method for promoter detection and analysis. Promoter sequence candidates are analyzed simultaneously in one reaction vial utilizing a vector comprising a TAG sequence wherein transcriptional products are tagged as they are synthesized, in such a way that one specific transcript is labeled with only one type of tag, and one tag labels only one type of transcript. The transcriptional output is analyzed on conventional arrays.

Description

PROMOTER DETECTION AND ANALYSIS
This application claims priority to U.S. Patent Application Ser. No. 11/925,837 filed 27 October 2007.
This invention was made with government support under Grant 1R43HG003559 awarded by the National Institutes of Health. The government has certain rights in the invention.
TECHNICAL FIELD The present disclosure relates to methods for detecting regulatory elements in a cell sample. More specifically, the disclosure relates to methods for detecting regulatory elements in multiple cell samples at the same time and uses arising there from. The present disclosure also provides a vector for detection and analysis of regulatory elements.
BACKGROUND The genes of all living organisms are encoded by the nucleic acids DNA and RNA. Each gene encodes a protein that may be produced by the organism through expression of the gene. The systems that regulate gene expression respond to a wide variety of developmental and environmental stimuli, thus allowing each cell type to express a unique and characteristic subset of its genes, and to adjust the dosage of particular gene products as needed. The importance of dosage control is underscored by the fact that targeted disruption of key regulatory molecules in mice often results in drastic phenotypic abnormalities (Johnson, R. S., et al., Cell, 71 :577-586 (1992)), just as inherited or acquired defects in the function of genetic regulatory mechanisms contribute broadly to human disease. Standard molecular biology techniques have been used to analyze the expression of genes in a cell by measuring nucleic acids. These techniques include PCR, northern blot analysis, or other types of DNA probe analysis such as in situ hybridization. Each of these methods allows one to analyze the transcription of only known genes and/or small numbers of genes at a time (Nucl. Acids Res. 19, 7097-7104 (1991); Nucl. Acids Res. 18, 4833-4842 (1990); Nucl. Acids Res. 18, 2789-2792 (1989); European J. Neuroscience 2, 1063-1073 (1990); Analytical Biochem. 187, 364-373 (1990); Genet. Annal Techn. Appl. 7, 64-70 (1990); GATA 8(4), 129-133 (1991); Pro. Natl. Acad. Sci. USA 85, 1696-1700 (1988); Nucl. Acids Res. 19, 1954 (1991); Proc. Natl. Acad. Sci. USA 88, 1943-1947 (1991); Nucl. Acids Res. 19, 6123-6127 (1991); Proc. Natl. Acad. Sci. USA 85, 5738-5742 (1988); Nucl. Acids Res. 16, 10937 (1988)). Measurement of the levels of mRNA has also been used to monitor gene expression. Since proteins are transcribed from mRNA, it is possible to detect transcription by measuring the amount of mRNA present. One common method, called "hybridization subtraction", allows one to look for changes in gene expression by detecting changes in mRNA expression (Nucl. Acids Res. 19, 7097-7104 (1991); Nucl. Acids Res. 18, 4833-4842 (1990); Nucl. Acids Res. 18, 2789-2792 (1989); European J. Neuroscience 2, 1063-1073 (1990); Analytical Biochem. 187, 364-373 (1990); Genet. Annal Techn. Appl. 7, 64-70 (1990); GATA 8(4), 129-133 (1991); Proc. Natl. Acad. Sci. USA 85, 1696-1700 (1988); Nucl. Acids Res. 19, 1954 (1991); Proc. Natl. Acad. Sci. USA 88, 1943-1947 (1991); Nucl. Acids Res. 19, 6123- 6127 (1991); Proc. Natl. Acad. Sci. USA 85, 5738-5742 (1988); Nucl. Acids Res. 16, 10937 (1988)). Gene expression has also been monitored by measuring levels of the gene product, (i.e., the expressed protein), in a cell, tissue, organ system, or even organism. Measurement of gene expression by measuring the protein gene product may be performed using antibodies known to bind to the particular protein to be detected. A difficulty arises in needing to generate antibodies to each protein to be detected. Measurement of gene expression via protein detection may also be performed using 2-dimensional gel electrophoresis, wherein proteins can be, in principle, identified and quantified as individual bands, and ultimately reduced to a discrete signal. In order to positively analyze each band, each band must be excised from the membrane and subjected to protein sequence analysis (e.g., Edman degradation). However, it tends to be difficult to isolate a sufficient amount of protein to obtain a reliable protein sequence. In addition, many of the bands often contain more multiple proteins. Another difficulty associated with quantifying gene expression by measuring an amount of protein gene product in a cell is that protein expression is an indirect measure of gene expression. It is impossible to know from a protein present in a cell when the expression of that protein occurred. Thus, it is difficult to determine whether the protein expression changes over time due to cells being exposed to different stimuli. The measurement of the amount of particular activated transcription factors has been used to monitor gene expression. Transcription in a cell is controlled by activated transcription factors which bind to DNA at sites outside the core promoter for the gene and activate transcription. Since activated transcription factors activate transcription, detection of their presence is useful for measuring gene expression Transcriptional activators are found in prokaryotes, viruses, and eukaryotes In molecular biology, a reporter gene (often simply reporter) is a gene that researchers often attach to another gene of interest in cell culture, animals or plants. Certain genes are chosen as reporters because the characteristics they confer on organisms expressing them are easily identified and measured, or because they are selectable markers. Reporter genes are generally used to determine whether the gene of interest has been taken up by or expressed in the cell or organism population To introduce a reporter gene into an organism, researchers place the reporter gene and the gene of interest in the same DNA construct to be inserted into the cell or organism For bacteria or eukaryotic cells in culture, this is usually in the form of a circular DNA molecule called a plasmid It is important to use a reporter gene that is not natively expressed in the cell or organism under study, since the expression of the reporter is being used as a marker for successful uptake of the gene of interest Commonly used reporter genes that induce visually identifiable characteristics usually involve fluorescent proteins; for example, green fluorescent protein (GFP) and the luciferase assay. Other reporters include, for example, beta-galactosidase, X-gal, and chloramphenicol acetyltransferase (CAT). Many methods of transfection and transformation - two ways of expressing a foreign or modified gene in an organism - are effective in only a small percentage of a population subjected to the techniques Thus, a method for identifying those few successful gene uptake events is necessary. Reporter genes used in this way are normally expressed under their own promoter independent from that of the introduced gene of interest, the reporter gene can be expressed constitutively ("always on") or inducibly with an external intervention such as the introduction of IPTG in the beta-galactosidase system. As a result, the reporter gene's expression is independent of the gene of interest's expression, which is an advantage when the gene of interest is only expressed under certain specific conditions or in tissues that are difficult to access In the case of selectable-marker reporters such as CAT, the transfected population of bacteria can be grown on a substrate that contains chloramphenicol. Only those cells that have successfully taken up the construct containing the CAT gene will survive and multiply under these conditions. 102 Reporter genes can also be used to assay for the expression of the gene of interest,
103 which may produce a protein that has little obvious or immediate effect on the cell culture or
104 organism. In these cases the reporter is directly attached to the gene of interest to create a
105 gene fusion. The two genes are under the same promoter and are transcribed into a single
106 polypeptide chain. In these cases it is important that both proteins be able to properly fold
107 into their active conformations and interact with their substrates despite being fused. In
108 building the DNA construct, a segment of DNA coding for a flexible polypeptide linker
109 region is usually included so that the reporter and the gene of interest will only minimally
110 interfere with one another.
111 Reporter genes can be used to assay for the activity of a particular promoter in a cell
112 or organism. In this case there is no separate "gene of interest"; the reporter gene is simply
113 placed under the control of the target promoter and the reporter gene product's activity is
114 quantitatively measured. The results are normally reported relative to the activity under a
115 "consensus" promoter known to induce strong gene expression.
116 In the past few years, the sequencing of numerous genomes, both eukaryotic and
117 prokaryotic, has generated an enormous amount of data. Although detection of coding
118 regions is common, the major challenge is to annotate the functional non-coding sequences,
119 in particular those involved in gene transcription. Because transcription plays a pivotal role
120 in regulating important processes such as morphogenesis, cell differentiation, tissue
121 specificity, hormonal communication, and cellular stress responses, a need for the
122 identification and functional characterization of transcriptional promoters exists. The
123 methods for detection and analysis of transcriptional promoters can be divided into two
124 categories: computational methods and experimental methods.
125 Computational methods for promoter studies incorporate the many public and private
126 databases containing information gathered from studies published by hundreds of laboratories
127 and conducted using conventional labor-intensive and time-consuming approaches. The
128 Eukaryotic Promoter Database (EPD) and the Transcription Regulatory Regions Database
129 (TRRD) contain 1,871 and 703 entries of human promoters, respectively. Other promoter
130 databases, such as TransFac and DBTSS, contain almost 9,000 promoter sequences.
131 However, most of these are derived from in silico primer extension assays (e.g., TransFac), or
132 contain only data about the putative transcriptional start site (e.g., DBTSS). The small
133 numbers of experimentally validated human promoters compared to the 35,000 expected
134 human genes indicate the magnitude of the work still to be done. 135 Numerous computer-based promoter prediction methods have been developed (Scherf
136 et al., J. MoI. Biol. 297(3):599-606, 2000; Werner, T. Brief Bioinform. l(4):372-80, 2000;
137 Loots et al , Gen. Res 12 832-839, 2002). These methods are limited by the lack of a
138 reliable, standard protocol to predict and identify promoter regions. Promoters are generally
139 only a few base pairs (bp) long, and are embedded within the massive genome. Thus,
140 promoters are much more difficult to find and are easier to confuse than long, patterned
141 coding sequences. Typical computer algorithms for promoter prediction are based on
142 comparisons of unknown sequences with known elements, a strategy which does not allow
143 for identification of new types of promoter elements. Thus, computer-based searches for
144 promoter elements are incomplete and always require experimental confirmation.
145 Computational methods based on microarray data have been used to investigate
146 genome-wide transcriptional regulation (Pilpel et al., Nat. Gen. 29(2):153-9, 2001) These
147 techniques allow for the identification of novel functional motif combinations in the
148 promoters of a given organism, and may provide a global view of transcription networks.
149 However, the data provided from these methods also need confirmation by experimental
150 means.
151 The experimental methods for investigation of a promoter region and subsequent
152 characterization usually follow a basic protocol. First, upon identification of a new coding
153 sequence, the transcription start site is defined with standard molecular biology tools such as
154 Sl mapping, primer extension, or 5'RACE. Second, the upstream genomic region (up to 10
155 kb) is cloned and demonstrated to have promoter activity by performing a reporter assay in a
156 transient transfection system. Third, deletion and point mutation analyses are performed to
157 define the important transcriptional c/s-acting elements; information about transcriptional
158 regulation may be obtained by applying different induction or repression agents in transient
159 transfection assays. Finally, the transcription factors involved in promoter regulation are
160 identified by Dnase I footprinting, electrophoresis mobility shift assay (EMSA) in the
161 presence or absence of mutant probes and competitors, and EMSA supershift assay.
162 Transient-transfection based experimental methods have several disadvantages.
163 These methods measure reporter protein level instead of mRNA level, which is the direct
164 product of the transcription; protein levels may not always correlate with mRNA levels.
165 There are a limited number of reporter assays available (e.g. chloramphenicol acetyl-
166 transferase, β-galactosidase, luciferase, green fluorescent protein (GFP), β-glucuronidase) and
167 the utilization of the same reporter to compare various promoters implies that these promoters
168 must be tested separately and thus these assays are labor-intensive and time-consuming. 169 Since each of the many steps involved (i.e. transfection, induction, harvest, reporter
170 detection) are performed separately for each promoter investigated, usually in duplicate or
171 triplicate, the handling of more than 20 constructs simultaneously is challenging. For each
172 step performed, the time difference between the first and last sample may be significant;
173 therefore incubation periods, cell and reagent quality, for example, may differ from one
174 sample to the other thus introducing more experimental variation. Large amounts of material
175 and reagents are required. Additionally, in order to compare a series of promoters to each
176 other, a second reporter cassette has to be included as an internal control. In some instances,
177 the detection of this control may be as time-consuming and labor-intensive as for the first
178 reporter, and subject to experimental errors. The expression of this internal control can also
179 compete with the gene expression driven by the promoter of interest, and affect the results of
180 the assay. Some assays, such as luciferase and GFP assays, require expensive
181 instrumentation.
182 Kim et al. reported an experimental method for isolation and identification of
183 promoters in the human genome (Kim et al. Genome Research 15:830-839, 2005). However,
184 the use of antibodies to identify regions that may be associated with active transcription and
185 the required binding of both RNAP and TFIID as criteria for promoters may lead to the
186 elimination of some promoters that only show partial binding.
187 Khambata-Ford et al. reported an experimental method for identification of promoter
188 regions in the human genome by using a retroviral plasmid library-based functional reporter
189 gene assay (Khambata-Ford et al., Gen. Res. 13 : 1765-1774, 2003). However, in addition to
190 allowing potentially lethal disruption of the target cell genome by random integration of the
191 retroviral vector, the assay relies on the fluorescent reporter GFP for detection and screens
192 the cells via fluorescence-activated cell sorting (FACS).
193 Trinklein et al, reported an experimental method for identification and functional
194 analysis of human transcriptional promoters (Trinklein et al, Gen. Res. 13:308-312, 2003) by
195 using a draft sequence of the human genome and cDNA libraries. However, for further
196 analysis and identification of promoter sequences they used a luciferase-based transfection
197 assay.
198 The sequencing of genomes has generated a huge amount of data that needs to be
199 annotated. Computational methods are available to detect putative transcriptional promoter
200 regions, but they are not 100% efficient and must be confirmed by experimentation.
201 Unfortunately, the experimental procedures that are currently available to study promoters are 202 time-consuming, laborious, and not easily adapted to large numbers of promoters. Therefore,
203 new techniques for transcriptional studies are needed. 204
205 SUMMARY
206 The foregoing disadvantages of the previously described methods are overcome by
207 providing a novel reporter system that incorporates unique, non-coding DNA sequences. The
208 object of the present disclosure is to provide a novel reporter system that is specific,
209 inexpensive, and provides an efficient means of promoter detection.
210 The present disclosure provides a method for the detection and analysis of DNA
211 promoter sequences. In a preferred embodiment, the present disclosure provides a method
212 for detecting DNA regulatory sequences comprising: a) inserting a promoter sequence
213 candidate into a vector wherein the vector comprises a TAG sequence and wherein the
214 promoter sequence candidate is inserted in a position to drive transcription of the TAG
215 sequence; b) the vector containing the inserted promoter sequence candidate is inserted into a
216 cloning host cell; c) cloning host cells containing different promoter sequence candidates are
217 grown to the same optical density, pooled and the vectors therein are extracted, purified and
218 inserted into a reporter cell line; d) mRNA is extracted from the reporter cell lines wherein
219 the mRNA is directly labeled or is used as template for cDNA or probe synthesis; and e) the
220 labeled mRNA, cDNA or probe is analyzed with an array wherein the array comprises
221 identical or complementary sequence to the TAG sequence. Preferably, the labeled mRNA,
222 cDNA or probe hybridizes to the array and the label of the mRNA, cDNA or probe has a
223 detectable response.
224 In another embodiment, the present disclosure provides a method for the detection
225 and analysis of DNA promoter sequence candidates wherein DNA promoter sequence
226 candidates are integrated into vectors that comprise a TAG sequence, one or more multiple-
227 cloning sites, one or more DNA recombination sequences, a negative selection marker,
228 nucleotide sequences useful for the detection of mRNA sequences such as a T7 promoter
229 sequence and a MA segment, a translation stop codon, a RNA stabilization fragment such as
230 the one from the alpha-globin gene, and a transcription termination signal, such as a poly A
231 signal, and wherein the DNA promoter sequence candidates are located such that they drive
232 the transcription of the TAG sequences. In another embodiment, the present disclosure
233 provides a method for the detection and analysis of DNA promoter sequences wherein DNA
234 promoter sequence candidates are integrated into a vector comprising a TAG sequence, one
235 or more multiple-cloning sites, both of attPl and attP2 sequences, a negative selection marker 236 wherein the negative selection marker is the ccdB gene, a T7 promoter sequence, a MA
237 segment, a translation stop codon, an alpha-globin RNA stabilization fragment, and a poly A- 238 signal, and wherein the DNA promoter sequence candidate drives the transcription of the
239 TAG sequence
240 In another embodiment, the present disclosure provides a method for the detection
241 and analysis of DNA promoter sequences wherein DNA promoter sequence candidates are
242 integrated into a vector wherein the vector comprises a TAG sequence, one or more multiple-
243 cloning sites, both of attPl and attP2 sequences, a negative selection marker, a T7 promoter
244 sequence, a MA sequence wherein the MA sequence is comprised of approximately 25% A,
245 25% T, 25% G, and 25% C, a translation stop codon, a RNA stabilization fragment, and a
246 transcription termination signal, and wherein the DNA promoter sequence candidate drives
247 the transcription of the TAG sequence Preferably, the vector is a plasmid Preferably, the
248 RNA stabilization fragment is from an alpha-globm gene. Preferably, the transcription
249 termination signal is a poly A signal.
250 In another embodiment, the present disclosure provides a method for the detection
251 and analysis of DNA promoter sequences wherein DNA promoter sequence candidates are
252 integrated into a vector wherein the vector comprises a TAG sequence, one or more multiple-
253 cloning sites, one or more DNA recombination sequences, a negative selection marker, a T7
254 promoter sequence, a MA sequence wherein the MA sequence is comprised of approximately
255 25% A, 25% T, 25% G, and 25% C, a translation stop codon wherein the translation stop is in
256 three frames, a RNA stabilization fragment, and a transcription termination signal, and
257 wherein the DNA promoter sequence candidate is located such that it drives the transcription
258 of the TAG sequence. Preferably, the vector is a plasmid. Preferably, the RNA stabilization
259 fragment is from an alpha-globm gene. Preferably, the transcription termination signal is a
260 poly A signal. Preferably, the DNA recombination sequences are attPl and att?2
261 In another embodiment, the present disclosure provides a method for the detection
262 and analysis of DNA promoter sequences comprising: (a) integrating DNA promoter
263 sequence candidates within TAG-vectors, wherein the DNA promoter sequence candidate is
264 located such that it drives the transcription of the TAG sequence, wherein the TAG-vector
265 comprises: multiple cloning sites (MCS) for inserting DNA promoter sequence candidate,
266 DNA recombination sequences, such as att? 1 and atiP2, between which DNA promoter
267 sequence candidates can be inserted; a negative selection marker to maximize the recovery of
268 clones containing promoter sequence inserts, such as ccdB; a nucleotide sequence useful to
269 enable RNA synthesis , preferably a T7 promoter sequence; a unique reporter TAG, a specific 270 MA segment useful to synthesize probes from RNA, wherein the MA segment is comprised
271 of approximately 25% A, 25% T, 25% G, and 25% C; a three frame translation stop codon;
272 RNA stabilization fragment, preferably from a hemoglobin or alpha-globin gene; and a
273 transcription termination signal, such as a poly A-signal; (b) the TAG-vectors with the
274 promoter sequence candidate inserts are cloned into a host, preferably Escherichia coli, and
275 the clones are arrayed into a 96-well plate and grown to about the same cell density; (c) the
276 resultant clones are pooled, and the vectors wherein are purified; (d) the purified vector
277 mixture is transfected into a cell line of interest; and (e) the RNA is extracted, labeled, and
278 quantified by hybridization to the DNA TAG sequences arrayed on a membrane or glass
279 support, or beads. Suitable bead compositions include those used in peptide, nucleic acid and
280 organic moeity synthesis, including but not limited to, plastics, ceramics, glass, polystyrene,
281 methylstyrene, acrylic polymers, paramagnetic materials, thoria sol, carbon graphite, titanium
282 dioxide, latex or cross-linked dextrans such as sepharose, cellulose, nylon, cross-linked
283 micelles and teflon many all be used (see Microsphere Detection Guide, Bangs Laboratories,
284 Fishers Ind.). Preferably, the vector is a plasmid. Preferably, the label of the mRNA, cDNA
285 or probe has a detectable response.
286 In another embodiment of the present disclosure, a method is provided wherein each
287 DNA promoter sequence candidate under investigation (for example, computer-predicted
288 DNA promoter sequence candidates, DNA fragments from a collection of nucleotide
289 sequences, such as a genomic library, deletion or site-directed mutants of a specific DNA
290 promoter, tissue-specific promoters, artificial promoters, etc.) drives the transcription of a
291 unique mRNA that consists of a short oligonucleotide TAG embedded in the 5' end of a
292 luciferase coding sequence, wherein equimolar amounts of the various promoters under
293 investigation are pooled and transfected into a cell line, and wherein the mRNA levels are
294 quantified by hybridization to the TAG oligonucleotides in an array format. In another
295 embodiment, the reporters are short oligonucleotides TAGs. In another embodiment the
296 length the TAG sequence is between about 16 base pairs and about 200 base pairs, more
297 preferably between about 20 base pairs and about 175 base pairs, more preferably between
298 about 25 base pairs and about 150 base pairs, more preferably between about 30 base pairs
299 and about 125 base pairs, more preferably between about 45 base pairs and about 100 base
300 pairs, more preferably between about 50 base pairs and about 75 base pairs, more preferably
301 about 65 base pairs, and most preferably 60 bp. In another embodiment, all the TAG
302 sequences are designed to have approximately the same melting temperature; this feature
303 allows for the unbiased quantification of various mRNAs by hybridization under the same 304 temperature and ionic strength conditions. In another embodiment, the method enables the
305 detection and quantification of mRNA levels, instead of reporter protein levels, and is
306 unaffected by potentially interfering translation and posttranslational events as in the
307 conventional reporter assays. In another embodiment of the present disclosure, each of the
308 clones containing a TAG vector, preferably a plasmid, is grown to about the same cell
309 density, and the purified vectors, preferably plasmids, of these clonal cultures, containing
310 every DNA promoter sequence candidate, is mixed, and the resulting mixture transfected into
311 a single population of cells creating a competitive environment for the various promoters to
312 recruit transcription factors. In another embodiment, vectors, preferably plasmids, purified
313 from the clonal cell cultures of about equal cell density and containing about equimolar
314 amounts of all the DNA promoter sequences are mixed and used for transfection of a single
315 population of cells and the need for internal controls is eliminated. There are several ways to
316 obtain equimolar amounts of the vectors that carry the various candidate promoters-TAG
317 combinations that are used to transfect reporter cell lines. In another embodiment, equimolar
318 amounts of the vectors can be obtained by: 1) making the vector library; 2) array the vector
319 library (e.g., 96 well plate); 3) take an equal fraction from each clone and pool them all; 4)
320 grow all clones together assuming same growth rate and yield of the same amount of vector
321 per cell; 5) extract the transformation agent (e.g., a vector, plasmid or virus); and 6) transfect
322 the vector (or plasmid or infect virus) into a reporter cell line. Alternately, equimolar
323 amounts of the vector can be obtained by: 1) making the vector library; 2) array the vector
324 library (e.g., 96 well plate); 3) grow each clone individually (e.g., in a deep-well plate in case
325 of bacteria); 4) take an equal fraction from each clone and pool them all; 5) extract the
326 transformation agent (e.g., vector, plasmid or virus); and 6) transfect the vector (or plasmid or
327 infect virus) into the reporter cell line. Alternately, equimolar amounts of the vector can be
328 obtained by: 1) making the vector library; 2) array the vector library (e.g., 96 well plate); 3)
329 grow each clone individually (e.g., in a deep-well plate in case of bacteria); 4) extract the
330 transformation agent (e.g., vector, plasmid or virus) and quantify it; 5) take an equal fraction
331 from each clone (e.g., vector, plasmid or virus) and pool them all; and 6) transfect vector (or
332 plasmid or infect virus) into the reporter cell line. Alternately, equimolar amounts of the
333 vector can be obtained by: 1) making the vector library; 2) take a fraction from each clone,
334 and pool them all; 3) grow all the clones together and assume same growth rate and yield of
335 the same amount of vector per cell; 4) extract transformation agent (e.g., vector, or plasmid
336 or virus); 5) transfect vector (or plasmid or infect virus) into reporter cell line and determine 337 the TAG of interest (e.g., high level of expression); and 6) find the clone in the vector library
338 that contains TAG of interest (e.g., colony hybridization).
339 In another embodiment, the present disclosure provides a method for the detection
340 and analysis of DNA promoter sequences comprising: (a) integrating a DNA promoter
341 sequence candidate into a vector, preferably a plasmid, wherein the plasmid comprises a
342 TAG sequence, one or more multiple-cloning sites, at least one DNA recombination
343 sequence, preferably attP 1 or att?2, a negative selection marker, preferably ccdB, a
344 nucleotide sequence useful to enable RNA synthesis, such as a T7 promoter sequence, a MA
345 segment, a translation stop codon, a RNA stabilization fragment, preferably from the
346 hemoglobin or alpha-globin gene, and transcription termination signal, such as a poly A-
347 signal, and wherein the DNA promoter sequence candidate is located such that it drives the
348 transcription of the TAG sequence; (b) the vectors with the promoter sequence candidate
349 inserts are cloned into a host, preferably Escherichia coli, and the clones are arrayed into a
350 96-well plate and grown to the same cell density; (c) the resultant clones are pooled, and the
351 vectors wherein are purified; (d) the purified vector mixture is transfected into a cell line of
352 interest wherein the use of internal controls is eliminated and (e) the RNA is extracted,
353 labeled, and quantified by hybridization to the DNA TAG sequences arrayed on a membrane
354 or glass support. Preferably, the vector is a plasmid. Preferably, the label of the mRNA,
355 cDNA or probe has a detectable response.
356 In another embodiment, the disclosure provides a method for the detection and
357 analysis of DNA promoter sequences comprising integrating a DNA promoter sequence
358 candidate into a vector, preferably a plasmid, wherein the vector comprises a TAG sequence,
359 one or more multiple-cloning sites, at least one DNA recombination sequence, preferably
360 attΫ\ or att?2, a negative selection marker, such as ccdB, a nucleotide sequence useful to
361 enable RNA synthesis, preferably a T7 promoter sequence, a MA segment, a translation stop
362 codon, an RNA stabilization fragment, preferably a hemoglobin or alpha-globin gene, and
363 transcription termination signal, preferably a poly A-signal, and wherein the DNA promoter
364 sequence candidate is located such that it drives the transcription of the TAG sequence.
365 In another embodiment, the present disclosure provides a method for the detection
366 and analysis of DNA promoter sequences comprising: (a) integrating a DNA promoter
367 sequence candidate into a vector wherein the vector comprises a TAG sequence, one or more
368 multiple-cloning sites, at least one DNA recombination sequence, a negative selection
369 marker, a nucleotide sequence useful to enable RNA synthesis, a MA segment, a translation
370 stop codon, a RNA stabilization fragment, and a transcription termination signal, and wherein 371 the DNA promoter sequence candidate is located such that it drives the transcription of the
372 TAG sequence; (b) the vectors with the promoter sequence candidate inserts are cloned into a
373 host, preferably Escherichia coli, and the clones are arrayed into a 96-well plate and grown to
374 the same cell density; (c) the resultant clones are pooled, and the vectors wherein are purified;
375 (d) the purified vector mixture is transfected into a cell line of interest; and (e) the RNA is
376 extracted, labeled, and quantified by hybridization to the DNA TAG sequences arrayed on a
377 membrane or glass support. Preferably, the vector is a plasmid. Preferably, the DNA
378 recombination sequence is attVl or attV2. Preferably, the nucleotide sequence useful to
379 enable RNA synthesis is a T7 promoter sequence. Preferably, the transcription termination
380 signal is a poly A-signal. Preferably, the RNA stabilization fragment is from the hemoglobin
381 or alpha-globin gene. Preferably, the label of the mRNA, cDNA or probe has a detectable
382 response.
383 In another embodiment, the present disclosure provides a method for the detection
384 and analysis of DNA promoter sequences comprising: (a) integrating a DNA promoter
385 sequence candidate into a vector wherein the vector comprises a TAG sequence, one or more
386 multiple-cloning sites, at least one DNA recombination sequence, a negative selection
387 marker, a nucleotide sequence useful to enable RNA synthesis, a MA segment, a translation
388 stop codon, a RNA stabilization fragment, and a transcription termination signal, and wherein
389 the DNA promoter sequence candidate is located such that it drives the transcription of the
390 TAG sequence; (b) the vectors with the promoter sequence candidate inserts are cloned into a
391 host, preferably Escherichia coli, and the clones are arrayed into a 96-well plate and grown to
392 the same cell density; (c) the resultant clones are pooled, and the vectors wherein are purified;
393 (d) the purified vector mixture is transfected into a cell line of interest and wherein the use of
394 internal controls is eliminated upon transfecting the cells with vectors purified from the
395 clonal cell populations which are of the same cell density and (e) the RNA is extracted,
396 labeled, and quantified by hybridization to the DNA TAG sequences arrayed on a membrane
397 or glass support. Preferably, the vector is a plasmid. Preferably, the DNA recombination
398 sequence is αffPl or att?2. Preferably, the nucleotide sequence useful to enable RNA
399 synthesis is a T7 promoter sequence. Preferably, the RNA stabilization fragment is from the
400 hemoglobin or alpha-globin gene. Preferably, the transcription termination signal is a poly
401 A-signal. Preferably, the label of the mRNA, cDNA or probe has a detectable response.
402 In another embodiment of the present disclosure, the disclosure provides a method for
403 detection and analysis of DNA promoter nucleotide sequences in a collection of nucleotide
404 sequences, such as genomic library, comprising: (a) mixing promoter sequence candidates 405 with TAG-vectors, wherein the TAG-vector comprises: multiple cloning sites (MCS) for
406 inserting promoter sequence candidate, at least one DNA recombination sequence, such as
407 attP 1 or attP2, a negative selection marker to maximize the recovery of clones containing
408 promoter sequence inserts, such as, for example, a ccdB gene, a T7 promoter sequence to
409 enable RNA synthesis, a unique approximate 60 base pair reporter TAG, a specific MA
410 segment useful to synthesize probes from RNA, wherein the MA segment is comprised of
411 approximately 25% A, 25% T, 25% G, and 25% C, a three frame translation stop codon, a
412 RNA stabilization fragment, such as, for example, alpha-globin or hemoglobin, and
413 transcription termination signal, preferably a poly A-signal; (b) the TAG-vectors with the
414 promoter sequence candidate inserts are cloned into a host, preferably Escherichia coli, and
415 the clones are arrayed into a 96-well plate and grown to the same cell density; (c) the
416 resultant clones are pooled, and the vectors wherein are purified; (d) the purified vector
417 mixture is transfected into a cell line of interest; and (e) the RNA is extracted, labeled, and
418 quantified by hybridization to the DNA TAG sequences arrayed on a membrane or glass
419 support. Preferably, the TAG-vector is a TAG-plasmid. Preferably, the label of the mRNA,
420 cDNA or probe has a detectable response.
421 In another embodiment of the present disclosure, the disclosure provides a method for
422 the detection and analysis of DNA promoter nucleotide sequences in a collection of
423 nucleotide sequences, such as a genomic library, comprising: (a) mixing promoter sequence
424 candidates with TAG-vectors, wherein the TAG-vector comprises: multiple cloning sites
425 (MCS) for inserting promoter sequence candidate, at least one DNA recombination sequence,
426 a negative selection marker, a nucleotide sequence useful to enable RNA synthesis, a unique
427 approximate 60 base pair reporter TAG, a specific MA segment useful to synthesize probes
428 from RNA, wherein the MA segment is comprised of approximately 25% A, 25% T, 25% G,
429 and 25% C, a three frame translation stop codon, a RNA stabilization fragment, and
430 transcription termination signal; (b) the TAG-vectors with the promoter sequence candidate
431 inserts are cloned into a host, preferably Escherichia coli, and the clones are arrayed into a
432 96-well plate and grown to the same cell density; (c) the resultant clones are pooled, and the
433 vectors wherein are purified; (d) the purified vectors are transfected into a cell line of interest
434 and no internal controls are utilized and (e) the RNA is extracted, labeled, and quantified by
435 hybridization to the DNA TAG sequences arrayed on a membrane or glass support.
436 Preferably, the vectors are plasmids. Preferably, the DNA recombination sequence is attVl
437 or attV2. Preferably, the negative selection marker is ccdB. Preferably, the nucleotide 438 sequence to enable RNA synthesis is a T7 promoter sequence. Preferably, the RNA 439 stabilization fragment is from the alpha-globin gene. Preferably, the transcription termination
440 signal is a poly A-signal. Preferably, the label of the mRNA, cDNA or probe has a detectable
441 response
442 In another embodiment of the present disclosure, the disclosure provides a method for
443 detection and analysis of DNA promoter nucleotide sequences in a collection of nucleotide
444 sequences, such as a genomic library, comprising (a) mixing promoter sequence candidates
445 with TAG-vectors, wherein the TAG-vector comprises: multiple cloning sites (MCS) for
446 inserting promoter sequence candidate, at least one DNA recombination sequence, a negative
447 selection marker, a nucleotide sequence useful to enable RNA synthesis, a unique
448 approximate 60 base pair reporter TAG, a specific MA segment useful to synthesize probes
449 from RNA, wherein the MA segment is comprised of approximately 25% A, 25% T, 25% G,
450 and 25% C, a three frame translation stop codon, a RNA stabilization fragment, and a
451 transcription termination signal, (b) the TAG-vector with the promoter sequence candidate
452 inserts are cloned into a host, preferably Escherichia coli, and the clones are arrayed into a
453 96-well plate and grown to the same cell density, (c) the resultant clones, containing about
454 equal amounts of vectors are pooled, and the vectors wherein are purified; (d) the purified
455 vectors are transfected into a cell line of interest and wherein the use of internal controls is
456 not utilized, and (e) the RNA is extracted, labeled, and quantified by hybridization to the
457 DNA TAG sequences arrayed on a membrane or glass support. Preferably, the TAG-vectors
458 are TAG-plasmids. Preferably, the DNA recombination sequence is attP 1 or attP2.
459 Preferably, the negative selection marker is ccdB. Preferably, the nucleotide sequence to
460 enable RNA synthesis is a T7 promoter sequence. Preferably, the RNA stabilization
461 fragment is from the alpha-globm gene Preferably, the transcription termination signal is a
462 poly A-signal. Preferably, the label of the mRNA, cDNA or probe has a detectable response
463 In another embodiment, the disclosure provides a method for analysis and detection of
464 a plurality of DNA promoter nucleotide sequences in a plurality of samples, comprising: (a)
465 mixing DNA promoter sequence candidates, wherein the DNA promoter sequence
466 candidates are, for example, selected from computer-predicted promoter sequence candidates,
467 DNA fragments from a collection of nucleotide sequences, such as a genomic library,
468 deletion or site-directed mutants of a specific promoter, tissue-specific promoters, artificial
469 promoters, etc., with TAG vectors, wherein the TAG-vector comprises: multiple cloning sites
470 for inserting DNA promoter sequence candidate, DNA recombination sequences, a negative
471 selection marker, a nucleotide sequence useful to enable RNA synthesis, a unique
472 approximate 60 base pair reporter TAG, a specific MA segment useful to synthesize probes 473 from RNA, wherein the MA segment is comprised of about 25% A, 25% T, 25% G, and 25%
474 C, a three frame translation stop codon, a RNA stabilization fragment, and a transcription
475 termination signal; (b) the TAG-vectors with the promoter sequence candidate inserts are
476 cloned into a host, preferably Escherichia coli, and the clones are arrayed into a 96-well plate
477 and grown to the same cell density; (c) the resultant clones are pooled, and the vectors
478 wherein are purified; (d) the purified plasmid mixture is transfected into a cell line of interest;
479 and (e) the RNA is extracted, labeled, and quantified by hybridization to the DNA TAG
480 sequences arrayed on a membrane or glass support. Preferably, the TAG-vectors are TAG-
481 plasmids. Preferably, the DNA recombination sequence is attPl or attV2. Preferably, the
482 negative selection marker is ccdB. Preferably, the nucleotide sequence to enable RNA
483 synthesis is a T7 promoter sequence. Preferably, the RNA stabilization fragment is from the
484 alpha-globin gene. Preferably, the transcription termination signal is a poly A-signal.
485 Preferably, the label of the mRNA, cDNA or probe has a detectable response.
486 In another embodiment, the disclosure provides a method for detection and analysis of
487 a plurality of DNA promoter nucleotide sequences in a plurality of samples, comprising: (a)
488 mixing DNA promoter sequence candidates, wherein the promoter sequence candidates are,
489 for example, selected from computer-predicted promoter sequence candidates, DNA
490 fragments from a collection of nucleotide sequences, such as a genomic library, deletion or
491 site-directed mutants of a specific promoter, tissue-specific promoters, artificial promoters,
492 etc., with TAG vectors, wherein the TAG-vector comprises: multiple cloning sites for
493 inserting promoter sequence candidate, DNA recombination sequence, a negative selection
494 marker, a nucleotide sequence useful to enable RNA synthesis, a unique approximate 60 base
495 pair reporter TAG, a specific MA segment useful to synthesize probes from RNA, wherein
496 the MA segment is comprised of about 25% A, 25% T, 25% G, and 25% C, a three frame
497 translation stop codon, a RNA stabilization fragment, and a transcription termination signal;
498 (b) the TAG-vectors with the promoter sequence candidate inserts are cloned into a host,
499 preferably Escherichia coli, and the clones are arrayed into a 96-well plate and grown to the
500 same cell density; (c) the resultant clones contain about equal amounts of vector and are
501 pooled, and the vectors wherein are purified; (d) about equal amounts of the purified vectors
502 are transfected into a cell line of interest; and (e) the RNA is extracted, labeled, and
503 quantified by hybridization to the DNA TAG sequences arrayed on a membrane or glass
504 support. Preferably, the TAG-vectors are TAG-plasmids. Preferably, the DNA
505 recombination sequence is ati?\ or αftP2. Preferably, the negative selection marker is ccdB.
506 Preferably, the nucleotide sequence to enable RNA synthesis is a T7 promoter sequence. 507 Preferably, the RNA stabilization fragment is from the alpha-globin gene. Preferably, the
508 transcription termination signal is a poly A-signal. Preferably, the label of the mRNA, cDNA
509 or probe has a detectable response
510 In another embodiment, the disclosure provides a method for the detection and
511 analysis of a plurality of DNA promoter nucleotide sequences in a plurality of samples,
512 comprising (a) mixing DNA promoter sequence candidates, wherein the promoter sequence
513 candidates are, for example, selected from computer-predicted promoter sequence candidates,
514 DNA fragments from a collection of nucleotide sequences, such as a genomic library,
515 deletion or site-directed mutants of a specific promoter, tissue-specific promoters, artificial
516 promoters, etc., with TAG vectors, wherein the TAG-vector comprises: multiple cloning sites
517 for inserting promoter sequence candidate, DNA recombination sequence, a negative
518 selection marker, a nucleotide sequence useful to enable RNA synthesis, a unique
519 approximate 60 base pair reporter TAG, a specific MA segment useful to synthesize probes
520 from RNA, wherein the MA segment is comprised of about 25% A, 25% T, 25% G, and 25%
521 C, a three frame translation stop codon, a RNA stabilization fragment, and a transcription
522 termination signal, (b) the TAG-vectors with the DNA promoter sequence candidate inserts
523 are cloned into a host, preferably Escherichia coli, and the clones are arrayed into a 96-well
524 plate and grown to the same cell density, (c) the resultant clones are pooled, and the vectors
525 wherein are purified; (d) about equal amounts of the purified vectors are transfected into a
526 cell line of interest and wherein the use of internal controls is eliminated; and (e) the RNA is
527 extracted, labeled, and quantified by hybridization to the DNA TAG sequences arrayed on a
528 membrane or glass support Preferably, the TAG-vectors are TAG-plasmids. Preferably, the
529 DNA recombination sequence is α^Pl or attP2. Preferably, the negative selection marker is
530 ccclB Preferably, the nucleotide sequence to enable RNA synthesis is a T7 promoter
531 sequence. Preferably, the RNA stabilization fragment is from the alpha-globin gene.
532 Preferably, the transcription termination signal is a poly A-signal. Preferably, the label of the
533 mRNA, cDNA or probe has a detectable response.
534 The present disclosure provides a vector In a preferred embodiment, the present
535 disclosure provides a vector into which a DNA promoter sequence candidate is inserted into
536 comprising a TAG sequence, one or more multiple-cloning sites, at least one DNA
537 recombination sequence, a negative selection marker, a RNA polymerase promoter sequence,
538 a MA segment, a translation stop codon, a RNA stabilization fragment, and a transcription
539 termination signal, and wherein the DNA promoter sequence candidate is located such that it
540 can drive the transcription of the TAG sequence. Preferably, the vector is a plasmid. 541 In another embodiment, the present disclosure provides for a plasmid vector
542 comprising: a region for insertion of a putative promoter sequence wherein a MCS is located
543 both 5' and 3' to the putative promoter sequence, one or more DNA recombination
544 sequences; a T7 sequence; a TAG sequence; a luciferase gene sequence; a MA sequence; and
545 a translational stop sequence. Preferably, the MA sequence is either MA5 or MA4.
546 Preferably, the MA sequence is located 3' from the TAG sequence. Preferably, the luciferase
547 gene sequence is partial luciferase gene sequence or the full luciferase gene sequence.
548 Preferably, the translational stop sequence is a translational stop sequence in at least one
549 reading frame, more preferably at least two reading frames, and most preferably in three
550 reading frames. Preferably, the DNA recombination sequences are attP 1 and att?2.
551 In another embodiment, the present disclosure provides a plasmid vector into which a
552 DNA promoter sequence is inserted into comprising a TAG sequence, one or more multiple-
553 cloning sites, one or both of atiPl and attP2 sequences, a negative selection marker, a RNA
554 polymerase promoter sequence, a MA segment, a translation stop codon, a RNA stabilization
555 fragment, and a transcription termination signal, and wherein the DNA promoter sequence is
556 located such that it drives the transcription of the TAG sequence. Preferably, the vector is a
557 plasmid. Preferably, the TAG sequence is between about 16 base pairs to about 200 base
558 pairs, more preferably the vector of the TAG sequence is about 60 base pairs. Preferably, the
559 TAG sequence is located 3' to the inserted promoter sequence and 5' to a transcription
560 termination signal. Preferably, the DNA promoter sequence is an enhancer. Preferably, the
561 translation stop codon is a three frame translation stop codon. Preferably, the RNA
562 stabilization fragment is from an alpha-globin gene. Preferably, the transcription termination
563 signal is a poly-A signal. Preferably, the RNA polymerase promoter sequence is a T7
564 promoter sequence.
565 In another embodiment, the disclosure provides for a vector. The disclosure provides
566 a nucleotide sequence for use in the detection and analysis of a promoter nucleotide sequence
567 comprising: a T7 promoter, a TAG sequence, a MA sequence, and a poly A-signal. In
568 another embodiment of the disclosure, the promoter sequence candidate is selected from
569 promoter sequence candidates provided by a computer-predicted model, DNA fragments
570 from a collection of nucleotide sequences, such as a genomic library, deletion or site-directed
571 mutants of a specific promoter, tissue-specific promoters, artificial promoters, etc. In another
572 embodiment, the TAG sequence is a DNA sequence composed of random nucleotides. In
573 another embodiment, the length of the TAG sequence is short, preferably between about 16
574 base pairs to about 200 base pairs, more preferably between about 20 base pairs to about 150 575 base pairs, more preferably between about 30 base pairs to about 120 base pairs, more
576 preferably between about 40 base pairs to about 100 base pairs, more preferably between
577 about 50 base pairs to about 75 base pairs, and most preferably about 60 base pairs. Within a
578 plurality of TAG sequences, each TAG sequence will have approximately equivalent
579 amounts of the nucleotides A, T, G, and C such that each TAG sequence has approximately
580 the same melting temperature as the other the TAGs. A same melting temperature will allow
581 for the unbiased quantification of various mRNAs by hybridization under the same
582 temperature and ionic strength conditions. In another embodiment, the specific MA segment
583 is useful to synthesize probes from RNA, and the MA segment is comprised of about 25% A,
584 25% T, 25% G, and 25% C.
585 In another embodiment, the disclosure provides a method where a nucleotide
586 sequence is used for the detection and analysis of a promoter nucleotide sequence
587 comprising: a T7 promoter sequence, a TAG sequence, a MA sequence, and a poly A-signal.
588 A DNA promoter sequence candidate may be selected from promoter sequence candidates
589 provided by a computer-predicted model, DNA fragments from a collection of nucleotide
590 sequences, such as a genomic library, deletion or site-directed mutants of a specific promoter,
591 tissue-specific promoters, artificial promoters, etc. In preferred embodiments, the TAG
592 sequence is a DNA sequence comprised of short, random nucleotides preferably between
593 about 16 base pairs to about 200 base pairs, more preferably between about 20 base pairs to
594 about 150 base pairs, more preferably between about 30 base pairs to about 120 base pairs,
595 more preferably between about 40 base pairs to about 100 base pairs, more preferably
596 between about 50 base pairs to about 75 base pairs, and most preferably about 60 base pairs.
597 In another embodiment, the present disclosure provides a cloning vector comprising a
598 TAG sequence; a transcription termination signal, preferably a poly A-signal; a nucleotide
599 sequence useful to enable RNA synthesis, preferably a T7 promoter sequence; and a MA
600 sequence, wherein the nucleotide sequence useful to enable RNA synthesis, preferably a T7
601 promoter sequence, and the MA sequence are on the antisense DNA strand. In another
602 embodiment of the present disclosure, a cloning vector is provided wherein the cloning vector
603 is comprised of a DNA promoter sequence candidate, a TAG sequence, a transcription
604 termination signal, preferably a polyA signal; a nucleotide sequence useful to enable RNA
605 synthesis, preferably a T7 promoter sequence; and a MA sequence, wherein the DNA
606 promoter sequence candidate, the TAG sequence, and the transcription termination signal,
607 preferably a poly A-signal, are located on the sense DNA strand. 608 In another embodiment of the present disclosure, a cloning vector is provided wherein
609 the cloning vector is comprised of a TAG sequence, a transcription termination signal,
610 preferably a poly A-signal, a nucleotide sequence useful to enable RNA synthesis, preferably
611 a T7 promoter sequence, and a MA sequence, wherein the DNA promoter sequence candidate
612 is located 5' to the TAG sequence and wherein the TAG sequence is located 5' to the
613 transcription termination signal, preferably a poly A-signal In another embodiment of the
614 present disclosure, a cloning vector is provided wherein the cloning vector is comprised of a
615 TAG sequence, a transcription termination signal, preferably a poly A-signal; a nucleotide
616 sequence useful to enable RNA synthesis, preferably a T7 promoter sequence, and a MA
617 sequence, and the TAG sequence is located 3' to the DNA promoter sequence candidate and
618 the transcription termination signal, preferably a poly A-signal, is located 3 ' to the TAG
619 sequence
620 In another embodiment of the present disclosure, a cloning vector is provided wherein
621 the cloning vector is comprised of a TAG sequence, a transcription termination signal,
622 preferably a poly A-signal, a nucleotide sequence useful to enable RNA synthesis, preferably
623 a T7 promoter sequence, and a MA sequence, wherein the DNA promoter sequence is
624 operably linked to the TAG sequence In another embodiment of the present disclosure, a
625 cloning vector is provided wherein the cloning vector is comprised of a DNA promoter
626 sequence candidate, a TAG sequence, a transcription termination signal, preferably a poly A-
627 signal, a nucleotide sequence useful to enable RNA synthesis, preferably a T7 promoter
628 sequence, and a MA sequence, and the TAG sequence is operably linked to the transcription
629 termination signal, preferably a poly A-signal
630 In another embodiment of the present disclosure, a cloning vector is provided wherein
631 the cloning vector is comprised of a TAG sequence, a transcription termination signal,
632 preferably a poly A-signal, a nucleotide sequence useful to enable RNA synthesis, preferably
633 a T7 promoter sequence, and a MA sequence, wherein the DNA promoter sequence is located
634 5' to the TAG sequence, the TAG sequence is located 5' to the transcription termination
635 signal, preferably a poly A-signal, transcription termination signal is 3' to a DNA promoter
636 sequence candidate, and the DNA promoter sequence candidate is operably linked to the
637 TAG sequence and TAG sequence is operably linked to the transcription termination signal 638 In another embodiment of the present disclosure, a cloning vector is provided wherein
639 the cloning vector is comprised of a pair of MCS, a TAG sequence, a transcription
640 termination signal, preferably a poly A-signal, a nucleotide sequence useful to enable RNA
641 synthesis, preferably a T7 promoter sequence, and a MA sequence, and a MCS is located 5' 642 of the DNA promoter sequence candidate and a MCS is located 3' of the DNA promoter
643 sequence candidate.
644 The present disclosure provides an array-based method for promoter detection and
645 analysis. The method provides for transcriptional products that are tagged as they are
646 synthesized, in such a way that one specific transcript is labeled with only one type of TAG,
647 and one TAG labels only one type of transcript. All promoter sequence candidates are
648 analyzed simultaneously in one reaction vial. The transcriptional output is analyzed on
649 conventional arrays and can be detected with procedures that do not require expensive
650 instrumentation. The method fulfills the need for reduction of labor, costs, and provides for
651 the detection of promoter regions from genomic libraries and other related advantages.
652 These and other embodiments of the present disclosure will become apparent upon
653 reference to the detailed description and illustrative examples which are intended to
654 exemplify non-limiting embodiments of the disclosure. All references disclosed herein are
655 hereby incorporated by reference in their entirety as if each was incorporated individually. 656
657 GLOSSARY
658 Unless defined otherwise, all technical and scientific terms used herein have the same
659 meaning as commonly understood by one of ordinary skill in the art to which this disclosure
660 belongs. Generally, the nomenclature used herein and the laboratory procedures in cell
661 culture, molecular genetics, and nucleic acid chemistry and hybridization described below are
662 those well known and commonly employed in the art. Standard techniques are used for
663 recombinant nucleic acid methods, polynucleotide synthesis, and microbial culture and
664 transformation (e.g., electroporation, lipofection). Generally, enzymatic reactions and
665 purification steps are performed according to the manufacturer's specifications. The
666 techniques and procedures are generally performed according to conventional methods in the
667 art and various general references (see generally, Sambrook et al. Molecular Cloning: A
668 Laboratory Manual, 2d ed. (1989) Cold Spring Harbor Laboratory Press, Cold Spring Harbor,
669 N. Y., which is incorporated herein by reference) which are provided throughout this
670 document. Units, prefixes, and symbols may be denoted in their SI accepted form. Unless
671 otherwise indicated, nucleic acids are written left to right in 5' to 3' orientation; amino acid
672 sequences are written left to right in amino to carboxyl orientation, respectively. Numeric
673 ranges are inclusive of the numbers defining the range and include each integer within the
674 defined range. Amino acids may be referred to herein by either their commonly known three
675 letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical 676 nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly
677 accepted single-letter codes. Unless otherwise provided for, software, electrical, and
678 electronics terms as used herein are as defined in The New IEEE Standard Dictionary of
679 Electrical and Electronics Terms (5.sup.th edition, 1993). As employed throughout the
680 disclosure, the following terms, unless otherwise indicated, shall be understood to have the
681 following meanings and are more fully defined by reference to the specification as a whole:
682 The term "amplified" refers to the construction of multiple copies of a nucleic acid
683 sequence or multiple copies complementary to the nucleic acid sequence using at least one of
684 the nucleic acid sequences as a template. Amplification systems include, for example, the
685 polymerase chain reaction (PCR) system, ligase chain reaction (LCR) system, nucleic acid
686 sequence based amplification (NASBA, Canteen, Mississauga, Ontario), Q-Beta Replicase
687 systems, transcription-based amplification system (TAS), and strand displacement
688 amplification (SDA) See, e.g., Diagnostic Molecular Microbiology: Principles and
689 Applications, D. H. Persing et al., Ed., American Society for Microbiology, Washington,
690 D.C. (1993). The product of amplification is termed an amplicon.
691 The term "array" refers to an array containing nucleic acid samples. An array may be
692 a "macroarray" or a "microarray." The term "microarray" refers to an array containing
693 nucleic acid samples, also referred to as microscopic DNA 'spots,' bound to solid substrates,
694 such as glass microscope slides, plastic, or silicon wafers. Because the physical area
695 occupied by each sample is usually 50-200 μm in diameter, nucleic acid samples representing
696 multiple samples, including, for example, entire genomes, genomic libraries, synthesized
697 DNA samples from computer predicted models, or in deletion mutants of promoters under
698 investigation etc., may be bound to the solid substrate. The solid substrate may include
699 membranes or beads. Macroarrays may be such as those available commercially (Clontech)
700 or synthesized manually. Beads may be of those used in peptide, nucleic acid and organic
701 moiety synthesis, including but not limited to, plastics, ceramics, glass, polystyrene,
702 methylstyrene, acrylic polymers, paramagnetic materials, thoria sol, carbon graphite, titanium
703 dioxide, latex or cross-linked dextrans such as sepharose, cellulose, nylon, cross-linked
704 micelles and Teflon many all be used (see Microsphere Detection Guide, Bangs Laboratories,
705 Fishers Ind.). Microarrays allow the genes of a given sample to be simultaneously monitored
706 with respect to some experimental condition of interest. Microarrays may be fabricated by
707 the mechanical deposition of nucleic acid samples onto a solid substrate. Alternatively, the
708 nucleic acid samples may be manually deposited. The term "DNA microarray" may apply to 709 several different forms of the technology, each differing in the type of nucleic acid applied
710 and the method of application.
711 The term "assay marker" or a "reporter gene" refers to a gene that can be detected, or
712 'followed.' The expression of the reporter gene may be measured at either the RNA level, or
713 at the protein level. The gene product may be detected in experimental assay protocol, such
714 as marker enzymes, antigens, amino acid sequence markers, cellular phenotypic markers,
715 nucleic acid sequence markers, and the like. A "reporter gene" (or "reporter") is a gene that
716 researchers may attach to another gene of interest in cell culture, bacteria, animals, or plants.
717 Some reporters are selectable markers, or confer characteristics upon on organisms
718 expressing them allowing the organism to be easily identified and measured. To introduce a
719 reporter gene into an organism, researchers place the reporter gene and the gene of interest in
720 the same DNA construct to be inserted into the cell or organism. For bacteria or eukaryotic
721 cells in culture, this is usually in the form of a plasmid. Commonly used reporter genes may
722 include fluorescent proteins, luciferase, beta-galactosidase, and selectable markers, such as
723 chloramphenicol, and ccdB.
724 The term "cDNA" refers to DNA synthesized from a mature niRNA template. cDNA
725 is most often synthesized from mature mRNA using the enzyme reverse transcriptase. The
726 enzyme operates on a single strand of mRNA, generating its complementary DNA based on
727 the pairing of RNA base pairs (A, U, G, C) to their DNA complements (T, A, C, G). There
728 are several methods known for generating cDNA, for example, to obtain eukaryotic cDNA
729 whose introns have been spliced: a) an eukaryotic cell transcribes the DNA (from genes) into
730 RNA (pre-mRNA); b) the same cell processes the pre-mRNA strands by splicing out introns,
731 and adding a poly-A tail and 5' Methyl-Guanine cap; c) this mixture of mature mRNA
732 strands are extracted from the cell; d) a poly-T oligonucleotide primer is hybridized onto the
733 poly-A tail of the mature mRNA template. (Reverse transcriptase requires this double-
734 stranded segment as a primer to start its operation.); e) reverse transcriptase is added, along
735 with deoxynucleotide triphosphates (A, T, G, C); f) the reverse transcriptase scans the mature
736 mRNA and synthesizes a sequence of DNA that complements the mRNA template. This
737 strand of DNA is complementary DNA. (see also Current Protocols in Molecular Biology,
738 John Wiley & Sons).
739 The term "cloning host cell" refers to a host cell that contains a cloning vector.
740 The term "cloning vector" refers to a DNA molecule such as a plasmid, cosmid, or
741 bacterial phage, or virus, such as, for example retroviruses, adeno-associated adenoviruses,
742 lentivirus, baculoviruses and adenoviruses, that has the capability of replicating 743 autonomously in a host cell. Cloning vectors typically contain one or a small number of
744 restriction endomiclease recognition sites at which foreign DNA sequences can be inserted in
745 a determinable fashion without loss of essential biological function of the vector, as well as a
746 selectable marker gene that is suitable for use in the identification and selection of cells
747 transformed with the cloning vector. Selectable marker genes may include genes that provide
748 tetracycline resistance, ampicillin resistance, or other observable features, such as with the
749 ccdB gene.
750 The term "detectable marker" encompasses both the selectable markers and assay
751 markers. The term "selectable markers" refers to a variety of gene products to which cells
752 transformed with an expression construct can be selected or screened, including drug-
753 resistance markers, antigenic markers useful in fluorescence-activated cell sorting, adherence
754 markers such as receptors for adherence ligands allowing selective adherence, and the like.
755 When the nucleic acid is prepared or altered synthetically, advantage can be taken of known
756 codon preferences of the intended host where the nucleic acid is to be expressed.
757 The term "detectable response" refers to any signal or response that may be detected
758 in an assay, which may be performed with or without a detection reagent. Detectable
759 responses include, but are not limited to, radioactive decay and energy (e.g., fluorescent,
760 ultraviolet, infrared, visible) emission, absorption, polarization, fluorescence,
761 phosphorescence, transmission, reflection or resonance transfer. Detectable responses also
762 include chromatographic mobility, turbidity, electrophoretic mobility, mass spectrum,
763 ultraviolet spectrum, infrared spectrum, nuclear magnetic resonance spectrum and x-ray
764 diffraction. Alternatively, a detectable response may be the result of an assay to measure one
765 or more properties of a biologic material, such as melting point, density, conductivity, surface
766 acoustic waves, catalytic activity or elemental composition. A "detection reagent" is any
767 molecule that generates a detectable response indicative of the presence or absence of a
768 substance of interest. Detection reagents include any of a variety of molecules, such as
769 antibodies, nucleic acid sequences and enzymes. To facilitate detection, a detection reagent
770 may comprise a marker.
771 The term "DNA recombination sequences" refers to nucleic acid sequence that
772 provides for efficient transfer of DNA fragments across multiple systems and into multiple
773 vectors. Any DNA fragment flanked by a recombination site can be transferred into any
774 vector that has a corresponding site. Orientation and reading frame are maintained with
775 efficiencies (typically 99%), effectively eliminating the need for secondary sequencing or
776 subcloning after the initial entry clone is made. The transfer of DNA fragments makes use of 777 lambda phage-based site-specific recombination instead of restriction endonuclease and
778 ligase to insert a gene of interest into an expression vector. The DNA recombination
779 sequences, for example, attL, attR, attB, and attV, and enzyme mixtures, for example, LR and
780 BP Clonase, may be used to mediate the lambda recombination reactions. Transferring a gene
781 into a destination vector is accomplished in two steps: 1) clone the gene of interest into an
782 entry vector and 2) mix the entry clone containing the gene of interest in vitro with the
783 appropriate expression vector (destination vector) and enzyme mix. Site-specific
784 recombination between the att sites (attR x attL attB x att?) generates an expression clone
785 and a by-product. The expression clone contains the gene of interest recombined into the
786 destination vector backbone. Following transformation and selection in E. coli, the expression
787 clone is ready to be used for expression in the appropriate host. This lambda-based system is
788 also known as the Gateway® cloning system (Invitrogen Inc., Carlsbad, CA).
789 The term "electroporation" refers to a significant increase in the electrical
790 conductivity and permeability of the cell plasma membrane caused by an externally applied
791 electrical field. It is used as a way of introducing some substance into a cell, such as loading
792 it with a piece of coding DNA, a molecular probe, or a drug. Pores are formed when the
793 voltage across a plasma membrane exceeds its dielectric strength. If the strength of the
794 applied electrical field and/or duration of exposure to it are properly chosen, the pores formed
795 by the electrical pulse reseal after a short period of time, during which extracellular
796 compounds have a chance to enter into the cell. However, excessive exposure of live cells to
797 electrical fields can result in cell death. Electroporation is done with electroporators,
798 instruments which create the electric current and send it through the cell solution, typically
799 bacteria. The solution is pipetted into a glass or plastic cuvette which has two Al electrodes
800 on its sides. For example, for bacterial electroporation, a suspension of around 50 μl is
801 usually used. Prior to electroporation it is mixed with the plasmid to be transformed. The
802 mixture is pipetted into the cuvette, the voltage is set on the electroporator (2,400 volts is
803 often used) and the cuvette is inserted into the electroporator and an electric current is
804 applied. Immediately after electroporation 1 ml of liquid medium is added to the bacteria (in
805 the cuvette or in a microcentrifuge tube), and the tube is incubated at the bacteria's optimal
806 temperature for an hour or more and then it is spread on an agar plate (see Ausubel, Current
807 Protocols in Molecular Biology, Wiley).
808 The term "equimolar" refers to having an equal concentration of moles in one liter of
809 solution. 810 The term "expression system" refers to a genetic sequence which includes a protein
811 encoding region which is operably linked to all of the genetic signals necessary to achieve
812 expression of the protein encoding region. Traditionally, the expression system will include a
813 regulatory element such as a promoter or enhancer, to increase transcription and/or
814 translation of the protein encoding region, or to provide control over expression. The
815 regulatory element may be located upstream or downstream of the protein encoding region,
816 or may be located at an intron (non coding portion) interrupting the protein encoding region.
817 Alternatively it is also possible for the sequence of the protein encoding region itself to
818 comprise regulatory ability.
819 The term "expression vector" refers a DNA molecule comprising a gene that is
820 expressed in a host cell. Typically, gene expression is placed under the control of certain
821 regulatory elements including promoters, tissue specific regulatory elements, and enhancers.
822 Such a gene is said to be "operably linked to" the regulatory elements.
823 The term "functional splice acceptor" refers to any individual functional splice
824 acceptor or functional splice acceptor consensus sequence that permits the construct of the
825 disclosure to be processed such that it is included in any mature, biologically active mRNA,
826 provided that it is integrated in an active chromosomal locus and transcribed as a contiguous
827 part of the pre-messenger RNA of the chromosomal locus.
828 The term "homing endonucleases" refers to double stranded DNases that have large,
829 asymmetric recognition sites (12-40 base pairs) and coding sequences that are usually
830 embedded in either introns or inteins. Introns are spliced out of precursor RNAs, while
831 inteins are spliced out of precursor proteins. Homing endonucleases are named using
832 conventions similar to those of restriction endonucleases with intron-encoded endonucleases
833 containing the prefix, "I-" and intein endonucleases containing the prefix, "PI-". Homing
834 endonuclease recognition sites are extremely rare. For example, an 18 base pair recognition
835 sequence will occur only once in every 7 x 1010 base pairs of random sequence. This is
836 equivalent to only one site in 20 mammalian-sized genomes. However, unlike standard
837 restriction endonucleases, homing endonucleases tolerate some sequence degeneracy within
838 their recognition sequence. As a result, their observed sequence specificity is typically in the
839 range of 10-12 base pairs. Homing endonucleases do not have stringently-defined
840 recognition sequences in the way that restriction enzymes do. That is, single base changes do
841 not abolish cleavage but reduce its efficiency to variable extents. The precise boundary of
842 required bases is generally not known. 843 The term "host cell" encompasses any cell which contains a vector and preferably
844 supports the replication and/or expression of the vector. Host cells may be prokaryotic cells
845 such as Escherichia coli, or eukaryotic cells such as yeast, insect, amphibian, or mammalian
846 cells. The term as used herein means any cell which may be in culture or in vivo as part of a
847 unicellular organism, part of a multicellular organism, or a fused or engineered cell culture.
848 The term "hybridization" refers to the process of combining complementary, single-
849 stranded nucleic acids into a single molecule. Nucleotides will bind to their complement
850 under normal conditions, so two perfectly complementary strands will bind (or 'anneal') to
851 each other readily. However, due to the different molecular geometries of the nucleotides, a
852 single inconsistency between the two strands will make binding between them more
853 energetically unfavorable. Measuring the effects of base incompatibility by quantifying the
854 rate at which two strands anneal can provide information as to the similarity in base sequence
855 between the two strands being annealed.
856 The term "internal ribosome entry site" (IRES) refers to an element which permits
857 attachment of a downstream coding region or open reading frame with a cytoplasmic
858 polysomal ribosome for purposes of initiating translation thereof in the absence of any
859 internal promoters. An IRES is included to initiate translation of selectable marker protein
860 coding sequences. Examples of suitable IRESes that can be used include the mammalian
861 IRES of the immunoglobulin heavy-chain-binding protein (BiP). Other suitable IRESes are
862 those from the picomaviruses. For example, such IRESes include those from
863 encephalomyocarditis virus (preferably nucleotide numbers 163-746), poliovirus (preferably
864 nucleotide numbers 28-640) and foot and mouth disease virus (preferably nucleotide numbers
865 369-804). Thus, the IRES are located in the long 5' untranslated regions of the picomaviruses
866 which can be removed from their viral setting in length to unrelated genes to produce
867 polycistronic mRNAs.
868 The term "isolated" refers to material, such as a nucleic acid or a protein, which is: (1)
869 substantially or essentially free from components that normally accompany or interact with it
870 as found in its naturally occurring environment. The isolated material optionally comprises
871 material not found with the material in its natural environment; or (2) if the material is in its
872 natural environment, the material has been synthetically (non-naturally) altered by deliberate
873 human intervention to a composition and/or placed at a location in the cell (e.g., genome or
874 subcellular organelle) not native to a material found in that environment. The alteration to
875 yield the synthetic material can be performed on the material within or removed from its
876 natural state. For example, a naturally occurring nucleic acid becomes an isolated nucleic 877 acid if it is altered, or if it is transcribed from DNA which has been altered, by means of
878 human intervention performed within the cell from which it originates. See, e.g., Compounds
879 and Methods for Site Directed Mutagenesis in Eukaryotic Cells, Kmiec, U.S. Pat. No.
880 5,565,350; In Vivo Homologous Sequence Targeting in Eukaryotic Cells; Zarling et al.,
881 PCT/US93/03868. Likewise, a naturally occurring nucleic acid (e.g., a promoter) becomes
882 isolated if it is introduced by non-naturally occurring means to a locus of the genome not
883 native to that nucleic acid. Nucleic acids which are "isolated" as defined herein are also
884 referred to as "heterologous" nucleic acids.
885 The term "inserted" or "introduced" in the context of inserting a nucleic acid into a
886 cell, refers to "transfection" or "transformation" or "transduction" and includes reference to
887 the incorporation of a nucleic acid into a eukaryotic or prokaryotic cell where the nucleic acid
888 may be incorporated into the genome of the cell (e.g., chromosome, plasmid, plastid or
889 mitochondrial DNA), converted into an autonomous replicon, or transiently expressed (e.g.,
890 transfected mRNA).
891 The terms "label" or "labeled" refers to incorporation of a detectable marker or
892 molecule, e.g., by incorporation of a radiolabeled nucleoside triphosphates or radioisotopes to
893 a nucleic acid that can be detected or measured. Various methods of labeling nucleic acids are
894 known in the art (see Short Protocols in Molecular Biology, 5th Ed., John Wiley & Sons,
895 2002) and may be used. Examples of labels for nucleic acids include, but are not limited to,
896 the following: radioisotopes (e.g., 32P-labeled NTPs and dNTPs; 35S-labeled NTPs and
897 dNTPs; 3H' 14C; 125I), fluorophores and fluorescent labels (e.g., FITC; rhodamine; lanthanide
898 phosphors; cyanine (Cy3, Cy5); fluorescein; coumarin, SYBR Green); and digoxygenin-1 1-
899 dUTP.
900 The term "MA segment", also referred to as a "MA sequence," refers to a nucleotide
901 sequence located downstream from the TAG and upstream of the transcription termination
902 signal in the TAG plasmids and their derivatives. All mRNAs synthesized from the various
903 promoters studied in a single experiment will contain the same MA sequence, to which a
904 complementary primer can anneal and initiate the synthesis of the first strand cDNA in order
905 to make hybridization probes. The MA sequence is usually 20 to 30 nucleotides in length,
906 but may be longer provided the MA sequence does not contain any secondary structure, such
907 as hairpin loops, which would prevent an efficient cDNA synthesis. The MA sequence is
908 composed of approximately 50% GC, such that the melting temperature ranges from about
909 7O0C to about 750C. MA sequences are unique among all published nucleotide databases, so
910 that only the TAG-transcripts will serve as template for cDNA synthesis. MA sequences do 911 not contain any of the restriction sites that are used elsewhere in the TAG plasmids for
912 cloning purposes. It cannot function as (or does not contain) a transcriptional promoter or
913 transcription termination signal
914 The term "mixing" refers to combining, joining, uniting, associating, fusing, or
915 hgating at least two distinct nucleotide sequences such that they become one fragment.
916 The term "multiple cloning site," also referred to as an "MCS" or a "polylinker"
917 refers to a short segment of DNA which contains many (usually 20+) sites recognized by
918 restriction enzymes or other endonucleases such as homing endonucleases.
919 The term "nucleic acid" refers to a deoxyribomicleotide or ribonucleotide polymer in
920 either single- or double-stranded form, and unless otherwise limited, encompasses known
921 analogues having the essential nature of natural nucleotides in that they hybridize to single-
922 stranded nucleic acids in a manner similar to naturally occurring nucleotides (e g , peptide
923 nucleic acids)
924 The term "nucleotide" refers to a chemical compound that consists of a heterocyclic
925 base, a sugar, and one or more phosphate groups In the most common nucleotides the base
926 is a derivative of purine or pyrimidme, and the sugar is the pentose deoxyribose or πbose
927 Nucleotides are the monomers of nucleic acids, with three or more bonding together in order
928 to form a nucleic acid. Nucleotides are the structural units of RNA, DNA, and several
929 cofactors: CoA, FAD, DMN, NAD, and NADP. The purines include adenine (A), and
930 guanine (G); the pyπmidines include cytosine (C), thymine (T), and uracil (U).
931 The terms "oligoclonal", "polyclonal" applied to cell populations indicates a
932 population of cells where some cells within that population are not genetically identical to the
933 rest of the cells of that population Conversely, the term "monoclonal" or "monoclonal cell
934 population" indicates that all cells within that population are genetically identical.
935 Differences in the "genetic identity" of a population of cells in the context of this disclosure
936 arise by random retroviral integration into different genomic insertion sites.
937 The term "operably linked" refers to a functional linkage between a promoter and a
938 second sequence, wherein the promoter sequence initiates and mediates transcription of the
939 DNA sequence corresponding to the second sequence. Generally, operably linked means that
940 the nucleic acid sequences being linked are contiguous and, where necessary to join two
941 protein coding regions, contiguous and in the same reading frame.
942 The term "optical density"' refers to the absorbance of an optical element for a given
943 wavelength per unit distance. Typically, bacterial cultures are measured at a wavelength of
944 600 nm. 945 The term "polymerase chain reaction" or "PCR" refers to a procedure described in
946 U.S. Pat. No. 4,683,195, the disclosure of which is incorporated herein by reference.
947 The term "polynucleotide" refers to a deoxyribopolynucleotide, ribopolymicleotide, or
948 analogs thereof that have the essential nature of a natural ribonucleotide in that they
949 hybridize, under stringent hybridization conditions, to substantially the same nucleotide
950 sequence as naturally occurring nucleotides and/or allow translation into the same amino
951 acid(s) as the naturally occurring nucleotide(s). A polynucleotide can be full-length or a
952 subsequence of a native or heterologous structural or regulatory gene. Unless otherwise
953 indicated, the term includes reference to the specified sequence as well as the complementary
954 sequence thereof. Thus, DNAs or RNAs with backbones modified for stability or for other
955 reasons are "polynucleotides" as that term is intended herein. Moreover, DNAs or RNAs
956 comprising unusual bases, such as inosine, or modified bases, such as tritylated bases, to
957 name just two examples, are polynucleotides as the term is used herein. It will be appreciated
958 that a great variety of modifications have been made to DNA and RNA that serve many
959 useful purposes known to those of skill in the art. The term polynucleotide as it is employed
960 herein embraces such chemically, enzymatically or metabolically modified forms of
961 polynucleotides, as well as the chemical forms of DNA and RNA characteristic of viruses
962 and cells, including among other things, simple and complex cells.
963 The terms "polypeptide", "peptide" and "protein" are used interchangeably herein to
964 refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which
965 one or more amino acid residue is an artificial chemical analogue of a corresponding
966 naturally occurring amino acid, as well as to naturally occurring amino acid polymers. The
967 essential nature of such analogues of naturally occurring amino acids is that, when
968 incorporated into a protein that protein is specifically reactive to antibodies elicited to the
969 same protein but consisting entirely of naturally occurring amino acids. The terms
970 "polypeptide", "peptide" and "protein" are also inclusive of modifications including, but not
971 limited to, glycosylation, lipid attachment, sulfation, gamma. -carboxylation of glutamic acid
972 residues, hydroxylation and ADP-ribosylation. It will be appreciated, as is well known and as
973 noted above, that polypeptides are not entirely linear. For instance, polypeptides may be
974 branched as a result of ubiquitination, and they may be circular, with or without branching,
975 generally as a result of posttranslational events, including natural processing event and events
976 brought about by human manipulation which do not occur naturally. Circular, branched and
977 branched circular polypeptides may be synthesized by non-translation natural process and by
978 entirely synthetic methods, as well. 979 The term "primer" refers to a nucleic acid which, when hybridized to a strand of
980 DNA, is capable of initiating the synthesis of an extension product in the presence of a
981 suitable polymerization agent. The primer preferably is sufficiently long to hybridize
982 uniquely to a specific region of the DNA strand. A primer may also be used on RNA, for
983 example, to synthesize the first strand of cDNA.
984 The term "promoter" refers to a region of DNA upstream, downstream, or distal, from
985 the start of transcription and involved in recognition and binding of RNA polymerase and
986 other proteins to initiate transcription. For example, T7, T3 and Sp6 are RNA polymerase
987 promoter sequences. In RNA synthesis, promoters are a means to demarcate which genes
988 should be used for messenger RNA creation and by extension, control which proteins the cell
989 manufactures. Promoters represent critical elements that can work in concert with other
990 regulatory regions (enhancers, silencers, boundary elements/insulators) to direct the level of
991 transcription of a given gene.
992 The term "promoter sequence candidate" refers to a nucleotide sequence that contains
993 a putative promoter sequence. A promoter sequence candidate may be provided by a
994 computer-predicted model, DNA fragments from a collection of nucleotide sequences, such
995 as a genomic library, deletion or site-directed mutants of a specific promoter, tissue-specific
996 promoters, artificial promoters, etc.
997 The term "promoterless" refers to a protein coding sequence contained in a vector,
998 retrovirus, adenovirus, adeno-associated virus or retroviral provirus that is not directly or
999 significantly under the control of a promoter within the vector, whether it be in RNA or DNA
1000 form. The vector, plasmid, viral or otherwise, may contain a promoter, but that promoter
1001 cannot be positioned or configured such that it directly or significantly regulates the
1002 expression of the promoterless protein coding sequence.
1003 The term "protein coding sequence" refers a nucleotide sequence encoding a
1004 polypeptide gene which can be used to distinguish cells expressing the polypeptide gene from
1005 those not expressing the polypeptide gene. Protein coding sequences include those commonly
1006 referred to as selectable markers. Examples of protein coding sequences include those coding
1007 a cell surface antigen and those encoding enzymes. A representative list of protein coding
1008 sequences include thymidine kinase, beta.-galactosidase, tryptophan synthetase, neomycin
1009 phosphotransferase, histidinol dehydrogenase, luciferase, chloramphenicol acetyltransferase,
1010 dihydrofolate reductase (DHFR); hypoxanthine guanine phosphoribosyl transferase
1011 (HGPRT), CD4, CD8 and hygromycin phosphotransferase (HYGRO). 1012 The term "recombinant" refers to a cell or vector that has been modified by the
1013 introduction of a heterologous nucleic acid or the cell that is derived from a cell so modified.
1014 Thus, for example, recombinant cells express genes that are not found in identical form
1015 within the native (non-recombinant) form of the cell or express native genes that are
1016 otherwise abnormally expressed, under-expressed or not expressed at all as a result of
1017 deliberate human intervention. The term "recombinant" as used herein does not encompass
1018 the alteration of the cell or vector by naturally occurring events (e.g., spontaneous mutation,
1019 natural transformation transduction/transposition) such as those occurring without deliberate
1020 human intervention.
1021 The term "recombinant expression cassette" refers to a nucleic acid construct,
1022 generated recombinantly or synthetically, with a series of specified nucleic acid elements
1023 which permit transcription of a particular nucleic acid in a host cell. The recombinant
1024 expression cassette can be incorporated into a plasmid, chromosome, mitochondrial DNA,
1025 virus, or nucleic acid fragment. Typically, the recombinant expression cassette portion of an
1026 expression vector includes, among other sequences, a nucleic acid to be transcribed, a
1027 promoter, and a transcription termination signal such as a poly-A signal.
1028 The term "recombinant host" refers to any prokaryotic or eukaryotic cell that contains
1029 either a cloning vector or an expression vector. This term also includes those prokaryotic or
1030 eukaryotic cells that have been genetically engineered to contain the cloned genes, or gene of
1031 interest, in the chromosome or genome of the host cell.
1032 The term "regulatory sequence" (also called regulatory region or regulatory element)
1033 refers to a promoter, enhancer or other segment of DNA where regulatory proteins such as
1034 transcription factors bind preferentially. They control gene expression and thus protein
1035 expression.
1036 The term "reporter cell line" refers to prokaryotic or eukaryotic cells that contain a
1037 reporter or as say marker.
1038 The term "restriction digestion'" refers to a procedure used to prepare DNA for
1039 analysis or other processing. Also known as DNA fragmentation, it uses a restriction enzyme
1040 to selectively cleave strands of DNA into shorter segments.
1041 The term "restriction enzyme"' (or restriction endonuclease) refers to an enzyme that
1042 cuts double-stranded DNA. The enzyme makes two incisions, one through each of the
1043 phosphate backbones of the double helix without damaging the bases. Restriction enzymes
1044 are classified biochemically into four types, designated Type 1, Type II, Type III, and Type
1045 IV. In Type I and Type III systems, both the methylase and restriction activities are carried 1046 out by a single large enzyme complex Although these enzymes recognize specific DNA
1047 sequences, the sites of actual cleavage are at variable distances from these recognition sites,
1048 and can be hundreds of bases away Both require ATP for their proper function In Type II
1049 systems, the restriction enzyme is independent of its methylase, and cleavage occurs at very
1050 specific sites that are within or close to the recognition sequence Type II enzymes are
1051 further classified according to their recognition site Most Type II enzymes cut palindromic
1052 DNA sequences, while Type Ha enzymes recognize non-palmdromic sequences and cleavage
1053 outside of the recognition site. Type lib enzymes cut sequences twice at both sites outside of
1054 the recognition sequence In Type IV systems, the restriction enzymes target only methylated
1055 DNA
1056 The term "restriction sites" or "restriction recognition sites" refer to particular
1057 sequences of nucleotides that are recognized by restriction enzymes as sites to cut the DNA
1058 molecule The sites are generally, but not necessarily, palindromic, (because restriction
1059 enzymes usually bind as homodimers) and a particular enzyme may cut between two
1060 nucleotides withm its recognition site, or somewhere nearby
1061 The term "reverse transcription"' or "reverse transcription polymerase chain reaction"
1062 (RT-PCR) refers to amplifying a defined piece of a ribonucleic acid (RNA) molecule The
1063 RNA strand is first reverse transcribed into its DNA complement or complementary DNA,
1064 followed by amplification of the resulting DNA using polymerase chain reaction
1065 The term "selectable marker" refers to a gene introduced into a cell, especially a
1066 bacterium or to cells in culture that confers a trait suitable for artificial selection. They are a
1067 type of reporter gene used in laboratory microbiology, molecular biology, and genetic
1068 engineering to indicate the success of a transfection or other procedure meant to introduce
1069 foreign DNA into a cell For example, analysis of gene function frequently requires the
1070 formation of cells that contain the studied gene m a stably integrated form In some
1071 situations, few cells may stably integrate DNA thus a dominant selectable marker is used to
1072 permit isolation of stable transfectants Selectable markers may include antibiotics
1073 (ampicillin) and 'suicide' genes (for example ccdB) Positive selective markers may utilize
1074 adenosine deaminase (thymidine, hypoxanthme, 9-β-D-xylofuranosyl adenine, T-
1075 deoxycoformycin), aminoglycoside phosphotransferase (neomycin, G418, gentamycm,
1076 kanamycin), Bleomycin (bleomycin, phleomycm, zeocin), cytosine deaminase (N-
1077 (phosphonacetyl) L aspartate, inosine, cytosine); dehydrofolate reductase (methotrexate,
1078 aminoptenn); histidmol dehydrogenase (histmdol); hygromycm-B-phosphotransferase
1079 (hygromycm-B), puromycin-Ν-acetyl transferase (puromycm), thymidine kinase 1080 (hypoxanthine, aminopterin, thymidine, glycine); and xanthine-guanine
1081 phosphorriobsyltransferase (xanthine, hypoxanthine, thymidine, aminopterin, mycophenolic
1082 acid, L-glutamine). Negative selectable markers may utilize: cytosine deaminase (5-
1083 fluorocytosine); diptheria toxin; ccdB, and HSV-TK.
1084 The term "selectively hybridizes" refers to hybridization, under stringent
1085 hybridization conditions, of a nucleic acid sequence to a specified nucleic acid target
1086 sequence to a detectably greater degree (e.g., at least 2-fold over background) than its
1087 hybridization to non-target nucleic acid sequences and to the substantial exclusion of non- 1088 target nucleic acids. Selectively hybridizing sequences typically have about at least 80%
1089 sequence identity, preferably 90% sequence identity, and most preferably 100% sequence
1090 identity (i.e., complementary) with each other.
1091 The term "sense" refers to the general concept used to compare the polarity of nucleic
1092 acid molecules to other nucleic acid molecules. Generally, a DNA sequence is called "sense"
1093 if its sequence is the same as that of a messenger R]SfA copy that is translated into protein.
1094 The sequence on the opposite strand is complementary to the sense sequence and is therefore
1095 called the "antisense" sequence.
1096 The term "TAG" refers to a DNA sequence composed of random nucleotides, in
1097 which each position has an equal probability of having any of the four deoxynucleotides (A,
1098 C, T, and G). Other bases, such as inosine, uracil, 5-methylcytosine, 8-azaguanine, 2,6-
1099 diaminopurine, 5 bromouracil, and other derivatives may be incorporated in their nucleotide
1100 form into the sequences. The length of the TAG sequence is short, preferably between about
1101 16 bp to about 200 bp, more preferably between about 20 to about 150 bp, more preferably
1102 between about 30 to about 120 bp, more preferably between about 40 to about 100 bp, more
1103 preferably between about 50 to about 75 bp, and most preferably about 60 bp. The sequences
1104 are preferably different or distinct enough to avoid annealing to each other at times when the
1105 oligonucleotide is present as a single strand. In addition, the sequence should not be self-
1106 complementary, so as to avoid the formation of primer-dimers during amplification. Within a
1107 plurality of TAG sequences, each TAG sequence will have approximately equivalent
1108 amounts of the nucleotides A, T, G, and C such that each TAG sequence has approximately
1109 the same melting temperature as the other TAGs. A same melting temperature will allow for
1110 the unbiased quantification of various mRNAs containing each a different TAG sequence by
1111 hybridization under the same temperature and ionic strength conditions. Within a plurality of
1112 TAG sequences, the nucleotide sequence of each individual TAG sequence is unique to the
1113 individual TAG of the plurality. 1114 The term "transcription termination signal" refers to a section of genetic sequence that
1115 marks the end of gene or operon on genomic DNA for transcription. In prokaryotes, two
1116 classes of transcription termination signals are known 1) intrinsic transcription termination
1117 signals where a hairpin structure forms within the nascent transcript that disrupts the mRNA-
1118 DNA-RNA polymerase ternary complex; and 2) Rho-dependent transcription termination
1119 signal that require Rho factor, an RNA helicase protein complex to disrupt the nascent
1120 mRNA-DNA-RNA polymerase ternary complex. In eukaryotes, transcription termination
1121 signals are recognized by protein factors that co-transcπptionally cleave the nascent RNA at a
1122 polyadenlyation signal (i e, "poly-A signal" or "poly-A tail") halting further elongation of the
1123 transcript by RNA polymerase. The subsequent addition of the poly-A tail at this site
1124 stabilizes the mRNA and allows it to be exported outside the nucleus. Termination sequences
1125 are distinct from termination codons that occur in the mRNA and are the stopping signal for
1126 translation, which may also be called nonsense codons.
1127 The term "translational stop sequence" refers to a sequence which codes for the
1128 translational stop codons In some embodiments, the translational stop sequence may be in
1129 one, two, or three reading frames
1130 The term "transfection" refers to the introduction of foreign DNA into eukaryotic or
1131 prokaryotic cells. Transfection typically involves opening transient holes in cells to allow the
1132 entry of extracellular molecules, typically supercoiled plasmid DNA, but also siRNA, among
1133 others. There are various methods of transfectmg cells. One method is by calcium
1134 phosphate. HEPES-buffered saline solution containing phosphate ions is combined with a
1135 calcium chloride solution containing the DNA to be transfected. When the two are
1136 combined, a fine precipitate of calcium phosphate will form, binding the DNA to be
1137 transfected on its surface The suspension of the precipitate is then added to the cells to be
1138 transfected. The cells take up precipitate and the DNA. Alternatively, MgCi2 or RbCl can be
1139 used. Other methods of transfection include electroporation, heat shock, proprietary
1140 transfection agents, dendrimers, and the use of liposomes. Liposomes are small, membrane-
1141 bounded bodies that fuse to the cell membrane releasing DNA into the cell. For eukaryotic
1142 cells, lipid-cation based transfection is typically used. Other methods of transfection include
1143 use of the gene gun and viruses. For stable transfection another gene is co-trans fected, which
1144 gives the cell some selection advantage, such as resistance towards a certain toxin. If the
1145 toxin, towards which the co-transfected gene offers resistance, is then added to the cell
1146 culture, only those cells with the foreign genes inserted into their genome will be able to
1147 proliferate, while other cells will die. After applying this selection pressure for some time, 1148 only the cells with a stable transfection remain and can be cultivated further. A common
1149 agent for stable transfection is Geneticin, also known as G418, which is a toxin that can be
1150 neutralized by the product of the neomycin resistant gene (see Bacchetti and Graham.
1151 Transfer of the gene for thymidine kinase to thymidine kinase-deficient human cells by
1152 purified herpes simplex viral DNA. 1977. Proc. Natl. Acad. Sci. USA 74(4): 1590-94).
1153 Conventional transient transfection assays may incorporate internal controls, such as pRL-
1154 SV40 (Promega, Inc.) and may be used in combination with any experimental reporter vector
1155 to co-transfect mammalian cells.
1156 The term "transformation" refers to the genetic alteration of a cell resulting from the
1157 introduction, uptake, and expression of foreign genetic material (DNA or RNA). In bacteria,
1158 transformation refers to a genetic change brought about by taking up and expressing DNA,
1159 and "competence" refers to a state of being able to take up DNA. Competent cells may be
1160 generated by a laboratory procedure in which cells are passively made permeable to DNA,
1161 using conditions that do not normally occur in nature, thus cells that have been manipulated
1162 to accept foreign DNA are called "competent cells". These procedures are comparatively
1163 easy and simple, and can be used to genetically engineer bacteria. These procedures may
1164 include chilling cells in the presence of divalent cations, such as CaCi2, which prepares the
1165 cell walls to become permeable to plasmid DNA. Cells are incubated with the DNA and then
1166 briefly heat shocked (e.g., 420C for 30-120 seconds), which causes the DNA to enter the cell.
1167 This method works well for circular plasmid DNAs. Electroporation is another way to allow
1168 DNA to enter cells and involves briefly shocking cells with an electric field of 100-200 V.
1169 Plasmid DNA enters cells via the holes created in the cell membrane by the electric shock;
1170 natural membrane-repair mechanisms close these holes afterwards. Yeasts may be
1171 transformed, for example, by High Efficiency Transformation (see Gietz, R. D., and R. A.
1172 Woods . 2002 Transformation of Yeast by the Liac/SS Carrier DN A/PEG Method. Methods in
1173 Enzymology 350:87-96); the Two-hybrid System Protocol (see Gietz, R.D., B. Triggs-Raine,
1174 A. Robbins, K.C. Graham, and R.A. Woods. 1997 Identification of proteins that interact with
1175 a protein of interest: Applications of the yeast two-hybrid system. MoI Cell Biochem 172:67-
1176 79); and the Rapid Transformation Protocol (see Gietz, R. D1, and R. A. Woods. 2002
1177 Transformation of Yeast by the Liac/SS Carrier DNA/PEG Method. Methods in Enzymology
1178 350:87-96).
1179 The term "vector" refers to a nucleic acid used in transfection of a host cell and into
1180 which can be inserted a polynucleotide. Vectors are frequently replicons. Expression vectors
1181 permit transcription of a nucleic acid inserted therein. Some common vectors include 1182 plasmids, cosmids, viruses, phages, recombinant expression cassettes, and transposons. The
1183 term "vector" may also refer to an element which aids in the transfer of a gene from one
1184 location to another. Vectors may include expression vectors and cloning vectors.
1185 The following terms are used to describe the sequence relationships between two or
1186 more nucleic acids or polynucleotides: (a) "reference sequence", (b) "comparison window",
1187 (c) "sequence identity", (d) "percentage of sequence identity", and (e) "substantial identity".
1188 The term "reference sequence" refers to a sequence used as a basis for sequence comparison.
1189 A reference sequence may be a subset or the entirety of a specified sequence; for example, as
1190 a segment of a full-length cDNA or gene sequence, or the complete cDNA or gene sequence.
1191 The term "comparison window" refers to a contiguous and specified segment of a
1192 polynucleotide sequence, wherein the polynucleotide sequence may be compared to a
1193 reference sequence and wherein the portion of the polynucleotide sequence in the comparison
1194 window may comprise additions or deletions (i.e., gaps) compared to the reference sequence
1195 (which does not comprise additions or deletions) for optimal alignment of the two sequences.
1196 Generally, the comparison window is at least 20 contiguous nucleotides in length, and
1197 optionally can be 30, 40, 50, 100, or longer. Those of skill in the art understand that to avoid
1198 a high similarity to a reference sequence due to inclusion of gaps in the polynucleotide
1199 sequence, a gap penalty is typically introduced and is subtracted from the number of matches.
1200 All TAGs should lack homology to other TAGs used within the same assay.
1201 Dependent upon the method the probe is made, the homology of the TAG with known
1202 nucleic acid sequences may be acceptable. For example, if the probe is made by labeling
1203 mRNA directly, for example with polyA polymerase (see, for example, Aviv and Leder, Proc
1204 Natl Acad Sci U S A. 1972 Jun;69(6): 1408-12), the TAG-containing mRNAs, the
1205 endogenous mRNAs and possibly the tRNA, and rRNA may be labeled as well.
1206 Hybridization by these latter RNAs may interfere with detection by the probe. The TAGs
1207 should not have homology with any known sequence that is transcribed into RNA, including
1208 mRNA, tRNA, rRNA, etc. If the probe is made by labeling the first-strand cDNA, there are
1209 two possibilities: 1) if oligo(dT) is used as a primer, all first strand cDNA synthesized from
1210 mRNAs will be labeled, including the TAG-containing mRNAs and the endogenous mRNAs.
1211 These latter cDNAs may interfere with detection by the probe, thus the TAGs should not
1212 have homology with any known sequence that is transcribed into RNA; and 2) if
1213 oligo(dT)+anchor is used as a primer "B" (where the anchor would be a short stretch of
1214 nucleotides corresponding to the 3' end of the mRNA, immediately preceding the poly A)
1215 only cDNAs synthesized from mRNAs terminated by the same or similar transcription 1216 termination signal as the one used for the TAG constructs will be labeled. Thus if a particular
1217 kind of endogenous mRNA is recognized by the oligo(dT)-anchor primer, that specific
1218 mRNA would interfere with detection by the probe, therefore the TAG should not share
1219 homology with that specific mRNA. If the probe is made by PCR, in addition to the
1220 homology considerations discussed above with regard to the synthesis of the first strand
1221 cDNA, there are two additional considerations. First, linear amplification of the first strand
1222 cDNA is made using a primer (A) corresponding to a region common to all the TAG-mRNAs
1223 that is located 5 ' to the TAG. This situation may arise when the vector (plasmid or viral
1224 DNA), from which the probe may be made from, is removed and the primer B used for the
1225 first strand cDNA synthesis is removed as well. Accordingly, if the first strand cDNA was
1226 synthesized using oligo(dT) as the primer, then the TAGs may not have homology with any
1227 known sequence that is transcribed into mRNA, and that shares sequence identity with primer
1228 A, and if the first strand cDNA was synthesized using oligo(dT)-anchor as the primer, then
1229 the TAGs may not have homology with any known sequence that is transcribed into mRNA
1230 that shares sequence identity with both the 3 ' end as the TAG-mRNA and primer A Second,
1231 exponential amplification of the first strand cDNA using primer (A) and the oligo(dT)-based
1232 primer occurs In this situation, the antisense strand may be used as a probe and the printing
1233 of the assay membrane with the sense-strand oligonucleotides so that the vector does not have
1234 to be removed, as discussed above. Thus, at times, one can use TAGs with sequences that are
1235 found elsewhere in databases A specific TAG should not share sequence homology with any
1236 other TAG used simultaneously in the same assay and with any DNA or RNA molecule that
1237 will be labeled during the synthesis of the probe, regardless of the method used to synthesize
1238 the probe.
1239 Methods of alignment of sequences for comparison are well-known in the art.
1240 Optimal alignment of sequences for comparison may be conducted by the local homology
1241 algorithm of Smith and Waterman, Adv. Appl. Math. 2:482 (1981); by the homology
1242 alignment algorithm of Needleman and Wunsch, J. MoI. Biol. 48:443 (1970); by the search
1243 for similarity method of Pearson and Lipman, Proc. Natl. Acad. Sci. 85:2444 (1988); by
1244 computerized implementations of these algorithms, including, but not limited to: CLUSTAL
1245 in the PC/Gene program by Intelligenetics, Mountain View, Calif.; GAP, BESTFIT, BLAST,
1246 FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer
1247 Group (GCG), 575 Science Dr., Madison, Wis., USA; the CLUSTAL program is well
1248 described by Higgins and Sharp, Gene 73 :237-244 (1988); Higgins and Sharp, CABIOS
1249 5: 151-153 (1989); Corpet, et al., Nucleic Acids Research 16: 10881-90 (1988); Huang, et al., 1250 Computer Applications in the Biosciences 8:155-65 (1992), and Pearson, et al., Methods in
1251 Molecular Biology 24:307-331 (1994). The BLAST family of programs which can be used
1252 for database similarity searches includes BLASTN for nucleotide query sequences against
1253 nucleotide database sequences; BLASTX for nucleotide query sequences against protein
1254 database sequences, BLASTP for protein query sequences against protein database
1255 sequences, TBLASTN for protein query sequences against nucleotide database sequences,
1256 and TBLASTX for nucleotide query sequences against nucleotide database sequences. See,
1257 Current Protocols in Molecular Biology, Chapter 19, Ausubel, et al., Eds., Greene Publishing
1258 and Wiley-Interscience, New York (1995)
1259 Unless otherwise stated, sequence identity/similarity values provided herein refer to
1260 the value obtained using the BLAST 2.0 suite of programs using default parameters. Altschul
1261 et al , Nucleic Acids Res 25 3389-3402 (1997) Software for performing BLAST analyses is
1262 publicly available, e.g., through the National Center for Biotechnology-Information
1263 (http:/'www.hcbi nlm nih gov/) This algorithm involves first identifying high scoring
1264 sequence pairs (HSPs) by identifying short words of length W in the query sequence, which
1265 either match or satisfy some positive-valued threshold score T when aligned with a word of
1266 the same length in a database sequence T is referred to as the neighborhood word score
1267 threshold (Altschul et al., supra). These initial neighborhood word hits act as seeds for
1268 initiating searches to find longer HSPs containing them. The word hits are then extended in
1269 both directions along each sequence for as far as the cumulative alignment score can be
1270 increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters
1271 M (reward score for a pair of matching residues; always>0) and N (penalty score for
1272 mismatching residues; alwaysO) For amino acid sequences, a scoring matrix is used to
1273 calculate the cumulative score Extension of the word hits in each direction are halted when.
1274 the cumulative alignment score falls off by the quantity X from its maximum achieved value,
1275 the cumulative score goes to zero or below, due to the accumulation of one or more negative-
1276 scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm
1277 parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN
1278 program (for nucleotide sequences) uses as defaults a word length (W) of 11, an expectation
1279 (E) of 10, a cutoff of 100, M=5, N=-4, and a comparison of both strands. For amino acid
1280 sequences, the BLASTP program uses as defaults a word length (W) of 3, an expectation (E)
1281 of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henrkoff (1989) Proc Natl.
1282 Acad ScL USA 89:10915). 1283 In addition to calculating percent sequence identity, the BLAST algorithm also
1284 performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin &
1285 Altschul, Proc. Natl. Acad. Sci. USA 90:5873-5787 (1993)). One measure of similarity
1286 provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an
1287 indication of the probability by which a match between two nucleotide or amino acid
1288 sequences would occur by chance. BLAST searches assume that proteins can be modeled as
1289 random sequences. However, many real proteins comprise regions of nonrandom sequences
1290 which may be homopolymeric tracts, short-period repeats, or regions enriched in one or more
1291 amino acids. Such low-complexity regions may be aligned between unrelated proteins even
1292 though other regions of the protein are entirely dissimilar. A number of low-complexity filter
1293 programs can be employed to reduce such low-complexity alignments. For example, the SEG
1294 (Wooten and Federhen, Comput. Chem, 17: 149-163 (1993)) and XNU (Claverie and States,
1295 Comput. Chem., 17: 191-201 (1993)) low-complexity filters can be employed alone or in
1296 combination. As used herein, "sequence identity" or "identity" in the context of two nucleic
1297 acid or polypeptide sequences refers to the residues in the two sequences which are the same
1298 when aligned for maximum correspondence over a specified comparison window. When
1299 percentage of sequence identity is used in reference to proteins it is recognized that residue
1300 positions which are not identical often differ by conservative amino acid substitutions, where
1301 amino acid residues are substituted for other amino acid residues with similar chemical
1302 properties (e.g. charge or hydrophobicity) and therefore do not change the functional
1303 properties of the molecule. Where sequences differ in conservative substitutions, the percent
1304 sequence identity may be adjusted upwards to correct for the conservative nature of the
1305 substitution. Sequences which differ by such conservative substitutions are said to have
1306 "sequence similarity" or "similarity". Means for making this adjustment are well-known to
1307 those of skill in the art. Typically this involves scoring a conservative substitution as a partial
1308 rather than a full mismatch, thereby increasing the percentage sequence identity. Thus, for
1309 example, where an identical amino acid is given a score of 1 and a non-conservative
1310 substitution is given a score of zero, a conservative substitution is given a score between zero
1311 and 1. The scoring of conservative substitutions is calculated, e.g., according to the algorithm
1312 of Meyers and Miller, Computer Applic. Biol. Sci., 4: 11-17 (1988) e.g., as implemented in
1313 the program PC/GENE (Intelligenetics, Mountain View, Calif, USA).
1314 As used herein, "percentage of sequence identity" means the value determined by
1315 comparing two optimally aligned sequences over a comparison window, wherein the portion
1316 of the polynucleotide sequence in the comparison window may comprise additions or 1317 deletions (i.e., gaps) as compared to the reference sequence (which does not comprise
1318 additions or deletions) for optimal alignment of the two sequences. The percentage is
1319 calculated by determining the number of positions at which the identical nucleic acid base or
1320 amino acid residue occurs in both sequences to yield the number of matched positions,
1321 dividing the number of matched positions by the total number of positions in the window of
1322 comparison and multiplying the result by 100 to yield the percentage of sequence identity.
1323 The term "substantial identity" of polynucleotide sequences means that a polynucleotide
1324 comprises a sequence that has at least 70% sequence identity, preferably at least 80%, more
1325 preferably at least 90% and most preferably at least 95%, compared to a reference sequence
1326 using one of the alignment programs described using standard parameters. One of skill will
1327 recognize that these values can be appropriately adjusted to determine corresponding identity
1328 of proteins encoded by two nucleotide sequences by taking into account codon degeneracy,
1329 amino acid similarity, reading frame positioning and the like. Substantial identity of amino
1330 acid sequences for these purposes normally means sequence identity of at least 60%, or
1331 preferably at least 70%, 80%, 90%, and most preferably at least 95%. Another indication that
1332 nucleotide sequences are substantially identical is if two molecules hybridize to each other
1333 under stringent conditions. However, nucleic acids which do not hybridize to each other
1334 under stringent conditions are still substantially identical if the polypeptides which they
1335 encode are substantially identical. This may occur, e.g., when a copy of a nucleic acid is
1336 created using the maximum codon degeneracy permitted by the genetic code. One indication
1337 that two nucleic acid sequences are substantially identical is that the polypeptide which the
1338 first nucleic acid encodes is immunologically cross reactive with the polypeptide encoded by
1339 the second nucleic acid. The terms "substantial Identity" in the context of a peptide indicates
1340 that a peptide comprises a sequence with at least 70% sequence identity to a reference
1341 sequence, preferably 80%, ore preferably 85%, most preferably at least 90% or 95% sequence
1342 identity to the reference sequence over a specified comparison window. Optionally, optimal
1343 alignment is conducted using the homology alignment algorithm of Needleman and Wunsch,
1344 J. MoI. Biol. 48:443 (1970). An indication that two peptide sequences are substantially
1345 identical is that one peptide is immunologically reactive with antibodies raised against the
1346 second peptide. Thus, a peptide is substantially identical to a second peptide, for example,
1347 where the two peptides differ only by a conservative substitution. Peptides which are
1348 "substantially similar" share sequences as noted above except that residue positions which are
1349 not identical may differ by conservative amino acid changes. 1350 Methods of extraction of RNA are well-known in the art and are described, for
1351 example, in J. Sambrook et al., "Molecular Cloning: A Laboratory Manual" (Cold Spring
1352 Harbor Laboratory Press, Cold Spring Harbor, N. Y., 1989), vol. 1, ch. 7, "Extraction,
1353 Purification, and Analysis of Messenger RNA from Eukaryotic Cells," incorporated herein by
1354 this reference. Other isolation and extraction methods are also well-known, for example in F.
1355 Ausubel et al., "Current Protocols in Molecular Biology, John Wiley & Sons). Typically,
1356 isolation is performed in the presence of chaotropic agents such as guanidinium chloride or
1357 guanidinium thiocyanate, although other detergents and extraction agents can alternatively be
1358 used. Typically, the mRNA is isolated from the total extracted RNA by chromatography over
1359 oligo(dT)-cellulose or other chromatographic media that have the capacity to bind the
1360 polyadenylated 3'-portion of mRNA molecules. Alternatively, but less preferably, total RNA
1361 can be used. However, it is generally preferred to isolate poly(A)+RNA.
1362 The method employs several basic steps to achieve its objective. First, a library of
1363 DNA TAGs is designed. The DNA TAG sequences are composed of random nucleotides.
1364 Each DNA TAG sequence, in one embodiment of approximately 60 bp in length, is unique
1365 among a plurality of TAG sequences, i.e. a specific TAG does not share sequence homology
1366 with any other TAG used simultaneously in the same assay and with any DNA or RNA
1367 molecule that will be labeled during the synthesis of the probe, regardless of the method used
1368 to synthesize the probe. The TAG sequences have similar physical properties so that a
1369 plurality of the TAG sequences can be used for hybridization under similar conditions.
1370 Second, pTAG-basic plasmids are constructed. Third, the TAG sequences are inserted into
1371 the pTAG-basic plasmids. Fourth, promoter array membranes are prepared. Fifth, promoter
1372 sequence candidates are inserted into the pTAG plasmids. Sixth, the pTAG plasmids with the
1373 promoter sequence candidate inserts are transfected into host cells, and the RNA extracted.
1374 The RNA or the resultant cDNA derived from the extracted RNA is then labeled, hybridized
1375 to the promoter array membrane, and analysis performed. Thus, the present disclosure
1376 discloses an array-based method for promoter detection and analysis. The method provides
1377 for transcriptional products that are tagged as they are synthesized, in such a way that one
1378 specific transcript is labeled with only one type of TAG, and one TAG labels only one type of
1379 transcript. All promoter sequence candidates are analyzed simultaneously in one reaction
1380 vial. The transcriptional output is analyzed on conventional arrays.
1381
1382 BRIEF DESCRIPTION OF THE DRAWINGS
1383 1384 FIGURE 1. Flow diagram of array-based promoter detection and analysis.
1385 FIGURE 2. BrightStar-Plus membranes spotted manually (left) or using a robot (right) with a
1386 collection of reverse-strand TAG oligonucleotides.
1387 FIGURES 3A and 3B. Comparative analysis of the activity of 42 promoters in a single
1388 population of HEK 293 cells. The 42 promoter-TAG plasmids and 8 promoter-less TAG-
1389 reporter plasmids were mixed in equimolar amounts and transfected into the same cell
1390 population. Total RNA was extracted 14 hours after transfection. RNA was labeled using
1391 the linear amplification method, and biotin-labeled probes were hybridized on the TAG-
1392 spotted membranes (Fig 3A). Hybridization was revealed by chemiluminescence, and
1393 quantified by densitometry (Fig 3B). The macro array membrane was made by spotting
1394 manually each oligonucleotide as a diagonal doublet.
1395 FIGURES 4A and 4B. Comparison of the transcriptional activities of 92 promoters in a single
1396 cell population. The 92 promoter-TAG plasmids and 8 promoter-less TAG-reporter plasmids
1397 were mixed in equimolar amounts and transfected into the same cell population. Total RNA
1398 was extracted 14 hours after transfection. RNA was labeled using the linear amplification
1399 method, and biotin-labeled probes were hybridized on the TAG-spotted membranes (Fig.
1400 4A). Hybridization was revealed by chemiluminescence, and quantified by densitometry
1401 (plain bars) (Fig. 4B). The relative luciferase activities obtained with each plasmid construct
1402 were obtained from previously published work and are shown at the bottom (empty bars)
1403 (Fig. 4B). The numbers at the bottom of the figure refer to the list of promoters described in
1404 Table 1. The luciferase data obtained with the various OM promoters (# 59-73), defensin
1405 promoters (# 74-85), and other promoters studied by Coleman (Coleman, S., et al.
1406 Experimental analysis of the annotation of promoters in the public database. Hum. MoL
1407 Genet., 2002. 11(16): 1817-1821) were generated in different experimental conditions and
1408 should not be compared between each other. The macroarray membrane was made by
1409 spotting each oligonucleotide as a quadruplet, using a Biorobotics MicroGrid array spotting
1410 robot (Genomic Solutions, Ann Arbor, MI) at the microarray facility of the University of
1411 Idaho Environmental Biotechnology Institute (Moscow, ID).
1412 FIGURES 5A and 5B. Validation of the Promoter Detective method with a set of 35
1413 promoter-TAG plasmids. The autoradiogram (Fig. 5A) was obtained by hybridizing
1414 radioactive TAG-cDNA probes to a membrane spotted with the complementary TAG strands.
1415 The identity of the spots is indicated by numbers on the left side of the autoradiogram, and on 1416 the bottom of the bar chart (Fig. 5B). The bar chart summarizes the intensities of the various
1417 spots, relative to the signal obtained with the CMV promoter (= 100).
1418 FIGURE 6. Flow diagram for the construction of the pTAG reporter plasmid.
1419 FIGURE 7. Plasmid map of the pT AG basic vector.
1420 TABLE 1. List of 100 promoter sequences used within the examples. Each promoter is
1421 described with its symbol, length, and Refseq or GenBank accession number. The TAG
1422 identification number to which it is associated is also indicated. 1423
1424 DETAILED DESCRIPTION
1425 The present disclosure provides a method for the detection and analysis of DNA
1426 promoter sequences. Figure 1 provides a general flow chart. The disclosure provides for the
1427 construction of a vector library containing potential DNA promoter sequence candidates that
1428 may be present, for example, in a collection of nucleotide sequences, such as a genomic
1429 library, in computer-predicted promoter regions, or in deletion mutants of promoters under
1430 investigation, etc. Each clone generated potentially drives the transcription of a unique
1431 reporter gene composed of a well-defined, approximately 60-bp long DNA TAG composed
1432 of random nucleotides. The transcriptional properties of the various constructs are analyzed
1433 by pooling equimolar amounts of vectors and transfecting them into a cell line of interest.
1434 RNA is extracted, cDNA synthesized and labeled, directly or indirectly, and quantified by
1435 hybridization to the DNA TAGs arrayed on a membrane, glass, or bead support (see Figure 1
1436 for a general schematic diagram). Suitable bead compositions may include those used in
1437 peptide, nucleic acid and organic moeity synthesis, including but not limited to, plastics,
1438 ceramics, glass, polystyrene, methylstyrene, acrylic polymers, paramagnetic materials, thoria
1439 sol, carbon graphite, titanium dioxide, latex or cross-linked dextrans such as sepharose,
1440 cellulose, nylon, cross-linked micelles and teflon many all be used (see Microsphere
1441 Detection Guide, Bangs Laboratories, Fishers Ind.).
1442 The design, operation and applications for the present disclosure will now be
1443 described in greater detail.
1444 1. Design of a library of DNA TAGs that will be transcribed by the putative DNA promoter
1445 sequences.
1446 The TAG DNA sequences were DNA sequences composed of random nucleotides,
1447 that is each position had an equal probability of having any of the four deoxynucleotides (A,
1448 C, T, and G). Other bases, such as inosine, uracil, 5-methylcytosine, 8-azaguanine, 2,6-
1449 diaminopurine, 5 bromouracil, and other derivatives may be incorporated in their nucleotide 1450 form into the oligonucleotides. The length of the TAG sequence was short, preferably
1451 between about 16 bp to about 200 bp, although a shorter or longer length may be used, but
1452 typically about 60 bp. Within a plurality of TAG sequences, each TAG sequence had
1453 approximately equivalent amounts of the nucleotides A, T, G, and C such that each TAG
1454 sequence had approximately the same melting temperature as the other TAGs. A same
1455 melting temperature allowed for the unbiased quantification of various mRNAs by
1456 hybridization under the same temperature and ionic strength conditions. Within a plurality of
1457 TAG sequences, the nucleotide sequence of each individual TAG sequence was unique
1458 amongst the plurality of TAGs. Each TAG did not share sequence homology with any other
1459 TAG used simultaneously in the same assay and with any DNA or RNA molecule that was
1460 labeled during the synthesis of the probe, regardless of the method used to synthesize the
1461 probe. A 60 bp length of random nucleotides of the TAG sequence allowed for generation of
1462 a large number of unique TAGs that were highly unlikely to be found in nature.
1463 Additionally, the longer length of the TAG (e.g., about 60 bp) allowed for use of
1464 hybridization temperatures (e.g., 70° C) that were high enough to prevent unspecific
1465 hybridization with partially homologous sequences. The GC content and thus melting
1466 temperature was normalized across the plurality of TAGs to ensure identical hybridization
1467 conditions for all of the TAG probes. To minimize cross-hybridization and for the highest
1468 specificity, all oligonucleotides were selected with a minimal length of sequence identity of
1469 no longer than six (6) bases. Low-complexity sequences with stretches of more than four (4)
1470 identical nucleotides were not allowed, thus avoiding difficulties in sequence similarity
1471 searching. Upon generation of the TAG sequences, the sequences were verified for the
1472 absence of homology amongst themselves. In some embodiments, the TAG sequences may
1473 be examined against sequences deposited in public databases such as GenBank, EMBL,
1474 DDBJ, and PDB using NCBI BLASTN to aid in determining if non-intended binding may
1475 occur. Oligonucleotides are generally synthesized as single strands by standard chemistry
1476 techniques, including automated synthesis. Many methods have been described for
1477 synthesizing oligonucleotides containing a randomized base. For example, a randomized
1478 position can be achieved by in-line mixing or using pre-mixed phosphoramidite precursors
1479 during an automated procedure (see, Ausbel et al., Current Protocols in Molecular Biology,
1480 Green Publishing, N.Y., 1995). Oligonucleotides are subsequently deprotected and may be
1481 purified by precipitation with ethanol, chromatographed using a size-exclusion or reversed-
1482 phase column, denaturing polyacrylamide gel electrophoresis, high-pressure liquid
1483 chromatography (HPLC), or other suitable method. 1484
1485 2. Construction of TAG-plasmids
1486 The TAG plasmids were derived from pTAG-basic (Figure 7) This plasmid
1487 incorporates a pair of Sfil sites which generate two distinct 3 nucleotide-long nonsymmetrical
1488 sticky ends suitable for the directional insertion of the TAG oligonucleotides. The plasmid
1489 also incorporates a modified cDNA encoding firefly luciferase (luc+) This 1650 bp cDNA
1490 was excised from the commercially available pGL3 using the restriction enzymes Ncol and
1491 Xbal. The wild-type coding region had been modified, in order to eliminate consensus
1492 sequences recognized by genetic regulatory proteins, thus helping to ensure that this reporter
1493 gene is unaffected by spurious host transcriptional signals. The plasmid also incorporates a
1494 97 bp long α-globm 3'UTR. The high level stability of α-globin mRNA, with a half-life from
1495 24 to 60 hours, is attributed to a C-πch as element in its 3'UTR, to which a protein complex
1496 binds to stabilize the mRNA. This protein complex is highly conserved from mouse to
1497 human and is found in a wide spectrum of tissues and cell lines This sequence is sufficient
1498 to increase luciferase mRNA stability, with a half-life of 7 hours The plasmid also
1499 incorporates the SV40 polyA signal to efficiently polyadenylate the luciferase transcript, thus
1500 resulting in up to a five-fold increase of steady-state mRNA levels The plasmid also
1501 incorporates a high copy number origin of replication from pUC19, but may alternatively
1502 contain a low copy number origin of replication, such as pBR322 CoIE 1 oπ'rop (15-20
1503 copies per chromosome), pACYC177 pl5A on (10-12 copies per chromosome) or the
1504 CopyControl system (1, 10-50 copies per chromosome). Additionally, the plasmid
1505 incorporates the ampicillin and kanamycin resistance genes for selection of the pTAG
1506 derivatives in E coli, the λ attVX and α#P2 sites for inserting promoter sequences by
1507 recombination using the Gateway system, and a MCS for inserting promoter sequence
1508 candidates by DNA ligation. The MCS was present in two structurally different but
1509 functionally equivalent copies flanking the ccdB gene, a configuration that allows for using
1510 the ccdB gene as a selection marker for plasmids that incorporates promoter sequences, by
1511 recombination or by ligation. The CcdB protein targets DNA gyrase and inhibits its catalytic
1512 reactions. Cells taking up unreacted vectors with the ccdB gene will not grow. The plasmid
1513 also incorporates a short, synthetic polyA signal based on the highly efficient polyA signal of
1514 the rabbit β-globin gene. Placed upstream of the MCS, it will terminate spurious
1515 transcription, which may initiate within the vector backbone 1516
1517 3. Insertion of DNA TAGs into pTAG-basic 1518 Typically, TAGs were obtained by annealing complementary 63 bp oligonucleotides
1519 [(+)strand: (N)60: ATA; (-)strand: (N)5o:GTG] that are then ligated into Sfil digested pTAG-
1520 basic, although oligonucleotides of differing lengths can be used, preferably between about
1521 16 bp to about 200 bp, more preferably between about 20 to about 150 bp, more preferably
1522 between about 30 to about 120 bp, more preferably between about 40 to about 100 bp, more
1523 preferably between about 50 to about 75 bp, and most preferably about 60 bp. The ligation
1524 reaction was electroporated into a host strain, for example E. coli DB3.1, which contains a
1525 gyrase mutation (gyrA462) that renders it resistant to the ccdB. Because the sticky ends
1526 generated by both Sfil sites are incompatible, a very low background of self-circularized
1527 pTAG-basic vectors, or vectors with multiple TAGs in tandem, was generated. The presence
1528 of the TAGs in the various plasmids was verified by DNA sequencing. High-throughput
1529 production of TAGs followed a similar methodology. Synthesis of 63 bp oligonucleotides
1530 was performed in two 96-well plates ((+) and (-) strands, respectively). The (+) and (-)
1531 strands were annealed in a 96-well plate, and ligated with Sfil digested, gel-purified pTAG
1532 basic. The ligation mixture was electroporated into electro-competent the E. coli DB3.1 host
1533 cells, using a 96-well electroporation plate. The bacterial clones were seeded into a 96-Deep-
1534 Well plate and the cultures were incubated for 18-24 hours at 370C at 250 rpm using a
1535 microtiter plate incubator shaker. Plasmid DNA purification was performed, either manually
1536 or via automation, for example using a BioRobot 3000 (Qiagen, Valencia, CA), and the
1537 presence of the TAGs verified via DNA sequencing (96-well format). 1538
1539 4. Preparation of Promoter Array Membranes
1540 Oligonucleotide arrays were manufactured using nylon membranes. The (-) strand
1541 TAG oligonucleotides were synthesized in a 96-well plate format and resuspended in buffer,
1542 for example TE, pH 7.5, at a concentration of 100 μg/ml. Nylon membranes, for example
1543 Nytran SuPerCharge (Whatman PLC, Middlesex, UK), were cut (2 cm x 4 cm) to fit 5.0 ml
1544 glass hybridization tubes. Oligonucleotides were either spotted manually in duplicate on the
1545 membranes (0.2 μl/spot) or oligonucleotide arrays printed using an array spotting robot, for
1546 example a Biorobotics MicroGrid (Genomic Solutions, Ann Arbor, MI), After spotting, the
1547 membranes were UV cross-linked twice using a Stratalinker 1800 at 120 mJ/sec, then baked
1548 at 7O0C for 1-2 hours. The printed membranes were sealed in parafilm and stored at -200C.
1549 The quality of the membranes was validated by hybridizing 10% of the membranes with
1550 biotin-labeled (+) strand oligonucleotide TAGs. The 3' end of the TAG oligonucleotides was
1551 labeled using terminal transferase and biotin-16-ddUTP. All TAGs were mixed together in 1552 equimolar amounts. The TAG mixture (100 pmol) was incubated in the presence of 1.0 nmol
1553 biotin-16-ddUTP and 50 U terminal transferase, following the manufacturer's
1554 recommendations. After a 15 minute incubation at 370C, the end-labeled TAG probes were
1555 precipitated with LiCl, centrifuged and resuspended in ddH2O. The labeling efficiency was
1556 checked by spotting a serial dilution of the labeling reaction and a standard on the nylon
1557 membrane. Detection was performed by chemiluminescence, for example with alkaline
1558 phosphatase-conjugated streptavidin, following the manufacturer's recommendations.
1559 Quantification was performed by densitometry. Upon validation of the quality of the biotin-
1560 labeled probes, the quality of the arrays was assessed by hybridizing the probes to the
1561 membranes using standard procedures, detecting them by chemiluminescence, and measuring
1562 the intensity of each spot by densitometry. The membranes were accepted upon observation
1563 of less than a variation of 5% of intensity and spot size. 1564
1565 5. Construction of promoter-TAG plasmids
1566 Promoter sequence candidates were inserted into TAG plasmids using two methods.
1567 First, promoter sequence candidates were extracted from existing plasmids using
1568 endonucleases such as restriction enzymes and inserted into the pTAG plasmids, between
1569 sites located in the multiple cloning sites. Promoter sequence and pTAG plasmids were
1570 assembled by DNA ligation using standard protocols (see Crowe et al., Improved cloning
1571 efficiency of polymerase chain reaction (PCR) products after proteinase K digestion.
1572 Nucleic Acids Res. 1991 Jan 11; 19(1):184); Ausubel, F. M., et al., Short Protocols in
1573 Molecular Biology). Alternatively, promoter sequences were amplified by PCR, using
1574 primers carrying attBl and attB2 extensions, and using mammalian genomic DNA or other
1575 plasmids as templates. The PCR products were inserted into the pTAG plasmids using the
1576 Gateway® recombination system. A promoter sequence candidate may be provided by a
1577 computer-predicted model, DNA fragments from a collection of nucleotide sequences, such
1578 as a genomic library, deletion or site-directed mutants of a specific promoter, tissue-specific
1579 promoters, artificial promoters, etc. Clones containing the pTAG plasmids with the promoter
1580 inserts were cultured in LB medium in the presence of 50 μg/ml ampicillin or 25 μg/ml
1581 kanamycin. At various time points during cell growth, aliquots of each culture were taken,
1582 the cell density measured spectrophotometrically at 600 nm, and equal volumes of culture
1583 pooled. Plasmid DNA was extracted using an alkaline lysis method and purified using anion-
1584 exchange resin. In order to verify that all plasmids were present in the mixture in equimolar
1585 concentrations, the following manipulation was performed. All plasmids in the DNA mixture 1586 were linearized by restriction digestion, and separated on an agarose gel (0.7%). The
1587 resultant DNA fragments, with sizes ranging from 5 to 15 kb, were stained with ethidium
1588 bromide and quantitated by densitometry using a gel documentation system. The linearity of
1589 the assay was verified by quantifying serial dilutions of the plasmid restriction digestion. 1590
1591 6. Transfection and RNA extraction
1592 The purified plasmid DNA mixture containing equimolar amounts of the promoter
1593 plasmids was transfected into HL60, U937, and 293 cell lines. Per transfection, 1 x 107
1594 viable U937 cells were washed and resuspended in 0.4 ml RPMI medium. Plasmid DNA (20
1595 μg) was added and the cell/DNA suspension was mixed gently by inversion. After a 5
1596 minute incubation at 250C, the cells were electroporated using a BTX ECM-600
1597 electroporator with the following settings: 500 V capacitance and resistance, 950 μF
1598 capacitance, 186 ohms resistance, 200 V charging voltage. After the electrochoc, the cells
1599 were transferred into a 10 cm diameter tissue culture dish containing 10 ml RPMI medium
1600 supplemented with 10% FBS. After 2 to 5 hours incubation at 370C, cells were harvested by
1601 centrifugation at 10 krpm for 30 seconds. Cell pellets were lysed by addition of 300 μl Trizol
1602 reagent and total RNA was extracted according to the manufacturers protocol (Invitrogen,
1603 Carlsbad, CA) (see also Current Protocols in Molecular Biology, John Wiley & Sons). RNA
1604 was precipitated with isopropyl alcohol, resuspended in RNase-free TE, pH 7.5, and
1605 quantified by measuring the absorbance at 260 nm and 280 nm (ratio ~2). RNA integrity was
1606 verified by agarose gel electrophoresis and ethidium bromide staining. The 28S and 18S
1607 rRNAs, represented in discrete individual bands, had a 2: 1 intensity ratio. RNA samples with
1608 a visible degree of degradation were not further processed. In parallel, an equimolar mixture
1609 of promoter-less TAG plasmids were transfected and analyzed for mRNA expression using
1610 the array. This control detected the possible presence of cryptic promoter activity in the
1611 TAGs. The promoter-less TAG plasmids yielding above-background signals were discarded. 1612
1613 7. Labeling, hybridization, and detection
1614 Radioactive cDNA probes were synthesized from total RNA. The total RNA was
1615 purified with Trizol (Invitrogen) and the concentration of the RNA was determined by the
1616 OD260 reading. One to five microgram of total RNA was mixed with MA5-a oligo (5'-
1617 TAGTCACTTCGATCGCTGAGG-3 ') ([SEQ ID NO. I]), and the nucleotides dATP, dTTP,
1618 dGTG, and 32P-dCTP. The reaction was incubated at 8O0C for 3 minutes and then cooled to
1619 420C. Then added were 1OX reverse transcription buffer (NEB), RNAse inhibitor, and M- 1620 MuLV reverse transcriptase (NEB). The reaction was mixed and incubated at 420C for 60
1621 minutes, then denatured at 9O0C for 10 minutes.
1622 The radioactive probes were hybridized to the membrane using Ultrahyb-oligo
1623 hybridization buffer (Ambion, Inc.) at 6O0C overnight. After washing the membrane twice
1624 with 2X SSC/1% SDS and twice with IX SSC/ 1% SDS at 600C, the bound probes were
1625 detected by autoradiography, using for example, Kodak Biomax Light Film (Carestream
1626 Health, Inc., New Haven, CT). The density of each spot was quantified with computer
1627 software, for example, Kodak ID Image Analysis Software (Carestream Health, Inc., New
1628 Haven, CT)
1629 In an alternate embodiment, biotin-labeled cDNA probes were synthesized from the
1630 total KNA. The probes were synthesized using the AmpoLabeling-LPR method developed
1631 by SuperArray Bioscience Corporation This method increased the sensitivity of cDNA
1632 arrays by amplifying the cDNAs obtained by reverse transcription by up to 30 rounds of
1633 Linear Polymerase Replication (LPR) A 300 nucleotide long region from the 5' end of the
1634 luciferase mRNAs, encompassing the 60 nucleotide TAGs, was reverse transcribed and
1635 amplified in the presence of biotin-labeled dUTP The total RNA was annealed with primer
1636 complementary to the MA4 segment, in a thermal cycler at 7O0C for 3 minutes, cooled to
1637 37CC and incubated at 370C for 10 minutes The annealed product was reverse transcribed 1638 using MMLV reverse transcriptase in presence of RNasin Ribonuclease Inhibitor. After
1639 mactivation of the reverse transcriptase and RNA hydrolysis at 850C, the cDNAs were
1640 amplified by LPR with primer 5'-GGCTCGGCCTCTGAGCTAAT^' ([SEQ ID NO 2])
1641 located immediately upstream of the TAG, in the presence of biotin-16-dUTP, and a
1642 thermostable DNA-dependent DNA polymerase, using the following program: 850C for 5
1643 minutes, then 30 cycles of 850C for 1 minute, 5O0C for 1 minute, 720C for 1 mmute, followed
1644 by 720C for 5 minutes. The probe was then checked for biotin incorporation by making serial
1645 dilutions of the probe synthesis reaction, spotting 1 μl ahquots on a HyBond nylon membrane
1646 and detecting the probe using the ECL chemiluminescent detection kit. Probes that were
1647 detectable at 1000-fold dilutions or higher were used in the hybridizations
1648 The hybridization of the biotinylated probes to the membranes was performed using
1649 the Ultrahyb-oligo hybridization buffer (Ambion Inc.), at 6O0C overnight. After washing the
1650 membrane twice with 2xSSC, 1%SDS and twice with IxSSC, 1%SDS at 60C, the bound
1651 probes were detected by chemiluminescence usmg a streptavidm-alkalme phosphatase
1652 conjugate and following the manufacturer's protocol (CDP-Star Universal Detection Kit,
1653 Sigma). The image was acquired with a Kodak image station 440 for 1 hour (Figure 3A, 1654 Figure 4A, and Figure 5A). The density from each spot was quantified using the Kodak ID
1655 Image Analysis software. The data presented in Figures 3A and 3B and Figures 4A and 4B
1656 show that: a) all the "blank" reporter-TAG plasmids which lack promoter sequences (# 10,
1657 19, 26, 28, 30, 35, 39, and 47 in Table 1) give very low intensity signals, a fact, which
1658 suggests the absence of intrinsic promoter activity from the plasmid backbone; b) with the
1659 series of defensin promoters (#74-85), the clone expressing the highest mRNA level (# 79) is
1660 also the one expressing the highest level of hiciferase. The data presented in Figures 5 A and
1661 5B show that: a) as expected, the viral CMV promoter appeared to be the strongest, a fact,
1662 which is well-documented in the scientific literature (U.S. Patents Nos. 5, 168,062 and
1663 5,385,839; Cayer et al J Immunol Methods. 2007 Apr 30;322(l-2): 118-27; Sakurai et al Gene
1664 Ther. 2005 Oct;12(19): 1424-33; Fabre et al. I Gene Med. 2006 May;8(5):636-45.); b) The
1665 GAPDH (glyceraldehyde-3-phosphate dehydrogenase ) promoter was able to drive very high
1666 expression levels, which is consistent with observation made by others (Hirano T et al,
1667 Biosci Biotechnol Biochem. 1999;63(7): 1223-7; Punt PJ et al.Gene. 1990; 93(1): 101-9;
1668 Nagashima T et al., Biosci Biotechnol Biochem. 1994;58(7): 1292-6); c) the ferritin light-
1669 chain promoter was about 40% stronger than the Ferritin heavy chain promoter, a fact that
1670 supports findings made by Cairo et al. in rat liver (Biochem J. 1991; 275 (Pt 3):813-6); d)
1671 Promoters OM3 (TAG61) and Def6 (TAG77) produced the strongest hybridization signals in
1672 their respective groups (OM and Defensin promoters), a fact, which correlates with the
1673 luciferase activities determined previously (Ma et al., Nucleic Acids Res. 1999;27(23):4649-
1674 57; Ma et al. J Biol Chem. 1998 Apr 10;273(15):8727-40.). Taken altogether, these data
1675 validate the present disclosure compared to other methods.
1676 The following examples are offered by way of illustration, and not by way of
1677 limitation.
1678
1679 EXAMPLES
1680 Example 1. Construction of 100 pTAG-reporter plasmids
1681 One hundred pTAG-plasmids featuring a multiple cloning site (MCS), attP sequences,
1682 a ccdB gene, a T7 promoter, a unique 60 bp-long reporter TAG, a specific MA4 segment, a
1683 3-frame translation stop codon, a hemoglobin RNA stabilization fragment and a poly -A
1684 signal were constructed. The construction was performed in 6 steps (Figure 6). First, a
1685 partial MCS was inserted, between the Sfil sites of plasmid pGL4 (Promega, Madison, WI).
1686 All the cloning sites from the original pGL4 plasmid were deleted and replaced with EcoRI,
1687 Kpnl, Sad, Nhel, Xhol, BgIII sites, and followed by two sets of SfuVBgII sites separated by a 1688 CG dinucleotide. The two sets of Sfil sites allowed for the directional insertion of TAG
1689 sequences. The dinucleotide CG between the Sfil sites created a unique restriction site
1690 (Smal/Xmal), which revealed useful to facilitate plasmid digestion with Sfil, either by
1691 insertion of a -170 bp-long spacer fragment to dissociate both Sfil sites, or by digestion of
1692 the plasmid sequentially with Smal and then Sfil.
1693 In the second step, a second partial MCS was inserted between the Xhol and BgIII sites of
1694 pGL4-12. The resulting plasmid (pGL-1256) contained BgIII, Apal, Nrul, Kpnl, Xhol Sacl,
1695 BgIII, Nhel, EcoRV, and MIuI sites following the existing MCS. As a result, pGL-1256
1696 contained two structurally different but functionally equivalent MCS surrounding the Apal
1697 and Nrul sites, a feature useful for cloning promoter sequence candidates in the TAG-
1698 plasmids.
1699 In the third step, the sequence encoding the luciferase reporter gene (Ncol-Xbal
1700 fragment) was replaced with an 80-mer oligonucleotide which contained a specific 25 bp-
1701 long sequence (MA4), a three-frame translation stop codon, and a RNA stabilization
1702 sequence derived from human alpha globin gene. The MA4 facilitated the synthesis of TAG-
1703 specific probes from mRNAs.
1704 In the fourth step, the resulting plasmid 1256MA4 was digested with EcoRV and MIuI,
1705 which allowed for insertion of an oligonucleotide that contained the bacteriophage T7 RNA
1706 polymerase promoter sequence. The presence of the T7 promoter allowed for synthesis of
1707 biotinylated RNA probes by in vitro transcription, a method which increased the sensitivity of
1708 the assay by at least one order of magnitude.
1709 In the fifth step, the Gateway® sequences attP - ccdB - chloramphenicol-resistance gene
1710 were amplified by PCR using plasmid pDONR-201 as template (Invitrogen Inc., Carlsbad,
1711 CA) and the following primers: sense-tcgggccccaaataatgattttattttgactgatag [SEQ ID NO. 3]
1712 and antisense-atgggcccaaataatgattttattttgactgatagtgacctgttc [SEQ ID NO. 4]. The PCR
1713 product was inserted into the Apal site of plasmid 1256MA4T7, generating plasmid
1714 1256MA4T7att.
1715 Finally, plasmid 1256MA4T7att was digested with BgII and 60 bp-long ds
1716 oligonucleotides (TAG) were directionally inserted into the plasmid. In total, we created 100
1717 reporter plasmids— pTAG-Reporter 1 to 100. These plasmids were used to generate the 92
1718 promoter-TAG plasmids. The remaining 8 pTAG-Reporter plasmids were used as blank.
1719 These 100 pTAG-Reporter plasmids are used for cloning putative promoters into the
1720 MCS, using either conventional methods (restriction digestion and ligation), or the
1721 GATEWAY® technology with attB-modified PCR products. 1722
1723 Example 2. Manual and Robotic Production of Macro-Array Membranes
1724 First, three nylon membranes BπghtStar-Plus (Ambion Inc., Austin, TX), Tropilon-Plus
1725 (Applied Biosystems, Foster City, CA), and Nytran SuperCharge (Whatman PLC, Middlesex,
1726 UK) were compared for their ability in being printed with short oligonucleotides. The 63 bp-
1727 long oligonucleotides complementary to the TAGs present on the TAG-reporter plasmids
1728 were manually spotted on the membranes , and hybridized with the biotm end-labeled sense
1729 TAG oligonucleotides. BrightStar-Plus (Ambion Inc., Austin, TX) was selected for use in
1730 subsequent experiments as this membrane produced the best results in terms of low
1731 background, sharpness of the signal spots, and the observation the rough surface of the
1732 BrightStar-Plus membrane produced stronger signals than the smooth surfaces of the other
1733 two membranes, without increasing the background. The nylon membranes were cut (2 x 4
1734 cm) to fit 5-mL glass hybridization tubes and the 8-well hybridization plates (SuperArray
1735 Inc., Frederick, MD).
1736 Next, the amount of oligonucleotides to be spotted on the membrane was optimized
1737 Stock solutions for all the reverse strand TAG oligonucleotides were made by reconstituting
1738 the lyophilized products in TE pH 7.5 to 100 μM. Serial dilutions of 2OX, 6OX, 180X, 540X
1739 and 1620X were made. Using a 2 μL Pipetman, the diluted oligonucleotides (0.2 μl) were
1740 spotted manually, in duplicate, on the membrane. Following hybridization of the membrane
1741 with biotin end-labeled sense-strand TAG oligonucleotide probes, detection of the signals
1742 was performed by chemiluminescence using the Southern-Star kit (Applied Biosystems,
1743 Foster City, CA). The 20-fold dilutions produced a strong and clean signal spots, and were
1744 selected.
1745 The same diluted oligonucleotides (n = 100) (Figure 2) were printed using a Biorobotics
1746 MicroGrid array spotting robot (Genomic Solutions, Ann Arbor, MI) at the microarray
1747 facility of the University of Idaho Environmental Biotechnology Institute (Moscow, ID).
1748 Each oligonucleotide was printed as a quadruple spot. Both types of membranes were air-
1749 dried at room temperature for 10 min and then UV-crosslinked twice using a Stratalinker
1750 1800 (Stratagene) at 120 mj/ sec, then baked at 70 0C for 2 hours. The printed membranes
1751 were then sealed in parafilm and stored at 4 0C. The size of the membrane was designed to fit
1752 into convenient small containers such as 2-mL microcentrifuge tubes and 8-well plates. 1753
1754 Example 3. Cloning of 92 Human and Viral Promoter Sequences into the TAG-Reporter
1755 Plasmids 1756 Ninety-two human and viral promoter sequences (TABLE 1) were cloned into the TAG-
1757 reporter plasmids using the Gateway® system. They included 12 defensin promoters and 15
1758 Oncostatin M promoters, 57 genomic DNA fragments from both EPD and chromosome 21,
1759 which have been studied experimentally for promoter activity, and 8 well-known promoters
1760 (SV40, CMV, wild-type and mutant RSV, GAPDH, HSP, FerL, and FerH).
1761 First, the promoter sequences were amplified by PCR, using human chromosomal DNA
1762 or plasmids as templates, and primers carrying attB sequence extensions. The PCR products
1763 were inserted into the pTAG-reporter plasmids in place of the ccdB and chloramphenicol-
1764 resistance genes by in vitro recombination using the BP clonase (Invitrogen, Carlsbad, CA).
1765 The recombinant plasmids were introduced into E. coll Top 10 using the heat-shock
1766 procedure, and amplified. Recombinant clones lacking promoter inserts were obtained at a
1767 frequency of about 1:200. To ascertain the correct clones, the plasmid DNAs of each clone
1768 were prepared and analyzed by agarose gel electrophoresis separately. Plasmid DNAs were
1769 quantified by spectrophotometry. Finally, equimolar amounts were pooled at a final
1770 concentration of 0.4 μg DNA/μL.
1771 In the context of screening plasmid libraries of putative promoters, E. coli clones are
1772 arrayed in 96-well plates. The bacteria (not their plasmid DNA) are pooled and amplified in
1773 the same flask. Their plasmid DNA is purified in a single preparation, before being
1774 transfected into the same cell population. 1775
1776 Example 4. Testing the Promoter Detective Method with 92 Promoter-TAG Plasmids
1777 The method was performed with the 92 promoter-TAG and 8 blank reporter-TAG
1778 plasmids. Different amounts (4, 16, 64 μg) of equimolar mixtures of these plasmids were
1779 transfected into HEK 293 cells using Lipofectamine 2000 (Invitrogen, Carlsbad, CA). After
1780 14 and 25 hours culture at 37 0C, cells were harvested. Total RNA was extracted and
1781 purified using the TRIzol-based method (Invitrogen, Carlsbad, CA). Biotin labeled cDNA
1782 probes were synthesized from the total RNA. The probes were synthesized using the
1783 AmpoLabeling LPR method (SuperArray Bioscience Corp., Frederick, MD). The sensitivity
1784 of cDNA arrays was increased by amplifying the cDNAs obtained by reverse transcription by
1785 up to 30 rounds of Linear Polymerase Replication (LPR). A 300 nucleotide long region,
1786 encompassing the 60 nucleotide TAGs, was reversed transcribed and amplified in the
1787 presence of biotin labeled dUTP. The 2.5 μg total RNA was annealed with primer
1788 complementary to the MA4 segment, in a thermal cycler at 7O0C for 3 minutes, cooled to
1789 370C and incubated at 370C for 10 minutes. The annealed product was reverse transcribed 1790 using MMLV reverse transcriptase and RNA hydrolysis at 850C, the cDNAs were amplified
1791 by LPR with primer 5'-GGCTCGGCCTCTGAGCTAAT-3 ' [SEQ ID NO 2] located
1792 immediately upstream of the TAG, in the presence of biotin 16 dUTP, and a thermostable
1793 DNA dependent DNA polymerase, with the following program: 850C for 5 minutes; then 30
1794 cycles of 85°C for 1 minute; 5O0C for 1 minute; 720C for 1 minute; followed by 720C for 5
1795 minutes The probe was then checked for biotin incorporation by making serial dilutions of
1796 the probe synthesis, spotting 1 μl aliquots onto a HyBond nylon membrane (Amersham,
1797 Little Chalfont, UK) and detecting the probe using the ECL chemiluminescent detection kit.
1798 Probes detectable at 1000-fold dilutions or higher were used in the hybridizations
1799 The hybridization of the biotinylated probes to the membranes was performed using
1800 the Ultrahyb-oligo hybridization buffer (Ambion Inc.), at 6O0C overnight. After washing the
1801 membrane twice with 2xSSC, 1%SDS and twice with IxSSC, 1%SDS at 60C, we detected
1802 bound probes by chemilummescence using a streptavidin-alkalme phosphatase conjugate and
1803 following the manufacturer's protocol (CDP-Star Universal Detection Kit, Sigma). The
1804 image was acquired with a Kodak image station 440 for 1 hour (Figure 4A) The density
1805 from each quadruple spot was quantified using the Kodak ID Image Analysis software. The
1806 results indicate a) all the "blank"' reporter-TAG plasmids which lack promoter sequences (#
1807 10, 19, 26, 28, 30, 35, 39, and 47 in Table 1) give very low intensity signals, a fact, which
1808 suggests the absence of intrinsic promoter activity from the plasmid backbone; b) with the
1809 series of defensin promoters (#74-85), the clone expressing the highest mRNA level (# 79) is
1810 also the one expressing the highest level of luciferase 1811
1812 Example 5 Testing the promoter detection method with 35 promoter-TAG plasmids
1813 The method was tested with a set of 35 promoter-TAG plasmids Twenty μg of an
1814 equimolar mixture of these plasmids were transfected into U937 cells by electroporation.
1815 After 7 hours culture at 37 0C, cells were harvested. Total RNA was extracted and purified
1816 using the TRIzol-based method (Invitrogen. Carlsbad, CA), and quantified by
1817 spectrophotometry (Abs260 nm)
1818 Radioactrv e cDNA probes were synthesized as follows. One microgram total RNA in
1819 6.3 μL H2O was mixed with 0.7 μL of 100 μM MA5-a oligonucleotide (5'-
1820 TAGTCACTTCGATCGCTGAGG-S ') ([SEQ ID NO. I]), 1.1 μL of 5 mM each of
1821 dATP/dTTP/dGTG, and 1.9 μL 32P dCTP. The reaction mixture was heated to 80 0C for 3
1822 minutes and then cooled down to 42 0C. Then 1.5 μL 10x reverse transcription buffer (New
1823 England Biolabs), 0.75 μL RNAse inhibitor, and M-MuLV reverse transcriptase (New 1824 England Biolabs) were added, and the reaction was performed at 42 0C for 60 minutes. The
1825 probes were then denatured at 90 0C for 10 minutes.
1826 The hybridization of the radioactive probes to the membranes was performed using
1827 the Ultrahyb-oligo hybridization buffer (Ambion Inc.), at 60 0C overnight After washing the
1828 membrane twice with 2x SSC, 1% SDS and twice with Ix SSC, 1% SDS at 60 0C, bound
1829 probes were detected by autoradiography using a Kodak Biomax Light film The density of
1830 each spot was quantified using the Kodak ID Image Analysis software (Figures 5A and 5B)
1831 where the autoradiogram was obtained by hybridizing radioactive TAG-cDNA probes to a
1832 membrane spotted with complementary TAG strands. The intensities of the various spots
1833 were compared, relative to the signal obtained with the CMV promoter. As expected, the
1834 viral CMV promoter appeared to be the strongest, a fact, which is well-documented in the
1835 scientific literature (U S Patents Nos 5,168,062 and 5,385,839, Cayer et al J Immunol
1836 Methods. 2007 Apr 30,322(1-2): 118-27; Sakurai et al Gene Ther. 2005 Oct;12(19) 1424-33;
1837 Fabre et al J Gene Med 2006 May;8(5) 636-45.). The GAPDH (glyceraldehyde-3-
1838 phosphate dehydrogenase ) promoter was able to drive very high expression levels, which is
1839 consistent with observation made by others (Hirano T et al, Biosci Biotechnol Biochem
1840 1999,63(7):1223-7; Punt PJ et al Gene. 1990, 93(l): 101-9; Nagashima T et al , Biosci
1841 Biotechnol Biochem. 1994,58(7) 1292-6). Also, the ferritin light-chain promoter was about
1842 40% stronger than the Ferritin heavy chain promoter, a fact that supports findings made by
1843 Cairo et al. in rat liver (Biochem J. 1991; 275 (Pt 3):813-6). Promoters OM3 (TAG61) and
1844 Def6 (TAG77) produced the strongest hybridization signals in their respective groups (OM
1845 and Defensin promoters), a fact, which correlates with the luciferase activities determined
1846 previously (Ma et al., Nucleic Acids Res. 1999;27(23) 4649-57; Ma et al J Biol Chem. 1998
1847 Apr 10,273(15) 8727-40.) Taken altogether, these data validate the present disclosure
1848 compared to other methods 1849
1850
\$$ l * * * * * TABLE 1
Gene Promoter Refseq or Promoter Refseq or
TAG # symbol size (bp) Accession # TAG # Gene symbol size (bp) Accession #
1 MT1 B 471 M 13484 51 SV 330 N/A
PROC 495 NM 000312 52 CMV 655 N/A
MMP1 477 NM 002421 53 RSV 396 N/A
CEA 508 NM 002483 54 RSV303 396 N/A
GAS 539 NM 000805 55 GAPDH 532 N/A
H3FL 506 NM 003537 56 HSP 464 N/A
RUN3 356 K00777 57 FerL 270 N/A
SLC9A1 509 XM 046881 58 FerH 180 N/A
ADAMTS 1 560 NM 006988 59 OM1 (pGL3BomB1 ) 189 BC011589
10 Blank 60 OM2 (N1 ) 304 BC011589
11 CCT8 528 NM 006585 61 OM3 (3STAT) 300 BC011589
12 CRYZL1 583 NM 005111 62 OM4 (3STATm) 300 BC011589
13 DAF 557 NM 000574 63 OM5 (3STATmm) 300 BC011589
14 GABPA 611 NM 002040 64 OM6 (NI ApI) 304 BC011589
15 IFNAR1 667 NM 000629 65 OM7 (N1 SpI mutation) 304 BC011589
16 KRT1 520 NM 006121 66 OM8 (N1 3STATrnm) 304 BC011589
17 LHB 494 NM 000894 67 OM9 (RI) 194 BC011589
18 NEFL 495 NM 006158 68 OMI O (StUl) 94 BC011589
19 Blank 69 OM 11 (2STATm) 194 BC011589
20 NEG9 407 N/A 70 OM 12 (N1 2STATmrn) 304 BC011589
21 IVL 500 NM 005547 71 OM13 (1 STAT) 109 BC011589
22 APOE 509 NM 000041 72 OM 14 (1 STATm) 109 BC011589
23 C21ORF33 689 NM 004649 73 OM 15 (TATA) 31 BC011589
24 DSCR4 688 NM 005867 74 Def3 (B/3) 619 AA321199
25 FTCD 596 NM 006657 75 Def4 (Aval) 497 AA321199
26 Blank 76 Def5 (Hindi) 321 AA321199
27 ITGB2 647 NM 000211 77 Def6 (Hinfl) 299 AA321199
28 Blank 78 Def7 (Apol) 203 AA321199
29 TFF1 605 NM 003225 79 Def8 (Sau96l (7)) 164 AA321199
30 Blank 80 Def9 (Scrfl (9)) 144 AA321199
31 WRB 639 NM 004627 81 Defl O (Scrfl (TATA)) 144 AA321199
32 AMY2B 488 NM 020978 82 Def11 (Tru9l) 111 AA321199
33 BCKDHA 481 NM 000709 83 Def12 (Tru9ITATA) 111 AA321199
34 CA3 518 NM 005181 84 Def13 (Tru9ITATAm) 111 AA321199
35 Blank 35 Def14 (Tru9ITATAm2) 111 AA321199
36 H4FG 222 NM 003542 86 ALB 517 NM 000477
37 NEG13 376 N/A 87 NEG11 468 N/A
38 NEG18 503 N/A HLCS 645 NM 000411
39 Blank 89 NEG12 522 N/A
40 NEG21 444 N/A 90 NEG1 500 N/A
41 NEG22 418 N/A 91 NEG6 480 N/A
42 NEG23 259 N/A 92 ORM1 499 NM 000607
43 NEG2 285 N/A 93 PKNOX1 593 NM 004571
44 NEG3 460 N/A 94 USP16 581 NM 006447
45 NEG5 488 N/A 95 IGSF5 622 AF121782
46 NEG7 466 N/A 96 NEG10 406 N/A
47 Blank 97 NEG16 202 N/A
48 RNU4C 305 M15957 NEG17 339 N/A
49 SH3BGR 588 007341 99 PCP4 625 NM 006198
50 NEG19 483 N/A 100 TCRD 333 M21624

Claims

LISTING OF CLAIMS We claim:
1. A method for detecting DNA regulatory sequences comprising: a) inserting a promoter sequence candidate into a vector wherein the vector comprises a TAG sequence and wherein the promoter sequence candidate is inserted in a position to drive transcription of the TAG sequence; b) the vector containing the inserted promoter sequence candidate is inserted into a cloning host cell; c) cloning host cells containing different promoter sequence candidates are grown to the same optical density, pooled and the vectors therein are extracted, purified and inserted into a reporter cell line; d) mRNA is extracted from the reporter cell lines wherein the mRNA is directly labeled or is used as template for cDNA or probe synthesis; and e) the labeled mRNA, cDNA or probe is analyzed with an array wherein the array comprises identical or complementary sequence to the TAG sequence.
2. The method of claim 1, wherein the vector is a plasmid.
3. The method of claim 1, wherein the TAG sequence is between about 16 base pairs to about 200 base pairs.
4. The method of claim 1, wherein step (a) further comprises inserting a plurality of promoter sequence candidates into a plurality of vectors wherein each vector is comprised of a unique TAG sequence.
5. The method of claim 1, wherein the cloning host cells are in a single reaction vial, wherein the vectors from within the cloning host cells are purified, and about equal amounts of the purified vectors are transferred into reporter cell lines.
6. The method of claim 1, wherein the cloning host cells are in individual reaction vials, wherein the DNA from the cloning host cells within each individual reaction vial is purified, and wherein the purified DNA from each cloning host cell is pooled in equimolar amounts and the vectors therein are inserted into a reporter cell line.
7. The method of claim 1, wherein the cDNA or probe contains a label.
8. The method of claim 1, wherein the mRNA is directly labeled.
9. The method of claim 1, wherein the mRNA is analyzed with an array, wherein the array comprises complementary sequence to the TAG sequence, and wherein the complementary sequence is the antisense strand.
10. The method of claim 1, wherein the cDNA is analyzed with an array, wherein the array comprises complementary sequences to the cDNA of the TAG sequences, and wherein the complementary sequence is the sense strand.
11. The method of claim 1, wherein the labeled mRNA, cDNA or probe hybridizes to the array and the label of the mRNA, cDNA or probe has a detectable response.
12. The method of claim 1, wherein the vector into which the DNA promoter sequence candidate is inserted into comprises a TAG sequence, one or more multiple-cloning sites, one or more DNA recombination sequences, a negative selection marker, a RNA polymerase promoter sequence, a MA segment, a translation stop codon, a RNA stabilization fragment, and a transcription termination signal, and wherein the DNA promoter sequence candidate is located such that it can drive the transcription of the TAG sequence.
13. The method of claim 12, wherein the RNA stabilization fragment is from an alpha-globin gene.
14. The method of claim 12, wherein the transcription termination signal is a poly-A signal.
15. The method of claim 12, wherein the RNA polymerase promoter sequence is a T7 promoter sequence.
16. The method of claim 12, wherein the DNA recombination sequences are selected from the group consisting of attPl and attP2.
17. The method of claim 12, wherein the TAG sequence is located 3' to the promoter sequence and 5' to the transcription termination site.
18. A vector into which a DNA promoter sequence candidate is inserted into comprising a TAG sequence, one or more multiple-cloning sites, at least one DNA recombination sequence, a negative selection marker, a RNA polymerase promoter sequence, a MA segment, a translation stop codon, a RNA stabilization fragment, and a transcription termination signal, and wherein the DNA promoter sequence candidate is located such that it can drive the transcription of the TAG sequence.
19. The vector of claim 18, wherein the vector is a plasmid.
20. The vector of claim 18, wherein the TAG sequence is between about 16 base pairs to about 200 base pairs.
21. The vector of claim 18, wherein the TAG sequence is located 3' to the inserted promoter sequence and 5' to a transcription termination signal.
22. The vector of claim 18, wherein the RNA stabilization fragment is from an alpha-globin gene.
23. The vector of claim 18, wherein the transcription termination signal is a poly-A signal.
24. The vector of claim 18, wherein the RNA polymerase is a T7 promoter sequence.
5. The vector of claim 18, wherein the DNA recombination sequence is selected from the group consisting of attPl and attP2.
PCT/US2008/081240 2007-10-27 2008-10-26 Promoter detection and analysis WO2009055760A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN2008801233105A CN101918578A (en) 2007-10-27 2008-10-26 Promoter detection and analysis
EP08841807A EP2209903A4 (en) 2007-10-27 2008-10-26 Promoter detection and analysis

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US11/925,837 US20090111099A1 (en) 2007-10-27 2007-10-27 Promoter Detection and Analysis
US11/925,837 2007-10-27

Publications (1)

Publication Number Publication Date
WO2009055760A1 true WO2009055760A1 (en) 2009-04-30

Family

ID=40580078

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2008/081240 WO2009055760A1 (en) 2007-10-27 2008-10-26 Promoter detection and analysis

Country Status (4)

Country Link
US (1) US20090111099A1 (en)
EP (1) EP2209903A4 (en)
CN (1) CN101918578A (en)
WO (1) WO2009055760A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2344676A1 (en) * 2008-09-25 2011-07-20 Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. Combinatorial synthesis and use of libraries of short expressed nucleic acid sequences for the analysis of cellular events
WO2017001570A3 (en) * 2015-06-30 2017-03-23 Ethris Gmbh Atp-binding cassette family coding polyribonucleotides and formulations thereof

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4021197B2 (en) * 2000-03-17 2007-12-12 アンチキャンサー, インコーポレイテッド Whole body optical imaging of gene expression and its use
WO2012151503A2 (en) * 2011-05-04 2012-11-08 The Broad Institute, Inc. Multiplexed genetic reporter assays and compositions
EP4219709A1 (en) * 2016-11-03 2023-08-02 Temple University - Of The Commonwealth System of Higher Education Dna plasmids for the fast generation of homologous recombination vectors for cell line development
CN114581265B (en) * 2022-03-08 2022-09-20 北京女娲补天科技信息技术有限公司 System and method for analyzing eating preference of diner
CN116343917B (en) * 2023-03-22 2023-11-10 电子科技大学长三角研究院(衢州) Method for identifying transcription factor co-localization based on ATAC-seq footprint

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996034109A1 (en) * 1995-04-25 1996-10-31 Vical Incorporated Single-vial formulations of dna/lipid complexes
US6057111A (en) * 1996-11-13 2000-05-02 Quark Biotech, Inc. Gene identification method
US6395473B1 (en) * 1998-07-14 2002-05-28 Merck & Co., Inc. Adenoviral based promoter assay
WO2002042325A2 (en) * 2000-10-31 2002-05-30 Zycos Inc. Cyp1b1 nucleic acids and methods of use
US20040219516A1 (en) * 2002-07-18 2004-11-04 Invitrogen Corporation Viral vectors containing recombination sites
WO2005118792A1 (en) * 2004-06-01 2005-12-15 Avigen, Inc. Compositions and methods to prevent aav vector aggregation
US20060067948A1 (en) * 2001-03-30 2006-03-30 Allen Jane F Viral vectors
US20060154369A1 (en) * 2005-01-10 2006-07-13 Guang-Hsiung Kuo Promoter sequences from WSSV immediate early genes and their uses in recombinant DNA techniques

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5168062A (en) * 1985-01-30 1992-12-01 University Of Iowa Research Foundation Transfer vectors and microorganisms containing human cytomegalovirus immediate-early promoter-regulatory DNA sequence
ATE196311T1 (en) * 1993-12-09 2000-09-15 Univ Jefferson COMPOUNDS AND METHODS FOR SITE-SPECIFIC MUTATION IN EUKARYOTIC CELLS
CA2440148C (en) * 2001-03-09 2012-07-10 Gene Stream Pty Ltd Novel expression vectors
US7026123B1 (en) * 2001-08-29 2006-04-11 Pioneer Hi-Bred International, Inc. UTR tag assay for gene function discovery
US20070161031A1 (en) * 2005-12-16 2007-07-12 The Board Of Trustees Of The Leland Stanford Junior University Functional arrays for high throughput characterization of gene expression regulatory elements

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996034109A1 (en) * 1995-04-25 1996-10-31 Vical Incorporated Single-vial formulations of dna/lipid complexes
US6057111A (en) * 1996-11-13 2000-05-02 Quark Biotech, Inc. Gene identification method
US6395473B1 (en) * 1998-07-14 2002-05-28 Merck & Co., Inc. Adenoviral based promoter assay
WO2002042325A2 (en) * 2000-10-31 2002-05-30 Zycos Inc. Cyp1b1 nucleic acids and methods of use
US20060067948A1 (en) * 2001-03-30 2006-03-30 Allen Jane F Viral vectors
US20040219516A1 (en) * 2002-07-18 2004-11-04 Invitrogen Corporation Viral vectors containing recombination sites
WO2005118792A1 (en) * 2004-06-01 2005-12-15 Avigen, Inc. Compositions and methods to prevent aav vector aggregation
US20060154369A1 (en) * 2005-01-10 2006-07-13 Guang-Hsiung Kuo Promoter sequences from WSSV immediate early genes and their uses in recombinant DNA techniques

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HILSON P. ET AL.: "Versatile Gene-Specific Sequence Tags for Arabidopsis Functional Genomics: Transcript Profiling and Reverse Genetics Applications", GENOME RESEARCH, vol. 14, 2004, pages 2176 - 2189, XP008123382 *
See also references of EP2209903A4 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2344676A1 (en) * 2008-09-25 2011-07-20 Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. Combinatorial synthesis and use of libraries of short expressed nucleic acid sequences for the analysis of cellular events
WO2017001570A3 (en) * 2015-06-30 2017-03-23 Ethris Gmbh Atp-binding cassette family coding polyribonucleotides and formulations thereof
GB2560250A (en) * 2015-06-30 2018-09-05 Ethris Gmbh ATP-Binding cassette family coding polyribonucleotides and formulations thereof
US11479768B2 (en) 2015-06-30 2022-10-25 Ethris Gmbh ATP-binding cassette family coding polyribonucleotides and formulations thereof

Also Published As

Publication number Publication date
EP2209903A1 (en) 2010-07-28
US20090111099A1 (en) 2009-04-30
CN101918578A (en) 2010-12-15
EP2209903A4 (en) 2011-06-22

Similar Documents

Publication Publication Date Title
JP7136816B2 (en) nucleic acid-guided nuclease
Schumann et al. Multiple links between 5-methylcytosine content of mRNA and translation
EP3344766B1 (en) Systems and methods for selection of grna targeting strands for cas9 localization
US20180340176A1 (en) Crispr-cas sgrna library
KR102271292B1 (en) Using rna-guided foki nucleases (rfns) to increase specificity for rna-guided genome editing
AU2019408503B2 (en) Compositions and methods for highly efficient genetic screening using barcoded guide rna constructs
CN113166797A (en) Nuclease-based RNA depletion
US6617112B2 (en) Methods for gene array analysis of nuclear runoff transcripts
EP2209903A1 (en) Promoter detection and analysis
CN110343724B (en) Method for screening and identifying functional lncRNA
CN110157785B (en) Single cell RNA sequencing library construction method
JP7244885B2 (en) Methods for Screening and Identifying Functional lncRNAs
WO2004015085A2 (en) Method and compositions relating to 5’-chimeric ribonucleic acids
US20110071047A1 (en) Promoter detection and analysis
CN111334531A (en) High signal-to-noise ratio negative genetic screening method
Gelsinger et al. Bacterial genome engineering using CRISPR-associated transposases
WO2024146540A1 (en) Method for dna methylation detection of single cell
CN109295163B (en) Universal long-fragment chromosome walking method
FACS-Based et al. Check for updates
Yin et al. Identification of cis-elements for RNA subcellular localization through REL-seq
Kato et al. Full-length transcriptome analysis using a bias-free cDNA library prepared with the vector-capping method
WO2024086848A2 (en) A crispr counter-selection interruption circuit (ccic) and methods of use thereof
WO2024124204A2 (en) Retrotransposon compositions and methods of use
CN117511945A (en) Cynoglossus semilaevis miRNA and lncRNA and application thereof in regulation and control of cdk2 gene
CN117015602A (en) Analysis of expression of protein-encoding variants in cells

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200880123310.5

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08841807

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2008841807

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE