EP1240311A1 - Dna molecules encoding human nhl, a dna helicase - Google Patents

Dna molecules encoding human nhl, a dna helicase

Info

Publication number
EP1240311A1
EP1240311A1 EP00983952A EP00983952A EP1240311A1 EP 1240311 A1 EP1240311 A1 EP 1240311A1 EP 00983952 A EP00983952 A EP 00983952A EP 00983952 A EP00983952 A EP 00983952A EP 1240311 A1 EP1240311 A1 EP 1240311A1
Authority
EP
European Patent Office
Prior art keywords
nhl
protein
seq
host cell
expression vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP00983952A
Other languages
German (de)
French (fr)
Inventor
Xiaomei Liu
Chang Bai
Michael L. Metzker
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Merck and Co Inc
Original Assignee
Merck and Co Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Merck and Co Inc filed Critical Merck and Co Inc
Publication of EP1240311A1 publication Critical patent/EP1240311A1/en
Withdrawn legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/90Isomerases (5.)

Definitions

  • the present invention relates in part to isolated nucleic acid molecules (polynucleotides) which encode NHL, a putative DNA helicase.
  • the present invention also relates to recombinant vectors and recombinant hosts which contain a DNA fragment encoding NHL, substantially purified forms of associated NHL, associated mutant proteins, and methods associated with identifying compounds which modulate NHL, which will be useful in the treatment of various neoplastic disorders, given that this gene is located at 20ql3.3 and immediately adjacent to M68/DcR3, which is involved in tumor growth.
  • a human genomic fragment representing this portion of the human genome, along with three additional genes (M68/DcR3, SCLIP, and ARP).
  • chemotherapeutic agents inhibit helicases, including actinomycin Cl, daunorubicin and nogalamycin (Tuteja, et al., 1997, Biochem. Biophys. Res. Comm. 236(3):636-640), and a prostate cancer drug, CI-958 (Lun, et al.,1998, Cancer Chemother. Pharmacol. 42(6):447-453).
  • actinomycin Cl daunorubicin and nogalamycin
  • CI-958 a prostate cancer drug
  • some topoisomerases have been shown to have anti-cancer activity.
  • helicase-encoding genes and chemotherapeutic agents it would be advantageous to identify additional genes which reside within chromosomal regions associated with a disease state such as cancer as well as a gene which encodes a type of protein which may be associated with that disease.
  • the present invention addresses and meets this need by disclosing a DNA molecule encoding a DNA helicase with a chromosomal location suggestive of association with cancer.
  • the present invention relates to an isolated or purified nucleic acid molecule (polynucleotide) which encodes a novel mammalian DNA helicase.
  • the present invention also relates to an isolated nucleic acid molecule (polynucleotide) which encodes mRNA which expresses a novel human DNA helicase, NHL.
  • a preferred aspect of the present invention relates to an isolated or purified DNA molecule which encodes human NHL, the nucleotide sequence as set forth in Figure 1A-B and SEQ ID NO: l.
  • the present invention also relates to biologically active fragments or mutants of SEQ ID NO: 1 which encode a mRNA molecule expressing a novel DNA helicase, NHL.
  • Any such biologically active fragment and/or mutant will encode either a protein or protein fragment which at least substantially mimics the biological properties of the human NHL protein disclosed herein in Figure 2 and as set forth as SEQ ID NO:2.
  • Any such polynucleotide includes but is not necessarily limited to nucleotide substitutions, deletions, additions, amino-terminal truncations and carboxy- terminal truncations such that these mutations encode mRNA which express a functional NHL protein in a host cell, so as to be useful for screening for agonists and/or antagonists of NHL activity.
  • the present invention also relates to recombinant vectors and recombinant hosts, both prokaryotic and eukaryotic, which contain the substantially purified nucleic acid molecules disclosed throughout this specification.
  • the present invention also relates to a substantially purified form of a human NHL protein which comprises the amino acid sequence disclosed in Figure 2 and set forth as SEQ ID NO:2.
  • a preferred aspect of this portion of the present invention is a NHL protein which consists of the amino acid sequence disclosed in Figure 2 and set forth as SEQ ID NO:2.
  • a substantially purified NHL protein preferably a human NHL protein
  • obtained from a recombinant host cell containing a DNA expression vector comprises a nucleotide sequence as set forth in SEQ ID NO: 1 and expresses the respective NHL protein.
  • the recombinant host cell be a eukaryotic host cell, such as a mammalian cell line.
  • the present invention also relates to biologically active fragments and/or mutants of a NHL protein comprising the amino acid sequence as set forth in SEQ ID NO:2, including but not necessarily limited to amino acid substitutions, deletions, additions, amino terminal truncations and carboxy-terminal truncations such that these mutations provide for proteins or protein fragments of diagnostic, therapeutic or prophylactic use and would be useful for screening for selective modulators, including but not limited to agonists and/or antagonists for human NHL pharmacology.
  • a preferred aspect of the present invention is disclosed in Figure 2 and is set forth as SEQ ID NO:2, a respective amino acid sequence which encodes human NHL. Characterization of one or more of these DNA helicase-like proteins allows for screening methods to identify novel NHL modulators that may be useful in the treatment of human neoplastic disorders. The modulators selected through such screening and selection protocols may be used alone or in conjunction with other cancer therapies. As noted above, heterologous expression of a NHL protein will allow the pharmacological analysis of compounds which modulate NHL activity and hence may be useful in various cancer therapies. To this end, heterologous cell lines expressing a NHL protein can be used to establish functional or binding assays to identify novel NHL modulators.
  • the present invention also relates to polyclonal and monoclonal antibodies raised in response to either the NHL or a biologically active fragment of NHL.
  • the present invention relates to transgenic mice comprising altered genotypes and phenotypes in relation to NHL and its in vivo activity.
  • the present invention also relates to NHL fusion constructs, including but not limited to fusion constructs which express a portion of the NHL protein linked to various markers, including but in no way limited to GFP (Green fluorescent protein), the MYC epitope, and GST. Any such fusion constructs may be expressed in the cell line of interest and used to screen for NHL modulators.
  • GFP Green fluorescent protein
  • MYC epitope MYC epitope
  • GST GST
  • the present invention relates to methods of expressing mammalian NHL, and preferably human NHL, biological equivalents disclosed herein, assays employing these gene products, recombinant host cells which comprise DNA constructs which express these proteins, and compounds identified through these assays which act as agonists or antagonists of NHL activity.
  • the present invention also relates to the isolated genomic sequence which comprises SEQ ID NO:l, a 115 kb genomic fragment set forth herein as SEQ ID NO:3.
  • SEQ ID NO:3 As especially preferred aspect of this portion of the invention is the region of the genomic fragment of SEQ ID NO:3 which comprises the regulatory and coding regions of human NHL, as well as intervening sequences (introns).
  • This 115 kb fragment contains at least the coding region of four genes, NHL, M68/DcR3, SCLIP and ARP. As discussed herein, it has been shown that this region of chromosome 20 is associated with tumor growth.
  • an aspect of this invention also comprises the use of one or more regions of this 115 kb genomic sequence to identify compounds which up or downregulate expression of one or more of the genes localized within this 115 kb region, wherein this up or down regulation results in an interference of tumor growth.
  • a transcription element of one of these four genes may be responsible for M68/DcR3 ( and/or NHL) overexpression in tumors, and if M68 or NHL overexpression in tumors has a caustic role, blockage of M68/DcR3 or NHL overexpression in tumors by interfering with this transcription site will be useful.
  • It is an object of the present invention to provide an isolated nucleic acid molecule (e.g., SEQ ID NO:l) which encodes novel form of human NHL, or fragments, mutants or derivatives of human NHL as set forth in Figure 2 and SEQ ID NO:2.
  • Any such polynucleotide includes but is not necessarily limited to nucleotide substitutions, deletions, additions, amino-terminal truncations and carboxy-terminal truncations such that these mutations encode mRNA which express a protein or protein fragment of diagnostic, therapeutic or prophylactic use and would be useful for screening for selective modulators of human NHL activity.
  • the recombinant host cell be a eukaryotic host cell, such as a mammalian cell line.
  • NHL proteins or biological equivalent to screen for modulators, preferably selective modulators, of human NHL activity. Any such compound may be useful in screening for and selecting compounds active against human neoplastic disorders.
  • substantially free from other nucleic acids means at least 90%, preferably 95%, more preferably 99%, and even more preferably 99.9%, free of other nucleic acids.
  • a human NHL DNA preparation that is substantially free from other nucleic acids will contain, as a percent of its total nucleic acid, no more than 10%, preferably no more than 5%, more preferably no more than 1%, and even more preferably no more than 0.1%, of non-NHL nucleic acids.
  • Whether a given NHL DNA preparation is substantially free from other nucleic acids can be determined by such conventional techniques of assessing nucleic acid purity as, e.g., agarose gel electrophoresis combined with appropriate staining methods, e.g., ethidium bromide staining, or by sequencing.
  • substantially free from other proteins or “substantially purified” means at least 90%, preferably 95%, more preferably 99%, and even more preferably 99.9%, free of other proteins.
  • a NHL protein preparation that is substantially free from other proteins will contain, as a percent of its total protein, no more than 10%, preferably no more than 5%, more preferably no more than 1%, and even more preferably no more than 0.1%, of non-NHL proteins.
  • Whether a given NHL protein preparation is substantially free from other proteins can be determined by such conventional techniques of assessing protein purity as, e.g., sodium dodecyl sulfate polyacryl amide gel electrophoresis (SDS-PAGE) combined with appropriate detection methods, e.g., silver staining or immunoblotting.
  • SDS-PAGE sodium dodecyl sulfate polyacryl amide gel electrophoresis
  • detection methods e.g., silver staining or immunoblotting.
  • the terms “isolated NHL protein” or “purified NHL protein” also refer to NHL protein that has been isolated from a natural source. Use of the term “isolated” or “purified” indicates that NHL protein has been removed from its normal cellular environment.
  • an isolated NHL protein may be in a cell-free solution or placed in a different cellular environment from that in which it occurs naturally.
  • isolated does not imply that an isolated NHL protein is the only protein present, but instead means that an isolated NHL protein is substantially free of other proteins and non-amino acid material (e.g., nucleic acids, lipids, carbohydrates) naturally associated with the NHL protein in vivo.
  • non-amino acid material e.g., nucleic acids, lipids, carbohydrates
  • a NHL protein preparation that is an isolated or purified NHL protein will be substantially free from other proteins will contain, as a percent of its total protein, no more than 10%, preferably no more than 5%, more preferably no more than 1%, and even more preferably no more than 0.1%, of non-NHL proteins.
  • a functional equivalent or “biologically active equivalent” means a protein which does not have exactly the same amino acid sequence as naturally occurring NHL, due to alternative splicing, deletions, mutations, substitutions, or additions, but retains substantially the same biological activity as NHL.
  • Such functional equivalents will have significant amino acid sequence identity with naturally occurring NHL and genes and cDNA encoding such functional equivalents can be detected by reduced stringency hybridization with a DNA sequence encoding naturally occurring NHL.
  • a naturally occurring NHL disclosed herein comprises the amino acid sequence shown as SEQ ID NO:2 and is encoded by SEQ ID NO:l.
  • a nucleic acid encoding a functional equivalent has at least about 50% identity at the nucleotide level to SEQ ID NO: 1.
  • a conservative amino acid substitution refers to the replacement of one amino acid residue by another, chemically similar, amino acid residue. Examples of such conservative substitutions are: substitution of one hydrophobic residue (isoleucine, leucine, valine, or methionine) for another; substitution of one polar residue for another polar residue of the same charge (e.g., arginine for lysine; glutamic acid for aspartic acid).
  • substitution of one hydrophobic residue isoleucine, leucine, valine, or methionine
  • substitution of one polar residue for another polar residue of the same charge (e.g., arginine for lysine; glutamic acid for aspartic acid).
  • the term “mammalian” will refer to any mammal, including a human being.
  • Figure 1A-B shows the nucleotide sequence which comprises the open reading frame which encodes human NHL, the nucleotide sequence set forth as SEQ ID NO:l.
  • the initiating Met residue (ATG) and the stop codon (TAG) are underlined.
  • Figure 2 shows the amino acid sequence of human NHL as set forth in SEQ ID NO:2.
  • Figure 3 shows the alignment of amino acid sequences of human NHL to ERCC2/RAD3 gene family members.
  • Rep D Dermatyosteliem discoideum
  • RAD 3 S. cerevisiae
  • RAD 15 S. pombe
  • XP_GroupD Homo sapien
  • Figure 4 shows Northern analysis of NHL expression in multi-human tissues.
  • Figure 5A-B show the genomic structure of the NHL gene (Figure 5A) and the entire 115 kb genomic region ( Figure 5B) containing the NHL, M68/DcR3, SCLIP and ARP genes.
  • the present invention relates to an isolated or purified nucleic acid molecule (polynucleotide) which encodes a novel mammalian DNA helicase.
  • An especially preferred aspect of this invention relates to an isolated nucleic acid molecule (polynucleotide) which encodes mRNA which expresses a novel human DNA helicase, NHL.
  • M68/DcR3 is a secreted TNFR member that is overexpressed in a number of human tumors.
  • M68/DcR3 is located at 20ql3.3, a known site that is associated with frequent gene amplification in cancer.
  • M68 DcR3 protein binds to FASL and inhibit FAS mediated apoptosis.
  • genes tightly linked to M68/DcR3 may be coregulated (e.g. co overexpressed and/or amplified in tumors).
  • NHL neoplastic disorders
  • the transcript was identified through exon prediction using GRAIL2 and sequence alignment to a contiguous 4.5 kilobase region of chromosome 4 (88% sequence identity). The complete exon structure of NHL was subsequently confirmed by RT-PCR analysis. Multiple sequence alignment of NHL to known helicases showed that NHL contains all the seven critical helicase domains.
  • BLAST analysis of the predicted 1,219 amino acid sequence revealed an approximately 26% sequence identity and 48% sequence similarity to the RAD3/ERCC2 gene family of DNA helicases (Naumovski et al., 1985 Mol. Cell Biol. 5:17-26; Reynolds et al., 1985 Nucleic Acid Res 13:2357-72; Weber et al., 1990 EMBO J. 9:1437-1447).
  • a preferred aspect of the present invention relates to an isolated or purified DNA molecule which encodes human NHL, the nucleotide sequence as set forth in Figure 1A-B and SEQ ID NO:l, which is as follows:
  • the above-exemplified isolated DNA molecule shown in Figure 1A-B and SEQ ID NO:l comprise 4946 nucleotides, with an initiating Met at nucleotides 828-
  • TAG termination codon are underlined.
  • the present invention also relates to biologically active fragments or mutants of SEQ ID NO:l which encode a mRNA molecule expressing a novel DNA helicase, NHL.
  • Any such biologically active fragment and/or mutant will encode either a protein or protein fragment which at least substantially mimics the biological properties of the human NHL protein disclosed herein in Figure 2 and as set forth as
  • any such polynucleotide includes but is not necessarily limited to nucleotide substitutions, deletions, additions, amino-terminal truncations and carboxy- terminal truncations such that these mutations encode mRNA which express a functional NHL protein in a host cell, so as to be useful for screening for agonists and/or antagonists of NHL activity.
  • the isolated nucleic acid molecules of the present invention may include a deoxyribonucleic acid molecule (DNA), such as genomic DNA and complementary DNA (cDNA), which may be single (coding or noncoding strand) or double stranded, as well as synthetic DNA, such as a synthesized, single stranded polynucleotide.
  • the isolated nucleic acid molecule of the present invention may also include a ribonucleic acid molecule (RNA).
  • RNA ribonucleic acid molecule
  • the present invention also relates to recombinant vectors and recombinant hosts, both prokaryotic and eukaryotic, which contain the substantially purified nucleic acid molecules disclosed throughout this specification.
  • the degeneracy of the genetic code is such that, for all but two amino acids, more than a single codon encodes a particular amino acid. This allows for the construction of synthetic DNA that encodes the NHL protein where the nucleotide sequence of the synthetic DNA differs significantly from the nucleotide sequence of SEQ ED NO: 1 but still encodes the same NHL protein as SEQ ID NO: 1
  • Such synthetic DNAs are intended to be within the scope of the present invention. If it is desired to express such synthetic DNAs in a particular host cell or organism, the codon usage of such synthetic DNAs can be adjusted to reflect the codon usage of that particular host, thus leading to higher levels of expression of the NHL protein in the host. In other words, this redundancy in the various codons which code for specific amino acids is within the scope of the present invention. Therefore, this invention is also directed to those DNA sequences which encode RNA comprising alternative codons which code for the eventual translation of the identical amino acid, as shown below:
  • the present invention discloses codon redundancy which may result in differing DNA molecules expressing an identical protein.
  • a sequence bearing one or more replaced codons will be defined as a degenerate variation.
  • mutations either in the DNA sequence or the translated protein which do not substantially alter the ultimate physical properties of the expressed protein. For example, substitution of valine for leucine, arginine for lysine, or asparagine for glutamine may not cause a change in functionality of the polypeptide.
  • DNA sequences coding for a peptide may be altered so as to code for a peptide having properties that are different than those of the naturally occurring peptide.
  • Methods of altering the DNA sequences include but are not limited to site directed mutagenesis. Examples of altered properties include but are not limited to changes in the affinity of an enzyme for a substrate or a receptor for a ligand.
  • the present invention also relates to recombinant vectors and recombinant hosts, both prokaryotic and eukaryotic, which contain the substantially purified nucleic acid molecules disclosed throughout this specification.
  • the nucleic acid molecules of the present invention encoding a NHL protein in whole or in part, can be linked with other DNA molecules, i.e, DNA molecules to which the NHL coding sequence are not naturally linked, to form "recombinant DNA molecules" which encode a respective NHL protein.
  • the novel DNA sequences of the present invention can be inserted into vectors which comprise nucleic acids encoding NHL or a functional equivalent. These vectors may be comprised of DNA or RNA; for most cloning purposes DNA vectors are preferred.
  • Typical vectors include plasmids, modified viruses, bacteriophage, cosmids, yeast artificial chromosomes, and other forms of episomal or integrated DNA that can encode a NHL protein. It is well within the purview of the skilled artisan to determine an appropriate vector for a particular gene transfer or other use.
  • DNA sequences that hybridize to SEQ ID NO:l under stringent conditions include DNA sequences that hybridize to SEQ ID NO:l under stringent conditions.
  • a procedure using conditions of high stringency is as follows: Prehybridization of filters containing DNA is carried out for 2 hours to overnight at 65°C in buffer composed of 6X SSC, 5X Denhardt's solution, and 100 ⁇ g/ml denatured salmon sperm DNA. Filters are hybridized for 12 to 48 hrs at 65°C in prehybridization mixture containing 100 ⁇ g/ml denatured salmon sperm DNA and 5-20 X 10 6 cpm of 32 P-labeled probe.
  • Washing of filters is done at 37°C for 1 hr in a solution containing 2X SSC, 0.1% SDS. This is followed by a wash in 0.1X SSC, 0.1% SDS at 50°C for 45 min. before autoradiography.
  • Other procedures using conditions of high stringency would include either a hybridization step carried out in 5XSSC, 5X Denhardt's solution, 50% formamide at 42°C for 12 to 48 hours or a washing step carried out in 0.2X SSPE, 0.2% SDS at 65°C for 30 to 60 minutes.
  • the present invention also relates to a substantially purified form of a human NHL protein which comprises the amino acid sequence (1219 amino acid residues) disclosed in Figure 2 and set forth as SEQ ID NO:2.
  • a preferred aspect of this portion of the present invention is a NHL protein which consists of the amino acid sequence disclosed in Figure 2 and set forth as SEQ ID NO:2, as follows:
  • GPSQSSGPPH GPAASEWGL* (SEQ ID NO: 2).
  • the present invention also relates to biologically active fragments and/or mutants of the human NHL protein comprising the amino acid sequence as set forth in SEQ ID NO:2, including but not necessarily limited to amino acid substitutions, deletions, additions, amino terminal truncations and carboxy-terminal truncations such that these mutations provide for proteins or protein fragments of diagnostic, therapeutic or prophylactic use and would be useful for screening for agonists and/or antagon i sts of NHL f uncti on .
  • Another preferred aspect of the present invention relates to a substantially purified, fully processed NHL protein obtained from a recombinant host cell containing a DNA expression vector which comprises a nucleotide sequence as set forth in SEQ ID NO:l and expresses the human NHL protein. It is especially preferred is that the recombinant host cell be a eukaryotic host cell, such as a mammalian cell line.
  • this invention includes modified NHL polypeptides which have amino acid deletions, additions, or substitutions but that still retain substantially the same biological activity as a respective, corresponding NHL. It is generally accepted that single amino acid substitutions do not usually alter the biological activity of a protein (see, e.g., Molecular Biology ofthe Gene, Watson et al., 1987, Fourth Ed., The Benjamin/Cummings Publishing Co., Inc., page 226; and Cunningham & Wells, 1989, Science 244:1081-1085).
  • the present invention includes a polypeptide where one amino acid substitution has been made in SEQ ID NO:2 wherein the polypeptide still retains substantially the same biological activity as a corresponding NHL protein.
  • the present invention also includes polypeptides where two or more amino acid substitutions have been made in SEQ ID NO:2 wherein the polypeptide still retains substantially the same biological activity as a corresponding NHL protein.
  • the present invention includes embodiments where the above-described substitutions are conservative substitutions.
  • polypeptides that are functional equivalents of NHL and have changes from the NHL amino acid sequence that are small deletions or insertions of amino acids could also be produced by following the same guidelines, (i.e, minimizing the differences in amino acid sequence between NHL and related proteins. Small deletions or insertions are generally in the range of about 1 to 5 amino acids).
  • the effect of such small deletions or insertions on the biological activity of the modified NHL polypeptide can easily be assayed by producing the polypeptide synthetically or by making the required changes in DNA encoding NHL and then expressing the DNA recombinantly and assaying the protein produced by such recombinant expression.
  • the present invention also includes truncated forms of NHL which contain the region comprising the active site of the enzyme.
  • truncated proteins are useful in various assays described herein, for crystallization studies, and for structure-activity- relationship studies.
  • the present invention also relates to isolated nucleic acid molecules which are fusion constructions expressing fusion proteins useful in assays to identify compounds which modulate wild-type NHL activity, as well as generating antibodies against NHL.
  • One aspect of this portion of the invention includes, but is not limited to, glutathione S-transferase (GST)-NHL fusion constructs.
  • GST-NHL fusion proteins may be expressed in various expression systems, including Spodoptera frugiperda (Sf21) insect cells (Invitrogen) using a baculovirus expression vector (pAcG2T, Pharmingen).
  • Another aspect involves NHL fusion constructs linked to various markers, including but not limited to GFP (Green fluorescent protein), the MYC epitope, and GST.
  • GFP Green fluorescent protein
  • any such fusion constructs may be expressed in the cell line of interest and used to screen for modulators of one or more of the NHL proteins disclosed herein.
  • Any of a variety of procedures may be used to clone NHL. These methods include, but are not limited to, (1) a RACE PCR cloning technique (Frohman, et al., 1988, Proc. Natl. Acad. Sci. USA 85: 8998-9002). 5' and/or 3' RACE may be performed to generate a full-length cDNA sequence. This strategy involves using gene-specific oligonucleotide primers for PCR amplification of NHL cDNA.
  • These gene-specific primers are designed through identification of an expressed sequence tag (EST) nucleotide sequence which has been identified by searching any number of publicly available nucleic acid and protein databases; (2) direct functional expression of the NHL cDNA following the construction of a NHL-containing cDNA library in an appropriate expression vector system; (3) screening a NHL-containing cDNA library constructed in a bacteriophage or plasmid shuttle vector with a labeled degenerate oligonucleotide probe designed from the amino acid sequence of the NHL protein; (4) screening a NHL-containing cDNA library constructed in a bacteriophage or plasmid shuttle vector with a partial cDNA encoding the NHL protein.
  • EST expressed sequence tag
  • This partial cDNA is obtained by the specific PCR amplification of NHL DNA fragments through the design of degenerate oligonucleotide primers from the amino acid sequence known for other kinases which are related to the NHL protein; (5) screening a NHL- containing cDNA library constructed in a bacteriophage or plasmid shuttle vector with a partial cDNA or oligonucleotide with homology to a mammalian NHL protein.
  • This strategy may also involve using gene-specific oligonucleotide primers for PCR amplification of NHL cDNA identified as an EST as described above; or (6) designing 5' and 3' gene specific oligonucleotides using SEQ ID NO: 1 as a template so that either the full-length cDNA may be generated by known RACE techniques, or a portion of the coding region may be generated by these same known RACE techniques to generate and isolate a portion of the coding region to use as a probe to screen one of numerous types of cDNA and/or genomic libraries in order to isolate a full-length version of the nucleotide sequence encoding NHL.
  • libraries as well as libraries constructed from other cell types-or species types, may be useful for isolating a NHL-encoding DNA or a NHL homologue.
  • Other types of libraries include, but are not limited to, cDNA libraries derived from other cells.
  • suitable cDNA libraries may be prepared from cells or cell lines which have NHL activity.
  • the selection of cells or cell lines for use in preparing a cDNA library to isolate a cDNA encoding NHL may be done by first measuring cell-associated NHL activity using any known assay available for such a purpose.
  • cDNA libraries can be performed by standard techniques well known in the art. Well known cDNA library construction techniques can be found for example, in Sambrook et al., 1989, Molecular Cloning: A Laboratory Manual; Cold Spring Harbor Laboratory, Cold Spring Harbor, New York. Complementary DNA libraries may also be obtained from numerous commercial sources, including but not limited to Clontech Laboratories, Inc. and Stratagene.
  • DNA encoding NHL may also be isolated from a suitable genomic DNA library. Construction of genomic DNA libraries can be performed by standard techniques well known in the art. Well known genomic DNA library construction techniques can be found in Sambrook, et al., supra. One may prepare genomic libraries, especially in PI artificial chromosome vectors, from which genomic clones containing the NHL gene can be isolated, using probes based upon the NHL nucleotide sequences disclosed herein. Methods of preparing such libraries are known in the art (Ioannou et al., 1994, Nature Genet. 6:84-89).
  • the amino acid sequence or DNA sequence of a NHL or a homologous protein may be necessary.
  • a respective NHL protein may be purified and the partial amino acid sequence determined by automated sequenators. It is not necessary to determine the entire amino acid sequence, but the linear sequence of two regions of 6 to 8 amino acids can be determined for the PCR amplification of a partial NHL DNA fragment.
  • the DNA sequences capable of encoding them are synthesized. Because the genetic code is degenerate, more than one codon may be used to encode a particular amino acid, and therefore, the amino acid sequence can be encoded by any of a set of similar DNA oligonucleotides.
  • the nucleotide sequence of a region of an expressed sequence may be identified by searching one or more available genomic databases.
  • Gene-specific primers may be used to perform PCR amplification of a cDNA of interest from either a cDNA library or a population of cDNAs.
  • the appropriate nucleotide sequence for use in a PCR-based method may be obtained from SEQ ID NO:l either for the purpose of isolating overlapping 5' and 3' RACE products for generation of a full-length sequence coding for NHL, or to isolate a portion of the nucleotide sequence coding for NHL for use as a probe to screen one or more cDNA- or genomic-based libraries to isolate a full-length sequence encoding NHL or NHL-like proteins.
  • This invention also includes vectors containing a NHL gene, host cells containing the vectors, and methods of making substantially pure NHL protein comprising the steps of introducing the NHL gene into a host cell, and cultivating the host cell under appropriate conditions such that NHL is produced.
  • the NHL so produced may be harvested from the host cells in conventional ways. Therefore, the present invention also relates to methods of expressing the NHL protein and biological equivalents disclosed herein, assays employing these gene products, recombinant host cells which comprise DNA constructs which express these proteins, and compounds identified through these assays which act as agonists or antagonists of NHL activity.
  • the cloned NHL cDNA obtained through the methods described above may be recombinantly expressed by molecular cloning into an expression vector (such as pcDNA3.neo, pcDNA3.1, pCR2.1, pBlueBacHis2 or pLITMUS28) containing a suitable promoter and other appropriate transcription regulatory elements, and transferred into prokaryotic or eukaryotic host cells to produce recombinant NHL.
  • Expression vectors are defined herein as DNA sequences that are required for the transcription of cloned DNA and the translation of their mRNAs in an appropriate host. Such vectors can be used to express eukaryotic DNA in a variety of hosts such as bacteria, blue green algae, plant cells, insect cells and animal cells.
  • RNA-yeast or bacteria-animal cells Specifically designed vectors allow the shuttling of DNA between hosts such as bacteria-yeast or bacteria-animal cells.
  • An appropriately constructed expression vector should contain: an origin of replication for autonomous replication in host cells, selectable markers, a limited number of useful restriction enzyme sites, a potential for high copy number, and active promoters.
  • a promoter is defined as a DNA sequence that directs RNA polymerase to bind to DNA and initiate RNA synthesis.
  • a strong promoter is one which causes mRNAs to be initiated at high frequency.
  • cDNA molecules including but not limited to the following can be constructed: a cDNA fragment containing the full- length open reading frame for NHL as well as various constructs containing portions of the cDNA encoding only specific domains of the protein or rearranged domains of the protein. All constructs can be designed to contain none, all or portions of the 5' and/or 3' untranslated region of a NHL cDNA. The expression levels and activity of NHL can be determined following the introduction, both singly and in combination, of these constructs into appropriate host cells.
  • this NHL cDNA construct is transferred to a variety of expression vectors (including recombinant viruses), including but not limited to those for mammalian cells, plant cells, insect cells, oocytes, bacteria, and yeast cells. Techniques for such manipulations can be found described in Sambrook, et al., supra, are well known and available to the artisan of ordinary skill in the art. Therefore, another aspect of the present invention includes host cells that have been engineered to contain and/or express DNA sequences encoding the NHL protein. An expression vector containing DNA encoding a NHL-like protein may be used for expression of NHL in a recombinant host cell.
  • Expression vectors may include, but are not limited to, cloning vectors, modified cloning vectors, specifically designed plasmids or viruses.
  • Commercially available mammalian expression vectors which may be suitable for recombinant NHL expression include but are not limited to, pcDNA3.neo (Invitrogen), pcDNA3.1 (Invitrogen), pCI-neo (Promega), pLITMUS28, pLlTMUS29, pLITMUS38 and pLITMUS39 (New England Bioloabs), pcDNAI, pcDNAIamp (Invitrogen), pcDNA3 (Invitrogen), pMClneo (Stratagene), pXTl (Stratagene), pSG5 (Stratagene), EBO-pS V2-neo (ATCC 37593) pBPV- 1 (8-2)
  • bacterial expression vectors may be used to express recombinant NHL in bacterial cells.
  • Commercially available bacterial expression vectors which may be suitable for recombinant NHL expression include, but are not limited to pCR2.1 (Invitrogen), pETl la (Novagen), lambda gtl 1 (Invitrogen), and pKK223-3 (Pharmacia).
  • a variety of fungal cell expression vectors may be used to express recombinant NHL in fungal cells.
  • Commercially available fungal cell expression vectors which may be suitable for recombinant NHL expression include but are not limited to pYES2 (Invitrogen) and Pichia expression vector (Invitrogen).
  • a variety of insect cell expression vectors may be used to express recombinant protein in insect cells.
  • Commercially available insect cell expression vectors which may be suitable for recombinant expression of NHL include but are not limited to pBlueBacUI and pBlueBacHis2 (Invitrogen), and pAcG2T (Pharmingen).
  • Recombinant host cells may be prokaryotic or eukaryotic, including but not limited to, bacteria such as E. coli, fungal cells such as yeast, mammalian cells including, but not limited to, cell lines of bovine, porcine, monkey and rodent origin; and insect cells including but not limited to Drosophila and silkworm derived cell lines.
  • bacteria such as E. coli
  • fungal cells such as yeast
  • mammalian cells including, but not limited to, cell lines of bovine, porcine, monkey and rodent origin
  • insect cells including but not limited to Drosophila and silkworm derived cell lines.
  • one insect expression system utilizes Spodoptera frugiperda (Sf21) insect cells (Invitrogen) in tandem with a baculovirus expression vector (pAcG2T, Pharmingen).
  • pAcG2T baculovirus expression vector
  • mammalian species which may be suitable and which are commercially available, include but are not limited to, L cells L-M(TK') (ATCC CCL 1.3), L cells L-M (ATCC CCL 1.2), Saos-2 (ATCC HTB-85), 293 (ATCC CRL 1573), Raji (ATCC CCL 86), CV-1 (ATCC CCL 70), COS-1 (ATCC CRL 1650), COS-7 (ATCC CRL 1651), CHO-K1 (ATCC CCL 61), 3T3 (ATCC CCL 92), NIH/3T3 (ATCC CRL 1658), HeLa (ATCC CCL 2), C127I (ATCC CRL 1616), BS- C-l (ATCC CCL 26), MRC-5 (ATCC CCL 171) and CPAE (ATCC CCL 209).
  • L cells L-M(TK') ATCC CCL 1.3
  • L cells L-M ATCC CCL 1.2
  • Saos-2 ATCC HTB-85
  • CTAAGCACAC CCAGGCAGGT GTCCTGGCAG ATGAGGACCA CATGCAGAGC CTCGGCCAGC 120
  • ATCTGACTAG TGTGATCTCG CAAGGAACAT TCCAGACACA GTGGAGCTAG AAGGTTCTTC 840 TCCAAACAAG GAATCCCCAG GGGATCAAAT TGTTTTGCAT CGGCCAGACA TGGTGGCTCA 900
  • CAGTCCCAGC ACTGTACTAA AAATCTACAC GGGGCCGGGC ATGGTGGCAC ATGCCTGTAG 1080
  • GACGTGCCAT AACCAAGAAG CCCCAGCCAC ACCCAGACCC GATGTGGCCA CAAGGGGTGA 1620
  • CTTTCCTCTG CAACTGTGGG CTTACGGGGC AAAGAAGTCC AGGCCTCCAG GTGGAGGATC 1860
  • CAAGACACCC ACAGAGGAGA GCTCTAAGCC ACAACTGTGT ACGAAGACAA CTGTGCAGGA 2640 TTTTATTACT ACAACATTTT TGTTTTCTTT TTTTTTTTTT TTTGAGACTG AGTCTCGCTC 2700
  • CACATTTCAA ATGGGTAACT CCAGTGTCCT TGATGCTCCT GCGACATGTT CGTGAGACTT 3060
  • ATGGTTCACA CTCCTTACCC TGCCGCTTTG TCTTGTATCC AATAAATAGC GCAACCTGGC 4080 ATTCGGGGCC GCTACCAGTC TCCGCGTCTT GGTGGTAGTG GTCCCCCAGG CCCAGCTGTC 4140
  • AAATGGCAAA TTAGACACAC ACATGTGGGC CGGGTACAGT GGCTCGCGCC TGTAATTCCA 5340
  • CCCTGCCTAC AGGACCCTGA GAGCTAGGGG AAGGCGTTAT CCTGAACTGT GTCCCCCGTA 6000
  • GAGATAATTT AAATGAGGTC ATATAAGTTG
  • GCCCTCATCC AGTAAGACTT TGACCTTCTG 6120 GTGGTTTTTT TTTTTTTGGA
  • GACTGGGTCT CACTCTATCA CTCAGGTTGG AGTACAGTGG 6180
  • ATCTGTCTCC CTCGGCCTCC TGCAGTGCTG GAATTACAGG TATGAGCCAC CGCGCCTGGC 6420 CGACCGTGAC CTTCTAAGAA GTGAAAGAGA AAGATCTTTC TCTCTCCCTC CCTCTCCATC 6480
  • GGCAGCCGCGCG GCCACGGTGT CAGGGCTCAG GTGAGGAGAG TTGGATATGG GACTGGGCCT 6960 ACCCCGAGGC TGCTTCCACC CAGACGCCTG GGTGGGTGAC ACGAAAGCTG GGCTCAGTTG 7020
  • CACGGCGTCC CAAGGGAGGG ACTTGGGCAC TGCCTCTCTG GGCAAGAGTG GGGAGGTGTG 7260 GGGTGGGAGA TGTCTGGAAA CATCATGGAC AC TGCCGGG AAAACACGGA AGCTGTGCAC 7320
  • GCCTCCACCA CCCTGACATG CAGGAGGGAG GTCAAAGCCT CGGGTCCAAC AACAGGCTCC 7560 ACAGCAAGGG AAGAAAGGCA GGAAGGAACT CAGGGCCAGG TCCTCCCAGG CAGCAGCTGC 7620
  • CTGCACGCTG TCCACCAAGG GAGGTCTGAC CTACACCGCA CAGGGGTTGG CAGTCTAGAG 7680
  • GGCCAACACA GTGAAATCCT GTCTTGACTA AAAATACTAA AAATTAGCCA GGCATGGTGG 10680
  • TTCTTGCTAA ATCTTACTCA ACCGACATTT TCCAGCATGG GAACATTTTT CTGAATGTCT 11640 TAGGGAGAGG AAGTCCGCAA GAGAACAAAA GGTCCTCAGG CCACCCTAGC TTCTTTTCCT 11700
  • AAAAATACAA AAATTAGCCG GGCGCGGTGG CAGGTGCCTG TAATCTCAGC TACTTGGGAG 12840 GCTGAGGCAG GAGAATCGCT TGAACCTGGG CAGCAGAGGT TGCAGTGAGC CAAGATCATG 12900
  • CTACTGATCT CCCGTGCTGA CTTCGGGG TTTAACTCTC ACTGAGGAGA CGCTGCTTTC 13680 ATAAGGGTAA GCTCAGCAGG GGCAACTAAA GTCATTTAAG CAGAGAGCTG CAAAGAGGCA 13740
  • CCGTGTTAGC CAGGCTGCTC TCAAACTCCT GACCTCATGA TCCGCCCACG TCGGGCTCCC 15000
  • GGCACTCACC CCGATCGCAT AGCATAGCTG ATACCCCGAT CCCACCCCAG TCCCATAGCC 17100
  • CTTCCTTCTA AAATATTTAT CATTTTTGTT TTGGGGATTT TTTTGGTTTG GTTTTTTTTG 19140
  • AACACGGTGA AACCCCGTCT CTACTAAAAA TACAAAAAAT TAGCCGGGCG TGGTAGCGGG 19440
  • AAAATACAAA AAAATTAGCC AGGCATGGTG ACGGGCGCCT GTAATCTCAG CTACTTGGGA 20640
  • TGGTGGCTCA CACCTGTAAT CCCAGCTACG TGGCAGGCTG AGGCAGGAGA ATCGCTTGAA 21240 CCTGGGAGGC GGAGGTTGTA GGGAGCTGAG ATCGCACCAC TGCACTCCAG CCTGGGCAAC 21300
  • ATCTCTACTA AAAATACAAA AGTTAGCTGG GTGTGTACAT GTAGTCTCAG CTACTTGGGA 21960
  • CAGCCACAGT CAATACCTCG CTTCTGCAGG GACGGTGGCT GCCAGAGTGG GAGGCTTTGG 22260
  • CTTTGCCCGA CACGAGTGCA CAGCAGGCTG TGGGGGAGCA ACTGGTTGAG TCAGGCCTCC 24780 ACTTGTGCCG TATCCCCACC TGCTTTGCTG GACACCCCTG TTTGGGGGGC ACCCACTGCT 24840
  • TAATCCCAGC TACTTGGGAG ACTGAGGCAG GAGAATCGCT TATAACCTGG GAGGTGGAGG 25620 TTGCAGTGAG CTGAGATCAC ACCGCTACAC TCTAGCTTGG GCAACAAGAG TGAAACTCCG 25680
  • GCAGATCTTC ACTCCCAGAC AGGGAGCCCG CAGCTGCCCC CGACCCCACA GGTGCAGGAC 28860 ACACACAGAC AGTTCAACCA TGTCTTAAAC ACACAGGTGT TTATTTAATT GTTCATTTGA 28920
  • CGCCATCCTC 29400 CAGCTGACCG TCCTCCAAGG CCAGCACTGG GCGTCCAAGG GAAAGAAGGA ACTCAGCCCA 29460
  • TCCTGCGCAC CTCGGCCGCGCG TGCAGCTCCT GCAGGACAGG GGGCGGGAGG GCCTGAGGGC 30600 GGGGGTGGCT TGGGGCGACT CCGGGAACCC CCAGGCGC AGGCCGTGGC GCCCTGGCAC 30660
  • ATTTAGCCAT CTATTACTGC GGCTAGTTAC TGTCCCGCCA GGACCAGACT CTGGACCTGC 30900 CTCGTGCGCT GCTGGGGACG CCCAGTAAAC ACGGGAGGAG CCCCCGACCC CCACCCCAGC 30960
  • CTTCTCCTCC GCCTGGCGGC TGAAGTTGTT ATTCTCCTCC AGCGCCTTGT GCAGCACCTC 31620
  • GGGCCTGCCC ACCTGGGCCC CCGTTTTCCC TCCCCATGGC TGCCTCTATC ATGTCTCTGT 33780 GAGACACGGA GCTGCCCAGC ACGCTCTCTT GTGTGTCTCC ACACCGCCGG CCCCTTCGTC 33840
  • CTGTCTTTCT CCCTGAGTGC ATCTTTCTGT GATTCCTTGT CACTGTGTGT CTTTCTGACT 34500
  • GATAGTCTCG ATCTCCTGAC CTCGTGATCC GCCCGCCTCG ACCTCCCAAA GTGCTGGGAT 36000
  • CTCCAGGCCC CTACCCTTCA GCTCATCCTT CCTTATCACA CATCCAAAAC TCTGAATGTG 37740
  • ATCTCGGCTC ACGGCAAGCT CCGCCTCCCG GATTCACGCC ATTCTCCTGC CTCAGCCTCC 38940
  • GGCTCACTGC AACCTCTGCC TCCCCAGTTC AAGTGATTCT CCTGCCTCAG CCTCCCAAGT 39240
  • TGAGGTTTCA CCATGTTGGC CAGGCTGGTC TTGAACTCCT GACCTCCGGT GATCTGCCCA 39360 CCTCAGCCTC CCAAAGTGCT GGGATGACAG GCGTGAGCCC CCGCGCCTGG CCCCCCGCAG 39420
  • TTCTCGCAGT GAGCTGGGCT TGTTTTGTCT CCCTGCTTCT CTTTGTACTA AACATTAGAT 39600 ACCGAGGAAA TGCGGATTGG CCTTTGGATG ATTCATGAGC AGGAGTCAGA AAAAGGCACC 39660
  • GGAGAGGTGC CAGCACCGTC ATCTCTACCC AGATAAGGAG ACCCAGGTCC TGAGAGGTTA 41040
  • CTCCCAGCTC AGCCCCAGGA ACCGAGCCCA TGGGGAGGGA CCGTCAGGGA AAGGCTGTCA 41220
  • TGACTCAGAC TTCAGCTCAG TCCACAGGAC AGCCTTTTCT GGCCACTGCT CTCAGGAGAT 46020 GAGATGTGTG GCTGCAAAAG GTAAACTCCT GGCTCCTGAG CAGGCTCTGG GCAATCTGCT 46080
  • AAATATAAAA AATTAGCCGG GCTTAGTGGT GCACACCTGT AATCCCAGCT ACTTGAGAGG 46320 CTGAGGCAGG AGAATCACTT GAACCCAGGA GGTGGAGGTT GCAGTGAGCC AAGATTGTGC 46380
  • TAGTCCCAGT TCCCGGGCGG GATTGAGGCT TAGAGAAGTT GAGTGATTTG CTGAGGGCTG 48060 CACGGGTTGG CATCCCGGCA TGCTCTTTCG CTACTTTGGC TGCATCTGGT TGCCCACCCG 48120
  • CTCTTCAGCC CTCACGCTCTCT TGTGGAAGTC GCGGAATTAC TGCAGGCGGA ACTTGCAGCA 48300
  • CTGTGGGCGT CTTTTCCAGA GAAGGACGGA GTTGTGGGGC GGGAGGATAA GGCAAGGCCC 48360 AGCCACTTCG CATCTTCGCC CCGCCAGCTC CTCGAGATGG GATATACCAG GGTTGCTCTC 48420
  • CTTCAGCTGC GCACTCTGCC CTTCCTCCCA CAGATCCACT TGTGCCGTAA GAAGGTGGCA 52140 AGTCGCTCCT GTCATTTCTA CAACAACGTA GAAGGTACAA GCAGCTGGGT GGGACCAGGG 52200
  • GCACACACCA CCACCCCCTG CTAATTTTTG TATTTTTAGT AGAGACGGGG TTTTACCATG 61200 TTGGCCAGGC TGGTCTTGAA CTCCTGACCT CGTGATCCGC CCGCCTCGGC CTCCCAAAGT 61260
  • CTTCCTGACT GCGGTGGCCG GGGGCTCCCA GGGCATCGTG GCCGTCTGTC TTGCTGAGCG 61440
  • GGTATTGTTC AGTAGTTCTG GTATTTTCCA AAGACCTATG TCTTCTCCCA GCCAGTATCA 63540 ACTTGGCCTC TACTGTGTAA AACTGGAAAA CTCTACTTTG TGAAGCTGAG TTGGGAGCAT 63600
  • AAAAATTAGC CAGGTGTGGT GGTGTGCTCC TGTGGTCCAA GCTTTTCTGG AGGCCGAAGT 63720
  • AAGTTGGCAT TTGTTTAGTA CAGAAGTTAT CAGGTGTTCT GGCTTTAGAA TCCCTTTATA 64620
  • CTCCTCTCCA CAGTCCCCCA ACCCCACCTC TCTAACGGGG TGGACGGCCG CCTCTTTCCA 64920
  • AACTTCAGTT TTCATCCCTA TCTGTTCCCC CACCCCTTTG GAGATGGGGT CTCACTCTGT 65220
  • CTATAGTCCC AGCTAGTCGG GAGACAGACA CGAGAATTGC TTGAACCTGG GACATGGAGG 66660
  • CTCTTCCTTT CCATGTTGGT GTCCTTTTTT CCATGCCAGG AATCCTGGTT CTCAAGGGCG 67200
  • CTGGTCCAGT CCGTCATTTG AGCACAGGTG CCTGTTAGAA CGAGACCTTC TTGTTAGGAC 68160 GATGAGTGTC CCAGCCACCA CCTCTTTTGG ACTCCGGGAG GCCTGGAACG TTCTGAACGC 68220

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • Microbiology (AREA)
  • Biotechnology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Peptides Or Proteins (AREA)
  • Enzymes And Modification Thereof (AREA)

Abstract

The present invention disclosed isolated nucleic acid molecules (polynucleotides) which encode NHL, a putative DNA helicase. The present invention in turn relates to recombinant vectors and recombinant hosts which contain a DNA fragment encoding NHL, substantially purified forms of associated NHL, associated mutant proteins, and methods associated with identifying compounds which modulate NHL, which will be useful in the treatment of various neoplastic disorders. Both a genomic clone containing regulatory and intron sequences, as well as the exon structure and open reading frame of human NHL are disclosed.

Description

TITLE OF THE INVENTION
DNA MOLECULES ENCODING HUMAN NHL, A DNA HELICASE
CROSS-REFERENCE TO RELATED APPLICATIONS
This application claims the benefit, under 35 U.S.C. §119(e), of U.S. provisional application 60/169,970 filed December 9, 1999.
STATEMENT REGARDING FEDERALLY-SPONSORED R&D Not Applicable
REFERENCE TO MICROFICHE APPENDIX
Not Applicable
FIELD OF THE INVENTION
The present invention relates in part to isolated nucleic acid molecules (polynucleotides) which encode NHL, a putative DNA helicase. The present invention also relates to recombinant vectors and recombinant hosts which contain a DNA fragment encoding NHL, substantially purified forms of associated NHL, associated mutant proteins, and methods associated with identifying compounds which modulate NHL, which will be useful in the treatment of various neoplastic disorders, given that this gene is located at 20ql3.3 and immediately adjacent to M68/DcR3, which is involved in tumor growth. Also included within the present invention is a human genomic fragment representing this portion of the human genome, along with three additional genes (M68/DcR3, SCLIP, and ARP). BACKGROUND OF THE INVENTION
Naumovski et al. (1985, Mol. Cell Biol. 5:17-26; Reynolds et al. (1985 Nucleic Acid Res 13:2357-2372) and Weber et al. (1990 EMBO J. 9:1437-1447) disclose members of the RAD3/ERCC2 gene family of DNA helicases.
It is known that several chemotherapeutic agents inhibit helicases, including actinomycin Cl, daunorubicin and nogalamycin (Tuteja, et al., 1997, Biochem. Biophys. Res. Comm. 236(3):636-640), and a prostate cancer drug, CI-958 (Lun, et al.,1998, Cancer Chemother. Pharmacol. 42(6):447-453). In addition, some topoisomerases have been shown to have anti-cancer activity.
Despite the identification of the aforementioned helicase-encoding genes and chemotherapeutic agents, it would be advantageous to identify additional genes which reside within chromosomal regions associated with a disease state such as cancer as well as a gene which encodes a type of protein which may be associated with that disease. The present invention addresses and meets this need by disclosing a DNA molecule encoding a DNA helicase with a chromosomal location suggestive of association with cancer.
SUMMARY OF THE INVENTION
The present invention relates to an isolated or purified nucleic acid molecule (polynucleotide) which encodes a novel mammalian DNA helicase.
The present invention also relates to an isolated nucleic acid molecule (polynucleotide) which encodes mRNA which expresses a novel human DNA helicase, NHL.
A preferred aspect of the present invention relates to an isolated or purified DNA molecule which encodes human NHL, the nucleotide sequence as set forth in Figure 1A-B and SEQ ID NO: l.
The present invention also relates to biologically active fragments or mutants of SEQ ID NO: 1 which encode a mRNA molecule expressing a novel DNA helicase, NHL. Any such biologically active fragment and/or mutant will encode either a protein or protein fragment which at least substantially mimics the biological properties of the human NHL protein disclosed herein in Figure 2 and as set forth as SEQ ID NO:2. Any such polynucleotide includes but is not necessarily limited to nucleotide substitutions, deletions, additions, amino-terminal truncations and carboxy- terminal truncations such that these mutations encode mRNA which express a functional NHL protein in a host cell, so as to be useful for screening for agonists and/or antagonists of NHL activity. The present invention also relates to recombinant vectors and recombinant hosts, both prokaryotic and eukaryotic, which contain the substantially purified nucleic acid molecules disclosed throughout this specification.
The present invention also relates to a substantially purified form of a human NHL protein which comprises the amino acid sequence disclosed in Figure 2 and set forth as SEQ ID NO:2.
A preferred aspect of this portion of the present invention is a NHL protein which consists of the amino acid sequence disclosed in Figure 2 and set forth as SEQ ID NO:2.
Another preferred aspect of the present invention relates to a substantially purified NHL protein, preferably a human NHL protein, obtained from a recombinant host cell containing a DNA expression vector comprises a nucleotide sequence as set forth in SEQ ID NO: 1 and expresses the respective NHL protein. It is especially preferred is that the recombinant host cell be a eukaryotic host cell, such as a mammalian cell line. The present invention also relates to biologically active fragments and/or mutants of a NHL protein comprising the amino acid sequence as set forth in SEQ ID NO:2, including but not necessarily limited to amino acid substitutions, deletions, additions, amino terminal truncations and carboxy-terminal truncations such that these mutations provide for proteins or protein fragments of diagnostic, therapeutic or prophylactic use and would be useful for screening for selective modulators, including but not limited to agonists and/or antagonists for human NHL pharmacology.
A preferred aspect of the present invention is disclosed in Figure 2 and is set forth as SEQ ID NO:2, a respective amino acid sequence which encodes human NHL. Characterization of one or more of these DNA helicase-like proteins allows for screening methods to identify novel NHL modulators that may be useful in the treatment of human neoplastic disorders. The modulators selected through such screening and selection protocols may be used alone or in conjunction with other cancer therapies. As noted above, heterologous expression of a NHL protein will allow the pharmacological analysis of compounds which modulate NHL activity and hence may be useful in various cancer therapies. To this end, heterologous cell lines expressing a NHL protein can be used to establish functional or binding assays to identify novel NHL modulators.
The present invention also relates to polyclonal and monoclonal antibodies raised in response to either the NHL or a biologically active fragment of NHL.
The present invention relates to transgenic mice comprising altered genotypes and phenotypes in relation to NHL and its in vivo activity.
The present invention also relates to NHL fusion constructs, including but not limited to fusion constructs which express a portion of the NHL protein linked to various markers, including but in no way limited to GFP (Green fluorescent protein), the MYC epitope, and GST. Any such fusion constructs may be expressed in the cell line of interest and used to screen for NHL modulators.
Therefore, the present invention relates to methods of expressing mammalian NHL, and preferably human NHL, biological equivalents disclosed herein, assays employing these gene products, recombinant host cells which comprise DNA constructs which express these proteins, and compounds identified through these assays which act as agonists or antagonists of NHL activity.
The present invention also relates to the isolated genomic sequence which comprises SEQ ID NO:l, a 115 kb genomic fragment set forth herein as SEQ ID NO:3. As especially preferred aspect of this portion of the invention is the region of the genomic fragment of SEQ ID NO:3 which comprises the regulatory and coding regions of human NHL, as well as intervening sequences (introns). This 115 kb fragment contains at least the coding region of four genes, NHL, M68/DcR3, SCLIP and ARP. As discussed herein, it has been shown that this region of chromosome 20 is associated with tumor growth. Therefore, an aspect of this invention also comprises the use of one or more regions of this 115 kb genomic sequence to identify compounds which up or downregulate expression of one or more of the genes localized within this 115 kb region, wherein this up or down regulation results in an interference of tumor growth. For example, a transcription element of one of these four genes may be responsible for M68/DcR3 ( and/or NHL) overexpression in tumors, and if M68 or NHL overexpression in tumors has a caustic role, blockage of M68/DcR3 or NHL overexpression in tumors by interfering with this transcription site will be useful. It is an object of the present invention to provide an isolated nucleic acid molecule (e.g., SEQ ID NO:l) which encodes novel form of human NHL, or fragments, mutants or derivatives of human NHL as set forth in Figure 2 and SEQ ID NO:2. Any such polynucleotide includes but is not necessarily limited to nucleotide substitutions, deletions, additions, amino-terminal truncations and carboxy-terminal truncations such that these mutations encode mRNA which express a protein or protein fragment of diagnostic, therapeutic or prophylactic use and would be useful for screening for selective modulators of human NHL activity.
It is a further object of the present invention to provide the mammalian, and especially human, NHL proteins or protein fragments encoded by the nucleic acid molecules referred to in the preceding paragraph.
It is a further object of the present invention to provide recombinant vectors and recombinant host cells which comprise a nucleic acid sequence encoding mammalian, and especially human, NHL protein and biological equivalent thereof. It is an object of the present invention to provide a substantially purified form of human NHL, as set forth in Figure 2 and SEQ ID NO:2.
Is another object of the present invention to provide a substantially purified recombinant form of a NHL protein which has been obtained from a recombinant host cell transformed or transfected with a DNA expression vector which comprises and appropriately expresses a complete open reading frame as set forth in SEQ ID NO:l, resulting in a functional, processed form of NHL. It is especially preferred is that the recombinant host cell be a eukaryotic host cell, such as a mammalian cell line.
It is an object of the present invention to provide for biologically active fragments and/or mutants of mammalian, and especially human, NHL, such as set forth in SEQ ID NO:2, including but not necessarily limited to amino acid substitutions, deletions, additions, amino terminal truncations and carboxy-terminal truncations such that these mutations provide for proteins or protein fragments of diagnostic, therapeutic and/or prophylactic use.
It is also an object of the present invention to use NHL proteins or biological equivalent to screen for modulators, preferably selective modulators, of human NHL activity. Any such compound may be useful in screening for and selecting compounds active against human neoplastic disorders.
As used herein, "substantially free from other nucleic acids" means at least 90%, preferably 95%, more preferably 99%, and even more preferably 99.9%, free of other nucleic acids. Thus, a human NHL DNA preparation that is substantially free from other nucleic acids will contain, as a percent of its total nucleic acid, no more than 10%, preferably no more than 5%, more preferably no more than 1%, and even more preferably no more than 0.1%, of non-NHL nucleic acids. Whether a given NHL DNA preparation is substantially free from other nucleic acids can be determined by such conventional techniques of assessing nucleic acid purity as, e.g., agarose gel electrophoresis combined with appropriate staining methods, e.g., ethidium bromide staining, or by sequencing.
As used herein, "substantially free from other proteins" or "substantially purified" means at least 90%, preferably 95%, more preferably 99%, and even more preferably 99.9%, free of other proteins. Thus, a NHL protein preparation that is substantially free from other proteins will contain, as a percent of its total protein, no more than 10%, preferably no more than 5%, more preferably no more than 1%, and even more preferably no more than 0.1%, of non-NHL proteins. Whether a given NHL protein preparation is substantially free from other proteins can be determined by such conventional techniques of assessing protein purity as, e.g., sodium dodecyl sulfate polyacryl amide gel electrophoresis (SDS-PAGE) combined with appropriate detection methods, e.g., silver staining or immunoblotting. As used interchangeably with the terms "substantially free from other proteins" or "substantially purified", the terms "isolated NHL protein" or "purified NHL protein" also refer to NHL protein that has been isolated from a natural source. Use of the term "isolated" or "purified" indicates that NHL protein has been removed from its normal cellular environment. Thus, an isolated NHL protein may be in a cell-free solution or placed in a different cellular environment from that in which it occurs naturally. The term isolated does not imply that an isolated NHL protein is the only protein present, but instead means that an isolated NHL protein is substantially free of other proteins and non-amino acid material (e.g., nucleic acids, lipids, carbohydrates) naturally associated with the NHL protein in vivo. Thus, a NHL protein that is recombinantly expressed in a prokaryotic or eukaryotic cell and substantially purified from this host cell which does not naturally (i.e., without intervention) express this protein is of course "isolated NHL protein" under any circumstances referred to herein. As noted above, a NHL protein preparation that is an isolated or purified NHL protein will be substantially free from other proteins will contain, as a percent of its total protein, no more than 10%, preferably no more than 5%, more preferably no more than 1%, and even more preferably no more than 0.1%, of non-NHL proteins.
As used interchangeably herein, "functional equivalent" or "biologically active equivalent" means a protein which does not have exactly the same amino acid sequence as naturally occurring NHL, due to alternative splicing, deletions, mutations, substitutions, or additions, but retains substantially the same biological activity as NHL. Such functional equivalents will have significant amino acid sequence identity with naturally occurring NHL and genes and cDNA encoding such functional equivalents can be detected by reduced stringency hybridization with a DNA sequence encoding naturally occurring NHL. For example, a naturally occurring NHL disclosed herein comprises the amino acid sequence shown as SEQ ID NO:2 and is encoded by SEQ ID NO:l. A nucleic acid encoding a functional equivalent has at least about 50% identity at the nucleotide level to SEQ ID NO: 1.
As used herein, "a conservative amino acid substitution" refers to the replacement of one amino acid residue by another, chemically similar, amino acid residue. Examples of such conservative substitutions are: substitution of one hydrophobic residue (isoleucine, leucine, valine, or methionine) for another; substitution of one polar residue for another polar residue of the same charge (e.g., arginine for lysine; glutamic acid for aspartic acid). As used herein, the term "mammalian" will refer to any mammal, including a human being.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1A-B shows the nucleotide sequence which comprises the open reading frame which encodes human NHL, the nucleotide sequence set forth as SEQ ID NO:l. The initiating Met residue (ATG) and the stop codon (TAG) are underlined.
Figure 2 shows the amino acid sequence of human NHL as set forth in SEQ ID NO:2.
Figure 3 shows the alignment of amino acid sequences of human NHL to ERCC2/RAD3 gene family members. Rep D (Dictyosteliem discoideum); RAD 3 (S. cerevisiae); RAD 15 (S. pombe) and XP_GroupD (Homo sapien).
Figure 4 shows Northern analysis of NHL expression in multi-human tissues.
Figure 5A-B show the genomic structure of the NHL gene (Figure 5A) and the entire 115 kb genomic region (Figure 5B) containing the NHL, M68/DcR3, SCLIP and ARP genes.
DETAILED DESCRIPTION OF THE INVENTION
The present invention relates to an isolated or purified nucleic acid molecule (polynucleotide) which encodes a novel mammalian DNA helicase. An especially preferred aspect of this invention relates to an isolated nucleic acid molecule (polynucleotide) which encodes mRNA which expresses a novel human DNA helicase, NHL.
The gene M68/DcR3 is a secreted TNFR member that is overexpressed in a number of human tumors. M68/DcR3 is located at 20ql3.3, a known site that is associated with frequent gene amplification in cancer. M68 DcR3 protein binds to FASL and inhibit FAS mediated apoptosis. Thus, genes tightly linked to M68/DcR3 may be coregulated (e.g. co overexpressed and/or amplified in tumors). During the course of cloning the genomic M68/DcR3 fragment and identifying genes that are linked to M68/DcR3 at 20ql3.3, three genes, including a novel gene that is similar to the Rad3/ERCC2 helicase family, were identified (termed NHL) in the immediately adjacent (overlapping) region. Given NHL's chromosomal location and the frequent association of DNA helicases with human genetic disorders (mutations in DNA helicases have been found associated with multiple diseases, including xeroderma pigmentosum, Cockayne's syndrome, Bloom's syndrome, and Werner's syndrome), NHL is a candidate for contribution to certain human neoplastic disorders. To this end, the genomic clone for this gene is disclosed and the complete sequence is determined. The transcript was identified through exon prediction using GRAIL2 and sequence alignment to a contiguous 4.5 kilobase region of chromosome 4 (88% sequence identity). The complete exon structure of NHL was subsequently confirmed by RT-PCR analysis. Multiple sequence alignment of NHL to known helicases showed that NHL contains all the seven critical helicase domains. BLAST analysis of the predicted 1,219 amino acid sequence revealed an approximately 26% sequence identity and 48% sequence similarity to the RAD3/ERCC2 gene family of DNA helicases (Naumovski et al., 1985 Mol. Cell Biol. 5:17-26; Reynolds et al., 1985 Nucleic Acid Res 13:2357-72; Weber et al., 1990 EMBO J. 9:1437-1447). The mRNA expression pattern of NHL was also examined in multiple human tissues. Radiation hybrid chromosomal mapping reconfirms that it is linked to M68/DcR3 locus. A preferred aspect of the present invention relates to an isolated or purified DNA molecule which encodes human NHL, the nucleotide sequence as set forth in Figure 1A-B and SEQ ID NO:l, which is as follows:
AGTCAGCCCT GCTGCCAGCC AGTGCCGGGT GCTGGGGACT CAGGGAGGCC CGCCGGGACC ACTGCGGGAC AGTGAGCCGA GCAGAAGCTG GAACGCAGGA GAGGAAGGAG AGGGGGCGGT CAGGGCTCTC AGGAGCCGGG TCCTGGGCAA GGCGCAGCCG TTTTCAAATT TTCAGGAAAG CGGTCGGCTC ACACTCGAGC AGTAAAAAGA TGCCTCTGGG GAGGAGGCCC GTGCAGCTCT CCGGGCAATG GTGGTGGCTC GGCCTAGAGA GGCGGTAGTG GAACGCAGAC CCTGGTGGGG GAATGACATC AAGGGAGGAG ACGGGCGGGA CCCCAGATTT CTGCCTGTGG GCGATGGAAG TGAGGTTCAC TGGCCAGCGG AGCCGGACAC AGAACGCGCA AAACGCCGTG TAGGCCTGGA GGAGCCGAAG AGCAGGCGGA CCCCCTCCGC GGGGGAACAG TTTCCGCCGG GAGCACAAAG CAACGGACCG GAAGTGGGGG GCGGAAGTGC AGTGGGCTCA GCGCCGACTG CGCGCCTCTG CCCGCGAAAA CTCTGAGCTG GCTGACAGCT GGGGACGGGT GGCGGCCCTC GACTGGAGTC GGTTGAGTTC CTGAGGGACC CCGGTTCTGG AAGGTTCGCC GCGGAGACAA GTGAGCAGTC TGTGCCATAG GGATTCTCGA AGAGAACAGC GTTGTGTCCC AGTGCACATG CTCGCATCGC TTACCAGGAG TGCCCGAGAC CCTAAGATGT TCGGAGTGGT TTTTTCGCAC AGACCCGAAT AGCCTGCCCC TCAGCCACGC TCTGTGCCCT TCTGAGAACA GGCTGATATG CCCAAGATAG TCCTGAATGG TGTGACCGTA GACTTCCCTT TCCAGCCCTA CAAATGCCAA CAGGAGTACA TGACCAAGGT CCTGGAATGT CTGCAGCAGA AGGTGAATGG CATCCTGGAG AGCCCTACGG GTACAGGGAA GACGCTGTGC CTGCTGTGCA CCACGCTGGC CTGGCGAGAA CACCTCCGAG ACGGCATCTC TGCCCGCAAG ATTGCCGAGA GGGCGCAAGG AGAGCTTTTC CCGGATCGGG CCTTGTCATC CTGGGGCAAC GCTGCTGCTG CTGCTGGAGA CCCCATAGCT TGCTACACGG ACATCCCAAA GATTATTTAC GCCTCCAGGA CCCACTCGCA ACTCACACAG GTCATCAACG AGCTTCGGAA CACCTCCTAC CGGCCTAAGG TGTGTGTGCT GGGCTCCCGG GAGCAGCTGT GCATCCATCC TGAGGTGAAG AAACAAGAGA GTAACCATCT ACAGATCCAC TTGTGCCGTA AGAAGGTGGC AAGTCGCTCC TGTCATTTCT ACAACAACGT AGAAGAAAAA AGCCTGGAGC AGGAGCTGGC CAGCCCCATC CTGGACATTG AGGACTTGGT CAAGAGCGGA AGCAAGCACA GGGTGTGCCC TTACTACCTG TCCCGGAACC TGAAGCAGCA AGCCGACATC ATATTCATGC CGTACAATTA CTTGTTGGAT GCCAAGAGCC GCAGAGCACA CAACATTGAC CTGAAGGGGA CAGTCGTGAT CTTTGACGAA GCTCACAACG TGGAGAAGAT GTGTGAAGAA TCGGCATCCT TTGACCTGAC TCCCCATGAC CTGGCTTCAG GACTGGACGT CATAGACCAG GTGCTGGAGG AGCAGACCAA GGCAGCGCAG CAGGGTGAGC CCCACCCGGA GTTCAGCGCG GACTCCCCCA GCCCAGGGCT GAACATGGAG CTGGAAGACA TTGCAAAGCT GAAGATGATC CTGCTGCGCC TGGAGGGGGC CATCGATGCT GTTGAGCTGC CTGGAGACGA CAGCGGTGTC ACCAAGCCAG GGAGCTACAT CTTTGAGCTG TTTGCTGAAG CCCAGATCAC GTTTCAGACC AAGGGCTGCA TCCTGGACTC GCTGGACCAG ATCATCCAGC ACCTGGCAGG ACGTGCTGGA GTGTTCACCA ACACGGCCGG ACTGCAGAAG CTGGCGGACA TTATCCAGAT TGTGTTCAGT GTGGACCCCT CCGAGGGCAG CCCTGGTTCC CCAGCAGGGC TGGGGGCCTT ACAGTCCTAT AAGGTGCACA TCCATCCTGA TGCTGGTCAC CGGAGGACGG CTCAGCGGTC TGATGCCTGG AGCACCACTG CAGCCAGAAA GCGAGGGAAG GTGCTGAGCT ACTGGTGCTT CAGTCCCGGC CACAGCATGC ACGAGCTGGT CCGCCAGGGC GTCCGCTCCC TCATCCTTAC CAGCGGCACG CTGGCCCCGG TGTCCTCCTT TGCTCTGGAG ATGCAGATCC CTTTCCCAGT CTGCCTGGAG AACCCACACA TCATCGACAA GCACCAGATC TGGGTGGGGG TCGTCCCCAG AGGCCCCGAT GGAGCCCAGT TGAGCTCCGC GTTTGACAGA CGGTTTTCCG AGGAGTGCTT ATCCTCCCTG GGGAAGGCTC TGGGCAACAT CGCCCGCGTG GTGCCCTATG GGCTCCTGAT CTTCTTCCCT TCCTATCCTG TCATGGAGAA GAGCCTGGAG TTCTGGCGGG CCCGCGACTT GGCCAGGAAG ATGGAGGCGC TGAAGCCGCT GTTTGTGGAG CCCAGGAGCA AAGGCAGCTT CTCCGAGACC ATCAGTGCTT ACTATGCAAG GGTTGCCGCC CCTGGGTCCA CCGGCGCCAC CTTCCTGGCG GTCTGCCGGG GCAAGGCCAG CGAGGGGCTG GACTTCTCAG ACACGAATGG CCGTGGTGTG ATTGTCACGG GCCTCCCGTA CCCCCCACGC ATGGACCCCC GGGTTGTCCT CAAGATGCAG TTCCTGGATG AGATGAAGGG CCAGGGTGGG GCTGGGGGCC AGTTCCTCTC TGGGCAGGAG TGGTACCGGC AGCAGGCGTC CAGGGCTGTG AACCAGGCCA TCGGGCGAGT GATCCGGCAC CGCCAGGACT ACGGAGCTGT CTTCCTCTGT GACCACAGGT TCGCCTTTGC CGACGCAAGA GCCCAACTGC CCTCCTGGGT GCGTCCCCAC GTCAGGGTGT ATGACAACTT TGGCCATGTC ATCCGAGACG TGGCCCAGTT CTTCCGTGTT GCCGAGCGAA CTATGCCAGC GCCGGCCCCC CGGGCTACAG CACCCAGTGT GCGTGGAGAA GATGCTGTCA GCGAGGCCAA GTCGCCTGGC CCCTTCTTCT CCACCAGGAA AGCTAAGAGT CTGGACCTGC ATGTCCCCAG CCTGAAGCAG AGGTCCTCAG GGTCACCAGC TGCCGGGGAC CCCGAGAGTA GCCTGTGTGT GGAGTATGAG CAGGAGCCAG TTCCTGCCCG GCAGAGGCCC AGGGGGCTGC TGGCCGCCCT GGAGCACAGC GAACAGCGGG CGGGGAGCCC TGGCGAGGAG CAGGCCCACA GCTGCTCCAC CCTGTCCCTC CTGTCTGAGA AGAGGCCGGC AGAAGAACCG CGAGGAGGGA GGAAGAAGAT CCGGCTGGTC AGCCACCCGG AGGAGCCCGT GGCTGGTGCA CAGACGGACA GGGCCAAGCT CTTCATGGTG GCCGTGAAGC AGGAGTTGAG CCAAGCCAAC TTTGCCACCT TCACCCAGGC CCTGCAGGAC TACAAGGGTT CCGATGACTT CGCCGCCCTG GCCGCCTGTC TCGGCCCCCT CTTTGCTGAG GACCCCAAGA AGCACAACCT GCTCCAAGGC TTCTACCAGT TTGTGCGGCC CCACCATAAG CAGCAGTTTG AGGAGGTCTG TATCCAGCTG ACAGGACGAG GCTGTGGCTA TCGGCCTGAG CACAGCATTC CCCGAAGGCA GCGGGCACAG CCGGTCCTGG ACCCCACTGG AAGAACGGCG CCGGATCCCA AGCTGACCGT GTCCACGGCT GCAGCCCAGC AGCTGGACCC CCAAGAGCAC CTGAACCAGG GCAGGCCCCA CCTGTCGCCC AGGCCACCCC CAACAGGAGA CCCTGGCAGC CAACCACAGT GGGGGTCTGG AGTGCCCAGA GCAGGGAAGC AGGGCCAGCA CGCCGTGAGC GCCTACCTGG CTGATGCCCG CAGGGCCCTG GGGTCCGCGG GCTGTAGCCA ACTCTTGGCA GCGCTGACAG CCTATAAGCA AGACGACGAC CTCGACAAGG TGCTGGCTGT GTTGGCCGCC CTGACCACTG CAAAGCCAGA GGACTTCCCC CTGCTGCACA GGTTCAGCAT GTTTGTGCGT CCACACCACA AGCAGCGCTT CTCACAGACG TGCACAGACC TGACCGGCCG GCCCTACCCG GGCATGGAGC CACCGGGACC CCAGGAGGAG AGGCTTGCCG TGCCTCCTGT GCTTACCCAC AGGGCTCCCC AACCAGGCCC CTCACGGTCC GAGAAGACCG GGAAGACCCA GAGCAAGATC TCGTCCTTCC TTAGACAGAG GCCAGCAGGG ACTGTGGGGG CGGGCGGTGA GGATGCAGGT CCCAGCCAGT CCTCAGGACC TCCCCACGGG CCTGCAGCAT CTGAGTGGGG CCTCTAGGAT GTGCCCAGCC TGCCACACCG CCTCCAGGAA GCAGAGCGTC ATGCAGGTCT TCTGGCCAGA GCCCCAGTGA GTGCCCACGG AGGCCCCCAG CACACCCAAC GTGGCTTGAT CACCTGCCTG TCCAGCTCTG GTGGGCCAAG AACCCACCCA ACAGAATAGG CCAGCCCATG CCAGCCGGCT TGGCCCGCTG CAGGCCTCAG GCAGGCGGGG CCCATGGTTG GTCCCTGCGG TGGGACCGGA TCTGGGCCTG CCTCTGAGAA GCCCTGAGCT ACCTTGGGGT CTGGGGTGGG TTTCTGGGAA AGTGCTTCCC CAGAACTTCC CTGGCTCCTG GCCTGTGAGT GGTGCCACAG GGGCACCCCA GCTGAGCCCC TCACCGGGAA GGAGGAGACC CCCGTGGGCA CGTGTCCACT TTTAATCAGG GGACAGGGCT CTCTAATAAA GCTGCTGGCA GTGCCC ( SEQ ID NO : 1 ) .
The above-exemplified isolated DNA molecule shown in Figure 1A-B and SEQ ID NO:l comprise 4946 nucleotides, with an initiating Met at nucleotides 828-
830 and a "TAG" termination codon at nucleotides 4585-4587. The initiating Met and
TAG termination codon are underlined.
The present invention also relates to biologically active fragments or mutants of SEQ ID NO:l which encode a mRNA molecule expressing a novel DNA helicase, NHL. Any such biologically active fragment and/or mutant will encode either a protein or protein fragment which at least substantially mimics the biological properties of the human NHL protein disclosed herein in Figure 2 and as set forth as
SEQ ID NO:2. Any such polynucleotide includes but is not necessarily limited to nucleotide substitutions, deletions, additions, amino-terminal truncations and carboxy- terminal truncations such that these mutations encode mRNA which express a functional NHL protein in a host cell, so as to be useful for screening for agonists and/or antagonists of NHL activity.
The isolated nucleic acid molecules of the present invention may include a deoxyribonucleic acid molecule (DNA), such as genomic DNA and complementary DNA (cDNA), which may be single (coding or noncoding strand) or double stranded, as well as synthetic DNA, such as a synthesized, single stranded polynucleotide. The isolated nucleic acid molecule of the present invention may also include a ribonucleic acid molecule (RNA). The present invention also relates to recombinant vectors and recombinant hosts, both prokaryotic and eukaryotic, which contain the substantially purified nucleic acid molecules disclosed throughout this specification.
The degeneracy of the genetic code is such that, for all but two amino acids, more than a single codon encodes a particular amino acid. This allows for the construction of synthetic DNA that encodes the NHL protein where the nucleotide sequence of the synthetic DNA differs significantly from the nucleotide sequence of SEQ ED NO: 1 but still encodes the same NHL protein as SEQ ID
NO:2. Such synthetic DNAs are intended to be within the scope of the present invention. If it is desired to express such synthetic DNAs in a particular host cell or organism, the codon usage of such synthetic DNAs can be adjusted to reflect the codon usage of that particular host, thus leading to higher levels of expression of the NHL protein in the host. In other words, this redundancy in the various codons which code for specific amino acids is within the scope of the present invention. Therefore, this invention is also directed to those DNA sequences which encode RNA comprising alternative codons which code for the eventual translation of the identical amino acid, as shown below:
A=Ala=Alanine: codons GCA, GCC, GCG, GCU
C=Cys=Cysteine: codons UGC, UGU
D=Asp=Aspartic acid: codons GAC, GAU E=Glu=Glutamic acid: codons GAA, GAG
F=Phe=Phenylalanine: codons UUC, UUU
G=Gly=Glycine: codons GGA, GGC, GGG, GGU
H=His =Histidine: codons CAC, CAU
I=Ile =Isoleucine: codons AUA, AUC, AUU K=Lys=Lysine: codons AAA, AAG
L=Leu=Leucine: codons UUA, UUG, CUA, CUC, CUG, CUU
M=Met=Methionine: codon AUG
N=Asp=Asparagine: codons AAC, AAU p=Pro=Proline: codons CCA, CCC, CCG, CCU Q=Gln=Glutamine: codons CAA, CAG
R=Arg=Arginine: codons AGA, AGG, CGA, CGC, CGG, CGU S=Ser=Serine: codons AGC, AGU, UCA, UCC, UCG, UCU T=Thr=Threonine: codons ACA, ACC, ACG, ACU V=Val=Valine: codons GUA, GUC, GUG, GUU W=Trp=Tryptophan: codon UGG Y=Tyr=Tyrosine: codons UAC, UAU
Therefore, the present invention discloses codon redundancy which may result in differing DNA molecules expressing an identical protein. For purposes of this specification, a sequence bearing one or more replaced codons will be defined as a degenerate variation. Also included within the scope of this invention are mutations either in the DNA sequence or the translated protein which do not substantially alter the ultimate physical properties of the expressed protein. For example, substitution of valine for leucine, arginine for lysine, or asparagine for glutamine may not cause a change in functionality of the polypeptide.
It is known that DNA sequences coding for a peptide may be altered so as to code for a peptide having properties that are different than those of the naturally occurring peptide. Methods of altering the DNA sequences include but are not limited to site directed mutagenesis. Examples of altered properties include but are not limited to changes in the affinity of an enzyme for a substrate or a receptor for a ligand.
The present invention also relates to recombinant vectors and recombinant hosts, both prokaryotic and eukaryotic, which contain the substantially purified nucleic acid molecules disclosed throughout this specification. The nucleic acid molecules of the present invention encoding a NHL protein, in whole or in part, can be linked with other DNA molecules, i.e, DNA molecules to which the NHL coding sequence are not naturally linked, to form "recombinant DNA molecules" which encode a respective NHL protein. The novel DNA sequences of the present invention can be inserted into vectors which comprise nucleic acids encoding NHL or a functional equivalent. These vectors may be comprised of DNA or RNA; for most cloning purposes DNA vectors are preferred. Typical vectors include plasmids, modified viruses, bacteriophage, cosmids, yeast artificial chromosomes, and other forms of episomal or integrated DNA that can encode a NHL protein. It is well within the purview of the skilled artisan to determine an appropriate vector for a particular gene transfer or other use.
Included in the present invention are DNA sequences that hybridize to SEQ ID NO:l under stringent conditions. By way of example, and not limitation, a procedure using conditions of high stringency is as follows: Prehybridization of filters containing DNA is carried out for 2 hours to overnight at 65°C in buffer composed of 6X SSC, 5X Denhardt's solution, and 100 μg/ml denatured salmon sperm DNA. Filters are hybridized for 12 to 48 hrs at 65°C in prehybridization mixture containing 100 μg/ml denatured salmon sperm DNA and 5-20 X 106 cpm of 32P-labeled probe. Washing of filters is done at 37°C for 1 hr in a solution containing 2X SSC, 0.1% SDS. This is followed by a wash in 0.1X SSC, 0.1% SDS at 50°C for 45 min. before autoradiography. Other procedures using conditions of high stringency would include either a hybridization step carried out in 5XSSC, 5X Denhardt's solution, 50% formamide at 42°C for 12 to 48 hours or a washing step carried out in 0.2X SSPE, 0.2% SDS at 65°C for 30 to 60 minutes.
Reagents mentioned in the foregoing procedures for carrying out high stringency hybridization are well known in the art. Details of the composition of these reagents can be found in, e.g., Sambrook et al., 1989, Molecular Cloning: A Laboratory Manual; Cold Spring Harbor Laboratory, Cold Spring Harbor, New York. In addition to the foregoing, other conditions of high stringency which may be used are well known in the art.
The present invention also relates to a substantially purified form of a human NHL protein which comprises the amino acid sequence (1219 amino acid residues) disclosed in Figure 2 and set forth as SEQ ID NO:2. A preferred aspect of this portion of the present invention is a NHL protein which consists of the amino acid sequence disclosed in Figure 2 and set forth as SEQ ID NO:2, as follows:
MPKIVLNGVT VDFPFQPYKC QQEY TKVLE CLQQKVNGIL ESPTGTGKTL CLLCTT AWR
EHLRDGISAR KIAERAQGEL FPDRALSS G NAAAAAGDPI ACYTDIPKII YASRTHSQLT
QVINELRNTS YRPKVCVLGS REQLCIHPEV KKQESNHLQI HLCRKKVASR SCHFYNNVEE KS EQELASP ILDIEDLVKS GSKHRVCPYY LSRNLKQQAD IIFMPYNYLL DAKSRRAHNI
D KGTWIFD EAHNVEKMCE ESASFD TPH D ASG DVID QVLEEQTKAA QQGEPHPEFS
ADSPSPG NM ELEDIAKLKM IL R EGAID AVE PGDDSG VTKPGSYIFE FAEAQITFQ
TKGCI DS D QIIQHLAGRA GVFTNTAGLQ K ADIIQIVF SVDPSEGSPG SPAGLGALQS
YKVHIHPDAG HRRTAQRSDA STTAARKRG KVLSYWCFSP GHSMHE VRQ GVRS ILTSG T APVSSFAL EMQIPFPVCL ENPHIIDKHQ IWVGWPRGP DGAQLSSAFD RRFSEEC SS
LGKALGNIAR WPYG LIFF PSYPVMEKSL EF RARDLAR KMEALKPLFV EPRSKGSFSE
TISAYYARVA APGSTGATFL AVCRGKASEG DFSDTNGRG VIVTGLPYPP RMDPRWLKM
QFLDEMKGQG GAGGQF SGQ E YRQQASRA VNQAIGRVIR HRQDYGAVFL CDHRFAFADA RAQLPSWVRP HVRVYDNFGH VIRDVAQFFR VAERTMPAPA PRATAPSVRG EDAVSEAKSP
GPFFSTRKAK SLD HVPSLK QRSSGSPAAG DPESSLCVEY EQEPVPARQR PRGLLAA EH
SEQRAGSPGE EQAHSCSTLS L SEKRPAEE PRGGRKKIRL VSHPEEPVAG AQTDRAK FM
VAVKQE SQA NFATFTQA Q DYKGSDDFAA LAACLGPLFA EDPKKHNLLQ GFYQFVRPHH
KQQFEEVCIQ LTGRGCGYRP EHSIPRRQRA QPVLDPTGRT APDPKLTVST AAAQQLDPQE HLNQGRPHLS PRPPPTGDPG SQPQWGSGVP RAGKQGQHAV SAYLADARRA LGSAGCSQLL
AA TAYKQDD DLDKVLAVLA ALTTAKPEDF PLLHRFSMFV RPHHKQRFSQ TCTDLTGRPY
PGMEPPGPQE ERLAVPPVLT HRAPQPGPSR SEKTGKTQSK ISSFLRQRPA GTVGAGGEDA
GPSQSSGPPH GPAASEWGL* (SEQ ID NO: 2).
The present invention also relates to biologically active fragments and/or mutants of the human NHL protein comprising the amino acid sequence as set forth in SEQ ID NO:2, including but not necessarily limited to amino acid substitutions, deletions, additions, amino terminal truncations and carboxy-terminal truncations such that these mutations provide for proteins or protein fragments of diagnostic, therapeutic or prophylactic use and would be useful for screening for agonists and/or antagon i sts of NHL f uncti on .
Another preferred aspect of the present invention relates to a substantially purified, fully processed NHL protein obtained from a recombinant host cell containing a DNA expression vector which comprises a nucleotide sequence as set forth in SEQ ID NO:l and expresses the human NHL protein. It is especially preferred is that the recombinant host cell be a eukaryotic host cell, such as a mammalian cell line.
As with many proteins, it is possible to modify many of the amino acids of NHL protein and still retain substantially the same biological activity as the wild type protein. Thus this invention includes modified NHL polypeptides which have amino acid deletions, additions, or substitutions but that still retain substantially the same biological activity as a respective, corresponding NHL. It is generally accepted that single amino acid substitutions do not usually alter the biological activity of a protein (see, e.g., Molecular Biology ofthe Gene, Watson et al., 1987, Fourth Ed., The Benjamin/Cummings Publishing Co., Inc., page 226; and Cunningham & Wells, 1989, Science 244:1081-1085). Accordingly, the present invention includes a polypeptide where one amino acid substitution has been made in SEQ ID NO:2 wherein the polypeptide still retains substantially the same biological activity as a corresponding NHL protein. The present invention also includes polypeptides where two or more amino acid substitutions have been made in SEQ ID NO:2 wherein the polypeptide still retains substantially the same biological activity as a corresponding NHL protein. In particular, the present invention includes embodiments where the above-described substitutions are conservative substitutions.
One skilled in the art would also recognize that polypeptides that are functional equivalents of NHL and have changes from the NHL amino acid sequence that are small deletions or insertions of amino acids could also be produced by following the same guidelines, (i.e, minimizing the differences in amino acid sequence between NHL and related proteins. Small deletions or insertions are generally in the range of about 1 to 5 amino acids). The effect of such small deletions or insertions on the biological activity of the modified NHL polypeptide can easily be assayed by producing the polypeptide synthetically or by making the required changes in DNA encoding NHL and then expressing the DNA recombinantly and assaying the protein produced by such recombinant expression.
The present invention also includes truncated forms of NHL which contain the region comprising the active site of the enzyme. Such truncated proteins are useful in various assays described herein, for crystallization studies, and for structure-activity- relationship studies.
The present invention also relates to isolated nucleic acid molecules which are fusion constructions expressing fusion proteins useful in assays to identify compounds which modulate wild-type NHL activity, as well as generating antibodies against NHL. One aspect of this portion of the invention includes, but is not limited to, glutathione S-transferase (GST)-NHL fusion constructs. Recombinant GST-NHL fusion proteins may be expressed in various expression systems, including Spodoptera frugiperda (Sf21) insect cells (Invitrogen) using a baculovirus expression vector (pAcG2T, Pharmingen). Another aspect involves NHL fusion constructs linked to various markers, including but not limited to GFP (Green fluorescent protein), the MYC epitope, and GST. Again, any such fusion constructs may be expressed in the cell line of interest and used to screen for modulators of one or more of the NHL proteins disclosed herein. Any of a variety of procedures may be used to clone NHL. These methods include, but are not limited to, (1) a RACE PCR cloning technique (Frohman, et al., 1988, Proc. Natl. Acad. Sci. USA 85: 8998-9002). 5' and/or 3' RACE may be performed to generate a full-length cDNA sequence. This strategy involves using gene-specific oligonucleotide primers for PCR amplification of NHL cDNA. These gene-specific primers are designed through identification of an expressed sequence tag (EST) nucleotide sequence which has been identified by searching any number of publicly available nucleic acid and protein databases; (2) direct functional expression of the NHL cDNA following the construction of a NHL-containing cDNA library in an appropriate expression vector system; (3) screening a NHL-containing cDNA library constructed in a bacteriophage or plasmid shuttle vector with a labeled degenerate oligonucleotide probe designed from the amino acid sequence of the NHL protein; (4) screening a NHL-containing cDNA library constructed in a bacteriophage or plasmid shuttle vector with a partial cDNA encoding the NHL protein. This partial cDNA is obtained by the specific PCR amplification of NHL DNA fragments through the design of degenerate oligonucleotide primers from the amino acid sequence known for other kinases which are related to the NHL protein; (5) screening a NHL- containing cDNA library constructed in a bacteriophage or plasmid shuttle vector with a partial cDNA or oligonucleotide with homology to a mammalian NHL protein. This strategy may also involve using gene-specific oligonucleotide primers for PCR amplification of NHL cDNA identified as an EST as described above; or (6) designing 5' and 3' gene specific oligonucleotides using SEQ ID NO: 1 as a template so that either the full-length cDNA may be generated by known RACE techniques, or a portion of the coding region may be generated by these same known RACE techniques to generate and isolate a portion of the coding region to use as a probe to screen one of numerous types of cDNA and/or genomic libraries in order to isolate a full-length version of the nucleotide sequence encoding NHL.
It is readily apparent to those skilled in the art that other types of libraries, as well as libraries constructed from other cell types-or species types, may be useful for isolating a NHL-encoding DNA or a NHL homologue. Other types of libraries include, but are not limited to, cDNA libraries derived from other cells.
It is readily apparent to those skilled in the art that suitable cDNA libraries may be prepared from cells or cell lines which have NHL activity. The selection of cells or cell lines for use in preparing a cDNA library to isolate a cDNA encoding NHL may be done by first measuring cell-associated NHL activity using any known assay available for such a purpose.
Preparation of cDNA libraries can be performed by standard techniques well known in the art. Well known cDNA library construction techniques can be found for example, in Sambrook et al., 1989, Molecular Cloning: A Laboratory Manual; Cold Spring Harbor Laboratory, Cold Spring Harbor, New York. Complementary DNA libraries may also be obtained from numerous commercial sources, including but not limited to Clontech Laboratories, Inc. and Stratagene.
It is also readily apparent to those skilled in the art that DNA encoding NHL may also be isolated from a suitable genomic DNA library. Construction of genomic DNA libraries can be performed by standard techniques well known in the art. Well known genomic DNA library construction techniques can be found in Sambrook, et al., supra. One may prepare genomic libraries, especially in PI artificial chromosome vectors, from which genomic clones containing the NHL gene can be isolated, using probes based upon the NHL nucleotide sequences disclosed herein. Methods of preparing such libraries are known in the art (Ioannou et al., 1994, Nature Genet. 6:84-89).
In order to clone a NHL gene by one of the preferred methods, the amino acid sequence or DNA sequence of a NHL or a homologous protein may be necessary. To accomplish this, a respective NHL protein may be purified and the partial amino acid sequence determined by automated sequenators. It is not necessary to determine the entire amino acid sequence, but the linear sequence of two regions of 6 to 8 amino acids can be determined for the PCR amplification of a partial NHL DNA fragment. Once suitable amino acid sequences have been identified, the DNA sequences capable of encoding them are synthesized. Because the genetic code is degenerate, more than one codon may be used to encode a particular amino acid, and therefore, the amino acid sequence can be encoded by any of a set of similar DNA oligonucleotides. Only one member of the set will be identical to the NHL sequence but others in the set will be capable of hybridizing to NHL DNA even in the presence of DNA oligonucleotides with mismatches. The mismatched DNA oligonucleotides may still sufficiently hybridize to the NHL DNA to permit identification and isolation of NHL encoding DNA. Alternatively, the nucleotide sequence of a region of an expressed sequence may be identified by searching one or more available genomic databases. Gene-specific primers may be used to perform PCR amplification of a cDNA of interest from either a cDNA library or a population of cDNAs. As noted above, the appropriate nucleotide sequence for use in a PCR-based method may be obtained from SEQ ID NO:l either for the purpose of isolating overlapping 5' and 3' RACE products for generation of a full-length sequence coding for NHL, or to isolate a portion of the nucleotide sequence coding for NHL for use as a probe to screen one or more cDNA- or genomic-based libraries to isolate a full-length sequence encoding NHL or NHL-like proteins.
This invention also includes vectors containing a NHL gene, host cells containing the vectors, and methods of making substantially pure NHL protein comprising the steps of introducing the NHL gene into a host cell, and cultivating the host cell under appropriate conditions such that NHL is produced. The NHL so produced may be harvested from the host cells in conventional ways. Therefore, the present invention also relates to methods of expressing the NHL protein and biological equivalents disclosed herein, assays employing these gene products, recombinant host cells which comprise DNA constructs which express these proteins, and compounds identified through these assays which act as agonists or antagonists of NHL activity.
The cloned NHL cDNA obtained through the methods described above may be recombinantly expressed by molecular cloning into an expression vector (such as pcDNA3.neo, pcDNA3.1, pCR2.1, pBlueBacHis2 or pLITMUS28) containing a suitable promoter and other appropriate transcription regulatory elements, and transferred into prokaryotic or eukaryotic host cells to produce recombinant NHL. Expression vectors are defined herein as DNA sequences that are required for the transcription of cloned DNA and the translation of their mRNAs in an appropriate host. Such vectors can be used to express eukaryotic DNA in a variety of hosts such as bacteria, blue green algae, plant cells, insect cells and animal cells. Specifically designed vectors allow the shuttling of DNA between hosts such as bacteria-yeast or bacteria-animal cells. An appropriately constructed expression vector should contain: an origin of replication for autonomous replication in host cells, selectable markers, a limited number of useful restriction enzyme sites, a potential for high copy number, and active promoters. A promoter is defined as a DNA sequence that directs RNA polymerase to bind to DNA and initiate RNA synthesis. A strong promoter is one which causes mRNAs to be initiated at high frequency. To determine the NHL cDNA sequence(s) that yields optimal levels of NHL, cDNA molecules including but not limited to the following can be constructed: a cDNA fragment containing the full- length open reading frame for NHL as well as various constructs containing portions of the cDNA encoding only specific domains of the protein or rearranged domains of the protein. All constructs can be designed to contain none, all or portions of the 5' and/or 3' untranslated region of a NHL cDNA. The expression levels and activity of NHL can be determined following the introduction, both singly and in combination, of these constructs into appropriate host cells. Following determination of the NHL cDNA cassette yielding optimal expression in transient assays, this NHL cDNA construct is transferred to a variety of expression vectors (including recombinant viruses), including but not limited to those for mammalian cells, plant cells, insect cells, oocytes, bacteria, and yeast cells. Techniques for such manipulations can be found described in Sambrook, et al., supra, are well known and available to the artisan of ordinary skill in the art. Therefore, another aspect of the present invention includes host cells that have been engineered to contain and/or express DNA sequences encoding the NHL protein. An expression vector containing DNA encoding a NHL-like protein may be used for expression of NHL in a recombinant host cell. Such recombinant host cells can be cultured under suitable conditions to produce NHL or a biologically equivalent form. Expression vectors may include, but are not limited to, cloning vectors, modified cloning vectors, specifically designed plasmids or viruses. Commercially available mammalian expression vectors which may be suitable for recombinant NHL expression, include but are not limited to, pcDNA3.neo (Invitrogen), pcDNA3.1 (Invitrogen), pCI-neo (Promega), pLITMUS28, pLlTMUS29, pLITMUS38 and pLITMUS39 (New England Bioloabs), pcDNAI, pcDNAIamp (Invitrogen), pcDNA3 (Invitrogen), pMClneo (Stratagene), pXTl (Stratagene), pSG5 (Stratagene), EBO-pS V2-neo (ATCC 37593) pBPV- 1 (8-2)
(ATCC 37110), pdBPV-MMTneo(342-12) (ATCC 37224), pRSVgpt (ATCC 37199), pRSVneo (ATCC 37198), pSV2-dhfr (ATCC 37146), pUCTag (ATCC 37460), and 1ZD35 (ATCC 37565). Also, a variety of bacterial expression vectors may be used to express recombinant NHL in bacterial cells. Commercially available bacterial expression vectors which may be suitable for recombinant NHL expression include, but are not limited to pCR2.1 (Invitrogen), pETl la (Novagen), lambda gtl 1 (Invitrogen), and pKK223-3 (Pharmacia). In addition, a variety of fungal cell expression vectors may be used to express recombinant NHL in fungal cells. Commercially available fungal cell expression vectors which may be suitable for recombinant NHL expression include but are not limited to pYES2 (Invitrogen) and Pichia expression vector (Invitrogen). Also, a variety of insect cell expression vectors may be used to express recombinant protein in insect cells. Commercially available insect cell expression vectors which may be suitable for recombinant expression of NHL include but are not limited to pBlueBacUI and pBlueBacHis2 (Invitrogen), and pAcG2T (Pharmingen).
Recombinant host cells may be prokaryotic or eukaryotic, including but not limited to, bacteria such as E. coli, fungal cells such as yeast, mammalian cells including, but not limited to, cell lines of bovine, porcine, monkey and rodent origin; and insect cells including but not limited to Drosophila and silkworm derived cell lines. For instance, one insect expression system utilizes Spodoptera frugiperda (Sf21) insect cells (Invitrogen) in tandem with a baculovirus expression vector (pAcG2T, Pharmingen). Also, mammalian species which may be suitable and which are commercially available, include but are not limited to, L cells L-M(TK') (ATCC CCL 1.3), L cells L-M (ATCC CCL 1.2), Saos-2 (ATCC HTB-85), 293 (ATCC CRL 1573), Raji (ATCC CCL 86), CV-1 (ATCC CCL 70), COS-1 (ATCC CRL 1650), COS-7 (ATCC CRL 1651), CHO-K1 (ATCC CCL 61), 3T3 (ATCC CCL 92), NIH/3T3 (ATCC CRL 1658), HeLa (ATCC CCL 2), C127I (ATCC CRL 1616), BS- C-l (ATCC CCL 26), MRC-5 (ATCC CCL 171) and CPAE (ATCC CCL 209). As disclosed in Example section 1, a 115 kb BAC clone (from Genome
Systems) was subcloned and subjected to restriction and sequence analysis. Four genes at chromosome location 20ql3.3 were identified, including M68/DcR3, NHL, SCLIP and ARP (Figure 5A). The nucleotide sequence of this BAC clone, hbml68, is presented as follows: TGAAGAGCTT TGACCAAGAG GCTGTGACGA GGCCCTACGA GGACTCTGGC TCTCCTCCTG 60
CTAAGCACAC CCAGGCAGGT GTCCTGGCAG ATGAGGACCA CATGCAGAGC CTCGGCCAGC 120
CCACCAATGC CCGGATATGC AAGTGAGCCC AGCCTGGACC CCCCGGCGAG GCCCAGCAGC 180
ACCAGCCCAG GCCCGAAAAC CTTAAGAAAT GACCAGTGTC TGCTGCTTTA AGCCACCAAG 240
CTCTGCGGTG GTTTGTTAGG CTGCAAGCAT GGCTAATTCA GAAACTGCCA GAAACAAGCA 300 CTGCTGTCCC CAGCCTGGGA CACACAGCAC CGCCTCTGCG TGGGGAGAGG GCACAGGCTA 360
AGGGCACAAA TGCCATCCCA GACCCGGCTC TTGTGTGTGG AAGGGGCCAC TGTGCCATGA 420
GGCAGAGGAA ACCTTGGCAG GACCTTATGC CACAGCAATT TAAAAGAGAA GAAACAGGCT 480
GGGCGTGGTG GCTCATGCCT ATAATCCCAG CACTTTGGGA GGCCAAGGTG GTGGATCACT 540
TGAGGTCAGG AGTTCAAGAC CAGCCTGGCC AATATGGTGA AACCCTGTCT CTACGAAAAA 600 TACAAAATTT AGGCAGGCGT GGTGGCGGGT GCCTGTAATC CCTGCTATTC AGGAGGCTGA 660
GGCAAGAGAT TTACTTGAAC CCAGGAGGTG GAGGCTGCTG CAGTGAGCTG AGATCATGCC 720
ACTGCACTCC AGCCTGTGTG ACGGAGTGAG ACTTGGTCTC AAAAAAAAAA AAGGAAACAC 780
ATCTGACTAG TGTGATCTCG CAAGGAACAT TCCAGACACA GTGGAGCTAG AAGGTTCTTC 840 TCCAAACAAG GAATCCCCAG GGGATCAAAT TGTTTTGCAT CGGCCAGACA TGGTGGCTCA 900
AGCCTGTAAC CCCAGTGCTT CGGGAGGCTG AGGTGGGAGG ACTGCTTGAG TCCAGGAGTT 960
CAAGACTAGC TTGGGCAACA CAGTGAGAGC CCATTAGCCA GGCGTGGTGG CACATGCCTG 1020
CAGTCCCAGC ACTGTACTAA AAATCTACAC GGGGCCGGGC ATGGTGGCAC ATGCCTGTAG 1080
AGTCCCAGCT ACTCAGGAGG CTGAGGCAGG ACGATTCCTT GAACCCAGGA GGTCACGGCT 1140 GCCATGAGCC GTGACTGTGC CACTGCACTC CAGTCTGTGC AACAGAACGA GACTCTGTTT 1200
CGAAAAACAA AAAATCATTT CATGTCTCCA GTTTCTCCAC TGGCAAAAGA CTCTGTCAAG 1260
GTAAAAAATG GTTCTGACCC ACAGAAATCT AAGAAAGGAA AAAATATAAA AAATAGAAAA 1320
TTTAAAAAAG AGATGGTCTC AGAATAAAGA CCAACCTGGG CTATGGTTGT CACTCTTCCC 1380
TCACACCTTA GAAAGCTTTC TGGCCGCATC TGGCCAAAGG GCCACCCTGC CCCATCTTGG 1440 ATCAGTGAGG TGCCTTCGAA CAAGCCACCT GCCCTGGAGC CCGTCCTGTC TTGTCTGCCA 1500
CCGCACGCTC AGTAGGGGAG GGGAAGTCGC TAGGTTTTAG TTC CCAGTC TCTGGATCAA 1560
GACGTGCCAT AACCAAGAAG CCCCAGCCAC ACCCAGACCC GATGTGGCCA CAAGGGGTGA 1620
GCTGGGAAGG CCCAGGAAAA GGCGGGAGGC GGACGAATGG AAATGTCATT CTGTGGCCAC 1680
AGAAATGATC TCAACGTTTT GTAACTTCCT ACCAAGAGGC AGTCTTAGCT CTGCCCTTGA 1740 ACCAGCACTT GGTGATGTCG CTTGCGTCAA TCAAGGCAAC AGAAGTGAGC AGGAGGCCCA 1800
CTTTCCTCTG CAACTGTGGG CTTACGGGGC AAAGAAGTCC AGGCCTCCAG GTGGAGGATC 1860
ACAGACCGGG CAAAGCAGAG GAGAGCCACC CAGCCGAGCC TACCTGTGCC TCAGACTGCC 1920
TCCCTCCAGA GACCCCTGTG GCCAAGGCCA CCCAGACCAG CAGGTCCTTG CCAAGCTGTC 1980
AGCTGACGAC AGGGGTTGGT GAGGCCGGCC CAGACCAGCA GAACCACGAA CCAACCAACA 2040 GAATTAAAAA TAATAACAAC TATGTCTTGT CTTAAGCCAC TAAGTTTTGG ATGGTTTCTT 2100
TCTTTCTTTT TCTTTTTTTT TTTCGGAGAC GCAGTCTCAC TCTGTTGCCC AGGCTGGAGT 2160
GCAGTGGCGC AATCTTGGCT CACTGCAAGC TCTGCCCCCC GGATTCACGC CATTCCCCTG 2220
CCTCAGCCTC CTGAGTAACT GGGACTACAG GTGCCTGCCA TTGGGTGTTT TCTTAAACAG 2280
CAAAAGAAAA CTGACACAAT CATAAACAGA GCAAGCAAGA GAACTTGGCA ATTATTTCCT 2340 CTCTACTTCT CACTGTTCTT CAAAGAGTTA ACTCAAGCAT AAGATGTGAG CAAATTCTTT 2400
TAACATCCTA GAAAAAAAGC TCCTACTCAG TGTTCATAAA GCAAAGCTAA CCTACAGGAG 2460
CCACCTTCCA CAGTGACCAC AGGAAACCAA GACAGCAAGT GGGACACCAG CCTCCAGGGC 2520
ACTGCGCCAG CCGTGCGCCT GTGTCTGCCA CTGCCCTGGT CCGTCACTGC CACCAGCCGG 2580
CAAGACACCC ACAGAGGAGA GCTCTAAGCC ACAACTGTGT ACGAAGACAA CTGTGCAGGA 2640 TTTTATTACT ACAACATTTT TGTTTTCTTT TTTTTTTTTT TTTGAGACTG AGTCTCGCTC 2700
TGTCACCCAG GCTGGAGTGC AGTGGCACAA TCTCGGCTCA CTGTAACCTC CATCTCCCTG 2760
GTTCAAGCAA TTCTCCTGCT GCAGCCTCCC AACTGGATTA CAGGCGCCCG CCACCACGCC 2820
TGGCTAATTT TTGTACTTTT AGTAGAGATG GGGTTTCACC ATGTTGGCCA GACTGGTCTC 2880 AAATTCCTGA CAAGTGATCC ACCCACCCTG GCCTCCCAAA GTGCTGGGAT TACAGGTGTG 2940
AGCCACTGCG CCTGGCCCAT TTTTGTTTAT CAATAAAAAT GTACTTAATG TTGAACTCTC 3000
CACATTTCAA ATGGGTAACT CCAGTGTCCT TGATGCTCCT GCGACATGTT CGTGAGACTT 3060
CTCTTGGGTG TGAGAGTCTA GCATGTGGGT GGTCTGGACA GGAGGGGGAG GGAAGAGTGC 3120
AGAGCCGGGC AGGGTAAAGA GACCCCCTAG GATGTGAAGG CCGCCCTGCA TTTGTCAGAC 3180 TGGGCAACAC CCACTCCATC AGATGGACCC TGGTATGGGC GGCAAGCCAC CTAGGTGCCG 3240
AGGCAAGAGA CCGAGGGCAC GAGCTGTTCC GGTGTAATAA AATGCATAAA ATAAGAATAG 3300
TTATACTAGA TATAGATCAT AAATATGATT ATATATGAAT ATCATTCATC ATTAGTTTGT 3360
AGCAATTACT CTTTATTCCA ATATTATAAT AATCCTTGCC TAAGCATAAC CTAGGAAAAA 3420
CTAGGAAATC ATAACCTAGG AAAAACTAGG CCATACAGAG ATAGGAGCTG AGGGGACATA 3480 GTGAGAACTG ACCAGAAGAC AAGAGTGCGA GCCTTCTGTT ATGCCTGGAC AGGGCCACCA 3540
GAGGGCTCCT TGGTCTAGCG GTAACGCCAG CATCTGGGAA GACGCCCGTT GCCAAGTGGA 3600
CCGTGGTCTA GCGGTAGCCT CAGTGTCAAG GAAAAACACC CGCTACTTAG CAAACCAGGA 3660
AAGAGAGTCT CCCTTTCCCC GGGGGAGTTT AGAGAAGACT CTACTCCTCC ACCTCTTGCG 3720
GAGGGCCTGA CATCAGTCAG GCCCGCCCGC AGTTATCCGG AGGCCTAACC GTCTCCCTGT 3780 GATGCTGTGC TTCAGTGGTC ACGCTCCTAG TCCGCCTTCA TGTTCCATCC TGTGCACCTG 3840
GCTCTGCCTT CTAGATAGCA GCAGCAAATT AGTGAAAGTA CTGAAAGTCT CTGATAAGCA 3900
GAAATAATGG CGTAAGCGGT CTCTCTCTCT CTCTCCTCTC TCTCTGCCTC AGCTGCCAGG 3960
AAGGGAAGGG CCCCCTGGCC AGTGGGCACG TGACCCACAT GACCTTACCT ATCACTGGAC 4020
ATGGTTCACA CTCCTTACCC TGCCGCTTTG TCTTGTATCC AATAAATAGC GCAACCTGGC 4080 ATTCGGGGCC GCTACCAGTC TCCGCGTCTT GGTGGTAGTG GTCCCCCAGG CCCAGCTGTC 4140
TTTTTCTTTT ATCTTTGTCT TGTGTCTTTA TTTCTACACT CTCTCATCTC CGCATACGAG 4200
GAGAAAACCC ACCAACCCTG TGGGGCTGGT CCCTACACCC TGGCTTTGTA GACTGGAGCC 4260
TAGGCACGAC TCAGCTGCTG TAGTGAATTG CGATCCTCCA AACCCAGCAA GGCACCTGCA 4320
GGACATCTGG CCCAGTCTCC TCGTTGAGCC AGTTCACGAA AAAGAGACTT TTCTGAGTGA 4380 CATGCTAATG GGCAATATGA GGACTAAATG GGATGGTCTC CAACTTGGAC AAACCAACAG 4440
TAAAAGCCAC TTTGCGGGGA AAGAAACTTT TCCTTTTTTC TTTTTTTTGA GACAGGATCT 4500
CACCCTGTCA CCCAGGCTGC AGTGCAGTGG CATGACCTTG GCTCACTGCA GCCTCAACCT 4560
CTCTCAGGCT CAAGCAATCC TCCCGCCTCA ACCTCCCATG CAGCTGGGAC CATAGGTGCA 4620
TGCCACCACA CCCAAATAAT TTTTATATTT TTTGTAGAGA CGAGGTTTCA CTATGTTGCT 4680 CGGGCTGGTC TCAACTCCTG GGCTCAAGCA ACCCTCCCAC CTCAGCCTCC CAAAGTGCTC 4740
AGATTACAGG CAGGAGCCAC CAGGCCTGGC CAACATAGGA AGAAATTTAA ATTTGAATTG 4800
AATATTAGAA GAGATGAAAA TTCATCAACA TGGAAAGACA AAGATCATTA ACTAAAGCCA 4860
AACCAGAATG GAAGCTGTGT GTACAGTGGG GTCTCATGCT GGGAACGCGA GGGGCACGTG 4920 CAGGGCTCCA CGGTGTGGCG ACGCCCCATG CTCCCTTTGT GGGGGTTCAT CCAGCGGAAC 4980
ATGAGGACCT GGGGTGCTTT TCAACATGTA CGTGAGTTTA ATAATAAAAA GGTTTAAGGA 5040
AAGAAAAATT CATATGTTTC TATATAAACA GAACATCTGG AAAGATCTAT TCTAAGGTGT 5100
TGACAGTAGG AATCTCTAGG TAGTAGTAAT ATGGCCTTTT TGAATTTTTG CTTATCAGTA 5160
TTTTCTAATT TTCTTTTTCT TTCTAAATAA TTCTAGCTAT GAAATAATTT TCTACCATAT 5220 ATATTTTGTA ATAAAAATGG TTATATTTAA TTTTTTAAAG GCTGTACAAA CTTCCTGATA 5280
AAATGGCAAA TTAGACACAC ACATGTGGGC CGGGTACAGT GGCTCGCGCC TGTAATTCCA 5340
GCACTTTGGG AGGCTGAGGC AGGCAGATCA CCTAAGGTCA GGAGTTTGAG ACCAGCCTGG 5400
CCAACATGGT GAAACCCCGT CTCTACTAAA TATACAAAAA TGAGCTGGAT GTGGTGGCAC 5460
ACACCTATAG TGCCAGCTAC TTGGGAAGCT GAGGCAGGAA AATTGCTTCA ACCCGGGAGG 5520 CAGAGGTTGT AGTGAGCCGA GATCATGCCA CTGCACTCCA GCCTAGGCAA CAAGAGCGAG 5580
ACTCCAACTC AAAAAAAAAT AAAAATAACA CACACGTGAA TAGGCTCCTC ATGGAAGTCA 5640
TCACAACAAT GCAGAGGGAA GAGCTTCCAA AGTGTAAACC CAGAAGCGAG GAGCAGGAGG 5700
GTGCGCGCAG ACGCAGAGAG CAGCAAGGTG CAGACTGAGA GGCGGAGGCT GGCCGTGGGG 5760
AGATGACTGA TGCTCAGTTT ATACCCCAAA TCCGTAAATC TAGAGGCCTG GCACATCAAC 5820 TACCTCTGCC AGCAGGAATG AGGGAAAGGA GGGCAACCAA AAGATGTCCC ACCCTCACCC 5880
ATCCAGCTAC CTGCCATCCT CAGCCCCACT GGCAGAAGAC CCTGAGAGGT GGAGGCAGGC 5940
CCCTGCCTAC AGGACCCTGA GAGCTAGGGG AAGGCGTTAT CCTGAACTGT GTCCCCCGTA 6000
AAATTCATAT GTTGAAGGCC TCATCCCCAG TGTGACTGTA TTTAAAGATG GGGTCTTCAG 6060
GAGATAATTT AAATGAGGTC ATATAAGTTG GCCCTCATCC AGTAAGACTT TGACCTTCTG 6120 GTGGTTTTTT TTTTTTTGGA GACTGGGTCT CACTCTATCA CTCAGGTTGG AGTACAGTGG 6180
CACGATCACG GCTCACTGCT GTCTCCAACT CCTGGGCTCA GGTGATCCTC CTGCTTCAGC 6240
CTCCTGAGTA GCTGGGACTA CAGGTGCTTA CCACCGCACC CAGCTGGTGG TGCATTGTGT 6300
TTTTTGTAGA GATGGGGTTT TGCCATGTCG CCCAGGCTGG TCCTGAACTG GGCTCAAGTG 6360
ATCTGTCTCC CTCGGCCTCC TGCAGTGCTG GAATTACAGG TATGAGCCAC CGCGCCTGGC 6420 CGACCGTGAC CTTCTAAGAA GTGAAAGAGA AAGATCTTTC TCTCTCCCTC CCTCTCCATC 6480
ATGAGGACAC AGCAAGAAGT CGGCCATCTG CAAGGTAGAA AGCGAGTCCT CCCAACAGCT 6540
GAACCTGGCA GACCCTGATC TTGGACTTCA GCCTTCAGAG CTGTAAGAAA ATAACTCTCT 6600
GCTGTTCAGG CCACGCGGTC TACGGCAGCC CGAGCAGACT AAGACACACG CCATCTGGGG 6660
AGTCAGACCA GATCAGGAAG AAAGGCCTAG AGCTCAGGAT ACTGAAGGTC CCAACCCGGT 6720 GCTGGACCAG ACCACCCCGG CAGCCGCGGC CACGGAGTCA CGGCTCGGGT GAGGTGACCT 6780
GGACACCATC CCGGCAGCCG CGGCCACGGA GTCACGGCTC GGGTGAGGTG ACCTGGACAC 6840
CATCCCGGCA GCCGCGGCCA CGGTGTCACG GCTCGGATGA GATGACTCGG ACACCACCCC 6900
GGCAGCCGCG GCCACGGTGT CAGGGCTCAG GTGAGGAGAG TTGGATATGG GACTGGGCCT 6960 ACCCCGAGGC TGCTTCCACC CAGACGCCTG GGTGGGTGAC ACGAAAGCTG GGCTCAGTTG 7020
GGATCAGAGC AGCCTCTCCC CAGGTCAGAA ATGACCCTGG GCTCCTCACA GTAGCCCTAG 7080
GGCACCATGA GAAAGCTACG TGGACTTCTC TGACCAAGGG TCACTGCTGC CACACTACTC 7140
ATTGCAGGCC ATGTCAGGGC TCAGCTGAGG AGACGTGGAC ACCACCCCAG CAGCCGCGGC 7200
CACGGCGTCC CAAGGGAGGG ACTTGGGCAC TGCCTCTCTG GGCAAGAGTG GGGAGGTGTG 7260 GGGTGGGAGA TGTCTGGAAA CATCATGGAC AC TGCCGGG AAAACACGGA AGCTGTGCAC 7320
CAAGGTGCTG ACAAAGGAAA AAGGAGAATG GAGGTGTGAA CATCCAGCTA GCAGGTCCCA 7380
CTCAGAAACT CCTGCATTTC CAGACATGGC CACCAGCTCT GTGGATGAGA CAGGGGAGGA 7440
CAGGGTACCT CACACCAGGA ACCCACACAG GTCCATGTCT TGCTCTGTGA TCACACAACA 7500
GCCTCCACCA CCCTGACATG CAGGAGGGAG GTCAAAGCCT CGGGTCCAAC AACAGGCTCC 7560 ACAGCAAGGG AAGAAAGGCA GGAAGGAACT CAGGGCCAGG TCCTCCCAGG CAGCAGCTGC 7620
CTGCACGCTG TCCACCAAGG GAGGTCTGAC CTACACCGCA CAGGGGTTGG CAGTCTAGAG 7680
TCGTCCTCTG TCAAACGGTG AGAAAGTCAA AAGCTCATGC TCAGTGATAT GCTAGGTCAG 7740
CATGAAGATG CCACACATGA GACACAGCAA GGATGAGACC AACGGGAAGA CTGCCCCAGA 7800
CCAGAGCCCC AGAGCCCTCT GGGGAGGAAG AATAAGGATG GCAGCCTGGG ACTGCCCGGG 7860 GCTGACTCTG CCTTTATTTC ACCCCAGCAG AGGCAGGAGT GACACCGGCT CACAGCAGGA 7920
GCAGCTCTGC CACCTCCTAG CAGTTCCACC TACGGGCAGC AAAACAAAGC TGGCAGTTTG 7980
GGCAAATGTT AGCGTTTTTG CCAACTAACA TTTGAATCGG ACATCTGGTA CAGAGATGAG 8040
GAAGAAAACA CTCACAGTTT CATGAAGACT GTCAAGAAAA TCACTGACTC TTCACTTCAT 8100
TTATGAAAGG CCAGCTCTCT GACATCCCTA CCACTCCCTC TCACATGAGA AATCACGGCC 8160 TTTCAGGACG TGGAGCCACG TGGCCATGCA GGTACGGGAG GCCTCCCCGC AGCTGCAGCT 8220
GGGTCTTCTG GTCCCCGTGC CATTTCTGCT TTTCTTCGCT CTCTACTTAC ACACACATTT 8280
GAGTCCAGTC TCAGAAGAAC TGGAACTAGA AAAATCCTGA CACTTGTCCC TTACTACGTT 8340
AATGCCAGCT GTGCCAAGGA CAGCCCAACC CAAGCCCCCA TCAGCCCCAA TGGCACCGAG 8400
GCCCGAGCTT ACCCGTGAGG GGCCAAGTTG GTCGTCACCA ACACGGTCTT CACCCCCTCC 8460 ACACCACTGC CGTCCACTGC AGTGTCCGGA GTTGTCACAA CCACCACCTC CTCCATGTGC 8520
ACACTCACGT CGGGAGTCGC CATGGCTCAG CGGAAGGGGA CGCCCAGGCC AGCAGCGTCA 8580
GTCCTCCAGG GTCCCAAGTC CTGGAGGAAG CAAGGCAGGG CACAGGGATG GAGTCATCTC 8640
CACATCCACA CAACATAGC CTCACAAAGG CATCTCTAAT CAGCTCCAAA GACCCACCCT 8700
TGAGTCCCAG ACTGCTACCT CCTGACAAAA ACGAGCGGCA ACAGAAGGGC TACTCCAGGC 8760 TCTGGTTCCG AGGGCGGTGT AAGCGCACTC CACCCGTTTT TCCCACTGGA TAAGCCGAAA 8820
CCCTTGGGTA GAAAGCACAG AGCCACTCCC TCCACGTGGG GCTCAGAGCA GGAGGACAGG 8880
AGGGGCCTGG AATTCCAAGC AACTTCCCTG GACGCAGGCT CCCGGCTTGC CAGTTCTTCC 8940
GTCTCTCCTG GCCTGAACTC AAAGCCAGCC CCAATCCCTG AACTGAGTTT CAGGTGCAGA 9000 AAGCACTCCA AGAAGTCCTC GCTGGTCTGT GGAACGGGAA GGGAAACCCA TTCAAGACAG 9060
AAAGAGAGGA GGGAAACGCC CTGGGTTTTT TTGGGTTTTT GGGTTTTTTT TGAGACGGAG 9120
TCTCGCTCTG TCGCCCAGGC TGGAATGCAG TGGCACGACC TCGGCTCACT GCAAGCTCCA 9180
CCTCCTGGGT TCAAGTGATT CTCCTGCCTC AGCCTCTCCA ATTGCTGGGA TTACAGGTTT 9240
CACCATGTTG CCCAGGCTGG TCTCAAACTC CTGACCTCAG GTGATCCACT CACCTCGGCC 9300 TCCCAAAGTG CTGGGATTGC AGGTGTGAGG CACCATGCCT GGCCTGCCCC GGGTTTAAAA 9360
ATTATTATTA TTTTGTCTTT CCTGGCTTTG CCTTCAGCAA GTCCAACCCC TGCTAAAACC 9420
CGGTGATAAT GGCTGTCCTG GCCCAAAAAG CTTGGAGACA GGGGAATCTT CCTCCTGACT 9480
AAAGGAATGG TGGCCCAAGA GTGTGGGGGC TCCCTGTTGC CCTCTCACTC TCCATCCCCT 9540
ACCTAGCACA GGGAACACAA AAGCCCCTGG TTTCCAGCCA GAGGGCAACG AGCCTGGAGT 9600 CAGAGTGTGG GGGAGGCGAC AAGAGGAGAG GGGAGAAGAG AGGATGGCAC ACAGCTGTGT 9660
GTGAGCGCCT GGGTCGTCCC AAGACAGTCT CTACGTGGTC CTGACCCTAA AGGGCAAAGG 9720
GAAGAAAACT GACCTACAGG ATAGGCCACT GCCCAGGTCT CAGATGGGCC CCAGTGGCGC 9780
ATATGGGACA GATCCACAGT GCACTGGAAA GTCTCTAAAA TAAACTGGCC TAAGAACACA 9840
GACACAGGAA CGGGGTGCAA AATTTGCAGC CTGAACCTAA CCAGGTCGAT TTCTTGCTAT 9900 GAAAAAAAAA AGTCTACATT CTCTGTGAAA CTTAAAACAA GACCTAGAGT CCATAGCACA 9960
GTAGTCAAAG CATCCAGAAC ACGATCAAAC TTCCTGGCAA AGGGTAGTCT GGTTGATTCT 10020
CAAAGGAACA AATACACAAG AGAAGCTGGC TCTTGAACGC AGAATCCAGA GACTTTCAGG 10080
TGCTATCGGA CCAGCTCCAA GAGGAAAGCA AACATTGTCA ACCAAGTGGA AAGAAAATCT 10140
TGGTATAGAA ACAGGAGTTA TAACCAAACA GAAATGTGAA AATTAAAAAC GACAACCAAA 10200 AGAAAATACA CAAAGCTGGG ATAGTCTCAG CTACTCGGAA GGCGGGGCTG GAGGATCGTT 10260
TGAGCCTAGG AGATTGAGGC TGCAATGAGC TGTGATCACA CCACCGCACT CCAGTCTGGG 10320
CAACAGAGTG AGAACTCTCT CAAAAAACGA AAAAGAAAGA AAGTAGAACA GAAGTGACCA 10380
GGGGCTGGGG GAGGGAGTAC AGGGAGTTGT TCTTTAATGA GTACAGAATT TCTGTTTGGG 10440
ATGATGAAAA GCTCTGGAAA TGGACGGCGG TGATGGCTGC ACAATCACTG TGGCTGTTCT 10500 GAATGGTGCT GAACCACACA TTTAAAAACA GTTAAAATGG GCTGGGCGTG GTGGCTCACG 10560
CCTGTAATCC CAGCACTTTG GGAGGCGGAT CGCCTGAGGT CAGGAGTTCG AGACCATCCT 10620
GGCCAACACA GTGAAATCCT GTCTTGACTA AAAATACTAA AAATTAGCCA GGCATGGTGG 10680
CAGGCACCTG TAGTCCCAGC TACTTGGGAG GCTGGGGCAG GAGACCTGCT TGAACCCAGG 10740
AGGCAGAGGT TGCAGTGAGC CGAGATCGTG CCACTGCACT CCAGCCTGGG CAACAAGAGC 10800 GAAACTCCAT CTCAAAAAAA AAAAAAAAAA AAAAAAAAAA AAGTTTAAAA TGGTTAAATT 10860
TTATGTTATG TATATTTTAC CGTAATAAAA ACACTGTAAT GCTACTATAA TAGAATGACT 10920
CATTAGGATT AGATATAGAC TAGAAAGTAC AGAATATAAA AACTTTTTAA ACAAAGAAAA 10980
ATTTTCATGG CCAGGCATGG TGTCACACCT GTAATCCCAG GACTTTGGGA GGCCAAGGCA 11040 AGAGGAATGC TTGAGCTCAG GGGTTTGAGA CCAGCCTGGG CAACACAGCA ACACCCCATC 11100
TCTGCTAAAT AAATAATAAA AAATAGCCAG GCATGGTGGT GTGCACGCCT GTAGTTGCAG 11160
CTACTCTGGA GGCTGAGGCA GGAGGATC C TTAAGCCCAG GAGGTCAAGG CTGCAGTGAG 11220
CCATGGTTGT GCCACTGCGC TCCAGCCTGG GCAACAGATC AAGACCTTGT CACAAAAAAA 11280
AGAAAGAAAG AAAAGAAAAA AGAAAGAAAA TAAAATCTTC CAGAACTTTT AAAATCATCA 11340 TTGTTAATAT AAAAATAACA TCACCTGCCC CTAGGACTGT AACAAACAAG TGTGTCTAAG 11400
GACAGGAGTG GGTCCACCCC AACCTGGCAC GCAGTGGTCC CCTGCGGAGA GTCTGGCCCT 11460
GCACTCACTA AGAGGAGGCA CTCATAGCCC AGCCAGGCCT CTGCAATTAT GCCTTCAATG 11520
CCAGAACTAA CTCACCCAAA CTGAACAATC GATCACAAAA TGTGCCTTCA GGTCTCAAGG 11580
TTCTTGCTAA ATCTTACTCA ACCGACATTT TCCAGCATGG GAACATTTTT CTGAATGTCT 11640 TAGGGAGAGG AAGTCCGCAA GAGAACAAAA GGTCCTCAGG CCACCCTAGC TTCTTTTCCT 11700
CCATTCCACA GGCTGTCTTT TGTCTGGGTA TGCACTGGAC CAGGGGGCTC TACTTCTTCC 11760
TACCTGGGCA TGGGTCTCCA CACAACTCCA AGGTAAAGGG CCACAGGCAA GATAAAGGGG 11820
AGAAAAGAAA GCTACGATTT CCTGGGCCAC CAATCGCAAA TGGCAGCCAG TCTCTGAAGT 11880
AACCCTTGAC CAGAGATCCA AGGAACCAAG AAATGTAGGT GATCTGAACA GAGGGGATGG 11940 TGGTTAAACA CCATGAAGGA AAGACCCATT CTCAAAGAAA AGGAAGCAAA AAGAAACCGT 12000
GGGGAGCTGG GTACCACCCG CAGCAAAGAC CCCGCACGCG TTACTGACGC CAGCCTGGCC 12060
TGGGAGAGCA GTGAGTGTGG CGGACGGTGA GTGGCGGGGA GGGCTGTGGT AGGTTTAGGG 12120
TAAGAAGGGG CAGCGCCCAG AGCCCAGAGA ACACCAGTGA GGGCTCCACA GGAACACTAC 12180
TCAAAGTATT CACGGAACAC ATCTAAACAC AAGCACTAAG GACTAAGTGC GAGGGACAAG 12240 AAAATATTCC CCGTTTCCTG TTTCAGGAGG GTATCGAAAA TGAGTGATGG AAGGAAAATG 12300
TATTGTTTAA ATGAGGAAAA AAAATTTTTA CAAATTAAGA ACATCCTGGA ACATGATGAG 12360
CCGTTTACTG TCACTCAATT TAAATGGTGG CCATCTAGGA CAGAGCGCCT AAGGGGAAAG 12420
GGGGCTCACA GGTGAACCCC TCCAGCTGCT GGTGGGCAAT TTCCCATTAG GGCATCAGGG 12480
TCTCTGAAGA CTGTCTTCAG ATGCTTTTTA GCCAGGAAAG TTACAATGAT GAATTCGTTT 12540 ACACTGGCGG AATTACTTCG TATTTCTCAA ATATAATGTT TTCACTAGCA TAACTTTGTT 12600
GTTGTAGACT TAGGCTTCAA AATAAAGAAC TTTAAACAAA CATGAATAAA AAGCCACTTT 12660
AGGCCGGGCG CGGTGGCTCA CACTTGTAAT CCCAGCACTT TGGGAGGCCG CGGCGGGTGG 12720
ATCATAAGGT CAGAAGTTCA AAGACCAGCC TGATCAATAC GGTGAAACCC CGTCTCTACT 12780
AAAAATACAA AAATTAGCCG GGCGCGGTGG CAGGTGCCTG TAATCTCAGC TACTTGGGAG 12840 GCTGAGGCAG GAGAATCGCT TGAACCTGGG CAGCAGAGGT TGCAGTGAGC CAAGATCATG 12900
CCACTGCACT CAAGCCTGGG TGACAGAGTG AGACTCTCTC TTAAAAAAAA AAAGCCACTT 12960
TAAAATTTTA CTCAGGCCAG GTGTGGTGGC TCACGCCCAT AATCCTAGCA CTTTGGGAGG 13020
CCGAGGCGAG CAGATCACCT GAGGTCAGGA GTTAGACCAG CCTGGCCAAC ATGGTAAAAC 13080 CTTGTCTCTA CTGAAAACAC AAAAATTAGC TGGGCGTGGT GGTGTGCCCA TGTAATCCCA 13140
GCTACTCAGG AGGCTGAAGT GAGAGAACTG CTTGAACCCG GGAGGCAGAG GCTGCAGTGT 13200
GCCAAGACTG CACCACTACA CTTCAGCCTG GGCGACAGAG CAAGACCCTG TCTCAGAAAA 13260
AAAAAAAATT CAAAAATTTG GCCAGGCGTG GTGGCTCACG CCTGTAATCC CATCACTTTG 13320
GAAGGCCGAG GCGGGTGGAT CACCTGAGGT CAGGAATTCA AGACCAGCCT GGCCACCATG 13380 ATGAAACCCT GTCTCTACTA AAAATACAAA AAAAAAAAAA CAAATTGGCC GGGCATGGTG 13440
GCGGGTGCCT GTAATCCCAC CTACTTGGGA GGCTGAGGCA GGAGAATCTC TCGAACTCCG 13500
GAGGCAGAGG TTGCAGCGAG CCAAGATTGT GCCACTGCAC TCCAGCCTAG ACAACAGAGC 13560
GAGACTCTGT CTCAAAAAAA AAAAAATTAA AATTAAAAAA TAAAAATTTC ATTTAAAATA 13620
CTACTGATCT CCCGTGCTGA CTTCTCGGGG TTTAACTCTC ACTGAGGAGA CGCTGCTTTC 13680 ATAAGGGTAA GCTCAGCAGG GGCAACTAAA GTCATTTAAG CAGAGAGCTG CAAAGAGGCA 13740
ACAGCCTCAC TGCAGGCAGG GGTCCTCGTC ACAGCTTCAG GGCTTTGCAG AGGATTACGC 13800
AATGTACACG CACAAAACTG AATTCCAGCC TCTCCATTGG CAACTGCATA CATACATATA 13860
TTCTTTTTTT GAGACGGAGT CTCGCTCTGT AGCCCAGGTT GGACTGCAGT GGCCCGATCT 13920
CGGCTCAATG CAAGCTCTGC CTCCCGGGTT CAAGCGATTC TCTTGCCTCA GCCTCCTGAG 13980 TAGCTGGGAT TACAGGCGCC CACCACCACG CCCGGCTAAT TTTTGTATTT TTAGTAGAGA 14040
CGGGGTTTCA CCATGTTGGC CAGGACAGTC TCGATCTCCT GACCTCGTGA TCCGCCCGCC 14100
TCTGCCTCCC AAAGTGCTGG GATTACAGGC GTGAGCCACT GAGCCTGGCC TCCAATGGCA 14160
ACTATATTAA AGGTTCAAAG CAATATGCAC AAAAGTTACC TCACAGAAAA TAGTGCAAGT 14220
CCTTGATACA ATGCTCTTTA GACACAGAAG AAGCACTATA GAATAGAGCA CCTCGCCCTA 14280 TTGCCTTCCC AAGGGCGAGC ACCCCCTCCT CTCTCCACAG CTCCTTCTTT GTTTTTTTGA 14340
GATGGAGTCT CGCTCTGTCA CCCAGGCTGG AGTGCAATGG CAAAATCTTG GCTCACTGCA 14400
ACCTCCGCCT CCCGGGTTGA AGTGATTCTC CTGCCTCAGC CTCCCGAGTA GCTGGGACTA 14460
CAGGCACCCA ACACGCCTAG CTAATTTTTG CATTTTTGGT AGAGACGGGG TTTCATCATG 14520
TTGGCCAGGC TGGTCTCGAA CTCCTGACCT CCAGTGATCC TCCCACCTTG ACCTCCCATA 14580 GTGCTGGGAT TATAGGTGTG AGCCACTACA CCTGGCCTCT CCACAGCCCC TTCTGTGTTG 14640
AAGCCAAGAC CCACCCAGCT TTGATCCCAA GGCTTGGGTT CCCCACTAGT GTGAAGTGAG 14700
TTTCCAAATT ATTAGGTAAA TCAGATATGA GAAAATATTT TATTTTACTT TTTTTTTTTT 14760
GAGACGCAAT CTTGCTCCGT CACCCAGGCT GGAGTGCAAT GGCACCATCT CCACTCACTG 14820
CAACCTCTGC CTTCTGGGTT CAAGCAATTC TCCTGCCTCA GCCTCCCAAC TAGCTGGGAT 14880 TACAAGTGCA CACCACCACG CCCGGCTAAC TTTTGTATTT TTAGTAGAGA CAGGGTTTCA 14940
CCGTGTTAGC CAGGCTGCTC TCAAACTCCT GACCTCATGA TCCGCCCACG TCGGGCTCCC 15000
AAAGTGGTGG GATTACAGGT GTGAGCCATC ACACCTGGCC CAAGAAAATA TTTTTAAACT 15060
AGTATTCTTG ACCGGCACGG TCAACACTGA TGTAATTGAA ACTGTTGTAT TTGAAGTGTT 15120 AGCAAAGAAA GAGAATTCTG GTTCAACAGA AAAGTCAGTC ACGACTTTTC AGTCACGCAT 15180
GAATTACACA GTAACCAAAT AGATAACATG CCATGACTGA CGACGGGCCC ACAACAAATC 15240
AGCTCCGACC AACAGGGTCC ACACCACCAT GGGTCTACAC AGATCCAGGT CCCGCCTGTG 15300
AGCCTACAGT GACGCGGGCC CCTGTGGGGT GGTCCCTGCA GGTCAGGTCC CTGAGAGTGG 15360
GTCCCAGTGG GGTGATCCCT GCGGGTCGCG TCCCTGCGAG TTGGGTGCCT GCCGGGTGGC 15420 CCCTGCGGGT CGGGTGCCTG CGGGGTGGTC CCTATGGGTC GCGTCCCTGC GGGTCGGGTG 15480
CCTGCGGGGT GGCCCCTGGG AATCGCGTCC CTGCGGGTCG GGTGCCTGCG GGGTGGCCCC 15540
TGGGGATCGC GTCCCTGCGG GTCGGGTGCC TGCGGGGTGG CCCCTGGGGA TCGCGTCCCT 15600
GCGGGTCGGG TGCCTGCGGG GTGGTCCTTG TGGGTCGCGT CCCTGTGGGG TGGTCCCTGT 15660
GGGTCGCGTC CCTGTGGGGT GGCCCCTGCG GGTCGCGTGG TGGCCCCTGC GGGTCGGGTG 15720 CCTGCGGGGT GGTCCCTGTG GGTCGCGTCC CTGCGGGTCG GGTGCCTGCG GGGTGGTCCC 15780
TGCGGGTCGC ACCCCTGCGG CGTGGTCCCC CCGGGATGGG TCCACCGAGG AGGCCGCTGG 15840
AGGCCGAGCC CGCGCCCGCC CGCGGCGCCA AGATGGAGGC AGGAAGCGCC GCCGCCCGCG 15900
CCCGCCACCG CCCGCGCCGC CCGCCTGACG CCGCCGTTGC GCCTGACGCC GCCGCCCGCG 15960
CGGCCGCCCC TCCCCCGGCC CTCCCCTCCC CCCGCCGTAA CGTCCTGACG CTCCGCAGGG 16020 ACCCCTGACT GGACGGCGGC GCGTGAGCGG AGCGAGAGGC CTCGCCGCGG GGGGGCCGCG 16080
GGCTCGCCGG CGCCGCTTAC CTGGGGCCGC GCCGGGCCTG CTTAGGCACC CGGCGGGGGC 16140
GGCGGCGTCG GGAGCTGCGG CGGCGGCGGG CGGCGGCGGC GGCCGCGGGC TTCGCTCCTT 16200
GTTGGGGATT CGGCGGCGGC GGCGGCGCGG GCGCGCGCTT CCTAGTGACG CAGGCGGCGG 16260
GGCCGCGCAC GCACGGGGCT GGGAGGGCCG GACACTTATT TGGCGCTCGC GGAGGAGGAA 16320 GGCGGGGCCG TGAAATAAGG CCCGACGGGC CCCGGGGCGC GTGCGCGGAC CGACACTGTC 16380
AGCTCCTAAC GCCGCAGGTT CCTCCTGGTC CCCGAGGCCC CCGGTCGGGC GTTGCCTGCC 16440
CCGCGCGGGC GGCCGGGCCG AGGGACGATG GTCAGTGGAC GGACGGCGCC AGGGAGCAGT 16500
GCCCACGCGC GGCAGGGCGG TACCTTCAGG CCTCCAGGTA CGGGCGCTCC TCGCCCGGAC 16560
GCTGCTGTGT GTGAATGGGC GCGAGGGGAC TCCCCTGCGG GGCGGACGCC TGAACACGAG 16620 GCTGTGGAGG AGGACGCTGT AGGGTGCGCG GACTCACGCG GAACATGCCA GAGGCTCAGC 16680
CAGCCACGGC GCTCCCAGCG TGGAGGGCGA GGGGCATCCG GGAGCGGCCG GGAGGGCTCG 16740
GTCACCCCTC AAGCTGTCAC CCCAGTCCCA CAACCAGCAC CCCGATCCTA TCGCAGTCCC 16800
ACAGCCGACA CCCCGATCCC ACCCCTGCCC AACAGCCGGC ACCCACCCCA ATCCCATAGC 16860
TAACACCCCG GTCCCACCGC TGTCCCACGG CCGGCACCCC GATCCCACCC CAGTCCCGCA 16920 GCTGGCACCC CGATCCCACC CCAGCCCAAC AGCTGGCACC CACCCCGATC CCACCGCTGT 16980
CCCACAGCCG GCACCCCGAT CCCACCCCAG TCCCGCAGCC GGCACCCCGA TCCCACAGCC 17040
GGCACTCACC CCGATCGCAT AGCATAGCTG ATACCCCGAT CCCACCCCAG TCCCATAGCC 17100
AGCACCCCGA TCCCACCCCA GTCCCATAGC CAGCACCTCG ATCCCATAGA TGACACCCCG 17160 ATCACGCCCC AGTCCTATAG CCCGCACCCC GATCCCACCC GAGTCCCGCA GCCGGCACCC 17220
CATCCCACCC ATGTCCCACA GTCGGCACCC CGATCCCACT CGGATCCGGC AGCCAGCTTG 17280
GATCCTGTGG CCCTCCTCCA GCCCCCAGGG CTCATTTATA TGTTTTATTG GCAGAGGCTG 17340
GGGCTGGCTC TGTTGGCCTC TGTGCTGGGT TTCTTCCTCT GCACCGCAGG ACTGGCTCTC 17400
CTGACCTCTC CAGGTGTCAT CGAACACCCT TGTGCTTGCT GTCACCCGCT GCCTGTCTGC 17460 AGGATCCCGG ATTCCGTATC AGGGGACCGA AATTAGTCGG AAAATAGGAA GCAGGTGCTC 17520
GCTTGGATGG AACCCTGACC CTGTGCTCAC ACTTGTAGGA GGAGGGCTCT GCAGGCCGCC 17580
TCCCGGAACG GGAGGTTCCC AAGCCACTGC ACTTCGGAGG GGCTGTAATT AGAGTTGCAC 17640
ATTCATTCAG TTCCCAGTAA AGTAGAACGT GCTCCAGCCA GTGAGGAAAA GGTGTTTTTA 17700
AAAATTAGAT TGGCCGAGTG CGGTGGCTCA TGCCTTTTAC CTCAACACTT TGGGAGACAA 17760 AGGTGGGAGG ATCACCTGTG GCCAGGAGTT CAAGACCAGC CTGGGCAACA GAGCCTGTCT 17820
CTGGGGAAGA ATAAAAAAAA AAATTGAGCC TTTGTCAGTG CTACTATTTT ATTATCTGGT 17880
AAATATGAGA GGGTTCACGC GGTCTATGTG TGTCATTTAT CTGAGTTTGC CTATCGTCAC 17940
GTTTTGGAAA TAAATGTCAA TAAAGTCGAA GAGGAGTGCT GAGGGGGGCC TGGGGATGGG 18000
AGGGTGGCTA CATCATGCCT GTGTGTTGCG CAAGCCCACC GAGGTCGGCC TGGGGTGAGC 18060 CCTGGGGCCT GTTCTGCCTC CTTCACTCTG GGGCTCCAAG AGACAAACTG GGCAACAAGA 18120
GAGAAACTCC ATCTAAAAAA AAAGAAAAAT CACCTCCAAG ATAACTTAGC TTTCTTCTGC 18180
TGGCATAACA AATTATCTCA AACTTAGTCG CTTAAAAATG CAAATTTAGG CTGAGTGCGG 18240
AGGCTCACGC CCATAATCCT AGCACTTTGG GAGGCCAAGG CAGGATTGCT TGAGGCCAGG 18300
AGTTCGAGAC CAACATGGCC AGAACTGTCT CTTTTTAAAA AATGCAAATG TGTCCGGCAC 18360 GGTGGCTCAC GCCTATAATC CCAGCACTTT GTGAGGCCAA GGCGGGCAGA TCACGAGGTC 18420
AGGAGATAGA GACCATCCTG GCTAACACTG TGAAACCCCC TCTCTACTAA AAATACAAAA 18480
AATTAGCCTG GCGTGGTGGC AGGCGCCTGT AGTCCCAGCT ACTCGGGAGG CTGAGGCAGG 18540
AGAATGGCGT GAACCCAGGA AGCGGAGCTT GCAGTGAGCC GAGATGGCGC CACTGCACTC 18600
CAGCCTAGGC AACAGAGCAA GACTCCGTCT CAAAAAATAA ATAAATAAAA CTGCAAATGT 18660 ATTCTCTAAC TGTTCTGTAG GTCGGAAGTC CAGCCCAGCC TCACTCCGCC AAAATCAGGG 18720
TGTCTGCAGG GCCGATTGCT TTTGGAGCTC CAGGGGAGAA GCTGTTCTGG CCTTTCCAGT 18780
TTCTGGAAGC ACTTGAGCCC CTTGTCTCGT GGCCTATCCC ACACCTGAAA GCCAGCCAAA 18840
GCCAGTTGAG TCCTCACCCT GTTGGCCCCG ACACTGATCT CCTGCCTCCC TCATCTGCTG 18900
TCAAGGCCCC TTGTGATGAC ATGGGGCCAC CAGCTGGCCC AGGGCACCTC CTGTCAGAGT 18960 CCGCCGACCA GTGACCTTCA TTCCATCTGT CGCTGTAATT CCCCTTTGCT TGGAACCAAC 19020
GTTCACAGAT CCCAGGGGTT AGGATGTGAA TATCTTGGGC AGGGCTGTGG GGGGGCTATT 19080
CTTCCTTCTA AAATATTTAT CATTTTTGTT TTGGGGATTT TTTTGGTTTG GTTTTTTTTG 19140
AGACAGAGTC TCGCTCTGTC GCCCAGGTTG GAGTGCAATG GTGCAATCTC AGCTCACTGC 19200 AACCTCTGCC TCCGGGCAGA CGTGAGCCAC TGCACCAGGC CTGTTTTTGT TTTTGTTTGT 19260
TTTGTTTTGT TTTTGAGATG GAGTCTCGGC CGGGCGCGGT GGCTCACGCC TGTAATCCCA 19320
GCACTTTGGG AGGCCGAGGC GGGCGGATCA CGAGGTCAGG AGATCGAGAC CATCCTGGCT 19380
AACACGGTGA AACCCCGTCT CTACTAAAAA TACAAAAAAT TAGCCGGGCG TGGTAGCGGG 19440
CGCCTGTAGT CCCAGCTACT CGGGAGGCTG AGGCAGGAGA ATGGCGTGAA CCCGGGAGGC 19500 GGAGCTTGCA GTGAGCCGAG ATCGCGCCAC TGCACTCCAG CCTGGGCGAC AGAGCGAGAC 19560
TCCGTCTCAA AAAAAAAAAA AAAAAAAAAA AAAAAAAGAG ATGGAGTCTC ACTTTGTCAC 19620
CCAGGCTGGA GTGTAGTGGC GGGATTATAG GTACGCGCCA TCATGCCCAG TTACTTTTTG 19680
TATTTTTAGT AGAGACAGGG TTTTACCATG TTGGTCAGAC TGGTCTCAAA CTCCTGATCT 19740
CAGGTAATCC ACCCGCCTCA GCCTCCCAAA GTGCTGGGAT TACAGACGTG AGCCACCGTG 19800 TCTGGCCATA TTTATTAACT ACAAAGGGAA AGATGATAAT TTTTTTTTTT GAGATGGAGT 19860
CTCACTCTGT CACCCAGGCT GGAGTACAAT AGCGTGATCT TGGCTCACTG AAACCTCTGC 19920
CTCCCAGGTT CAAGCGATTC TCCTGCCTCA GCCTCCCAAC TAGCTGGGAT TACAGGCGCA 19980
CGCTACCAAG CCCAGCTAAT TTTTGTATTT TTAGTAGAAA CGGAGTTTCA CCATGTTGGT 20040
GAGGCTGGTC TCGAACTCCT GACCTTGTGA TCTGCCCACC TCGGCCTCCC AAAGTGCTGG 20100 GATTATAGGC ATGAGCCACT GCAACCGGCT GAAAGATGGT AATTTTAAAG TAGAGAAACT 20160
GGGTTGGCTG GGCATGGTGG CTTATGCCTG TAAGCTCAGC ACTTTGGAAG TCCAAGGCAA 20220
GAGGATCGCT TGAGTCCAGG AGTTTGAGAC CAGCCTGGAC AATATAGCAA GACCCCATCT 20280
CCGCAAAAGC TAAAAAGTTA GCCAGGTGTG GCGGCACATG CCTGTAGTCC CAGCTACTCA 20340
GGAGGCTGAC GTGGGAGGAT CACTTGAGAC CAGGAGGTCA AGGCTGAAGT GAGCTGTTAT 20400 TGTGCCACTG CACTCAGCCT GGGCAACAGA GCGAGAGTCT GTCTCCAAAG GTAAAAAAAG 20460
GTCCAGGCAC AGTGGCTCAC ACCTGTAATC TCAGCACTTT GGGAGGCCGA GGCGGGCAGA 20520
TTCGTTGAGG TCAGGAGTTC AAAACGAGCC TGGCTAAATG GTGAAACCCC GTCTCTACTA 20580
AAAATACAAA AAAATTAGCC AGGCATGGTG ACGGGCGCCT GTAATCTCAG CTACTTGGGA 20640
GACTGAGGCA GGAGAATCAT GTAAACCCAG GAGGCTGAGG TTGCAGCGAG CCAAGATCAT 20700 GCCACTGCAC TTCAGCCTGG GCGACAGAGC AAGACTGTCT CAAAACAAAA CAAAAGAATC 20760
TTGAGTCCTG AGTTCCTCTA AGGGAAATTC CAGGCACCTC GCCACCCTTG ACAGGCAAAG 20820
GAACAATCTG ATGAGGAAGA AGATAGAAAC AGCTTAAACA ATAGTCTCCC GGCCGGGGGC 20880
AGTGGCTCAC GCCTGTAATC TGAGCACTTT GGGAGGCCGA GGCGGGTGGA TCACAAGGTC 20940
AAGAGATCAA GACCATCCTG GCTAACATGG TGAAACCCCG TCTCTACTAA AAATACAAAA 21000 AATTAGCCGG GCGTGGTGGT GGGTGCCTGT AGTCCCAGCT ACTCGGGAGG CTGAGGCAGG 21060
AGAATGGCGT GAACCCAGGA GGCGGAGCTT TCAGTGAGCT GAGATCGCGC CTCTGCACTC 21120
CAGCCTGGGC GACAGAGCCT CGAGACTCC TCTCAAAAAA AAAAAAAAAT TAGCTGGGTG 21180
TGGTGGCTCA CACCTGTAAT CCCAGCTACG TGGCAGGCTG AGGCAGGAGA ATCGCTTGAA 21240 CCTGGGAGGC GGAGGTTGTA GGGAGCTGAG ATCGCACCAC TGCACTCCAG CCTGGGCAAC 21300
AGAGCGAGAC TCTGTCTCAA AAAAAAAAAA AAAAAACAAA AAAACAATAG TCTCCCAAGT 21360
AAGTCAGAGT CACAAGGTGT TTTGATTCCC TGTGGAAACT AAAATATAAC AGCTTAACAT 21420
ATGTTCTTGA GTTATTTTTC AGAAACTTGG ACATCCACCA GGTGGAAAAT GCTGAGCTAG 21480
GAACAGTGGC TATAATTTCA GCCTTTTGAG AGGCCAAGGT GGAAGGATCA CTTGAGGCCA 21540 GGAGTTAGAG ACCAGCCTGG CCAACATGGT GAAACCCCGT CTCTAGTAAA AATACAAATA 21600
TTAGCTGGGC ATGGTGGTGC AACCTGAAAT CCCAGCTACT TGGGAGACCT AGCTGGGAGG 21660
ATCGCTTGAA CCTGGTAGGA GGAGTTTGCA GTGAGCTGAA ATTGTGCCAC TGCACTCTAG 21720
CCTGGGCAAC AGAGTGAGAC TCTGTCTCAA AAAATAAATA AATAAAAAGA GAAAAAAGTG 21780
TTGCCTGCAG GCCGGGCACA GTGGCTCACG CCTGTAATCC CAACACTTTG GGAGGCCGAG 21840 ATGGGCAGAT CACCTGAGGT CAGGAGTGCA AGAACAGCCT GGCCAACATG GTGAAACCCC 21900
ATCTCTACTA AAAATACAAA AGTTAGCTGG GTGTGTACAT GTAGTCTCAG CTACTTGGGA 21960
AGCTGAGGCA GGAGAATCTC TTCAACCGGG GAGGTGGAGG TTGCGATGAG CTGAGATC C 22020
GCCACCACAC TCCATCCAGC CTGGGTGACA GAGTGAGACT CCATCTCAAA GCAAAAAAAG 22080
AAACATAGGT GGGACCCTTG GTGTGTCCTT AGGGCATGAT GGTTGAGGTA TACTGCTGGT 22140 CCTGTCATGT AAAAGAAAAC GAGCCGACTC TGTGTCTACT GGAGAAAGCA CTGCATATAT 22200
CAGCCACAGT CAATACCTCG CTTCTGCAGG GACGGTGGCT GCCAGAGTGG GAGGCTTTGG 22260
TAGCACCCAT GTCGTGGAAT CACAATGTTG TCGATAGCTC TGGGGTCTTG TACAAAATGC 22320
CAGATCCTCC CATTTGGTTT CCTTATGGGA AGGATCGCAG TACTATAATA CATGGGCTTG 22380
TGCAAGGGAT CATTATACCC TTTTCTCTTT TTTTGCTTTT CTTTGAGACA GAGTTTCACT 22440 CTCGTCACCC AGGCTGGAGT GCAATGGCGC GATCTTGGCT CACTGCAACC TCCACCTCCT 22500
GGGTTCAAGT GATTTTCCTG GCTCAGCCTT CTGAGTAGCT GGGATTACAC ATGCCCGCCA 22560
CCAGGCCTGA CTTATTTTTG TATTTTTAGT AGAGACAGGG TTTCACCAAG TTGGTCAGGC 22620
TGGTCTTGAA CTCCTGACCT CAGGTGATCC ACCCACCTCG GCCTCCCAAA GTGTTGGGAT 22680
TTCAGGCATA AGCCACCAGG CCCAGCCTTT CTTTCTTTTT AAAATTAATC TTTGTTTAAA 22740 AATACTCTCA TTTTTTATTT AATTGTAGCA CTCCTAGATC CCGAAAGCAG ATACACTCTT 22800
GTTATGGGTC TGATTCTTTT CATTGCTTCA CGCCTTAGAG GATATTGTCC AATACTGGAT 22860
AAAAGTTTAC TCAGGTCTAC TTCCACTTTA ACGGGGATGG CTGAATATCT CTTCCACTTG 22920
GCTGTTTGTT TATAATGAAC TGACAAACAT ACAAATTTTC TTGAGTTCTG TGAGACATTC 22980
TAGTAAATCA TCTAACCTGA AGAGCAGGTT GTGAGAACCC CTGATTTAGA AAGCCCAGTG 23040 GTCATAAATA TAAGTGGCTC TGGACTGGCT CCCGGGGTCT GAAGTGTGGG CAGTCGGTTA 23100
GGATTGAGCC CTTGTAATTT GTAGGATCTG ACACACACTC CAGGAAGGCA GTGTCAGAAT 23160
TTACCTGTAT TATATTGGAC ACCCAGTTAG CGTTTGGAGA ATTGGTTGCT GGTATAGAAA 23220
AATACCAAAT ATTTTATGTC AGGGGAGTGA AAGAAAAAAC AAAAACCCGG CCGGGCGCGG 23280 TGGCTCACGC CTGTCATCCC AGCACTTTGG GAGGCCGAGA CGGGCGGATC ACGAGGTCAG 23340
GAGATCGAGA CCATCCTGGC TAACACGGTG AAACCCCATC TCTACTAAAA ATACAAAAAT 23400
TAGCCGGGCG TGGTGGCGCG CGCCTGTAGT CCCAGCTACT CGGGAGGCTG AGGCAGGAGA 23460
ATGGCGTGAA CCCGGGAGGC GGAGCTTGCA GTGAGCCCAG ATCGCGCCAC CGCACTCCAG 23520
CCTGGGCGAC AGAGCGAGAC TCCGTCTCAA AAAAAAAAAA CAAAAAAAAA AAACAAAAAA 23580 AAAAAACCCA TACACTTTAA GGAAAGCAAC TGACAGCATT TGTTACCAGT GATAAAATTT 23640
GAGCTTTGAA GTAAGAATAA CAATTTTGCC ATTGTGCCCG GGCCAAGAAA AAAAAAAGAA 23700
TTTTGCCATT GTGAAAGGCT TCCCAGTACT TTCTGATGAG CTTGACGGTG ATATTAACAA 23760
ATAACTTTTT TTTTTTTTTT TTGAGATGGG GTCTTGCTCT GTCACCCAGG CTGGAGTGCA 23820
GTGGTTCAAT CTCAGCTCAC TGCAACCTCC GCCTCCCAGG TTCAAGCGAT TCTCCTGCCT 23880 CAACGTCCCA AGTCGCTGGA CTACAGGTGT GCGCCACCAC GTCCAGATAA TTTTTGTATT 23940
TTTAGTAGAG ATGGGGTTTC ACCATGTTGC CCAGACTGGT CTCAAACTCG TGACCTCAGG 24000
CGACCCGCCC ACCTCGGCCT CCCAAAGGTG GGAGGCCTTG CTGGGATTAG AGGTATGAGC 24060
CGCTGCACCT GGCCTCTTGT CCTTGTGTTT TGCAGTGATG CAATGACCAT GTCTTACATT 24120
TGCAACCAGA AAAAAAGGTT AGTGTAACAA TGTTTATCCT GTTTTTCCCA GAGTAGACAT 24180 TATGAAGATT AAAAAAATTT GAAAGTGTTT TGAATATAAT AAACTATGCT ATACACACAA 24240
CATTTTGGTG ACTAGAAATA CAAGTTTATT GTTTGTTGTT TGTTGAGACA GGGCCCTGCT 24300
CTGTCTCCCA GGCTGGGTGG CACAATCATG GCTCACTACA GTCTTGAACT CCTGGGCTTA 24360
AGCGATCCTC CCACCTCAGC CTCCAGAGTA GCTGGGACTG CAAACGAGCA CCACCACGCC 24420
TGGCTAATAT TTGTATTTTT TGTAGAGATG GGGTTTCACC ATGTTGCCCA GACTGGTCTC 24480 AAACTCCTGG GCTCAAGCAA TGCTCCTGCC TCGGCCTCCC AAAGTGCTGG GATCACAAGT 24540
ATGAGCCACT GCACCCGGCT GAGTTTCTGT TGTTTTAAGC CGCTTCATTT GTGGTACTTC 24600
TTACAGCAGT CCCAGGAAAC TGAGCAACTG CAGAACATCA AAATTGTTTT TCTTCAGCAA 24660
AAGGAGAAGC ACTTGTGGTT GGCACCAGCT TTTCCTGTGC TCACTTCTGC ATGGCCGCAC 24720
CTTTGCCCGA CACGAGTGCA CAGCAGGCTG TGGGGGAGCA ACTGGTTGAG TCAGGCCTCC 24780 ACTTGTGCCG TATCCCCACC TGCTTTGCTG GACACCCCTG TTTGGGGGGC ACCCACTGCT 24840
GCCCCAGACA CCAAGCAAGC ACCAGCTGTG TCCAAAACTT ACAGTCACTG TCTTGGCCCG 24900
TTTTGTGCTG CTGTAACAGA ATGCCACAGA CTGGGTAATT TAATACAGAA CAGAAATTTA 24960
TTTCCTCAAA GTTTTGGAGG CTGGGAAGTC CAAGAGCAAG GGGCCATCAG GTCAGGGCCT 25020
GGTCTCTGCT TCCACGATGG CACCTTGACC ACCGTGTCCT CACGTGGTCA GAGAGAGCCC 25080 ACTCCCAGGA GCCCTTTTAA TAGAGCAGAA CACTGCTGCG CTGCGGTTAA GTTTCCAACA 25140
CGTGAACTTC GGAGGTGACA CATTCAGATC ATAGCAGTCA CTCTAGGCAG AGTGTCTGAT 25200
GTGGTTTTAA AATACGTTCA CAGACTGGCC GGGCACTGTA GCTCACGTCT GTAATCCCAA 25260
CAGTTTGGGA GGCCAAGGTG GGTGGATCAC CTGAGGTCAG GAGTTCAAGA CCAGCCTCAC 25320 CAACATGGTG AAACCCCATC TCTACTAAAA ATACAAAATT AGCCAGGTGG TGCATGCCTG 25380
TAATCCCAGC TACTCGGGAG GCCGAGGCTG GAGAATCGCT TGAATCCAGG AGGTGGAGGT 25440
TACAGTGAGT CGAGATCATG CCATTGCACT CCAGCCTGGG CAACAAGAGC GAAACTCTGT 25500
CTCAAAAAAT AAAATAAAAT AAAATACATT CACAAGGCCG GGCACTGTGG CTCACGCCTG 25560
TAATCCCAGC TACTTGGGAG ACTGAGGCAG GAGAATCGCT TATAACCTGG GAGGTGGAGG 25620 TTGCAGTGAG CTGAGATCAC ACCGCTACAC TCTAGCTTGG GCAACAAGAG TGAAACTCCG 25680
TCTCAAAAAA GTAAAATAAG GCCCTGCAGG CATGGTGGCC CACACCTGTA ATCCCAGCAC 25740
TTTAGGAGGC CAAGGCGGTC GGATCACGAG GTCAGGAGTT CGAGACCAGC CTGGCCAACA 25800
TGATGAAACC CCGTCTCTAC TAGCCTAGCC AACATGGGGA AACCCTGTCT CTACTAAAAA 25860
TAC AAAATT AGCCGGGCAT GGTGGTGCGT GCCTGTAATC CCAGCTACTC AGGAGGCTGA 25920 GGCAGGAGAA TCGCTTGAAC CCAGGAAGCA GAGGGTGCAG TGAGCCAAGA TTGCGCCGCT 25980
GCTCTCTAGC CTGGGCGACA GAGCGAGACT CCATCTCTAA ATAAATAAAT AAAATAAGAA 26040
AATAAAATAT GTTCACAAAT CCTTTGACAT TCCTCACCTC AAAAGCTGGA ACCCAACTCC 26100
CTCCTAAGCA TGAGTCTTCT CAGTGACTCA CTTCTAACAG CAGAACTTAC ATGGTTCCCC 26160
ACACCCAGAG GACATTGGGT TCCTCCCAAT ATCCCCCCAC CCAGCGACCC CCACCCAGGT 26220 CGCTGGCTTT GGGTCCCCCA GAGCCATGTT TCAAGGACAC TCAGGCAGCC CCTGGATGTC 26280
CATGTGGTAA GGAATGAAGG CCTCCTGCCT GCAGCCTCGG GAGGGAGCAT TCTCAGAAGA 26340
GGATGCCCCA CCTCCTGCCC AGCCTTCAGA TGGCCAGGAC CTCGTCCAAC GTCCTGACTG 26400
CAACATCATG AGAGACTCCG AGCCAGAAAC CCCCAGGTTT TGTACTCCTG ACTTATGGGA 26460
ACTGACAGAT AATGTTCGTT GTTAATTAAG GGGTGACTTG TCACACACAA TAGGTCACTA 26520 AACAGCTCTG TCTGGCCTCC CAGGAGGAGC CTGCCTTTCC TTTTCTTCAT GGGAAAAGTG 26580
CGATCAGTTT GTGAAGGAAT GTCCGCCCCC ACTTGATGCC AGAGGCTCCA CATGGTGACT 26640
GTCATAAACT CCATCTGCCC TCAGTGCCTT GCCAGCACCC GGCCTGCGAT CAGCTTGGTC 26700
TTGCGGGAGG CCAAGGCCCA CGTGTGTTTG TGTGTGGTGT CTGTGTCTGC GTGCCCATGC 26760
ATGCCCAGGG TACAGGGATG CCATATACAA ATTCTTTCAA TGTTGTATGT GGCATGTGTG 26820 TGTCTGTATG CCCAGGATAC AGGGATGCTA TATACAAACT CTGTTTTTTC GTTTTTTTTT 26880
TTTTGAGACA GAGTCTTGCT GTTTCGCCCA GGCCGGACTG CAGTGGCGCT ATCTCGGCTC 26940
ACTGCAAGCT CCACCTCCCG GGTTCACGCC ATCCTCCTGC CTCAGCCTCC TGAGTAGCTG 27000
GAACTACAGG CGCCCGCCAC CACACCCGGC TAATTTTTTG TATTTTTAGT AGAGACGGGG 27060
TTTCACCATG TTAGCCAGGA TGGTCTTGAT CTCCTGACCT CGTGATCCAC CCGCCTCAGC 27120 CTCCCAAAGT GCTGGGATTA CAGGCATGAG CCACCACGCC TGGCCTACAA ACTCTTTCTT 27180 ττττττττττ TTTTTTTTGA GATGGAGTCT CACTGTCTTC CAGGCTGGAG TGCAGTGATG 27240
CGATCTCAGC TCACTGCAAG CTCCACCTCC CGGGTTCATG CCATTCTCCT GCCTCAGCCT 27300
CCCAAGTAGC TGGGACTACA GGCACACACC ACCACGCCCA GCTAATTTTT TGTGTTTTTA 27360 GCAGAGATGG GGTTTCACCA TGTTAGCCAG GATGGTCTCG ATCTCCTGAC CTCGTGATCC 27420
GCCCGCCTCG GCCTCCCAAA GTGCTGGGAT TACAGGCGTG AGCCACTGCG CCCAGCCTGC 27480
AAACTCTTTC AATGTCTTTC TTTTCTCTCT CCTGCCATCT TCTCCCTTGC AGATTTCTTT 27540
TGTCTCTACG TCTTCCCCAG CTGAGTCCGA GGTCCTGACT TGCCCACGCT CCCTGGACTG 27600
GAGGAGAGGT GATAGCAAGA GCTCCTTCAA GCCCAGGAAT GCCACCAGGG CTGCCCCGGG 27660 AGAGGAGGAA GCTGGGTCTC TCGGGGTTGT GGGGACCAGA CACCCTTCTA AGACATGGAC 27720
TCAGCACAGA AAGTCTAGAC ATCCACTACA AACACATCTC CCTCCTAACA GGGGGCCCCT 27780
GGGCACCCCA AGTGGCTGTT TGGTGGGACA GGCATGTCCA TCAGTCAGAA TATCTTTATT 27840
TTTTATTTTT TATTTTTTAT TTTTGAGAGA GTTTCACTGG AGTGCAATGG CACGATCTCA 27900
GCTCCCTACA ACCTCCGCCT CCCAGGTTCA AGCGATTCTC CTGCCTCAGC CTGCCACGTA 27960 GCTGGGATTA CAGGTGTGAG CCACCACACC CAGCTAATTT TTTTTTTTTT TTTTTGAGAT 28020
GGAGTCTCGA GGCTCTGTCG CCCAGGCTGG AGTGCAGAGG CGCGATCTCA GCTCACTGAA 28080
AGCTCCGCCT CCTGGGTTCA CGCCATTCTC CTGCCTCAGC CTCCCGAGTA GCTGGGATTA 28140
CAGGCATGAG CCACCGCGCC CGGCCAATTT TGTATTTTTA GTAGAGACAG GGTTTCACCA 28200
TGTTGGTCAG GCTGGTCTTG AACTCCTGAC CTCAGGTGAT CCACCTCCCT CGGCCTCCCA 28260 AAGTGCTGGG ATTACAGGCC TGAGCCACCA CGCCCAGCCC AGAATGTCTT CTTACTTTTT 28320
ATTACTCTGT CCCCCATCCT GGGTCCAGAC CTGTGACCGT GAACAACCGG CTGCCCAGGG 28380
GTGAATGGGG TGAGTGGGGT GAGTCCACAG AACAGTGGGG TGCAGCCCCA GGGGTCTCGT 28440
AGCACCTGCC CCCAGGTCAG GAAGTCCCAC AGCCTAGAGG CTCCAGCCTC AGATGCATAC 28500
ATATGTAGGC CCTGCCCTTT CCTCCTGAGC GGCGGGCCAC AGAGTCCTGA ACAACAGGAA 28560 GCCCCTGAGG AGGGCTCCGC CCTGAGGGAG GGCAGGGGAG CCCCCGCCAG CCCCACCCAC 28620
AGCAGCGGGC CCTGCCACCC CCCACCCTGA CACCTCACCC CTTGGATTCC AGAGAGGAAA 28680
GTGGGCTTGT GTGTAGTTTA CATGCTCATA TCTTAAAATC ACCGTTGTCA ATAGAACAAT 28740
TCATAATAAT GATGATAAAA TAAGATTTAT AACCAGCTTC AGTCTGGAGA TACACACAGA 28800
GCAGATCTTC ACTCCCAGAC AGGGAGCCCG CAGCTGCCCC CGACCCCACA GGTGCAGGAC 28860 ACACACAGAC AGTTCAACCA TGTCTTAAAC ACACAGGTGT TTATTTAATT GTTCATTTGA 28920
TTGAATTTTT AAGTTCACTT TACTACGTGG ATGAGATGGG TGCATATTAC AGTAGGCTTT 28980
CGCTATGAGC GCTGCCACCA TGAGGAATAT CCCAGCCCTC AGTTCTGCTT CCCTTTCTGA 29040
GTCCCACAAA AGCCAGATGT GGACAGCCTT GGGTTCCCAT CCCAGCTGGC TGCTCCTTCT 29100
GGGGCTGTCT TGGTGGGGAG AGGGAGATGG GGCAGTGGGT CCCTGCTGAC CCCTGAGCCC 29160 TGCAGGGGTC AGGATCCTCC CGTGGTCCCT GGGTGTGGCT CTGGAAGACA CTGGCAGTGC 29220
CCGGCCAAGG CCTCCCGCAG GATGGAAGTT GAGGGCCCTG GCTCTGGGTC CTAAGAGAAC 29280
TCAGCCGCCC CCTTCACACT TTACAGCAAG GGGCCAGGCA GCAGCTTTGG GATGGGGCTT 29340
CCGTGGAGAA GTGGGGGATG CTGCAGTGGT ACAAAGACAG CCTCCCCCAC CGCCATCCTC 29400 CAGCTGACCG TCCTCCAAGG CCAGCACTGG GCGTCCAAGG GAAAGAAGGA ACTCAGCCCA 29460
GAGGGTGTGG GCAGGAGAGG CCTGGAGTCA GGCCTCCACC CACAGCCCCC TCTGGGTGCC 29520
AAGTGGGAAG GGTGTTGGGG CTGGCTTGGG AACCTTACCC GCTGCCCTTC CAACACCTGG 29580
ATCTGTGGGC AGCGGTCCCA CAAAATCCCC CTTGGGGCTC CCTGAGGAGG ACTTGTGGCT 29640
GCCGCTTCCA CCAGGGCAGA GGGCACAGGA GGGGCCAGCA CTCCAAAGGG CTCTAGGGTG 29700 GGTCTTTCAA GGACATCTGC AAAGCCCTGG TGGGGAGGGG CCTGGGCCAG AGGCTCTTTG 29760
GAACTCTTGC ACTTCTGAGT GGGGGACTGT CCATGCTGCC CACAACCTCT AGACCATGCA 29820
GCCTGCTCAT GGGTCCCTGG CAGAGAATGC CCACTCCCCA GCAGACTCAG GGCAGGCCCC 29880
CAACTGCAGG CTTCCAGGAA GGCCCAGGGT GTCCACCTCA CGCCAGGTGG TCTCAGAGGA 29940
CCCCTGTGCA ACCACATTAA GGAAAGCTGC AGCCCCCACC CACCCGCCTG CCAGTTCAAC 30000 AAGCACCGGC TGCACACGCA GGCTCCCAGG CACCATCACC CCCCTCCCCC GTCGCCCCTC 30060
CCTCACGGGG AGCCCCTTCC CCCTGGAAAG ACAGCAGGTA CTGTAGCCTC GCCTGCTGGC 30120
CAGGGGCGCC GGCTCAGAGG ACCTGCCCTG ACCTGCACGT GCTGACCAGA CAGCCCAGCG 30180
TAAGGACCCG CGATCCCACG CCACCGCCCT GGGTTTACCA CGGTCACCAC CACCTCTCTC 30240
ACAGGGCCCC CGGGGGACCC AGCCGCGCCC GGCCTGGTGT CTGCACCGAG GGACCGCGTC 30300 TCACGCCCGG CGGCTCCTGC AGGGGAAGCC GTGGTCAGCG ACTCACCACG AGGACAGGGC 30360
AGGGCGGCTG AGTGCGGAAG AGAAGCATGA AGCTGGGGGC GGGGGTGGGG GAGGAGGAAC 30420
AAAAGTTGCA TCTAGACAGA GGTGAACGAA ACAAAACCAA AACCCGAACG TGTTCCGTCG 30480
CAGGATGGGC GCCGCCCGTC CCGGGCCCTT AGCCCGACAT CTCTTCTCGC TGCTCCTTGT 30540
TCCTGCGCAC CTCGGCCGCG TGCAGCTCCT GCAGGACAGG GGGCGGGAGG GCCTGAGGGC 30600 GGGGGTGGCT TGGGGCGACT CCGGGAACCC CCAGGCGCGC AGGCCGTGGC GCCCTGGCAC 30660
CCGCCCGGCC TCATCCGGGC TGGCCTTCGG CAGGACCCTG ACTGAGTTGA GGGGGCGGGA 30720
GCACCGGGGA GGCGCAGAGC AAGGCCAGGG ACCAAGGACG GGTTTCCTGG GAGCTGGCTG 30780
GGCCCCGCTT CTAGCTCGTA CCGGAGCCGA GCTTCCTTCA GGGCACTTTC AATATAATGA 30840
ATTTAGCCAT CTATTACTGC GGCTAGTTAC TGTCCCGCCA GGACCAGACT CTGGACCTGC 30900 CTCGTGCGCT GCTGGGGACG CCCAGTAAAC ACGGGAGGAG CCCCCGACCC CCACCCCAGC 30960
TCAGCGCCTC GGAGTCCCCG GCCCCGCTCT GCGCCCCTCC GAGCTCCGCC CTAGCCCCGC 31020
CCCCGCCCAG TGCCCCGCCC CCTGCCTGCT GCTAGCCCTG CCCCCGCCCC GGCCCCTGCC 31080
CGCTCCGAGC TCCGCCCTGG CCCCGCCCCG GCCCCTGCCC GCTCCGAGCT CCGCCCTGGC 31140
CCCGCCCCCC GCCCAGTGCC CCGCCCCCTG CCTGCTGCTA GCCCTGCCCC CGCCCCGGCC 31200 CCTGCCCGCT CCGAGCTCCG CCCCGGCCCC GCCCCGGCCC CTGCCCGCTC CGAGCTCCGC 31260
CCTGGCCCCG CCCCCGCCCA GTGCCCCGCC CCCTGACTGC TGCTAGCCCT GCCCCCGCCC 31320
CGGCCCCTGC CCGCTCCGAG CTCCGCCCCG GCCCCGCCCC GGCCCCTGCC CGCTCCGAGC 31380
TCCGCCCCGG CCCCGCCCCG GCCCCTGCCC GCTCCGAGCT TCGCCCCGGC CCCGCCCCGG 31440 CCCCTGCCCG CTCCGAGCTC CGCCCCGGCC CCGCCCCCGC ACCTTCTCGC GCAGCCGCTC 31500
GCGCAGTGCG GCCAGGTGTG CCTCGCGGAT CTCCTTGCTG AGCTCCATCT TGTAGTTGAG 31560
CTTCTCCTCC GCCTGGCGGC TGAAGTTGTT ATTCTCCTCC AGCGCCTTGT GCAGCACCTC 31620
GCGCTCGTGC TCGCGCCGCT CCGCCAGCTG CTTCAGCACC TGCGCCTCCT GCGTCTGTGC 31680
GGGGCCGGCG GGCGCGCGTG AGCGGCAACC CCGGGCCCTG CCCGGCCGGA CTCCTCCCTG 31740 CTCTCCGCCT CCCGCCCAGC GCCCGCTCGC CTCACCTGGC GCCTCCACCT GCCCAGGCCT 31800
CGGTGGGCGC CGGGACCCCC GGGCGCTGCC CTGGGAACCC TCGCCTGCCA TCCGGCCTGT 31860
GGTCGGGGCA GGGCCAGGGG GTCGCGATCC GCCGCCCCCG CCCCCGTCCC TGCCTCGCGC 31920
GCGGGTCCCG CGGTCCTGGC TGCGCCCAGG GCCCCCGCCA TACCCTGCCG CCACTGCACA 31980
CCCTGCCCTG CGCGTCTGCC CCTCCAAGGA CCAGCAGCAA GAAACCCTAA ACTTGTGGGC 32040 GGTCTCTGAG CTTTGTCTCT TCCTCGGACA TCCGCCCACT GAGCAGAGTA GCTGCTTGTT 32100
ACACACCGGG TTCCCAGCTC CCAATTAGGT GCCCAGGAGC GGAGGGTCCC CAGGGATGCT 32160
GGGGGAGGGG CCGGCTGGTG ACCCCTGGGA GGAGAGCGGG GCAGCAGGAC CCGCACCCAC 32220
ATGCCAGTCC CTACTAGTCA GCCCTGTGAA CCCTGGTCTC TGGCCTCACC GGGAAGGGAA 32280
CGGAGCCGCT TCCCCTGCCC AATGCGTTGG CCTCCAGGGT GGCACCCCCA AAAGGACATT 32340 TTTATCTCTG TTTCAGTCTC AGAGGGGCTG GTGGGAGGGG AGGCTGCAGG GAGGGGACCT 32400
GGAGCCCACA CCCACCTCTC CCAGGGCCCC TCCGCCCTCC AGCAAGCCTC AGGGTCTTCA 32460
CACATGAGGC CCTTCCTCCA GCTTCCCTGT CTGGGAGAGG GATGCCCCAC CCGACGTCCC 32520
CAGGGCCCAT CTGGGGACCA CCCCCTAGCA TCCTGCTGGC CCTGACAAGG GTGCCTCCCA 32580
CCCTCACCAG AGGCTCCTGC TCCTTCCAGG TGGCCGCCTC GGAACCCTTC CTCCTCTCCA 32640 TCCCTTTCTT TTTTTGTTCT TGTTTGTTTT TTGAAATGGA GTCTCACCCT GTCGCCCGGG 32700
CTGAGGAGTG CAGTGGCGCA GTCTCGGCTC ACTGCATCCT CCACTTCTTG GGTTCAAGCA 32760
ATTCCCCTGC CTCAGACTCC CTAGTAGGTG GGATTACAGG TGTGCACCAC CACACCTGGC 32820
TAATTTTGTA TTTTTAGTAC AGATGGGGTT TCACCATGTT GGCCAGGCTG ATCTTGAACT 32880
TCCAACCTCA AGTGATCTGC CTGCCTCAGC TTCCCAAAGT TCTGGGATTA CAGGCGTGAG 32940 CCACCACACC CGGCCTCTCC CCATCCCATT CTTATCTCTC AGAAAGAGGC CCAGGGAGCC 33000
ACAGCCCCTC CTGCTCCAGG CCAAGGCACT GACCAAGCCT GTCCGGGAGC ACCCTGCTTC 33060
TTGCAGGCCC TGTCCCCGTG GGCCGCCTCC GTTGAAACTC CTGGGGGGTG GGGGATGGAG 33120
GACTCCTTGC CTTCCTCCGC TCCTCGGCTG CCTCCAGCCG CTTTTGCAGC TCCTCCAGGG 33180
AGGTGTCCTT CTTCTTGGGT GGGGAGGAGA GCATAGGGCT CTCTGGGGAC AGGTCAGAAG 33240 GGGACTTGAG GATGACCTCG AAGCTCTGGC CTGAGGCCCG CTTGTCCAGC TGCTTCACCT 33300
CCATGTCTGC AGGGCAAGAC CAGAGTAGAG CTTCAGAGGC CCGGCCAGGG CATGGCGTGG 33360
GCTGAGCGGG ATGCTCCCAG CACACATCCA ACCCCAGGGC TGGGCGAGAG GGGGTGGCTG 33420
CTCCCGCAGG AATCCCAGGC TTCAGCCCCC AGGATGGGCC CCTTCCCCCT AGAACCTCCC 33480 TCTCCAGAGG CAGCCAGGAC GGGAGTTCAG AGAGACTGCC GGAGGCCGGG GGAAAAGGTG 33540
AGGTGGGCAG GCACCGCAGG GAAGGGCAGG CGGCAGCCAG GCACTCACCC CCGTACTGGT 33600
AGACGGTATT GGGGTGCGGC TGTGTGTAGA AGCAGGAGCA GATGAGCGAC AGCACCGACA 33660
GCTCCTTCAT CTTCTCCTTG TAGGCTGTGG GCACAAGGCT GGGCTGAGCA AGCACCACTG 33720
GGGCCTGCCC ACCTGGGCCC CCGTTTTCCC TCCCCATGGC TGCCTCTATC ATGTCTCTGT 33780 GAGACACGGA GCTGCCCAGC ACGCTCTCTT GTGTGTCTCC ACACCGCCGG CCCCTTCGTC 33840
TCTCCAGCTC TCTCGCTTCC AGACGTCGGC ACTGTCTCCG TGGTGTGTCC CCTGCCTTCT 33900
GTCTCTCTCG CCCTCTGCCT CTCCCCGCTT TTCCTCTCTC TCGGCATTAA TGTCTGTCTC 33960
ATCTTCCACA CTGACTTGTT TCTCCATCCT TCTCCTGCCT GCTGTGGTCT GAATGTTTCC 34020
ATTACCCAAA ACTCATGTGT TGAAATCGTA ACCCCAAGGT GCCGGTGTGC GGAGGTGAGG 34080 CATTCGGAGG GAATTAGGCC ATGAGGATAG AGCCCTCCTA AGTGGCCCCA GAGTGGGGCT 34140
TCAGAGAACT CCCTCACCTT CCATCATGTG AGGACACAGC CAGAAGACGC CACCCGTCTA 34200
TGTACCAGGA GGCGAGACCT CTCCAGGCAC CGACTCTGCC GGCACCTTGA TCCTGGACTT 34260
TCTGGCCTCC AGAGCGATGG GAAATAAGTT CCTGTCGTCT ATAAACCACT CAGTCTCAGG 34320
TACCTGCCCA GACTGACAAA GTGGCTACCC CTGCCTGTCT GGGTCTCTGT TTACCTTCTG 34380 TGTGTCTGAC TCTGTCACTG TCATTGTATC TTTCTGTGTC TCTGGGGGTA GCCCCTGACT 34440
CTGTCTTTCT CCCTGAGTGC ATCTTTCTGT GATTCCTTGT CACTGTGTGT CTTTCTGACT 34500
CTTACCTCCC TCTGTCCCGC TACTTCTCTC TCCCCTCCTC CTCCTTCCCA CTCCTCGCCA 34560
GCTCAAGCAG GCAAGATTTA CTCATGACGG GACCAGCACA GATGCAAACC CTCTGTGGGC 34620
AGGACTTTCT TGGGCTGTAA ACCTGGATGA AGCCCTCAGA CCCTCCTTTT TCCTTCCCAA 34680 TGATTGTGTG GTCACCTTGA GATGAAACCA GGCCCTCTCC AGGCACATGC TCTCTGTCTA 34740
TCTAGGGCTG GGCTTGGGCC ACTGATGCCA CCAAGGAGCA AGGGAGGGAA GCTGTCCGTT 34800
CAGCACCACA GCCAGCCCTC TTGCCCATTC AGGTCAATCA AGTGCCCACC AGCCAGTGTC 34860
CCTGCTGCCC AACCCAAACC AGAAGCAAGC CGGGCTCCTG TGGCCCTGTG CCCTGTCAGG 34920
GGAAGAGGAA GGCGCCTGCT GTCACAGTGA AAATAATTTA GCTCTTTTGG TCTATTCAGG 34980 GCGAACCTCA TTCCTAAGCA GACACGCTGG CCCGGTTTCT CACTAGTGCT CGATAATCCT 35040
TTTGGCTGGG TGCAGTGGCT CATTTAACTG TAATCCCAGC ACTTTGGGAG GCCAAGGCAG 35100
GTGGAACACC TGAGGTCAGG AGTTTGAGAC CAGCCTGACC AACATGGTGA AACCCGATCT 35160
CTACTAAAAA TATAAAAATT AGCCAGGCGT GGTGGCAGGC ACCTGTAATC CTAGCTACTT 35220
GGGAGGCTGA GGCAGGAGAA TCGCTTGAAC CTGGGAGGCG GAGGTTGCAG TGAGCCGAGG 35280 TCGCGCCATC GCACTCCAGC CTGGGTGACA GTGTGAGACT CCGTCTCAAA ACAGAAAGAA 35340
AAAGAGAGAG AGGAAGAAAG GAAGGAGGGA GGGAGGGAGG AAAAGAAGAA AGGAAAGGAA 35400
AGGAAGACAG ACAAGGCAGA AGTAATCAAG CCTTTCATGG TGAGCTGGGT CTTCTGGTGA 35460
CAGTGCAGAG AATGGTCTGT CCTGACTTAA ATTTCCTGGT GACCTACACT TTTCTGGACA 35520 GAGCAGCACA GAGCCCAAGA GGGTGTAAGG AGGAGCAGAA AGGAATCCCA GGGTGGGCAG 35580
GCCCGTGCGA GAGCCTTTGG GGGAAGGAAT GAGACTTTGA GCCGGGAAGC GAGGCAAAGC 35640
TACCTGTCTT GGTCATTGTC TTCAGGGAGG GAGATGGAGG GGGACCAGGT GGGGGAGCCT 35700
CACAGGGGAC TTTGGTCTGA CTTGTCAAGT TTTCTTTTTT TCTTTTTGAG ATGGAGTCTT 35760
GCACTGTTGC CCAGGCTGCA GTGCAGTGGT GCGATCTCGG CTCACCGCAA GCTCCGCCTC 35820 CTGGGTTCAC ACCATTCTCC TGCCTCAGCC TCCCGAGTAG CTGGGACCAC AGGCACCGCC 35880
ACCACACCCA GCTAATTTTT TGTATTTTTA GTAGAGACGG GGTTTCACTA TATTAGCCAG 35940
GATAGTCTCG ATCTCCTGAC CTCGTGATCC GCCCGCCTCG ACCTCCCAAA GTGCTGGGAT 36000
TACAGGTGTG AGCCACTGTG CCTGGCCTAC TTTATTTTTT AGAAACAGGA CTGTGCTCTG 36060
TTGCCCATGC TGGAGTGTAG GGTGCAGCTG TGCGGTTCAC TGCAGCCTTG AACTTCTGGG 36120 CTTGACGGAT CCTGCCATCT TAGCAGCTGG GACTACAGGT GCATGCCAGC ACACCAGTTT 36180
TCTTTTTTTT TTTATCTCTG CTCACTGCAA TTCCGCCTCC TGGGTTCTAG CGATTCTCCT 36240
GCCTCAGCCT CCCAAGTAGC AGGGATTACA CGCACATGCC ACCACACCCG GCTAATTTTT 36300
GTATTTTTAG TAGAGACAGG GTTTCACTAT GTTGGTCAGG CTGGTCTTGA GCCACCGCGC 36360
CCGCCCGGCC TACACACCAG CTTAAAAAAA AGAAAAAAAT AGCTGGGCGT GGTGGCTCAT 36420 GCCTGTAATC CCAGCACTTT GGGAGGCTGA GGCAGGCAGA TCACCTGAGG TCAGGAGTTC 36480
AAGACCAACC TGGCCAACAT GGCGAAACCC TGTCTCTACT ACAAATATAA AAATCAGCCA 36540
GGCGTGGTGG CGGGCTCCTC TAATTCCAGC TACTTGGGAG GCTGAGGCAG GAGAATCACT 36600
TGAACCCGGG AGGTGGAGGT TGAAGTGAGC CAAGATCGAG CTACTGCACT CCAGCCTGGG 36660
AGCAAGACTC CCGTCTCAAA AAAAAAAAAA AAATTTGTAG TGGTATGGAG GCCGGGCATG 36720 GTGGCTCACG CCTGTAATCC CAGAACTTTG AGGGGCCAAG GCGGGCAGAT CATGAGGTCA 36780
GGAGTTCGAG ACCAGCCTGA CCAACATGAT GAAACCCTGT CTCTACTAAA AATAACAAAA 36840
ATTAGCCAGG CATGGTGGCG GGCACGTGTA GTCCCAGCTA CTCGGGAGAC TGAGACGGGA 36900
GAATCGCTTG AACCCAGGAG GCAGAGGTTG CAGTGAGCTG AGATCACGCC ACTGCACTCC 36960
AGCCTGGGTG ACAGAGTGAG ACTCTGTCTC AAAAACAAAC ACAAACAAAC ATAT TATAT 37020 ATACATGTAT ATATATAATA TATATATACG TATATATACA CGTGTATATA TATAATATAT 37080
ATACGTATAT ATACACGTGT ATATATAATA TATATACGTA TATATGTATA TATTAATATA 37140
TATACGTATA TATACACGTG TATATATTAA TATATATACG TATATATACA CGTGTGTATA 37200
TATTAATATA TATACGTATA TATGTGTGTG TGTGTATATA TATATGTATA TATATATATA 37260
TATATACATA TATATATACA GAGAGAGAGA GAGTAGTGAT AGGTCTTGCT GTCTTGTCCA 37320 GGCTGATCTT GAACTCCCGG CCTCAAGAGA CCCTCCCACC TCAGCCTCCC AAAGCACTAG 37380
GATTATAGGT GTAAGCCACA GTACCTAGCC TATTAAAAAT TAATGTTAAA CAAGAGGATG 37440
TGATGAGGGA GTTAGAGGGT GTGCCAGCCA TGTGTTCCAC AGCAGCAGGT CAGGAGACAT 37500
TGGGGACATT TAGAGGAGCT GAAGAGGTGG CCAACCCTGT GCTCAGGAGG ACGGGGGAGG 37560 GAGAGAGCAA GAGGGAGTTT GGGCTGGGGC AGAACGTACC TGGGTCCTGA GAGGATAAGA 37620
AGGTAGGGAC TTGGCCCCTC CAGGCCTGAC TCTGCCAGCA ACCAGCTCCC TATCAGCAGA 37680
CTCCAGGCCC CTACCCTTCA GCTCATCCTT CCTTATCACA CATCCAAAAC TCTGAATGTG 37740
GCCGGGCGCA GTGGCTCACG CCTGTAATCC CAGAACTTTG GGAGGCTGAG GCAGGAGGAT 37800
CGCTTGAGAA CAAGAGTTTG AGACCAGCCT AGGCAACATG GTGAAACCCC ATCTCTACTA 37860 AAAATATAAA AATTAGCTGG GTGTGGTGGC ACATGCCTGT TGCCCCAGCT ACTCAGGAGG 37920
CTGAGGCAGG AGAATCACTT GAGCCTGGAA GGCGGAAGTT GTAGTGAGCA GAGATTGTGC 37980
CACTGCGTTC CAGCCTGGGC AACACAGCGA GACTCTGTCT CAAAAAACAA AAACTGGAAT 38040
GTGTTTACCA TAAAGGCCAG AAAATGTGAT TAACAGCTGC TCAAAGCCCC TGTCTGCCCT 38100
AAGCCTGAAA TTTTCACCGA AAAAAAGATC TGTAGGCTCA TACAGAGGAA GGACAAACAC 38160 CAGGGAGGCT CTCTTCCAGT TTGCTTCACC TCAGCAAGCA GACGGCTGGC AGCAATTTGG 38220
GGGCAGGTGT GAGCACCTGC ATCATCAGGA AAGAAGGGGC ACGGTGGGGA CGCAGGTCAG 38280
ACCTCTCACA GGTCTTGGCT CTGCCCAGGA GACACGTGTC CAACTGAGAG GTGAGGAACT 38340
GGGTTCTGCA GCTGCAGACA CAGGTGCGGC TCAGCATCTG ATGGCCACGG AGACCCCCTG 38400
GCTTGGCTTC TCCCAGCTGG TGGCCCATGA GGAGCTTCTA TCCCAAGAGA CTGTCCCTCA 38460 AGGAGCAAGT GGGACCAGGT ACCCACAGGA CGGAGCCTGG GAGTGAGGCC TGCCCTGTGG 38520
TCTGGCTACA GGGAGGAAGG GCAGATTGGA GGGGGCAGGA CAGCAGGTCA GGAATTGGCC 38580
AACTCTGGAG AGAGCAAGCA AGGGGAAGTC TGCGCACAGG GCAGGGCTGG TCAGGGGCGA 38640
GGCAGGGC T TGGACCAGTA TTTTCAGAGC TGGTGAGGCT TAAAGAGCAT GTCTACTGCC 38700
TCTTATTACA GAGAGAGGAT GCCGAGGCCC AGACCCATCC AGGCCACCTC TCCACAGACA 38760 CAGCTGGTGC CAGGGAAGCC CCTCCCAGAG CCTCAAGGCA TTGCTCCCTC TCTCTCTCTC 38820
TTTTTGTTTT TTTGGAGACG GAGTCTCACT CTGTCTCCCA GGCTGGAGTG CAGTGGTACA 38880
ATCTCGGCTC ACGGCAAGCT CCGCCTCCCG GATTCACGCC ATTCTCCTGC CTCAGCCTCC 38940
CGAATAGCTG GGACTACAGG CGCCCGCCAC CACGCCCAGC TAATTTTTTG TATTTTTAGT 39000
AGAGACGGGG TTTCACTGTG TTAGCCAGGA TGGTCTCGAT CTCCTGACCT TGTGATCCGC 39060 CCGTCTCAGC CTCCCAAAGT GCTGGGATTA CAGGTGTGAG CCACCGCGCC TGGACTTTTT 39120
TTTTTTTTTA AGACGGGGTC TCACTCTGTC ACCCAGGCTG GAGTGCAGTG GCGCGATGTC 39180
GGCTCACTGC AACCTCTGCC TCCCCAGTTC AAGTGATTCT CCTGCCTCAG CCTCCCAAGT 39240
AGCTAGAATT ACAGGCACAT GCCACCATGC CCAGCTAATT TTCTGTATTT TTAGTAGAGA 39300
TGAGGTTTCA CCATGTTGGC CAGGCTGGTC TTGAACTCCT GACCTCCGGT GATCTGCCCA 39360 CCTCAGCCTC CCAAAGTGCT GGGATGACAG GCGTGAGCCC CCGCGCCTGG CCCCCCGCAG 39420
TGCTGGGATT ACAGGCGTGA GCCCCCGCGC CCGGCCCCTC CCTCTCTTTG ACTCCCTTCT 39480
TTCTCACCGC CCCCTCCCCA CCATCCTTCC CCTTCACTGA CTTCAGGGAG TTAAAAACAA 39540
TTCTCGCAGT GAGCTGGGCT TGTTTTGTCT CCCTGCTTCT CTTTGTACTA AACATTAGAT 39600 ACCGAGGAAA TGCGGATTGG CCTTTGGATG ATTCATGAGC AGGAGTCAGA AAAAGGCACC 39660
AGGTTGGCCT CAAGCAGCAG GGTATAGTAG TGCCCGCTCC CAGGGTCACA CCTCACGCCC 39720
ACCCCTCCCG CCGTCCAGGT GGATGGTGCC CACTCCCAGG GTCACACCTC ACGCCCACCC 39780
CTCCCGCCGT CCAGGTGGAT GGTGCCCACT CCCAGGGTCA CACCTCACGC CCACCCCTCC 39840
CGTCGCCCAG GTGGATGGTG CCCACTCCCA GGGTCACACC TCACGCCCGC CCCTCCCACC 39900 CACCCGGGTG GATGGTGCCC GCTCCCAGGG TCACACCTGA CGCCCACCCG GGTGGATGGT 39960
GCCCGCTCCC AGGGTCACAC CTCACGCCCA CCCCTCCCGC CCGCCCGGGT GGATGGTGCC 40020
CGCTCCCAGG GTCACACCTC ACGCCCACCC CTCCCGCCGT CCAGGTGGAT GGTGCCCACT 40080
CCCAGGGTCA CACCTCACGC CCACCCCTCC CGCCGCCCAG GTGGATGGTG CCCACTCCCA 40140
GGGTCACACC TCACACCCAC CCCTCCCGCC CACCCGGGTG GATGCCCTTA TCAGCTCTCC 40200 TTCTCCTTCT CTTTCGTCTT CTTCGTCTTC CTCCTCTTCT TTCTTCTTTT TTTTTTTTTT 40260
TAGAAAGAGT TTCTACTCTT GCTGCCCAGG CTGGAGTGCA ATGGCACAAT CTCAGCTCAC 40320
TGCAACCTCC CTCTCCCCGG GTCAAGCAAT TATCCTGCCT CAGTCTCCCA GATTGCTGGG 40380
ATCACAGGAG TGTGTCACCA CACCTGGCTA ATTTTGTACT TTTAGCAGAG AGGGGGGATT 40440
TCACCATGTT GGCCAGGCTA GTCTCGAACT CTTGACCTCA GTTTATCCAC CGGCCTCAGC 40500 CTCTCAAAGT GCTGGGATTA CAGGCATGAG CCACCCTATC TGCCTCACTT CTACAGAGGA 40560
GGAATGAAGG CTCAGAGAGG GCAAGCATTC CACCCAGCAT CACACAGAGT GCCGGGTGAG 40620
AGCCCAGTCA TGAGCCTGGG CCTGACTGCA GGCTCCTGTT GGGAGCTCGC GGAGGTGGGG 40680
GATCTGTCCA GAACTGAGAG GCCAGGGGAC CACAGTGGCC TCTGACCCCT GGAGGGCCCT 40740
GGAGGCTGCT GCCGGCTCCC CCCGGGGGCA GATGGAGGTC ACTGTCACCC AGGCTGCTTC 40800 TCATGGTGCC AGGAGCACAG CATGGCAGGA GCCACCAGCC GATTTGCCTT TCCCTGGGCA 40860
GGAAACTCAG AAATGTGGCT ACCACAGTCA GGCTGCTTGA CGTGCGGTGA GCACTCATCT 40920
CTTAGCAGGC AAGCGGCCAA GCACCTTTCC TGAAATATTG AGGCCTCAGA ACAAGCCCCA 40980
GGAGAGGTGC CAGCACCGTC ATCTCTACCC AGATAAGGAG ACCCAGGTCC TGAGAGGTTA 41040
GGCAGCTCGG ACAACACCAC ACAGCTGGAG GAGGTCAGAC TCTGGGTTGC AGAAGGAGAA 41100 TGTGAGCAGA GGCCACAAAA GAGCGAGGAG CCAGTGCCCA GATGCCGAGA TGCCCTCGCC 41160
CTCCCAGCTC AGCCCCAGGA ACCGAGCCCA TGGGGAGGGA CCGTCAGGGA AAGGCTGTCA 41220
GGAAGGGCAG GAGGCGGCCC TGGAGAGGAC GGCGCTGCCC TCAGGGGCAG GAGGGGAGTC 41280
CCCTCCGCTG AGAGCCCCCC CACCCCCAGT ATCCCCGGGG GTGTCCAGGA GGAGGCGGAG 41340
GGAGGAAGCG CAGATGGACA GGACTCCCAG ATAGGGTGGG GAGGTGTGGC CGGTGACACA 41400 CACGGTCCCC TCCTGGCAGG TGCTGAAGTC ACCTGGAGCC TCCAAGCCCG TGGGGCCTGA 41460
GGGGCGGGGT CAGGTCGGGC ACGCGTGGGT GGGCGGAGTT CTGCGCCCCG GGCCAAGGCG 41520
CCCGAGTTGA ACCAGTCAGC TCGGGAGAGG GACCGCGGCG ACCTGTCCCG GGGGCGTAAG 41580
AAAAGGTGGG AGGGAGTGCG GCTCGTGAAC GGGGGCGGCG ATGGGAAGGA GGTGCGGCCC 41640 TTCGTCCTGT CCTCCCAAAC GTCGAGTGAA AAACGAAGCG GGTTCTGCGG CCTCGCGGCG 41700
GAGCAGAGCG TTTCGGGAAG GGCGGGCCCA GCGTCCTCGC GCCCGAGGTC GCCCGGCAGC 41760
TCCCCTGCGT CCAGAATCCG CCCCCCGCCC GGGCCTGCGC CCGCCCCTCC GCCTGAGCTC 41820
CGCGCGGGAC GGGCCGGGAG GCCGGGGTGG GCGCTACCTT CGAAGGCGGT GGGTCCGCCC 41880
CGCGGGAGGT GGAGGGGCGG GAGGGGCGGA GCCCTCTGGT CTCCGGAGGG TTTGGGGATC 41940 GCAGTCGCCC CTCCCCCATC CAGACCCCGC GGCGCAAAGG GCAGTGGCTT TTCTGGCCAG 42000
AGCAGGTGGC GCGGGCGTCG CAAAGGGTGG TCCCCGAGGC CGCAGCGGTG TGGGGGGAGG 42060
GCGCGGTCCC CCTCACTCCG GGCTCCGCCG TGTCTGGCCC GCCCCCCTCC TTCAGCGCCC 42120
CCTCCAGCCC CTGTGCTGCA CTGGCGCGGG GAGCGCCGGG TTCCCGGCTG GGGCTTTGGC 42180
AGAGGGTCCC ACCCTCTCCC CGCCTCCCCA CGAAGGCTCT GGCGGACCCA GATCTCGGGT 42240 CGCCGGACGC CCCAGGGACC CCGCCCGCAC ATCGCGAGCG CGCCCACCCG GTCGCGAGCC 42300
CACGCCCGGG TCTGGGAGCC ACCCTGCGGC AGTCGCGCCC TGCGTGGCAC GCTGCTCCCC 42360
CAGGGGCGAG GCGCCCCCGC CCGACGTCCC GGTCCCGAGC GCTCCCCGGC GCGGCGCCTC 42420
GCAGCCCAGC GCCCCACCAG CCCCGCCGGC GCCGCAGACC CCAGCCTCGG GCGGGTCGGG 42480
CCCAGGCTTG CAACGCGCAG GGTAGGAGAA GGGAAATTGG CGTCCGCTGC CGGCCGCTGC 42540 CCCAGGCGAG GCCAGACGAG GCCTCTGCTC AGATCCCGCC GCCCCACAAA GCCCGTGGCC 42600
CCGGAGCCTA CCGGAAATGG TGCTGGCCAT GGTGCTGGCG GCGGTTGGGC CTGCGGAGGC 42660
TGGAGAGGCG CAAGTGGCGG CCGGAGCTGC AGACGGCTGG TGCTGCAGTG CCGGGGAGGG 42720
GAGGGGAGAG GAGTGGAGGG AGCGAGGGCG GGCGGGAGGC GGGCGCGGCG GGAGAGAGAG 42780
AGGGAGGGAG ACAGAGGGAG AGAGAGAGAG GGTTGGGGGA AGGAGCGGGG GGAGGAGGGA 42840 GGGAGGGTTG GGGGAAGGAG AGAGAGAGAG AGAGAGACTG CGGGGGCGGG GGAAGGAGGG 42900
AGGGAGGAAG GGAGGGAGGA AGAGAGAGAG GAGCAAGCGC CTGGCTGCGG AAGGGGCCGC 42960
GGCTCTCAGG GGGAGAGGGC GGAGGAGGGG GGCTACCCGA ACTGCAACAA GACCCCCCAC 43020
CCTCCAACCG CTCACAGCGG GACAGCTGCT TCTCCAACTT GGCTTTGTGA GGCCTGAGAG 43080
TGGGGTGGGG GTGGAGATGA GCCCCCATTC CCCAGGGCAG GCGGGGCAGG GGCAATGCCG 43140 GAGGAGCAGG TCCCACCCAT GGGGTGGGGC CGCAGAGCTC TTCGCCGCCA AGGCCGCTGT 43200
AGGCTGGGCT GGCGCCAACA GGGTCCAGGT CTGTGCCTGC CATCGGAGAG GATGCCACAG 43260
CCACAGGGGT GGGCGCTGGC CTGGAGGCCT CCAAGGGGCA TCTCCTGTGA GCCCAGGGGA 43320
TGGGCAGGAT CTGAGCGGAG AAGAGTGAAA GTGGAGGAGT GAGGCCAGAA CAAAGGCTTT 43380
GCCGTGAAAG AGGTGGTTTC CCGCCTGGGC TCAGACCTTC ACTCACTGTG TGGCCCAGGC 43440 CAAGGGCAAG CGTCTGACCT CGCTGGGCCT TTGTTTCTCA GGGGTAAGAT GAAACAATGA 43500
TGCCCCCAGA CGATGGAGAG GAGGGGTGCC AGGGTTGTGC GCACTTAGTG AGTGGGGGGC 43560
AACCTATCCT GCCTCCCCCT CTCCTCATAA CTCCCAAAGG GAAAGCCTGG TAGGCAAACG 43620
GAGCGTCTTT GCCATTGCAG GGATGAAGCC ACCGAGGCAG GGAGAAAAGT GCTTTGCCCT 43680 ACAAGCAACT AAGTCATAGG GCCAGGAGCA AAACCCTGAA AACCTCAGGA GACTTGCAGA 43740
GCCATGAGGC TGGCTCAGCA ACACAAAAGC CAGGGGCAAG CCTCAGCTCT AGCAGTGCGG 43800
TGGGAGCACC CAAGGCCAGT CACATCCTAG GGTGGCCTGG AGAGTCCTGA CCCCTGACGT 43860
GCAAGCCGGC ATCATCCCCG GGACTGTGAG TCTGGTGGGG GTGATGCCCA GGAATGTGAC 43920
ATTGTGTGGC CCAGAGGTAC CCTTAAGACT GGAGGATCAC CAGGCGGGCC CTGACCTCAT 43980 CACAGGAGCC CTTTAAAAGC AGTTTCCTTT GCCTGGTTGA AGAAATCGGA GGGATCAAAC 44040
CAAAGAAGGT TTTCTGTTGT TGAGATGAGG GGGCCACGTG GCAAGGATCT GAGAACTGCT 44100
CCCAGCCAAC AGCCAGCAAG ACAACAAGAC CTTAACTGCA AGGAAGTGAG TTCTGCCAAC 44160
AAGAAGAGAA TGGGCTTGGA GGCAGGTTTG ACCCCAGGGC CTCCACACAA GAACTGAGCC 44220
CAACTGCCCA CTTGGTTTCA GCCTTGGGTT ACTAAGAATT AGGAGGTAAT GAATGAGAGT 44280 TGTTTTAAGC TGTTGGTTTT GTGGTGATTT GCTATGAAGC CATATCAAAC TAATATACAC 44340
ACAGAGGTGT TGGCCCCTGG GCCATTCCTA GGAAGCCAGC TCTGCGAAGG AGGAAGAAGG 44400
GCAGAGAGGC ACACAGAGCT GCCCACCACA GCAGCTGTGT CCTCCCTGTT GGCCACCACA 44460
GTAGCAGTTG GGGATGGTCA GCATCCTTCA GGCAGACTCC AGCCCCGGGT GCTGGAGCTC 44520
AGGTGCTAGG GATCAAGAGA AGTAGCCCTC TCTGGGACCT CCAGAGTCTT CTCATGTGGG 44580 TGGGGTAGGA CCCACCCAGT CAGGCTCAGA GCACCGCAAT GCCTCACACT CATTGTGACT 44640
CTGGCCAGGC CCTCTCTGAG CCTCTGTGTC CTCATCTGGA GCACAGGGAC CAGGTGTGTG 44700
GAAGCCCGTG GCATAGTGCC AGGAACAC G TAGATGTGCA CAGTGTGCAC TAGCAGGAAC 44760
ACACAACAGG GGTACTGACT GTCAGCACCT AGGCAGGCAC ACGCAATGGG GTACTGACTG 44820
TCAGCCATAC TGACTGTCAG CGTGCTAGCA GGCATACACA ACAGCTGTAC TGACAGCACA 44880 CTAGCAGGCA CATGCCATAG GTGTACTGAC TCTCAGTGCA CTGGCAGGCA CACGCAATAG 44940
GAGTAATGAC AGCATGCTGG CAGGCACACA ATAGCTGTAC TGACTGTTTG CCCCAATATA 45000
GTGCCAGGTC TTGGAGCAGA TTTTGACTTC TCACCAAGAT CAAATGCAGA AAGTGCACGA 45060
GCATTTCAAA GATGTTTTTC ACATGCACAT TAGTGCTAGT TAAAAAAATG TTTTGACTGG 45120
GTGCAGTGGC TCACAACTGT AATCCCAACA CTTTGGGGGG CCGAGGTGGG CAGATCACCT 45180 GAGGTCAGGA GTTTGAGACC AGCCTGGCCA ACATGGTGAA ACCCCATCTA CCCTAAAAAT 45240
ACAAAAATTA GCCAGGTGTG GTGGCAGGTG CCTGTAATCT CAGCTACTTT GGAGGCTGAA 45300
GCAGGAGAAT CACTTGAATC CAGGAGGCAG AGGTTGCAGT GAGCCGAGAT CCCACCACTG 45360
CACTCCAGCC TGGGCAACAA TATCAAGACT CCACCTCAAA AAAAAAAATG TTTTTCATAA 45420
AGTGTGACTT TTATCAGACC TCTGCATTCT TGAAATTAAC TCTGGCTTGG CTGGGCGTGG 45480 TGGCCCACAC CTGTAATCTT AACACTTTGG GAGGCTGAGG TGGGCAGATC ACGAGGTCAG 45540
GAGTTCAAGA CCAGCCTGAC CAACATGATG AAACCCCATC TCTACTAAAA ATACAAAAAT 45600
TAGCCGGGCG TGGTGGCATG CACCTGTAAT CCCAGCTACT CAGGAGGCTG AGGCAGGAGA 45660
ATCGCTTGAA CCCAGGAGGT GGAGGTTGCA GGGAGCCGAG ATCGCACCAC TCTATTCCAG 45720 CCTGGGCGAC AGAGCAAGAC TCTGTCTCAA AAAAAAAAAA GAAAGAAAGA AATTAACTCT 45780
GGCTCCTAGA AGGAGCCCTA TATCTCAGCA GGACACTCAG TCATTCAACA GACATCTGTC 45840
AAGCACCTGC TGTATGCTGG AGCTGTGGGT ACGTCAGCAA TTAGAGGAAG AGGGCAGGGG 45900
TACAGGAGTT CCTGACCACC CCAGGCCAGC ACGCTCCTAT AGCAGCTGGC AAGGAGCAGA 45960
TGACTCAGAC TTCAGCTCAG TCCACAGGAC AGCCTTTTCT GGCCACTGCT CTCAGGAGAT 46020 GAGATGTGTG GCTGCAAAAG GTAAACTCCT GGCTCCTGAG CAGGCTCTGG GCAATCTGCT 46080
CAACGCTCTG TGCCTCACTT TCTCACCCAG AAAGTGTGGA CAATGAGAGG ACTTATCTGG 46140
CTGGGCGCGG TGGCTCACGC CTGTAATCCC AGCACTTTGG GAGGCCGAGG CGGGTGGATC 46200
ACCTGAGGTC AGGAGTTCAA GACCTGCCTG GCCAACACGG TCAAACTCCA TCTCTACTAA 46260
AAATATAAAA AATTAGCCGG GCTTAGTGGT GCACACCTGT AATCCCAGCT ACTTGAGAGG 46320 CTGAGGCAGG AGAATCACTT GAACCCAGGA GGTGGAGGTT GCAGTGAGCC AAGATTGTGC 46380
CACTGCACTC CAGCCTGGGC AAAAAGCCAA AACTCTGTCT CAAAGAAAAA AGAATCATGG 46440
CAGAAGGTGA AGTCTATGTT AGTCCCAGTT CCCAGGTCGT ACATGGCGGC AGGAGAAAGA 46500
GAGAGAGAAG GGGAAACTGC CACTTTTAAA CCATCGGGTC TCCTGAGCAC TCACTGTCAG 46560
AACAGCCTGG AGGAAACTGA CCGCATGATC CAACCACCTC CCTCCAGGTC CCTCCCTCCA 46620 CACGTGGGGA TTACAATTCG AGGTGAGACT TGGGTGGAGA CACAGAGCCG AACCATATCA 46680
GCATGTATGG GGGGCACTGA AACTTGTGCT TGGTGCCCAT TCATTCAACG AGTGTGTGTG 46740
GCTGGTCTCC TCATCTTCAA CTCCCTGCCG AGTCTCAGAT AGGCAGCCTG CAGTTCCTTC 46800
ACCACAACAG GCACATGGGG CTGGGTGCCA GTGAGTGCTG GGGCTTCTCC GAGCACTATC 46860
TCACACCCAG GAGCGTGGGC ACGCATGGC TTCGCATGTG CCGTCAGTGG ACATTAAACA 46920 CAGCCATGAA GAAGCCACGA AGAAGTGCTG CCTGCCGGCC GTGCGCGGTC ACGCAGCGCC 46980
AACTCCCTCC TGGGGCCTTC TGGGGCCTTC TGGGGCATGG GAGCTGGGGC CGCCTGAGAC 47040
AAACATCCGT GACGCTGGGC TGACCCCACA GAACGGTGCG GGCCTCGCTC TTGGAGTCAG 47100
CCCTGCTGCC AGCCAGTGCC GGGTGCTGGG GACTCAGGGA GGCCCGCCGG GACCACTGCG 47160
GGACAGTGAG CCGAGCAGAA GCTGGAACGC AGGAGAGGAA GGAGAGGGGG CGGTCAGGGC 47220 TCTCAGGAGC CGGGTCCTGG GCAAGGCGCA GCCGTTTTCA AATTTTCAGG AAAGCGGTCG 47280
GCTCACACTC GAGCAGTAAA AAGATGCCTC TGGGGAGGAG GCCCGTGCAG CTCTCCGGGC 47340
AATGGTGGTG GCTCGGCCTA GAGAGGCGGT AGTGGAACGC AGACCCTGGT GGGGGAATGA 47400
CATCAAGGGA GGAGACGGGC GGGACCCCAG ATTTCTGCCT GTGGGCGATG GAAGTGAGGT 47460
TCACTGGCCA GCGGAGCCGG ACACAGAACG CGCAAAACGC CGTGTAGGCC TGGAGGAGCC 47520 GAAGAGCAGG CGGACCCCCT CCGCGGGGGA ACAGTTTCCG CCGGGAGCAC AAAGCAACGG 47580
ACCGGAAGTG GGGGGCGGAA GTGCAGTGGG CTCAGCGCCG ACTGCGCGCC TCTGCCCGCG 47640
AAAACTCTGA GCTGGCTGAC AGCTGGGGAC GGGTGGCGGC CCTCGACTGG AGTCGGTTGA 47700
GTTCCTGAGG GACCCCGGTT CTGGAAGGTT CGCCGCGGAG ACAAGTGAGC AGTGAGTCGC 47760 AGTGACCCTA CAAGTGGTTC TTTTACCCGA GCGGCTCGTA GGCGCGTTGC GGTTTTTCGA 47820
AACTACAGCT CCCGGCAGGC CCCAAGCCGC CCTCGGGGCC GCGGGTCGGC GGATTGGCCG 47880
CGCTGCATTT TGGGACCTGT AGTTTCCTGC GCTCGTGGCG CTGGCGCCGC GGCGTTGGCT 47940
GAGCCCTTGA CCGGGGCTGG AGGGAAGGGC CGACATTCAG TGTGTCCGCG TCTGTTCTGT 48000
TAGTCCCAGT TCCCGGGCGG GATTGAGGCT TAGAGAAGTT GAGTGATTTG CTGAGGGCTG 48060 CACGGGTTGG CATCCCGGCA TGCTCTTTCG CTACTTTGGC TGCATCTGGT TGCCCACCCG 48120
GGCGGATGGG GAATGGACTC CAGCCAGCCA GGAGGGCAGA GGGCTGGAGA GGCAGGGCCG 48180
GAGGTTCAGA CCCTCCGCTC TGACGTTGCG CCTGGTGAGG CCGGGAGGGG TGCCGCTTGC 48240
CTCTTCAGCC CTCACGCTCT TGTGGAAGTC GCGGAATTAC TGCAGGCGGA ACTTGCAGCA 48300
CTGTGGGCGT CTTTTCCAGA GAAGGACGGA GTTGTGGGGC GGGAGGATAA GGCAAGGCCC 48360 AGCCACTTCG CATCTTCGCC CCGCCAGCTC CTCGAGATGG GATATACCAG GGTTGCTCTC 48420
CAACCCTCTC CGCAGGAGGG ACTGATGGAA ACGCCTGGGA AAGTAGCCCG GTACCCACAA 48480
AGGCTGTCTA CAAACAGAGT CTTACTGTCT TTCCCAGGTC TGTGCCATAG GGATTCTCGA 48540
AGAGAACAGC GTTGTGTCCC AGTGCACATG CTCGCATCGC TTACCAGGAG TGCCCGAGAC 48600
CCTAAGATGT TCGGAGTGGT TTTTTCGCAC AGACCCGAAT AGCCTGCCCC TCAGCCACGC 48660 TCTGTGCCCT TCTGAGAACA GGCTGATATG CCCAAGATAG TCCTGAATGG TGTGACCGTA 48720
GACTTCCCTT TCCAGCCCTA CAAATGCCAA CAGGAGTACA TGACCAAGGT CCTGGAATGT 48780
CTGCAGCAGG TAGAGCACAG GCCCCGAGGA AAGGACTGCG GGTGGGTGGA GCTTCAGCCA 48840
GGACGGGGTG TGCTTCCCTC TCCCGGCCCA TTCCAGCCAG GCCCCTCCGG GCCAGAGGCA 48900
GCGTCTGTCA TAAAAAGGGC TGGTGTTCCA GGTGGGGTCA GAGAGAGGAT TGACAAGTAA 48960 AAACGATCGT CCTTTGAAGG GGGCCGGCCC CTCCACACCT GTGGGTATTT CTCATCAGGC 49020
GGGACGAGAG ACTGAGAAAA TGAATAAGAC ACAGAGACAA AGTATAGAGA GAAAAGTGGG 49080
CCCAGGGGAC CGGCGCTCAG CATACAGAGG ACCTGCACCG GCACCAGTCT CTGAGTTTCC 49140
TCAGTATTCA TTAATTACTA TTTTCACTAT CTCAGCAAGA GGAATGCGGC AGGACAGCAA 49200
GGTGATAGTG GGGAGAAGGT CAGCAAGAAA ACGTGAGCAA AGGAATCTGG GTCACAAATA 49260 AGTTCAAGGG AAGGTACTAT GCCTGGATGT GCACGTAGGC TAGTTTTATG CTTTTCTCCA 49320
CCCAAACATC TCGGTGGAGT AAAGAGTAAC AGAGCAGCAT TGCTGCCAAT ATGTCTCGCC 49380
TCCTGCCACA GGGCGGCTTT TCTCCTATCT CAGAATTGAA CAAATGTACA ATCGGGTTTT 49440
ATACCGAAAC ATTCAGTTCC CAGGGGCAGG CAGGAGACAG TGGCCTTCCT CTATCTCGAC 49500
TGCAAGAGGC TTTCCTCTTT TACTAATCCT CAGCACAGAC CCTTCACGGG TGTTGGGCTG 49560 GGGGACTGTC AGGTCTTTCC CATCCCACGA GGCCATATTT CAGACTATCA CATGGAGAGA 49620
AACCTTGGGC AATACCCGGC TTTCCAGGGC AGAGGTCCCT GCGGCTTTCC GCAGTGCATC 49680
GTGCCCCTGG TTTATCGAGA CTGGAGAATG GCGATGACTT TTACCAAGCA TACTGCCTGT 49740
AAACATATTG TTAACAAGGC ATGTTCTGCA CAGCTCTAGA TCCCTTAAAC CTTGATTCCA 49800 TACAACACAT GTTTCTGTGA GCTCAAGGCT GGGGCAAAGT TACAGATTAA CAGCATCTTA 49860
GGGCAAAGCA ATTGTTCAGG GTACAGGTCA AAATGGAGTG TGTTATGTCT TCCCTTTCTA 49920
CATAGACACA GTAACAGTCT GATCTCTCTT TTCCCTACAG TCCTTGAGGG TGACAGACTT 49980
AGGAGTGCCT TGGGGGCCTC TCTGAGGAGC AGCTGATATT CACGGGTCAG GAGGAAGCAT 50040
TTCCATTAGA GGGGCAGCCG GTGGCCAGCC TCACTTGGAA GGTCTTTGAA CCTCGGGGGT 50100 GCAGGGAGGT GGCAGTGGTG CAGGTTGCCT TCTCCTGGGT TCCTTGAGGT GCCCTCTTGT 50160
ACCCGGCTCA CACCCTTCCC CTCCCCGAGT TTCCTGCTCA GGTTCCCGTC TGAGAGCTTG 50220
TATGTAGGAC GTCAGATAGG ACAGCATAAA TGTTTGGATC CAGAAACGCA GAACAGTTTC 50280
CTATTTTGAG ACTTGACACC TAATTAGTCA TCTTACTATT TAAGCTGAAA AATAGTGTCG 50340
TGTTTTGGGT AACGTTCTGC AAATCGTTTG CTAATGGCGG CTGAGTTGCT TCACGCCCTT 50400 TAGGGCAAGA GTGGGACTTG CCTGTGGACT TCTCCGCGGT CCCACAGGGC TCTCGCCACC 50460
TGGCAGTGGC CTCTGCATCT GCAAAGAGCT GCCCGCTGGC TGCCGAAGCT TGTCTCAGGG 50520
CAGCTTGTGT GGCCTCGCCT CTTCCTGGCT TCCCCGTAAC CCTTGCTCCG AACTCCGTTC 50580
AGAAGGTGAA TGGCATCCTG GAGAGCCCTA CGGGTACAGG GAAGACGCTG TGCCTGCTGT 50640
GCACCACGCT GGCCTGGCGA GAACACCTCC GAGACGGCAT CTCTGCCCGC AAGATTGCCG 50700 AGAGGGCGCA AGGAGAGCTT TTCCCGGATC GGGCCTTGTC ATCCTGGGGC AACGCTGCTG 50760
CTGCTGCTGG AGACCCCATA GGTGACCCTA GTTCCCAGGC CTCTCCTGGC CTCCTGTGGG 50820
GATGGTTGGC AAGGGATGGC GCTGAGGGTG GGGTGGGCCC ATGGGGACTC CTGCCGTCTC 50880
TCAAGCAGAA CTCAAGGAGA ATTTTTTAGC TGCTGTAT A TTTCTCGCCA TCGTGGGTGT 50940
AAACCTAGGG TTGGGCTTTT TTGCTGAATT AGGGCACGGC AGATGCCCAC TTCACCCATT 51000 TTTGATAAAC CAGTATCTGG GGTGTCAGAT TCTTGGCTGT CTGCAGGGCC GAGTTAGCCG 51060
AATGCCACCT GCCTTTGATA CGTGAGAACG TTGTCTGAGA ACCGTGACTT CTGTGCTTGC 51120
TTGTGTCTGG TCAGCTTGCT ACACGGACAT CCCAAAGATT ATTTACGCCT CCAGGACCCA 51180
CTCGCAACTC ACACAGGTCA TCAACGAGCT TCGGAACACC TCCTACCGGT GGGTCAGACG 51240
AGTTTACACC TGTCTCGGGG TCCTCAAGAG AACCAGCTTG GCATGGTGCT GAGTCCACAG 51300 CCCCATGCTG TGCTGTGGTG GAGGGTGGTG GTCTTTCTAG ACGCTCCCCC GAAGTGTGCA 51360
GAGCGCTGGT GCCCAGGGGT GGGGTGCGGC CTGGGCTGCC TCCAATGCCC ATTACTTGTG 51420
AGGAAGCAGC TTTGCATCTG TGTGCTGACC TTGGGCGGGC GTCCTGAGCT CCTCGCAGGT 51480
GCTGTTGTAG CAGCTGTGCA GTAGGTCAGG GCTGGCCCCC AGTGCAGCTT TGCACATGAA 51540
GTAGGAGGAG GCCCTGCTGC TTGTCAGAGC CCAGCAGAGT CTTGGTGTTC TGTCGGGTTC 51600 CTGTGGCCGG ACCAGTGGCA GGGTGCTGTG GAAGCTGTCG AATCTCCTCC CTCTGTCCAG 51660
TACCCCCGCT CGTCTTCTAG CTCCCTCCTA CGCCCGGGCC ACGTTTCAGT TATGCTCACT 51720
TCCTCTGACC GCCGAGGCTC CTGCGTGTCT CCATACAGCT CACGCTGCAG GGCCACGCTG 51780
TGGGTGTTGG AGACAGCTCC TCCTCGACCC ACGGTGCTCT CTCCCACCAG GCCTAAGGTG 51840 TGTGTGCTGG GCTCCCGGGA GCAGCTGTGC ATCCATCCTG AGGTGAAGAA ACAAGAGAGT 51900
AACCATCTAC AGGTAGGCTC CTGGGCTCCC GCTCCGGCTC AGTGTCCGAC AGGCGAGTGC 51960
TGCTGGGTGT CCAGAGCCCC AGGCTGCGCT CCCGCTGGGC TAGGGTTTGA AGTTCACTGG 52020
GGGACTGCAG GGGAGGACCT GGTGGGGGTG GGGACTGGCT TCGGTCCTTT CTTGGCCGTG 52080
CTTCAGCTGC GCACTCTGCC CTTCCTCCCA CAGATCCACT TGTGCCGTAA GAAGGTGGCA 52140 AGTCGCTCCT GTCATTTCTA CAACAACGTA GAAGGTACAA GCAGCTGGGT GGGACCAGGG 52200
TCGGGTTGGA GTGTGTGCAG CCTCTCAGGG TGGAGCTCAG TGGTGTCACA GCCTGGTTGT 52260
GCTTGCCCGG TGGGGCGGCC AGTGCGGCCA TGTACCTGGG CCCTGTCTTC TGACTCGGGG 52320
CCACCCATGT TAGACTTCTG TGTGGAAGAG CTCACACAGT GGTCTGAGAC AGCCAGCCGG 52380
CAAGACTGCC TCTGGCTGGT GCCTGGGGCC TTGGATTTTG GGAAGGCTCC CTCCATTTCC 52440 TGATGAGAGG GTCTCCCTGC ACCTAACCTG CTGGTGCAAA CAGTAGGGGT TTTGCTGAAC 52500
ACCGGCTTTC TCTTCGGGGA CTTTGTTGCT TGCCCAGCAG CAGGTGCTCC AGTGACCGGC 52560
CCTCATACCA TCTTGGGAGG GTGTCCTGGA AGCCGTGTCT GGCCTCCCGC GACCCTGCCC 52620
CGTGTGTCTT TTTCCTGTGC TGACCTTGCT GCGGAAAATT ATGGCCCTGA GTGTGACTCC 52680
AGGCTGAGTC CTGTGGGTCC AACACGGGAT GCCTTGGGGC CTCTTCTGGA GACGGGATGT 52740 GAGTGACAGG AGCCGGCCGG GGCAGCTTGC CCTGTGACTG CACGTGGCCA CAGCCTGTGA 52800
GGGCCGGGGG TGCTTCTCCA CCCACGTGGC TGCCCCTCGG GTATGTCAAG GGCTTCTGGG 52860
GCTCATCACG GGGTCCTAGA GACAGTGGCA GGGTGCACCC CCGTTGGCTG CCCTTACAGT 52920
TTCTGTGACC TGAGGGTGGC ATCTGTGCAG TCGGCGCGGT CTGTGCTTCT GTGGGATCAG 52980
GGTTCCCTCT GTTTCCTGCC TCAGTTGGGG CTCAAGCCTC AGGTGAGGTG GCCCCGGAGC 53040 ACTCAGAAGG CATCGGCGGT CCTGTGGGCT GCTTTCTGCA CTCACGTTTG CTGAGTGCTC 53100
AGTGTGCCAG GACTGAGGAC CCTGAAGCTG CTCTTGTATT TAGGGCGGCG CTCCCCTGGC 53160
AGAGACTGAG CCAGGTGGTC CCGCATGACC CACTACCAGG CGTTTCTGGG CCCTGGCCCT 53220
TGGAGGGACA GGGTGGGCGG AACATGGGCC TGCAGGGAGG CTCCCGCTTA CTGGAGGCAT 53280
GTGCTGTGTT GCTGGAGACA TCCTCTGTGT TGCTTCTTGT TCGCTGTGGT TTTTGGTCTG 53340 GTGGCACCAA GGACCCTCAG TCATCTTGAT GTGTGGTTGT CCAGGCCTTT TTGTTGGTCC 53400
TAAGAAGGGG CTCTGCCTTT GTGCCCCCAG GTTCCCTGAC AGGAGCTGCC GGCTCGTCCC 53460
GGTGATGCCT GCAGGACGTG ACTCTGGGAC GGGGGGTTGG GCAGATGTGC TGATGGAAAT 53520
TCTCAAGCAG GCGTCATTTC CGAGGTCCTC ACCTGGATTT CCAGGACAGG AGTGCCTGCT 53580
GGGTGTCCCC AGTCCCATGC AGCGGGGGTC CTTGGGATAG CATGGAACGC TGAGCATGGG 53640 CCTGGCCGGC CGTGGTCCTG GACAAGGGCA GTGCCCCGGT GGCTGCTGGG CCTGGGACCT 53700
GGTGGGGACG CTGGGCCTGG TACCTGGTGG GGATGCTGGG CCTGGGACCT GGTGGGGAGG 53760
CCTCTGACTG CCTCCTGGTG CTGCTTCCGT CTGTGTTAGG CCTCTGGGTA TTGGGGCCCC 53820
CATCTGTCTC CTCCTCCAGG CCTGTGGACT CAGACCAGGA AGACACAGGC CAGCCCCTGC 53880 CTGTCCCCCT TGGCTTGGGC TCTCACTGCC CGACCTGGCG GGAGGTTGCC TAGCCGTGAA 53940
CCTTCGCACC CTGTCTGCCA CCGGACAGGC TGTGAGGGGG TGTCTGCAGC ACCTGCACCG 54000
GCCTGAGCAT CTTCAGAGTG GGCTGCAGCT CCTGGAGGGG TCTGAGAGGA AGGGAGGCAG 54060
GTATTTTGGG CGAATGAGGA GACAGCTGGA GAGCTGGCAC CCTTCCTGGC CTGCGTCCTG 54120
TGAGGACTCT GGTTGGGGAC AGCAAGCTTG GGGTCAGCCT GGGGCAGAGC CTCTGGGACG 54180 GCCCCGCCCC TCGTGCCCCT TCCCCTCGCA GCTCCTGTCC TCGCCCCGCC CTCAGCTCTC 54240
CGCCAGGCAA GGTTTGGCAA GTGCCGCTGT GCGGCAGTGC CTGCTGATTG GCTGGTCTGT 54300
TGCTATGGTG CTGCCCAGGG GTGTGCTTTT CCTCCCCTGC CTTCCCTGCT ATCCCTGGGA 54360
GTATCTGGGG TTGGGTCATC GCTGGTGTGT GTGAGTGTGT GTGTGTGTGT ATGTGCACGT 54420
GTGCATATGT GTGCGCTTCT GGCCTCTGCA GCTGAGTCCT GGCCCTCGGG GGGCCTGGCA 54480 CCTCCTGGGG ACAGGCACAA AGCAGCCATG ATGGAGTCGG GAGCTGGGGG AGGCCCCATT 54540
GCCCCACGTG GCTGCCCTGT GACTCTGGGG TGCTTGTTAG AAGAGGTATC TGGTTCTGTC 54600
TGTGTTTAAG CAACTCCCTA AGGAATTCTT GTGGTTCCAG TTTGGGGGGC CTGTACTGTA 54660
GAGGCAAGGG AGGGGCAGGA CATCCCCCAG ACTCTGACTT CTGAAGCCTT TTCTGCCCGG 54720
GGCCTCTCCG CCAGTACAGG CAGTGTCCTT TGCCAGGGCT GCCATGCTGC AGAGGGGAGT 54780 GGGCCACTGT TTAGCCCAGG AAAACCTGGC TCTCCCTTAG CTGGAAGTTC TGGGCCTGTT 54840
GTGGTTGGCA GGGAAGCTGA GTGACGGTGC TAATCACAGG GGCACCTGCA GGGGTTTGTG 54900
GGAGATGCCT CTGTGGGTTG GGGCGATAGG CTGAGGGGCT GTTCTTCCCT GCCCTGAGGA 54960
GGGCTGAGTG TAGCCGCCAC TCCTGTCCTG TCTTGGGCTG TCTCGGAGAG GATGCGTAGA 55020
ACCCTCGGGA TCCTGCTGGC CTCCGTCTGG TCCACCCTGA ACCTCAGGCC TTCTGGGGGC 55080 AGAGGAGGAT TCCCTCAGGA TCACTCGGGT GGGGGCCTCT CTTGGGCACC TGAGACCCTC 55140
AGTGGGTGCT TTGTGGCGCG TTCACGGTTG GTGGGGGACG CCCAGCCCTG CCCGCCGTGT 55200
AGGAGCCGTT CTGTCCTGGG CATCCCCCTG TGGTCTGGGA CTTAGTGGAC CCTGAGGGTG 55260
TGTGTTTACC CCTGCCTCAC ACCTGCAGAA AAAAGCCTGG AGCAGGAGCT GGCCAGCCCC 55320
ATCCTGGACA TTGAGGACTT GGTCAAGAGC GGAAGCAAGC ACAGGTGAGA CCCCTCAGTG 55380 AGGCCACGAC CACTGTCCTT CCATGGCCCA GCTCTCCTGT GACCTGTGGA GGCCCGGATA 55440
TATTTCTTCA CTTTTCTTTG TTCCTTTTTA AATTATGAAA CTAACCACCA TTCAGTACGA 55500
AAAAGTTTAA GCAGCTCTGA GGAAGATAGA GTAAAAAATT GTCTCCCTCT TCCCTGGCCC 55560
TCAGCCATCC CCGGTGGCCA CCGTGGAGTG TGGACGGAGC CCTGCAGGCC TGTGTCTGTG 55620
CGGAAGCACG CGCAGTTTTG TCTGCACAGA CTGTCCTGCA GTTGGCTGTT TTCACTCAGC 55680 GTTGTGGGTA TAGCTTCCCA TGCTGGTGCT GGCAGCTCGG CCTTGTTCTT TTGAGGACAG 55740
CAGATGTCTC CTATGTCTAC CTCTTACAGC TTCAGAGATT CAAGTTATAA TAAAGCTCTT 55800
CTTATATTGA GGGGGAAACC TCCCTCCCCC TTTTTTTTGA AACAGGGTCT CGCTCTGCTA 55860
CCCAGGCTGC AGTGCAGTGT CACAGTCTTG GCTCACTGCA GCCTCAGCCT CCCAGGCTCA 55920 AGCGATTTTC CCACCTCAGC CTCCCAAGTA GCCGGGACTG CAGGCACGCA CCACCATGCC 55980
TGGTTAATTT TTGTATTTTT TGTACAGACA GGGTCTCACT CTGTTGCTCA GGCCAGTCTC 56040
CTGAGCTCGA GAGTTCCACC TGCCTTGGCC TCCCAAAGTG CTGGGATTAC AGGCGTGAGA 56100
CCCCATGCCT GGCCAGCTCT TTTTTTTTTT TTTTTTTTTT TTGAGACGGA GTCTCGCTCT 56160
GTCGCCCAGG CTGGAGTGCA GTGGTGCGAT CTCGGCTCAC TGCAAGCTCC GCCTCCCGAG 56220 TTCACGCCAT TCTCCTGCCT CAGCCTCCCG AGTAGCTGGG ACTACAGGTG CCCGCCACCA 56280
CGTCTGGCTA ATTTTCTGTA TTTTTAGTAG AGACGGGGTT TCACCGTGTT AGCCAGGATG 56340
GTCTCGATCT TCTGACCTTG TGATCCGCCC ACCTCGGCCT CCCAAAGTGC TGGGATTACA 56400
GGAGTGAGCC ACCGCGCCCG GCCCAGCTCT GCTTTTTCTT AGTGGTTCTG CGTTGTGTTT 56460
GTTTCTATCC AGGAATAGGG TTGGTTTT C TTTTCCATCG AGTTTTTAAA GAGACGACGA 56520 TTTACATGGT CGGAAACTCA CGAGGACTCC CCATCCCTTG GTCGGAAACT CACATGGACT 56580
CCCCATCCCT TGGTCAGAAA CTCACGTGGA CTCCCATCCA TCCCAGGCAG CAGCTTCCCA 56640
CCTGGGCCCT ACGTGCAGGA TGAGGGCTCC TTCCGGGTCA GAAGACATGG CGGCCTCGGG 56700
GCACCGTCCC CTGCATGGGG TGCTCACAGG ATCTTCTCCT CTCTCCTTCC CAGGGTGTGC 56760
CCTTACTACC TGTCCCGGAA CCTGAAGCAG CAAGCCGACA TCATATTCAT GCCGTACAAT 56820 TACTTGTTGG ATGCCAAGGT GGGGGCTCAG TCCTGTAGCT GACGACTCCT GATGTCCAGG 56880
GGTGTCCCTG GGCTTGGGAA CAGCTGTCCG AGCCTTTGCT GCTTCAGGGC CTTAGATCAG 56940
CAGGCCTGGG TGGGAGGACT CACCTCTGTC ACTGGGCAGG GGCTCAACCT GGCCAGACAC 57000
ACTTGTGAGC AGCCCCAGGC CACAGGTCAG TTTTCTGAGC AGTCTGGGAG CGGGCAGGCT 57060
GGTGGGAGTG AGGAGAGACC TCCAGGCTGT GGTCCATAGG CCAGTGCCCG CTCTTGATCC 57120 TGACAGCTCA GGTTCTCTCC TTCACGTCAG GCCATGGGAG GCACCGAGAA CACAGGAAGC 57180
CCACTGACTC CCCTCTTCCC AGCGCGTGCC CGGCCCCACA CTCACTCCCC CTCCCAGCAT 57240
GTGCCCGGCT TCACACTCAC TCCCCTCTTC CCAGTGCATG CCCGGCCCCA CACTCACTCC 57300
CCCCACAGCA TGTGCCCGGC CTGACACTCA CTCCCCTCCT CCCAGTGTGT GCCCAGCCCC 57360
ACTCCCTTCC GCCCCGTGTG CCCAGCCCCA CGCTCACTCC CCCCGCCAGC ATGTGCCCGG 57420 CCCCACACTC AACTCCCCTC CTCCCAGTGT GTGCCCGGCC CTGCTGCCCT CCTCCCCATG 57480
TGCCCTGCTT TTGTGCCCCA CACTTTTTAC TTAGTGCAGG TGGGATCACA CGCCACGGGT 57540
CAATGGTTTG TGTGTTCACG TGACGATGGC GTGGTGACGT TTCCAGATCC CGTCGTTGGT 57600
TCGCTCATTC TCGGGGTGTA TATTTATTGA GAGCTCATCA TGCTGGGTGC TATTCCAGGC 57660
ATAGCAAGAC TGGCTTCACT CACATGGAGC TTTGATTCTA GTGGTGGGGA CAGGTGGACA 57720 GCAAAAGAGT AAGCACGTGA GCTGACGATA CTGAAGGGAA ATAGAGCAGA GGGAGGAGGC 57780
GGAGACCGAG CCAAGCGGGC CCAAGTGCGA TGTCGGCGGG AGGTGGGGAA TGCTGGTGGG 57840
TCTGAGGGGA GCCTCAGCAG GTGCAGCAGA GCAAGGGAAG AGGTGAGTGG GGGCGGCTGG 57900
GGGGCCGACT CCTGGGAAGC TGTAGCAGAA CCCCACAGAG AGCTGGTGAG GTTTGCCGTG 57960 GTTGTGGGTG ACTCGGTGCT TTGAGCCCTG GCTGCCCCTG GGAACCATCT GGAGAGCTTC 58020
TAACCCAACC AGGCCCCTCC CTGGGACAGT TATATCACAG CTGGTAAGCC GAGTCTAACA 58080
CTTTCACGGA AACGCAGAAG ATCTAAAACA GCAAGATGAC CGTGAAGAAG AACAGAGCTG 58140
GAGGACTCAC CTCGCTGGTT TCAAGACTCC TCTAAAGCTG CAGGAGTGGA GGTGGAGATG 58200
GCCCAGCTCA GGCACAGGCC TGCAGGCCAT GGAGAAGGCA GCAAGCTCAA GCTGACCCAC 58260 ACGCATGTGG TCATTGTTTT TTTTTTCAGT TGGAATCTCA CTCTGTCACC CAGGTTGGAG 58320
TGCAGTGGCA CCATCTCGGC TCACTGCAGC CCCCGCCCCT AGGTTCTAGC GATTCTCCCA 58380
CATCAGCCTC CCGAGTAGCT GGGATTACAG GCGTGCGCCA CCATGCCTGG CCCTTGGTGA 58440
TTGTTTTTTG ACAAACATGC CAATTTAATT GAGAGAGGAA ATGAAGGTTG ATTTCTGGTT 58500
TTCTGAAAAA ATGGTGCTAA GAACAGCTGG ATATCTGTTC GGAAAACAGT GAATCTTAAC 58560 TCTTGTTTTA CCCTGTATAA ACCTAAATGT AAAAGCTAAA CTAAAAGTTA TAGAAAGGAA 58620
CATGGGGGAG GTCTTTGCAA CTTTGGGGTA GGCAGAGATT TCTTAGTATG GATACACAAG 58680
GCACTAGCCA TGAAGAAAAA CATTAAAATT TAGACTTCAC CAAAATTTAA AGCTTCAACT 58740
CTGTGGAAGA GTTGAGAAAA TGAAAAAGCA GTTAAAGAAA GGGAGAAAAT ACTTCTTTCA 58800
AAGGACTTAA AAAATTTTTT CAGCCCTCCT CTGATTTGAA AGGACCTTTG ACCAGAGTAT 58860 GTAAAATTCT CCCATAACTA AGCAAACAAC CCACTTAACC ACTGGGAAGG GATCTGGACA 58920
GACGTTTCAC CAAGATGGGT GGAATGGCCA GTTAACCACT GGGAGAGCAT CCGGACAGAC 58980
GTTTCGCCAA GATGGGTGGA ATGGCCAGTT AACCACTGGG AGAGCATCCG GACAGACGTT 59040
TCGCCAAGAT GGGTGGAATG GCCAGTTAAC CACTGGGAGA GCATCCGGAC AGACGTTTCG 59100
CCAAGATGGG TGGAATGGCC AGTTAACCAC TGGGAGAGCA TCCGGACAGA CGTTTCGCCA 59160 AGATGGGTGG AATGGCCAGT TAACCACTGG GAGAGCATCC GGACAGACGT TTCGCCAAGA 59220
TGGGTGGAAT GGCCAGTTAA CCACTGGGAG AGCATCCGGA CAGACGTTTC GCCAAGATGG 59280
GTGGAATGGC CAGTTAACCA CTGGGAGAGC ATCCGGACAG ACGTTTCGCC AAGATGGGTG 59340
GAATGGCCAG TTAACCACTG GGAGAGCATC CGGACAGACG TTTCGCCAAG ATGGGTGGAA 59400
TGGCCAGTTA ACCACTGGGA GAGCATCCGG ACAGACGTTT CGCCAAGATG GGTGGAATGG 59460 CCAGTTAACC ACTGGGAGAG CATCCGGACA GACGTTTCGC CAAGATGGGT GGAATGGCCA 59520
GTTAACCACT GGGAGAGCAT CCGGACAGAC GTTTCACCAA GGTGGATGGA ATGACCAGTT 59580
GAGCACATGG AAAGTCGCCC AGCATCTCCA GTCATAGGAG AAGGCAGATT AAAGCCACGG 59640
GGAGCCGACA CTGTGGTCCC ACTGGCATGG CTGAAATTCA GAAGCCCTGA GTGTGGCATG 59700
AGGATGTGGA ACAGCTGGAT CTCATCCATC GCTGTGAAGT TGTGTAGCCA CTCCACAAAC 59760 GTGTGGCAAA CAGCCGAGCC GGGAGAAGGG AAGACGTGTT CAAAGATTCA TATGTGGCCA 59820
GGCTCAGTGG CTCACGCCTG TAATCCCAGA ACTTTAGGGG CCAAGGCTGG GGGATCGCTT 59880
AAGCCCAGGA GTTTGAGACC AGCCTAGGCA ACATAGGGAG ACCCCATCTC AAAAAAAAAA 59940
AAAAAGAAAA AAGAAAAGAC TTCAGTGTGC AGGTTTACCA GAGTTTTGTT TGCAGTTGCC 60000 AAAACTGGGA AGCAGCCCGC GTGAGCCCAT CCACAGGTGA ATGGACAGAC CGTGGTACCC 60060
GAACACTAAC AGCAGCCACG GGCGTGGACT GTGGTCACAC AGCAGCAGGG AGCCGATGAG 60120
TCTCGGACAT GCTAACCCAG AGAGGCCCAT TGAGGAGGAC CTACTGTTTT TTGTGTTTTT 60180
GTTTTTTGTT TTGAAATGGA GTCTCGCTCT GTGGTGCAGG CTGGAGTGCA GTGGTGTGGT 60240
CTTGGCTCAC TGCAGCTTCC GCCTCTTGGG TTCAAACAGT TCTCCTGCCT CAGCCTTCCG 60300 AGTAGCTGGG ACTACAGGCA CCCGCCACCA CACCCGGCTA ATTTTTGTAT TTTCAGTAGA 60360
GACGGCAGTT CGCCATGTTG GCCAGGCTGG TCCCAAACTC CTGACCTTGT CATCCACTCA 60420
CTTTGGCCTC CCAAAGTGCT GAGGTTGCAG GCATGAACCA CCGCACCCGG CTGGACCTAC 60480
TGTTTTATTC CATTTATGTG ACACTCTATT AATAGAAAAG GCAGGGGTGG GGCTGGTGGT 60540
TATATGGTGC ACATAACTGC CAGAACTCAG TACACTTAAA ATGAACATCT TAATGTGTGA 60600 AATTTTTTTT TTTGAGACGG GGTCTTGCTC TGTCACCCAG GCTAGAGTGC AGTGGTGCGA 60660
TCTCCACTCA CTGCAAGCTC TGCCTCCTGG GTTCACGCCA TTCTCCTGCC TCAGCCTCCC 60720
GAGTAGCTGG GACTACAGGC GCCCGCCACC ACGCCTGGCT AATTTTTTTT TTTTTTTTGT 60780
ATTTTTAGTA GAGACGGGGT TTCACAGTGT TCGCCAGGCT GGTCTCGATC TCCTGACCTC 60840
GTGATCCGCC TGCCTCGGCC TCCGAAAGTG CTGGGCTTGC AGGCGTGAGC CACCATGCCC 60900 GGCCAATGTG TGAAAATTTA AAAGTACCAA AGCTGGACCC CACCCCAGAT TGCTCCCATG 60960
ACACTCTGTG GGTGGGACCT GGGAGTTGGG TTTTGTTTTG TTTTGTTTTG TTTTTGAGAT 61020
GAAGTCTCAC TCTGTCGCCT AGGCTGGAGT GCAGTGACAC AATCTCGGCT CACATTAACC 61080
TCTGCCTCCC AGATGAAAGC GATTCTCCTG CCTCAGCCTT CTGAGTAGCT GGGATTACAG 61140
GCACACACCA CCACCCCCTG CTAATTTTTG TATTTTTAGT AGAGACGGGG TTTTACCATG 61200 TTGGCCAGGC TGGTCTTGAA CTCCTGACCT CGTGATCCGC CCGCCTCGGC CTCCCAAAGT 61260
GCTGGGATTA CAGGCGTGAG CCACCGCGCC TGGCTGGGAG TTGGGTTTGT AAATCTCCCT 61320
GAGTGGGGCT GGGGCAGGGA ACTGCTGGGT CTGGGTCTTC CTGGCTCCTC TGGTCTGTGG 61380
CTTCCTGACT GCGGTGGCCG GGGGCTCCCA GGGCATCGTG GCCGTCTGTC TTGCTGAGCG 61440
TGGCACGTGC CTTTCCATGC TGTGGAGGAG CGTCTCCCGG TATGGCGAAC TGCTGGTTAG 61500 GGTGGGGCGG TGTTGCCAGG TCATCCAGGT CTGGCCTCTG CTCTCGACAT CGCCGGCGCT 61560
GTTGCTCATC TGCGCTTGTG ATGTTCGATG CCTGCTGCAC ATGTCTTGGC TTCCCTCTTT 61620
CCCGGCCTCT GTGAGCTCCA GCGCTGCGTC CCTTCTCTTC CTCCTGTAGA GCCGCAGAGC 61680
ACACAACATT GACCTGAAGG GGACAGTCGT GATCTTTGAC GAAGCTCACA ACGTGGTGAG 61740
TCTCCGCTGG CCTCCTAAAC ACCTCCTATT GCTTCTGGCC TTTTTGTCAA GAGCCACGCA 61800 AACCTTTCTG GAGGGGCTCT GGCCAAACTC CTGAAGCCCT AGGTGCCCAG GACTGGGGAC 61860
TGAGCACACC AGGAGCTTCT GCCACCCCCT CCCGCCCTGA TCCGATGCCT CTGCTGGGGC 61920
TGGAGACTGG CCAGCTGGGC CAGGGACCTG CCCGTCAGGC GCAGGGCCCC CACAGGCCGC 61980
TCACCAGACC CTTTCCCTCC AGCCAGCTCG GGGTCAGCCT GGGCCAGGGC TGTCTCCTCT 62040 GCCCTCGGCA GCAGCAGGCT TGTGGTCTTG CCTGCAGTGT CTCTGCCCTT CCGGCCACAT 62100
GGCTTGAGAC TGAGGCAGGA GAATCGCTTG AACCTTGGAG GCAGAGGCTG CAGTGAGCCA 62160
GGATCACACC ACTGCATTCC AGCCTGGGTG ACAAAGCGGG ATTCTGTGTC AAAAAAAAAA 62220
ATGTTGACTG GGCGCGCTAG CTCATGCCTA TAATCCCAGC ACTTTGGGAG GCTGAGGTGG 62280
GCGGATCACG AGGTCAAGAG ATCAAGACCA TCCTGGCCAA CATAGTGAAA CACCGTCTCT 62340 ACTAAAAATA CAAAAAAATT AGCTGGGCGT GGTGGCGTGT GCCTATAGTC CCAGCTACTC 62400
AGGAGGCTGA GGCAGGAGAA TCACTCGAAC CCAGGAGGTA GAGGTTGCAA TGAGCCAAGA 62460
TCACACCACT GTACTCCAGC CTGGTGACAG AGCAAGACTC CGTCTCAAAA AAAATAAAAT 62520
CAAAAAGAAT AATTGGCAAT TCCAGTGAAA TAATTGTTTG TTTGTTTGTT GAGACAGGGT 62580
CTCCTTCTGT CGTCCAGGCT GGAGTTCAGT GGTATGATCT TGGCCCACTG CAACCTCCAC 62640 CTCCTGGGCT CAAGCCATCC TCCCACCTCA GCCTCCCGAG TAGCCGGGAC TACAGGTGCA 62700
CACCACCACG CCCGGCTAAT TTTTGTATTT TTTGTAGAGG CGGGGTTTCC CAGCGTTGCC 62760
CAGGCTGGTC TTGAACCCCT GAGCTCAAGT GATCTGCCCA CCTTGGCCTC CCAAAGTGCT 62820
GGGATTACAG GTGTGAGCCA CCGCGCCCGG CCTGAAACAA TCGTTTCTAA ATATTGGTGT 62880
GGGCCACACA GTCATGTTTG GACCTACTTG TGGCCTTTTA CAGACCCCAG GCCAAGGCTT 62940 TGGGAACTTG GCTGTCAGCC TCCTGTGCCT TCTGCACCCC CACCCCATTT CTGCTTTCTG 63000
GAACCCCCGA TCCTGTCCTG TTCTGTGGTG ATTCGGGTGT GCTTGGGCTC TAGGAGAAGA 63060
TGTGTGAAGA ATCGGCATCC TTTGACCTGA CTCCCCATGA CCTGGCTTCA GGACTGGACG 63120
TCATAGACCA GGTGCTGGAG GAGCAGACC AGGCAGCGCA GCAGGGTGAG CCCCACCCGG 63180
AGTTCAGCGC GGACTCCCCC AGCCCAGGTG CGTTCATAGC CAGACTGCTT GGTCCTGAGG 63240 CCTGCGCTGC TGCAGGGTGA GCCCCACCCG GAGTTCAGCA CGGACTCCCC CAGCCCAGGT 63300
GCGTTCATAG CCAGGCTGCT TGGTCCTGAG GCCCGTGCTA CTGCAGTGGG CAGCCTGCCC 63360
TGTGGCTGTG TGTGGTCGGC CTGGGCACCA TCTATTCAGG CTGGCACTGC AGGGCATCCG 63420
CTTCTCTCAG AGGCTTCTTG GGTGTGAATT CTTCAGGGTC CTGTAGCCTG TGGAAGGGCT 63480
GGTATTGTTC AGTAGTTCTG GTATTTTCCA AAGACCTATG TCTTCTCCCA GCCAGTATCA 63540 ACTTGGCCTC TACTGTGTAA AACTGGAAAA CTCTACTTTG TGAAGCTGAG TTGGGAGCAT 63600
CGCTTGAGGC CAGGAGTTTG AGACCAGCCT GGGCAACATG GCGGAACCTC GCCCCTGCCA 63660
AAAAATTAGC CAGGTGTGGT GGTGTGCTCC TGTGGTCCAA GCTTTTCTGG AGGCCGAAGT 63720
GGGAGGCGTG CTTGAGCCTG GGAGGCAGAG CTTCCGGTGC CCCAGATGAC TCCACTGCAC 63780
TCCAGCCTGG GCGGCAGAGT GAGGCCATCT CAAAAAAAAA AAAAAGGAAA ACTAAATATA 63840 TTCACTGTAA GGGCATTTTG CATCTTTAAA TGACCCACAA ATCTGGCATG CATCAGCTGC 63900
TCTGCCTGTA GGTTCCTTCC CAGTGTTTGT CCAGAGGTGT ATTTCCACAC AGCGCTAGTC 63960
ACGGCATATG TGGAAAACGT GGAAACCCTT CATGGATGTT GTCAGTTGGT CTATATTTTC 64020
TTTCTTTTTT TTTTTTTTGA GATGGAGTTT CACTTTTGTT GCCCAGGCTG GAGTGCAATG 64080 GCGCGATCTT GGCTCACTGC AACCTCCGCC TCCTGGGTTC AAGCAATTCT CCTGCCTCAG 64140
CCTCCCAAGT AGCTGGGATC ACAGGCGTGC ACCACCACGC CCAGCTAATT TTGTATTTTT 64200
AGTAGAGATG GTTTCTCCGT GTTGGCCAGG CTGGTCTCGA ACTCCTGACC TCACGTGATC 64260
CACCCGCTTC GGCCTCCCAA AGTGCTGGGA TTACAGGCGT GAGCCGCCAC GCCCGGCCTT 64320
TGTCCATATT TTCTACATGG CTTCTGTAAA CAGCTGACTA GGAGTCTGTG TGAATATCTT 64380 CATAGGTTCT GCTGTGACAC TACTTGCTCG TGAGCATCTC CAGGTGTAAA CAGCATCAGC 64440
TTCCCCCATT TTCCTTTAAA ATCGCACATG TGGACGGACA CCACGGGGAC CCTGGACCCT 64500
GGGGAGCCCC GTCCTCACCC TTCTCACCAG GATGGCTGCT TGGTAGAGAG TGAGTTTGCA 64560
AAGTTGGCAT TTGTTTAGTA CAGAAGTTAT CAGGTGTTCT GGCTTTAGAA TCCCTTTATA 64620
TATATATATA TATACATATA TTTAAGTGAC AGGGTCTCAC TCTGTTGCCC AGGCTGGAAT 64680 GTGGTGGTAC AATCAAAGTT CCCTGTAGCC TCGGCCTCCT GGGCTCATGG GATCTTCCCG 64740
TCTCAGCGTC TTAAAGCGCC GGGACCACAG GTGTGCACCA CTGCCACCGG CTCTCAAGAT 64800
TGCCACGCAG GGAGTTGCAG TGGGGGAAGG GGTTCCTGGG ACTTTGAACG CTCCACCTCC 64860
CTCCTCTCCA CAGTCCCCCA ACCCCACCTC TCTAACGGGG TGGACGGCCG CCTCTTTCCA 64920
TCCTTCGCTT GGCGCAGGGT GGGGAGAGTG ACAGGTCTCC TTCCCTCATC TCGGCAGCTG 64980 CCATTTCATC GCTTACATAA CGTGGGAGAA ACATCCACCC ACCCCCAGGC CTGTGTGAAC 65040
ATCACCACGG GGCCTTCTCC ACTCTTCAGT TTTGTTAGTT ACTTGATGTG CAGGGCTTTT 65100
TGTTGTAACT AGTGGGGGAC GTGTGGTGGG GTGGGCTTCT GCCATCTCAT TCAGGACCAG 65160
AACTTCAGTT TTCATCCCTA TCTGTTCCCC CACCCCTTTG GAGATGGGGT CTCACTCTGT 65220
CACCCAGGCT GGAGAGCGGT GGTGCCATCA CGGCTCACTG CAGCCTCCAC CTCCTGCAGC 65280 CTCCACCTCT TGGGCTCAAG TGATCCTCCT GCCTCGGCCT CCCAAGCTCC TGGGACTACA 65340
GGCGTGTGCC ACTGTGCTTG GCAGGGTCCA TTCTTTTCCT CACACTTTAT TTATTGAAGA 65400
GCCCAGGCCG TTTACCCTGC AGAGTCGGAA TCTGTACAGG AGGGGCAGCC ACACGAGTTC 65460
CCCGGTTTAC TCTGAACTTA GGTGGCTTGA GGGCCCCAGT TAGACTGCGG CCACCGTTTG 65520
CCGGGCTCCA GATGGGACGT CCTTTCTATC AGAAGGCTCA CAGTATCTCC TTTCCCGTTT 65580 CTTCCCATGT GAACATTGTT GCTGCTGAAC ACCTGAATAT GTTAATCACT GGGGGCTTGC 65640
AAGATGGCAG TGTGCTAATT CCATCATCTA GTCAGTTAGC AGGAATAACT TAGGACCACG 65700
CCCTGCACCA TATCAGCTAT GTGGTGATCC CATTCACACA GGAAAGGTGG GACAAATGCT 65760
GGGGGTGGGC CGGGTGTGCT GTCTCACACC TGTCATCCCA GCACTTTGGG AGGCCCAGGC 65820
AGGCGGATCA CGAGGTCAGA GATTGAGACC ATCCTGGCCA ACACGGTGAA ACCCCGTCTC 65880 TACTAAAAAT ACAAAAAAAT TAGCCAGGTG TGGTGGTGCA TGCTTGTAAT CCCAGCTACT 65940
TGGGAGGCTG AGGCAGGAGA ATCACTTGAA CCCAGGAGGC GGAGGTTGCA GTGAGCCGAG 66000
ATCGCACCAT TGCACTCCAG CCTGGCAACA GAGCGAGACT CCGTCTCAAA AATCAATCAG 66060
TCAATCAAGT GTCATCACTG AATGTTTGTG TGTGAACGTG GGGATTGGTC CTGCCCCATG 66120 CTCCCTCCTG AATCTCACTC CTGACCTCAG TTGCTGCACC TTGAGGTGTT TTCTGTGGGC 66180
TCTTGTGTCC TGACCCCGGC GGTTGTGGCC TCTGCTGTCT GGGAGTCAGG ATTTTTCACA 66240
CTCATGTCCT GCTCCAGACC TGGAATCAGC CAAGTCTCCA AGAAGCCCTG CTTTCTTTTC 66300
CTGCAAGACG GTATTTCAAG ACCCGCCGTG CGGCAGCGGG TTGGTCATGG TTACTGGGTT 66360
GGTCGTTGTG ACTGGGTGTT TTCGTGGAGA TACAGCCATA CGCACAGGTG TGTTCACAAA 66420 TGTTAATTCT AAAGGTCAAA CACCCGGCCA GGCATAAGGG CTCAGCGGTA ATCCCAGCAC 66480
TTTGGGAGAC CAAGACTGGT GGATCACCTG AGGTCAGGAG TTTAAGACCA GCCTGAGCAA 66540
CAGGGTGAAA CCCCATCTCT ACTAAAAATG CGAAAATTAG CCGGGCATGG TGGCGCACAC 66600
CTATAGTCCC AGCTAGTCGG GAGACAGACA CGAGAATTGC TTGAACCTGG GACATGGAGG 66660
TTGCAGTGAG CAGAGATGGC GCTGCTGCAC CCCTGCCTGG GTGACAGAGT GACACCCTGT 66720 CTCAAAAATG AATAGATAAA TAAAGATAAA ACACCTGCTC CTCTTGGTGT CTCCAGTTTG 66780
GATTTGGCCT GTGTAGCCTC TTCCTTCGCC TGTTGGTGGA TTTGGCCTGC ACGGATTCTG 66840
TGTGGCCTCT TCCTTCCCCT GTTGGTGGAT TTGGCCTGCA CGGATTCTGT GTGGCCTCTT 66900
CCTTCCCCTG TTGGTGGATT TGGCCTGCAC GGATTCTGTG TGGCCTCTTC CTTCCCCTGT 66960
TGGTGGATTT GGCCTGCACG GATTCTGTGT GGCCTCTTCC TTCCCCTGTT GGTGGATTTG 67020 GCCTGCACGG ATTCTGTGTG GCCTCTTCCT TCCCCTGTTG GTGGATTTGG CCTGCACGGA 67080
TTCTGTGTGG CCTCTTCCTT CCCATGTTGG TGGATTTGGC CTGCATGGAT TCTGTGTGGC 67140
CTCTTCCTTT CCATGTTGGT GTCCTTTTTT CCATGCCAGG AATCCTGGTT CTCAAGGGCG 67200
GGGTTGTTGG CACGAGCGTG ATGCAGACTG CCTTTGCTGC CTTTCTCTTG CCCAGGGCTG 67260
AACATGGAGC TGGAAGACAT TGCAAAGCTG AAGAGTAAGT GTTGCCCTCC CCGCCTCCTT 67320 GCAGCTGGGT GGGGCCTCCT CCTTGCGAGG AGGTGGGTGA CACCTCCTCG ACCCACAGTG 67380
ATCCTGCTGC GCCTGGAGGG GGCCATCGAT GCTGTTGAGC TGCCTGGAGA CGACAGCGGT 67440
GTCACCAAGC CAGGGAGGTG AGAGGCGGGG AGCCAGCCCC TTCACTGCAG GCCCAGCCTA 67500
GAGCTAGAAA CGGGCCATGG TGCAGTCCTG GGCTGTCACA TCACGAGTGA GGCCTGTTTT 67560
CAGGCCTGTT TTCCCTTTTT GAGACCTGGG AGGAGCACCT GCTTTGCATG ATCTGGTTGC 67620 TGAGATGTTG AGAGGAGCAG CACACACTCC CACGGGACAG CACACAGCCC CCCACGGAAC 67680
GGCACACACA CCCATGGAAC AGCACACACA CTCCCACGAA CAGCACACAC ACTCCCACGA 67740
ACAGCACACA CACTCCCACG GAACAGCACA CACACCCACG GAACGGCACA CACACCCACG 67800
GAACAGCACA CACACTCCCA CGGAACAGCA CACACACCCA CGGAACGGCA CACACTCCCA 67860
CGGAACAGCA CACTCTCCCA CGGAACAGCA CACTCTCCCA CGGAACAGCA CACACACTCC 67920 CACGGAACAG CACACACACC CACGGAACGG CACACACTCC CACGGAACAG CAGACTCTCC 67980
CACGGAACAG CACACACACT CCCACAGACA GCACACACAC ACCCACGGAA CAGCACACTC 68040
TCCCACGCGG GGCCGCTGGG TTTCCTGCAG TTTCTCCTCC TCCAGGCCTT TCCCTGGACC 68100
CTGGTCCAGT CCGTCATTTG AGCACAGGTG CCTGTTAGAA CGAGACCTTC TTGTTAGGAC 68160 GATGAGTGTC CCAGCCACCA CCTCTTTTGG ACTCCGGGAG GCCTGGAACG TTCTGAACGC 68220
TCCGTGGGGC TCCAGTCTTC TCCGCAGCCA GGGCAGCAGG GTTTGCTGTC TGTCCTGCAG 68280
GCAGATGAGG AGTCAGGGCT GGGGCCTGTG TGGGGGCTCT CCTGAGCGCG CAGCCGCCGA 68340
GGTGGAGCGT GTTCTGCCTG AGCGCCGACC TGGTCGGGGG AATCCCAGTT GCTTCCAGGT 68400
GGAGCCACTG TCCTCAGCGT AATGCTCAAG GCTCTGGCCT GGCTCCTCGG CCACCCTGCA 68460 CCCTCAGGGT CCCCTCCTGT AGCTTCTGCT GCCCCATCAC TGTCACTCTC CAAAGCTTTG 68520
GGGACTCTGC CCAGAGCCAC CGCCTCCCAG AAGCCCCTGA CAACCTCTTG ACGACCCCCT 68580
AGTGACCCCA TCCCTCCCCT CTGACGGCGG CCCCTGCTCT GAGGCGGCTT CTTTTCCTCG 68640
GTGCTGTTCT CGTGCTGGCC AGGCCTCCTC TCCCCACCTG GAGGCTCCTG AGGGCGGAGG 68700
CCTCTCACCT CCAATGCTGG CGTCCCCTGG AGGGCTGAAT TTGTTTCCGA GGGAAGGAAA 68760 CTTCCACAGT TGTTGCCTTC AGTTCCAAAG CTGCAGCCTG ATTTCCCCCT CCAGGCTCGA 68820
GCCTGTTTTC TTCTCGGCAG CTACATCTTT GACCAGTGTC GTCCCCCCTC AGGCCCGAGC 68880
CTGCCTTCTT CTCCTCAGTT CCCAAAGCTG CAGTCTGGTC CCCCCGCCAG GCTCGAGCCT 68940
GCCTTCTTCT CCTCGGCAGC TACATCTTTG AGCTGTTTGC TGAAGCCCAG ATCACGTTTC 69000
AGACCAAGGG CTGCATCCTG GACTCGCTGG ACCAGATCAT CCAGCACCTG GCAGGACGTG 69060 AGTGCTGGCA CGGGGTCTTT GGTGCGGGCA AATGTGGCGT AGGGGGTGCA GCAGGCCTCC 69120
ATCTTGGCAG TCAGGGCTCC CCTGGCCGTC ACCTGGCCGT CAGCAGGAAC AGGCCCACAG 69180
AACCTCATCT TCTGATCGGG GCGTGGAGGC GTTAGTGCCA CTTGCCAGCT GCCGTAGAGC 69240
CTGTCCCAGT TCTGCAGCTG GCGGCTTCGT CCTACAGCCT CATCCCATTA TTCTGCTTTT 69300
GAGAAAGAGC AGCCCAAGGC CCTAGCTGGC TTGTGGGGCC TCTGGCTTCT CCACACCACC 69360 CCGAGTTCTG CTTCTCAGAG TTGTGGGGTC CAGAGGCTTT GCCCAGAGGC GGTGTCCCCA 69420
TGGGCTGCTC TGGTTTGAGA CGCCGGGCCC AGCGGGGTCT CTCCTCTGCT GCGCTCCCGG 69480
GTGCTGGGGA GGGTGGCTTT TGCTGCTTCA ACCCTTAGGC GACCATAGAG CCTCTTTTCA 69540
AGTCCCACTG ACCCCCTTGG AGACTCTGTC CCTGCCTGGC TTCTCTCCTG GCTGCTGGGA 69600
AGAGCAGGCG AACTGCCCGC CCTGAATGGA TGCTGCGCTC CACCCTGGGC CCCCCATTGG 69660 GCAGGAGATG GAGCTTGGCA GTCGGGCTGA GCGGGCTCAT GCTGGAAGGG CCGGGGCTGG 69720
GGTCGGGGCC TCCCCTGCCT GCAGTGTGGG TGTCAGCGCC CTGCTGCCCT CCAGGTGCTG 69780
GAGTGTTCAC CAACACGGCC GGACTGCAGA AGCTGGCGGA CATTATCCAG GTGGGGCCTG 69840
CTCCTCTGTG GCATCTCCTT CCCTGATGGA AGCCGGGCGG GTGCCTTCTC CTGCTGTATT 69900
AGTTAACTGA TTCTAGACTT GGGGATGGGA GAAAGGCCCC TACACCACCT GTTTCTGATT 69960 GGCAAACTCT CGGCTCCTTT CCAGTGCCCT AAACCCACAC TGGGCCTCCT GCAGGGATGG 70020
GGGAGGACGA GGTCTGGTGG CACATGCCCA GGGTGATGCT GGTGAGGGAG GACGCAAAGG 70080
ACAGTGGGGG CCGGGGAGCC GCTCCTGCCC TGTCCGGGCC CTCAGGCCAG GGGGGACCCA 70140
CTGCTGGCAG CCCCAGCAGC CCCAGCTGCA CGCAGATGAA GAGCTCTGGA CACACGCGGC 70200 TTCCTGAACA GCTTCTCCAG GGACAGACAA ATGGGGACCC TGCAGGTTCC CGGCAGGGGT 70260
GTCCCTGGGA GCCCATGATT GGGGGTGCGA CCCTGGCCCC CTTCTCATTG GCCCCGTCCT 70320
GTCCTGCAAT GCCCGTCCCA TGTGAGGTCT GCTTCTGGCT CCATGCCTAT GGCAGCACCT 70380
GCTTTCCCTG GCGTAGAGGT GCTTGTCCGG TTTGTGGAGG GCACGCCCCA TTTTGGGTGC 70440
TCTGGGCACG TTGCCTCTCC GGGGCCTCGG TGGCTTTTTT AGAAGCAGAC TCAGAAGTCC 70500 CTGACTGGGG AAGCCAAGGC ACAGGTGGCT GTGTGGAGCC CTGTGAGGCC TCCTCTGTGC 70560
TGCCCACGCT GTACCTGCTG GCCACACGAG ATCATGGCAG GGTTAGGCAG GGCTGCCCAG 70620
CGCTATGACA GCTTCATGAG TGTCCATCTG GCCTGTGGGG TGCTTGAGCT GGGGGAGGCC 70680
GCAGAAGAAC CCTGGGATGC ATGGCTGGCC TGTGCATGCT GCTGGGCATG GAGCTGCAGA 70740
TCCCGGAACA AGCAGGCACT GCCTTCTCCT TCACAGACGC AGCTCTGAGC GGGGGCGAGA 70800 CCTGGGCAGG GACCAGGTGG GGTGGGCACA GGGTGGTGGG GCCCAGGCTC AGCCCTCCCT 70860
CCACTGTGGC CGTCTCTGTG GCCAGTGACG CCACAGCCTG TGTCTTCTCT GTGCGGTAGC 70920
TGGGGCTGGA AGGACAGCAC TGCCTTGTCC TCCCAACTCC TCCCCAAAGG CACGGTGGGC 70980
ATCCCAGGCC CAGACCCCTC TGTCTGTGGC TCCTGCCTGC CAAGGGCTGC TGTGCTGTCC 71040
CGCATGGAGT GTGGTTGGCT CTTCAAGCAG GAGGCCGTGC ACCTATCAGG CGGACCTGCT 71100 TCCATGTCCC TGATGGGTCA CTGCAAAGCA CCTCCAGCAC ATGGCCAGGC GAGGTAGCCC 71160
TGCAGCCCAG GGCCTGGAGG GCAGGTGTGA GCTGGCCCGG GCCTGTCCCT CCCTGGAATA 71220
CAGCTTCCCA GGCTCCCACT TATGGAGAAG TCTCCTCCAC ACTATGGAAC TGAATCCTAG 71280
AATGTGGCTT CTGAGGTTCC TACACTCGAA CTGAATCCTG GAATGCGGCT TCCAAGGCTT 71340
CCAGCTATGG AGAAGACTCC ACACTCTGGA ACCGAATCCT GGAACGCGGC CTCCCAGGCC 71400 CCCAGCTATG GAGAAGACTC CACACTCTGG AACCGAATCC TGGAACGCGG CCTCCCAGGC 71460
CCCCAGCTAT GGAGAAGACT CCACACTCTG GAACCGGATC CTGGAACGCG GCCTCCCAGC 71520
CTCCCACTTA AGGAGAAGTC TCCACACTCT GGAACCGGAT CCTGGAACGT GGCCTCCCAG 71580
GCCCCCACTT AAGGAGAAGA CTCCACACTC TGGAACCGAA TCCTGCACAC TCCATCGGTT 71640
TGGAATTTCC TTTGGCTGCT GCTCTAAGTA GCCGCTGGTG GATGACTCAG CTTCTGCCAG 71700 CCCTCGGGTG CCTGGAGGAT GAGGGACTGC ACACAGTGCT CACCCGCGTT GGCTCCTGAG 71760
CCCCTGCAGG TGTGGGCGGT GCCCATAGGG CTGGTGCTGG GTTGGGCCTG CAGCCCTGAG 71820
TCACAGGTGA CCCTGGGGGC AGAGTGGGGC CAGTGGCCCC AGGAAGAGGA TGTGGGATGC 71880
ACAGCTCAGC TGGAGGCGAA CTCCAGGCAG GGTCAGGCCG TGTGCTCGGA AGTCAGGGCT 71940
TAGCTGGAGG CAAACTCTGG GCAGTGCTGG CCCGTGTTGG GGAACCAGTT GCCCCTGGGC 72000 CCCCGTGAGA CTGCTGGGTC CTCATCCCTC TCTGCCTGAG GCCGGAGCTG CCCTGGGCTG 72060
AGGCACAGGG GGATTTGTGG TGGTGTTTTT TTGAGAAAGG GTCTCGCTTT GTCACCCCGG 72120
CTGGAGTGCA GGGGCTTGAT CACAGCTCAC TGCAGCCTCA ACCTCCTGGG CCCAAGTGAT 72180
CCTCTTGCCT CAGCCACCCG AGGAGCTGTG AACACAGGTG TGCACCACCG CACTCAGCTA 72240 ATTTTTAAAA TTTTTTTGTA GAGATGAGGT CTTGCCATGT TTCCCAGGCT GGTCTCAAAC 72300
TCCTGGGCTC AGGCAGTCTG CCCGCCTTGG CCTCCCAAAG TGCTGGGATT ACAGGCAAGA 72360
GCTTCCATGC CTGCCCAGCA GAAGGCTTTT CGAAGGAAGC TGTTTCCTGA GGCAGACTCA 72420
GCCCTGCTCA TGGCAGCCAC CAGCGTGGGG GTGAACTTGT TCTGTTACTT CCATCCCCGT 72480
GGGCCAAATG CTTTGGTAAA ACACAAGGCC CTGTGTTTAG CTGTCTTGAC AGTGAAAATG 72540 GCTGGGAAGG AAGGAAGGAA CGGAAGGAAA TTTCTCTCTC CTTCTGTGCG TACCCAGGCA 72600
CGTGCACATG CATGCAGAGT ACGCACACAC GCACGCACGC CTGCACAAAT CCACGCATGT 72660
TGCCAAGTCT CTGTGTTCCA GCCGTGGTGT CTGCCCCCCG GTGTTCTCTA GTTCGGCTTC 72720
TCCGCATTTC TGTGAATGAT TCCGGCTTCT TGGTGTTCCC AGCAGAACTC CCTCAAGTCT 72780
GCGGCGGGGC TCTGACGGCG GTGGCTTGGC TGACATGGCC ACATTGCTGA GCCTGTTGGG 72840 GGCTTTGCGT TCCTGTTCTG GCCGTTTTTG GCTCGTTTTC CAGGAACGGT CGTCACGCGC 72900
TCCTCTCCTA GTGCAGGCAT CATTCCTTTC CCATTGATTT GCAGGGTTCT CTGTAAGTTC 72960
TGAGGATCCC ATATACATAT ACTCTCTGTA AGTTCTGAGG ATCCCATATA CATATTCTCT 73020
CTCTAAGTTC TGAGGATCCC ATATACATAT TCTCTCTCTA AGTTCTGAGG ATCCCATGCC 73080
GACATACATA TTCTTTCCTT GTCTCATGCT GGTCATTTTT TCCATTTTCA TGACAGGTTT 73140 GGTGAACACA TGTTTCCTTG TCAGATTTTT GTTCTGAGCT TGTGCCTCCC GACCAAGATG 73200
CTAAACCGGG TCTTGTGTAT TCTCCAAACT GCACTGTAGA GTGACGGAGC TTTGTGTCTG 73260
GGCCTCCATG CCTTCTGACG TCACCTGTGG GGGTGTGAAA GGCAGACTCT ACCTTGATTT 73320
TTCCCAGCAC GCCACACCGG TGGTTCTGTG CGCTGACCGA GCGGCTCGGC TTCCCCCAAC 73380
TCCACTGGGC ACCTGCCACA CTTTTCCTCA TGTTTTTGTT CACTGTGGTT TTGTCGTAAG 73440 TCCTGGTGTT GGCCTGAACC AATTTCTTTT TGTTTGTTTT TGAGACAGAG TTTTGCTCTT 73500
GTTGCCCAGG CTGGAGTGCA GTGGCGCGAT CTCGGCTCAC TGCAAGCTCC GCCTCCCGGG 73560
TTCACGCCAT TCTCCTGCCT CAGCCTCCCA AATACCTGGG ATTATAGGCA CCTGCCACCA 73620
CGCCTGGCTA ATTTTTTGTA TTTTTAGTAG AGACGAGGTT TCACCGTGTT AGCCAGGATG 73680
GTCTCGATCT CCTGACCTCG TGATCCGCCT CCCAAAGTGC TGGGATTACA GGCATGAGCC 73740 ACCGTGCCCA GCCTGATATT TTTAGTAGAA ATGGGGTTTT GCCATGTTGG CCAGGCTGGT 73800
CTCGAACTCC TGACCTCAGG TGATCCTCTC ACCTTGGCCT CCCAGAGTGC TGGGATTACG 73860
GGTGTGAGCC ACCACGCCCG GCCTCTTGTT CTTTTGAAAC CTGCCCTGAC GTTTTTTCCA 73920
TAGTGCATCT TGGAGTCAGC GTGTCTACTT CCTGTAAAAA TCTTACTGTG ATTTTGACTA 73980
GAATGTGTTG AATTCCTGTT TTTTTTTTGA GTCAGGGTCT CTCTGTTGCC CAGGCTGGAG 74040 TGCAGTGGGA CCATCACAGC TCACTGCAGC CTCAACCTCC TGGGCTCAGG GGATCCTCTC 74100
AGCTCAACCT CCCAAGTAGC TGGGACCACA GGCACATGCC ACCATGCCCG GCTAGGTTTT 74160 ττττττττττ TTTTTGGTGA ACACCCTGGG GTTGCACCAT GTTGCCCAGG CTGGTCTCGA 74220
ACTCCTGGGT TCGGGCAGTT TGCTCCTCTC AGCCTCCCGG AGTGCTGGGA TTACAGGCCT 74280 GAGCCACTGC ACTAGGCCAT GTTGAATTTC TAGATTAATT TGGGGCCCTC AGGGGCACAG 74340
AGAGGAGGGC TGGGCCAGTT GGCGGGAGGA GAGGCCCCTC GGGCTGCCGC ATTTTCAGTG 74400
CATGGAGATG GCCTATGTTG GGGGAACACA GAGCTCACCG GGGGTCCCTG CAGGGAGGAG 74460
AAAGGGTCAG GCAGGTGCCA GCTCCTGTCC ATTGGCCTGG GGCTGCATGA TGGCAGGGGC 74520
CGGTGAACCG ATGACCCCTG GGTGTCCTGT GACCTTCTGT GTATGCGGCT GATGCTGCAG 74580 AAAGTCGGGT GGCCTCAGGC TCCTGACGGG GCTGCACTTC CTCTGCCTTT CAGATTGTGT 74640
TCAGTGTGGA CCCCTCCGAG GGCAGCCCTG GTTCCCCAGC AGGGCTGGGG GCCTTACAGT 74700
CCTATAAGGT AGGGGCCACC TCCAGGAGGC AGGTGGAGGG CAGCCCTTGT TCCCCGGCAG 74760
GGCTGGGGGC CTTACAGTCC TATAAGGTGG GGGCCACCTC CAGGAGGCAG GTGGGGCTGG 74820
GGGTCTTCTG GTCCTAAAAG GTAAGGGGCT GCCCCCAGGA CATGGGCGGG GCCTCCACAC 74880 TCCTGGTCCT GTCCCCTCCA GGTGCACATC CATCCTGATG CTGGTCACCG GAGGACGGCT 74940
CAGCGGTCTG ATGCCTGGAG CACCACTGCA GCCAGAAAGC GAGGTACAGA CCTGGGCCCA 75000
CACGCTCCCC GCCCGCCCGG GTGCAGTGCC CGGCACCACC ATGCCACAGG CTAGGCACAT 75060
GCCCAGCCGT GGATCTCCTG CCCCCATGGG CCTGGCCACC TTCTCCATAT CCAGGCCAAT 75120
CCAGAGCATT CTCCTCACTG TCCCTCTGAA GATTGGAGTT ACTGAGAGAC GTAGGAGATG 75180 GCCTGATGGC ACCGTGACCT GCCCAGAGTC ACCTGGTTGG TGGTGGCAGA GCCACAGCCC 75240
AGCCAGGCCT CCCTGCTGGG ACACGCTCGT TTATGCCGAG GCCGTCAGCA CAGAGCCTCC 75300
ACAGTGAGGC ACGGCTCTGC CTGCTGCCTC CACGCAGCGC CTGGCCGGGC CAAGCCTCAG 75360
GGTCACATCT GAAGGGGGCC CGGCTGGCCC TGTTGTCCGA AGCCCCTGGT GCGCTCAGCC 75420
CCGAGGCCCC ACGTGCCTTC TTGGCTTCCT GTGCTCCGTG GCGTCTTCGA GTCGGTGCTG 75480 CCGGGGACGC TGTGTGGATG GGGTCTGTGA GTGTGCCCTC GGCTCCGTGT CCGGAGCCCT 75540
GTGGTTCTTG GGGTGTATCT GGCCCCACCC CCACTGCGTG GTGTCCAGGG TGGGGCTTCA 75600
CGGCTGCAGC TGCGGGAGCT GCTGCCCCTG CCTTGTGCTC CAGTGGGGCC TTGCCTCTGG 75660
GCTTGGTTCG TCCCTCTCTG GAACATTCTT TCTCAGCTGC TGTCCGACCC ATGGTGGCAT 75720
GACGTGGCCC TGGCTGAAGC AGCCCTTGTG CGGTTGCTGT GGTTGGGTCT GCCTGGCCGA 75780 GCCGGAAGGG AAGGGCTGGG AGGGCGTCAG GGTGGCGTGG CTTGACCCCC GCTCGGTGAT 75840
GGTCCTGCAG CAAGGCCTCT CCCAGCAGGA AGCGTCCATC CCGGGGGGAG GCCGGCGCCC 75900
CTCACGCAGT TGGGGTTGCG GGAGGCAGTG CGTGCCTGAG GCAGCCGGTG CACAGATTCC 75960
AAGGGCCTGG AATCTGTTTG TTCCATTGAC CTCTGATGTC ACTTGACTTC TCAGAAGCAG 76020
CCACTCCCTG CACTGGGCGT TTGTAGGAAA TGAGCTCCTG GAGGAGGGGG TGGGGAAGTT 76080 CCCCCATTGC AGGGCACACT CAGCCCCAGG AAGGAAACGT GCCTCGTCCC TGCTGACTCC 76140
GAATCGCAGT CAGAGTCGTT CTGCTTGTGC CGTGTTGAAT TCCCGGCATC CGGCATCCAG 76200
ACTCAGCCTC CTCCCCAGGC CACGGCCGCC GTGGCCAGTC GGTCAAGCCC TTCTAGGAAC 76260
TTCCTTTGAG CTGGCGCCCT TGTTCACTGC TGACGCCACT CAGAGGCTTG TGCACGTGTC 76320 CTGCTTCCAG GCAGAGCTGG GAACTCGCAC CCCGTCTTCT GCACGCGGCC GTGGAATGTC 76380
GGGATGCCGG CGCTTCCTTC CCGTGTGCTC TTGGCGGGGT GGGCTTCTTG CCCTGAGCCG 76440
CATGTCACAG TTTCTGCAGA AGTTTAGGGT TGGAGTGGGC TGACCTCTCT GCAGGTGTCC 76500
CCAGCCTCTG CCTGGGGTCT GCCTCCTACT CCCAGGACCC CCTGTCCCCC AGAGGGGCCC 76560
CAAGCTGGCA GGCTCACACT CAGGGCAGCC TCCTTTGTTC TGACTTCTGC ACAGTGGGCC 76620 TGGGTGGCTG CCCGCGGCTC GCTTGCTTGA TGCCAGTGGG TGGAGAGGGT GATGGGCAGA 76680
GAGGCAGGTG GTCAGGCCCC CAGTCCCGTC CTCACACTCT GTGCCCTCTG CCGCCCCCCG 76740
CCCCACAGGG AAGGTGCTGA GCTACTGGTG CTTCAGTCCC GGCCACAGCA TGCACGAGCT 76800
GGTCCGCCAG GGCGTCCGCT CCCTCATCCT TACCAGCGGC ACGCTGGCCC CGGTGTCCTC 76860
CTTTGCTCTG GAGATGCAGA TGTACGGGCC ACCCCTGCCA GGGCCTGAGC ACCGGTGACA 76920 CCTCTGACAT CAGCGGGGTG GAAGTGGTGG GGGTCCCCAT GAGCCGGGTG CTGGGGGTCT 76980
CGGGCCTCGA GGGCTAAAGG GGTGCTGGTG CACTTCCCCA CTGTCTGCTC CCTCTGGCCA 77040
CGCTCAGCCC TTTCCCAGTC TGCCTGGAGA ACCCACACAT CATCGACAAG CACCAGATCT 77100
GGGTGGGGGT CGTCCCCAGA GGCCCCGATG GAGCCCAGTT GAGCTCCGCG TTTGACAGAC 77160
GGTGAGGGCC TGTCCCTGGG CCCTGCTGGG GTGGGAGGTG GGGGAGCACT GAGGCCTGAG 77220 GCCCTGAGCA GTGGCCTCTC CGGCTCTAGG TTTTCCGAGG AGTGCTTATC CTCCCTGGGG 77280
AAGGCTCTGG GTGAGTGCCC TGAATGCCCC AGCTGTGCGC ATCCTGGATC CTGGACCCCT 77340
GCTCCCAAGA GCTGGTAGGG ACCCCTGCAG ACATCCTGCC CCTGCCTTGA CCCCGGCCCC 77400
TGCACTTCCA GGCAACATCG CCCGCGTGGT GCCCTATGGG CTCCTGATCT TCTTCCCTTC 77460
CTATCCTGTC ATGGAGAAGA GCCTGGAGTT CTGGCGGGTG CGTCTCCCCT GTGTTCTGGG 77520 CGGGGTGGGT GAGGGCAGGG CTGGAGCATG AAGCAGGCAG TGGTCACAGC TCCTGCTTGC 77580
CCTCATCGGA TCGGCGGCGT GACCAGGGCT GCCGTGTCCC TGCCTCTTCC TCCCACAGGC 77640
CCGCGACTTG GCCAGGAAGA TGGAGGCGCT GAAGCCGCTG TTTGTGGAGC CCAGGAGCAA 77700
AGGCAGCTTC TCCGAGGTCG GCACTTGGCC GGGGCTCTGG GCCTGCTGCC CCCTCGTGCC 77760
TCCCCTGCCT CTCACAGCTT CCCCAAGGCT GACCACTGGC CCTGACCATG GGCTCCGGCG 77820 GCTCCCGCTG CCTCTTCAGG GCTCCTGCGT TTCCTTCCTG GCCCTGAGTG TTGCCTCTTA 77880
TCTTACAAAG CCCCCAGCAC CGGGTGGGTG TGGTAACAGT GGCCCTCCTG TCTGAGTAGC 77940
CCTAGTCGGC CACCCTGGCC CTGGGGTTCC CCGTGTTTTC TGGGAAGCAC TGAGCAGGCG 78000
TGGGGTCAGC CTGGGATCCG TGCCAGGAAG AAGCTTCCAG AACCCGATTG GCCTTCCTGG 78060
CTAGGACGAT CCTTCATCTT GGAGCATGAG ACCTGGGTCT CCCTCATGGG GGAGGAAGGG 78120 GCTGGGGGGG GGCTCCAGGC TCAGCCTCAC CAACTTTCCT TCCAGACCAT CAGTGCTTAC 78180
TATGCAAGGG TTGCCGCCCC TGGGTCCACC GGCGCCACCT TCCTGGCGGT CTGCCGGGGC 78240
AAGGTGAGCT CTCCAGGGCC CTCTGCCCTG ACCTGGTTGC CTGTTCCCTG GTGGGTGCTT 78300
ATGGCTCCCC AGCAGACTCT GGGCCCTGGG GGCTGCCCGG TCCCCTCCTT GGGTCCCACG 78360 AGAGCGACTG CTGGCCCTGC TGGGAGCGTG TCCTGCTCTG GGCCTGGGCA GGCAGGATGG 78420
GAGTTTCCTG GCCACAAGAG TTGGAGGTGG CGTCTGGGAG CTGTGGACCC CAAGTGGGGT 78480
CCTGACCCAC AGATGGAGCT TCCTCCCACC CCTGGTTGGG GACGGAGCCT CGGGGAAGGT 78540
GGCTGGGCTG GGTGTGGGCA CCAGGGAGAG GAGCCCCCAC GGCCCCAGGC AGCTCCCTGG 78600
TGTGTCCCCT AGGCCAGCGA GGGGCTGGAC TTCTCAGACA CGAATGGCCG TGGTGTGATT 78660 GTCACGGGCC TCCCGTACCC CCCACGCATG GACCCCCGGG TTGTCCTCAA GATGCAGTTC 78720
CTGGATGAGA TGAAGGGCCA GGGTGGGGCT GGGGGCCAGG TGAGTTACAG CAGGGTGGGG 78780
CTGGGGTAAG GCGGTCTGGT GACTGAGCCC CCGCCCCGTG GCCAAGGGAG CCCCCGTGAC 78840
CGAGCCGCCT CGCCCCACAG TTCCTCTCTG GGCAGGAGTG GTACCGGCAG CAGGCGTCCA 78900
GGGCTGTGAA CCAGGCCATC GGGCGAGTGA TCCGGCACCG CCAGGACTAC GGAGCTGTCT 78960 TCCTCTGTGA CCACAGGTGC GTGCAGTCCG GTGGCAGGCG CGGCGCCAGG GGACACGCCC 79020
ACACCCCACT GGGCCCCTGG ACTCTCCTTC CCCACATGAG GCCCCGTCTC CTCCAGAGCC 79080
TCTCCGGCTA CTCGGGGTCA GCGTGGGGCC CCTGCAGCAG ATGAGGGTCT TCACTTCGGT 79140
GAACTGAACC CTTGAAGCGG CTGTGGGCAG GGCAGCAGGG CTATGGCCAC CCCCCAGGTT 79200
CGCCTTTGCC GACGCAAGAG CCCAACTGCC CTCCTGGGTG CGTCCCCACG TCAGGGTGTA 79260 TGACAACTTT GGCCATGTCA TCCGAGACGT GGCCCAGTTC TTCCGTGTTG CCGAGCGAAC 79320
TGTGAGTTCC TGCCCAGGGA GGGGATGAGG GTGTTGTCCC CAGAGGAGCC AGAAATGGGT 79380
CCACCCACCC CCATGGTTCT GCAGATGCCA GCGCCGGCCC CCCGGGCTAC AGCACCCAGT 79440
GTGCGTGGAG AAGATGCTGT CAGCGAGGCC AAGTCGCCTG GCCCCTTCTT CTCCACCAGG 79500
AAAGCTAAGA GTCTGGACCT GCATGTCCCC AGCCTGAAGC AGAGGTCCTC AGGTGCGGAC 79560 GGGCAGCGCT GGGTGGGCGG TGTGGGGGTG GCGGAGCGGG CGGCGTGGGG CGGGCAGCAC 79620
CAGGCGCCCA GGGCGGAGGC GACTCACCTG GCTTTGTGCG CTTCCCCTCC CACCTCCAAA 79680
GGCTGCCTCT CCCTCCTAGG GCAGGGCCCC CACGGGCTGC AACCCTCCCC TACAGGCAGA 79740
GAACGCCCCA GGCAAGGATG CCCCCCGAGG CTGAGACTCC CCCCAATAGC AGGGAGGACA 79800
CCCACAGGCA GGACCCCAAG TGCTGGGACT CTCCCCCAAG AGGGGCTTTG CCACAGGCAG 79860 GGACCCCAGC TGGGGCCCCC CGTGGGCTTC ACTGCGCACT CGGGTGCCCC TGCAGGGTCA 79920
CCAGCTGCCG GGGACCCCGA GAGTAGCCTG TGTGTGGAGT ATGAGCAGGA GCCAGTTCCT 79980
GCCCGGCAGA GGCCCAGGGG GCTGCTGGCC GCCCTGGAGC ACAGCGAACA GCGGGCGGGG 80040
AGCCCTGGCG AGGAGCAGGT ACAGTTCCAG GGCCTTGGGA TGGACACAGA CCCTCTGTCT 80100
CCTGAGGCCA ACCCGACCCC GCCCATCTGG CCTCAGGCAC CTCCCCACAC ACCCCTGTAA 80160 ATCCCCTGCC TGGCAGGCAG GCGGGCAAGC GGGCGGGGGA TCCCAGCTGC CTGGCTGTCT 80220
GTGGGTCCTC CACCCCACCT CACCCACAGG CTGCTGGCTC CCAGGTGGTG CATGCCCTGG 80280
CCCTCCGCGG GTGCCCCCCA CATCACTTTG GTTCTCTGGC GGGTCAGCTT GGCTCAGTGC 80340
ACTCAAGGTC GGGTGCCCCT GCCACTGGCT GCGCTTGAGG CTGGCCTTTC TCCAGGAATG 80400 TGCTGCGGGT GGAACCCAGG TTCCTTCTTC CTTGGGGCCT TTTGCCCCAG AAGCCCATAA 80460
TTCCTCAGGC CAACCCGAAA TTTTCTCCCT GCTTCCTGCT GGGAGCCATT CCCCTCTTCC 80520
TGCCCATCCC TGCCCTTCAG GCCCCTGGAG TGAGCTCCAG GTGCAGGCAC CAGGCACCTG 80580
TGTCCCCTTC CTGCCAGCCC CTCGCTGTGG TCGGACTGTC TTCCCTGGAC CTGCTCTTAC 80640
AAGTCACCAC CTGCGAGCCT CATGAGCCGC TGGTGTGACT TGGACAGGAC CAAGTTGTGG 80700 CACTGTCACC GGGGTGTGCT GTGCCCCCCT CCCCCGACCT CCATCTTGGC TCAGGGCTCC 80760
TTGGGACCAT CTTCCCTGTG CGTCCAGGTG CTTTGGGACC CCAGAGTGTG TGGTTGGGGT 80820
CTGTGTGTGG TTGTGAGCTG TGTCCTCCTC AGGCCCACAG CTGCTCCACC CTGTCCCTCC 80880
TGTCTGAGAA GAGGCCGGCA GAAGAACCGC GAGGAGGGAG GAAGAAGATC CGGCTGGTCA 80940
GCCACCCGGT GCGTGAGCTG TCCCTGCACC TGTGCCGACC ACCATAGACA CGCATGGGAA 81000 CGCAGCCGTG GGTGCCCCCA GCCACGGCTG GTCCCGATGG GACCAGGGAA TCCACCCCCA 81060
GGAGCTGATG TCCAGGGCAG CTGTGATGCT GACGGCCAGG GGCTCAAGTG TGTGGTTTCT 81120
TCTGCAGGGG GCTCATGAGT CCCAGCTGGA ATCAGGCCCC ACCCTTGGGC AGGTTTGGCA 81180
TGGGGCCTGC AGCACTGGGC TTGGCCCTGG CATTTCCCTC AAGTGTGGAT GCACACCTGC 81240
CTCATGTGAG GGACACAGCC CATTCCTAGC CTTGGATCAA AGAACGGAGT TATAGCCGGA 81300 GCCAGGAAGC CCCCTGCCTG CTGGAAAACC CCAAGTGTGG CGGCCTTTGT CCATGTCCCT 81360
TGGCTTCTGG GAAGAACTGG GTGGTGCCCA GGCAGGGCTG GTGCCATCAG GAAGTGGGTG 81420
GCTGCTGAGG GGCCTGGGCT GGCGAGGGCC TGGGTGGGGA GTGCCTGGGC CGCCCCTGCC 81480
TTGGTTTCCA CGTTTCCGTG TTGGTCTGGG GTGTGTAGAG AGATGGGCAC TGCTCATCCG 81540
GAAGCCCCTC CTTGTGCGCT GCCATCCTGG GAGCCTCAGC CGCATCCGCT GTGGGGCAGG 81600 GGGCTTGAGG GAGGAGGAGA GAGACGGGCC ATGCAGGACC CCTGGCTTGA GGCAGAGCCA 81660
ATCTACCCTT TGCCCATTCA CTGCTCTCAG TTCCCTGCCA GCCTCTCACT GTGTGACCTC 81720
AGACGGGCCC AGCCCCACAG CTTTCTTCCC GCAGCCCCTC CCTATGTCCA TCCAGCCAGC 81780
CAGTTTCTCA GGCAGCAGCC CCACCTCGGC AGTCACTGTC CCAGGGAACG CTCAATGTTC 81840
CAAGGAAGGC TCTGCAGCCC CAGGGACCAG ATGATGAGGC TGGCCCTGAT GGAGCCTCGG 81900 GCCTGTGTCC TGCAGGAGGA GCCCGTGGCT GGTGCACAGA CGGACAGGGC CAAGCTCTTC 81960
ATGGTGGCCG TGAAGCAGGA GTTGAGCCAA GCCAACTTTG CCACCTTCAC CCAGGCCCTG 82020
CAGGACTACA AGGGTTCCGA TGACTTCGCC GCCCTGGCCG CCTGTCTCGG CCCCCTCTTT 82080
GCTGAGGACC CCAAGAAGCA CAACCTGCTC CAAGGTGCCC TGGCTTGCAG AGGCCACCCA 82140
CCCTGAGGGC AGTGCTGCCG CCGCGTGTGG GGTGGGGGCC ATCTGGGTCC AAGGTGGTCT 82200 CTGTTCTCTA GAGAAAAAGG GGCAGATGGG GACAGACGCC CCTTCCTCTA CAGGCTTCTA 82260
CCAGTTTGTG CGGCCCCACC ATAAGCAGCA GTTTGAGGAG GTCTGTATCC AGCTGACAGG 82320
ACGAGGCTGT GGCTATCGGC CTGAGCACAG CATTCCCCGA AGGCAGCGGG CACAGCCGGT 82380
CCTGGACCCC ACTGGTAAAT GGGGCCCCAG GTGGGACCCT CAGACTCCTG CGTGGAAGGC 82440 AGTGTGGGCC AGAGTCCTGG GCTGCTTGGG GTGGGCATCC TCGGGCCCTG CTTGGCCCCG 82500
CCTCTCTGTT CCCCTATGGG AGTGATGGGG GCCTCCACCT CCACCACCAG CACCAGCAGC 82560
ACCACCTCCA CCTTCACCAC CACCACCTCC ACCACCACCA CCTCCACCAC CTCCACCTCC 82620
ACCACCTCCA CCACCTCCAC CACCTCCACC ACCACCACCA CCTCCACCAC CACCACCACC 82680
ACCACCTCCA CCACCACCAC CACCACCACC ACCTCCACCT CCACCACCTC CACCACCACC 82740 TCCACCTCCA CCACCACCAC CACCTCCACC TCCACCACCT CCACCTCCAC CTCCACCACC 82800
ACCACCTCCA CCACCACCAC CACCACCTCC ACCTCCACCA GCAGCAGCAT CACTTGTTGG 82860
GGAGACCCTG TGCAACTCCA TGCACAGCCC TGTCCCTGCC ATAGCCCCGA CCCCTAAGCA 82920
CAGCCCTGTC CAACTGCCAC ACGTCCCCTG CCTCCCATGC ATGGTCCTGG GGGGTCAACT 82980
GCACACGCCA GGGTCCTAGG GTCCTAGACC CCTGTCCTCC CTGTTTCTGC CTCTGTTTGG 83040 GGTGGAGTCC AAGTCTCCAG AGGCGGAAGC ATCTGTGTTC GTGTGTTAAT GAACAGCCCC 83100
TACAGAGTTC CCCTAGTTCA CCCAGGGGGG AACCTAGCCT GTTGGGACGA CCCCAGATCC 83160
CTTCTGGGCT TGGTACTCAC TGGGATATCC TCATGCCTGC ACCCAGCCTA CGGCTCTGAG 83220
CTCCTGAGTG GGGCTTTGGC CTGCCCGCCA CTGTTCCAGC CCCCATCCAG CAGGCTGGTG 83280
TCTCCTCTGA TGCCCCCAGC ACCCAGGCGT GTACCTGCCT GGGTTTTCCC GCCCTGGTCT 83340 GAGGTGGGTG AGGCCTGGCC TCCCTAGCCA GCCCTGCCCC CCCACCCCAG GGAACTTTCC 83400
AGATGCTCCC GACCAGCTTT GTGGCTCTAC ATCTCTTCAT CAGGAAGAAC GGCGCCGGAT 83460
CCCAAGCTGA CCGTGTCCAC GGCTGCAGCC CAGCAGCTGG ACCCCCAAGA GCACCTGAAC 83520
CAGGGCAGGC CCCACCTGTC GCCCAGGCCA CCCCCAACAG GTAGCTGACT CCTGAACCGT 83580
GTGCAGCCTA CGACTTGGTG GGTCCCTCAG TGGCTTCACG AGGCTAACTC TTGAGTGTGG 83640 CCGGGGCTGC CCCTGTGGGG AGCCATCTCA TGGTGGGGAC TGCTCCCGGT TCTGCACCCC 83700
GCAGTTGTCC TGAGCAGCTC TCCAGGAGTT CCTGGAGGAA GGGCGGGCAG GGCGGTGGGA 83760
CTCTCAGTCC TCCACCCCAG CGCCACTCTG AGCCATGCTA CTCCCACACC AGGAGACCCT 83820
GGCAGCCAAC CACAGTGGGG GTCTGGAGTG CCCAGAGCAG GGAAGCAGGG CCAGCACGCC 83880
GTGAGCGCCT ACCTGGCTGA TGCCCGCAGG GCCCTGGGGT CCGCGGGCTG TAGCCAACTC 83940 TTGGCAGCGC TGACAGCCTA TAAGCAAGAC GACGACCTCG ACAAGGTGCT GGCTGTGTTG 84000
GCCGCCCTGA CCACTGCAAA GCCAGAGGAC TTCCCCCTGC TGCACAGCAA GTGGCCCTGG 84060
CGTGGGGAAC AGCCGGTGGG GTGGGGGGCA GGGGACAAAA TGGGGGCTGT GCCGGGTCTG 84120
ATTGAAGCTC CCCGCAGGGT TCAGCATGTT TGTGCGTCCA CACCACAAGC AGCGCTTCTC 84180
ACAGACGTGC ACAGACCTGA CCGGCCGGCC CTACCCGGGC ATGGAGCCAC CGGGACCCCA 84240 GGAGGAGAGG CTTGCCGTGC CTCCTGTGCT TACCCACAGG GCTCCCCAAC CAGGTAGGGC 84300
ACCTGCCTGG CTGCTCCTGG CAGCGCCCCA ACCGCACGCA GCCCTGGGAG TGAGCAGCAA 84360
AGCCCCAGGC CCCCCTCAGA CTCAAGTCTC TGTCTCCAGG CCCCTCACGG TCCGAGAAGA 84420
CCGGGAAGAC CCAGAGCAAG ATCTCGTCCT TCCTTAGACA GAGGCCAGCA GGGACTGTGG 84480 GGGCGGGCGG TGAGGATGCA GGTCCCAGCC AGTCCTCAGG ACCTCCCCAC GGGCCTGCAG 84540
CATCTGAGTG GGGTGAGCCT CATGGGAGAG ACATCGCTGG GCAGCAGGCC ACGGGAGCTC 84600
CGGGCGGGCC CCTCTCAGCA GGCTGTGTGT GCCAGGGCTG TGGGGCAGAG GACGTGGTGC 84660
CCTTCCAGTG CCCTGCCTGT GACTTCCAGC GCTGCCAAGC CTGCTGGCAA CGGCACCTTC 84720
AGGTTGGTGC CTGGCCACTA CAGTTCCTGC TGGGTGTAGC CCCAGGTGAT GGGCTGAGGG 84780 GGAAAGGGCA GGCCCTTGTC CTGGTGGCAA CGCCTGGCAG ACGTGTGCAG TGGGCCGGTT 84840
GTCTCACAGG CCTCTAGGAT GTGCCCAGCC TGCCACACCG CCTCCAGGAA GCAGAGCGTC 84900
ATGCAGGTCT TCTGGCCAGA GCCCCAGTGA GTGCCCACGG AGGCCCCCAG CACACCCAAC 84960
GTGGCTTGAT CACCTGCCTG TCCAGCTCTG GTGGGCCAAG AACCCACCCA ACAGAATAGG 85020
CCAGCCCATG CCAGCCGGCT TGGCCCGCTG CAGGCCTCAG GCAGGCGGGG CCCATGGTTG 85080 GTCCCTGCGG TGGGACCGGA TCTGGGCCTG CCTCTGAGAA GCCCTGAGCT ACCTTGGGGT 85140
CTGGGGTGGG TTTCTGGGAA AGTGCTTCCC CAGAACTTCC CTGGCTCCTG GCCTGTGAGT 85200
GGTGCCACAG GGGCACCCCA GCTGAGCCCC TCACCGGGAA GGAGGAGACC CCCGTGGGCA 85260
CGTGTCCACT TTTAATCAGG GGACAGGGCT CTCTAATAAA GCTGCTGGCA GTGCCCAGGA 85320
CGGTGTCTTC GTGGCCTGGG CTTGGTGGTG GGAGTTGAGG GACAGGGAGT TGGCAGAGGC 85380 CCCTCCCAGC CTGCCATGTG ACACTGTACT TCCTCCACGG TGGGCTCAGC CCTGCCCTCA 85440
TCCTCACAGC CGCAGCCAAG CTGCAGTTGG TAGGGGATCC ACCGACACAC CAGGCTGCCT 85500
GGGCTGGTCT CTGGGTTGGG AGCTGCCCCA GGTGCTGAGG AGGGCAGCTC CCTGGCTGGT 85560
GAGGCCCCTC CCAGAACCAC CCTTGGACTG AGCTCTGGGG AGGGATGGTA CCAGGTGGGT 85620
GAGGGGGGCT GCCTGGGGAG GGAGGGGTTC CTATGGGGCG TGGCGAGGCT GGCCCAGCCC 85680 TCTCCCCGCC CATATATGTA GGGCAGCAGC AGGATGGGCT TCTGGACTTG GGCGGCCCCT 85740
CCGCAGGCGG ACCGGGGGCA AAGGAGGTGG CATGTCGGTC AGGCACAGCA GGGTCCTGTG 85800
TCCGCGCTGA GCCGCGCTCT CCCTGCTCCA GCAAGGACCA TGAGGGCGCT GGAGGGGCCA 85860
GGCCTGTCGC TGCTGTGCCT GGTGTTGGCG CTGCCTGCCC TGCTGCCGGT GCCGGCTGTA 85920
CGCGGAGTGG CAGAAACACC CACCTACCCC TGGCGGGACG CAGAGACAGG GGAGCGGCTG 85980 GTGTGTGCCC AGTGCCCCCC AGGCACCTTT GTGCAGCGGC CGTGCCGCCG AGACAGCCCC 86040
ACGACGTGTG GCCCGTGTCC ACCGCGCCAC TACACGCAGT TCTGGAACTA CCTGGAGCGC 86100
TGCCGCTACT GCAACGTCCT CTGCGGGGAG CGTGAGGAGG AGGCACGGGC TTGCCACGCC 86160
ACCCACAACC GCGCCTGCCG CTGCCGCACC GGCTTCTTCG CGCACGCTGG TTTCTGCTTG 86220
GAGCACGCAT CGTGTCCACC TGGTGCCGGC GTGATTGCCC CGGGTGAGAG CTGGGCGAGG 86280 GGAGGGGCCC CCAGGAGTGG TGGCCGGAGG TGTGGCAGGG GTCAGGTTGC TGGTCCCAGC 86340
CTTGCACCCT GAGCTAGGAC ACCAGTTCCC CTGACCCTGT TCTTCCCTCC TGGCTGCAGG 86400
CACCCCCAGC CAGAACACGC AGTGCCAGCC GTGCCCCCCA GGCACCTTCT CAGCCAGCAG 86460
TTCCAGCTCA GAGCAGTGCC AGCCCCACCG CAACTGCACG GCCCTGGGCC TGGCCCTCAA 86520 TGTGCCAGGC TCTTCCTCCC ATGACACGCT GTGCACCAGC TGCACTGGCT TCCCCCTCAG 86580
CACCAGGGTA CCAGGTGAGC CAGAGGCCTG AGGGGGCAGC ACACTGCAGG CCAGGCCCAC 86640
TTGTGCCCTC ACTCCTGCCC CTGCACGTGC ATCTAGCCTG AGGCATGCCA GCTGGCTCTG 86700
GGAAGGGGCC ACAGTGGATT TGAGGGGTCA GGGGTCCCTC CACTAGATCC CCACCAAGTC 86760
TGCCCTCTCA GGGGTGGCTG AGAATTTGGA TCTGAGCCAG GGCACAGCCT CCCCTGGGGA 86820 GCTCTGGGAA AGTGGGCAGC AATCTCCTAA CTGCCCGAGG GGAAGGTGGC TGGCTCCTCT 86880
GACACGGAGA AACCGAGGCC TGATGGTAAC TCTCCTAACT GCCTGAGAGG AAGGTGGCTG 86940
CCTCCTCTGA CATGGGGAAA CCGAGGCCCA ATGTTAACCA CTGTTGAGAA GTCACAGGGG 87000
GAAGTGACCC CCTTAACATC AAGTCAGGTC CGGTCCATCT GCAGGTCCCA ACTCGCCCCT 87060
TCCGATGGCC CAGGAGCCCC AAGCCCTTGC CTGGGCCCCC TTGCCTCTTG CAGCCAAGGT 87120 CCGAGTGGCC ACTCCTGCCC CCTAGGCCTT TGCTCCAGCT CTCTGACCGA AGGCTCCTGC 87180
CCCTTCTCCA GTCCCCATCG TTGCACTGCC CTCTCCAGCA CGGCTCACTG CACAGGGATT 87240
TCTCTCTCCT GCAAACCCCC CGAGTGGGGC CCAGAAAGCA GGGTACCTGG CAGCCCCCGC 87300
CAGTGTGTGT GGGTGAAATG ATCGGACCGC TGCCTCCCCA CCCCACTGCA GGAGCTGAGG 87360
AGTGTGAGCG TGCCGTCATC GACTTTGTGG CTTTCCAGGA CATCTCCATC AAGAGGCTGC 87420 AGCGGCTGCT GCAGGCCCTC GAGGCCCCGG AGGGCTGGGG TCCGACACCA AGGGCGGGCC 87480
GCGCGGCCTT GCAGCTGAAG CTGCGTCGGC GGCTCACGGA GCTCCTGGGG GCGCAGGACG 87540
GGGCGCTGCT GGTGCGGCTG CTGCAGGCGC TGCGCGTGGC CAGGATGCCC GGGCTGGAGC 87600
GGAGCGTCCG TGAGCGCTTC CTCCCTGTGC ACTGATCCTG GCCCCCTCTT ATTTATTCTA 87660
CATCCTTGGC ACCCCACTTG CACTGAAAGA GGCTTTTTTT TAAATAGAAG AAATGAGGTT 87720 TCTTAAAGCT TATTTTTATA AAGCTTTTTC ATAAAACTGG TTGTAGTTGC ACAGCTACTG 87780
GGAGGGCAGC CGGGGACACC TGAGCCGCCC GCTGTGCCCA GATCCCTCAG GCTGCCTGCC 87840
ATCAGAACTG CTGCCCGGGG CTTCCCCTAC CTCAGACAGA CCCTCCCTGG GAGGATCAGT 87900
GGGGAGTGCC ACCTCTGCCC CCAGTGGCTG TGGCACGTGG CAGGGGCCCC TGAAGCTCAG 87960
CGAGGGTCAG GGCCTGGGAG GGTATCATTG CTGGAAGAAC AGGATGGGGC TCAGGCCAGC 88020 CCTAGTCGCC GGGGCCCACA CTAACCCCCC ACTTATGAAT TCCTCCCACT CCCAACTCAC 88080
AGGGGATTTC CCGAGAGGGG ACCTGCCAAA GACCTCCTCC AGGCCTCCCA TGCTTCCCGG 88140
GAAGTGAAGC TTCTCCCCCT CTGGGGCAGG CTCTGAAGCC TCCCGATGCA CCCAGAGCAA 88200
CCAGGGGGCT GCACCAGCCA CTCGCCTCCC CAGCACGGCC AGGTTCCCGG GGCTGGAGGT 88260
CCCCCCCAGG TCCTGGGAAC CAACCTGCAG AACACACACA GGGTCCCCTG GAGAGGACGC 88320 GGGGACTTCC AGGGCCCGAC TCCTGTGAGT CACAGCCCCG CAGCTGCTGC GCCACCCCCA 88380
CCCTGACTCA TGCCCCTTCC CAGCAGCTCC TCCCAGGACC CCATGTCCTT CCCACATCCG 88440
CAGGAAGGGA GTGCCTGGAC TCTCCAGGCC CACCTGGGGA GCCCCTCACC TGCCCACCAG 88500
CCCCTGAGCA GCCCAGTAAC ACCATCACCG TGTCCAACAG CCAGGAGCCT CCACCCTCCA 88560 GGAGGGAAGG GATGGACAGA GCCACACTCG CCGTCTTTAT TTTGCACTCA CCCTGGGTGA 88620
CACTGGGCAG GCCGCTCCTG CCCACAGCCA GACTGAGGAA GAACACAGCA CTCGGCAGGC 88680
CCAGTGGGGT CCGTGCAGGG AGGACCCCAG GACCAGCCTT ACTCCCGAGC AGGGGACACA 88740
GGGCCCCACA GAGAACCCCT CCGGGAGGTT CTCTCCTGGC TGGGGGAGGG CTCTGGACCC 88800
CCACAAACAC TCCCCAACTT GCGGGGCTGG GGCATAAAAA CAGCCACTCC CAGCAGGCCC 88860 CCTCAGCTTT TTGCATCAGT CAGCTCCCTC CCGGGGGATT AGGGTGAGGT GAAGCCAGGC 88920
CCAGGCGTGG GGTATAGGTC TTCCCCCGCA GGCCTCAGCC CTGTCCCGAG GCTGCATCAC 88980
AATCCAGGGC CCCCGCTGGC CTTTGGGAAC ATGGCCTGGG TCTTCCTCAA GGCAAGATCA 89040
GCCCCAGACC ACTTCCGGGG TCACGGGGTC ACAGGGCAGA AGCCAGATGG CAGCCATGGC 89100
TGACGGGCCT CCTCCTCGAT GGGGCGGAGA CAGCCACGGG GTCTCCCGAG GGTCCCACAG 89160 GGCTGTCCTC ATGCAGCCCA AGCCAGCCTG AGCACTGGAG CCCCAATTCC CAACCAGGTC 89220
TCCCTCAGAC CCCCCAGAAA GGGCCTCGAA AGGCCGCCGC TGCGCCCTGT GGAAAGGCTG 89280
CCGCTGCAGG GCCTGGGCCA GCCGGGCTGC CAGACTCCCC TCCAAAGCCT CCGGATGCCT 89340
ACGCTTTTCC AGACATAGAG GAAAGTTTGT CTTCGAGAAA ACAAAGTAAA TAGAAGAACC 89400
CCAAAGCAAA GCAAACCCAC CCCCCAGATC AGCAGCATGG GAGCCAACAG GAGGCCACTC 89460 CTCCAGCACC AGGGGACCAG CCGTCCCGAC GGCAGCGCGG CTGCGCCTAC GTGATGTCCC 89520
TCTGCCGCGG CGGCCGGTGC ACATTCCGCA CGACACACTT CACCATCCAC TCGATGCCCT 89580
CGCGCACCCC TTTGCTGTGA AGACAGCGGG TGTGAGGCGG GGGGTCTCGG TCCCCAAAGC 89640
CCCCGCAGGT GCAGCCCCCA CTCACCCTGT GAGGGCCGAG CAGGCCTGGG TCAGGCAATC 89700
GCGCCTGCCG ATCTTGCTGG TGCAGTCGCT GAAGGCCGTC TTGATGTCAG GGATTGAGAG 89760 GCACGTCTGG GGGAGGTAAG GCCGTGAGGA GCAGCCCCCA CGTCTGGCCC TGTCCTGCCT 89820
GTGGGCCCGG GACTCTCAGA AGGGCGTATG CCCTTCACCC CAGGGAAACA GCCAGAGCTC 89880
CACCAGGGTC CCAGTGTCTC CCACAGAGAC CACAGCAGTG AGGACCCTGT GCTCAGCCCG 89940
AGGCTGAACA TGGCTGGTAG TGCCTGAGAC AAACTAGACG TCCACACGGC TCCAAGGAGT 90000
CCACCCCCCA TCCCCTCCCT GGGGGACACC CTGAGCCCCG AGGTGGGGCG CTGAGGACTG 90060 AGGCCTCCTG GGCAGTGGCG GAGGCAGGTC CCAGGGGCCC ACACAGCCGG GGATGATGGA 90120
GAGGTGGGAG CCCTGCATCA GTGATGGGGG CAGTCTGCAG TCATGGTGGC TTCTGCTCAC 90180
AACCACCTGC CCAGTCTTCA AAAAGCAGCC CTCCCCTCCC CTTTTCCTCC GAGGGGAGAC 90240
CCCTGCCCCG TACCAGATGT CCCTCTTGTC GGCTGAGATT GTAGGGGAGG CCAGCCTTAC 90300
AGGCTGGGGG CAACAGAGCC ACCCCAGAGA AGGCAGGAAG TGAAGATTCA CCCGGCCCTC 90360 TGGACGCCGG GCTGCTTCTG TGCAAAGCCA CTCCAAGAGA ACAGCTAGAA CTCAGCGTGG 90420
CCAGTGCTCC CGGGGGCAGT GGCACCTCAG AGGGGTCTTG AGGGGCTGCC CTGGGGGTGG 90480
GGCTGGCACA GATGCCACCT CCAAGGGTAG CAGGAACAGG TAAGGGTCAG AGCTGACTCC 90540
CACCAGGGCC CCAGCATCAC TTCTTTGAGC TCTGAGTTTC ACCTGGGTGT CCCCACAGCT 90600 TGGCCACACA CTCCTGAGAC ACGGCCGCCC TCCTGGGGAG AGGTGCCCTG CATAGCAGGA 90660
AGAGGCCTCT GGGCGCCTGC CCTGAGGTGG GAGAACCTCC AGGGCTGGCA GCAGCAGGTC 90720
TGGAGAGGAA CCAAGCTTGG GAAGCTGCTG GGGGCAGGGC AGGCCTTGAG AATGGCTCTG 90780
TACCCCCTGG GCAGTCACTG GGCCTGGGGT GTCTGGGTGC ACACCTACTC CCCTTGCTGT 90840
GGGGGAGGCT GGGGACTCGG GAAGCTGCTG CGGGAGGCAG GGGTGGGGCT CACCTCCACA 90900 TCCTGCTTGT TGGCCAGCAC CAAGACGGGG ACACCGCACA GCGCCTCGCT GGTCACCACC 90960
TTCTCTGGGG AGGGCAGGAG AGGCAGCGCC TCACACCCAG CATCCTGCCT CTGACTGCCC 91020
AGGGGCCCAC AGGCGTGGAC ACTGTGACAG CCACTCCCTC TGCCCCCCCC CCGTCACCCA 91080
CTAGGCAGGA GCACTTCTGA CCAGACACTG AGCCTGCCCC AGGCACAGAG CTGCCCAAGC 91140
TGGACCTGCC CCCACTCACC ATCCATCCCT CCCAGAGCAG CCAGGCCGCA CTCACCAAAC 91200 GCCTGCTTGG ACTCAGCCAG CCTCTCCTCG TCGGTGGAGT CAATGACGTA GATGACGCCG 91260
TGACACTCCG CATAATACTG GGAGGAAGCA CCAGGAGTTG GGGCTCAGTC CCCACCCTGC 91320
CAAGGGCCAG CAGAGCCAGG CCTGTGTCAT GGCCACAGTG AGGGGCTCAC ATGAGGAAGG 91380
GGCAAGAGGG CAGCCCCCAA CTGCAAGACC CTTCTGGGAT GCATTCTGGG GTTGCGGGGA 91440
GATCTGGTGG AGGTGTCCCC AGACGCTGCT CCTGAGAACC TGCCGGCAAC CTTTGGCCTG 91500 ATGGTGGCCA AAGGTGAAAG ACAGGGATTG GGCCAGGCGT GGTGGCTCAC ACTTATTATC 91560
CCAACACTTT GGGAGGCAGA AGCAGGAGGA TCACCTGAGC CCACTTCACG GCCAACCTGG 91620
GCAACACAGT GAGACTCCGT CTGTACAAAA GCTTATGGTA ATGTGCGCCT GCAGTCCTAG 91680
CTACTCGGGA GGCTGAGGTG GGAGGATGGC TTGAGCCTGG GAGGTTGAGG CTGTAGTGAG 91740
CTCTGATCAC ACCACTGCAC TCCAGCCTGG GTGAGAATGA GAGACCCTGT CTCAAAAAAA 91800 AGATAGGGTT TGGGGGCTGG AGGAACCTAG ACCACAGCCT GGCCCGTTGA GGGAGTGCAC 91860
CTGTGGGGCT CTGTGCCAGC ACCTCGCACA GGGAGGGAGT GTGGCCATGC GGATAAGACT 91920
GACCAGCACC ATCTACGAAG CGAGCCTTCC CTGCCAGGAC AGGGCCAGAG TCACTGAGCT 91980
CAGACCTCTG CAGCCTGGGC TGGTCAGTCC TGGGCTCGCT GGCAACACTC CTGGGCAAGA 92040
CAGGGCACAG CCCCTGCAGC CTCAGGTACA AGTGCTGAGC CCTGGACCAG ATGAGTGCAC 92100 CTCTATCTCA ATCAGAAAAA AACACAGCAA ACTCCGCGTC CACGTGGAGC AGACAACAGC 92160
TCACATTTGC CACTTTGCCT CCAGGCTGTG CCAGCTCTCC TGTCCAGGCA TGAGTGCCCA 92220
GAGACCTAGA ACTGGATGCT GACCAGGTAG GACAAGCTGG TGGTCAGTGT GTTAAGACAC 92280
ACACACCCGA GAGCATGAGA AGCCAGGAGG CACAGCCCAA CTCTCCGAAA TCCTTAGGGT 92340
GTCTGAGCAG GGAGTACCAG ACAACCCCAT CCCAGTGCCA GACAAGCTTG TGCACCTGCA 92400 CTTCCCACAG AGGAGAGAAG CCTGTGCACC TGCACTTCCC ACAGTGGAAA GGAGGAGGCC 92460
CAAGGCCAGG CCCCCCCACC CCCAGGAACT TCCCACAGTG GAGAGGAGGC CCAAGGCCAG 92520
GCGCCCTCCA GGGTTCTGCA GGTAGCGAGG CCCCCCCACC CCCAGGAACT TCTCTGGCCT 92580
ACAGACAGGT CCCACACAGA GGCCGCCAAC CCCTCAAGGG ACCCTGCAGT GTGCCGGCTG 92640 TCTGCTGCTG ACACAAGGGA GCAGGCGGAC CCTAAGGTGG AGACCTCTGT GGCAGGAGGG 92700
GCGGCTCTGT GGAGGCTGCA GCAAGCCCAG TGAGAGAATC TCCACGTGGC TCCTGGGGCT 92760
TCTGAGCAGG GTGGCAGAAG GTTCATGTGC AACCGGGTCC TGGACCATGG GACCACGTGG 92820
CCAGAGCCAC CCATCACACC TACCAGGCAC AAGGTGCACA GCCCAGCAGG GCCGCAGTGG 92880
ACGGGAGCGA CACCTCAGGG CTGAGTGCGG GCAGGACCCA GAGCCCCACG CCCCAGTGGA 92940 GGCGTCACAG CAGTGGTCAT TGTGGGGTGC CCCACAAGGA GGGGGAAGAG GGAGGTGTCC 93000
CAGCGTGGCT CCTGGCTGGC CAGCTGACCC CAGTGGAGCA GTCAGAGGGA CTGTGGGTCT 93060
GAGTTTTTCT CCCCAGCAGC AATGGGAGCT CCCCAACTGC AAAGTGCCAG CCAGCCTGAG 93120
AGACTAGTGT TACAGCAAAG AACCCAGGAG CTGAGGTCCT GGCACATGCC ACACATGTGG 93180
ACACCAACCC AGGGTCCAGC CCCAGGACGA GGCCAATTCG CAATGACGCC CCTTTCTGTG 93240 GTGCTGGCTC TGCACAAGGA TGCAGGATAC AGGAACCAGG GTGGGAGCAG GGGCCTCCCT 93300
TCCGGTCCCT CCCAGTGACC TAGGGGGGTC CCTGCAGCTG ATCCTCCCAG CTCTGAGCTC 93360
AGCAGGGTCA GGGGTCCCGG CCACTAGAGC AGCACATACT CAGCAGACAC GCTGAATGAC 93420
GAGCCACAGC TGCCTCATGG GCATGACTTG CACCTCATGT CTAGGAGACC CTGGTGGGCA 93480
GGAGATGGGG CTGCCATCCC ACAGCTGTCC CACAGCTGGG GACCCAGGGA GCCACTGGCC 93540 CCACCACGGT GGTGTCTGGA GAAGGGCTCA GACTGCCAGG AAGTCGCACC CCAGCAGAAG 93600
TGGTAGTGAA TTGGGAGGGC ACTCAAGGAA GGGCTGTGCA GCCCCAAGAC CAGCAGCAAG 93660
GATGGGCTAC AGTGGCCCCC TTAAGTCTCC CTCTTCCAGT TTCGCCTTAA GAGAGGCCCT 93720
CAGGACCTTG GAGGAACCCC TCTCCAACGT GGAAGTGTGG GTCCACATAG GGCTGCAGCT 93780
GTGGCCAGTG CAGGCATCTC TGGCCCCACT GTATTCTTGC TTCATGTTGG AGAACACTGC 93840 ACCAGCAGAT GGTCTCATTT TGGTTTCTGT GGGACCCACT TTGGCTGCAA AGAGCCACAC 93900
TGCCAGGTCA CACCTGCCCA GGGCAGCCCA CACTGGGGAC CCACCAGGCC ATGGTGTGAA 93960
GTCCCGGCCA GCCTGGCCCC ACATGGCACA GCATAGCCAG TTCTCCTCCA GGGCTCCCTG 94020
CTGGGCCAAC CACAGCTCTG CGGATCCTGC TGCCTGAGTC GACCTCTCCT CTCCCGTCCT 94080
CCCTGCCTTC CTGGTGCCGA CCCCCAGTGT GCATCCTGTA CCTCGACCTG TCTCAGCATC 94140 TGTGCCTGAG ACACCGGCCT GTGACAAGAT CATCATCATC TGTGTCACTC CCCAAGCATG 94200
CTGCGCACTG GACACACAGG CCCTGACTCA ACTTGTCCTG TCTGACTTCA GTGGTCCTAC 94260
AGGATCTATC AGAGATCACT TGGCCATGGG AGAAATGTCT TCTTGGCTAG AAGTCACAGC 94320
AGGAGGGGAC ACTTTGGGGG CGCCTAGGAA AGGGGAACTA GGATCAAAAA AGAGATCAGG 94380
ACCTGGGCAC TCAGCTCTAG AGATGGCATC AGGGCAGCCA AGGCACTGGG GACACCCCAC 94440 ACCCACTGTG CCAGCCTAGG GCAGGGAGCC CGAGGAAGCC ACAGGCTCTG CCCTGCTCAG 94500
TGCTGGACTC AGTGCCTGGC CCAGGCTGAG AAGGAGATAA ACTGCAGCCT TGGGGGTGTG 94560
GGGAAGGGGC ACCACACTGG GATCTCAGAA ATGCCCAAAA CCTGTGTCAA AATAGGAGAC 94620
TGCCGCTGTG AGACCCTGAG GAGTCTTCTG GTGATCATGG AAGAACAAAT GTTAAGCTAG 94680 AACTGAAGGA ACCTCATCAG GGGAGAGGCA GCCATCCTGC CGTCCCCACA TCTGGTCTTT 94740
GCCATTTCTG TGTCCTGTGG TGGTCAGCAG CAAGGTCTCT GAGCCGAAAG GAGGCACTCA 94800
CTTTGGAGGA GTGCAGGGTC CCCAGGTCCC CACACTTTGT CTTGTCCTGA CTGAGAAAGA 94860
AACAGACTGC CCTGACCTCT CTGACTTGGC CAGCGAGGTT GCCCTTAGGC TCAAACCCAA 94920
GCCAGGGTTT GAACATTCCC AGACACTTGT AAGATGTTTA GGTTGTTAAC ATAATGTTCA 94980 GGTTTCAAAA CATTGAAAGA AACTAGCCCC AGCCCTGAAC CCAGATCCCC CCCGGCTTCA 95040
GGCATGACC GTGAACACGC CCTTCTCTCA CTGGTCACCT GAGGATGCCG CACTCTGTC 95100
ACAGGTTCCC CTAATACATG CTCTGATCTG ATCGCCTTGG CATTTAGTGA TTCTTTCCCT 95160
GGAATTCTCC ACTGGCCCCA TCGCAGGGAA CTCCCAAGTG GGAAACTCCC CTACCACCAC 95220
TTTTGGGGCA ACTTCAGCTA AGGGTTCAGC TGGGACAAAA CAGGGAGCCA CTCGGGAACC 95280 TGGGACAGGA CCAGAGAGAA AACCCGAGGG ACAGAGTGGG TAAGGAAAGC TGCTGAGGAA 95340
GGGCCCAAAG GGCACTCTGG AAAGAAGTGG CACTGGAGGG CTGGGGTGGG GGTGGTCCTG 95400
GCCAGGGAGT CTTACCTTGT CCCACAAAGA CTGCAGCTCT TCCTGCCCTC CTAAGTCCCA 95460
GAACATGAGC CGAGCCTTTC CCACATCCAC AGTGCCGACT GGGGAGAGGA GGAAACAGGC 95520
AAGGCTCATG ACCTTGGTCC TCGACACACC CAGTCCCAGC TCTCCCAGGG GATGGGGCAA 95580 ACCATGCTGG TGCCACTCAA ATGAGACTTG AGAGGGGCCC GACAGGGCTG TGGCCACGGG 95640
CCAGCTGGAC TGTGAATATC ACGGCATCCT CAAGGCCCCA AACCCACAGC CTGCTATTGA 95700
GACCCTTACT GTTTAGGCCC ACGGTGGTGG TGATTTTGGA TAGACTCATC CCCTTGTAGT 95760
TCTTGTTAAA TCGGGTTTTC GACTGCTCCA GGAAGGTCTG AGGAGAGAGG CAGAGGCGAA 95820
ACACATCAAG GAGGGGCTAT ACTGGCTTCC AAATATCCTT ACTCAGGTCT GTTCTTTAAA 95880 AGACAGAAAC AGAAACAGAG CAACACTCTG CTCTTCAGGA GGCTGGTGGT GACTATCCTG 95940
CCGTCTCAGG TGAAATTTGG CTTCCGTCTG GGTAGTGAAC GTGCAGCTGA CAGCACAAAA 96000
CCGAAGGGGG CGCCGCCAGG CCGTGGGAAA GGTGCGCGCA AGGGCGTGGG CACTCACCGT 96060
CTTCCCAGCA TTGTCCAGGC CCAGGATCAG GATGCAGTAC TCGTCCTTCT GAAACATGTA 96120
CTTGTACAAG CCCGACAGCA GCGTGTACAT CCTGCCCTGG GCACCCCAAC ATAGGTCAGT 96180 GTGCAGCCAG AAAGCACCTC CCCTCCCCCG GGCTTCTCCA CGGTGGTCAG TGGCGCCCCA 96240
CGTCCAGCCG ACCGCTCAGG ACGAGAGCCT GGGGGCCATT CCCGACTCCT CGTCCCTCTC 96300
CCACCCCGTC CCTCTGTAAC TTCTCCCAGG TCAGCCGCCA CTGTGTCCTG CTCACAGCAA 96360
TGACTGCGAC CTCTCCGCAT ACACATCGGT TCCGGCCCCT CCCCTGCTCG CGGGACTACC 96420
CAGCCGGGTG TTCACAGTGA GCTCAGCCGC GCTCCCGCCC TCCCCCGAGG CTTCGCTCCC 96480 ACGCTTCACG CGCGCGGAAC GGGGAACACA CTCGCTGCAG CCCCGCCTGG GCCACGGCAC 96540
CCTCGAGCGC CAGCCCCGCG CCCCACCCGG GAGCAGCGAG CCACCGGCGC GCTCCCCAGG 96600
AGCCCCTGCA GGCGCCGGGT AGGGACGCCC CATCACCCCA TTTCTTAAAA CGGGGACGGC 96660
CCTGGGGGGA GCGGACTACA GGGCGGGTGA GCAGCGGCGC GGCTGCTCCT GGAGTGCACC 96720 TGGAGGCGGC GCGCGGCTGG CAGGGAACGA CTGCGAAGGA AGAACCTGGG TCGCGGCCCC 96780
CGGCTACGTC CGCCCCAAGC CGCCGCCGCC AGGTCTGAGG CTCCCCGACA AGCAGCCAAA 96840
GCTGGCTCCT GTCACACCCG CGTCCCACCT CGAGTCCTGG GCCGCCCCTC GGGCCTCGCG 96900
CCTCACCGCA CAGCCTGCGG CCTACCTGCG TCCGCCGCGC CCTCGGAGCC GCTGCTGCTG 96960
ACCCCCGCTG ACCTCCGCTG ACCCCGCGCT AACCCCGCGC GGCGCCTGAC GGGACGCGGG 97020 CCGGCCTCAG GGAATGAGCT GAACCGCGTC CCAGCGGCCT CCGCGCTCCG CTTCCCGGCT 97080
GCCCCCGCGC GCCAAGCACT TCCGGAAGCG GCGGCGCTCG GGAGGAAGTG CCGATCGGCT 97140
GCTGGGGCGA AAAGGGGGCG CCGGGCCGCT CTAGCCGGTG AGGCCGGCGG GCTCTCTGTG 97200
GCTGCGGCTG GGAAACCGCG CGGAGGAGGT GCCCGGCCGG GGACCAGGTG GCCGCGGTTT 97260
GCGGGGACGC GGCCCTGGCC AGACAGAAGA GACGCCGGGC GGGGGGGCGC GGCCGGCCTG 97320 GAAGGCGGCG GGCGCGGCGG GTGGGCTCGG CGGAGGGTGA GGCGGCGGGG CGCCCCGCGG 97380
GGAAGGGGCT CCGGAGTGAC GCGGGACCCG GCTAGCGGCG AGCCCACGGC GGCTCGGAAG 97440
GGAAGCGCGG AGCCTGAGCG GGGGTACCCG GGCTGCGACC TCTGCGCTGG GAGCTGTGCC 97500
TCTGAGCCGG TGTCTCCCCG AGGGAAAGGG GACGTGCCCG TGCCCGTGCC CGCCCTCAGG 97560
CTGTGGGGTC GGTCCCGAGA CGCGGGGCTC AGCTGGCTTC TCTTCTTGCA GCCCTGGTCC 97620 AGCGCCTCCC TCTCTCAGCA TGGACGAGGA GAGCCTGGAG TCGGCCTTGC AGACCTACCG 97680
TGCGCAGCTG CAGCAGGTGG AGCTGGCCTT GGGCGCCGGC CTGGATTCGT CTGAGCAGGC 97740
TGACCTGCGC CAGCTGCAGG GGGACCTGAA GGAGCTCATC GAGCTCACCG AGGCCAGCCT 97800
GGTGTCTGTC AGGAAGAGCA GGTTGTTGGC CGCGCTGGAC GAAGAGCGCC CGGGCCGCCA 97860
GGAAGATGCT GAGTACCAGG CTTTCCGGGA GGCCATCACT GAGGCGGTGG AGGCACCAGC 97920 AGCGGCCCGT GGGTCCGGAT CAGAGACCGT TCCTAAAGCA GAGGCGGGGC CAGAATCTGC 97980
GGCAGGTGGG CAGGAGGAGG AAGAGGGAGA GGACGAGGAA GAGCTGAGTG GGACAAAGGT 98040
GAGCGCGCCC TACTACAGCT CCTGGGGCAC TCTGGAGTAT CACAACGCCA TGGTGGTGGG 98100
AACGGAAGAG GCGGAGGATG GCTCGGCGGG TGTCCGTGTG CTTTACCTGT ACCCCACTCA 98160
CAAGTCTCTG AAGCCGTGCC CGTTCTTCCT GGAGGGAAAG TGCCGCTTTA AGGAGAACTG 98220 CAGGTAAAGC CCTTTGTTGT CAGATGCCAA CCTTAGGGGC GTAAGGGGCA CGCACACAGG 98280
GTCGGGTCAG GATCGGCCCT CCCTTTGCTT TGCAGTTTTG TCTCAGCTTC CTGGGGCAGG 98340
CGTGCTTTGA CAGCTGTGTC TGTGTTCAGG CGTCTACGTC TTCCTTCTGG GGTGAATCAA 98400
GAAGCATGGA AGGAGGCCAG GCGCGGTGGC TCACGCCTGT AATCCCAGCA CTTTAGGAAG 98460
CCGAGGCGGG CAGATCACCT GAGGTCAGGA GTTCAAGACC ACGCTGGTCA ACATGGTGAA 98520 ACCCCATCTC CTTAAAAACA CAAAAATGAA CCGGTCGTGG TGGCGCGCAC CTGTGGTCCT 98580 GGCTACTCAG GAGGCTGAGG CAGGAGAATT GGTTGAACCC AGGAGGCCGA GTTTGCAGTG 98640 AGTGGAGATG CAGCCACTGT ACTGCAGCCC GAGCAGCAGT GCAAGGCTTA TGTGGAAGAG 98700 AGTAGGTCTC CAGCCTATCG TCAGTTTTTT TTTGGTGGTT GTTTTAATTT TTTTTGAGAC 98760 AGGGTCTTAC TTTGTCAACC AGGCTGGAGT GCAGTGGCAT AGTCCTGGCT CACTGCAGCC 98820 TGGACCTCCT GGGCTCAACC GATCCTCCTG CCTCAGCCCC CCTAGGAGCT GGGCTACAGA 98880 CTCACGCTAC TACACCCAGC TAATTTTTAT ATTACTATAA TTTTTTATCT TTTTTTTGAG 98940 ACGGAGTCTT GTTCTGTTGC CCAGGCTGGA GTGCAGTGGC GTGATCTCGG CTCACTGCAA 99000 GCTCCGCCTC CCGGGTTCAC GCCATTCTCC TGCCTCAGCC TCCCGAGTAG CTGGGACTAC 99060 AGGCGCCCGC CACCATGTCT GGCTAATTTT CTGTATTTTT AGTAGAGACG GGGTTTCACC 99120 ATGTTAGCCA GGATGGTCTC AATCTCCTGA CCTCGTGATC CGCCCACCTT GGCCTCCCAA 99180 AGTGCTGGGA TGACAAGCGT GAGCCACCGC GCCTGGCCTT TTTTTTTTGG AGACAGAGTT 99240 TCACTCTCCT CACCCAGGCT GGAGTGTAGT GGCGCAATCT CAGCTTACCG CAACCTCTGT 99300 CTCCCGGGTT GAAGTAATTC TCTACCTCAG CGTCCAGAGT AGCTGGCATT ACAGGCGCCC 99360 GCCACCACAC TCGGCTAATT TTTTGTATTT TTAGTAGAGT CGGAGATTCA CCATCTTGGC 99420 CAGGCTGGTC TTGAACTCCT GACCTCGTGA TCCACCCACC TTGGCCTCCC AAAGTGCTGG 99480 GATCACAGGC GTGAGCCACT GCGCCTGGCC CTGTTGTTAG TTTTATTCTC TAGAGTTCAA 99540 CTTTTAAATT TTACTTTCAT GGAGATTTTC AAACATACCC CAAATTAGAG AGTTTAGCAT 99600 AATCACCGCC CACGGTCCAT CATCCAATGT CGTCATTTAT TAATATTTTC CCAGTCTCAT 99660 TTTGTCTGTT CTCCCTGCCC TATTTTTTTC TTTCCTGGGC CATTTTAAAG CAAATTCCAG 99720 AAGTTACTGG TTTTTTCCAA TTATGAATAC TTCATAGTTG CATCTCTAAT CTAACTGATT 99780 AGGAAATTAC TTAAAAAGTA ACTTTTTGGA AGTCCAAGTC CGATGTGAGG ACAAAAAAGA 99840 GTAACTTCTG TGTCATAATA GGTAACACAT TTAATGGTAA TACCTCTTCC ATATTCAAAT 99900 ATGAACAATT ATTACTGTAA TGTCTCTATT TCCCTAAGCG CATAGCTTTA TTTTTCCTCC 99960 TTTTTACTTT TCTCTTAGAA GAAATATTTA CCAAGCCTTC TAGTAGGTAA TTTTCTTTTT 100020 TAGCCAATAG TTCAGGCTGA CCGTGTAACC ATCCCTAGTT CTAGTTCTAG TTCTTTGAAT 100080 GTCTTCCTTT TTTTTTTTTT TTGAAACAGC GTCTTGCTGC TCTGTCACCC AGGCTGGAGT 100140 GCAGTGGCAC AATCTCGGCT CACTGCAATC TCCGCCTCCC TGGCCCAAGC CATCCTCCCA 100200 CCTCAGCCTC CCTAATAGCT GATACTACAA GTGTGCACTG CCACGCCCAG CTAATTTTTG 100260 TATTTTTTGT AGAGACGGGA TTTCACCATA TTACCCAGGT CTCGAATTCC TGATCCCTTT 100320 GATGAGAGAT CTGACACATC CCTGTGGTGC TCCCTCTGGA CCAGGCACTG CTCCAAGGGT 100380 TTCATATACT TTCATTCATC TGTGCAACAG CCCTGTAGGT AGGCCCTGCA GTCACACCAT 100440 CTGACAGAGG AGGAAACAGG AGTAGAAGAA CTGAGTGGTC CAGGGCTTCA AGGCTCAGAG 100500 GGCTCCAGTT GCCCCCAGCC CTCGTTCCGT CCCCTGCTCC ACCCAGTGCT GCTTGCCATG 100560 TCGGCATCAG GCCTGATCTG AAAGCTTCCG GAGCATCTTA CAGACGTCCA CCTTGCCACC 100620 ATTCAGGACT GATAAGTTCT CTTGGATTTG CGTTGGACCT TTTTTTTTTT TTTAAGATGG 100680 AGTTTCACTG TTGTTGCCCA GGCTAGAGTA CAATGGCACG ACCTCCACCT CCTGGGTTCA 100740 AGGGATTCTC CTGCCTCAGC CTCCCAAGTA GCTGGGATTA CAGGCGCCTG TCACCACGTG 100800 GTGCCCAGCT AATTTTTATA TTTTTAGTAG AGGCAGGGTT TCACCGTGTT GGCCAGGCTG 100860 GTCTCGAACC CTTGACCTCA GGTGATCCCG CCTTGGTTTC CCAAAGTGCT GGGATTACAG 100920 GCATGAGCCA CCACACCCGG CCCAGGATTT CTTTATATAT TCTGGATATC ATCCCTTATG 100980 AAGTATATAG TTTGCAGATA TTTGCTCCCA TTGTTTGGGT TGTCTTTTCA CTTGATATAG 101040 TGTCCTTTGA TGCACAAACA TTTTAAATTT TGATGCAGTG CAATTTATTG TTTCTTTATT 101100 GCCTATGTTT TTGTCATCAG GTTTAAGAAA CCACCTCATC CATAGTTATG AGGATTTTCA 101160 CCTATGTTTT CTTCTAAGAG TTCTGTAGTT TTAGCTGTTA AATTTAGGTC TTTGATCCAT 101220 TTTGAGTTAA TTTTTGTATA TGTTATTAGG TGAGGGTCCA CTTTATTCTT TTGCATGTGG 101280 ATTTCCAGTT TTCCCAGCAC CATTTGTTTA AAAGACTGCT TTTTCTCCAC TGAATGGTCT 101340 TGGCACTTTT GTCCAAAATC AATTGGCAAT ATATGTAAGG GTTTATTTCT GAGCTCTCTC 101400 TCCTGTTCCA TTGGTGTATA TGTGCCAGTA CCACACTGTT CTGATTATTA TAGCTTTGTG 101460 ATAAGTTTTA AACTCAGGAA GTGGTAGTTA TTCACCATTT GCTCCTCTTT TTCAAGTTTG 101520 TTTTGTTTCT GGATCCTTTG CAATTTCATA TGAATTTTAG GATCGGCTTG TCCAATTCTG 101580 CATAAAAGAC AGTTTGAATT TTGATATGGA TTGCATAGAA TGTGTAGATC TGTTTGGGGC 101640 ACATTGTCAT CTTTACAATA TTAAGCCTTC TGGCTGGGTG TGGTGGCTGA CGCCTGTAAT 101700 CCCAGTACTT TGGGAGGCTG AGGCGGGCAT ATCACTTGAG GTCAGGAGTT CAAGACCAGC 101760 CTGGCCAACG TGGTGAAACC CCGTCTCTAC TAAAAATAAA AAACAAATTA GTCGGAGGTG 101820 GTGCACACCT GTAATCCCAG CTACAGGAGA GGGTGAGGCA GGAGAATCGC TTGAACCTGG 101880 GAGGAGGAGG TTGCAGTGAG CTGAGATCAT GCCACTGCAC TCCAGCCTGG GTAACAGAGG 101940 GAGACTCCAT CTTAAACAAC AACAATAACA GAAGAAAAAA ACAGTATTAA GTCTTCCAAT 102000 TCATGAATGA AGGATCTGTC CATTTATTTA CGTCTTTAAT TTCTTTCAAC AGTATTTTGT 102060 ACTGTTCAAG TCTTGCACAT TCTTGGTTAA ATAAGTATTA TTTTTGATGC TTCTCTAAGG 102120 AATTGTTTTT CTTTTCCTTT TTTTTTTTGA GACAGAGTCT TGCTCTGTCA CCCAGGCTGG 102180 AGTGCAGTGG CACAATCTTG GCTCACTGCA ACCTCTGCCT CCCGGGTTCA AGCAATTCTT 102240 CTGCTCAGCC TCCCAAGTAG CTGGGATCAC AGGTGCCTGC CACCACACCC AGCTAATTTT 102300 TTTTTTTGAG ATGGAGTCTT GCTCTGTTGC CCAGGCTGGA GTGAAGTGGC CCAATCTTGG 102360 CTCACTGCAA GCTCCACCTC CCGGGTTCAC ACCATTCTTC CGCCTCAGCC TCCTGAGTCG 102420 CTGGGAATAC AGGTGCCTGC CACCACGCCC AGCTAATTTT TTGTATTTTT AGTAGAGATG 102480 GGGTTTCACC ATGTAGCCAG GATGGTCTCG AACTCTTGAC CTCAGGTGAT CTGCCTGCCT 102540 CGGCCTCCCA AAGTGCTGGG ATTACAGATG TGAGCCACTG TGCCCGGCTC GAGTTGTTTT 102600 CCTTAGTTAC ATTTTCAGGC TGTTTGTTGC TAGTATATAG AAATACAAGC TGGGCACCGT 102660 GGCTCACGCC TGTAATCCCA GCACTTTGGG AGGCCAAGGC GGGTGGATCA CCTGTGGTCA 102720 GGAGTTCGAG ACCAGCCTGG CCAACATGGT GAAATCCAGC CTCTATTAAA AATACAAAAA 102780 TTAGTCTGGC ATGGTGGCAG GTGCCTGTAA TCCCATCTAC TCAGGAGGCT GAGGCAAGAG 102840 AATTGCTTGA ACCTGGGAGG CGGAGGTTGC AGTGAGCTGA GATCGCGCCA TTGCACTCCA 102900 GCTTGGGGAA CAAGAGTGAG ACTTCATCTC AAAAAAAAAA AAAAAGAAAT ACAGTGGATT 102960 TTTTTATGTT AATCCTGTAT TGATTGCTGA ATTGGTTTAT TAGTGCTAAT AGGATTTTTT 103020 ATGCACTATT TAGGATTTTC GATATATACA ATCATATATA TTCAATATAT ACAATTAATA 103080 TATATGTGAA TAGAGATAAT TGTAGTCTTT GTTTCTAGTT TGCATGGCAT TTATTTCTTT 103140 TTCTTGCTTA ACTGCCTTAG CTAGAACTTC AAGTACGATG TTGAATAAAA GTGACTAGAG 103200 CGGGCCGGGG GTGGTGGCTC ACACCTGTGT TCCCAGCACT TTGGGAGGTG GAAGTGGGCA 103260 GATCACTTGA GATCAGCAGT TTGAGACCAG CCTGGCCAAC ACGGCGAAAC CCCATCTCTA 103320 CTAAAAATAC AAAAATTAGC TGGGTGAGGT GATGTGCACC TGTAGTCCCA GCTACTTGAG 103380 AGGGTGAGAC ATGAGAATTG CTTGAACCTG GGGGGCGGAG GTTGCAGTGA GCCAAGATCA 103440 TGCCACTCCA CTCCAGCCTG GACGACAGAG CAAGAACCCT GTCTTTAAAA AAAAAAAAAA 103500 AAAAGTGGCT AGAACAAACA TCTTTATCTT GTTCCTGATC TTAGGTGGAA AACTTTTTTG 103560 TTCCTGATAT TAGGTGGAAA ACTTTTAGTC TTTCACTGTT GAATATGATG TTACTTGTAG 103620 GTTTTCTGTA GATTCCCTTT ATCGAGTTGA GGAAATTCTC TTATATTCAT AGTGTGTTGA 103680 GTGTTTTTTA TCATGAAAGG GTGTTGATTT TTTTTTTAAA GATAGGGTCT TGTTCTGTCA 103740 CCCAGGCTGG AGGGCAGTGG CATGATCATG GCTCACTGCA ACCTCGAATT CCTGGGCTCA 103800 GGGGATCCTC CTACTTCATC CTCCTGAGTA GGTGAGACTA CAGGCATGAG CCACCATGCC 103860 CAGCTAATTT TTTAATTTTT CTGTAGAGGT AGGGTCCTGC TTTGCTGCCC AGGCTGGTCT 103920 TAAACTCCAG GGCTCAAGCA ATCCTGCCTC AGCCTCCCAA AGTGCTGAGA TTACAGGGGT 103980 GAGTCACTGC ACTGCACCCA GCTGTGTGGG ATTTTTCAAA TGCTTTTTTC CTTTAGATGA 104040 TCATGTGTGG TTTTTTTCCT TTCATTTTGT TAATGTGGTA TATTGATTTT CGTATGTTGA 104100 ACCATCCTTG AATTCCTCAG ATAAAGCACG CATATTCATG GCGTATTATC TCTTTATTAT 104160 TATTTTTTTT GTAGAGATGA GATTTCACTC TGTTGCCCAA GCTGGTCTCA AACTCCTGGG 104220 CTAAAGTGAT CCTCCTGCCT CAGCCTCCGA AAGCGCTGGG ATTATAGGCA TGAGCCACTT 104280 GGCCCTATCT TTTTTCTTTT TCTTTTTTTT TTTTTTTTGA GACAGAGTCT CACTCTGTCG 104340 CCGGGCTGGA GTGAGTGGCG CGATCTCGGC TCACTGCAAC CTCCATCTCC CGGGTTCAAG 104400 CAATTCTCCT GCCTCAGCCT CCTGAGTAGC TGGGACTACA GGTGCCCGCC ACTATGCCCA 104460 GCTAATTTTT TGTGTTTTTA GTTGAGACGG TGTTTTGCCA TGTTGGACAG GCTGGTCTTG 104520 CACTCCTGAC CTCGTGATTC ACCCACCTTG GCCTCCCGAA GTGCTGGGAT TACAGGCATG 104580 AGCCACCGCA GCGAGCCTTA TCTTTTTAAC AGTTAAAAGT TTAAGGCCTT ATCATGTAAT 104640 AACATTGCTG GATTTGATTT GCTGCTGTTT TGTTGAGAAT ATTTGCATCT GTATTGATAA 104700 GGGATATTGG TCTGTAGTTT TCTTTTCTTG GCATGTCTTT GTATAGCTTT GATGCCAGCA 104760 TAATATTGGC CTCATAGAAT GAGTTAGGAA GTATTCTTTA TATTATGGGA AGAGGTAAAA 104820 AGGGATTGGT GTTAATTCTT CTTCAAATGT TTGATAGAAT TCAACAGTGA AGTGATAT T 104880 ACAATCATAT ATATAGAGAG AGAGAGAGAG AGAGATGGAC TTTTCTTTTG TTGGAAGTTT 104940 ATTGACTATT GATTCAATTT CCTTATTGAA ATTGACTTTT CTTTTTGGAA GCTAAAATGT 105000 ATAACTGTAG TGAAAGTTTC TGAACTTTTC TTTCATTGGA AGTTTTTTGA CTACTGATTC 105060 TTTATTTGTT ATAGGTCTAT TCAGATTTTC TGTTTCTTCT TGAGTCAGTT TGGTCTCGCT 105120 CTGTCGCCCA GGCTGGAGTG CAGTGGTGCC ATCTTGGCTC ACTGCAACTT CTACCTCCCG 105180 AGTTCAAGTG ATTCTCCCAC CTCAGCCTCC CCAGTATCTC GGACTACAGG CGCACGCCAG 105240 CATACCTGGC TAATTTTTGT ATTTTTAGTA GGAACAGCAT TTCACCATGT TGGCCAGGCT 105300 GGTCTCGAAC TCCTGACCTC AGGTGATCCA CCCGCCTCGG CCTCACAAAG TGCTGGGACT 105360 ACAGACATAA GCCACCGCGT CCAGCCTTGA GTCAGTTTAG ATAGTTTGCA TGCATGTTTC 105420 TAGGAATTTG TCCATTTTGT TTATGTTATC TAATCTGTTA CCATACAATT GTTCATAGTA 105480 TCCTTTTATA GCCCTAGTTA TTTCTGTAAG ATCAGTAGTA ATAGCTCCAC TTTCTCTCTT 105540 GGTTTTAGCA ATTTGAGTCA TCTCTTTTCT TCTTCTTTTT TTTTTTTTGA GATGGAGTCT 105600 CACTGTGTCA CCCAGGCTGG AGTGCAGTGG CATGATCTTG GCTCACTGCA ACCCCTGCCT 105660 CCCAGGTTCA AGCAATTCTG CCTTAGCCTC CTGAGTAGCT GGGATTACAG GTGTGAGCCA 105720 CCACACCCAG CTAGTTTTGT TTTGTTTTTT TGTTTTTGAG ACGGAGTCTG TTTCTGTCTC 105780 CCAGGCTGGA GTGCAGTGGT GCAATCTCAC TCATTGCAAC CTCCGACTCC CAGATTCCAG 105840 CAATTCTCCT GCCTCAGCCT CCCGAGTAGC TGGAACTATA GGCGTGCACC ACCACGCCTG 105900 GCTGATTTTT ATATTTTTAG TAGAGATGGG ATTTCACCAT GTTGGCCAGG CTGGTCTTGG 105960 ACTCCCTACC TGAGGTGATC CGCCCACCTT GGCCTCCCAA AGTGCTGGGA TTATAGGCAT 106020 GAGCCACCAT GCCCAGCCAG TTTTTGTATT TTTAGTAGAG ATGGGGTTTC TCCCTGTCGG 106080 CCAGGCTGGT CTTGAAATCC TGACCTCAGG TTATCCACCA GCCTTGGCCT CCCAAAGTGC 106140 TAGGATTACA GGCATGAGCC ACCACGCATG GCCTGTCTTT TCTTCTTGGT CATTTTCGCT 106200 AAAGGTTTGT CAATTTTGTT GATCTTTTTT GTTGCTGATC TCTATTGTTT TCCCATTCTG 106260 TTTCATTTAT TTCCATTTTA ACCTTTGTTT CCTTTTTTCT GCTGGTTTGG GTTTAATTTG 106320 CTCTTTTTTT CCCCTAATTT TTCAAGGTAT ACAGTTAAGT TATTGATTTG AGATCTCTTT 106380 TTTCTTTTCT TTTTTTTTTT TTTTTTTTTT TTTGGTTGCT GTTGAGATGG AGTCTCCCTC 106440 TGTCACCCAG ACTGGAGTGC AGTGGCATGA TCTCAGCTCA CTGCAGCCTC CGCCGCCCAG 106500 GCGATTCTCC TGCCTCAGCC TCCTGAGTAG ACGTTTCCCG GCCAAGGTGT TTCTTTTTGA 106560 ATGTAAGCAT TTACAGCTAC AGATTTCCCT CTAAACACTG CTTTCACTGC ATTCCATAAG 106620 ATTGTTTTTT GTTGTTTTTT GTTGTTGTTT TGTTGTTTGA GACACAGTCT CACTCTGTTG 106680 CCGTTTGGAG AGCAGCGATG CGATCATAGC TCTGTAGCCT TGAGCTCCTG GACTCAATCA 106740 GTCCTCCTGC CTCAGCCTCC CAAGTAGCTG GGACTACAGG TGTACACCAC TGCACCTAAC 106800 TAATTTCTTT TATAAGTTTT TGCAGAGGCC AGGCACAGTG GCTCACACCT GTAATCCCAG 106860 CACTTTGGGA GGCCAAGGTG GGTGGATCAC CTAAGGTCAG GAGTTCGAGA CCAGCCTGGC 106920 CGACAGGGAG AAACCCCATC TCTACTAAAA ATACAAAAAT TAGCTGGGCG TGGTGGCAGG 106980 TGCCTGTAAT CCCAGCTACT CAGGAGGCTG AGGCAGGAGA ATCGCTTGAA CCTGGGAGGC 107040 AGAGGTTGCA GTGAGCCAGG ATCACACCAT TGCACTCCAG CCTGGGTAAC AAAAGCAAAA 107100 CTCCATCTCA AGAAAAGAAA AAAAAAAGTT TTTGCAGAGA CAGGGTATCA CTTTGTTGCC 107160 CAGGCTGGTC TCAAACTCCT GACTTGAAGG AGTCCTACTG CCTCAGCCTC CCAAAGTGCT 107220 GAGATTATGG GCAAGAGCCA CCGCACCCTG CCACTTGGCT GTTTTGTTCT GTTGTATTTC 107280 CATTTTCATT GATCTCAAGA CATCCTAATC TCCCTTTTGT TTTTTTGTTC GACTTACTGG 107340 TTATTCAAGA GTGTCTTTAT TTCTGCATAT TTGTAAATTT TCCAAAAAAG TTTTTCTTTC 107400 TTTTTTTTTT GAGAAAGGGT CTTGCTCTGT CGCCCAGGCT GGAGAATGGT GGTGCACAAT 107460 CTTGCCTCAC TGCAACCTCT GCCTCCCGGG TTCAAGTGAT CCTCCCACCT CAGCCTTCCC 107520 AGTAGCTGGG ATTACAGGCA CACACCACCA CACCTGGCTA ATTTTTGTAT TTTAGTCTTA 107580 ACGTGCTGGT CAGACTGGTC TCGAATTCCT GACCTCAGGT GATCTGCCCG CCTTGGCCTC 107640 CCAAAGCACT GGGATTACAG GCGTGAAACA CCATGCCCAG CCCCCAATTT TTTTTTTTTA 107700 ATAGAGAGAA GGTCTCACTC AAGCCCAGGC TGGTCTTGAA CTCCTGAGCT CAAGCTGTCA 107760 TCCCTCCTCG GCCTCCCAAG GTGCTGAGAT TACAGGTGTG AGTCACAGTA CCTGGCCTTC 107820 TTTCAAGACT TTAAAAATGC CATCTTGGCT GGGCACGGTG GCTCACGCCT GTAATCCCAG 107880 CACTTTGGGA GGCCGAGGTG GGCAGATCAC GAGGTCAGGA GATCAAGACC ACCCTGGCTA 107940 ACATGGTGAA ACCCTGTCTC TACTAAAAAT ACAAAAAATT AACCAGGTGT GGTGGCAGGT 108000 GCCTGTAGTC CCAGCTACTC GGGAAGCTGA AGCAGGAGAA TGGCGTGAAC CCGGGAGGTG 108060 GAGCTTGCAG TGAGCTGAGA TCACACCACT GTACTCCAGC CTGGGCAACA GTGCGAGACT 108120 CCGTCTCAAA AAAAAAAAAA AAAATGTCAT CTCACTGCCT TCTGGTCCAA TAGTTTCTGA 108180 TGAGAAATTG GCTGTTAATC TTATTGAGGA ACATTTATAT ATTGACTAGT CACTTGTCTC 108240 TTGCTGTTTT AGGAGATTCT CTATCTTTGG GTTTCAGCAG TTTGATTATA ATGTATCAGT 108300 GTGGATCCCT CAATTTATAA GCTACTTGGA GTTCATTGGA CTTCTTGGAT GTGTAAATTC 108360 ATGTCTTTCA TTAAATTTGC AAAGTTTCAG CTACTATTCT TTGCATCTTG AAATACTAGT 108420 TTTGTTTCTT TCTGTCTGTT TGCCGCTTAT GGAACTTTAT GCATACATTG ATGTGCTTCA 108480 TGGTGTAGCA CAGGTCCCTT GGGCTCTAGG CATTTTTCTT TGTTCTTTTT TTCTTTCTGC 108540 TCCTCATTTT GGATAAATTC AGCTGACCTG TCCTCAAGTT CACTGTTTCT TTCTTCTTCC 108600 TTCTCAAATC TGCTGTTGAA ACTTCTGGTG AAATTTTCAC TACAGTTACT GTACTTTTTA 108660 GCTCCAAAGT TTCTATTTGG TTTCTTTCTG TAGTAATTAT CACTTTACTA GTATTCTCTA 108720 TTTGGTTAGA CATGGTTCTT TTGTTTTCCT TTAGTTCATT ATCCATGGTT TCCTTTATTT 108780 TTAAATTTCT TTTTATTTAG TTATTAATTT TTTTTTTTTT TGAAGCGGGG TTTCACTCTT 108840 GTCACCCAGG CTGGCAGGCA ACGTCACAAT CTTGGCTCAC TACAACCTCC GCCTCCTGGG 108900 TTCAAGTGAT TCTCCTGCCT CAGCCTCCCA AGTAGCTGGG ATTATAGGCA TGTGCCACCA 108960 CACCCACCTA ATTTTTGGTA TTTTTAGTAG AAACTGGGTT TCACCACATT GGCCAGACTG 109020 GTCTTAAACT ACTAACCTCA GGTGATCTGT CCGCCTCAGC CTCCCAAAAT GCTGGGATTA 109080 CAGATGTGAG CCACTGTGCC CAGCCTCTTT TTTTAGTGTA TTTAAGGTAA TTGATTGAAA 109140 GTTTTTGTCT AGTCATTCAA ATGTCTAGGC TTCCTCAGGA ACAGTTTCTA TTAATTTCTT 109200 TATTTTTAAA AAATTTTTTT TAATTTTCTT TTTTTTTTAG ATGGAGTCTC ACTCTATAGC 109260 CTAGGCTGGA GTGCAATGGC TTGATCTTGG CTCACTGCAA CCTCTGCCTC CTGGGTTCAA 109320 GCGATTCTCC TGCTTCAGCC TCCTGAGTAG CTGGGACTAT AGGTGCGTGC CACCACTCCT 109380 GGCTAATTTT TTGTATTTTC AGTAGAGACA TGGTTTTGCC GTGTTAGCCA GGATGGTCTC 109440 GATCTCGTGA CCTCATGATC CTCCTGCCTC GGCCTCCCAA AGTGCTGGAA TTACAGGTGT 109500 GAGCCACCGC GCCCAGCCTA TTTTTTATTT TTTGAGACAA AGTCTCCCTC TCTCACCCAG 109560 GCTGTAGTGC AGTGGCACAA CCCTGGCACA CTGCAGCCTT AACCGTCCAG GCTTAAGTGA 109620 GTCTCCCACC TTAGTCTCCT GAGTAGCTAG AACTACAAGC ATGTGCCACC ATGCCTGGCT 109680 GGTTGTGTTG TTACTGTTTT AGACACAGGG TCTTGCTACA TTTCTCTGAC TGGTCTTGAA 109740 CTCCTGGGCT CAAGCAGTCA TCCCACCTTG GCCTCCCAAG GTGTTGAGAT TACAGGTGTG 109800 AGCCACCGCA CCCGGCCTGT TAATTTCTTT ATTTCCGGTG AATGGGCCAC ACTTTCTTGT 109860 TTCTTTGCAT GCCTTGTAAT TTTTTGTTGA AACCTGCACA ATTTGAAGAT GATAATGTGG 109920 TTACTTTGAA AATCAGATCC TCCGCCCTCT GCAGGGTTCA TTGTTGCTGT TTGTTGTGGA 109980 TTGTCGTTTC TCGTTTGTTT AGTTACTTTC CTGACCTTTT TAAATAAAGA CTATATTCTG 110040 TCAGGGGTGC TTGTTTCTGT TCTTTTAGGT TAGTGGTTAG CTTGTGCTTT GAAAGAGATT 110100 TCTTTAAATA TCTAGTGGCA AAAAGGATAA AGAGGCCGGG CGCAGTGGCT CACGCCTGTA 110160 ATGCTAGGAC TTTGGGAAGT GGAGGCGGGT GGATCACTTG AGGTCAGGAG TTTAAGATCA 110220 GCCTGGCCAG TATGGTGAAA CCCTGTCTCT ACTAAAAATA CAAAAATTAA CCGGGCATGG 110280 TGGCACCTGC CTGTAGTCCC AGCTACTGGG AAGACTGAGG CAGGAGAATC GCTTCAATCC 110340 AGGGGGCGGA GGTTGCAGTG AGCTGAGATT GCGCCATTGC ACTCCAGCCT GGGCAACAGA 110400 GCGAGACTCT GTCTCAAATA AAAAAAAAAA AAAAAGGATA AAGAGTGTCT TCCATCCTTT 110460 CCAGGTTGCC TCTGTACTGG GGCAAGTCCT TCAGTGTCCG CCAGGCTGTT CACGGCTTTT 110520 CCTCAGCCTT TACTTCTCGC TCCCATGGAG CCTAAGGATG AACCAGAGGT GAAAGTTGAG 110580 GGCCTCCTCA GGTGTTTCTG AGCCCCTGTC TAGCCCCAGC TGTGTGCATG GCCTTCTGGA 110640 TTTCCAAGCA TGAACAGGAG CTTTCCAAAG CCCTTAGACC TTCATGTAGC TCTTTTCCCA 110700 GCCTCTTCCT TCCTAGGCTT TTCTGTCAGC TCTTTGCCCA TCTGTTGTTG TCCCTCCCCC 110760 ACAACTTCAG GTAGTATCTA CCTGTAAATG CCTTCAGGCC AGGCGCGGTG GCTCATACCT 110820 GTTATCCCAG CACTTTGGGA GGCCGAGGCG GGTGAATTGC TTGAGGTCAG GAGTTCGAGA 110880 CCAGCCTGGC CAACATGGTG AAGCCCCGTC TCTAGTAAAA ATACAAAAAT TAGCTGGGCG 110940 TGGTGGGTGC CTGTAATCTC AGCTACTCGG GAGGCTGAAG CAGGAGAATT GCTTGAGCCT 111000 GGGAGGCGGA GGTTGCAGTG AGCTGAGATC GTGCCATTGC ACTCCAGCCT GGGCGACAGA 111060 GTGAGACTCC ATCTCGGGGA AAAAAAAAAA AAAAAAATGC CATCAACAGC ACGACCCTGG 111120 AGGCTGCCCC AGCCCTGAGA GAGTTCGAGG GGGTGAAACA AAGGCAAGCC CTTCAGGGAG 111180 ACACTAGAAA GATCCAAATG CATAAGCAGG ATTCCTTGAG AAAAGGTCTG TATCATCCCT 111240 TCTGACACCA GCAAGCCACA TCAGAAATAC AGGTTGCCTT CCCCATGGCT ACATGTGAGC 111300 TGGTAGTAGT GGCTGAGCAG AAATAGCCCA GCTGTCCTCC TGAAATTTAG CAGGGTCTTA 111360 CTTCATTGAG CAGTCATCTG GTTCGTAGAC ACCAGAGTTA CAGAAAAGTT TATTGGGAGG 111420 TTTTGACAGT TTAATAGAAA AAAGTTTATT GTGACAGTTT TGACAGCTGA ATAGAAAAAA 111480 GTTTACTGTG ACAGTTTTGA CAGCAGAATA GTTGCTTTGC TGGAGAGACG GATCTTTGGA 111540 GCTGCCAACT CCATCATTTT GGTGATATCC AGCTCTGTTG CTGAATTTTT AGCTATGCTG 111600 TTTTAAGTTA TTTTCTTAGT GGTTGCTCTA GAGATGACAA TGTGCATCTT TAACTTACCA 111660 CAATGTACTT CAGATTATTA CTAACTTAAC ACTTAAAGTA CAGCATTTTT TTTTTTATGG 111720 AGTTTCACTC TGTCACCCAG GCTGGAGTGC AATGGTGTGA TCTCGGCTCA CTGCAACCTC 111780 CGCCTCCCAG GTTCACGCCA TTCTCCTGCC TCAGCCTCCT GAGTAGCTGG GACTACAGGC 111840 ACCCCCACCA CACCCGGCTA ATTTTGTATT TTTAGTAGAG ATGAGGTTTC ACCATGTTGG 111900 TCAGGCTGGT CTCGAACTGC TGACCTCAGG TGATCCGCCC ATCTTGGCCT CCCAAAGTGC 111960 TGGGATTACA GGTGTGAGCG ACTGCACTGA GCCTAAGTAT GGCAACGTGT CTATAACATA 112020 GATCTACTTC CGTTGTACTA TGACATAGTT CCCCCTCCAT TTTCCTATAG CACAGTCCCA 112080 ACCTCCCTTT TCCTCTGACA TAGTTCCATC CTCCCTCCTC CTATGACGTC CTCCCTTCTC 112140 CTCTGGCATA GCTCCATCCT CCCTTCTCCT ATGACACAGC TCCATCCTCC CTTCTCCTCT 112200 GACACAGCTC CATCCTCCCT TCTCCTATGA CACAGCTCCA TCCTCCCTTC TCCTCTGACA 112260 TAGCTCCATC CTCCCTTCTC CTATGTCATA GCTCCATCCT CCCTTCTCCT CTGACACAGC 112320 TCCATCCTCC CTTCTCCTCT GGCATAGCTC CATCCTCCCT TCTCCTATGA CACAGCTCCA 112380 TCCTCCCTTC TCCTATGACA CAGCTCCATC CTCCCTTCTC CTATGACACA GCTCCATCCT 112440 CCCTTCTCCT ATGACACAGC TCCATCCTCC CTTCTCCTCT GGCATAGCTC CATCCTCCCT 112500 TCTCCTCTGA CATAGCTCCA TCCTCCCTTC TCCTCTGACA TAGCTCCATC CTCCCTTCTC 112560 CTCTGACATA GCTCCATCCT CCCTTCTCCT CTGACATAGC TCCATCCTCC CTTCTCCTCT 112620 GACATAGCTC CATCCTCCCT TCTCCTCTGA CATAGTTCCA TCCTCCCTTG TCCTCTGACA 112680 TAGCTCCATC CTCCCTTCTC CTCTGACATA GCTCCATCCC CTCTTCTCCT TCATGTATTA 112740 TTGCCATATA TACATTTATG TATGTTATAA CTTCAGCTCT TCAGCGTTAT AATTATTGCT 112800 TCAAAAGT T TTTGAAAGAA GTTGCCTGGA GGCAGTGGCT TATGCCTTTA ACTCCAGCAC 112860
TTTTGGGGGC TGAGGTGGGC AGATCGCCTG AGCCAGGGAG TTGGAGACCA GCCTGGGCAA 112920
CATGACGAAA CCCATCTCCA CCAAAATTAC AAAAAATTAG TCTGGCATGG TGGCACGCGC 112980
CTGTAGTCCC AGCTATTTGG GGGAGGATCC CAGCTAAGGT GGGAGGATCA CTTGAGCCTG 113040 GGAAGTCAAG GCTGCAGTGA GCTGAGATTG TGCCACTGCA CTCCAGCCTG GGTGCAGATC 113100
TTATCTCAGA AGTAAAGGGA CTAGGAATGG TGGCTTTTAT CTCTAATCCC AGCACTTTGG 113160
GAGGCTGAGG TGAGTGGATC ACCGGAGGTC AGGAGTTTAA GACCAGCCTG GCCAACATGG 113220
TGAAACCCCG TCTCTACTAA AAATACAAAA AGTAGCCGGG TGTGGTGGTG GGTGTCTGTA 113280
ATCCCAGCTA CTCGGGAGGC TGAGGCAAGA GAATCGCTTG AACCTGGGAA GCGGAGGTTG 113340 CAGTGAGCAA GATCGCACCA CTGCATTACA GCCTAGATGA CAGAGCGAGA CTCTGCCTAA 113400
AAAAAAAAAA AAAAAGAAAA GAAAAGAAAT TAAGATCTAG ACACTGTGGT TCATGCCTGT 113460
AATCCCAAAG CCTTGGGAGG CCAAGGCAGG AGGATCACTT GAGGCCAGGA GTTCAACACC 113520
AGCCTGGGCA ACATAGCGAG ACTCCATCTC TATTTAAAAA AGAAAGAAAT TCAAAGAGAA 113580
AAAAAGTATA CTTGTTTTTT TGTATCATCC ATATTTTACC TTTCTTTTTT TTGCCCCTTT 113640 TTCTTTCCTG TGAATTTGAG TTACTGTCTA GTGTCATTTC CTTTTAGTCT GAAGAACTTC 113700
ATTTAGAATT TTTTTTTTTT TTTGAGACAA AGTCTCACTG TGTTGCCCAG GCTGGAGTGC 113760
AATGGTGCAG TCTCAGATCA CTGCAACCTC TGCCTCCCTG GTTAGAGTGA TTTTCCTGCC 113820
TCAGCCTCCC AAGTAGCTGA GACTGCAGGC ACCTGCCACC ACCCCCAGCC AATTTTTTTG 113880
GTATTTTTAG TAGAGACAGG GTTTCACTAT GTTGGCCAGG CTGGTCTCGA ATTCATGACC 113940 TCATGATCTG CCTGTCCTGG CCTCCCAAAA TGCTGGGATT ACCATGAGCC ACCACGCCCA 114000
GCCCATTTAG AATTTCTTTT TTTTTTTTTT TTTTGAGATG GGGTCTCGCT CTTGTTTCCC 114060
AGGCTGGAGT GCAGTGGCAC GATCTCGGCT CACTGCGAGC TCCGCCTCCC GGGTTCACGC 114120
CATTCTCCTG CCTCAGCCTC CCGAGTAGCT GGGATTACAG GCGCCTGCCA CCACGCCCAC 114180
CTAATTTTTT GTATTTTTAG GAGAGATGGG GTTTCACCAT GTTAGCCAGG ATGGTCTTGA 114240 TCTCCTGACC TCGTGATCCG CCCGCCTTGG CCTCCCAAAG TGCTGGGATT ACAGGCGTGA 114300
GCCACCGCGC CCGGCTAGAA TTTCTTGTAG GACAGGCTTG CTAGCAACCA ATTCAGTGTT 114360
TATTTGGGAA TGTCTTTATT TCAGCTTCAT TTTTTGAAGG ATAGTTTAGC TGGCTATAGA 114420
ATTATTAATT GATCATTCTT TTCAGTGTTT AAAAGTGTCA TCATGCTACC TTCTGGGTTC 114480
CATTGTTTCT GATGAGAAGT CATCTGTCAA ATTGTCCCTT TGTACTTGAA GAATTATCTT 114540 TTTTTCTCTT GATGTTTTCA AGATTTTCTC TTTGTCTTTG GCCTTTAGTA GTTTGTGATG 114600 TATCTAGGTG TGGATCTCTT GGTGTGCATC GTATTTGGGC TTCAGTAAGC CTCTTAGATT 114660
CATAGATTAA TGTTTTGTTT TGTTTTACCA AATTTGGAGA GTTTTTACTC ATCATTTCAA 114720
CAAATTTTTT TCCTGCCCCT CTCTCATCTC CTTTTGGGAG TACCACTGCA TGTATGTTGG 114780
TGTGCGTTCT CTA (SEQ ID NO: 3) 114793. The present invention also relates to a portion of SEQ ID NO:3 which comprises 5' regulatory regions, exons, introns and 3' non-translated regions which comprise the human NHL gene of the present invention. Such regulatory sequence may be found within the various regions of this 115 kb fragment. The 5' portion of SEQ ID NO:l begins at nucleotide 47095 of SEQ ID NO:3, the initiating ATG of human NHL is from nucleotide 48687-48689 of SEQ ID NO:3, the termination 'TAG' codon is from nucleotide 84855-84857, while the 3' terminus of SEQ ID NO:l as disclosed herein (GCAGTGCCC) corresponds to nucleotides 85308-85316. To this end, one preferred aspect of the invention is an isolated genomic fragment or fragments which comprise from about nucleotide 470000 to about nucleotide 85500 of SEQ ID NO:3), which comprises the portion of the genomic clone encoding the mRNA transcript responsible for human NHL (see Figure 5A-B). The genomic sequence encoding NHL contains 35 exons (Figure 5A). An especially preferred aspect of the invention is a human genomic fragment or fragments which comprise from about nucleotide 47095 to about nucleotide 85316 of SEQ ID NO:3. As noted in regard to SEQ ID NO:l, the present invention also relates to DNA vectors and recombinant hosts which comprise at least a portion of SEQ ID NO:3. Portions of the 115 kb genomic fragment may be housed in multiple vector/hosts so as to optimize handling of the DNA sequences within SEQ ID NO:3. Therefore, the present invention relates to the isolated genomic sequence which set forth as SEQ ID NO:3, a region of SEQ ID NO:3 which contains the coding and non-coding region oi human NHL, as well as s-acting sequences within SEQ ID NO:3 which effect regulation of transcription of one or more of the genes localized within this 115 kb human genomic fragment, including regulatory regions effecting levels of NHL, M68/DcR3, SCLIP and ARP, As noted above, this region of chromosome 20 (20ql3.3) is associated with tumor growth. Therefore, an aspect of this invention also comprises, as one example, the use of one or more regulatory regions of this 115 kb genomic sequence as a target to antagonize the effect of a transcriptional factor(s) which normally upregulate expression of a gene which has a caustic role in tumor growth. Alternatively, compounds may be selected which interacts with a specific cis-acting sequence to upregulate a gene within this region, where upregulation results in a decrease in tumor growth.
The present invention is also directed to methods of screening for compounds which modulate the expression of DNA or RNA encoding a NHL protein. Compounds which modulate these activities may be DNA, RNA, peptides, proteins, or non-proteinaceous organic molecules. Compounds may modulate by increasing or attenuating the expression of DNA or RNA encoding NHL, or the function of the NHL-based protein. Compounds that modulate the expression of DNA or RNA encoding NHL or the biological function thereof may be detected by a variety of assays. The assay may be a simple "yes/no" assay to determine whether there is a change in expression or function. The assay may be made quantitative by comparing the expression or function of a test sample with the levels of expression or function in a standard sample. Kits containing NHL, antibodies to NHL, or modified NHL may be prepared by known methods for such uses.
The DNA molecules, RNA molecules, recombinant protein and antibodies of the present invention may be used to screen and measure levels of NHL. The recombinant proteins, DNA molecules, RNA molecules and antibodies lend themselves to the formulation of kits suitable for the detection and typing of NHL. Such a kit would comprise a compartmentalized carrier suitable to hold in close confinement at least one container. The carrier would further comprise reagents such as recombinant NHL or anti-NHL antibodies suitable for detecting NHL. The carrier may also contain a means for detection such as labeled antigen or enzyme substrates or the like.
The assays described above can be carried out with cells that have been transiently or stably transfected with NHL. The expression vector may be introduced into host cells via any one of a number of techniques including but not limited to transformation, transfection, protoplast fusion, and electroporation. Transfection is meant to include any method known in the art for introducing NHL into the test cells. For example, transfection includes calcium phosphate or calcium chloride mediated transfection, lipofection, infection with a retroviral construct containing NHL, and electroporation. The expression vector-containing cells are individually analyzed to determine whether they produce NHL protein. Identification of NHL expressing cells may be done by several means, including but not limited to immunological reactivity with anti-NHL antibodies, labeled ligand binding, the presence of host cell-associated NHL activity.
The specificity of binding of compounds showing affinity for NHL is shown by measuring the affinity of the compounds for recombinant cells expressing NHL. Expression of human NHL and screening for compounds that bind to NHL or that inhibit the binding of a known, radiolabeled ligand of NHL provides an effective method for the rapid selection of compounds with high affinity for NHL. Such ligands need not necessarily be radiolabeled but can also be nonisotopic compounds that can be used to displace bound radiolabeled compounds or that can be used as activators in functional assays. Compounds identified by the above method are likely to be agonists or antagonists of NHL and may be peptides, proteins, or non- proteinaceous organic molecules.
Accordingly, the present invention is directed to methods for screening for compounds which modulate the expression of DNA or RNA encoding a NHL protein as well as compounds which effect the function of the NHL protein. Methods for identifying agonists and antagonists of other receptors are well known in the art and can be adapted to identify agonists and antagonists of NHL. For example, Cascieri et al. (1992, Molec. Pharmacol. 41: 1096-1099) describe a method for identifying substances that inhibit agonist binding to rat neurokinin receptors and thus are potential agonists or antagonists of neurokinin receptors. The method involves transfecting COS cells with expression vectors containing rat neurokinin receptors, allowing the transfected cells to grow for a time sufficient to allow the neurokinin receptors to be expressed, harvesting the transfected cells and resuspending the cells in assay buffer containing a known radioactively labeled agonist of the neurokinin receptors either in the presence or the absence of the substance, and then measuring the binding of the radioactively labeled known agonist of the neurokinin receptor to the neurokinin receptor. If the amount of binding of the known agonist is less in the presence of the substance than in the absence of the substance, then the substance is a potential agonist or antagonist of the neurokinin receptor. Where binding of the substance such as an agonist or antagonist to is measured, such binding can be measured by employing a labeled substance or agonist. The substance or agonist can be labeled in any convenient manner known to the art, e.g., radioactively, fluorescently, enzymatically. Therefore, the present invention includes assays by which modulators of NHL are identified. As noted above, methods for identifying agonists and antagonists are known in the art and can be adapted to identify compounds which effect in vivo levels of NHL. Accordingly, the present invention includes a method for determining whether a substance is a potential modulator of mammalian NHL levels that composes:
(a) providing test cells by transfecting cells with an expression vector that directs the expression of NHL in the cells;
(b) exposing the test cells to the substance; (c) measuring the amount of binding of the substance to NHL;
(d) comparing the amount of binding of the substance to NHL in the test cells with the amount of binding of the substance to control cells that have not been transfected with NHL or a portion thereof; wherein if the amount of binding of the substance is greater in the test cells as compared to the control cells, the substance is capable of binding to NHL.
The conditions under which step (b) of the method is practiced are conditions that are typically used in the art for the study of protein-ligand interactions: e.g., physiological pH; salt conditions such as those represented by such commonly used buffers as PBS or in tissue culture media; a temperature of about 4°C to about 55°C. The assays described above can be carried out with cells that have been transiently or stably transfected with NHL. Transfection is meant to include any method known in the art for introducing NHL into the test cells. For example, transfection includes calcium phosphate or calcium chloride mediated transfection, lipofection, infection with a retroviral construct containing NHL, and electroporation. Where binding of the substance or agonist to NHL is measured, such binding can be measured by employing a labeled substance or agonist. The substance or agonist can be labeled in any convenient manner known to the art, e.g., radioactively, fluorescently, enzymatically.
Therefore, the specificity of binding of compounds having affinity for NHL shown by measuring the affinity of the compounds for recombinant cells expressing the cloned receptor or for membranes from these cells. Expression of the cloned receptor and screening for compounds that bind to NHL or that inhibit the binding of a known, radiolabeled ligand of NHL to these cells provides an effective method for the rapid selection of compounds with high affinity for NHL. Such ligands need not necessarily be radiolabeled but can also be nonisotopic compounds that can be used to displace bound radiolabeled compounds or that can be used as activators in functional assays. It is also possible to construct assays wherein compounds are tested for an ability to modulate helicase activity in an in vitro- or in vivo- based assay. Compounds identified by the above method again are likely to be agonists or antagonists of NHL and may be peptides, proteins, or non-proteinaceous organic molecules. As noted elsewhere in this specification, compounds may modulate by increasing or attenuating the expression of DNA or RNA encoding NHL, or by acting as an agonist or antagonist of the NHL receptor protein. Again, these compounds that modulate the expression of DNA or RNA encoding NHL or the biological function thereof may be detected by a variety of assays. The assay may be a simple "yes/no" assay to determine whether there is a change in expression or function. The assay may be made quantitative by comparing the expression or function of a test sample with the levels of expression or function in a standard sample. Expression of NHL DNA may also be performed using in vitro produced synthetic mRNA. Synthetic mRNA can be efficiently translated in various cell-free systems, including but not limited to wheat germ extracts and reticulocyte extracts, as well as efficiently translated in cell based systems, including but not limited to microinjection into frog oocytes, with microinjection into frog oocytes being preferred.
Following expression of NHL in a host cell, NHL protein may be recovered to provide NHL protein in active form. Several NHL protein purification procedures are available and suitable for use. Recombinant NHL protein may be purified from cell lysates and extracts by various combinations of, or individual application of salt fractionation, ion exchange chromatography, size exclusion chromatography, hydroxylapatite adsorption chromatography and hydrophobic interaction chromatography. In addition, recombinant NHL protein can be separated from other cellular proteins by use of an immunoaffinity column made with monoclonal or polyclonal antibodies specific for full-length NHL protein, or polypeptide fragments of NHL protein.
Polyclonal or monoclonal antibodies may be raised against NHL or a synthetic peptide (usually from about 9 to about 25 amino acids in length) from a portion of NHL disclosed in SEQ ID NO:2. Monospecific antibodies to NHL are purified from mammalian antisera containing antibodies reactive against NHL or are prepared as monoclonal antibodies reactive with NHL using the technique of Kohler and Milstein (1975, Nature 256: 495-497). Monospecific antibody as used herein is defined as a single antibody species or multiple antibody species with homogenous binding characteristics for NHL. Homogenous binding as used herein refers to the ability of the antibody species to bind to a specific antigen or epitope, such as those associated with NHL, as described above. Human NHL-specific antibodies are raised by immunizing animals such as mice, rats, guinea pigs, rabbits, goats, horses and the like, with an appropriate concentration of NHL protein or a synthetic peptide generated from a portion of NHL with or without an immune adjuvant. Preimmune serum is collected prior to the first immunization. Each animal receives between about 0.1 mg and about 1000 mg of NHL protein associated with an acceptable immune adjuvant. Such acceptable adjuvants include, but are not limited to, Freund's complete, Freund's incomplete, alum-precipitate, water in oil emulsion containing Corynebacterium parvum and tRNA. The initial immunization consists of NHL protein or peptide fragment thereof in, preferably, Freund's complete adjuvant at multiple sites either subcutaneously (SC), intraperitoneally (IP) or both. Each animal is bled at regular intervals, preferably weekly, to determine antibody titer. The animals may or may not receive booster injections following the initial immunization. Those animals receiving booster injections are generally given an equal amount of NHL in Freund's incomplete adjuvant by the same route. Booster injections are given at about three week intervals until maximal titers are obtained. At about 7 days after each booster immunization or about weekly after a single immunization, the animals are bled, the serum collected, and aliquots are stored at about -20°C.
Monoclonal antibodies (mAb) reactive with NHL are prepared by immunizing inbred mice, preferably Balb/c, with NHL protein. The mice are immunized by the IP or SC route with about 1 mg to about 100 mg, preferably about 10 mg, of NHL protein in about 0.5 ml buffer or saline incorporated in an equal volume of an acceptable adjuvant, as discussed above. Freund's complete adjuvant is preferred. The mice receive an initial immunization on day 0 and are rested for about 3 to about 30 weeks. Immunized mice are given one or more booster immunizations of about 1 to about 100 mg of NHL in a buffer solution such as phosphate buffered saline by the intravenous (IV) route. Lymphocytes, from antibody positive mice, preferably splenic lymphocytes, are obtained by removing spleens from immunized mice by standard procedures known in the art. Hybridoma cells are produced by mixing the splenic lymphocytes with an appropriate fusion partner, preferably myeloma cells, under conditions which will allow the formation of stable hybridomas. Fusion partners may include, but are not limited to: mouse myelomas P3/NS l/Ag 4-1; MPC-11; S-194 and Sp 2/0, with Sp 2/0 being preferred. The antibody producing cells and myeloma cells are fused in polyethylene glycol, about 1000 mol. wt., at concentrations from about 30% to about 50%. Fused hybridoma cells are selected by growth in hypoxanthine, thymidine and aminopterin supplemented Dulbecco's Modified Eagles Medium (DMEM) by procedures known in the art. Supernatant fluids are collected form growth positive wells on about days 14, 18, and 21 and are screened for antibody production by an immunoassay such as solid phase immunoradioassay (SPIRA) using NHL as the antigen. The culture fluids are also tested in the Ouchterlony precipitation assay to determine the isotype of the mAb. Hybridoma cells from antibody positive wells are cloned by a technique such as the soft agar technique of MacPherson, 1973, Soft Agar Techniques, in Tissue Culture Methods and Applications, Kruse and Paterson, Eds., Academic Press.
Monoclonal antibodies are produced in vivo by injection of pristine primed Balb/c mice, approximately 0.5 ml per mouse, with about 2 x 10 to about 6 x 10 hybridoma cells about 4 days after priming. Ascites fluid is collected at approximately 8-12 days after cell transfer and the monoclonal antibodies are purified by techniques known in the art.
In vitro production of anti- NHL mAb is carried out by growing the hybridoma in DMEM containing about 2% fetal calf serum to obtain sufficient quantities of the specific mAb. The mAb are purified by techniques known in the art.
Antibody titers of ascites or hybridoma culture fluids are determined by various serological or immunological assays which include, but are not limited to, precipitation, passive agglutination, enzyme-linked immunosorbent antibody (ELISA) technique and radioimmunoassay (RIA) techniques. Similar assays are used to detect the presence of NHL in body fluids or tissue and cell extracts.
It is readily apparent to those skilled in the art that the above described methods for producing monospecific antibodies may be utilized to produce antibodies specific for NHL peptide fragments, or a respective full-length NHL.
NHL antibody affinity columns are made, for example, by adding the antibodies to Affigel-10 (Biorad), a gel support which is pre-activated with N- hydroxysuccinimide esters such that the antibodies form covalent linkages with the agarose gel bead support. The antibodies are then coupled to the gel via amide bonds with the spacer arm. The remaining activated esters are then quenched with 1M ethanolamine HCl (pH 8). The column is washed with water followed by 0.23 M glycine HCl (pH 2.6) to remove any non-conjugated antibody or extraneous protein. The column is then equilibrated in phosphate buffered saline (pH 7.3) and the cell culture supernatants or cell extracts containing full-length NHL or NHL protein fragments are slowly passed through the column. The column is then washed with phosphate buffered saline until the optical density (A 80) falls to background, then the protein is eluted with 0.23 M glycine-HCl (pH 2.6). The purified NHL protein is then dialyzed against phosphate buffered saline.
Pharmaceutically useful compositions comprising modulators of NHL may be formulated according to known methods such as by the admixture of a pharmaceutically acceptable carrier. Examples of such carriers and methods of formulation may be found in Remington's Pharmaceutical Sciences. To form a pharmaceutically acceptable composition suitable for effective administration, such compositions will contain an effective amount of the protein, DNA, RNA, modified NHL, or either NHL agonists or antagonists including tyrosine kinase activators or inhibitors.
Therapeutic or diagnostic compositions of the invention are administered to an individual in amounts sufficient to treat or diagnose disorders. The effective amount may vary according to a variety of factors such as the individual's condition, weight, sex and age. Other factors include the mode of administration.
The pharmaceutical compositions may be provided to the individual by a variety of routes such as subcutaneous, topical, oral and intramuscular. The term "chemical derivative" describes a molecule that contains additional chemical moieties which are not normally a part of the base molecule. Such moieties may improve the solubility, half-life, absorption, etc. of the base molecule. Alternatively the moieties may attenuate undesirable side effects of the base molecule or decrease the toxicity of the base molecule. Examples of such moieties are described in a variety of texts, such as Remington's Pharmaceutical Sciences.
Compounds identified according to the methods disclosed herein may be used alone at appropriate dosages. Alternatively, co-administration or sequential administration of other agents may be desirable.
The present invention also has the objective of providing suitable topical, oral, systemic and parenteral pharmaceutical formulations for use in the novel methods of treatment of the present invention. The compositions containing compounds identified according to this invention as the active ingredient can be administered in a wide variety of therapeutic dosage forms in conventional vehicles for administration. For example, the compounds can be administered in such oral dosage forms as tablets, capsules (each including timed release and sustained release formulations), pills, powders, granules, elixirs, tinctures, solutions, suspensions, syrups and emulsions, or by injection. Likewise, they may also be administered in intravenous (both bolus and infusion), intraperitoneal, subcutaneous, topical with or without occlusion, or intramuscular form, all using forms well known to those of ordinary skill in the pharmaceutical arts.
Advantageously, compounds of the present invention may be administered in a single daily dose, or the total daily dosage may be administered in divided doses of two, three or four times daily. Furthermore, compounds for the present invention can be administered in intranasal form via topical use of suitable intranasal vehicles, or via transdermal routes, using those forms of transdermal skin patches well known to those of ordinary skill in that art. To be administered in the form of a transdermal delivery system, the dosage administration will, of course, be continuous rather than intermittent throughout the dosage regimen. For combination treatment with more than one active agent, where the active agents are in separate dosage formulations, the active agents can be administered concurrently, or they each can be administered at separately staggered times.
The dosage regimen utilizing the compounds of the present invention is selected in accordance with a variety of factors including type, species, age, weight, sex and medical condition of the patient; the severity of the condition to be treated; the route of administration; the renal, hepatic and cardiovascular function of the patient; and the particular compound thereof employed. A physician or veterinarian of ordinary skill can readily determine and prescribe the effective amount of the drug required to prevent, counter or arrest the progress of the condition. Optimal precision in achieving concentrations of drug within the range that yields efficacy without toxicity requires a regimen based on the kinetics of the drug's availability to target sites. This involves a consideration of the distribution, equilibrium, and elimination of a drug.
The present invention also relates to a non-human transgenic animal which is useful for studying the ability of a variety of compounds to act as modulators of NHL, or any alternative functional NHL in vivo by providing cells for culture, in vitro. In reference to the transgenic animals of this invention, reference is made to transgenes and genes. As used herein, a transgene is a genetic construct including a gene. The transgene is integrated into one or more chromosomes in the cells in an animal by methods known in the art. Once integrated, the transgene is carried in at least one place in the chromosomes of a transgenic animal. Of course, a gene is a nucleotide sequence that encodes a protein, such as one or a combination of the cDNA clones described herein. The gene and/or transgene may also include genetic regulatory elements and/or structural elements known in the art. A type of target cell for transgene introduction is the embryonic stem cell (ES). ES cells can be obtained from pre-implantation embryos cultured in vitro and fused with embryos (Evans et al., 1981, Nature 292:154-156; Bradley et al., 1984, Nature 309:255-258; Gossler et al., 1986, Proc. Natl. Acad. Sci. USA 83:9065-9069; and Robertson et al., 1986 Nature 322:445-448). Transgenes can be efficiently introduced into the ES cells by a variety of standard techniques such as DNA transfection, microinjection, or by retrovirus- mediated transduction. The resultant transformed ES cells can thereafter be combined with blastocysts from a non-human animal. The introduced ES cells thereafter colonize the embryo and contribute to the germ line of the resulting chimeric animal (Jaenisch, 1988, Science 240: 1468-1474). It will also be within the purview of the skilled artisan to produce transgenic or knock-out invertebrate animals (e.g., C. elegans) which express the NHL transgene in a wild type background as well in C. elegans mutants knocked out for one or both of the NHL subunits. These organisms will be helpful in further determining the dominant negative effect of NHL as well as selecting from compounds which modulate this effect.
The present invention also relates to a non-human transgenic animal which is heterozygous for a functional NHL gene native to that animal. As used herein, functional is used to describe a gene or protein that, when present in a cell or in vitro system, performs normally as if in a native or unaltered condition or environment. The animal of this aspect of the invention is useful for the study of the retinal specific expression or activity of NHL in an animal having only one functional copy of the gene. The animal is also useful for studying the ability of a variety of compounds to act as modulators of NHL activity or expression in vivo or, by providing cells for culture, in vitro. It is reiterated that as used herein, a modulator is a compound that causes a change in the expression or activity of NHL, or causes a change in the effect of the interaction of NHL with its ligand(s), or other protein(s). In an embodiment of this aspect, the animal is used in a method for the preparation of a further animal which lacks a functional native NHL gene. In another embodiment, the animal of this aspect is used in a method to prepare an animal which expresses a non-native NHL gene in the absence of the expression of a native NHL gene. In particular embodiments the non-human animal is a mouse. In further embodiments the non- native NHL is a wild-type human NHL which is disclosed herein, or any other biologically equivalent form of human NHL gene as also disclosed herein. In reference to the transgenic animals of this invention, reference is made to transgenes and genes. As used herein, a transgene is a genetic construct including a gene. The transgene is integrated into one or more chromosomes in the cells in an animal by methods known in the art. Once integrated, the transgene is carried in at least one place in the chromosomes of a transgenic animal. Of course, a gene is a nucleotide sequence that encodes a protein, such as human or mouse NHL. The gene and/or transgene may also include genetic regulatory elements and/or structural elements known in the art.
Another aspect of the invention is a non-human animal embryo deficient for native NHL expression. This embryo is useful in studying the effects of the lack of NHL on the developing animal. In particular embodiments the animal is a mouse.
The animal embryo is also useful as a source of cells lacking a functional native NHL gene. The cells are useful in in vitro culture studies in the absence of NHL.
An aspect of this invention is a method to obtain an animal in which the cells lack a functional gene NHL native to the animal. The method includes providing a gene for an altered form of the NHL gene native to the animal in the form of a transgene and targeting the transgene into a chromosome of the animal at the place of the native NHL gene. The transgene can be introduced into the embryonic stem cells by a variety of methods known in the art, including electroporation, microinjection, and lipofection. Cells carrying the transgene can then be injected into blastocysts which are then implanted into pseudopregnant animals. In alternate embodiments, the transgene-targeted embryonic stem cells can be coincubated with fertilized eggs or morulae followed by implantation into females. After gestation, the animals obtained are chimeric founder transgenic animals. The founder animals can be used in further embodiments to cross with wild-type animals to produce FI animals heterozygous for the altered NHL gene. In further embodiments, these heterozygous animals can be interbred to obtain the non-viable transgenic embryos whose somatic and germ cells are homozygous for the altered NHL gene and thereby lack a functional NHL gene. In other embodiments, the heterozygous animals can be used to produce cells lines. In preferred embodiments, the animals are mice. A further aspect of the present invention is a transgenic non-human animal which expresses a non-native NHL on a native NHL null background. In particular embodiments, the null background is generated by producing an animal with an altered native NHL gene that is non-functional, i.e. a knockout. The animal can be heterozygous (i.e., having a different allelic representation of a gene on each of a pair of chromosomes of a diploid genome) or homozygous (i.e., having the same representation of a gene on each of a pair of chromosomes of a diploid genome) for the altered NHL gene and can be hemizygous (i.e., having a gene represented on only one of a pair of chromosomes of a diploid genome) or homozygous for the non-native NHL gene. In preferred embodiments, the animal is a mouse. In particular embodiments the non-native NHL gene can be a wild-type or mutant allele including those mutant alleles associated with a disease. In further embodiments, the non-native NHL is a human NHL. In a further embodiment the non-native NHL gene is operably linked to a promoter. As used herein, operably linked is used to denote a functional connection between two elements whose orientation relevant to one another can vary. In this particular case, it is understood in the art that a promoter can be operably linked to the coding sequence of a gene to direct the expression of the coding sequence while placed at various distances from the coding sequence in a genetic construct. An aspect of this invention is a method of producing transgenic animals having a transgene including a non-native NHL gene on a native NHL null background. The method includes providing transgenic animals of this invention whose cells are heterozygous for a native gene encoding a functional NHL protein and an altered native NHL gene. These animals are crossed with transgenic animals of this invention that are hemizygous for a transgene including a non-native NHL gene to obtain animals that are both heterozygous for an altered native NHL gene and hemizygous for a non-native NHL gene. The latter animals are interbred to obtain animals that are homozygous or hemizygous for the non-native NHL and are homozygous for the altered native NHL gene. In particular embodiments, cell lines are produced from any of the animals produced in the steps of the method. The transgenic animals and cells of this invention are useful in the determination of the in vivo function of a non-native NHL in the central nervous system and in other tissues of an animal. The animals are also useful in studying the tissue and temporal specific expression patterns of a non-native NHL throughout the animals. The animals are also useful in determining the ability for various forms of wild-type and mutant alleles of a non-native NHL to rescue the native NHL null deficiency. The animals are also useful for identifying and studying the ability of a variety of compounds to act as modulators of the expression or activity of a non-native NHL in vivo, or by providing cells for culture, for in vitro studies.
As used herein, a "targeted gene" or "Knockout" (KO) is a DNA sequence introduced into the germline of a non-human animal by way of human intervention, including but not limited to, the methods described herein. The targeted genes of the invention include nucleic acid sequences which are designed to specifically alter cognate endogenous alleles. An altered NHL gene should not fully encode the same NHL as native to the host animal, and its expression product can be altered to a minor or great degree, or absent altogether. In cases where it is useful to express a non- native NHL gene in a transgenic animal in the absence of a native NHL gene we prefer that the altered NHL gene induce a null lethal knockout phenotype in the animal. However a more modestly modified NHL gene can also be useful and is within the scope of the present invention.
A type of target cell for transgene introduction is the embryonic stem cell (ES). ES cells can be obtained from pre-implantation embryos cultured in vitro and fused with embryos (Evans et al., 1981, Nature 292:154-156; Bradley et al., 1984, Nature 309:255-258; Gossler et al., 1986, Proc. Natl. Acad. Sci. USA 83:9065-9069; and
Robertson et al., 1986 Nature 322:445-448). Transgenes can be efficiently introduced into the ES cells by a variety of standard techniques such as DNA transfection, microinjection, or by retro virus-mediated transduction. The resultant transformed ES cells can thereafter be combined with blastocysts from a non-human animal. The introduced ES cells thereafter colonize the embryo and contribute to the germ line of the resulting chimeric animal (Jaenisch, 1988, Science 240: 1468-1474).
The methods for evaluating the targeted recombination events as well as the resulting knockout mice are readily available and known in the art. Such methods include, but are not limited to DNA (Southern) hybridization to detect the targeted allele, polymerase chain reaction (PCR), polyacrylamide gel electrophoresis (PAGE) and Western blots to detect DNA, RNA and protein.
The following examples are provided to illustrate the present invention without, however, limiting the same hereto. EXAMPLE 1 Characterization of DNA Molecules Encoding NHL M68/DcR3 identification - The human osteoprotegerin (OPG) sequence (Ace. # U94332), which is a member of the TNFR-related family, was used to searched Genbank using the programs TBLASTN and TFASTX3 to identify novel gene family members. Two EST sequences (GenBank Ace. # AA155701 and AA025672) were identified that showed sequence similarities to the cysteine repeats of the OPG sequence. These EST sequences were then used to identify additional EST sequences, which formed a single EST cluster (GenBank Ace. #s aa577603, aa603704, aa613366, aal58406, w67560, aa325843, aal55646, aa025673, aa514270, m91489). Two clones were further characterized, which were derived from colon tumor and germ cell tumor libraries (Research Genetics, Inc). DNA sequence analysis revealed two alternatively spliced forms of the 5'-end UTR of M68/DcR3. The M68/DcR3 open reading frame was confirmed by sequence analysis of clones obtained by PCR cloning from a normal human cDNA library (Clontech).
M68/DcR3 BAC identification and sequencing - To further delineate the gene structure of M68/DcR3, genomic DNA was obtained using a human "Down to the Well" ™ genomic bacterial artificial chromosome (BAC) library (Genome Systems, Inc.) according to the manufacturer's protocol. Two sets of PCR primers, C68.36F: 5'-CACAGGTTCAGCATGTTTGTGCGTC-3' (SEQ ID NO:4) and C68.275R: 5'-CACAGTCCCTGCTGGCCTCTGTCTA-3' (SEQ ID NO:5), and E68.715F: 5'-CAGGACATCTCCATCAAGAGGCTGC-3' (SEQ ID NO:6) and E68.972R: S'-AATAAGAGGGGGCCAGGATCAGTGC-S1 (SEQ ID NO:7), were used to carry out PCR reactions to identify positive wells that contained the full-length M68/DcR3 gene. The PCR conditions used were 94°C for 9min, 35 cycles of (94°C, 30 sec, 68°C 3 min.) followed by 72°C for 10 min. Two positive BAC clones were identified and characterized by restriction digestion and BAC-end sequence analyses, of which hbml68 was selected for shotgun sequencing. A shot-gun library for BAC hbml68 was constructed using a conventional strategy. Briefly, two 150-ml bacterial cultures were combined and purified using a modified protocol of the plasmid-Maxi kit (QIAGEN) followed by CsCl gradient purification. After butanol extraction and isopropanol precipitation, BAC DNA was nebulizied at 10 psi for 60 seconds to generate randomly sheared fragments. Following ethanol precipitation, the fragments were end-repaired using T4 polymerase (Promega) and BstXI adaptors (Invitrogen) were ligated overnight. Removal of excess, unligated adaptors and size selection was performed using a cDNA sizing column (Life Technologies, Inc.) to generate genomic fragments in the size range of 1500 to 3000 bp. Adaptor ligated fragments were cloned into a modified pBlueScript SK+ vector (Stratagene) and transformed in XL2-Blue ultracompentent cells (Stratagene). Approximately 1000 clones were isolated, plasmids were purified using the Turbo miniprep kits (QIAGEN), and both plasmid ends were sequenced with the BigDye terminator kits (Perkin-Elmer). Sequence data were assembled using Phred/Phrap/Consed where single-stranded and gap regions were closed using a directed sequencing strategy.
NHL identification and sequencing - The genomic clone for the NHL gene was obtained and sequenced. The transcript was identified through exon prediction using GRAIL2 and sequence alignment to a contiguous 4.5 kilobase region of chromosome 4 (88% sequence identity). The complete exon structure of NHL was subsequently confirmed by RT-PCR analysis. The exon structure was confirmed by RT-PCR using polyA RNA from a human colorectal adenocarcinoma cell line, SW480 (Clontech). Primers were designed based on the genomic sequence that were predicted to be exons. RT-PCR reaction were carried out with SW480 polyA RNA using standard conditions with TaqGold Enzyme at 94°C for 12min, 35 cycles of (94°C, 30 sec, 60C, 30 sec, and 68°C 2-6 min.) followed by 68°C for 7 min. Most sequence confrimation was accomplished by RT-PCR, although first junction between exon 1 and 2 was confirmed by 5'RACE and junctions between exon 26-29 were by RCCA. The primers used were as follows: Junction of Exons Confirmed by Primers
H01/H02 hdkw (5'RACE)
H02/H03 hdiy,hdiz
H03-H09 hdid,hdie,hdja,hdjb
H09-H13 hdja,hdie H13-H18 hdje,hdjf
H18-H23 hdjg,hdjh
H23-H26 hdji,hdjj
H26-H29 hdkv,r543(RCCA)
H29-H31 hdij,hdmu,hdnd,hdne H31/H32 hdij,hdmu
H32/H34 hdip,hdil,hdmv,hdik,hdli
H34/H35 hdng,hdnh
HDID - 5 -GTGAATGGCATCCTGGAGAG-3' (SEQ ID NO:8);
HDIE - 5'-GTCTCCAGGCAGCTCAACAG-3' (SEQ ID NO:9);
HDIJ - 5 -ACCCTGTCCCTCCTGTCTGA-3' (SEQ ID NO: 10);
HDIY - 5 -AGACCCTAAGATGTTCGGAG-3' (SEQ ID NO: 11);
HDIZ - 5 -GATGACCTGTGTGAGTTGCG-3' (SEQ ID NO:12); HDJA - 5 -CGCAACTCACACAGGTCATC-3' (SEQ ID NO: 13);
HDJB - 5'-GGAGTCAGGTCAAAGGATGC-3' (SEQ ID NO: 14);
HDJC - 5'-GCATCCTTTGACCTGACTCC-3' (SEQ ID NO: 15);
HDJD - 5'-GGTCTGAAACGTGATCTGGG-3' (SEQ ID NO: 16);
HDJE - 5 -CCCAGATCACGTTTCAGACC-3' (SEQ ID NO: 17); HDJF - 5'-CGATGATGTGTGGGTTCTCC-3' (SEQ ID NO: 18);
HDJG - 5 -GGAGAACCCACACATCATCG-3' (SEQ ID NO: 19);
HDJH - 5 -CGTGTCTGAGAAGTCCAGCC-3' (SEQ ID NO:20);
HDJI - 5'-GGCTGGACTTCTCAGACACG-3' (SEQ ID NO:21);
HDJJ - 5'-ACAGCATCTTCTCCACGCAC-3' (SEQ ID NO:22); HFMU - 5'-AGTCCTCTGGCTTTGCAGTG-3' (SEQ ID NO:23);
HDKV - 5'-TGTGCGTGGAGAAGATGCTG-3' (SEQ ID NO:24);
HDKW - 5 -GGCTGGAAAGGGAAGTCTAC-3' (SEQ ID NO:25);
HDND - 5 -TGGTTCAGGTGCTCTTGGGG-3' (SEQ ID NO:26);
HDNE - 5'-CGTGAAGCAGGAGTTGAGCC-3' (SEQ ID NO:27); HDIK - 5'-ATCTTGCTCTGGGTCTTCCC-3' (SEQ ID NO:28);
HDIL - 5'-CACTGCAAAGCCAGAGGACT-3' (SEQ ID NO:29);
HDIP - 5 -ATAAGCAAGACGACGACCTC-3' (SEQ ID NO:30);
HDLI - 5 -CTATTCTGTTGGGTGGGTTC-3' (SEQ ID NO:31);
HDMV - 5 -CGTGCCTCCTGTGCTTACCC-3' (SEQ ID NO:32); HDNG - 5 -CAGACCCCAAGGTAGCTCAG-3' (SEQ ID NO: 33);
HDNH - 5 -GGAAGACCCAGAGCAAGATC-3' (SEQ ID NO:34). Amplified product were subject to direct sequencing after purification from an agarose gel or cloned into a TOPO PCR cloning vector (Invitrogen) for sequencing. Multiple sequence alignment of NHL to known helicases showed that NHL contains all the seven critical helicase domains. BLAST analysis of the predicted 1,219 amino acid sequence (see Figure 2, SEQ ID NO:2) reveal an approximately 26% sequence identity and 48% sequence similarity to the RAD3/ERCC2 gene family of DNA helicases (see Figure 3). Review of this sequence data shows that two partial human cDNA clones (Ace No. al080127 and ab029011) are deposited. No. al080127 covers exon 25-35 while ab029011 covers exons 9-35. Ab029011 starts at amino acid 240 of the full length human NHL protein disclosed herein, but also differs at exon 35 and appears to be a fusion transcript with M68. This cDNA was isolated from brain tissue, which has been known to express rare transcripts.
EXAMPLE 2
Northern Analysis of human NHL Expression Messenger RNA (mRNA) obtained from human brain, heart, skeletal muscle, colon, thymus, spleen, kidney, liver, small intestine, placenta, lung, and peripheral blood leukocytes. Two μg of polyA+ RNA were run on each lane a denaturing formaldehyde 1% agarose gel, and transferred to a charged-modified nylon membrane. The probe was made using a 733 bp fragment derived from 1174-1907 nt of the NHL
32 cDNA. This fragment was labeled via the P dCTP random priming method (Ambion). Hybridization was carried in ExpressHyb (Clontech) according to the manufacturer's protocol except for the final wash, which was at 55°C. Membranes were exposed to X-ray film with intensifying screen at -80°C overnight. The Northern data is presented in Figure 4. Note hybridization of the NHL probe to an approximately 4.4 kb transcript. The 7.5 kb transcript may suggest an alternative splicing of the NHL RNA. EXAMPLE 3 Chromosomal localization To map the position of M68/NHL in the human genome, primers C68.36F and C68.275R, were used to carry out PCR reactions to 93 clones of the MIT GeneBridge 4 panel (Research Genetics) and results were submitted to MIT for analysis. M68/DcR3 was mapped to the extreme telomere of chromosome 20, at 20ql3.3, 28cR from D20S173 with a lod score of 13. An analogous procedure was also carried out with the 83 clones of the Stanford G3 radiation hybrid panel, with PCR results submitted to the Stanford Genome Center for analysis. Analysis using another pair of PCR primers specific to NHL yielded the same result. For fluorescence in situ (FISH) analysis, the normal human male fibroblast cell line, L136 (Coriell Cell Repository, Camden, NJ) was arrested in mitosis with colcemid (10 μg/ml). A human chromosome 20 α-satellite probe (Vysis, Downers Grove, IL) was directly labeled with Spectrum Orange dUTP and was used to identify chromosome 20. The M68 BAC clone was directly labeled with SpectrumGreen dUTP by nick translation (Vysis). Slides were counterstained with DAPI stain and viewed under an Olympus microscope with narrow blue and DAPI/TRITC filters. Fifty metaphase cells were scored to verify that the M68 probe was located on the same chromosome as the Human Chromosome 20 probe. Radiation hybrid chromosomal mapping reconfirms that it is linked to M68 locus,at 20ql3.3.

Claims

WHAT IS CLAIMED IS:
1. A purified DNA molecule encoding a mammalian NHL protein.
2. A purified DNA molecule of claim 1 encoding a human NHL protein which comprises the amino acid sequence
MPKIVLNGVT VDFPFQPYKC QQEYMTKVLE CLQQKVNGIL ESPTGTGKTL CLLCTTLAWR EHLRDGISAR KIAERAQGEL FPDRALSSWG NAAAAAGDPI ACYTDIPKII YASRTHSQLT QVINELRNTS YRPKVCVLGS REQLCIHPEV KKQESNHLQI HLCRKKVASR SCHFYNNVEE KSLEQELASP ILDIEDLVKS GSKHRVCPYY LSRNLKQQAD IIFMPYNYLL DAKSRRAHNI DLKGTWIFD EAHNVEKMCE ESASFDLTPH DLASGLDVID QVLEEQTKAA QQGEPHPEFS ADSPSPGLNM ELEDIAKLKM ILLRLEGAID AVELPGDDSG VTKPGSYIFE LFAEAQITFQ TKGCILDSLD QIIQHLAGRA GVFTNTAGLQ KLADIIQIVF SVDPSEGSPG SPAGLGALQS YKVHIHPDAG HRRTAQRSDA WSTTAARKRG KVLSYWCFSP GHSMHELVRQ GVRSLILTSG TLAPVSSFAL EMQIPFPVCL ENPHIIDKHQ IWVGWPRGP DGAQLSSAFD RRFSEECLSS LGKALGNIAR WPYGLLIFF PSYPVMEKSL EFWRARDLAR KMEALKPLFV EPRSKGSFSE TISAYYARVA APGSTGATFL AVCRGKASEG LDFSDTNGRG VIVTGLPYPP RMDPRWLKM QFLDEMKGQG GAGGQFLSGQ EWYRQQASRA VNQAIGRVIR HRQDYGAVFL CDHRFAFADA RAQLPSWVRP HVRVYDNFGH VIRDVAQFFR VAERTMPAPA PRATAPSVRG EDAVSEAKSP GPFFSTRKAK SLDLHVPSLK QRSSGSPAAG DPESSLCVEY EQEPVPARQR PRGLLAALEH SEQRAGSPGE EQAHSCSTLS LLSEKRPAEE PRGGRKKIRL VSHPEEPVAG AQTDRAKLFM VAVKQELSQA NFATFTQALQ DYKGSDDFAA LAACLGPLFA EDPKKHNLLQ GFYQFVRPHH KQQFEEVCIQ LTGRGCGYRP EHSIPRRQRA QPVLDPTGRT APDPKLTVST AAAQQLDPQE HLNQGRPHLS PRPPPTGDPG SQPQWGSGVP RAGKQGQHAV SAYLADARRA LGSAGCSQLL AALTAYKQDD DLDKVLAVLA ALTTAKPEDF PLLHRFSMFV RPHHKQRFSQ TCTDLTGRPY PGMEPPGPQE ERLAVPPVLT HRAPQPGPSR SEKTGKTQSK ISSFLRQRPA GTVGAGGEDA GPSQSSGPPH GPAASEWGL* (SEQ ID NO : 2 ) .
3. An expression vector for expressing a NHL protein in a recombinant host cell wherein said expression vector comprises a DNA molecule of claim 2.
4. A host cell which expresses a recombinant NHL protein wherein said host cell contains the expression vector of claim 3.
5. A process for expressing a NHL protein in a recombinant host cell, comprising:
(a) transfecting the expression vector of claim 3 into a suitable host cell; and, (b) culturing the host cells of step (a) under conditions which allow expression of said NHL protein from said expression vector.
6. A purified DNA molecule encoding a human NHL protein which consists of the amino acid sequence MPKIVLNGVT VDFPFQPYKC QQEYMTKVLE CLQQKVNGIL ESPTGTGKTL CLLCTTLAWR EHLRDGISAR KIAERAQGEL FPDRALSSWG NAAAAAGDPI ACYTDIPKII YASRTHSQLT QVINELRNTS YRPKVCVLGS REQLCIHPEV KKQESNHLQI HLCRKKVASR SCHFYNNVEE KSLEQELASP ILDIEDLVKS GSKHRVCPYY LSRNLKQQAD IIFMPYNYLL DAKSRRAHNI DLKGTWIFD EAHNVEKMCE ESASFDLTPH DLASGLDVID QVLEEQTKAA QQGEPHPEFS ADSPSPGLNM ELEDIAKLKM ILLRLEGAID AVELPGDDSG VTKPGSYIFE LFAEAQITFQ TKGCILDSLD QIIQHLAGRA GVFTNTAGLQ KLADIIQIVF SVDPSEGSPG SPAGLGALQS YKVHIHPDAG HRRTAQRSDA WSTTAARKRG KVLSYWCFSP GHSMHELVRQ GVRSLILTSG TLAPVSSFAL EMQIPFPVCL ENPHIIDKHQ IWVGWPRGP DGAQLSSAFD RRFSEECLSS LGKALGNIAR WPYGLLIFF PSYPVMEKSL EFWRARDLAR KMEALKPLFV EPRSKGSFSE TISAYYARVA APGSTGATFL AVCRGKASEG LDFSDTNGRG VIVTGLPYPP RMDPRWLKM QFLDEMKGQG GAGGQFLSGQ EWYRQQASRA VNQAIGRVIR HRQDYGAVFL CDHRFAFADA RAQLPSWVRP HVRVYDNFGH VIRDVAQFFR VAERTMPAPA PRATAPSVRG EDAVSEAKSP GPFFSTRKAK SLDLHVPSLK QRSSGSPAAG DPESSLCVEY EQEPVPARQR PRGLLAALEH SEQRAGSPGE EQAHSCSTLS LLSEKRPAEE PRGGRKKIRL VSHPEEPVAG AQTDRAKLFM VAVKQELSQA NFATFTQALQ DYKGSDDFAA LAACLGPLFA EDPKKHNLLQ GFYQFVRPHH KQQFEEVCIQ LTGRGCGYRP EHSIPRRQRA QPVLDPTGRT APDPKLTVST AAAQQLDPQE HLNQGRPHLS PRPPPTGDPG SQPQWGSGVP RAGKQGQHAV SAYLADARRA LGSAGCSQLL AALTAYKQDD DLDKVLAVLA ALTTAKPEDF PLLHRFSMFV RPHHKQRFSQ TCTDLTGRPY PGMEPPGPQE ERLAVPPVLT HRAPQPGPSR SEKTGKTQSK ISSFLRQRPA GTVGAGGEDA GPSQSSGPPH GPAASEWGL* (SEQ ID NO : 2 ) .
7. An expression vector for expressing a NHL protein in a recombinant host cell wherein said expression vector comprises a DNA molecule of claim 6.
8. A host cell which expresses a recombinant NHL protein wherein said host cell contains the expression vector of claim 7.
9. A process for expressing a NHL protein in a recombinant host cell, comprising:
(a) transfecting the expression vector of claim 7 into a suitable host cell; and,
(b) culturing the host cells of step (a) under conditions which allow expression of said NHL protein from said expression vector.
10. A purified DNA molecule which comprises the nucleotide sequence as set forth in SEQ ID NO: 1.
11. An expression vector for expressing a NHL protein in a recombinant host cell wherein said expression vector comprises a DNA molecule of claim 10.
12. A host cell which expresses a recombinant NHL protein wherein said host cell contains the expression vector of claim 11.
13. A purified DNA molecule which consists of the nucleotide sequence as set forth in SEQ JO NO: 1.
14. An expression vector for expressing a NHL protein in a recombinant host cell wherein said expression vector comprises a DNA molecule of claim 13.
15. A host cell which expresses a recombinant NHL protein wherein said host cell contains the expression vector of claim 14.
16. A purified DNA molecule of claim 13 which consists of the nucleotide sequence from about nucleotide 828 to about nucleotide 4587, as set forth in SEQ ID NO:l.
17. An expression vector for expressing a NHL protein in a recombinant host cell wherein said expression vector comprises a DNA molecule of claim 16.
18. A host cell which expresses a recombinant NHL protein wherein said host cell contains the expression vector of claim 17.
19. A substantially purified NHL protein which comprises the amino acid sequence as set forth in SEQ ID NO:2.
20. A substantially purified NHL protein which consists of the amino acid sequence as set forth in SEQ ID NO:2.
21. A substantially purified NHL protein which comprises the amino acid sequence as set forth in SEQ ID NO:2, wherein said protein is a product of a DNA expression vector comprising SEQ ID NO:l and contained within a recombinant host cell.
22. A method of identifying modulators of NHL activity, comprising:
(a) combining a test compound with a NHL protein, wherein NHL comprises the amino acid sequence as set forth in SEQ ID NO:2; and,
(b) measuring the effect of the test compound on the NHL protein.
23. An isolated DNA molecule which comprises the nucleotide sequence as set forth in SEQ ID NO:3.
24. An isolated DNA molecule of claim 20 which comprises from about nucleotide 47000 to about nucleotide 85500 of SEQ ID NO:3.
25. An isolated DNA molecule of claim 23 which comprises from about nucleotide 47095 to about nucleotide 85316 of SEQ ID NO:3.
26. A substantially purified NHL protein of claim 21 wherein said protein is a product of a DNA expression vector comprising from about nucleotide 828 to nucleotide 4587, as set forth in SEQ ID NO:l, and contained within a recombinant host cell.
EP00983952A 1999-12-09 2000-12-07 Dna molecules encoding human nhl, a dna helicase Withdrawn EP1240311A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US16997099P 1999-12-09 1999-12-09
US169970P 1999-12-09
PCT/US2000/033065 WO2001042434A1 (en) 1999-12-09 2000-12-07 Dna molecules encoding human nhl, a dna helicase

Publications (1)

Publication Number Publication Date
EP1240311A1 true EP1240311A1 (en) 2002-09-18

Family

ID=22617972

Family Applications (1)

Application Number Title Priority Date Filing Date
EP00983952A Withdrawn EP1240311A1 (en) 1999-12-09 2000-12-07 Dna molecules encoding human nhl, a dna helicase

Country Status (4)

Country Link
EP (1) EP1240311A1 (en)
JP (1) JP2003523181A (en)
CA (1) CA2395378A1 (en)
WO (1) WO2001042434A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7285267B2 (en) 1997-01-14 2007-10-23 Human Genome Sciences, Inc. Tumor necrosis factor receptors 6α & 6β
DE69837996T2 (en) 1997-01-14 2008-02-28 Human Genome Sciences, Inc. TUMOR NECROSE FACTOR RECEPTORS 6 ALPHA & 6 BETA

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5466576A (en) * 1993-07-02 1995-11-14 Fred Hutchinson Cancer Research Center Modulation of PIF-1-type helicases
US5843737A (en) * 1994-12-30 1998-12-01 Chen; Lan Bo Cancer associated gene protein expressed therefrom and uses thereof
US5888792A (en) * 1997-07-11 1999-03-30 Incyte Pharmaceuticals, Inc. ATP-dependent RNA helicase protein

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO0142434A1 *

Also Published As

Publication number Publication date
JP2003523181A (en) 2003-08-05
CA2395378A1 (en) 2001-06-14
WO2001042434A1 (en) 2001-06-14

Similar Documents

Publication Publication Date Title
CN107941681B (en) Method for identifying quantitative cellular composition in biological sample
KR102279458B1 (en) Modulation of huntingtin expression
AU2019377115A1 (en) Use of adeno-associated viral vectors to correct gene defects/ express proteins in hair cells and supporting cells in the inner ear
US6340583B1 (en) Isolated human kinase proteins, nucleic acid molecules encoding human kinase proteins, and uses thereof
RU2768285C1 (en) Oligonucleotides for tau protein expression modulation
US6399349B1 (en) Human aminopeptidase P gene
US7361491B2 (en) DNA molecules encoding human NHL, a DNA helicase
KR100901147B1 (en) A marker for predicting recurrence of a uterine cancer patient treated with radiation therapy, a kit and microarray comprising the same and method for predicting recurrence of a uterine cancer patient after radiation therapy using the marker
WO2001042434A1 (en) Dna molecules encoding human nhl, a dna helicase
US20040180338A1 (en) Mutated eukariotic transalation initiation factor 2 alpha kinase3, eif2ak3, in patients with neonatal insuluin-dependant diabetes and multiple epiphyseal dyslapsia (wolcott-rallison syndrome)
US6500656B1 (en) Isolated human kinase proteins, nucleic acid molecules encoding human kinase proteins, and uses thereof
US6706510B2 (en) Isolated human kinase proteins
US20020115179A1 (en) Isolated human phosphodiesterase proteins, nucleic acid molecules encoding human phosphodiesterase proteins, and uses thereof
US20040014193A1 (en) Isolated human kinase proteins, nucleic acid molecules encoding human kinase proteins, and uses thereof
US20040161759A1 (en) Test and model for inflammatory disease
US20040101885A1 (en) Isolated human kinase proteins, nucleic acid molecules encoding human kinase proteins, and uses thereof
US20020137131A1 (en) Isolated human enzyme proteins, nucleic acid molecules encoding human enzyme proteins, and uses thereof
JP2003116575A (en) New gene and protein coded by the same
JP2003135081A (en) New gene and protein encoded by the same
KR20130024135A (en) Microarray and kit for diagnosing 1p36 deletion syndrome
JP2003245081A (en) New gene and protein encoded thereby

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20020709

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR

AX Request for extension of the european patent

Free format text: AL;LT;LV;MK;RO;SI

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN WITHDRAWN

18W Application withdrawn

Effective date: 20030924