AU774841B2 - Development of novel anti-microbial agents based on bacteriophage genomics - Google Patents

Development of novel anti-microbial agents based on bacteriophage genomics Download PDF

Info

Publication number
AU774841B2
AU774841B2 AU15815/00A AU1581500A AU774841B2 AU 774841 B2 AU774841 B2 AU 774841B2 AU 15815/00 A AU15815/00 A AU 15815/00A AU 1581500 A AU1581500 A AU 1581500A AU 774841 B2 AU774841 B2 AU 774841B2
Authority
AU
Australia
Prior art keywords
bacteriophage
bacterial
protein
phage
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU15815/00A
Other versions
AU1581500A (en
Inventor
Michael Dubow
Philippe Gros
Jerry Pelletier
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Targanta Therapeutics Inc
Original Assignee
Targanta Therapeutics Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US09/407,804 external-priority patent/US6982153B1/en
Application filed by Targanta Therapeutics Inc filed Critical Targanta Therapeutics Inc
Publication of AU1581500A publication Critical patent/AU1581500A/en
Application granted granted Critical
Publication of AU774841B2 publication Critical patent/AU774841B2/en
Assigned to TARGANTA THERAPEUTICS INC. reassignment TARGANTA THERAPEUTICS INC. Request to Amend Deed and Register Assignors: PHAGETECH, INC.
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/02Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving viable microorganisms
    • C12Q1/18Testing for antimicrobial activity of a material
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P31/00Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
    • A61P31/04Antibacterial agents
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/005Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2795/00Bacteriophages
    • C12N2795/00011Details
    • C12N2795/10011Details dsDNA Bacteriophages
    • C12N2795/10022New viral proteins or individual genes, new structural or functional aspects of known viral proteins or genes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2795/00Bacteriophages
    • C12N2795/00011Details
    • C12N2795/10011Details dsDNA Bacteriophages
    • C12N2795/10041Use of virus, viral particle or viral elements as a vector
    • C12N2795/10043Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector

Landscapes

  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • Medicinal Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Immunology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Veterinary Medicine (AREA)
  • Public Health (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Oncology (AREA)
  • Communicable Diseases (AREA)
  • Toxicology (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Analytical Chemistry (AREA)
  • Animal Behavior & Ethology (AREA)
  • Biotechnology (AREA)
  • Virology (AREA)
  • Microbiology (AREA)
  • General Chemical & Material Sciences (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Agricultural Chemicals And Associated Chemicals (AREA)
  • Peptides Or Proteins (AREA)
  • Apparatus Associated With Microorganisms And Enzymes (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)

Description

WO 00/32825 PCT/I B99/02040 1
DESCRIPTION
Development of Novel Anti-Microbial Agents Based on Bacteriophage Genomics BACKGROUND OF THE INVENTION The present invention relates to the field of antibacterial agents and the treatment of infections of animals or other complex organisms by bacteria.
The frequency and spectrum of antibiotic-resistant infections have, in recent years, increased in both the hospital and community. Certain infections have become essentially untreatable and are growing to epidemic proportions in the developing world as well as in institutional settings in the developed world. The staggering spread of antibiotic resistance in pathogenic bacteria has been attributed to microbial genetic characteristics, widespread use of antibiotic drugs, and changes in society that enhance the transmission of drug-resistant organisms. This spread of drug resistant microbes is leading to ever increasing morbidity, mortality and health-care costs.
Ironically, it is the very success of antibiotics, resulting in their widespread use, that has contributed the most to rising numbers of drug resistant bacterial strains.
The longer a bacterial strain is exposed to a drug, the more likely it is to acquire resistance. Today, a total of 160 antibiotics, all based on a few basic chemical structures and targeting a small number of metabolic pathways, have found their way to market. Over-prescription of these drugs, as well as the failure of patients to comply with the complete antibiotic regimen, has lead to the rapid emergence of antibiotic resistant strains. Such misuse of prescriptions, careless use of antibiotics in virtually all commercial production of beef and fowl, and changing societal conditions, such as the growth of day-care centers, increased long-term care in hospitals, and increased mobility of the population, has provided an environment where drug-resistant microbes can emerge and spread. Thus, virtually all common infectious bacteria are becoming, or have already become, resistant to one or more groups of antibiotics. Such resistance now reaches all classes of antibiotics currently in use, including: P-lactams, fluoroquinolones, aminoglycosides, macrolide peptides, chloramphenicol, tetracyclines, rifampicin, folate inhibitors, glycopeptides, and mupirocin.
Over the last 45 years bacteria have adapted genetically to avoid the destruction/alteration of the essential pathways that these chemotherapeutic agents 1 WO 00/32825 PCT/I B99/02040 target. Antibiotic resistant bacterial strains are now emerging at a higher rate than the rate at which new antibiotics are being developed. The consequence of this dilemma has been a dramatic increase in the cost of treating infections what would otherwise easily succumb to routine antibiotic therapy. Furthermore, and perhaps most importantly, the emergence of multiple drug resistant pathogenic bacteria has led to a significant increase in morbidity and mortality, particularly in institutional settings.
Most major pharmaceutical companies have on-going drug discovery programs for novel anti-microbials. These are based on screens for small molecule inhibitors (natural products, bacterial culture media, libraries of small molecules, combinatorial chemistry) of crucial metabolic pathways of the micro-organism of interest bacteria, fungi, parasites, worms). The screening process is largely for cytotoxic compounds and in most cases is not based on a known mechanism of action of the compounds. Pharmaceutical companies have large programs in this area.
Classical drug screening programs are being exhausted and many of these pharmaceutical companies are looking towards rational drug design programs.
Several small to mid-size biotechnology companies as well as large pharmaceutical companies have developed systematic high-throughput sequencing programs to decipher the genetic code of specific micro-organisms of interest. The goal is to identify, through sequencing, unique biochemical pathways or intermediates that are unique to the microorganism. Knowledge of this may, in turn, form the rationale for a drug discovery program based on the mechanism of action of the identified enzymes/proteins. Genome Therapeutics Corp., The Institute for Genome Research, Human Genome Sciences Inc., and other companies have such sequencing programs in place. However, one of the most critical steps in this approach is the ascertainment that the identified proteins and biochemical pathways are 1) nonredundant and essential for bacterial survival, and 2) constitute suitable and accessible targets for drug discovery.
WO 00/32825 PCT/I B99/02040 SUMMARY OF THE INVENTION While animals such as humans are, on occasion, infected by pathogenic bacteria, bacteria also have natural enemies. A number of host-specific viruses, known as bacteriophages or phages, infect and kill bacteria in the natural environment. Such bacteriophages generally have small compact genomes and bacteria are their exclusive hosts. Many known bacteria are host to a large number of bacteriophages that have been described in the literature. During the 1940's 1960's, phage biology was an area of active research. As a testimony to this, the study of phages which infect and inhibit the enteric bacterium Escherichia coli coli) contributed much to the early understanding of molecular biology and virology.
As is generally understood, bacteriophage (or phages) are viruses that infect and kill bacteria. They are natural enemies of bacteria and, over the course of evolution, have developed proteins (products of DNA sequences) which enable them to infect a host bacteria, replicate their genetic material, usurp host metabolism, and ultimately kill their host. The scientific literature well documents the fact that many known bacteria have a large number of such bacteriophages (Ackermann and DuBow, 1987) that can infect and kill them (for example, see the ATCC bacteriophage collection at http://www.atcc.org).
This invention utilizes the observation that bacteriophages successfully infect and inhibit or kill host bacteria, targeting a variety of normal host metabolic and physiological traits, some of which are shared by all bacteria, pathogenic and nonpathogenic alike. The term "pathogenic" as used herein denotes a contribution to or implication in disease or a morbid state of an infected organism. The invention thus involves identifying and elucidating the molecular mechanisms by which phages interfere with host bacterial metabolism, an objective being to provide novel targets for drug design. Whether the phage blocks bacterial RNA transcription or translation, or attacks other important metabolic pathways, such as cell wall assembly or membrane integrity, the basic blueprint for a phage's bacteria-inhibiting ability is encoded in its genome and can be unlocked using bioinformatics, functional genomics, and proteomics. By these means, the invention utilizes sequence information from the genomics of bacteriophage to identify novel antimicrobials that can be further used to actively and/or prophylactically treat bacterial infection.
Two important components of the invention thus are: i) the identification of bacteria-inhibiting phage open reading frames ("ORF"s) and corresponding products that can be used to develop antibiotics based on amino acid sequence and secondary structural characteristics of the ORF products, and ii) the use of bacteriophages to map out essential bacterial target genes and homologs, which can in turn lead to the development of suitable anti-microbial agents. These two avenues represent new and general methods for developing novel antimicrobials.
The invention thus concerns the identification of bacteriophage ORFs that supply bacteria-inhibiting functions. In this regard, use of the terms "inhibit", "inhibition", "inhibitory", and "inhibitor" all refer to a function of reducing a biological activity or function. Such reduction in activity or function can, for example, be in connection with a cellular component, an enzyme, or in connection with a cellular process, e.g., synthesis of a particular protein, or in connection with an overall process of a cell, e.g., 1o cell growth. In reference to bacterial cell growth, for example, an inhibitory effect a bacteria-inhibiting effect) may be bacteriocidal (killing of bacterial cells) or bacteriostatic stopping or at least slowing bacterial cell growth). The latter slows or prevents cell growth such that fewer cells of the strain are produced relative to uninhibited cells over a given period of time. From a molecular standpoint, such inhibition may equate with a reduction in the level of, or elimination of, the transcription and/or translation of a specific bacterial target(s), or reduction or elimination of activity of a particular target biomolecule.
It is particularly advantageous to evaluate a plurality of different phage ORFs for inhibitory activity that may be from one, but is preferably from a plurality of different oooo phage. For example, evaluating ORFs from a number of different phage of the same bacterial host provides at least two advantages. One is that the multiple phages will provide identification of a variety of different targets. Second, it is likely that multiple phage will utilize the same cellular target As used herein, the terms "bacteriophage" and "phage" are used interchangeably to refer to a virus which can infect a bacterial strain or a number of different bacterial strains.
~In the context of this invention, the term "bacteriophage ORF" or ""phage ORF" similar term refers to a nucleotide sequence in or from a bacteriophage. In connection with a particular ORF, the terms refer an open reading frame which has at least sequence identity, preferably at least 97% sequence identity, more preferably at least 98% sequence identity with an ORF from the particular phage identified herein with an ORF as identified herein) or to a nucleic acid sequence which has the specified sequence identify percentage with such an ORF sequence.
[R:\LIBVV]05898.doc:NSS According to a first aspect of the present invention there is provided a method for identifying at least one bacterial target for antibacterial agents, comprising: contacting a bacterial protein with a bacteriophage polypeptide that inhibits bacterial growth, wherein said bacteriophage polypeptide is a polypeptide encoded by said bacteriophage or a variant of said encoded polypeptide; determining whether said bacteriophage polypeptide binds to said bacterial protein; and identifying any said bacterial protein bound by said bacteriophage polypeptide, wherein binding of said bacteriophage polypeptide to said bacterial protein is indicative that said bacterial protein is a target.
According to a second aspect of the present invention there is provided a method for identifying at least one bacterial target for antibacterial agents, comprising: contacting at least one homolog of a bacterial protein that binds with a bacteriophage polypeptide that inhibits bacterial growth; and determining whether said bacteriophage polypeptide binds to said homolog; wherein binding of said homolog by said bacteriophage polypeptide is indicative that said homolog is a target for antibacterial agents.
According to a third aspect of the present invention there is provided a method for S 20 identifying a bacterial target for screening antibacterial agents, comprising: identifying a full-length bacteriophage protein or a bacteriophage polypeptide variant which inhibits bacterial growth when introduced into a bacteria; contacting said bacteriophage protein or variant with a full-length bacterial protein, or a bacterial protein fragment or a homolog thereof; and determining whether said bacteriophage polypeptide or variant binds to said full-length bacterial protein, or said bacterial protein fragment or homolog thereof; wherein binding of said bacteriophage protein or variant with said bacterial protein, fragment or homolog is indicative that said bacterial protein, fragment or homolog is a target for screening antibacterial agents.
According to a fourth aspect of the present invention there is provided a method for identifying a bacterial target for screening antibacterial agents, comprising: R:\LIBVV]05898.doc:NSS 1 providing a bait capable of inhibiting bacterial growth when introduced into a bacteria, said bait being selected from the group consisting of full-length bacteriophage proteins and bacteriophage polypeptide variants; providing a prey, selected from the group consisting full-length bacterial proteins, bacterial protein fragments and homologs thereof; contacting said bait with said prey under conditions suitable for allowing formation of specific bait:prey complex(es); and identifying a prey forming any said bait:prey complex(es); wherein formation of bait:prey complexes is indicative that said prey is a bacterial target against which antibacterial agents may be screened.
According to a fifth aspect of the present invention there is provided a method for identifying at least one target for antibacterial agents, comprising: contacting a bacterial protein with a bacteriophage polypeptide that inhibits bacterial growth; determining whether said bacteriophage polypeptide binds to said bacterial protein; and identifying any said bacterial protein bound by said bacteriophage polypeptide, wherein binding of said bacteriophage polypeptide to said bacterial protein is indicative that said bacterial protein is a said target.
According to a sixth aspect of the present invention there is provided a method for identifying an antibacterial agent, comprising.
identifying at least one target for antibacterial agents according to any one of the above aspects; and screening for at least one compound that binds to or reduces the level of activity of said at least one target; wherein identification of a compound that binds to or reduces the level of activity of said S .at least one target is indicative that said compound is an antibacterial agent.
According to a seventh aspect of the present invention there is provided a method of making an antibacterial agent, comprising the steps of: identifying an antibacterial agent according to the sixth aspect; and synthesizing said antibacterial agent in an amount sufficient to inhibit bacterial growth.
[R:\LIBVV]OS898 doc NSS According to an eighth aspect of the present invention there is provided a method for inhibiting a bacterium, comprising the steps of making an antibacterial agent according to the seventh aspect; and contacting said bacterium with said antibacterial agent.
According to a ninth aspect of the present invention there is provided a method for treating a bacterial infection in a non-human animal suffering from an infection, comprising: making an antibacterial agent according to the seventh aspect; and administering to said non-human animal a therapeutically effective amount of said antibacterial agent.
According to a tenth aspect of the present invention there is provided an isolated, purified, or enriched nucleic acid molecule at least 15 nucleotides in length, wherein said molecule comprises at least a portion of a bacteriophage sequence, and wherein said bacteriophage is selected from the group consisting of Staphylococcus aureus bacteriophage 77, 3A, 96, and 44AHJD, Enterococcus bacteriophage 182, and Streptococcus pneumoniae bacteriophage Dp-1, with the proviso that said bacteriophage sequence is other than the nucleotide sequence shown in Table 32.
According to an eleventh aspect of the present invention there is provided a recombinant vector comprising at least one nucleic acid sequence according to the tenth aspect.
According to a twelfth aspect of the present invention there is provided a recombinant cell comprising a vector according to the eleventh aspect.
According to a thirteenth aspect of the present invention there is provided an Sisolated, purified, or enriched polypeptide comprising at least a portion of an Santimicrobial protein, wherein said polypeptide is encoded by a bacteriophage selected S 25 from the group consisting of Staphylococcus aureus bacteriophage 77, 3A, 96, and 44AHJD, Enterococcus bacteriophage 182, and Streptococcus pneumoniae bacteriophage Dp-1, with the proviso that said polypeptide is encoded by a bacteriophage sequence other than the nucleotide sequence shown in Table 32.
According to a fourteenth aspect of the present invention there is provided a computer readable device when used in the method according to any one of the first to the ninth aspects, said device having recorded therein a nucleotide sequence of a portion of at least one bacteriophage genome of Staphylococcus aureus bacteriophage 77, bacteriophage 3A, or bacteriophage 96, a nucleotide sequence at least 95% identical to a said nucleotide sequence, a ribonucleic acid equivalent, a degenerate equivalent, a [R:\LIBVV]05898.doc:NSS 4d homologous sequence, or at least one amino acid sequence encoded by said nucleotide sequence; and a nucleotide sequence or amino acid sequence analysis program, wherein said program can perform at least one sequence analysis on said nucleotide or amino acid sequence.
Also disclosed herein is a method for identifying a bacteriophage nucleic acid coding region encoding a product active on an essential bacterial target by identifying a nucleic acid sequence encoding a gene product which e [R:\LIBVV]05898.doc:NSS WO 00/32825 PCT/1 B99/02040 provides a bacteria-inhibiting function when the bacteriophage infects a host bacterium, preferably one that is an animal or plant pathogen, more preferably a bird or mammalian pathogen, and most preferably a human pathogen. The bacteriophage is an uncharacterized bacteriophage. Thus, the method excludes, for example, phage X, <x174, ml3 and other E.coli-specific bacteriophage that have been studied with respect to gene number and/or function. It also excludes, for example, the nucleic acid coding regions described in Tables 12-14, and in preferred embodiments, excludes the phage in which those regions are naturally located.
In connection with bacteriophage, the term "uncharacterized" means that a certain bacteriophage's genome has not yet been fully identified such that the genes having function involved in inhibiting host cells have not been identified. In particular, phage for which the description of genomic or protein sequence was first provided herein are uncharacterized. Phage sequences for which host bacteriainhibiting functions have been identified prior to the filing of the present application (or alternatively prior to the present invention) are specifically excluded from the aspects involving utilization of sequences from uncharacterized bacteriophage, except that aspects may involve a plurality of phage where one or more of those phage are uncharacterized and one or more others have been characterized to some extent. A number of different bacteria-inhibiting phage ORFs are indicated in Tables 11-14.
The phage ORFs or sequences identified therein are not within the term "uncharacterized; alternatively, in preferred embodiments the phage containing those ORFs are excluded from this term. Further, any additional phage ORFs (or alternatively the phage which contain those ORFs) which have previously been described in the art as bacteria-inhibiting ORFs are expressly excluded; those ORFs or phage are known to those skilled in the art and the exclusion can be made express by specifically naming such ORFs or phage as needed (likewise for uncharacterized targets as described below). For the sake of brevity, such a listing is not expressly presented, as such information is readily available to those skilled in the art.
Stating that an agent or compound is "active on" a particular cellular target, such as the product of a particular gene, means that the target is an important part of a cellular pathway which includes that target and that the agent acts on that pathway.
Thus, in some cases the agent may act on a component upstream or downstream of the stated target, including on a regulator of that pathway or a component of that pathway.
By "essential", in connection with a gene or gene product, is meant that the host cannot survive without, or is significantly growth compromised, in the absence depletion, or alteration of functional product. An "essential gene" is thus one that encodes a product that is beneficial, or preferably necessary, for cellular growth in WO 00/32825 PCT/I B99/02040 6 vitro in a medium appropriate for growth of a strain having a wild-type allele corresponding to the particular gene in question. Therefore, if an essential gene is inactivated or inhibited, that cell will grow significantly more slowly, preferably less than 20%. more preferably less than 10%, most preferably less than 5% of the growth rate of the uninhibited wild-type, or not at all, in the growth medium. Preferably, in the absence of activity provided by a product of the gene, the cell will not grow at all or will be non-viable, at least under culture conditions similar to the in vivo conditions normally encountered by the bacterial cell during an infection. For example, absence of the biological activity of certain enzymes involved in bacterial cell wall synthesis can result in the lysis of cells under normal osmotic conditions, even though protoplasts can be maintained under controlled osmotic conditions. In the context of the invention, essential genes are generally the preferred targets of antimicrobial agents. Essential genes can encode target molecules directly or can encode a product involved in the production, modification, or maintenance of a target molecule.
A "target" refers to a biomolecule that can be acted on by an exogenous agent, thereby modulating, preferably inhibiting, growth or viability of a cell. In most cases such a target will be a nucleic acid sequence or molecule, or a polypeptide or protein.
However, other types of biomolecules can also be targets, membrane lipids and cell wall structural components.
The term "bacterium" refers to a single bacterial strain, and includes a single cell, and a plurality or population of cells of that strain unless clearly indicated to the contrary. In reference to bacteria or bacteriophage, the term "strain" refers to bacteria or phage having a particular genetic content. The genetic content includes genomic content as well as recombinant vectors. Thus, for example, two otherwise identical bacterial cells would represent different strains if each contained a vector, a plasmid, with different phage ORF inserts.
In preferred embodiments, the phage is Staphylococcus aureus phage 77, 3A, 96, or 44 AHJD, Enterococcus sp. phage 182, or Streptococcus pneumoniae phage Dp-l.
In preferred embodiments, the phage is selected from. Preferred embodiments involve expressing at least one recombinant phage ORF(s) in a bacterial host followed by inhibition analysis of that host. Inhibition following expression of the phage ORF.
is indicative that the product of the ORF is active on an essential bacterial target.
Such evaluation can be carried out in a variety of different formats, such as on a support matrix such as a solidified medium in a petri dish, or in liquid culture.
WO 00/32825 PCT/I B99/02040 7 Preferably a plurality of phage ORFs are expressed in at least one bacterium. The plurality of phage ORFs can be from one or a plurality of phage. With respect to a single phage or at least one phage in a plurality of phages, the plurality of expressed ORFs preferably represents at least 10%, more preferably at least 20%, 40%, or still more preferably at least 80% or 90%, and most preferably at least 95% of the ORFs in the phage genome. Preferably, for a plurality of phage, the plurality of expressed ORFs preferably represents at least 10%, more preferably at least or 60%, still more preferably at least 80% or 90%, and most preferably at least of the ORFs in the phage genome of each phage. The plurality of phage ORFs can be expressed in a single bacterium, or in a plurality of bacteria where one ORF is expressed in each bacterium, or in a plurality of bacteria where a plurality of ORFs are expressed in at least one or in all of the plurality of bacteria, or combinations of these.
In embodiments of the above aspect (as well as in other aspects herein) in which a plurality of phage are utilized, a plurality of phage have the same bacterial host species; have different bacterial host species; or both. The plurality of phage includes at least two different phage, preferably at least 3,4,5,6,8,10,15,20, or more different phage. Indeed, more preferably, the plurality of phage will include 50, 100, or more phage. As described herein, the larger number of phage is useful to provide additional target and target evaluation information useful in developing antibacterial agents, for example, by providing identification of a larger range of bacterial targets, and/or providing further indication of the suitability of a particular target (for example, utilization of a target by a number of different unrelated phage can suggest that the target is particularly stable and accessible and effective) and/or can indicate alternate sites on a target which interact with different inhibitors.
Further embodiments involve confirmation of the inhibitor function of the phage ORF, such as by utilizing or incorporating a control(s) designed to confirm the inhibitory nature of the ORF(s) being evaluated. The control can, for example, be provided by expression of an inactive or partially inactive form of the ORF or ORF product, and/or by the absence of expression of the ORF or ORF product in the same or a closely comparable bacterial strain as that used for expression of the test ORF.
The reduced level of activity or the absence of active ORF product in the control will thus not provide the inhibition provided by a corresponding inhibitory ORF, or will provide a distinguishably lower level of inhibition. An inactivated or partially inactivated control has a mutation(s), in the coding region or in flanking regulatory elements, that reduce(s) or eliminate(s) the normal function of the ORF.- Thus, the inhibition of a bacterium following expression of a phage ORF is determined by comparison with the effects of expression of an inactivated ORF or the WO 00/32825 PCT/I B99/02040
S
response of the bacteria in the absence of expression in the same or similar type bacterium. Such determination of inhibition of the bacterium following expression of the ORF is indicative of a bacteria-inhibiting function. These manipulations are routinely understood and accomplished by those of skill in the art using standard techniques. In embodiments utilizing absence of expression of the ORF, the bacteria can, for example, contain an empty vector or a vector which allows expression of an unrelated sequence which is preferably non-inhibitory. Alternatively, the bacteria may have no vector at all. Combinations of such controls or other controls may also be utilized as recognized by those skilled in the art.
In embodiments involving expression of a phage ORF in a bacterial strain, in preferred embodiments that expression is inducible.
By "inducible" is meant that expression is absent or occurs at a low level until the occurrence of an appropriate environmental stimulus provides otherwise. For the present invention such induction is preferably controlled by an artificial environmental change, such as by contacting a bacterial strain population with an inducing compound an inducer). However, induction could also occur, for example, in response to build-up of a compound produced by the bacteria in the bacterial culture, in the medium. As uncontrolled or constitutive expression of inhibitory ORFs can severely compromise bacteria to the point of eradication, such expression is therefore undesirable in many cases because it would prevent effective evaluation of the strain and inhibitor being studied. For example, such uncontrolled expression could prevent any growth of the strain following insertion of a recombinant ORF, thus preventing determination of effective transfection or transformation. A controlled or inducible expression is therefore advantageous and is generally provided through the provision of suitable regulatory elements, e.g., promoter/operator sequences that can be conveniently transcriptionally linked to a coding sequence to be evaluated. In most cases, the vector will also contain sequences suitable for efficient replication of the vector in the same or different host cells and/or sequences allowing selection of cells containing the vector, i.e., "selectable markers." Further, preferred vectors include convenient primer sequences flanking the cloning region from which PCR and/or sequencing may be performed.
As knowledge of the nucleotide sequence of phage ORFs is useful, for assisting in the identification of phage proteins active against essential bacterial host targets, preferred embodiments involve the sequencing of at least a portion of the phage genome in combination with the above methods. This can be done either-be6re or after or independent of expression and inhibition of the ORF in the bacteria, and provides information on the nature and characteristics of the ORF. Such a portion is 1 WO 00/32825 PCT/I B99/02040 9 preferably at least 10%, 20%, 40%, 80%, 90%, or 100% of the phage genome. For embodiments in which a plurality of phage are utilized, preferably each phage is sequenced to an extent as just specified.
Such sequencing is preferably accompanied by computer sequence analysis to define and evaluate ORF(s), ORF products, structural motifs or functional properties of ORF products, and/or their genetic control elements. Thus, certain embodiments incorporate computer sequence analyses or nucleic acid and/or amino acid sequences.
Further, existing data banks can provide phage sequence and product information which can be utilized for analysis and identification of ORFs in the sequence.
Computer analysis may further employ known homologous sequences from other species that suggest or indicate conserved underlying biochemical function(s) for the inhibitory or potentially inhibitory ORF sequence(s) being evaluated. This can include the sequences of signature motifs of identified classes of inhibitors.
In the context of the phage nucleic acid sequences, gene sequences, of this invention, the terms "homolog" and "homologous" denote nucleotide sequences from different bacteria or phage strains or species or from other types of organisms that have significantly related nucleotide sequences, and consequently significantly related encoded gene products, preferably having related function. Homologous gene sequences or coding sequences have at least 70% sequence identity (as defined by the maximal base match in a computer-generated alignment of two or more nucleic acid sequences) over at least one sequence window of 48 nucleotides, more preferably at least 80 or 85%, still more preferably at least 90%, and most preferably at least The polypeptide products of homologous genes have at least 35% amino acid sequence identity over at least one sequence window of 18 amino acid residues, more preferably at least 40%, still more preferably at least 50% or 60%, and most preferably at least 70%, 80%, or 90%. Preferably, the homologous gene product is also a functional homolog, meaning that the homolog will functionally complement one or more biological activities of the product being compared. For nucleotide or amino acid sequence comparisons where a homology is defined by a sequence identity, the percentage is determined using BLAST programs with default parameters (Altschul et al., 1997, "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acid Res. 25:3389-3402).
Any of a variety of algorithms known in the art which provide comparable results can also be used, preferably using default parameters. Performance characteristics for three different algorithms in homology searching is described in Salamov et al, 1--99, "Combining sensitive database searches with multiple intermediates to detect distant
I
WO 00/32825 PCT/IB99/02040 homologues." Protein Eng. 12:95-100. Another exemplary program package is the GCGTM package from the University of Wisconsin.
Homologs may also or in addition be characterized by the ability of two complementary nucleic acid strands to hybridize to each other under appropriately stringent conditions. Hybridizations are typically and preferably conducted with probe-length nucleic acid molecules, preferably 20-100 nucleotides in length. Those skilled in the art understand how to estimate and adjust the stringency of hybridization conditions such that sequences having at least a desired level of complementarity will stably hybridize, while those having lower complementarity will not. For examples of hybridization conditions and parameters, see, Maniatis, T. et al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor University Press, Cold Spring, Ausubel, F.M. et al. (1994) Current Protocols in Molecular Biology.
John Wiley Sons, Secaucus, N.J. Homologs and homologous gene sequences may thus be identified using any nucleic acid sequence of interest, including the phage ORFs and bacterial target genes of the present invention.
A typical hybridization, for example, utilizes, besides the labeled probe of interest, a salt solution such as 6xSSC (NaCI and Sodium Citrate base) to stabilize nucleic acid strand interaction, a mild detergent such as 0.5% SDS, together with other typical additives such as Denhardt's solution and salmon sperm DNA. The solution is added to the immobilized sequence to be probed and incubated at suitable temperatures to preferably permit specific binding while minimizing nonspecific binding. The temperature of the incubations and ensuing washes is critical to the success and clarity of the hybridization. Stringent conditions employ relatively higher temperatures, lower salt concentrations, and/or more detergent than do non-stringent conditions. Hybridization temperatures also depend on the length, complementarity level, and nature (ie, "GC content") of the sequences to be tested. Typical stringent hybridizations and washes are conducted at temperatures of at least 40 0 C, while lower stringency hybridizations and washes are typically conducted at 37 0 C down to room temperature (-25 0 One of skill in the art is aware that these conditions may vary according to the parameters indicated above, and that certain additives such as formamide and dextran sulphate may also be added to affect the conditions.
By "stringent hybridization conditions" is meant hybridization conditions at least as stringent as the following: hybridization in 50% formamide, 5X SSC, 50 mM NaH,PO,, pH 6.8, 0.5% SDS, 0.1 mg/mL sonicated salmon sperm DNA, and Denhart's solution at 42 0 C overnight; washing with 2X SSC, 0.1% SDS at 45 0 G; and washing with 0.2X SSC, 0.1% SDS at 45 0
C.
WO 00/32825 PCT/IB99/02040 I1 In sequence comparison analyses, an ORF, or motif, or set of motifs in a bacteriophage sequence can be compared to known inhibitor sequences, e.g., homologous sequences encoding homologous inhibitors of bacterial function.
Likewise, the analysis can include comparison with the structure of essential bacterial gene products, as structural similarities can be indicative of similar or replacement biological function. Such analysis can include the identification of a signature, or characteristic motif(s) of an inhibitor or inhibitor class.
Also, the identification of structural motifs in an encoded product, based on nucleotide or amino acid sequence analysis, can be used to infer a biochemical function for the product. A database containing identified structural motifs in a large number of sequences is available for identification of motifs in phage sequences. The database is PROSITE, which is available at www.expasy.ch/cgi-bin/scanprosite. The identification of motifs can, for example, include the identification of signature motifs for a class or classes of inhibitory proteins. Other such databases may also be used.
In aspects and preferred embodiments described herein, in which a bacterium or host bacterium is specified, the bacterium or host bacterium is preferably selected from a pathogenic bacterial species, for example, one selected from Table 1.
Preferably, an animal or plant pathogen is used. For animals, preferably the bacterium is a bird or mammalian pathogen, still more preferably a human pathogen.
In aspects and preferred embodiments involving a bacteriophage or sequences from a bacteriophage, one or more bacteriophage are preferably selected from those listed in Table 1. Those exemplary bacteriophge are readily obtained from the indicated sources.
In some cases, it is advantageous to utilize phage with non-pathogenic host bacteria. The genome, structural motif, ORF, homolog, and other analyses described herein can be performed on such phage and bacteria. Such analysis provides useful information and compositions. The results of such analyses can also be utilized in aspects of the present invention to identify homologous ORFs, especially inhibitor ORFs in phage with pathogenic bacterial hosts. Similarly, identification of a target in a non-pathogenic host can be used to identify homologous sequences and targets in pathogenic bacteria, especially in genetically closely related bacteria. Those skilled in the art are familiar with bacterial genetic relationships and with how to determine relatedness based on levels of genomic identity or other measures of nucieotide sequence and/or amino acid sequence similarity, and/or other physical and culture characteristics such as morphology, nutritional requirements, or minimal media4osupport growth.
WO 00/32825 PCT/I B99/02040 12 Also in preferred embodiments, an embodiments of this aspect is combined with an embodiment of the following aspect.
A related aspect of the invention provides methods for identifying a target for antibacterial agents by identifying the bacterial target(s) of at least one uncharacterized or untargeted inhibitor protein or RNA from a bacteriophage. Such identification allows the development of antibacterial agents active on such targets.
Preferred embodiments for identifying such targets involve the identification of binding of target and phage ORF products to one another. The phage ORF products may be subportions of a larger ORF product that also binds the host target. In preferred embodiments, the phage protein or RNA is from an uncharacterized bacteriophage in Table 1. This aspect preferably includes the identification of a plurality of such targets in one or a plurality of different bacteria, preferably in one or a plurality of bacteria listed in Table 1.
In preferred embodiments of this aspect and other aspects of this invention involving particular phage ORFs or phage sequences, the ORF is Staphylococcus aureus phage 77 ORF 17, 19, 43, 102, 104, or 182 as identified in U.S. application 09/407,804, S. aureus phage 44AHJD ORF 1, 9, or 12, Streptococcus pneumoniae phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, or 041, or Enterococcus sp. phage 182 ORF 002, 008, or 014.
As indicated for the above aspect, preferably the method involves the use of a plurality of different phage, and thus a plurality of different phage inhibitors and/or inhibitor ORFs.
In addition to uncharacteized phage ORF products, it is also useful to identify the targets of phage ORF products which are known to be inhibitors of host bacteria, but where the target has not been identified. Thus, such inhibitors can likewise be utilized as "untargeted" inhibitor phage ORFs and ORF products, proteins or RNAs.
In the context of inhibitor proteins or RNAs from a phage, the term "uncharacterized" means that a bacteria-inhibiting function for the protein has not previously been identified. Preferably, but not necessarily, the sequence of the protein or the corresponding coding region or ORF was not described in the art before the filing of the present application for patent (or alternatively prior to the present invention). Thus, this term specifically excludes any bacteria-inhibiting phage protein and its associated bacterial target which has been identified as inhibitory before the present invention or alternatively before the filing of the present application, for. example those identified in Tables 12-14 or otherwise identified herein. For example, from E. coli, phage T7 genes 0.7 and 2.0 target the host RNA polymerase, phage T4 WO 00/32825 PCT/I B99/02040 13 gp55/gp33 alter the specificity of host RNA polymerase. The T4 regB gene product also targets the host translation apparatus. As with the uncharacterized bacteriophage ORFs or bacteriophage above, for such identified proteins, the sequences encoding those proteins are excluded from the uncharacterized inhibitor proteins.
The term "fragment" refers to a portion of a larger molecule or assembly. For proteins, the term "fragment" refers to a molecule which includes at least 5 contiguous amino acids from the reference polypeptide or protein, preferably at least 8, 10, 12, 20, 30, 50 or more contiguous amino acids. In connection with oligo- or polynucleotides, the term "fragment" refers to a molecule which includes at least contiguous nucleotides from a reference polynucleotide, preferably at least 24, 30, 36, 60, 90, 150, or more contiguous nucleotides.
Preferred embodiments involve identification of binding that include methods for distinguishing bound molecules, for example, affinity chromatography, immunoprecipitation, crosslinking, and/or genetic screen methods that permit protein:protein interactions to be monitored. One of skill in the art is familiar with these techniques and common materials utilized (see, Coligan, J. et al. (eds.) (1995) Current Protocols in Protein Science. John Wiley Sons, Secaucus, Genetic screening for the identification of protein:protein interactions typically involves the co-introduction of both a chimeric bait nucleic acid sequence (here, the phage ORF to be tested) and a chimeric target nucleic acid sequence that, when coexpressed and having affinity for one another in a host cell, stimulate reporter gene expression to indicate the relationship. A "positive" can thus suggest a potential inhibitory effect in bacteria. This is discussed in further detail in the Detailed Description section below. In this way, new bacterial targets can be identified that are inhibited by specific phage ORF products or derivatives, fragments, mimetics, or other molecules.
Other embodiments involve the identification and/or utilization of mutant targets by virtue of their host's relatively unresponsive nature in the presence of expression of ORFs previously identified as inhibitory to the non-mutant or wild-type strain. Such mutants have the effect of protecting the host from an inhibition that would otherwise occur and indirectly allow identification of the precise responsible target for follow-up studies and anti-microbial development. In certain embodiments, rescue from inhibition occurs under conditions in which a bacterial target or mutant target is highly expressed. This is performed, for example, through coupling of the sequence with regulatory element promoters, as known in the art, which regulate expression at levels higher than wild-type, at a level sufficiently higher that the WO 00/32825 PCT/I B99/02040 14 inhibitor can be competitively bound to the highly expressed target such that the bacterium is detectably less inhibited.
Identification of the bacterial target can involve identification of a phagespecific site of action. This can involve a newly identified target, or a target where the phage site of action differs from the site of action of a previously known antibacterial agent or inhibitor. For example, phage T7 genes 0.7 and 2.0 target the host RNA polymerase, which is also the cellular target for the antibacterial agent, rifampin. To the extent that a phage product is found to act at a different site than previously described inhibitors, aspects of the present invention can utilize those new, phagespecific sites for identification and use of new agents. The site of action can be identified by techniques well-known to those skilled in the art, for example, by mutational analysis, binding competition analysis, and/or other appropriate techniques.
Once a bacterial host target protein or nucleic acid or mutant target sequence has been identified and/or isolated, it too can be conveniently sequenced, sequence analyzed by computer), and the underlying gene(s), and corresponding translated product(s) further characterized. Preferred embodiments include such analysis and identification. Preferably such a target has not previously been identified as an appropriate target for antibacterial action.
Certain embodiments include the identification of at least one inhibitory phage ORF or ORF product, as described for the above aspect, and thus are a combination of the two aspects.
Additionally, the invention provides methods for identifying targets for antibacterial agents by identifying homologs of a bacterial target S. aureus, Enterococcusfaecalis or other Enterococci, and Streptococcus pneumoniae of a bacteriophage inhibitory ORF product. Such homologs may be utilized in the various aspects and embodiments described herein as describded for the host Enterococcus sp.
for bacteriophage 182.
Other aspects of the invention provide isolated, purified, or enriched specific phage nucleic acid and amino acid sequences, subsequences, and homologs thereof for phage selected from uncharacterized phage listed in Table 1, preferably from bacteriophage 77, 3A, 96, 44AHJD (Staphylococcus aureus host bacterium), Dp-I (Streptococcus pneumoniae host), or 182 (Enterococcus host) or other phage listed in Table I for those bacteria. For example, such sequences do not include sequences identified in any of Tables 11-14. Nucleotide sequences of this aspect are at least nucleotides in length, preferably at least 18, 21, 24, or 27 nucleotides in length, more preferably at least 30, 50, or 90 nucleotides in length. In certain embodiments, longer WO 00/32825 PCT/I B99/02040 nucleic acids are preferred, for example those of at least 120, 150, 200, 300, 600, 900 or more nucleotides. Such sequences can, for example, be amplification oligonucleotides PCR primers), oligonucleotide probes, sequences encoding a portion or all of a phage-encoded protein, or a fragment or all of a phage-encoded protein. In preferred embodiments, the nucleic acid sequence contains a sequence which is within a length range with a lower length as specified above, and an upper length limit which is no more than 50, 60, 70, 80, or 90% of the length of the corresponding full-length ORF. The upper length limit can also be expressed in terms of the number of base pairs of the ORF (coding region). In preferred embodiments, the nucleic acid sequence is from Staphylococcus aureus phage 77 ORF 17, 19, 43, 102, 104, or 182 as identified in U.S. application 09/407,804, S. aureus phage 44 AHJD ORF 1, 9, or 12, Streptococcus pneumoniae phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, or 041, or Enterococcus sp. phage 182 ORF 002, 008, or 014.
As it is recognized that alternate codons will encode the same amino acid for most amino acids due to the degeneracy of the genetic code, the sequences of this aspect includes nucleic acid sequences utilizing such alternate codon usage for one or more codons of a coding sequence. For example, all four nucleic acid sequences GCT, GCC, GCA, and GCG encode the amino acid, alanine. Therefore, if for an amino acid there exists an average of three codons, a polypeptide of 100 amino acids in length will, on average, be encoded by 3' 0 or 5 x 10 47 nucleic acid sequences.
Thus, a nucleic acid sequence can be modified a nucleic acid sequence from a phage as specified above) to form a second nucleic acid sequence encoding the same polypeptide as encoded by the first nucleic acid sequence using routine procedures and without undue experimentation. Thus, all possible nucleic acid sequences that encode the specified amino acid sequences are also fully described herein, as if all were written out in full, taking into account the codon usage, especially that preferred in the host bacterium. The alternate codon descriptions are available in common texbooks, for example, Stryer, BIOCHEMISTRY 3rd ed., and Lehninger, BIOCHEMISTRY 3" ed., along wth many others. Codon preference tables for various types of organisms are available in the literature. Sequences with alternate codons at one or more sites can also be utilized in the computer-related aspects and embodiments herein. Because of the number of sequence variations involving alternate codon usage, for the sake of brevity, individual sequences are not separately listed herein. Instead the alternate sequences are described by reference to the natural sequence with replacement of one or more (up to all up to 3, 5, 10, 15, 20, 30, or more) of the degenerate codons with alternate codons from the alternate codon WO 00/32825 PCT/IB99/02040 16 table (Table or a modified table applicable to a particular organism that has differing codon usage, preferably with selection according to preferred codon usage for the normal host organism or a host organism in which a sequence is intended to be expressed. Those skilled in the art also understand how to alter the alternate codons to be used for expression in organisms where certain codons code differently than shown in the "universal" codon table.
For amino acid sequences or polypeptides, sequences contain at least 5 peptidelinked amino acid residues, and preferably at least 6, 7, 10, 15, 20, 30, or 40, amino acids having identical amino acid sequence as the same number of contiguous amino acid residues in a particular phage ORF product. In some cases longer sequences may be preferred, for example, those of at least 50, 60, 70, 80, or 100 amino acids in length. In preferred embodiments, the amino acid sequence contains a sequence which is within a length range with a lower length as specified above, and an upper length limit which is no more than 50, 60, 70, 80, or 90% of the length of the corresponding full-length ORF product. The upper length limit can also be expressed in terms of the number of amino acid residues of the ORF product. In preferred embodiments, the amino acid sequence or polypeptide has bacteria-inhibiting function when expressed or otherwise present in a bacterial cell which is a host for the bacteriophage from which the sequence was derived.
By "isolated" in reference to a nucleic acid is meant that a naturally occurring sequence has been removed from its normal cellular chromosomal) environment or is synthesized in a non-natural environment artificially synthesized). Thus, the sequence may be in a cell-free solution or placed in a different cellular environment. The term does not imply that the sequence is the only nucleotide chain present, but that it is essentially free (about 90-95% pure at least) of non-nucleotide material naturally associated with it, and thus is distinguished from isolated chromosomes.
The term "enriched" means that the specific DNA or RNA sequence constitutes a significantly higher fraction (2-5 fold) of the total DNA or RNA present in the cells or solution of interest than in normal or diseased cells or in cells from which the sequence was originally taken. This could be caused by a person by preferential reduction in the amount of other DNA or RNA present, or by a preferential increase in the amount of the specific DNA or RNA sequence, or by a combination of the two. However, it should be noted that enriched does not imply that there are no other DNA or RNA sequences present, just that the relative amount of the sequence of interest has been significantly increased.
WO 00/32825 PCT/I B99/02040 17 The term "significant" is used to indicate that the level of increase is useful to the person making such an increase and an increase relative to other nucleic acids of about at least 2-fold, more preferably at least 5- to I 0-fold or even more. The term also does not imply that there is no DNA or RNA from other sources. The other source DNA may, for example, comprise DNA from a yeast or bacterial genome, or a cloning vector such as pUC19. This term distinguishes from naturally occurring events, such as viral infection, or tumor type growths, in which the level of one mRNA may be naturally increased relative to other species of mRNA. That is, the term is meant to cover only those situations in which a person has intervened to elevate the proportion of the desired nucleic acid.
It is also advantageous for some purposes that a nucleotide sequence be in purified form. The term "purified" in reference to nucleic acid does not require absolute purity (such as a homogeneous preparation). Instead, it represents an indication that the sequence is relatively more pure than in the natural environment (compared to the natural level, this level should be at least 2-5 fold greater, in terms ofmg/mL). Individual clones isolated from a cDNA library may be purified to electrophoretic homogeneity. The claimed DNA molecules obtained from these clones could be obtained directly from total DNA or from total RNA. The cDNA clones are not naturally occurring, but rather are preferably obtained via manipulation of a partially purified naturally occurring substance (messenger RNA). The construction of a cDNA library from mRNA involves the creation of a synthetic substance (cDNA) and pure individual cDNA clones can be isolated from the synthetic library by clonal selection of the cells carrying the cDNA library. Thus, the process which includes the construction of a cDNA library from mRNA and isolation of distinct cDNA clones yields an approximately 10 6 -fold purification of the native message. Thus, purification of at least one order of magnitude, preferably two or three orders, and more preferably four or five orders of magnitude is expressly contemplated.
The terms "isolated", "enriched", and "purified" as respect nucleic acids, above, may similarly be used to denote the relative purity and abundance of polypeptides multimers of amino acids joined one to another by oc-carboxyl:a-amino group (peptide) bonds). These, too, may be stored in, grown in, screened in, and selected from libraries using biochemical techniques familiar in the art. Such polypeptides may be natural, synthetic or chimeric and may be extracted using any of a variety of methods, such as antibody immunoprecipitation, other "tagging" techniques, conventional chromatography and/or electrophoretic methods. Some of the above utilize the corresponding nucleic acid sequence.
WO 00/32825 PCT/I B99/02040 I8 As indicated above, aspects and embodiments of the invention are not limited to entire genes and proteins. The invention also provides and utilizes fragments and portions thereof, preferably those which are "active" in the inhibitory sense described above. Such peptides or oligopeptides and oligo or polynucleotides have preferred lengths as specified above for nucleic acid and amino acid sequences from phage; corresponding recombinant constructs can be made to express the encoded same.
Also included are homologous sequences and fragments thereof.
Nucleic acid sequences of the present invention can be isolated using a method similar to those described herein or other methods known to those skilled in the art.
In addition, such nucleic acid sequences can be chemically synthesized by wellknown methods. Also, by having particular phage ORFs, the phage ORFs identified herein anti-bacterial OREs of the present invention, portions thereof, or oligonucleotides derived therefrom as described), other antimicrobial sequences from other bacteriophage sources can be identified and isolated using methods described here or other methods, including methods utilizing nucleic acid hybridization and/or computer-based sequence alignment methods.
The invention also provides bacteriophage antimicrobial DNA segments from other phages based on nucleic acids and sequences hybridizing to the presently identified inhibitory ORF under high stringency conditions or sequences that are highly homologous. The bacteriophage segment from a specific phage, an antimicrobial DNA segment, can be used to identify a related segment from another unrelated phage based on stringent conditions of hybridization or on being a homolog based on nucleic acid and/or amino acid sequence comparisons. As with identified inhibitory sequences, such homologous coding sequences and products can be used as antimicrobials, to construct active portions or derivatives, to construct peptidomimetics, and to identify bacterial targets.
The nucleotide and amino acid sequences identified herein are believed to be correct, however, certain sequences may contain a small percentage of errors, 1- In the event that any of the sequences have errors, the corrected sequences can be readily provided by one skilled in the art using routine methods. For example, the nucleotide sequences can be confirmed or corrected by obtaining and culturing the relevant phage, and purifying phage genomic nucleic acids. A region or regions f interest can be amplified, by PCR from the appropriate genomic templateusing primers based on the described sequence. The amplified regions can then be sequenced using any of the available methods a dideoxy termination method).
WO 00/32825 PCT/I B99/02040 19 This can be done redundantly to provide the corrected sequence or to confirm that the described sequence is correct. Alternatively, a particular sequence or sequences can be identified and isolated as an insert or inserts in a phage genomic library and isolated, amplified, and sequenced by standard methods. Confirmation or correction of a nucleotide sequence for a phage gene provides an amino acid sequence of the encoded product by merely reading off the amino acid sequence according to the normal codon relationships and/or expressed in a standard expression system and the polypeptide product sequenced by standard techniques. The sequences described herein thus provide unique identification of the corresponding genes, coding sequences, and other sequences, allowing those sequences to be used in the various aspects of the present invention.
In other aspects, the invention provides recombinant vectors and cells harboring at least one of the phage ORFs or portion thereof, or bacterial target sequences described herein. As understood by those skilled in the art, vectors may be provided in different forms, including, for example, plasmids, cosmids, and virusbased vectors. See, Maniatis, T. et al. (19891 Molecular Cloning: A Laboratory Manual, Cold Spring Harbor University Press, Cold Spring, See also, Ausubel, F.M. et al. (eds.) (1994) Current Protocols in Molecular Biology. John Wiley Sons, Secaucus, N.J.
In preferred embodiments, the vectors will be expression vectors, preferably shuttle vectors that permit cloning, replication, and expression within bacteria. An "expression vector" is one having regulatory nucleotide sequences containing transcriptional and translational regulatory information that controls expression of the nucleotide sequence in a host cell. Preferably the vector is constructed to allow amplification from vector sequences flanking an insert locus. In certain embodiments, the expression vectors may additionally or alternativley support expression, and/or replication in animal, plant and/or yeast cells due to the presence of suitable regulatory sequences, promoters, enhancers, 3' stabilizing sequences, primer sequences, etc. In preferred embodiments, the promoters are inducible and specific for the system in which expression is desired, bacteria, animal, plant, or yeast.
The vectors may optionally encode a "tag" sequence or sequences to facilitate protein purification. Convenient restriction enzyme cloning sites and suitable selective marker(s) are also optionally included. Such selective markers can be, for exaple, antibiotic resistance markers or markers which supply an essential nutritive growth factor to an otherwise deficient mutant host, tryptophan, histidine, or leucjn1in the Yeast Two-Hybrid systems described below.
WO 00/32825 PCT/IB99/02040 The term "recombinant vector" relates to a single- or double-stranded circular nucleic acid molecule that can be transfected into cells and replicated within or independently of a cell genome. A circular double-stranded nucleic acid molecule can be cut and thereby linearized upon treatment with appropriate restriction enzymes. An assortment of nucleic acid vectors, restriction enzymes, and the knowledge of the nucleotide sequences cut by restriction enzymes are readily available to those skilled in the art. A nucleic acid molecule encoding a desired product can be inserted into a vector by cutting the vector with restriction enzymes and ligating the two pieces together. Preferably the vector is an expression vector, a shuttle expression vector as described above.
By recombinant cell" is meant a cell possessing introduced or engineered nucleic acid sequences, as described above. The sequence may be in the form of or part of a vector or may be integrated into the host cell genome. Preferably the cell is a bacterial cell.
In another aspect, the invention also provides methods for identifying and/or screening compounds "active on" at least one bacterial target of a bacteriophage inhibitor protein or RNA. Preferred embodiments involve contacting such a bacterial target or targets bacterial target proteins) with a test compound, and determining whether the compound binds to or reduces the level of activity of the bacterial target a bacterial target protein). Preferably this is done either in vivo in a cellbased assay) or in vitro, in a cell-free system under approximately physiological conditions.
The compounds that can be used may be large or small, synthetic or natural, organic or inorganic, proteinaceous or non-proteinaceous. In preferred embodiments, the compound is a peptidomimetic, as described herein, a bacteriophage inhibitor protein or fragment or derivative thereof, preferably an "active portion", or a small molecule.
In preferred embodiments, the bacterial target is a target of a phage ORF identified herein, S. aureus phage 44AHJD ORF 1, 9, or 12, Streptococcus pneumoniae phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, or 041, or Enterococcus sp. phage 182 ORF 002, 008, or 014.
In particular embodiments, the methods include the identification of bacterial targets or te site ofl action of an inhibLtor on a bacterial target as described above or otherwise described herein.
In embodiments involving binding assays, preferably binding is to a fragmnifit or portion of a bacterial target protein, where the fragment includes less than 70%, 60%, 50%, 40%, or 30% of an intact bacterial target protein. Preferably, WO 00/32825 PCT/I B99/02040 21 the at least one bacterial target includes a plurality of different targets of bacteriophage inhibitor proteins, preferably a plurality of different targets. The plurality of targets can be in or from a plurality of different bacteria, but preferably is from a single bacterial species.
A "method of screening" refers to a method for evaluating a relevant activity or property of a large plurality of compounds a bacteria-inhibiting activity), rather than just one or a few compounds. For example, a method of screening can be used to conveniently test at least 100, more preferably at least 1000, still more preferably at least 10,000, and most preferably at least 100,000 different compounds, or even more.
In the context of this invention, the term "small molecule" refers to compounds having molecular mass of less than 2000 Daltons, preferably less than 1500, still more preferably less than 1000, and most preferably less than 600 Daltons.
Preferably but not necessarily, a small molecule is not an oligopeptide.
In a related aspect or in preferred embodiments, the invention provides a method of screening for potential antibacterial agents by determining whether any of a plurality of compounds, preferably a plurality of small molecules, is active on at least one target of a bacteriophage inhibitor protein or RNA. Preferred embodiments include those described for the above aspect, including embodiments which involve determining whether one or more test compounds bind to or reduce the level of activity of a bacterial target, and embodiments which utilize a plurality of different targets as described above.
The identification of bacteria-inhibiting phage ORFs and their encoded products also provides a method for identifying an active portion of such an encoded product. This also provides a method for identifying a potential antibacterial agent by identifying such an active portion of a phage ORF or ORF product. In preferred embodiments, the identification of an active portion involves one or more of mutational analysis, deletion analysis, or analysis of fragments of such products. The method can also include determination of a 3-dimensional structure of an active portion, such as by analysis of crystal diffraction patterns. In further embodiments, the method involves constructing or synthesizing a peptidomimetic compound, where the structure of the peptidomimetic compound corresponds to the structure of the active portion. In this context, :corresponds" means that the peptidomimetic compound structure has sufficient similarities to the structure of the active portion that the peptidomimetic will interact with the same molecule as the phage protein and preferably will elicit at least one cellular response in common which relates to the inhibition of the cell by the phage protein.
WO 00/32825 PCT/1 B99/02040 22 In preferred embodiments, the ORF or ORF product is or is derived or obtained from S. aureus phage 44AHJD ORF 1, 9, or 12, Streptococcus pneumoniae phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, or 041, or Enterococcus sp. phage 182 ORF 002, 008, or 014 or product thereof.
The methods for identifying or screening for compounds or agents active on a bacterial target of a phage-encoded inhibitor can also involve identification of a phage-specific site of action on the target.
Preferably in the methods for identifying or screening for compounds active on such a bacterial target, the target is uncharacterized; the target is from an uncharacterized bacterium from Table 1; the site of action is a phage-specfic site of action.
Further embodiments include the identification of inhibitor phage ORFs and bacterial targets as in aspects above.
An "active portion" as used herein denotes an epitope, a catalytic or regulatory domain, or a fragment of a bacteriophage inhibitor protein that is responsible for, or a significant factor in, bacterial target inhibition. The active portion preferably may be removed from its contiguous sequences and, in isolation, still effect inhibition.
By "mimetic" is meant a compound structurally and functionally related to a reference compound that can be natural, synthetic, or chimeric. In terms of the present invention, a "peptidomimetic," for example, is a compound that mimics the activityrelated aspects of the 3-dimensional structure of a peptide or polyeptide in a nonpeptide compound, for example mimics the structure of a peptide or active portion of a phage- or bacterial ORF-encoded polypeptide.
A related aspect provides a method for inhibiting a bacterial cell by contacting the bacterial cell with a compound active on a bacterial target of a bacteriophage inhibitor protein or RNA, where the target was uncharacterized. In preferred embodiments, the compound is such a protein, or a fragment or derivative thereof; a structural mimetic, a peptidomimetic, of such a protein or fragment; a small molecule; the contacting is performed in vitro, the contacting is performed in vivo in an infected or at risk organism, an animal such as a mammal or bird, for example, a human, or other mammal described herein; the bacterium is selected from a genus and/or species listed in Table 1; the bacteriophage inhibitor protein is uncharacterized; the bacteriophage inhibitor protein is from an uncharacterized phage listed in Table 1; the phage inhibitor protein is from one of S. aureus phage 44AHJD ORF 1, 9, or 12, Streptococcus pneumoniae phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016,-021, 029, 030, 038, or 041, or Enterococcus sp. phage 182 ORF 002, 008, or 014.
WO 00/32825 PCT/IB99/02040 23 In the context of targets in this invention, the term "uncharacterized" means that the target was not recognized as an appropriate target for an antibacterial agent prior to the filing of the present application or alternatively prior to the present invention. Such lack of recognition can include, for example, situations where the target and/or a nucleotide sequence encoding the target were unknown, situations where the target was known, but where it had not been identified as an appropriate target or as an essential cellular component, and situations where the target was known as essential but had not been recognized as an appropriate target due to a belief that the target would be inaccessible or otherwise that contacting the cell with a compound active on the target in vitro would be ineffective in cellular inhibition, or ineffective in treatment of an infection. Methods described herein utilizing bacterial targets, for inhibiting bacteria or treating bacterial infections, can also utilize "uncharacterized target sites", meaning that the target has been previously recognized as an appropriate target for an antibacterial agent, but where an agent or inhibitor of the invention is used which acts at a different site than that at which the previously utilized antibacterial agent, a phage-specific site. Preferably the phage-specific site has different functional characteristics from the previously utilized site. In the context of targets or target sites, the term "phage-specific" indicates that the target or site is utilized by at least one bacteriophage as an inhibitory target and is different from previously identified targets or target sites.
In the context of this invention, the term "bacteriophage inhibitor protein" refers to a protein encoded by a bacteriophage nucleic acid sequence which inhibits bacterial function in a host bacterium. Thus, it is a bacteria-inhibiting phage product.
In the context of this invention, the phrase "contacting the bacterial cell with a compound active on a bacterial target of a bacteriophage inhibitor protein" or equivalent phrases refer to contacting with an isolated, purified, or enriched compound or a composition including such a compound, but specifically does not rely on contacting the bacterial cell with an intact phage which encodes the compound.
Preferably no intact phage are involved in the contacting.
Related aspects provide methods for prophylactic or therapeutic treatment of a bacterial infection by administering to an infected, challenged or at risk organism a therapeutically or prophylactically effective amount of a compound active on a target of a bacteriophage inhibitor protein or RNA, or as described for the previous aspect.
Preferably the bacterium involved in the infection or risk of infection produces the identified target of the bacteriophage inhibitor protein or alternatively produces-a homologous target compound. In preferred embodiments, the host organism is a plant or animal, preferably a mammal or bird, and more preferably, a human or other WO 00/32825 PCT/I B99/02040 24 mammal described herein. Preferred embodiments include, without limitation, those as described for the preceding aspect.
Compounds useful for the methods of inhibiting, methods of treating, and pharmaceutical compositions can include novel compounds, but can also include compounds which had previously been identified for a purpose other than inhibition of bacteria. Such compounds can be utilized as described and can be included in pharmaceutical compositions.
In preferred embodiments of this and other aspects of the invention utilizing bacterial target sequences of a bacteriiophage inhibitory ORF product, the target sequence is encoded by a Staphylococcus nucleic acid coding sequence, preferably S.
aureus, a Streptococcus nucleic acid coding sequence, preferably Streptococcus pneunioniae, or Enterococcus nucleic acid coding sequence. Possible target sequences are described herein by reference to sequence source sites.
The amino acid sequence of a polypeptide target is readily provided by translating the corresponding coding region. For the sake of brevity, the sequences are not reproduced herein. For the sake of brevity, the sequences are described by reference to the GenBank entries instead of being written out in full herein. In cases where the TIGR or GenBank entry for a coding region is not complete, the complete sequence can be readily obtained by routine methods, by isolating a clone in a phage host genomic library, and sequencing the clone insert to provide the relevant coding region. The boundaries of the coding region can be identified by conventional sequence analysis and/or by expression in a bacterium in which the endogenous copy of the coding region has been inactivated and using subcloning to identify the functional start and stop codons for the coding region.
In the context of nucleic acid or amino acid sequences of this invention, the term "corresponding" indicates that the sequence is at least 95% identical, preferably at least 97% identical, and more preferably at least 99% identical to a sequence from the specified phage genome, a ribonucleotide equivalent, a degenerate equivalent (utilizing one or more degenerate codons), or a homologous sequence, where the homolog provides functionally equivalent biological function.
By "treatment" or "treating" is meant administering a compound or pharmaceutical composition for prophylactic and/or therapeutic purposes. The term "prophylactic treatment" refers to treating a patient or animal that is not yet infecLed but is susceptible to or otherwise at risk of a bacterial infection. The term "therapeutic treatment" refers to administering treatment to a patient already suffering from. infection.
WO 00/32825 PCT/I B99/02040 The term "bacterial infection" refers to the invasion of the host organism, animal or plant, by pathogenic bacteria. This includes the excessive growth of bacteria which are normally present in or on the body of the organism, but more generally, a bacterial infection can be any situation in which the presence of a bacterial population(s) is damaging to a host organism. Thus, for example, an organism suffers from a bacterial population when excessive numbers of a bacterial population are present in or on the organism's body, or when the effects of the presence of a bacterial population(s) is damaging to the cells, tissue, or organs of the organism.
The terms "administer", "administering", and "administration" refer to a method of giving a dosage of a compound or composition, an antibacterial pharmaceutical composition, to an organism. Where the organism is a mammal, the method is, topical, oral, intravenous, transdermal, intraperitoneal, intramuscular, or intrathecal. The preferred method of administration can vary depending on various factors, the components of the pharmaceutical composition, the site of the potential or actual bacterial infection, the bacterium involved, and the infection severity.
The term "mammal" has its usual biological meaning referring to any organism of the Class Mammalia of higher vertebrates that nourish their young with milk secreted by mammary glands, mouse, rat, and, in particular, human, bovine, sheep, swine, dog, and cat.
In the context of treating a bacterial infection a "therapeutically effective amount" or "pharmaceutically effective amount" indicates an amount of an antibacterial agent, as disclosed for this invention, which has a therapeutic effect.
This generally refers to the inhibition, to some extent, of the normal cellular functioning of bacterial cells that renders or contributes to bacterial infection.
The dose of antibacterial agent that is useful as a treatment is a "therapeutically effective amount." Thus, as used herein, a therapeutically effective amount means an amount of an antibacterial agent that produces the desired therapeutic effect as judged by clinical trial results and/or animal models. This amount can be routinely determined by one skilled in the art and will vary depending on several factors, such as the particular bacterial strain involved and the particular antibacterial agent used.
In connection with claims to methods of inhibiting bacteria and therapeutic or prophylactic treatments, "a compound active on a target of a bacteriophage inhibitor protein" or terms of equivalent meaning differ from administration of or contact-with an intact phage naturally encoding the full-length inhibitor compound. While an intact phage may conceivably be incorporated in the present methods, the method at WO 00/32825 PCT/I B99/02040 26 least includes the use of an active compound as specified different from a full length inhibitor protein naturally encoded by a bacteriophage and/or a delivery or contacting method different from administration of or contact with an intact phage encoding the full-length protein. Similarly, pharmaceutical compositions described herein at least include an active compound different from a full-length inhibitor protein naturally encoded by a bacteriophage or such a full-length protein is provided in the composition in a form different from being encoded by an intact phage. Preferably the methods and compositions do not include an intact phage.
In accord with the above aspects, the invention also provides antibacterial agents and compounds active on bacterial targets of bacteriophage inhibitor proteins or RNAs, where the target was uncharacterized as indicated above. As previously indicated, such active compounds include both novel compounds and compounds which had previously been identified for a purpose other than inhibition of bacteria.
Such previously identified biologically active compounds can be used in embodiments of the above methods of inhibiting and treating. In preferred embodiments, the targets, bacteriophage, and active compound are as described herein for methods of inhibiting and methods of treating. Preferably the agent or compound is formulated in a pharmaceutical composition which includes a pharmaceutically acceptable carrier, excipient, or diluent. In addition, the invention provides agents, compounds, and pharmaceutical compositions where an active compound is active on an uncharacterized phage-specific site.
In preferred embodiments, the target is as described for embodiments of aspects above.
Likewise, the invention provides a method of making an antibacterial agent.
The method involves identifying a target of a bacteriophage inhibitor polypeptide or protein or RNA, screening a plurality of compounds to identify a compound active on the target, and synthesizing the compound in an amount sufficient to provide a therapeutic effect when administered to an organism infected by a bacterium naturally producing the target. In preferred embodiments, the identification of the target and identification of active compounds include steps or methods and/or components as described above (or otherwise herein) for such identification. Likewise, the active compound can be as described above, including fragments and derivatives of phage inhibitor proteins, peptidomimetics, and small molecules. As recognized by those skilled in the art, peptides can be synthesized by expression systems and purified, or can be synthesized artificially. In preferred embodiments the inhibitory phage ORFproducts is from S. aureus phage 44AHJD ORF 1, 9, or 12, Streptococcus WO 00/32825 PCT/IB99/02040 27 pneumoniae phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, or 041, or Enterococcus sp. phage 182 ORF 002, 008, or 014.
As indicated above, sequence analysis of nucleotide and/or amino acid sequences can beneficially utilize computer analysis. Thus, in additional aspects the invention provides computer-related hardware and media and methods utilizing and incorporating sequence data from uncharacterized phage, uncharacterized phage listed in Table 1, preferably at least one of Staphylococcus aureus phage S. aureus phage 44AHJD ORF 1, 9, or 12, Streptococcus pneumoniae phage Dp- ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, or 041, or Enterococcus sp. phage 182 ORF 002, 008, or 014, or 44 AHJD, Enterococcus sp. phage 182, or Streptococcus pneumoniae phage Dp-1. In general, such aspects can facilitate the above-described aspects. Various embodiments involve the analysis of genetic sequence and encoded products, as applied to the evaluating bacteriophage inhibitor ORFs and compounds and fragments related thereto. The various sequence analyses, as well as function analyses, can be used separately or in combination, as well as in preceding aspects and embodiments. Use in combination is often advantageous as the additional information allows more efficient prioritizing of phage ORFs for identification of those ORFs that provide bacteria-inhibiting function.
In one aspect, the invention provides a computer-readable device which includes at least one recorded amino acid or nucleotide sequence corresponding to one of the specified phage and a sequence analysis program for analyzing a nucleotide and/or amino acid sequence. The device is arranged such that the sequence information can be retrieved and analyzed using the analysis program. The analysis can identify, for example, homologous sequences or the indicated %s of the phage genome and structural motifs. Preferably the sequence includes at least 1 phage ORF or encoded product, more preferably at least 10%, 20%, 30%, 40%, 50%, 70%, or 100% of the genomic phage ORFs and/or equivalent cDNA, RNA, or amino acid sequences. Preferably the sequence or sequences in the device are recorded in a medium such as a floppy disk, a computer hard drive, an optical disk, computer random access memory (RAM), or magnetic tape. The program may also be recorded in such medium. The sequences can also include sequences from a plurality of different phage.
In this context, the term "corresponding" indicates that the sequence is at least identical, preferably at least 97% identical, and more preferably at least 99% identical to a sequence from the specified phage genome, a ribonucleotide equivalent, a degenerate equivalent (utilizing one or more degenerate codons), or a homologous sequence, where the homolog provides functionally equivalent biological function.
II
WO 00/32825 PCT/IB99/02040 28 Similarly, the invention provides a computer analysis system for identifying biologically important portions of a bacteriophage genome. The system includes a data storage medium, as identified above, which has recorded thereon a nucleotide sequence corresponding to at least a portion of at least one uncharacterized bacteriophage genome, a set of program instructions to allow searching of the sequence or sequences to analyze the sequence, and an output device where the portion includes at least the sequence length as specified in the preceding aspect. The output device is preferably a printer, a video display, or a recording medium. More one than one output device may be included. For each of the present computer-related asepcts, the bacteriophage are preferably selected from the uncharacterized phage listed in Table 1, more preferably from bacteriophage 77, 3A, 96, 44 AHJD (S.
aureus), Dp-1 (Streptococcus pneumoniae), or 182 (Enterococcus).
In keeping with the computer device aspects, the invention also provides a method for identifying or characterizing a bacteriophage ORF by providing a computer-based system for analyzing nucleotide or amino acid sequences, as describe above. The system includes a data storage medium which has recorded a sequences or sequences as described for the above devices, a set of instructions as in the preceding aspect, and an output device as in the preceding aspect. The method further involves analyzing at least one sequence, and outputting the analysis results to at least one output device.
In preferred embodiments, the analysis identifies a sequence similarity or homology with a sequence or sequences selected from bacterial ORFs encoding products with related biological function; ORFs encoding known inhibitors; and essential bacterial ORFs. Preferably the analysis identifies a probable biological function based on identification of structural elements or characteristic or signature motifs of an encoded product or on sequence similarity or homology. Preferably the uncharacterized bacteriophage is from Table 1, more preferably at least one of bacteriophage 77, 3A, 96, 44 AHJD aureus), Dp-1 (Streptococcus pneumoniae), or 182 (Enterococcus). In preferred embodiments, the method also involves determining at least a portion of the nucleotide sequence of at least one uncharacterized bacteriophage as indicated, and recording that sequence on data storage medium of the computer-based system. In preferred embodiments, the analysis identifies a sequence similarity of homology with a S. aureus phage 44AHJD ORF 1, 9, or i2, Streptococcus pneumoniae phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, or 041, or Enterococcus sp. phage 182 ORF 002, 008, or 014. WO 00/32825 PCT/IB99/02040 29 As used in the claims to describe the various inventive aspects and embodiments, "comprising" means including, but not limited to, whatever follows the word "comprising". Thus, use of the term "comprising" indicates that the listed elements are required or mandatory, but that other elements are optional and may or may not be present. By "consisting of' is meant including, and limited to, whatever follows the phrase "consisting of'. Thus, the phrase "consisting of' indicates that the listed elements are required or mandatory, and that no other elements may be present.
By "consisting essentially of' is meant including any elements listed after the phrase, and limited to other elements that do not interfere with or contribute to the activity or action specified in the disclosure for the listed elements. Thus, the phrase "consisting essentially of' indicates that the listed elements are required or mandatory, but that other elements are optional and may or may not be present depending upon whether or not they affect the activity or action of the listed elements.
Further embodiments will be apparent from the following Detailed Description and from the claims.
BRIEF DESCRIPTION OF THE DRAWINGS FIGURE 1A and 1B are flow schematics showing the manipulations used to convert pT0021, an arsenite inducible vector containing the luciferase gene, into pTHA or pTM, two ars inducible vectors. Vector pTHA contains BamH I, Sal I, and Hind III cloning sites and a downstream HA epitope tag. Vector pTM contains Bam HI and Hind III cloning sites and no HA epitope tag.
FIGURE 2 is a schematic representation of the cloning steps involved to place the DNA segments of any of ORFs 17/ 19/ 43/ 102/104/182 or other sequences into pTHA to assess inhibitory potential. For subcloning into pTM or pT0021, Individual ORFs were amplified by the PCR using oligonucleotides targeting the ATG and stop codons of the ORFs. Using this strategy, Bar HI and Hind III sites were positioned immediately upstream or downstream, respectively of the start and stop codons of each ORF. Following digestion with Bam HI and Hind III, the PCR fragments were subcloned into the same sites ofpT0021 or pTM. Clones were verified by PCR-and direct sequencing.
WO 00/32825 PCT/IB99/02040 FIGURE 3 shows a schematic representation of the functional assays used to characterize the bactericidal and bacteriostatic potential of all predicted ORFs (>33 amino acids) encoded by bacteriophage 77. Fig. 3A) Functional assay on semi-solid support media. Fig. 3B) Functional assay in liquid culture.
FIGURE 4A, B, and C is a bar graph showing the results of a screen in liquid media to assess bacteriostatic or bactericidal activity of 93 predicted ORFs (>33 amino acids) encoded by bacteriophage 77. Growth inhibition assays were performed as detailed in the Detailed Description. The relative growth of Staphylococcus aureus transformants harboring a given bacteriophage 77 ORF (identified on the bottom of the graph), in the absence or presence of arsenite, is plotted relative to growth of a Staphylococcus aureus transformant containing ORF 5, a non-toxic bacteriophage 77 ORF (which is set at 100%). Each bar represents the average obtained from three Staph A transformants grown in duplicate. Bacteriophage 77 ORFs showing significant growth inhibition consist of ORFs 17, 19, 102, 104, and 182.
FIGURE 5 shows a block diagram of major components of a general purpose computer.
FIGURE 6 shows an ORF map for Streptococcus pneumoniae bacteriophage Dp-1 showing the ORF identifiers, genomic locations, and orientations of the identified ORFs that were found to have ribosomal binding sites and thus are expected to be expressed.
FIGURE 7 shows a schematic representation of the arsenite-inducible expression system present in a shuttle vector designed to express individual Streptococcus bacteriophage Dp-1 ORFs in Streptococcus. Various modifications can be readily made to such a vector, or other vectors can be readily constructed to provide iiiducible expression f ORs in a particular host bacterium using well-known techniques.
WO 00/32825 PCT/I B99/02040 31 DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The invention may be more clearly understood from the following description.
The tables will first be briefly described.
Table 1 is a listing of a large number of available bacteriophage that can be readily obtained and used in the present invention.
Table 2 shows the complete nucleotide sequence of the genome of Staphylococcus aureus bacteriophage 77.
Table 3 shows a list of all the ORFs from Bacteriophage 77 that were screened in the functional assay to identify those with anti-microbial activity.
Table 4 shows the predicted nucleotide sequence, predicted amino acid sequence, and physiochemical parameters of ORF 17/ 19/ 43/ 102/ 104/ 182]. These include the primary amino acid sequence of the predicted protein, the average molecular weight, amino acid composition, theoretical pi, hydrophobicity map, and predicted secondary structure map.
Table 5 shows homology search results. BLAST analysis was performed with ORFs 17/ 19/ 43/ 102/ 104/ 182 against NCBI non-redundant nucleotide and Swissprot databases. The results of this search indicate that: I) ORF 17 has no significant homology to any gene in the NCBI non-NCBI non-redundant nucleotide database, II) ORF 19 has significant homology to one gene in the NCBI nonredundant nucleotide database the gene encoding ORF 59 of bacteriophage phi PVL, III) ORF 43 has significant homology to one gene in the NCBI non-redundant nucleotide database the gene encoding ORF 39 of phi PVL, IV) ORF 102 has significant homology to one gene in the NCBI non-redundant nucleotide database the gene encoding ORF 38 of phi PVL, V) ORF 104 has no significant homology to any gene in the NCBI non-redundant nucleotide database, VI) ORF 182 has significant homology to one gene in the NCBI non-redundant nucleotide database the gene encoding ORF 39 of phi PVL.
Table 6 is a table from Alberts et al., MOLECULAR BIOLOGY OF THE CELL 3 ed., showing the redundancy of the "universal" genetic code.
Table 7 shows the complete nucleotide sequence of Staphylococcus aureus bacteriophage 3A.
WO 00/32825 PCT/I B99/02040 32 Table 8 is a listing of the ORFs identified in Staphylococcus aureus bacteriophage 3A.
Table 9 shows the complete nucleotide sequence of Staphylococcus aureus bacteriophage 96.
Table 10 is a listing of the ORFs identified in Staphylococcus aureus bacteriophage 96.
Table 11 is a listing of sequences deposited in the NCBI public database (GeneBank) for bacteriophage listed in Table 1.
Table 12 is a listing of phage which encode a known lysis function including the identified lysis gene.
Table 13 is a listing of bacteriophage which encode holin genes, where holin genes encode proteins which form pores and eventually enable other enzymes to kill the host bacterium.
Table 14 is a listing of bacteriophage which encode kil genes.
Table 15 is a list of Staphylococcus aureus sequences identified by accession number which may include sequences from genes coding for target sequences for the phage 77-encoded antimicrobial proteins or peptides. The sequences were obtained by searching GenBank for listings.
Table 16 shows the nucleotide sequence of the genome of Staphylococcus aureus phage 44 AHJD.
Table 17 lists and shows the sequence position of the 73 ORFs predicted to be encoded by Staphylococcus aureus bacteriophage 44 AHJD that are greater than 33 amino acids.
Table 18 shows the ORF sequences and putative amino acid sequences for the Staphylococcus aureus bacteriophage 44AHJD ORFs greater than 33 amino acids.
Table 19 shows the similarities in sequence identified between predicted Staphylococcus aureus bacteriophage 44 AHJD ORFs and sequences present in public databases.
Table 20 shows the homology alignments between predicted Staphylococcus aureus bacteriophage 44AHJD ORFs and the corresponding protein sequences present in public sequence databases.
Table 21 shows the complete nucleotide sequence of the genome of Enterococcus bacteriophage 182.
Table 22 lists and shows the sequence position of the 80 ORFs identified in bacteriophage 182 and that are greater than 33 amino acids.
WO 00/32825 PCT/IB99/02040 Table 23 shows the nucleotide and predicted amino acid sequence of all ORFs identified in bacteriophage 182.
Table 24 shows the similarities identified to date in sequence between Enterococcus phage 182 ORFs greater than 33 amino acids and sequences present in public sequence databases.
Table 25 shows the predicted amino acid sequence as well as the predicted secondary structures map for two Enterococcus bacteriophage 182 ORFs.
Table 26 shows the homology alignments between predicted Enterococcus bacteriophage 182 ORFs and the corresponding protein sequences present in public sequence databases.
Table 27 list Enterococcus sequences listed in GenBank providing possible Enterococcal target sequences for inhibitory Enterococcus bacteriophage 182 ORFs and other compounds with antibacterial activity.
Table 28 shows the complete nucleotide sequence of the genome of Streptococcus bacteriophage Dp-1.
Table 29 lists and shows sequence position of the 273 ORFs identified in Pneumococcal bacteriophage Dp-1 that are greater than 33 amino acids, 85 of which are predicted to be expressed in Dp-l as having a ribosomal binding site. That set of ORFs is shown in the attached drawings.
Table 30 shows the nucleotide and predicted amino acid sequence of all 273 ORFs identified in bacteriophage Dp-1 that are identified as being expressed.
Table 31 shows the similarities identified in sequence between Streptococcus phage Dp-1 ORFs greater than 33 amino acids and sequences present in public sequence databases.
Table 32 shows the 4731 bp sequence of Dp-1 published by Sheehan et al., 1997).
Table 33 lists Streptococcus pneumoniae sequences listed in GenBank providing possible target sequences for inhibitory Streptococcus pneumoniae bacteriophage Dp-1 ORFs and other compounds with antibacterial activity Background: As indicated above, the present invention is concerned, in part, with the use of bacteriophage coding sequences and the encoded polypeptides or RNA transcripts to identify bacterial targets for potential new antibacterial agents. Thus, the invention concerns the selection of relevant bacteria. Particularly relevant bacteria are those which are pathogens of a complex organism such as an animal, mammals, ii WO 00/32825 PCT/IB99/02040 34 reptiles, and birds, and plants. Examples include Stapylococcus aureus, Enterococcus species, and Streptococcus pneumoniae. However, the invention can be applied to any bacterium (whether pathogenic or not) for which bacteriophage are available or which are found to have cellular components closely homologous to components targeted by phage of another bacterium.
Thus, the invention also concerns the bacteriophage which can infect a selected bacterium. Identification of ORFs or products from the phage which inhibit the host bacterium both provides an inhibitor compound and allows identification of the bacterial target affected by the phage-encoded inhibitor. Such targets are thus identified as potential targets for development of other antibacterial agents or inhibitors and the use of those targets to inhibit those bacteria. As indicated above, even if such a target is not initially identified in a particular bacterium, such a target can still be identified if a homologous target is identified in another bacterium.
Usually, but not necessarily, such another bacterium would be a genetically closely related bacterium. Indeed, in some cases, a phage-encoded inhibitor can also inhibit such a homologous bacterial cellular component.
The demonstration that bacteriophage have adapted to inhibiting a host bacterium by acting on a particular cellular component or target provides a strong indication that that component is an appropriate target for developing and using antibacterial agents, in therapeutic treatments. Thus, the present invention provides additional guidance over mere identification of bacterial essential genes, as the present invention also provides an indication of accessability of the target to an inhibitor, and an indication that the target is sufficiently stable over time not subject to high rates of mutation) as phage acting on that target were able to develop and persist. Thus, the present invention identifies a subset of essential cellular components which are particularly likely to be appropriate targets for development of antibacterial agents.
The invention also, therefore, concerns the development or identification of inhibitors of bacteria, in addition to the phage-encoded inhibitory proteins (or RNA transcripts), which are active on the targets of bacteriophage-encoded inhibitors. As described herein, such inhibitors can be of a variety of different types, but are preferably small molecules.
The following description provides preferred methods for use in the various aspects of the invention. However, as those skilled in the art will readily recognize, other approaches can be used to obtain and process relevant information. Thus-theinvention is not limited to the specifically described methods. In addition, the following description provides a set of steps in a particular order. That series of steps WO 00/32825 PCT/IB99/02040 describes the overall development involved in the present invention. However, it is clear that individual steps or portions of steps may be usefully practiced separately, and, further, that certain steps may be performed in a different order or even bypassed if appropriate information is already available or is provided by other sources or methods.
Selecting and Growing Phage, and Isolating DNA Conceptually, the first step involves selecting bacterial hosts of interest.
Preferably, but not necessarily, such hosts will be pathogens of clinical importance.
Alternatively, because bacteria all share certain fundamental metabolic and structural features, these features can be targeted for study in one strain, for example a nonpathogenic one, and extrapolated to similarly succeed in pathogenic ones.
Nonpathogenic strains may also exhibit initial advantages in being not only less dangerous, but also, for example, in having better growth and culturing characteristics and/or better developed molecular biology techniques and reagents. Consequently, advantageously the invention provides the ability target virtually any bacteria, but preferably pathogenic bacteria, with antimicrobial compounds designed and/or developed using bacteriophage inhibitory proteins and peptides from phage with nonpathogenic and/or pathogenic hosts.
We have selected Staphylococcus aureus, Streptococcus pneumoniae, various Enterococci, and Pseudomonas aeruginosa as initial exemplary pathogens. These bacteria are a major cause of morbidity and mortality in hospital-based infections, and the appearance of antibiotics resistance in all three organisms makes it increasingly difficult to treat benign infections involving these organisms. Such infections can include, for example, otitis media, sinusitis, and skin, and airway infections (Neu, H.C. (1992). Science 257, 1064-1073). However, the approach described below is clearly applicable to any human bacterial pathogens including but not restricted to Mycobacterium tuberculosis, Nesseria gonorrhoeae, Haemophilus influenza, Acinobacter, Escherichia coli, Shigella dysenteria, Streptococcus pyogenes, Helicobacterpylori, and Mycoplasma species. This invention can also be applied to the discovery of anti-bacterial compounds directed against pathogens of animals other than humans, for example, sheep, cattle, swine, dogs, cats, birds, and reptiles.
Similarly, the invention is not limited to animals, but also applies to plants and plant pathogens.
In general, the bacteria are grown according to standard methodologies employed in the art, including solid, semi-solid or liquid culturing, which procedures can be found in or extrapolated from standard sources such as Maloy, Stewart, WO 00/32825 PCT/IB99/02040 36 and Taylor, R.K. Genetic Analysis of Pathogenic Bacteria (1996) Cold Spring Harbor Laboratory Press, or Maniatis, T. et al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor University Press, Cold Spring, or Ausubel, F.M. et al. (1994) Current Protocols in Molecular Biology. John Wiley Sons, Secaucus, N.J. Culture conditions are selected which are adapted to the particular bacterium generally using culture conditions known in the art as appropriate, or adaptations of those conditions.
Nucleic acids within these bacteria can be routinely extracted through common procedures such as described in the above-referenced manuals and as generally known to those skilled in the art. Those nucleic acid stocks can then be used to practice the other inventive aspects described below.
Selection and Growth of Bacteriophage. and Isolation of DNA The second step involves assembling a group of bacteriophages (phage collection) for one or more of the targeted bacterial hosts. While the invention can be utilized with a single bacteriophage for a pathogen or other bacterium, it is preferable to utilize a plurality of phage for each bacterium, as comparisons between a plurality of such phage provides useful additional information. Non-limiting examples of phage and sources for some of the above-mentioned pathogenic bacteria are found in Table 1. The criteria used to select such phages is that they are infectious for the microbe targeted, and replicate in, lyse, or otherwise inhibit growth of the bacterium in a measurable fashion. These phages can be very different from one another (representing different families), as judged by criteria such as morphology (head, tail, plate, etc.), and similarity of genome nucleotide sequence (cross-hybridization). Since such diverse bacteriophages are expected to block bacterial host metabolism and ultimately inhibit by a variety of mechanisms, their combined study will lead to the identification of different mechanisms by which the phages independently inhibit bacterial targets. Examples include degradation of host DNA (Parson and Snustad, D.P. (1975). J. Virol. 15, 221-444) and inhibition of host RNA transcription (Severinova, Severinov, K. and Darst, S.A. (1998). JMol. Biol. 279, 9-18). This, in turn, yields novel information on phage proteins that can inhibit the targeted microbe. As explained below, this 1) forms the basis of novel drug discovery efforts hacd on knoledge eMfo- rth pmr eino id ence* w* A& J Ll& jAl111"U&y UllA I& A4 3 A UL l V i L11' ll. VC IIlUIUILUI protein peptide fragments or peptidomimetics) and/or 2) leads to the identification of bacterial biochemical pathways, the proteins of which are essentiaTor significant for survival of the targeted microbe, and which enzymatic steps or
II
WO 00/32825 PCT/IB99/02040 37 chemical reactions can be targeted by classical drug discovery methods using molecular inhibitors, for example, small molecule inhibitors.
Bacteriophage are generally either of two types, lytic or filamentous, meaning they either outright destroy their host and seek out new hosts after replication, or else continuously propogate and extrude progeny phage from the same host without destroying it. Regardless of the phage life cycle and type, preferred embodiments incorporate phage which impede cell growth in measurable fashion and preferably stop cell growth. To this end, lytic phage are preferred, although certain nonlytic species may also suffice, if sufficiently bacteriostatic.
Various procedures that are commonly understood by those of skill in the art can be routinely employed to grow, isolate, and purify phage. Such procedures are exemplified by those found in such common laboratory aids such as Maloy, S.R., Stewart, and Taylor, R.K. Genetic Analysis of Pathogenic Bacteria (1996) Cold Spring Harbor Laboratory Press; Maniatis, T. et al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor University Press, Cold Spring, and Ausubel, F.M. et al. (eds.) (1994) Current Protocols in Molecular Biology. John Wiley Sons, Secaucus, N.J. The techniques generally involve the culturing of infected bacterial cells that are lysed naturally and/or chemically assisted, for example, by the use of an organic solvent such as chloroform that destroys the host cells thereby liberating the phage within. Following this, the cellular debris is centrifuged away from the supernatant containing the phage particles, and the phage then subsequently and selectively precipitated out of the supernatant using various methods usually employing the use of alcohols and/or other chemical compounds such as polyethylene glycol (PEG). The resulting phage can be further purified using various density gradient/centrifugation methodologies. The resulting phage are then chemically lysed, thereby releasing their nucleic acids that can be conveniently precipitated out of the supernatant to yield a viral nucleic acid supply of the phage of interest.
Exemplary bacteriophage are indicated in Table 1, along with sources where those phage may be obtained.
Exemplary bacteria include the reference bacteria for the identified bacteriophage, available from the same sources.
Characterizing Bacteriophage Genomes for ORFs The third step involves systematically characterizing the genetic information contained in the phage genome. Within this genetic information is the sequence of all RNAs and proteins encoded by the phage, including those that are essential or WO 00/32825 PCT/IB99/02040 38 instrumental in inhibiting their host. This characterization is preferably done in a systematic fashion. For example, this can be done by first isolating high molecular weight genomic DNA from the phage using standard bacterial lysis methods, followed by phage purification using density gradient ultracentrifugation, and extraction of nucleic acid from the purified phage preparation. The high molecular weight DNA is then analyzed to determine its size and to evaluate a proper strategy for its sequencing.
The DNA is broken down into smaller size fragments by sonication or partial digestion with frequently cutting restriction enzymes such as Sau3A to yield predominantly 1 to 2 kilobase length DNA, which DNA can then be resolved by gel electrophoresis followed by extraction from the gel.
The ends of the fragments are enzymatically treated to render them suitable for cloning and the pools of fragments are cloned in a bacterial plasmid to generate a library of the phage genome. Several hundred of these random DNA fragments contained in the plasmid vector are isolated as clones after introduction into an appropriate bacterium, usually Escherichia coli. They are then individually expanded in culture and the DNA from each individual clone is purified. The nucleotide sequences of the inserts of these clones are determined by standard automated or manual methods, using oligonucleotide primers located on either side of the cloning site to direct polymerase mediated sequencing the Sanger sequencing method or a modification of that method). Other sequencing methods can also be used.
The sequence of individual clones is then deposited in a computer, and specific software programs (for example, SequencherTM, Gene Codes Corp.) are used to look for overlap between the various sequences, resulting in ordering ofcontig sequences and ultimately providing the complete sequence of the entire bacteriophage genome (one such example is given in Table 2 for Staphylococcus aureus bacteriophage 77; others are also provided herein). This complete nucleotide sequence is preferably determined with a redundancy of at least 3- to 5-fold (number of independent sequencing events covering the same region) in order to minimize sequencing errors.
Preferably, the bacterial strain used as a phage host should not possess any other innate plasmids, transposons, or other phage or incompatible sequences that would complicate or otherwise make the various manipulations and analyses more difficult.
Commercially available computer software programs are used to translate the nucleotide sequence of the phage to identify all protein sequences encoded by the phage (hereafter called open reading frames or ORFs). (Customized software can clearly also be used.) As phages are known to transcribe their genome into RNA from 39 both strands and in all six possible reading frames. As evolutionary constraints have forced the phage to conserve all of its vital protein sequences in as small a genome as possible, it is straightforward to identify all the proteins encoded by the phage by simple examination of the 6 translation frames of the genome. Once these ORFs are identified, they are cataloged into a phage proteome database (Table 3 lists ORFs identified from phage 77; ORF lists are also provided for other exemplary phage).
This analysis is preferably performed for each phage under study. The process of ORF identification can be varied depending on the desired results. For example, the minimum length for the putative encoded polypeptide can be varied, and/or putative coding regions that have an associated Shine-Dalgarno sequence can be selected. In the case of phage 77 ORFs, such parameter adjustment was performed and resulted in the identification of ORFs as listed herein. Different parameters had resulted in the identification of different ORFs for phage 77 as shown in the following table: ORF ID Genomic a.a. Start ORF ID Genomic a.a. Start (1I' set of position size codon 2 n d set of position size codon parameters) parameters) 770RF016 2369-24024 251 TTG 770RF017 23269-23982 237 ATG 770RF019 39845-40501 218 ATA 770RF019 39851-40501 216 ATG 770RF050 29268-29564 98 ATG 770RF182 29268-29564 98 ATG 770RF050 29268-29564 98 ATG 770RF043 29304-29564 86 ATG 770RF067 34312-34551 79 CTG 770RF104 34393-34551 52 ATG 770RF146 29051-29212 53 ATG 770RF102 29051-29212 53 ATG 15 Identifying and Characterizing Inhibitory Phage ORFs The fourth step entails identifying the phage protein or proteins or RNA transcripts that have the ability to inhibit their bacterial hosts. This can be accomplished, for example, by either or both of two non-mutually exclusive methods.
The first method makes use of bioinformatics. Over the past few years, a large amount of nucleotide sequence information and corresponding translated products have become available through large genome sequencing projects for a variety of organisms including mammals, insects, plants, unicellular eukaryotes (yeast and fungi), as well as several bacterial genomes such as E. coli, Mycobacterium tuberculosis, Bacillus subtilis, Staphylococcus aureus and many others. Such sequences have been deposited in public databases (for example, non-redundant go °e o eo goD° e o oooe Doge ogeo °e [R:\LIBVV]03375.doc:sxc WO 00/32825 PCT/IB99/02040 sequence database at GenBank and SwissProt protein sequence database) (http://www.ncbi.nlm.nih.gov)) and can be freely accessed to compare any specific query sequence to those present in such databases. For example, GenBank contains over 1.6 billion nucleotides corresponding to 2.3 million sequence records. Several computer programs and servers TBLASTN) have been created to allow the rapid identification of homology between any given sequence from one organism to that of another present in such databases, and such programs are public and available free of charge.
In addition, it has been well established that basic biochemical pathways can be conserved in very distant organisms (for example bacteria and man), and that the proteins performing the various enzymatic steps in these pathways are themselves conserved at the amino acid sequence level. Thus, proteins performing similar functions DNA repair, RNA transcription, RNA translation) have frequently preserved key structural signatures, identifiable by similarities across regions of proteins (domains and motifs). The antimicrobials of the present invention will preferably target features and targets that are highly characteristic or conserved in microbes, and not higher organisms.
Most genomes encode individual proteins or groups of proteins that can be assembled into protein families that have been evolutionarily conserved. Therefore, similarity between a new query sequence and that of a member of a protein family (reference sequences from public databases) can immediately suggest a biochemical function for the novel query sequence, which in our case is a phage ORF.
The sequence homology between individual members of evolutionarily distant members of a protein family is usually not randomly distributed along the entire length of the sequence but is often clustered into "motifs" and "domains". These correspond to key three-dimensional folds that form key catalytic and/or regulatory structures that perform key biochemical function(s) for the group of proteins.
Commercially available computer software programs can identify such motifs in a new query sequence, again providing functional information for the query sequence.
Such structural and functional motifs have also been derived from the combined analysis of primary sequence databases (protein sequences) and protein structure databases (X-ray crystallography, nuclear magnetic resonance) using so-called "threading" methods (Rost B,i and Sander C. (i 996).Ann. Rev. Biophy. BiUMol.
Struct. 25, 113-136).
Such motifs and folds are themselves deposited in public databases whichcan be directly accessed (for example, SwissProt database; 3D-ALI at EMBL, Heidelberg; PROSITE). This basic exercise leads to a structural homology map in which each of WO 00/32825 PCT/IB99/02040 41 the phage ORFs has been probed for such similarities, and where initial structural and functional hits are identified (selected examples of sequence homologies detected between individual ORFs from the genome of Staphylococcus aureus bacteriophage 77 and sequences deposited in public databases are shown in Table 5 for ORFs 17/19/43/102/104/182).
This analysis can point out phage proteins with similarity to proteins from other phages (such as those for E. coli) playing an important role in the basic biochemical pathways of the phage (such as DNA replication, RNA transcription, tRNAs, coat protein and assembly). Selected examples of such proteins include integrase and capsid protein. Therefore, this analysis enables identification and elimination of non-essential ORFs as candidates for an inhibitor function, as well as the identification of (potentially) useful ones.
In addition, this analysis can point out specific ORFs as possible inhibitor ORFs. For example these ORFs may encode proteins or enzymes that alter bacterial cell structure, metabolism or physiology, and ultimately viability. Examples of such proteins present in the genome of Staphylococcus aureus bacteriophage 77 include orfl4 (deoxyuridine triphosphatase from bacteriophage T5), and orfl5 (sialidase).
(These ORF identifications are as listed in provisional application 60/110,992.) Other examples include ORFs 9 and 12 of S. aureus phage 44 AHJD, which encode the putative lysis functions found in many bacteriophages a "holin" and an "amidase".
In addition, it is well known that bacterial and eukaryotic viruses can usurp pathways from their host in order to use them to their advantage in blocking host cellular pathways upon infection. The phage can achieve this by 1) directly producing an inhibitor of a key host pathway T7 gene 0.5 and 2) directly producing a novel activity T4 DNA polymerase), and 3) altering concentrations of cell components by producing similar functions T4 transfer RNAs). The identification of sequence similarity between phage ORFs and bacterial host genome sequences will be highly indicative of such a mechanism. (Selected examples of such homologies are listed in Figure 4 of the provisional application 60/110,992 and include orf4 (homologous to autolysin), orf20 (hypothetical protein from Staphyloccus aureus) and orf29 (hypothetical protein from Staphyloccus aureus.)) These ORFs can be analyzed by a standard biochemical approach to directly test their inhibitor functions as described below).
Alternatively, a homology search may reveal that a given phage ORF is related to a protein present in the databases having an activity known to be inhibitory, inhibitor of host RNA polymerase by E. coli bacteriophage T7. Such a finding would implicate the phage ORF product in a related activity. This will also suggest that a WO 00/32825 PCT/IB99/02040 42 new antimicrobial could be derived by a mimetic approach peptidomimetic) imitating this function or by a small molecule inhibitor to the bacterial target of the phage ORF, or any steps in the relevant host metabolic pathway, high throughput screening of small molecule libraries. Selected examples of such similarity between ORFs of Staphyloccus aureus bacteriophage 77 and proteins with inhibitor functions for bacterial hosts are listed in Figure 4 of the provisional application 60/110,992.
These include orf9 (similar to bacteriophage P kilA function), and orf4 (autolysin of Staphylococcus aureus, amidase enzymatic activity).
A reason for the biochemical study of individual ORFs for inhibitor function is that their expression or overexpression will block cellular pathways of the host, ultimately leading to arrest and/or inhibition of host metabolism. In addition, such ORFs can alter host metabolism in different ways, including modification of pathogenicity. Therefore, individual ORFs identified above are expressed, preferably overexpressed, in the host and the effect of this expression or overexpression on host metabolism and viability is measured. This approach can be systematically applied to every ORF of the phage, if necessary, and does not rely on the absolute identification of candidate ORFs by bioinformatics. Individual ORFs are resynthesized from the phage genomic DNA, by the polymerase chain reaction (PCR), preferably using oligonucleotide primers flanking the ORF on either side. These single ORFs are preferably engineered so that they contain appropriate cloning sites at their extremities to allow their introduction into a new bacterial expression plasmid, allowing propagation in a standard bacterial host such as E. coli, but containing the necessary information for plasmid replication in the target microbe such as S. aureus (hereafter referred to as shuttle vector). Shuttle vectors and their use are well known in the art.
Such shuttle vectors preferably also contain regulatory sequences that allow inducible expression of the introduced ORF. As the candidate ORF may encode an inhibitor function that will eliminate the host, it is beneficial that it not be expressed prior to testing for activity. Thus, screening for such sequences when expressed in a constitutive fashion is less likely to be successful when the inhibitor is lethal. In the exemplary inducible system presented in Figure 1A, 1B, 2, and 7, regulatory sequences from the ars operon of S. aureus are used to direct individual ORF expression in S. aureus (or other bacteria in which the ars system is functional). The ars operon encodes a series of proteins which normally mediate the extrusion of arsenite and other trivalent oxyanions from the cells when they are exposed to such toxic substances in their environment. The operon encoding this detoxifying mechanism is normally silent and only induced when arsenite-related compounds are WO 00/32825 PCT/I B99/02040 43 present. (Tauriainen, S. et al. (1997) App. Env. Microb., Vol. 63, No. 11, p. 4456- 4461.) Therefore, individual phage ORFs can be expressed in S. aureus in an inducible fashion by adding to the culture medium non-toxic arsenite concentrations during the growth of individual S. aureus clones expressing such individual phage ORFs. Toxicity of the phage inhibitor ORF for the host is monitored by reduction or arrest of growth under induction conditions, as measured by optical density in liquid culture or after plating the induced cultures on solid medium. Subsequently, interference of the phage ORF with the host biochemical pathways ultimately leading to reduced or arrested host metabolism can be measured by pulse-chase experiments using radiolabeled precursors of either DNA replication, RNA transcription, or protein synthesis. Similar constructs can be made and used for other bacteria using wellknown techniques.
Those skilled in the art are familiar with a variety of other inducible systems which can also be used for the controlled expression of phage ORFs, including, for example, lactose (see Stratagene's LacSwitchTM I system; La Jolla, CA) and tetracycline-based systems (see, e.g. Clontech's Tet On/Tet OffrM system; Palo Alto, CA). The arsenite-inducible system described is further depicted in Figures 1, 2 and 7.
The selection or construction of shuttle vectors and the selection and use of inducible systems are well known and thus other shuttle vectors appropriate for other bacteria can be readily provided by those skilled in the art, for use in other bacterial species.
Standard methodologies for expressing proteins from constructs, and isolating and manipulating those proteins, for example in cross-linking and affinity chromatography studies, may be found in various commonly available and known laboratory manuals. See, Current Protocols in Protein Science. John Wiley Sons, Secaucus, and Maniatis, T. et al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor University Press, Cold Spring, N.Y.
It has been found that certain phage or other viruses inhibit host cells, at least in part, by producing an antisense RNA which binds to and inhibits translation from a bacterial RNA seqeunce. Thus, in the case of potentially inhibitor RNA transcripts encoded by the phage genome, a strong indicator of a possible inhibitory function is provided by the identification of phage sequence which is the identical to or fiily complementary (or with only a small percentage of mismatch, preferably less than most preferably less than to a bacterial sequence. This approachTis convenient in the case of bacteria that have been essentially completely sequenced, as the comparison can be performed by computer using public database information.
WO 00/32825 PCT/I B99/02040 44 The inhibitory effect of the transcript can be confirmed using expression of the phage sequence in a host bacterium. If needed, such inhibitory can also be tested by transfecting the cells with a vector that will transcribe the phage sequence to form RNA in such manner that the RNA produced will not be translated into a polypeptide.
Inhibition under such conditions provides a strong indication that the inhibition is due to the transcript rather than to an encoded polypeptide.
In an alternative, the expression of an ORF in a host bacterium is found to be inhibitory, but the inhibition is found to be due to an RNA product of the genomic coding region. For antisense inhibition, the sequence of the bacterial target nucleic acid sequence can be identified by inspection of the phage sequence, and the full sequence of the relevant coding region for the bacterial product can be found from a database of the bacterial genomic sequence or can be isolated by standard techniques a clone in a genomic library can be isolated which contains the full bacterial ORF, and then sequenced).
In either case, the identification of a target which is inhibited by an RNA transcript produced by a phage provides both the possible inhibition of bacteria naturally containing the same target nucleic acid sequence, as well as the ability to use the target sequence in screening for other types of compounds which will act directly on the target nucleic acid sequence or on a polypeptide product expressed or regulated, at least in part, by the target of the inhibitory phage RNA.
In some cases it will be found that the target of an inhibitory phage RNA or protein has previously been found to be a target of an inhibitory phage RNA or protein has previously been found to be a target for an antibacterial agent. In such cases, the phage inhibitor can still provide useful information if it is found that the phage-encoded product acts at a different site than the previously identified antibacterial agent or inhibitor, acts at a phage-specific site. For many targets, action at a different site provides highly beneficial characteristics and/or information.
For example, an alternate site of inhibitor action can at least partially overcome a resistance mechanism in a bacterium. As an illustration, in many cases, resistance is due, in large part, to altered binding characteristics of the immediate target to the antibacterial agent. The altered binding is due to a structural change which prevents or destabilizes the binding. However, the structural change is frequently quite local, so that compounds which bind at different local sites will b unaffected or afiMeciUe to a much lesser degree. Indeed, in some cases the local sites will be on a different molecule and so may be completely unaffected by the local structural change creafi-ng resistance to the original agent(s). An example of resistance due to altered binding is WO 00/32825 PCT/ B99/02040 provided by methicillin-resistant Staphylococcus aureus, in which the resistance is due to an altered penicillin-binding protein.
In other cases, a new site of action can have improved accessibility as compared to a site acted on by a previously identified agent. This can, for example, assist in allowing effective treatment at lower doses, or in allowing access by a larger range of types of compounds, potentially allowing identification of more potential active agents.
Another advantage is that the structural characteristics of a different site of action will lead to identification and/or development of inhibitors with different structures and different pharmacological parameter. This can allow a greater range of possibilities when selecting an antibacterial agent.
Yet further, different sites often produce different inhibitory characteristics in the target organism. This is commonly the case for multi-domain target proteins.
Thus, inhibition targeting an alternate site can produce more efficacious action, e.g., faster killing, slower development of resistance, lower numbers of surviving cells, and different secondary effects (for example, different nutrient utilization).
Staphlococcus aureus phage 77 As indicated above, the present invention is concerned, in part, with the use of bacteriophage 77 coding sequences and the encoded polypeptides or RNA transcripts to identify bacterial targets for potential new antibacterial agents.
As described, phage 77 ORFs 17, 19, 43, 102, 104, and 182 have been found to have bacteria inhibiting function. Identification of ORFs 17, 19, 43, 102, 104, and 182 and products from the phage which inhibit the host bacterium both provides an inhibitor compound and allows identification of the bacterial target affected by the phage-encoded inhibitor. Such a target is thus identified as a potential target for development of other antibacterial agents or inhibitors and the use of those targets to inhibit those bacteria. As indicated above, even if such a target is not initially identified in a particular bacterium, such a target can still be identified if a homologous target is identified in another bacterium. Usually, but not necessarily, such another bacterium would be a genetically closely related bacterium. Indeed, in some cases, an inhibitor encoded by phage 77 ORF 17, 19, 43, 102, 104, or 182 can also inhibit such a homologous bacterial cellular component. Possible bacterial target sequences are described herein by reference to sequence source sites. In preferred embodiments, the sequence encoding the target corresponds WO 00/32825 PCT/I B99/02040 46 to a S. aureus nucleic acid sequence available from numerous sources including S.
aureus sequences deposited in GenBank, S. aureus sequences found in European Patent Application No. 97100110.7 to Human Genome Sciences, Inc. filed January 7, 1997, S. aureus sequences available from TIGR at http://www.tigr.org/tdb/mdb/mdb.html, and S. aureus sequences available from the Oklahoma University S. aureus sequencing project at the following URL: http://www.genome.ou.edu/staph new.html. Such possible targets are particularly applicable to S aureus phages 77, 3A, 96, and 44 AHJD.
The amino acid sequence of a polypeptide target is readily provided by translating the corresponding coding region. For the sake of brevity, the sequences are not reproduced herein. Also, in preferred embodiments, a target sequence corresponds to a S. aureus coding sequence corresponding to a sequence listed in Table 15 herein. The listing in Table 15 describes S. aureus sequences currently listed with GenBank. Again, for the sake of brevity, the sequences are described by reference to the database accession numbers instead of being written out in full herein.
In cases where an entry for a coding region is not complete, the complete sequence can be readily obtained by routine methods, by isolating a clone in a phage host S. aureus genomic library, and sequencing the clone insert to provide the relevant coding region. The boundaries of the coding region can be identified by conventional sequence analysis and/or by expression in a bacterium in which the endogenous copy of the coding region has been inactivated and using subcloning to identify the functional start and stop codons for the coding region.
Staphyloccus aureus phage 44 AHJD The present invention also can utilize the identification of naturally occuring DNA sequence elements within Staphylococcus aureus bacteriophage 44AHJD which encode proteins with antimicrobial activity.
Such identification can utilize bioinformatics identification of specific proteins (ORFs) utilized by Staphylococcus aureus bacteriophage 44AHJD during the viral life cycle, resulting in a slowing or arrest of growth of the bacterial host, or in death, of the Staphylococcus aureus host including lysis of the infected bacteria. Thus, some of the bacteriophage 44AHJrD DNA sequenres encoding these proteins (ORFs) are predicted to encode antimicrobial functions. Information derived from these DNA sequences and translated ORFs can, in turn, be utilized to develop inhibitory compounds by peptidomimetics that can also function as antimicrobials. In addition, the identification of the host bacterial proteins that are targeted and inhibited by the WO 00/32825 PCT/IB99/02040 47 antimicrobial bacteriophage ORFs can themselves provide novel targets for drug discovery.
The methodology described above is used to identify and characterize DNA sequences from Staphylococcus sp. bacteriophage 44 AHJD that have antimicrobial activity. As described in the Examples, the Staphylococcus aureus propagating strain (PS 44A), obtained from the Felix d'Herelle Reference Centre (#HER 1101), was used as a host to propagate its phage 44AHJD, also obtained from the Felix d'Herelle Reference Centre (#HER 101). By sequencing, we found that bacteriophage 44AHJD consists of 16,668 bp (Table 16) predicted to encode 73 ORFs greater than 33 amino acids (Tables 17 18). Computational analysis of the predicted protein products of Staphylococcus aureus bacteriophage 44AHJD identified homolgs in public sequence databases as listed inTable 19 and 20, along with the accompanying list of related proteins.
From this analysis, it is apparent that 3 genes (ORF 3, 7, and 8) are related to structural proteins found in other bacteriophages. These include genes predicted to encode a tail protein (ORF an upper collar/connector protein of the phage virion (ORF and a lower collar protein (ORF Bioinformatics has also identified one gene whose product is likely involved in phage DNA synthesis. One gene (ORF 1) shows significant homology to DNA polymerases of a number of bacteriophages, bacteria and fungi, and the product of this gene is likely responsible for replicating the genetic material of bacteriophage 44AHJD. ORF 2 encodes a protein with homology to the dinC gene of Bacillus subtilis that encodes a protein involved in teichoic acid biosynthesis. Teichoic acid is a polyphosphate polymer found in some, but not all, Gram positive organisms (and not in Gram negative organisms), where it is attached to the peptidoglycan layer. The phage protein may thus be involved in the synthesis of this material for incorporation into the cell wall, allowing enhanced lysis by the phage lysis enzymes or, as many enzymes can function in "reverse reactions", may be involved in its degradation allowing for penetration of the peptidoglycan and phage genome entry into the cell followinu adsorption The similarity between Staphylococcus aureus bacteriophage 44AHJD and E. coli phage T7 indicates that they may share similar mechanisms of replication and growth. Both phages belongto the Pododviridae Family of bacteriophages and are members of the "T7-like" Genus of this Family (Ackermann and DuBow; VIth ICTV Report).
WO 00/32825 PCT/I B99/02040 48 Two genes, ORF 9 and 12, were identified with the potential to encode antimicrobial protein products. The homology alignments are shown in Tables 19 and The predicted product of ORF 9 is related to a class of genes which encodes lysozyme-like functions, enzymes which cleave linkages in the mucopolysaccharide cell wall structure of a variety of micro-organisms, including that from the Staphylococcus aureus bacteriophage Twort. ORF 12 of Staphylococcus aureus bacteriophage 44AHJD shows homology to a set of lysis proteins from several bacteriophages. These lysis proteins are also referred to as holins, and represent phage-encoded lysis functions required for transit of the phage murein hydrolases (lysozyme) to the periplasm, where it can digest the cell wall and thus lyse the bacterium.
Thus, in particular embodiments, the present invention provides a nucleic acid sequence isolated from Staphylococcus aureus bacteriophage 44AHJD comprising at least a portion of one of the genes described above with antimicrobial activity. For example, ORF 1 encodes a DNA polymerase function. This polymerase may utilize host-derived accessory proteins for its activity when replicating the phage template, sequestering such proteins from use by the bacterial polymerase, resulting in inhibition of DNA replication, cell division, and cell growth. Alternatively, ORF 9 directly encodes a polypeptide with antimicrobial activity. ORF 9 is predicted to encode an amidase, a protein known to act as a cell wall degrading enzyme. ORF 12 likely encodes a holin function required for transit of the phage amidase (gene 9 product) to the periplasm. When this type of gene product from Bacillus phage phi 29 (gene 14), was cloned in Escherichia coli, cell death ensued (Steiner et al., 1993).
Thus, production of proteins from Bacillus phage phi 29 gene 14 in E. coli resulted in cell death, whereas production of protein from Bacillus phage phi 29 gene 14 concomitantly with the phi 29 lysozyme or unrelated murein-degrading enzymes led to lysis, suggesting that membrane-bound protein 14 induces a nonspecific lesion in the cytoplasmic membrane (Steiner et al., 1993).
The present invention also provides the use of the Staphylococcus bacteriophage 44 AHJD antimicrobial ORFs or ORF products as pharmacological agents, either wholly or in part and derivatives, as well as the use of corresponding peptidomimetics, developed from amino acid or nucleotide sequence knowledge derived from Staphylococcus bacteriophage 44 AHJD killer ORFs.
WO 00/32825 PCT/IB99/02040 49 Enterococcus phage 182 Bacteriophage 182 was obtained from the Felix D'Herelle phage collection (Ste. Foy, Quebec) and infects Enterococcus sp. Group D. The genome of Enterococcus bacteriophage 182 consists of 17,833 bp (Table 21) and is predicted to encode 80 ORFs greater than 33 amino acids (Tables 22 and 23). Computational analysis of the predicted protein products of Enterococcus bacteriophage 182 was performed in order to identify protein products related to those deposited in public databases. Bacteriophage 182 protein products which detected sequences with significant sequence similarity in public databases are listed in Table 24 and 26, along with the accompanying list of related proteins.
From this analysis, it is apparent that 5 genes (ORF 001, 004, 007, 009, and 011) are related to structural proteins of several Bacillus phages Bacillus bacteriophage PZA, phi-29, and B103. These include genes predicted to encode a tail protein (ORF 001), a head protein (ORF 004), and upper collar protein (ORF 007), a lower collar protein (ORF 009), and a pre-neck appendage protein (ORF 011). Two gene products are predicted to encode genes which direct phage morphogenesis these are ORF 005 and 019.
Bioinformatics has also identified three genes whose products are likely involved in phage DNA synthesis. One gene, ORF 002 shows significant homology to DNA polymerases of a number of bacteriophages, and the product of this gene is likely responsible for replicating the genetic material of bacteriophage 182. ORF 006 encodes a protein with homology to the encapsidation proteins of several other bacteriophages, including Bacillus phage phi-29 (P11014), PZA (P07541), and B103 (X99260) and Streptococcus phage CP-1 (Z47794). These gene products catalyze the in vivo and in vitro genome-encapsidation reaction (Garvey et al., 1985). Proteins involved in genome packaging have been shown to have additional activities that affect biochemical reactions in other phages and their hosts. For example, the coat protein of the RNA bacteriophage MS2 interacts with viral RNA to translationally repress replicase synthesis (Pickett and Peabody, 1993). This protein-RNA interaction also plays a role in genome encapsidation, enveloping a single copy of the viral genome in a protein shell composed of many molecules of coat protein. In addition, the bacteriophage k terminase enzyme can be lethal to E. coli when expressed, WO 00/32825 PCT/IB99/02040 suggesting cleavage of packaging sites in the bacterial chromosome. Also present within bacteriophage 182 is a gene, ORF 010, that encodes a protein that is related to the terminal proteins of Bacillus phage Nf (P06812), Bacillus phage GA-I (X96987) and Bacillus phage B103 (X99260). DNA terminal proteins are linked to the 5' ends of both strands of the genome and are essential for DNA replication playing a role in initial priming of DNA replication. The similarity between Enterococcus bacteriophage 182 and Bacillus phages phi-29, PZA, and B103 indicates that they may share similar mechanisms of replication and growth. Protein-primed DNA replication is a well described phenomenon, and in the phi-29-like phages, the ends of the DNA serve as origins and termini of replication (Guti6rrez et al., 1986; Yoshikawa et al., 1985).
There is also a gene (ORF 015) that encodes a protein showing homology to an early protein product of Bacillus bacteriophage PZA and the single-strand nucleic acid binding protein of bacteriophage B103.
Two genes, ORF 008 and 014, were identified with the potential to encode anti-microbial protein products. The homology alignments are shown in Tables 24 26 and biochemical features of the predicted polypeptides shown in Table 25. The predicted product of ORF 008 is related to a class of genes which encodes lysozymelike functions, enzymes which cleave linkages in the mucopolysaccharide cell wall structure of a variety of micro-organisms. ORF 014 of Enterococcus 182 shows homology to a set of lysis proteins from Bacillus bacteriophage phi-29, PZA, and B 103. These lysis proteins are also referred to as holins and represent phage encoded lysis functions required for transit of the phage murein hydrolases (lysozyme) to the periplasm, where it can digest the outer cell wall and thus lyse the bacterium.
Thus, the present invention provides a nucleic acid sequence obtained from Enterococcus bacteriophage 182 comprising at least a portion of a phage 182 ORF, preferably an inhibitory ORF, and more preferably at least a portion of one of the genes described above with anti-microbial activity. For example, ORF 002 encodes a DNA poiymerase function. This polymerasc may utilize host-derived accessory proteins for its activity when replicating the phage template, sequestering such proteins from use by the bacterial polymerase, resulting in inhibition of DNA replication, cell division, and cell growth. Alternatively, ORFs 008 or 014 directly encode polypeptides with anti-microbial activity. ORF 008 is predicted to encode an
II
WO 00/32825 PCT/IB99/02040 51 autolytic lysozyme, a protein known to have anti-microbial activity (Martin et al., 1998). ORF 014 likely encodes a holin function required for transit of the phage murein hydrolases to the periplasm. When the related product from Bacillus phage phi 29 (gene 14), was cloned in Escherichia coli, cell death ensued (Steiner et al., 1993).
Thus, production of proteins from Bacillus phage phi 29 gene 14 in E. coli resulted in cell death, whereas production of protein from Bacillus phage phi 29 gene 14 concomitantly with the phi 29 lysozyme or unrelated murein-degrading enzymes led to lysis, suggesting that membrane-bound protein 14 induces a nonspecific lesion in the cytoplasmic membrane (Steiner et al., 1993).
The present invention also provides the use of the Enterococcus bacteriophage 182 anti-microbial ORFs as pharmacological agents, either wholly or in part and derivatives, as well as the use of corresponding peptidomimetics, developed from amino acid or nucleotide sequence knowledge derived from Enterococcus bacteriophage 182 killer ORFs. This can be done where the structure of the peptidomimetic compound corresponds to the structure of the active portion of a product of an ORF. In this analysis, the peptide backbone is transformed into a carbon based hydrophobic structure that can retain cytostatic or cytocidal activity for the bacterium. This is done by standard medicinal chemistry methods, measuring growth inhibition of the various molecules in liquid cultures or on solid medium. These mimetics also represent lead compounds for the development of novel antibiotics. In this context, "corresponds" means that the peptidomimetic compound structure has sufficient similarities to the structure of the active portion of a product of one of the Enterococcus ORFs listed, that the peptidomimetic will interact with the same molecule as the product of the ORF, and preferably will elicit at least one cellular response in common which relates to the inhibition of the cell by the phage protein.
To validate the identity of an ORF as a killer ORF, it is preferably expressed in the host or other test bacterial organism and the effect of this expression on bacterial growth and replication is assessed. Therefore, all individual ORFs identified herein, those identified above, can be expressed, preferably overexpressed; in a suitable host bacterium a host Enterococcus and the effect of this expression or overexpression on host metabolism and viability can be measured.
Individual ORFs can be resynthesized from the phage genomic DNA by the polymcrase chain reaction (PCR) using oligonucleotide primers flanking the ORF on WO 00/32825 PCT/IB99/02040 52 either side. Those skilled in the art are familiar with the design and synthesis of appropriate primer sequences. These single ORFs are preferably engineered so that they contain appropriate cloning sites at their extremities to allow their introduction into a new bacterial expression plasmid, allowing propagation in a standard bacterial host such as E. coli, but containing the necessary information for plasmid replication in the target microbe, Enterococcus sp. (hereafter referred to as a shuttle vector).
This shuttle vector also preferably contains regulatory sequences that allow inducible expression of the introduced ORF. As the candidate ORF may encode a killer function that will eliminate the host, it is highly advantageous that it not be expressed (or at least not expressed at a substantial level) prior to testing for activity; thus screening for such sequences in a constitutive fashion is less likely to be successful (lethality). In an example presented in Fig. 7, regulatory sequences from the ars operon are used to direct individual ORF expression in Enterococcus. The ars operon encodes a series of proteins which normally mediate the extrusion of arsenite and several other trivalent oxyanions from the cells when they are exposed to such toxic substances in their environment. The operon encoding this detoxifying mechanism is normally silent and only induced when arsenite-related compounds are present.
Therefore, individual phage ORFs can be expressed in Enterococcus or other suitable host in an inducible fashion by adding to the culture medium non-toxic arsenite concentrations during the growth of individual Enterococcus (or other host cells) clones expressing such individual phage ORFs. Toxicity of the phage killer ORF for the host is monitored by reduction or arrest of growth under induction conditions, as measured by optical density in liquid culture or after plating the induced cultures on solid medium. Subsequently, interference of the phage ORF with the host biochemical pathways ultimately leading to reducing or arresting host metabolism can be measured by pulse chase experiments using radiolabeled precursors of either DNA replication, RNA transcription, or protein synthesis.
Of course, other inducible regulatory sequences promoters, operators, etc.) may be used systems using positive induction of expression or systems using release of repression). A variety of such systems are known to those-skilled in the art and can be utilized in the present invention.
WO 00/32825 PCT/I B99/02040 53 Nucleic acid sequences of the present invention can be isolated using a method similar to those described herein or other methods known to those skilled in the art.
In addition, such nucleic acid sequences can be chemically synthesized by wellknown methods. Having the phage 182 ORFs, anti-bacterial ORFs of the present invention, portions thereof, or oligonucleotides derived therefrom as described, other anti-microbial sequences from other bacteriophage sources can be identified and isolated using methods described here or other methods, including methods utilizing nucleic acid hybridization and/or computer-based sequence alignment methods.
The invention also provides bacteriophage anti-microbial DNA segments from other phages based on nucleic acids and sequences hybridizing to the presently identified inhibitory ORF under high stringency conditions or sequences which are highly homologous. The bacteriophage anti-microbial DNA segment from bacteriophage 182 can be used to identify a related segment from another unrelated phage based on stringent conditions of hybridization or on being a homolog based on nucleic acid and/or amino acid sequence comparisons. As with the phage 182 inhibitory sequences, such homologous coding sequences and products can be used as antimicrobials, to construct active portions or derivatives, to construct peptidomimetics, and to identify bacterial targets.
Enterococcus sequences are listed in Table 27 by accession number, providing identification of possible targets of Enterococcus phage inhibitory ORF products, e.g., from phage 182.
Streptococcus pneumoniae As indicated in the Summary above, the present invention is concerned with the use of Streptococcus sp. bacteriophage Dp-1 coding sequences and the encoded polypeptides or RNA transcripts to identify bacterial targets for potential new antibacterial agents.
Streptococcus pneumoniae is an important cause of community-acquired pneumonia and a major cause of otitis media; sinusitis, and meningitis in children and adults. In Spain and other Mediterranean countries, the majority of S. pneumoniae are relatively resistant to penicillin (Klugman, 1990; Fenoll et al., 1991; Jorgenserret al., 1990). These strains also have decreased susceptibility to broad-spectrum cephaloporins, which are frequently used in the empiric treatment of meningitis and WO 00/32825 PCT/IB99/02040 54 other serious invasive bacterial infections. High-level resistance of pneumococci has been encountered in Hungary where 70% of children who were colonized with S.
pneumoniae carried penicillin resistant strains that were also resistant to tetracycline, erythromycin, trimethoprim/sulfamethoxazole, and 30% resistant to chloramphenicol (Neu, 1992). The resistance of pneumococci to macrolides such as erythromycin averages 20-25% in France, -20% in Japan, and <10% in Spain (Neu, 1992).
The antimicrobial susceptibilities and distribution of serotypes of the 42 isolates of S. pneumoniae in southern Taiwan from invasive infections have been recently determined (Hseuh et al., 1996). Resistance rates among these isolates were: erythromycin, 61.9%; clindamycin, 47.6%; chloramphenicol, 19%; and tetracycline, 73.8%. Resistance to three or more classes of antibiotics was found in 33.3% of the isolates. Bacteremic pneumonia and primary bacteremia accounted for 64.3% of the infections and mortality was 42.6%. Given the severity of these infections despite adequate antibiotic therapy, there is clearly a need for introduction of new therapeutic options to prevent mortality due to invasive S. pneumoniae infections.
Pneumococcal phages belong to four families and they present a great variety in morphology, including lytic and temperate phages (for a review, see Garcia et al., 1997). Examples of lytic phages are Cp-1 and Dp-1, whereas examples of temperate phages are HB-3, EJ-1, and HB-746. The complete nucleotide sequence and functional organization of Cp-1 has been reported (Martin et al., 1996). Cp-1 has a 19,345 bp double-stranded DNA genome, with a terminal protein covalently linked to its 5' ends, that replicates by a protein primed mechanism. The phage contains 29 ORFs, 23 on one strand and 6 on the opposite. When these predicted proteins were compared to sequences compiled in GenBank EMBL databases, to ORFs showed significant similarity to proteins of bacteriophage 29 that infects B. subtilis (Martin et al., 1996). The similar proteins corresponded to those involved in DNA replication (terminal protein and DNA polymerase), structural and morphogenic proteins (major head, collar, connector, tail, and encapsidation proteins), and proteins involved in lysis function (holin and lysozyme). In its strategy of lysis, the holin gene product inserts itself into the cell membrane, allowing access of the lysozyme to the peptidoglycan..
Expression of the Cp-1 holin protein in E. coli results in cell death after 2-hours of induction, but did not lead to lysis (Garcia et al., 1997). Cells harboring a plasmid construction with holin and lysozyme genes together did lyse after induction and the WO 00/32825 PCT/IB99/02040 viability loss was similar to that of the culture expressing holin alone. Cloning of these lytic genes in S. pneumoniae showed that both genes had the same effect as in E.
coli. That is, holin itself did not lyse the culture but the viability loss was noticeable, whereas both holin and lysozyme together were capable of lysing M31, an amidase deleted mutant (Garcia et al., 1997).
Recently, a small portion kbp) of a second S. pneumoniae phage, Dp-1, has been sequenced (Sheehan et al., 1997). This portion contains the genes coding for the lytic system (Sheehan et al., 1997) and shows a modular organization similar to that described for Cp-l. However, in this case, a single chimeric protein appears to be made in which the N-terminal domain is highly similar to that of the murein hydrolase coded by a gene found in the phage BK5-T that infects Lactococcus lactis, and the Cterminal domain is homologous to holins. Thus, both functions appear to have been combined in a novel chimeric protein.
Bacteriophage Dp-1 was obtained from Dr. P. Garcia (Departamento de Microbiologia Molecular, Centro de Departamento de Investigaciones Biologicas, Consejo Superior de Investigaciones Cientificas, Velazquez, Madrid, Spain). We found that Dp-1 has a double-stranded DNA genome of 56,506 bp, predicted to encode 85 ORFs greater than 33 amino acids and with upstream Shine-Dalgarno motifs for translation initiation (Tables 28 30, and Fig. Computational analysis of the predicted protein products of Streptococcus bacteriophage Dp-1 protein products, which detected homologs in public databases, are listed inTable 31, along with the accompanying list of related proteins.
From this analysis, it is apparent that several predicted genes of Dp-1 encode polypeptides that are related to structural proteins. ORFs 001, 002, 004, and 030 are predicted to encode tail proteins, minor structural proteins, and minor capsid proteins (Table 31). We also note the identification of several gene products that are likely involved in DNA synthesis. These include ORF 3 which encodes DNA polymerase, ORF 8 which encodes a SWI/SNF helicase-related protein, ORF 10 encodes a protein showing homolovg to recA, and ORF 1 Pnrncod a dna7X-like ORF In E. coli, RapA encodes an RNA polymerase (RNAP)-associated protein with ATPase activity and which is a homolog of the eukaryotic SWI/SNF family, a set of proteins whose members are involved are involved in transcription activation, nucleosome remodeling, and DNA repair. RapA forms a stable complex with RNAP, WO 00/32825 PCT/IB99/02040 56 as if it were a subunit of RNAP and it is possible that the ORF 8 product behaves similarly or in a dominant-negative fashion to inhibit the activity of RapA. Mutation of the essential E. coli dnaZX results in a block in DNA chain elongation during replication (Maki et al., 1988). The dnaZX gene has only one open reading frame for a 71-kDa polypeptide from which the two distinct DNA polymerase III holoenzyme subunits, tau (71 kDa) and gamma (47 kDa), are produced. The tau subunit is the precursor of the gamma subunit, and the gamma subunit is produced by a -1 frameshift causing early termination of translation (Tsuchihashi et al., 1990). These proteins show single-strand DNA binding properties that is ATPase (and dATPase) dependent and are thought to increasing the processivity of the core DNA polymerase enzyme (Lee et al., 1987).
There are several Dp-1 ORFs which encode proteins predicted to play a role in cellular metabolic pathways. These include polypeptides involved in coenzyme PQQ synthesis (ORFs 20, 29, 38). Pyrrolo-quinoline quinone (PQQ) is the non-covalently bound prosthetic group of many quinoproteins catalysing reactions in the periplasm of Gram-negative bacteria. Most of these involve the oxidation of alcohols or aldose sugars. Interestingly, ORFs 20, 29, and 30 also show homology to the exoenzyme S regulon (Frank, 1997). Proteins encoded by the P. aeruginosa exoenzyme S regulon may be involved in a contact-mediated translocation mechanism to transfer anti-host factors directly into eukaryotic cells disrupting eukaryotic signal transduction through ADP-ribosylation (Frank, 1997).
There is also a protein with similarity to GTP cyclohydrolase I (ORF 21) and ORF 41 which shows homology to dUTPase (Table 31). GTP cyclohydrolase I is an enzyme that catalyzes the first reaction in the pathway for the biosynthesis of the pteridine, a cofactor of the monooxygenases of the aromatic amino acids. Disruption of the homologous gene in Saccharomyces cerevisiae leads to a recessive conditional lethality due to folinic acid auxotrophy, that can be complemented with the mammalian or bacterial GTP cyclohydrolase I enzymes (Nardese et al., 1996; Mancini et al., 1999).
ORF 16 shows high homology to autolysin. This region of the phage sequence was previously reported (Sheehan et al., 1997) and encompasses 4 kbp of our sequence. The sequence published by (Sheehan et al., 1997) is shown in Table 32.
Thus, the present invention provides a nucleic acid sequence obtained from Streptococcus bacteriophage Dp-1 comprising at least a portion of a phage Dp- 1. OF; preferably an inhibitory ORF, and more preferably at least a portion of 6ne of the genes described above with anti-microbial activity. For example, ORF 013 encodes a WO 00/32825 PCT/IB99/02040 57 protein with homology to the gamma subunit of DNA polymerase (dnaX gene). This protein may act in a dominant-negative fashion to sequester the host DNA polymerase for its own replication, thus inhibiting host DNA replication. The dnaX gene product is essential for E. coli replication (Kodaira et al., 1983).
In certain preferred embodiments of the present invention, the bacterial target of a bacteriophage inhibitor ORF product, an inhibitory protein or polypeptide, is encoded by a Streptococcus nucleic acid coding sequence from a host bacterium for bacteriophage Dp-1. As above, possible target sequences are described herein by reference to sequence source sites. The sequence encoding the target preferably corresponds to a Streptococcus nucleic acid sequence available from The Institute for Genomic Research (TIGR), or available from GenBank or other public database. The TIGR Streptococcus sequences are publicly available at The Institute for Genomics Research at URL: http://www.tigr.org The amino acid sequence of a polypeptide target is readily provided by translating the corresponding coding region. For the sake of brevity, the sequences are not reproduced herein. Also, in preferred embodiments, a target sequence corresponds to a Streptococcus pneumoniae coding sequences corresponding to a sequence listed in Table 33 herein. Sequences for other Streptococcal species are also available from TIGR and./or from GenBank. The listing in Table 33 describes Streptococcus sequences currently deposited in GenBank. Again, for the sake of brevity, the sequences are described by reference to the GenBank entries instead of being written out in full herein. In cases where the TIGR or GenBank entry for a coding region is not complete, the complete sequence can be readily obtained by routine methods, by isolating a clone in a phage Dp-1 host Streptococcus sp.
genomic library, and sequencing the clone insert to provide the relevant coding region. The boundaries of the coding region can be identified by conventional sequence analysis and/or by expression in a bacterium in which the endogenous copy of the coding region has been inactivated and using subcloning to identify the functional start and stop codons for the coding region.
In the various aspects of this invention involving Dp-1 sequences, preferably the sequence is preferably not contained in the sequence described in Sheehan et al., 1997 (Tal. 1 32).
Validating Identified Inhibitory Phase ORFs A fifth step involves validating the identified phage inhibitor ORF by independent methods, and delineating further possible smaller segments of the ORFs WO 00/32825 PCT/ B99/02040 58 that have inhibitory activity. Several methods exist to validate the role of the identified ORF as an inhibitor ORF.
One example utilizes the creation of a mutant variant of the phage ORF in which the candidate ORF carries a partial or complete loss-of-function mutation that is measurable as compared with the non-mutant ORF. Comparison of the effects of expression of the loss of function mutant with the normal ORF provides confirmation of the identification of an inhibitor ORF where the loss-of-function mutant provides a measurably lower level of inhibition, preferably no inhibition. The loss of function may be conditional, temperature sensitive.
Once validation of the inhibitor ORF is achieved, a bi-directional deletion analysis can be carried out using the same experimental system to identify the minimal polypeptide segment that has inhibitor activity. This may be carried out by a variety of means, by exonuclease or PCR methodologies, and is used to determine if a relatively small segment of the ORF the product of the ORF) still possesses inhibitory activity when isolated away from its native sequence. If so, a portion of the ORF encoding this "active portion" can be used as a template for the synthesis of novel anti-microbial agents and further allowing derivation of the peptide sequence, using modified peptides and/or peptidomimetics.
In creation of certain peptidomimetics, the peptide backbone is transformed into a carbon-based hydrophobic structure that can retain inhibitor activity against the bacterium. This is done by standard medicinal chemistry methods, typically monitored by measuring growth inhibition of the various molecules in liquid cultures or on solid medium. These mimetics can also represent lead compounds for the development of novel antibiotics.
Recently, a major effort has been undertaken by the pharmaceutical industry and their biotechnology partners for the sequencing of bacterial pathogen genomes.
The rationale is that the systematic sequencing of the genome will identify all of the bacterial proteins and therefore this proteome will be the target for designing novel inhibitor antibiotics. Although systematic, this approach has several major problems.
The first is that analysis of primary amino acid sequences of bacterial proteins does not immediately reveal which protein will be essential for viability of the bacterium, and target validation is thus a major issue. The second problem is one of redundancy, as several biochemical pathways are either structurally duplicated in bacteria (different isoforms of the same enzyme), or functionally duplicated by the presence of.
salvage pathways in the event of a metabolic block in one pathway (different nutritional conditions). The third is that even a valid target may not be structurally or WO 00/32825 PCT/IB99/02040 59 functionally amenable to inhibition by small molecules because of inaccessibility (sequestration of target).
Therefore, there is considerable interest within the pharmaceutical and biotechnology industry in identifying key targets for drug discovery amongst the mass of novel targets generated by large-scale genomic sequencing projects.
On the other hand, and underscoring the instant invention, the phages herein described have, over millions of years, evolved specific mechanisms to target such key biochemical pathways and proteins. In the few cases where inhibition by phages has been elucidated see ref. such bacterial targets are invariably rate-limiting in their respective biochemical pathways, are not redundant, and/or are readily accessible for inhibition by the phage (or by another inhibitory compound).
Therefore, the sixth step of this invention involves identifying the host biochemical pathways and proteins that are targeted by the phage inhibitory mechanisms.
Identifying. Validating, and Characterizing Bacterial Host Target Proteins and Affected Pathways A rationale for this step is that the inhibitor ORF product from the phage physically interacts with and/or modifies certain microbial host components to block their function. Exemplary approaches which can be used to identify the host bacterial pathways and proteins that interact with, and preferably also are inhibited by, phage ORF product(s) are described below.
One approach is a genetic screen to determine physiological protein:protein interaction, for example, using a yeast two hybrid system. In this assay, the phage ORF is fused to the carboxyl terminus of the yeast Gal4 activation domain II (amino acids 768-881) to create a bait vector. A cDNA library of cloned S. aureus sequences which have been engineered into a plasmid where the S. aureus sequences are fused to the DNA binding domain of Gal4 is also generated. These plasmids are introduced alone, or in combination, into yeast strain Y190 previously engineered with chromosomally integrated copies of the E. coli lacZ and the selectable HIS3 genes, both under Gal4 regulation (Durfee, Becherer, Chen, Yeh, Yang, Kilburn, Lee, and Elledge, S.J. (1993). Genes Dev. 7, 555-569). If the two proteins expressed in yeast interact, the resulting complex will activate transcription from promoters containing Gal4 binding sites. A lacZ and His3 gene, each driven by a promoter containing Gal4 binding sites, have been integrated into thegenome of the host yeast system used for measuring protein-protein interactiens. Such a system provides a physiological environment in which to detect potential protein interactions. This system has been extensively used to identify novel protein-protein WO 00/32825 PCT/IB99/02040 interaction partners and to map the sites required for interaction (for example, to identify interacting partners of translation factors (Qiu, Garcia-Barrio, and Hinnebusch, A.G. (1998). Mol Cell Biology 18, 2697-2711), transcription factors (Katagiri, Saito, Shinohara, Ogawa, Kamada, Nakamura,Y., and Miki, Y. (1998). Genes, Chromosomes Cancer 21, 217-222), and proteins involved in signal transduction (Endo, Masuhara, Yokouchi, Suzuki, R., Sakamoto, Mitsui, Matsumoto, Tanimura, Ohtsubo, Misawa, H., Miyazaki, Leonor Taniguchi, Fujita, Kanakura, Komiya, and Yoshimura, A. Nature. 387, 921-924). This approach has also been used in many published reports to identify interaction between mammalian viral and mammalian cell proteins.
For example, the non-structural protein NS I of parvovirus is essential for viral DNA amplification and gene expression and is also the major cytopathic effector of these viruses. A yeast two-hybrid screen with NS 1 identified a novel cellular protein of unknown function that interacts with NS-1, called SGT, for small glutamine-rich tetratricopeptide repeat (TPR)-containing protein (Cziepluch C. Kordes E. Poirey R.
Grewenig A. Rommelaere, J, and Jauniaux JC. (1998) J Virol. 72, 4149-4156). In another screen, the adenovirus E3 protein was recently shown to interact with a novel tumor necrosis factor alpha-inducible protein and to modulate some of the activities of E3 (Li Y. Kang J. and Horwitz M.S. (1998). Mol Cell Biol. 18, 1601-1610). In yet another recent screen, the herpes simplex virus 1 alpha regulatory protein ICPO was found to interact with (and stabilize) the cell cycle regulator cyclin D3 (Kawaguchi Y.
Van Sant C. and Roizman B. (1997). J Virol. 71,7328-7336).
Another two-hybrid system for identifying protein:protein interactions is commercially available from STRATEGENE" T as the CYTO-TRAP T M system (Chang et al., Strategies Newsletter 11(3), 65-68 (1998)(from Stratagene)). The system is a yeast-based method for detecting protein:protein interactions in vivo, using activation of the Ras signal transduction cascade by localizing a signal pathway component, human Sos (hSos), to its activation site in the yeast plasma membrane.
The system uses a temperature-sensitive Saccharomyces cerevisiae mutant, strain which contains a point mutation at amino acid residue 1328 of the gene. This gene encodes a guanyl nucleotide exchange factor which binds and activates Ras, leading to cell growth. The mutation in the cdc25 gene prevents host growth at 37 0 C, but at a permissive temperature of 25 0 C, growth is normal. The system utilizes the ability of (hSos) to complement the cdc25 defect and activate the yeast Ras signaling pathway. Once (hSos) is expressed and localized to the plasma membrane, the cdc25H yeast strain grows at 37 0 C. Localizing hSos to the plasma WO 00/32825 PCT/IB99/02040 61 membrane occurs through a protein:protein interaction. A protein of interest, or bait, is expressed as a fusion protein with hSos. The library, or target proteins are expressed with the myristylation membrane-localization signal. The yeast cells are then incubated under restrictive conditions (37 0 If the bait and the target protein interact, the hSos protein is recruited to the membrane, activating the Ras signaling pathway and allowing the cdc25H yeast strain to grow at the restrictive temperature.
The protein targets ofphage inhibitory ORFs can also be identified using bacterial genetic screens. One approach involves the overexpression of a phage inhibitory protein in mutagenized bacterial host species, followed by plating the cells and searching for colonies that can survive the antimicrobial activity of the inhibitory ORF. These colonies are then grown, their DNA extracted, and cloned into an expression vector that contains a replicon of a different incompatibility group from the plasmid expressing the original ORF. This library is then introduced into a wildtype host bacterium in conjunction with an expression vector driving synthesis of the phage ORF, followed by selection for surviving bacteria. Thus, bacterial DNA fragments from the survivors presumably contain a DNA fragment from the original mutagenized host bacterial genome that can protect the cell from the antimicrobial activity of the inhibitory phage ORF. This fragment can be sequenced and compared with that of the bacterial host to determine in which gene the mutation lies. This approach enables one to determine the targets and pathways that are affected by the killing function.
A second approach is based on identifying protein:protein interactions between the phage ORF product and bacterial S. aureus, proteins using a biochemical approach based, for example, on affinity chromatography. This approach has been used, for example, to identify interactions between lambda phage proteins and proteins from their E. coli host (Sopta, Carthew, and Greenblatt, J.
(1985) J. Biol. Chem. 260, 10353-10369). The phage ORF is fused to a peptide tag glutathione-S-transferase 6xHIS, and/or calmodulin binding protein within a commercially available plasmid vector that directs high level expression on induction of a suitably responsive promoter driving the fusion's expression. The translated fusion protein is expressed in E. coli, purified, and immobilized on a solid phase matrix via, for example the tag. Total cell extracts from the host bacterium, S. aureus, are then passed through the affinity matrix containing the immobilized phage ORF fusion protein; host proteins retained on the column are then eluted under different conditions of ionic strength, pH, detergentis etc., and characterized by gel electrophoresis and other techniques. Appropriate controls are run to guard against nonspecific binding to the resin. Target proteins thus WO 00/32825 PCT/IB99/02040 62 recovered should be enriched for the phage protein/peptide of interest and are subsequently electrophoretically or otherwise separated, purified, sequenced, or biochemically analyzed. Usually sequencing entails individual digestion of the proteins to completion with a protease (e.g.-trypsin), followed by molecular mass and amino acid composition and sequence determination using, for example, mass spectrometry, by MALDI-TOF technology (Qin, Fenyo, Zhao, Hall, Chao, Wilson, Young, R.A. and Chait, B.T. (1997). Anal. Chem.
69, 3995-4001).
The sequence of the individual peptides from a single protein are then analyzed by the bioinformatics approach described above to identify the S. aureus protein interacting with the phage ORF. This analysis is performed by a computer search of the S. aureus genome for an identified sequence. Alternatively, all tryptic peptide fragments of the S. aureus genome can be predicted by computer software, and the molecular mass of such fragments compared to the molecular mass of the peptides obtained from each interacting protein eluted from the affinity matrix. The responsible gene sequence can be obtained, for example by using synthetic degenerate nucleic acid sequences to pull out the corresponding homologous bacterial sequence.
Alternatively, antibodies can be generated against the peptide and used to isolate nascent peptide/mRNA transcript complexes, from which the mRNA can be reverse transcribed, cloned, and further characterized using the procedures discussed herein.
A variety of other binding assay methods are known in the art and can be used to identify interactions between phage proteins and bacterial proteins or other bacterial cell components. Such methods that allow or provide identification of the bacterial component can be used in this invention for identifying putative targets.
Validation of the interaction between the phage ORF product and the bacterial proteins or other components can be obtained by a second independent assay co-immunoprecipitation or protein-protein crosslinking experiments (Qiu, Garcia- Barrio, and Hinnebusch, A.G. (1998). Mol Cell Biology 18, 2697-2711; Brown, S. and Blumenthal, T. (1976). Proc. Natl. Acad. Sci. USA 73, 1131-1135)).
Finally, the essential nature of the identified bacterial proteins is preferably determined genetically by creating a constitutive or inducible partial or complete lossof-function mutation in the gene encoding the identified interacting bacterial protein.
This mutant is then tested for bacterial survival and replication.
The protein target of the phage inhibitor function can also be identified using a genetic approach. Two exemplary approaches will be delineated here. The first approach involves the overexpression of a predetermined phage inhibitor protein in mutagenized host bacteria, S. aureus, followed by plating the cells and searching WO 00/32825 PCT/IB99/02040 63 for colonies that can survive the inhibitor. These colonies will then be grown, their DNA extracted and cloned into an expression vector that contains a replicon of a different incompatibility group, and preferably having a different selectible marker than the plasmid expressing the phage inhibitor. Thus, host DNA fragments from the mutant that can protect the cell from phage ORF inhibition can be sequenced and compared with that of the bacterial host to determine in which gene the mutation lies.
This approach allows rapid determination of the targets and pathways that are affected by the inhibitor.
Alternatively, the bacterial targets can be determined in the absence of selecting for mutations using an approach known as "multicopy suppression". In this approach, the DNA from the wild type host is cloned into an expression vector that can coexist, as previously described, with one containing a predetermined phage inhibitor. Those plasmids that contain host DNA fragments and genes that protect the host from the phage inhibitor can then be isolated and sequenced to identify putative targets and pathways in the host bacteria.
Regardless of the specific mode of identification, screening assays may additionally utilize gene fusions to specific "reporter genes" to identify a bacterial gene(s) whose expression is affected when the host target pathway is affected by the phage inhibitor. Such gene fusions can be used to search a number of small molecule compounds for inhibitors that may affect this pathway and thus cause cell inhibition.
This approach will allow the screening of a large number of molecules on petri dishes or 96-well format by monitoring for a simple color change in the bacterial colonies.
In this manner, we can validate host targets and classes of compounds for further study and clinical development. These inhibitors also represent lead compounds for the development of other antibiotics.
Bioinformatics and comparative genomics are preferably then applied to the identified bacterial gene products to predict biochemical function. The biochemical activity of the protein can be verified in vitro in cell free assays or in vivo in intact cells. In vitro biochemical assays utilizing cell-free extracts or purified protein are established as a basis for the screening and development of inhibitors.
These inhibitors, preferably small molecule inhibitors, may comprise peptides, antibodies, products from natural sources such as fungal or plant extracts or small molecule organic compounds. In general, small molecule organic compounds are preferred. These compounds may, for example, be identified within large compound libraries, including combinatorial libraries. For example, a plurality of compoidiids, preferably a large number of compounds can be screened to determine whether any of the compounds binds or otherwise disrupts or inhibits the identified bacterial target.
WO 00/32825 PCT/IB99/02040 64 Compounds identified as having any of these activities can then be evaluated further in cell culture and/or animal model systems to determine the pharmacological properties of the compound, including the specific anti-microbial ability of the compound.
For mixtures of natural products, including crude preparations, once a preparation or fraction of a preparation is shown the have an anti-microbial activity, the active substance can be isolated and identified using techniques well known in the art, if the compound is not already available in a purified form.
Identified compounds possessing anti-microbial activity and similar compounds having structural similarity can be further evaluated and, if necessary, derivatized according to synthesis and/or modification methods available in the art selected as appropriate for the particular starting molecule.
Derivatization of identified anti-microbials In cases where the identified anti-microbials above might represent peptidal compunds, the in vivo effectiveness of such compounds may be advantageously enhanced by chemical modification using the natural polypeptide as a starting point and incorporating changes that provide advantages for use, for example, increased stability to proteolytic degradation, reduced antigenicity, improved tissue penetration, and/or improved delivery characteristics.
In addition to active modifications and derivative creations, it can also be useful to provide inactive modifications or derivatives for use as negative controls or introduction of immunologic tolerance. For example, a biologically inactive derivative which has essentially the same epitopes as the corresponding natural antimicrobial can be used to induce immunological tolerance in a patient being treated. The induction of tolerance can then allow uninterrupted treatment with the active anti-microbial to continue for a significantly longer period of time.
Modified anti-microbial polypeptides and derivatives can be produced using a number of different types of modifications to the amino acid chain. Many such methods are known to those skilled in the art. The changes can include, for example, reduction of the size of the molecule, and/or the modification of the amino acid sequence of the molecule. In addition, a variety of different chemical modifications of the naturally occurring polypeptide can be used, either with or without modifications to the amino acid sequence or size of the molecule. Such chemical modifications can, for example, include the incorporation of modified or non-natural amino acids orndhnamino acid moieties during synthesis of the peptide chain, or the post-synthesis modification of incorporated chain moieties.
WO 00/32825 PCT/I B99/02040 The oligopeptides of this invention can be synthesized chemically or through an appropriate gene expression system. Synthetic peptides can include both naturally occurring amino acids and laboratory synthesized, modified amino acids.
Also provided herein are functional derivatives of anti-microbial proteins or polypeptides. By "functional derivative" is meant a "chemical derivative," "fragment," "variant," "chimera," or "hybrid" of the polypeptide or protein, which terms are defined below. A functional derivative retains at least a portion of the function of the protein, for example reactivity with a specific antibody, enzymatic activity or binding activity.
A "chemical derivative" of the complex contains additional chemical moieties not normally a part of the protein or peptide. Such moieties may improve the molecule's solubility, absorption, biological half-life, and the like. The moieties may alternatively decrease the toxicity of the molecule, eliminate or attenuate any undesirable side effect of the molecule, and the like. Moieties capable of mediating such effects are disclosed in Alfonso and Gennaro (1995). Procedures for coupling such moieties to a molecule are well known in the art. Covalent modifications of the protein or peptides are included within the scope of this invention. Such modifications may be introduced into the molecule by reacting targeted amino acid residues of the peptide with an organic derivatizing agent that is capable of reacting with selected side chains or terminal residues, as described below.
Cysteinyl residues most commonly are reacted with alpha-haloacetates (and corresponding amines), such as chloroacetic acid or chloroacetamide, to give carboxymethyl or carboxyamidomethyl derivatives. Cysteinyl residues also are derivatized by reaction with bromotrifluoroacetone, chloroacetyl phosphate, Nalkylmaleimides, 3-nitro-2-pyridyl disulfide, methyl 2-pyridyl disulfide, p-chloromercuribenzoate, 2-chloromercuri-4-nitrophenol, or chloro-7-nitrobenzo-2-oxa-1,3diazole.
Histidyl residues are derivatized by reaction with diethylprocarbonate at pH 5.5-7.0 because this agent is relatively specific for the histidyl side chain. Parabromophenacyl bromide also is useful; the reaction is preferably performed in 0.1 M sodium cacodylate at pH Lysinyl and amino terminal residues are reacted with succinic or other carboxylic acid anhydrides. Derivatization wiit these agents has the effect of reversing the charge of the lysinyl residues. Other suitable reagents for derivatizing primary amine- containing residues include imidoesters such as methyl picolinimidate; pyridoxal phosphate; pyridoxal; chloroborohydride; WO 00/32825 PCT/IB99/02040 66 trinitrobenzenesulfonic acid; O-methylisourea; 2,4 pentanedione; and transaminasecatalyzed reaction with glyoxylate.
Arginyl residues are modified by reaction with one or several conventional reagents, among them phenylglyoxal, 2,3-butanedione, 1,2-cyclohexanedione, and ninhydrin. Derivatization of arginine residues requires that the reaction be performed in alkaline conditions because of the high pK, of the guanidine functional group.
Furthermore, these reagents may react with the groups of lysine as well as the arginine alpha-amino group.
Tyrosyl residues are well-known targets of modification for introduction of spectral labels by reaction with aromatic diazonium compounds or tetranitromethane.
Most commonly, N-acetylimidizol and tetranitromethane are used to form O-acetyl tyrosyl species and 3-nitro derivatives, respectively.
Carboxyl side groups (aspartyl or glutamyl) are selectively modified by reaction carbodiimide such as 1-cyclohexyl-3-(2-morpholinyl(4-ethyl) carbodiimide or 1-ethyl-3-(4-azonia-4,4-dimethylpentyl) carbodiimide. Furthermore, aspartyl and glutamyl residues are converted to asparaginyl and glutaminyl residues by reaction with ammonium ions.
Glutaminyl and asparaginyl residues are frequently deamidated to the corresponding glutamyl and aspartyl residues. Alternatively, these residues are deamidated under mildly acidic conditions. Either form of these residues falls within the scope of this invention.
Derivatization with bifunctional agents is useful, for example, for crosslinking component peptides to each other or the complex to a water-insoluble support matrix or to other macromolecular carriers. Commonly used cross-linking agents include, for example, 1,1-bis (diazoacetyl)-2-phenylethane, glutaraldehyde, Nhydroxysuccinimide esters, for example, esters with 4-azidosalicylic acid, homobifunctional imidoesters, including disuccinimidyl esters such as 3,3'dithiobis(succinimidylpropionate), and bifunctional maleimides such as bis-Nmaleimido-1,8-octane. Derivatizing agents such as methyl-3-[p-azidophenyl) dithiolpropioimidate yield photoactivatable intermediates that are capable of forming crosslinks in the presence of light. Alternatively, reactive water-insoluble matrices such as cyanogen bromide-activated carbohydrates and the reactive substrates described in U.S. Patent Nos. 3,969,287; 3,691,016; 4,195,i28; 4,247,642; 4,229,537; and 4,330,440 are employed for protein immobilization.
Other modifications include hydroxylation ofproline and lysine, phosphorylation of hydroxyl groups of seryl or threonyl residues, methylation of the alpha-amino groups of lysine, arginine, and histidine side chains (Creighton, T.E., WO 00/32825 PCT/IB99/02040 67 Proteins: Structure and Molecular Properties, W.H. Freeman Co., San Francisco, pp. 79-86 (1983)), acetylation of the N-terminal amine, and, in some instances, amidation of the C-terminal carboxyl groups.
Such derivatized moieties may improve the stability, solubility, absorption, biological half life, and the like. The moieties may alternatively eliminate or attenuate any undesirable side effect of the protein complex. Moieties capable of mediating such effects are disclosed, for example, in Alfonso and Gennaro (1995).
The term "fragment" is used to indicate a polypeptide derived from the amino acid sequence of the protein or polypeptide having a length less than the full-length polypeptide from which it has been derived. Such a fragment may, for example, be produced by proteolytic cleavage of the full-length protein. Preferably, the fragment is obtained recombinantly by appropriately modifying the DNA sequence encoding the proteins to delete one or more amino acids at one or more sites of the C-terminus, N-terminus, and/or within the native sequence.
Another functional derivative intended to be within the scope of the present invention is a "variant" polypeptide that either lacks one or more amino acids or contains additional or substituted amino acids relative to the native polypeptide. The variant may be derived from a naturally occurring polypeptide by appropriately modifying the protein DNA coding sequence to add, remove, and/or to modify codons for one or more amino acids at one or more sites of the C-terminus, N-terminus, and/or within the native sequence.
A functional derivative of a protein or polypeptide with deleted, inserted and/or substituted amino acid residues may be prepared using standard techniques well-known to those of ordinary skill in the art. For example, the modified components of the functional derivatives may be produced using site-directed mutagenesis techniques (as exemplified by Adelman et al., 1983, DNA 2:183; Sambrook et al., 1989) wherein nucleotides in the DNA coding sequence are modified such that a modified coding sequence is produced, and thereafter expressing this recombinant DNA in a prokaryotic or eukaryotic host cell, using techniques such as those described above. Alternatively, components of functional derivatives of complexes with amino acid deletions, insertions and/or substitutions may be conveniently prepared by direct chemical synthesis, using methods well-known in the all t.
Insofar as other anti-microbial inhibitor compounds identified by the invention described herein may not be peptidal in nature, other chemical techniques existto allow their suitable modification, as well, and according the desirable principles discussed above.
WO 00/32825 PCT/I B99/02040 68 Administration and Pharmaceutical Compositions For the therapeutic and prophylactic treatment of infection, the preferred method of preparation or administration of anti-microbial compounds will generally vary depending on the precise identity and nature of the anti-microbial being delivered. Thus, those skilled in the art will understand that administration methods known in the art will also be appropriate for the compounds of this invention.
The particularly desired anti-microbial can be administered to a patient either by itself, or in pharmaceutical compositions where it is mixed with suitable carriers or excipient(s). In treating an infection, a therapeutically effective amount of an agent or agents is administered. A therapeutically effective dose refers to that amount of the compound that results in amelioration of one or more symptoms of bacterial infection and/or a prolongation of patient survival or patient comfort.
Toxicity, therapeutic and prophylactic efficacy of anti-microbials can be determined by standard pharmaceutical procedures in cell cultures and/or experimental organisms such as animals, for determining the LD 50 (the dose lethal to 50% of the population) and the ED 50 (the dose therapeutically effective in of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LDS0/EDSQ. Compounds that exhibit large therapeutic indices are preferred. The data obtained from these cell culture assays and animal studies can be used in formulating a range of dosage for use in humans. The dosage of such compounds lies preferably within a range of circulating concentrations that include the ED, 0 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized.
For any compound identified and used in the method of the invention, the therapeutically effective dose can be estimated initially from cell culture assays. Such information can be used to more accurately determine useful doses in organisms such as plants and animals, preferably mammals, and most preferably humans. Levels in plasma may be measured, for example, by HPLC or other means appropriate for detection of the particular compound.
The exact formulation, route of administration and dosage can be chosen by the individual physician in view of -e .paiii.'s conditio "see Ir-i n i.e, a Pharmacological Basis of Therapeutics, 1975, Ch. 1 p.1).
It should be noted that the attending physician would know howand wheiito terminate, interrupt, or adjust administration due to toxicity, organ dysfunction, or other systemic malady. Conversely, the attending physician would also know to adjust treatment to higher levels if the clinical response were not adequate (precluding WO 00/32825 PCT/I B99/02040 69 toxicity). The magnitude of an administered dose in the management of the disorder of interest will vary with the severity of the condition to be treated and the route of administration. The severity of the condition may, for example, be evaluated, in part, by standard prognostic evaluation methods. Further, the dose and perhaps dose frequency, will also vary according to the age, body weight, and response of the individual patient. A program comparable to that discussed above also may be used in veterinary or phyto medicine.
Depending on the specific infection target being treated and the method selected, such agents may be formulated and administered systemically or locally, i.e., topically. Techniques for formulation and administration may be found in Alfonso and Gennaro (1995). Suitable routes may include, for example, oral, rectal, transdermal, vaginal, transmucosal, intestinal, parenteral, intramuscular, subcutaneous, or intramedullary injections, as well as intrathecal, intravenous, or intraperitoneal injections.
For injection, the agents of the invention may be formulated in aqueous solutions, preferably in physiologically compatible buffers such as Hanks' solution, Ringer's solution, or physiological saline buffer. For transmucosal administration, penetrants appropriate to the barrier to be permeated are used in the formulation.
Such penetrants are generally known in the art.
Use of pharmaceutically acceptable carriers to formulate identified antimicrobials of the present invention into dosages suitable for systemic administration is within the scope of the invention. With proper choice of carrier and suitable manufacturing practice, the compositions of the present invention, in particular those formulated as solutions, may be administered parenterally, such as by intravenous injection. Appropriate compounds can be formulated readily using pharmaceutically acceptable carriers well known in the art into dosages suitable for oral administration.
Such carriers enable the compounds of the invention to be formulated as tablets, pills, capsules, liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be treated.
Agents intended to be administered intracellularly may be administered using techniques well known to those of ordinary skill in the art. For example, such agents may be encapsulated into liposomes, then administered as described above.
Liposomes are spherical lipid bilayers with aqueous interiors. All molecules present in an aqueous solution at the time of liposome formation are incorporated into the aqueous interior. The liposomal contents are both protected from the external microenvironment and, because liposomes fuse with cell membranes, are efficiently WO 00/32825 PCT/IB99/02040 delivered into the cell cytoplasm. Additionally, due to their hydrophobicity, small organic molecules may be directly administered intracellularly.
Pharmaceutical compositions suitable for use in the present invention include compositions wherein the active ingredients are contained in an effective amount to achieve the intended purpose. Determination of the effective amounts is well within the capability of those skilled in the art.
In addition to the active ingredients, these pharmaceutical compositions may contain suitable pharmaceutically acceptable carriers comprising excipients and auxiliaries which facilitate processing of the active compounds into preparations which can be used pharmaceutically. The preparations formulated for oral administration may be in the form of tablets, dragees, capsules, or solutions, including those formulated for delayed release or only to be released when the pharmaceutical reaches the small or large intestine.
The pharmaceutical compositions of the present invention may be manufactured in a manner that is itself known, by means of conventional mixing, dissolving, granulating, dragee-making, levitating, emulsifying, encapsulating, entrapping or lyophilizing processes.
Pharmaceutical formulations for parenteral administration include aqueous solutions of the active anti-microbial compounds in water-soluble form.
Alternatively, suspensions of the active compounds may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or triglycerides, or liposomes. Aqueous injection suspensions may contain substances which increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or dextran. Optionally, the suspension may also contain suitable stabilizers or agents which increase the solubility of the compounds to allow for the preparation of highly concentrated solutions.
Pharmaceutical preparations for oral use can be obtained by combining the active compounds with solid excipient, optionally grinding a resulting mixture, and processing the mixture of granules, after adding suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt thereof such as sodium alginate.
WO 00/32825 PCT/IB99/02040 71 Dragee cores are provided with suitable coatings. For this purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer solutions, and suitable organic solvents or solvent mixtures.
Dyestuffs or pigments may be added to the tablets or dragee coatings for identification or to characterize different combinations of active compound doses.
Pharmaceutical preparations which can be used orally include push-fit capsules made of gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, stabilizers may be added.
The above methodologies may be employed either actively or prophylactically against an infection of interest.
Computer-related Aspects and Embodiments In addition to the provision of compounds as chemical entities, nucleotide sequences, or fragments thereof at least 95%, preferably at least 97%, more preferably at least 99%, and most preferably at least 99.9% identical to phage inhibitor sequences can also be provided in a variety of additional media to facilitate various uses.
Thus, as used in this section, "provided" refers to an article of manufacture, rather than an actual nucleic acid molecule, which contains a nucleotide sequence of the present invention; a nucleotide sequence of an exemplary bacteriophage or a sequence encoding a bacterial target or a fragment thereof, preferably a nucleotide sequence at least 95%, more preferably at least 99% and most preferably at least 99.9% identical to such a bacteriophage or bacterial sequence, for example, to a polynucleotide of an unsequenced phage listed in Table 1, preferably of bacteriophage 77 aureus host) or bacteriophage 3A (S.aureus host) or bacteriophage 96 (S.
aureus host). Such an article provides a large portion of the particular bacteriophage genome or bacterial gene and parts thereof a bacteriophage open reading frame (ORF)) in a form which allows a skilled artisan to examine and/or analyze the sequence using means not directly applicable to examining the actual genome or gene or subset thereof as it exists in nature or in purified form as a chemical entity In one application of this aspect, a nucleotide sequence of the present invention can be recorded on computer readable media. As used herein, "computer WO 00/32825 PCT/I B99/02040 72 readable media" refers to any medium that can be read and accessed directly by a computer. Such media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories, such as magnetic/optical storage media. A skilled artisan can readily appreciate how any of the presently known computer readable mediums can be used to create an article of manufacture which includes one or more computer readable media having recorded thereon a nucleotide sequence or sequences of the present invention. Likewise, it will be clear to those of skill how additional computer readable media that may be developed also can be used to create analogous manufactures having recorded thereon a nucleotide sequence of the present invention.
As used herein, "recorded" refers to a process for storing information on computer readable medium. A skilled artisan can readily adopt any of the presently known methods for recording information on computer readable medium to generate manufactures comprising the nucleotide sequence information of the present invention.
A variety of data storage structures are available to a skilled artisan for creating a computer readable medium having recorded thereon a nucleotide sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable medium. The sequence information can, for example, be presented in a word processing test file, formatted in commercially available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring formats text file or database) in order to obtain computer readable medium having recorded thereon the nucleotide sequence information of the present invention.
Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium. Thus, by providing in computer readable form a nucleotide sequence of an unsequenced bacteriophage, such as an exemplary bacteriophage listed in Table 1 or of a sequence encoding a bacterial target or a fragment thereof, preferably a nucleotide sequenceat least 95%, more preferably at least 99% and most preferably at least 99.9% identical to such a bacteriophage or bacterial sequence, for example, to a polynucleotide of bacteriophage 77 aureus host) or bacteriophage 3A (S.aureus host) bacteriophage WO 00/32825 PCT/IB99/02040 73 96 aureus host), bacteriophage 44AHJD aureus host), bacteriophage Dp-1 (Streptococcus pneumoniae host), or bacteriophage 182 (Enterococcus host) the present invention enables the skilled artisan to routinely access the provided sequence information for a wide variety of purposes.
Those skilled in the art understand that software can implement a variety of different search or analysis software which implement sequence search and analysis algorithms, the BLAST (Altschul et al., J. Mol. Biol. 215:403410 (1990) and BLAZE (Brutlag et al., Comp. Chem 17:203-207 (1993)) search algorithms. For example, such search algorithms can be implemented on a Sybase system and used to identify open reading frames (ORFs) within the bacteriophage genome which contain homology to ORFs or proteins from other viruses, e.g, other bacteriophage, and other organisms, the host bacterium. Among the ORFs discussed herein are protein encoding fragments of the bacteriophage genomes which encode bacteria-inhibiting proteins or fragments.
The present invention further provides systems, particularly computer-based systems, which contain the sequence information described. Such systems are designed to identify, among other things, useful fragments of the bacteriophage genomes.
As used herein, "a computer-based system" refers to the hardware, software, and data storage media used to analyze the nucleotide sequence information of the present invention. The minimum hardware of the computer-based systems of the present invention comprises a central processing unit (CPU), input device, output device, and data storage medium or media. A skilled artisan will readily recognize that any of the currently available general purpose computer-based system are suitable for use in the present invention, as well as a variety of different specialized or dedicated computer-based systems.
As stated above, the computer-based systems of the present invention comprise data storage media having stored therein a nucleotide sequence of the present invention and the necessary hardware and software for supporting and implementing a search and/or analysis program.
As used herein, "data storage media" refers to memory which can store nucleotide sequence information of the present invention, or a memory access means which can access manufactures having recorded thereon the nucleotide sequence information of the present invention.
As used herein, "search program" refers to one or more programs which are implemented on the computer-based system to compare a target sequence or target structural motif with the sequence information stored within the data storage means.
WO 00/32825 PCT/I B99/02040 74 Search means are used to identify fragments or regions of the present gnomic sequences which match a particular target sequence or target motif. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. Examples of such software includes, but is not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBIA). A skilled artisan can readily recognize that any one of the available algorithms or implementing software packages for conducting homology searches and/or sequence analyses can be adapted for use in the present computer-based systems.
As used herein in connection with sequence searches and analyses, a "target sequence" can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. Also, the target sequence length is preferably selected to include sequence corresponding to a biologically relevant portion of an encoded product, for example a region which is expected to be conserved across a range of source organisms. Preferably the sequence length of a target polypeptide sequence is from 100 amino acids, more preferably 7-50 or 7-100 amino acids, and still more preferably 10-80 or 10-100 amino acids. Preferably the sequence length of a target polynucleotide sequence is from 15-300 nucleotide residues, more preferably from 21- 240 or 21-300, and still more preferably 30-150 or 30-300 nucleotide residues.
However, it is well recognized that searches for commercially important fragments, such as sequence fragments involved in gene expression and protein processing, may be of shorter length. Likewise, it may be desirable to search and/or analyze longer sequences.
As used herein, "a target structural motif," or "target motif," refers to any rationally selected sequence or combination of sequences in which the sequence(s) are chosen based on a three-dimensional configuration which is formed upon the folding of the target motif. There are a variety of target motifs known in the art. Protein target motifs include, but are not limited to, enzymatic active sites and signal sequences. Nucleic acid target motifs include, but are not limited to promoter sequences, hairpin structures and inducible expression elements (protein binding sequences).
A variety of structural formats for the input and output devices can be used to_ input and output the information in the computer-based systems of the presrntf-" invention. A preferred format for an output device ranks fragments of the bacteriophage or bacterial sequences possessing varying degrees of homology to the WO 00/32825 PCT/I B99/02040 target sequence or target motif. Such presentation provides a skilled artisan with a ranking of sequences which contain various amounts of the target sequence or target motif and identifies the degree of homology contained in the identified fragment.
A variety of comparing methods and/or devices and/or formats can be used to compare a target sequence or target motif with the sequence stored in data storage media to identify sequence fragments of the bacteriophage or bacterium in question.
One skilled in the art can readily recognize that any one of the publicly available homology search programs can be used as the search program for the computer-based systems of the present invention. Of course, suitable proprietary systems that may be known to those of skill, or later developed, also may be employed in this regard.
Figure 6 provides a block diagram of a computer system illustrative of embodiments of this aspect of present invention. The computer system 102 includes a processor 106 connected to a bus 104. Also connected to the bus 104 are a main memory 108 (preferably implemented as random access memory, RAM) and a variety of secondary storage devices 110, such as a hard drive 112 and a removable medium storage device 114. The removable medium storage device 114 may represent, for example, a floppy disk drive, a CD-ROM drive, a magnetic tape drive, etc. A removable storage medium 116 (such as a floppy disk, a compact disk, a magnetic tape, etc.) containing control logic and/or data recorded therein may be inserted into the removable medium storage device 114. The computer system 102 includes appropriate software for reading the control logic and/or the data from the removable medium storage device 114, once it is inserted into the removable medium storage device 114.
A nucleotide sequence of the present invention may be stored in a well-known manner in the main memory 108, any of the secondary storage devices 1 10, and/or a removable storage medium 116. During execution, software for accessing and processing the sequence (such as search tools, comparing tools, etc.) reside in main memory 108, in accordance with the requirements and operating parameters of the operating system, the hardware system and the software program or programs.
The data storage medium in which the sequence is embodied and the central processor need not be part of a single stand-alone computer, but may be separated so long as data transfer can occur. For example, the processor or processors being utilized for a search or analysis can be part of one general purpose computer, and the data storage medium can be part of a second general purpose computer connected toa.
network, or the data storage medium can be part of a network server. As anotfer example the data storage medium can be part of a computer system or network accessible over telephone lines or other remote connection method.
WO 00/32825 PCT/I B99/02040 76
EXAMPLES
Example 1. Growth of Staph A bacteriophage 77 and purification of genomic
DNA.
The Staphylococcus aureus propagating strain (PS 77; ATCC #27699) was used as a host to propagate its respective phage 77 (ATCC 27699-B Two rounds of plaque purification of phage 77 were performed on soft agar essentially as described in Sambrook et al (1989). Briefly, the PS 77 strain was grown overnight at 37°C in Nutrient broth [NB: 0.3% Bacto beef extract, 0.5% Bacto peptone (Difco Laboratories) and 0.5% NaCI (w/v)].The culture was then diluted 20x in NB and incubated at 37C until the .2 (early log phase) with constant agitation. In order to obtain single plaques, phage 77 was subjected to 10-fold serial dilutions using phage buffer (1 mM MgSO 4 5 mM MgCl 2 80 mM NaCI and 0.1% Gelatin and /l of each dilution was used to infect 0.5 ml of the cell suspension in the presence of 400 .g/ml CaCl,. After incubation of 15 min at room temperature 2 ml of melted soft agar kept at 45*C (NB supplemented with 0.6% agar) was added to the mixture and poured onto the surface of 100 mm nutrient agar plates Bacto Beef extract, 0.5% Bacto peptone, 0.5% NaCI and 1.5% Bacto agar After overnight incubation at 30°C, a single plaque was isolated, resuspended in 1 ml of phage buffer by end over end rotation for 2 hrs at 20 0 C, and the phage suspension was diluted and used for a second infection as described above. After overnight incubation at 30 0 C, a single plaque was isolated and used as a stock.
The propagation procedure for bacteriophage 77 was modified from the agar layer method of Swanst6rm and Adams (1951). Briefly, the PS 77 strain was grown to stationary phase overnight at 37 0 C in Nutrient broth. The culture was then diluted twenty-fold in NB and incubated at 37 0 C until the OD.o= The suspension (15x10 7 Bacteria) was then mixed with 15x10 5 plaque forming units (pfu) to give a ratio of 100-bacteria/phage particle in the presence of 400 pg/ml ofCaCl 2 After incubation for 15 min at 20 0 C, 7.5 ml of melted soft agar (NB plus 0.6% agar) were added to the mixture and poured onto the surface of 150 mm nutrient agar plates and incubated 16 hrs at 30 0 C. To collect the phage plate lysate, 20 ml of NB were added to each plate and the soft agar layer was collected by scrapping off with a clean microscope slide followed by shaking of the agar suspension for 5 min to break up the agar. The mixture was then centrifuged for 10 min at 4,000 RPM (2,830xg) in a JA-10 rotor-- (Beckman) and the supernatant fluid (lysate) was collected and subjected to-a treatment with 10 pg /ml of DNase I and RNase A for 30 min at 37 0 C. To precipitate the phage particles, the phage suspension was adjusted to 10% PEG 8000 and WO 00/32825 PCT/I B99/02040 77 M of NaCI followed by incubation at 4°C for 16 hrs. The phage was recovered by centrifugation at 4,000 rpm (3,500xg) for 20 min at 4 0 C on a GS-6R table top centrifuge (Beckman). The pellet was resuspended with 2 ml of phage buffer (1 mM MgSO 4 5 mM MgCl 2 80 mM NaCl and 0.1% Gelatin). The phage suspension was extracted with 1 volume of chloroform and further purified by centrifugation on a cesium chloride step gradient as described in Sambrook et al. (1989), using a TLS rotor centrifuged in an Optima TLX ultracentrifuge (Beckman) for 2 h at 28,000 rpm (67,000xg) at 4°C. Banded phage was collected and ultracentrifuged again on an isopycnic cesium chloride gradient (1.45 g/ml) at 40,000 rpm (64,000xg) for 24 h at 4°C using a TLV rotor (Beckman). The phage was harvested and dialyzed for 4 h at room temperature against 4 L of dialysis buffer consisting of 10 mM NaCI, 50 mM Tris-HCl [pH 8] and 10 mM MgCl 2 Phage DNA was prepared from the phage suspension by adding 20 mM EDTA, 50 mg/ml Proteinase K and 0.5% SDS and incubating for 1 h at 65C, followed by successive extractions with 1 volume of phenol, 1 volume of phenol-chloroform and 1 volume of chloroform. The DNA was then dialyzed overnight at 4°C against 4 L of TE (10 mM Tris pH 8.0, ImM EDTA).
Example 2. DNA sequencing of Bacteriophage 77 genome Four micrograms of phage 77 DNA was diluted in 200 pl of TE (10 mM Tris, [pH 1 mM EDTA) in a 1.5 ml eppendorftube and sonication was performed (550 Sonic DismembratorTM, Fisher Scientific). Samples were sonicated under an amplitude of 3 p.m with bursts of 5 s spaced by 15 s cooling in ice/water for 3 to 4 cycles. The sonicated DNA was then size fractionated by electrophoresis on 1% agarose gels utilizing TAE (1 x TAE is: 40 mM Tris-acetate, 1 mM EDTA [pH as the running buffer. Fractions ranging from 1 to 2 kbp were excised from the agarose gel and purified using a commercial DNA extraction system according to the instructions of the manufacturer (Qiagen), with a final elution of 50 pil of 1 mM Tris (pH The ends of the sonicated DNA fragments were repaired with a combination of T4 DNA polymerase and the Klenow fragment ofE. coli DNA polymerase I, as follows. Reactions were performed in a reaction mixture (final volume, 100 .1) containing sonicated phage DNA, 10 mM Tris-HCl [pH 50 mM NaCI, 10 mM MgCl1, 1 mM DTT, 50 p.g/ml BSA, 100 pM of each dNTP and 15 units of T4 DNA polymerase (New England Biolabs) for 20 min at 12 0 C followed by addition of 12.5 units of Klenow large fragment (New England Biolabs) for 15 min at roomtemperature. The reaction was stopped by two phenol/chloroform extractions and the WO 00/32825 PCT/IB99/02040 78 DNA was precipitated with ethanol and the final DNA pellet was resuspended in ll of H 2 0.
Blunt-ended DNA fragments were cloned by ligation directly into the Hinc II site ofpKSII+ vector (New England Biolabs) dephosphorylated by treatment with calf intestinal alkaline phosphatase (New England Biolabs)-treated pKS II+ vector (Stratagene). A typical ligation reaction contained 100 ng of vector DNA, 2 to 5 pl of repaired sonicated phage DNA (50-100 ng) in a final volume of 20 jl containing 800 units ofT4 DNA ligase (New England Biolabs) and was incubated overnight at 16 0
C.
Transformation and selection of bacterial clones containing recombinant plasmids was performed in E. coli DH 101 according to standard procedures (Sambrook et al., 1989).
Recombinant clones were picked from agar plates into 96-well plates containing 100 pl LB and 100 pg/ml ampicillin and incubated at 37 0 C. The presence of phage DNA insert was confirmed by PCR amplification using T3 and T7 primers flanking the Hinc II cloning site of the pKS II+ vector. PCR amplification of foreign insert was performed in a 15 pl reaction volume containing 10 mM Tris (pH mM KCI, 1.5 mM MgCI 2 0.02% gelatin, 1 pM primer, 187.5 pM each dNTP, and 0.75 units Taq polymerase (BRL). The thermocycling parameters were as follows: 2 min initial denaturation at 94C for 2 min, followed by 20 cycles of 30 sec denaturation at 94C, 30 sec annealing at 57 0 C, and 2 min extension at 72 0
C,
followed by a single extension step at 72 0 C for 10 min. Clones with insert sizes of 1 to 2 kbp were selected and plasmid DNA was prepared from the selected clones using QIAprep T M spin miniprep kit (Qiagen).
The nucleotide sequence of the extremities of each recombinant clone was determined using an ABI 377-36 automated sequencer with two types of chemistry: ABI prism Big DyeTM primer or ABI prism Big DyeTM terminator cycle sequencing ready reaction kit (Applied Biosystems). To ensure co-linearity of the sequence data and the genome, all regions of phage genome were sequenced at least once from both directions on two separate clones. In areas that this criteria was not initially met, a sequencing primer was selected and phage DNA was used directly as sequencing template employing ABI prism Big DyeTM terminator cycle sequencing ready reaction kit.
Example 3. Bioinformatic management of primary nucleotide sequence from Phase 77.
Phage 77 sequence contigs were assembled using SequencherTM 3.1 software (GeneCodes). To close contig gaps, sequencing primers were selected near the edge of WO 00/32825 PCT/I B99/02040 79 the contigs. Phage DNA was used directly as sequencing template employing ABI prism BIG DYETM terminator cycle sequencing ready reaction kit. The complete sequence of bacteriophage 77 is shown in Table 2.
A software program was developed and used on the assembled sequence of bacteriophage 77 to identify all putative ORFs larger than 33 codons. Other ORF identification software can also be utilized, preferably programs which allow alternative start codons. The software scans the primary nucleotide sequence starting at nucleotide #1 for an appropriate start codon. Three possible selections can be made for defining the nature of the start codon; I) selection of ATG, II) selection of ATG or GTG, and III) selection of either ATG, GTG, TTG, CTG, ATT, ATC, and ATA. This latter initiation codon set corresponds to the one reported by the NCBI (http://www.ncbi.nlm.nih.gov/htbin-post/Taxonomv/wprintgc?mode=c) for the bacterial genetic code.
When an appropriate start codon is encountered, a counting mechanism is employed to count the number of codons (groups of three nucleotides) between this start codon and the next stop codon downstream of it. If a threshold value of 33 is reached, or exceeded, then the sequence encompassed by these two codons (start and stop codons) is defined as an ORF. This procedure is repeated, each time starting at the next nucleotide following the previous stop codon found, in order to identify all the other putative ORFs. The scan is performed on all three reading frames of both DNA strands of the phage sequence.
Sequence homology (BLAST) searches for each ORF are then carried out using an implementation of BLAST programs, although any of a variety of different sequence comparison and matching programs can be utilized as known to those skilled in the art. Downloaded public databases used for sequence analysis include: i) non-redundant GenBank (ftp://ncbi.nlm.nih.gov/blast/db/nr.Z), ii) Swissprot (ftp://ncbi.nlm.nih.gov/blast/db/swissprot.Z); iii) vector (ftp://ncbi.nlm.nih.gov/blast/db/vector.Z); iv) pdbaa databases (ftp://ncbi.nlm.nih.gov/blast/db/pdbaa.Z); v) S. aureus NCTC 8325 (ftp://ftp.genome.ou.edu/pub/staph/staph-lk.fa); vi) streptococcus pyogenes (ftp://ftp.genome.ou.edu/pub/strep/strep- k.fa); vii) Streptococcus pneumoniae (ftp://ttp.tigr.org/pubidata/s_pneumoniaeigsp.contigs. ii 2197.Z); viii) Mycobacterium tuberculosis CSU#9 (ftp://ftp.tigr.org/pub/data/m_tuberculosis/TB_091097.Z) and ix) pseudomonas aeruginosa (http://www.eenome.washington.edu/pseudo/data.html).
WO 00/32825 PCT/IB99/02040 The results of the homology searches performed on the ORFs is shown in Table Example 4. Subcloning of Bacteriophage 77 ORFs into a Staph A inducible expression system.
The shuttle vector pT0021, in which the firefly luciferase (lucFF) expression is controlled by the ars (arsenite) promoter/operator (Tauriainen et al., 1997), was modified in the following fashion. Two oligonucleotides corresponding to a short antigenic peptide derived from the heamaglutinin protein of influenza virus (HA epitope tag) were synthesized (Field et al., 1988). The sense strand HA tag sequence (with BamHI, Sail and HindIII cloning sites) is: '-gatcccggtcgaccaagcttTACCCATACGACGTCCCAGACTACGCCAGCTGA-3' (where upper case letters denote the nucletotide sequence of the HA tag); the antisense strand HA tag sequence (with a HindIII cloning site) is: 5'-agctTCAGCTGGCGTAGTCTGGGACGTCGTATGGGTAaagcttggtcgaccgg-3' (where upper case letters denote the sequence of the HA tag). The two HA tag oligonucleotides were annealed and ligated into pT0021 vector which had been digested with BamHI and HindIII. This manipulation resulted in replacement of the lucFF gene by the HA tag. This modified shuttle vector containing the arsenite inducible promoter, the arsR gene, and HA tag was named pTHA. A diagram outlining our modification ofpT0021 to generate pTHA is shown in Fig. 1A.
Each ORF, encoded by Bacteriophage 77, larger than 33 amino acids and having a Shine-Dalgarno sequence upstream of the initiation codon was selected for functional analysis for bacterial inhibition. In total, 98 ORFs were selected and screened as detailed below. A list of these is presented in Table 3. Each individual ORF, from initiation codon to last codon (excluding the stop codon), was amplified from phage genomic DNA using the polymerase chain reaction (PCR). For PCR amplification of ORFs, each sense strand primer targets the initiation codon and is preceded by a BamHI restriction site 5 'cgggatcc) and each antisense oligonucleotide targets the pentultimate codon (the one before the stop codon) of the ORF and is preceded by a Sal I restriction site 5 gcgtcgaccg 3 The PCR product of each ORF was gel purified and digested with BamHI and Sail. The digested PCR product was then gel purified using the Qiagen kit as described, ligated into BamHI and Sail digested pTHA vector, and used to transform E. coli bacterial strain DH100(as described above). As a result of this manipulation, the HA tag is set inframe with the ORF and is positioned at the carboxy terminus of each ORF (pTHA/ORF clones). Recombinant pTHA/ORF clones were picked and their insert sizes were confirmed by PCR analysis WO 00/32825 PCT/IB99/02040 81 using primers flanking the cloning site. The names and sequences of the primers that were used for the PCR amplification were: HAF:
'TATTATCCAAAACTTGAACA
3 HAR: 5
CGGTGGTATATCCAGTGATT
3 The sequence integrity of cloned ORFs was verified directly by DNA sequencing using primers HAF and HAR. In cases where verification of ORF sequence could not be achieved by one pass with the sequencing primers, additional internal primers were selected and used for sequencing.
Staphylococcus aureus strain RN4220 (Kreiswirth et al., 1983) was used as a recipient for the expression of recombinant plasmids. Electoporation was performed essentially as previously described (Schenk and Laddaga, 1992). Selection of recombinant clones was performed on Luria-Broth agar (LB-agar) plates containing pg/ml ofkanamycin.
For each ORF introduced in the pTHA plasmid, 3 independent transformants were isolated and used to individually inoculate cultures in 5 ml of TSB containing 30pg/ml kanamycin, followed by growth to saturation (16 hrs at 30 0 An aliquot of this stationary phase culture was used to generate a frozen glycerol stock of the transformant stored at 80°C). The remaining culture was used for plasmid DNA extraction. Bacterial cells were harvested by centrifugation at 3000 x g at 22 0 C for min. The pellet was resuspended in 200 tl 25% sucrose containing 25U/ml of lysostaphin and incubated for 15 min at 37C. Then, 400.pl of alkaline SDS solution SDS, 0.2N NaOH) were added, well mixed and incubated for 7 min at room temperature. After the alkaline SDS treatment, 300l1 of ice-cold 3M sodium acetate pH 4.8 were added, and the mix is immediately spun at 13000g for 15 min at room temperature. The supernatant was transferred to a new 1.5 ml conical centrifuge tube and 6 50 pl ofisopropanol (stored at room temperature) were added. The mix was then centrifuged at 13,000 x g for 5 min. The superatant fluid was discarded, the pellet washed with 70% ethanol, and resuspended in 320 pl sterile distilled water.
The presence of individual phage 77 ORF DNA inserts in the plasmid was verified by PCR amplification using 1.5 pl transformant miniprep DNA in a PCR with primers flanking the cloning site of ORF in pTHA vector (HAF and HAR). The composition of the PCR reaction and the cycling parameters are identical to those employed for library screening described above.
Example 5. Functional assay for bacterial inhibitory activity of bacteriophage 77 ORFs.
The anti-microbial activity of individual phage 77 ORFs was monitored by two growth inhibitory assays, one on solid agar medium, the other in liquid medium.
WO 00/32825 PCT/IB99/02040 82 In general, Staphylococcus bacteria transformed with expression plasmids containing individual ORFs were grown in normal TSA medium and stored in 19% glycerol. At pre-determined times, arsenite was added to the culture to induce transcription of the phage 77 ORFs cloned immediately downstream from an arsenite-inducible promoter in the pTHA expression plasmid.
The effect of ORF induction on bacterial growth characteristics was then monitored and quantitated. The growth inhibition assay on solid medium was performed by streaking pTHA/ORF containing S. aureus transformant onto LB-Kn and TSA-Kn plates containing increasing concentrations of sodium arsenite 2.5; and 7.5 pM). Arsenite is used to induce the expression of cloned DNA in pTHA vector. In parallel, 3 pl of 1/10 and 1/100 dilutions of the frozen cultures of the pTHA/ORF transformants were spotted as single drops onto LB-Kn and TSA-Kn plates containing increasing concentration of sodium arsenite 2.5; 5; and 7.5 pM).
The plates were then incubated 16 hrs at 37 0 C, and the effect of arsenite-induced ORF expression on bacterial growth was monitored and quantitated by comparing the extent to that seen in control plates. As positive controls for growth inhibition,the holin/lysin genes of the Sthaphylococcus aureus phage Twort (Loessner et al., 1998) was subcloned into the pTHA ars inducible vector and used.
For the growth inhibition assay in liquid medium, stationary phase cultures were prepared by inoculating 2.5ml TSB-Kn with frozen S. aureus RN4220 transformants containing phage 77 ORFs cloned in pTHA vector followed by incubation for 16 hrs at 37°C. These cultures were then diluted 1/100 in the same medium, and the bacteria were allowed to grow for 2 hrs at 37 0 C to reach early log phase. 150 il of such culture were then mixed with 2.35 ml TSB-Kn medium with or without arsenite (the final concentration of arsenite in the medium was 0 or 5 pM arsenite). After 3.5 hrs incubation at 37 0 C with shaking at 250 rpm, 100 gl of bacterial culture was removed from each tube for OD, 65 measurement. Serial ten-fold dilutions of the culture in buffered saline solution (0.85% NaCI) were then spotted onto TSB-Kn plates. The plates were incubated at 37 0 C 16 hrs and the number of surviving colonies counted the following day. The growth inhibitory property of individual ORFs was then quantitated by comparing CFU numbers under normal or arsenite-induction conditions. A schematic flow of the inhibition analysis is shown in Fig. 3 (also applicable to inhibition analysis for the other phagC and bacteria pointed out herein). Inhibition results are shown in Figures 4A-C.
Example 6: Itentification of Cecropin Signature Motif in Staphvlococcus aureus Bacteriophage 3A ORF WO 00/32825 PCT/IB99/02040 83 The genome for S. aureus bacteriophage 3A was determined and the sequence was analyzed essentially as described for bacteriophage 77 in the examples above.
Upon blast analysis of the identified open reading frames of phage 3A, the presence of an amino acid sequence corresponding to a cecropin signature motif was observed.
This motif (WDGHKTLEK) is located at position aa 481-489. Cecropins were originally identified in proteins from the cecropia moth and are recognized as potent antibacterial proteins that constitute an important part of the cell-free immunity of insects. Cecropins are small proteins (31-39 amino acid residues) that are active against both Gram-positive and Gram-negative bacteria by disrupting the bacterial membranes. Although the mechanisms by which the cecropons cause cell death are not fully understood, it is generally thought to involve channel formation and membrane destabilization.
The identification of a motif corresponding to a known inhibitor suggests that the product of ORF002 is also an inhibitory compound. Such inhibitory activity can be confirmed as described herein or by other methods known in the art. Confirmation of the inhibitory activity would indicate that the ORF product could serve as the basis for construction of mimetic compounds and other inhibitors directed to the target of the ORF002 product.
Boman Hultmark, 1987, Ann. Rev. Microbiol. 41:103-126.
Boman, 1991, Cell 65:205-207.
Boman et al., 1991, Eur. J. Bioichem. 201:23-31.
Wang et al., J. Biol. Chem. 273:27438-27448.
Example 7. Growth of Staphylococcus aureus bacteriophage 44AHJD: Staphylococcus aureus propagating strain (PS 44A) (Felix d'Herelle Reference Centre #HER 1101) was used as a host to propagate its respective phage 44AHJD (Felix d'Herelle Reference Centre #HER 101). Two rounds of plaque purification of phage 44AHJD were performed on soft agar essentially as described in Sambrook et al. (1989). Briefly, the Staphylococcus aureus PS strain was grown overnight at 37°C in Nutrient Broth [NB: 3 g Bacto Beef Extract, 5 g Bactopeptone per liter, (Difco Laboratories 0003-17-8), supplemented with 0.5% NaCI]. The culture was then diluted 20 fold in NB and incubated at 37 0 C until an OD,, of 0.2. n order to nhtnin single plaques, phage 44AHJD was subjected to 10-fold serial dilutions using the phage buffer (1 mM MgSO,, 5 mM MgCI 2 80 mM NaCI and 0.1% Gelatin) andlO pl were used to infect 0.5 ml of the cell suspension in the presence of 400 ig/ml of WO 00/32825 PCT/IB99/02040 84 CaC12. After incubation of 15 min at room temperature, 2 ml of melted soft agar (NB supplemented with 0.6% of agar) were added to the mixture and poured onto the surface of 100 mm nutrient agar plates (3 g Bacto Beef extract, 5 g Bactopeptone, NaCI and 15 g of Bacto agar per liter (Difco Laboratories 0001-17-0). After overnight incubation at 37°C, a single plaque was isolated, resuspended in Iml of phage buffer by end over end rotation for 2 h at room temperature and the phage suspension was diluted and used for a second infection as described above. After overnight incubation at 37C, a single plaque was isolated and used as a stock.
Large scale purification of bacteriophage and preparation of phage DNA was as follows.
The propagation method was carried out by using the agar layer method described by Swanst6rm and Adams (1951). Briefly, the PS 44A strain was grown to stationary phase overnight at 37 0 C in Nutrient Broth. The culture was then diluted in NB and incubated at 37°C until the A54= 0.2. The suspension (15x10 7 Bacteria) was then mixed with 15x 10 phage particles to give a ratio of 100-bacteria/phage particle in the presence of 400 ig/ml ofCaClI. After incubation of 15 min at room temperature, 7.5 ml of melted soft agar were added to the mixture and poured onto the surface of 150 mm nutrient agar plates and incubated overnight at 37C. To collect the lysate, 20 ml ofNB were added to each plate and the soft agar layer was collected by scrapping off with a clean microscope slide and shaken vigorously for 5 min to break up the agar. The mixture was then centrifuged for 10 min at 4,000 rpm (2,830 xg) using a JA-10 rotor (Beckman) and the supernatant (lysate) is collected and subjected to a treatment with 10 pig/ml of DNase I and RNase A for 30 min at 37 0 C. To precipitate the phage particles, 10% of PEG 8000 and 0.5 M of NaCl were added to the lysate and the mixture was incubated on ice for 16 h. The phage was recovered by centrifugation at 4,000 rpm (3,500 xg) for 20 min at 4°C on a GS-6R table top centrifuge (Beckman).
The pellet was resuspended with 2 ml of phage buffer (1 mM MgSO 4 5 mM MgC!,, 80 %mMA NasC' and 0.A 1 o/ Glatin). The phage suspension was extracted with 1 volume of chloroform and further purified by centrifugation on a preformed cesium chloride step gradient as described in Sambrook et al. (1989), using a TLS and centrifuged in an Optima TLX ultracentrifuge (Beckman) for 2 h at 28,000 rpm (67,000 xg) at 4 0 C. Banded phage was collected and ultracentrifuged again on an WO 00/32825 PCT/IB99/02040 isopycnic cesium chloride gradient (1.45 g/ml) at 40,000 rpm (64,000 x g) for 24 h at 4°C using a TLV rotor (Beckman). The phage was harvested and dialyzed for 4 h at room temperature against 4 L of dialysis buffer consisting of 10 mM NaCI, 50 mM Tris-HCI [pH 8] and 10 mM MgCl. Phage DNA was prepared from the phage suspension by adding 20 mM EDTA, 50 pg/ml Proteinase K and 0.5% SDS and incubating for 1 h at 65 0 C, followed by successive extractions with 1 volume of phenol, 1 volume of phenol-chloroform and 1 volume of chloroform. The DNA was then dialyzed overnight at 4 0 C against 4 L of TE (10 mM Tris-HCl [pH ImM
EDTA).
Example 8. DNA sequencing of the Bacteriophage 44 AHJD genome.
Four mg of phage DNA was diluted in 200 p1 of TE pH 8.0 in a 1.5 ml eppendorftube and sonication was performed (550 Sonic Dismembrator, Fisher Scientific). Samples were sonicated under an amplitude of 3 plm with bursts of 5 s spaced by 15 s cooling in ice/water for 3 to 4 cycles and size fractionated on 1% agarose gels. The sonicated DNA was then size fractionated by gel electrophoresis.
Fractions ranging from 1 to 2 kbp were excised from the agarose gel and purified using a coommercial DNA extraction system according to the instructions of the manufacturer (Qiagen) and eluted in 50 pl of ImMTris-HCl pH The ends of the sonicated DNA fragments were repaired with a combination of T4 DNA polymearse and the Klenow fragment ofE. coli DNA polymerase 1 as follows. Reactions were performed in a final volume of 100 pl containing DNA, mM Tris-HCI pH 8.0, 50 mM NaCI, 10 mM MgCl,, 1 mM DTT, 5 pg BSA, 100 M of each dNTP and 15 units of T4 DNA polymerase (New England Biolabs) for 20 min at 12 0 C followed by addition of 12.5 units of Klenow fragment (New England Biolabs) for 15 min at room temperature. The reaction was stopped by two phenol/chloroform extractions and the DNA was ethanol precipitated and resuspended in 20 gp of H 2 0.
Cloning of the sonicated phage DNA into pKSII vector and transformation: Blunt-ended DNA fragments were cloned by ligation directly into theHincII site of the pkSII vector (Stratagene) dephosphorylated with calf intestinal alkaline phosphatase (New England Biolabs). A typical reaction contained 100 ng of vector, 2 WO 00/32825 PCT/IB99/02040 86 to 5 pl of repaired sonicated phage DNA (50-100 ng) in a final volume of 20 pl containing 800 units of T4 DNA ligase (New England Biolabs) overnight at 16 0
C.
Transformation and selection of positive clones was performed in the host strain P ofE. coli using ampicillin as a selective antibiotic as described in Sambrook et al. (1989).
Recombinant clones were picked from agar plates into 96-well plates containing 100 ml LB and 100 pg/ml ampicillin and incubated at 37°C. The presence of phage DNA insert was confirmed by PCR amplification using T3 and T7 primers flanking the HincII cloning site of the pKS vector. PCR amplification of the potential foreign inserts was performed in a 15 il reaction volume containing 10 mM Tris-HC1 (pH 50 mM KCI, 1.5 mM MgCl, 0.02% gelatin, 1 mM primer, 187.5 pM each dNTP, and 0.75 units Taq polymerase (BRL). The thermocycling parameters were as follows: 2 min initial denaturation at 94 0 C for 2 min, followed by 20 cycles of 30 sec denaturation at 94 0 C, 30 sec annealing at 58C, and 2 min extension at 72°C, followed by a single extension step at 72 0 C for 10 min. Clones with insert sizes of 1 to 2 kbp were selected and plasmid DNA was prepared from the selected clones using the QIAprep T M spin miniprep kit (Qiagen).
The nucleotide sequence of the extremities of each recombinant clone was determined using an ABI 377-36 automated sequencer with two types of chemistry: ABI prism BigDye T M primer cycle sequencing (21M13 primer: #403055)(M13REV primer: #403056) or ABI prism BigDyeTM terminator cycle sequencing ready reaction kit (Applied Biosystems; #4303152). To ensure co-linearity of the sequence data and the genome, all regions of the phage genome were sequenced at least once from both directions on two separate clones. In areas that this criteria was not initially met, a sequencing primer was selected and phage DNA was used directly as sequencing template employing ABI prism BigDyeTM terminator cycle sequencing ready reaction kit.
Example 9. Bioinformatic management of primary nucleotide sequence.
Sequence contigs were assembled using Sequencher T M 3.1 software (GeneCodes). To close contig gaps, sequencing primers were selected near the edge of the contigs. Phage DNA was used directly as sequencing template employing ABI WO 00/32825 PCT/IB99/02040 87 prism BigDye T M terminator cycle sequencing ready reaction kit (Applied Biosystems; #4303152). The complete sequence of Staphylococcus aureus bacteriophage 44AHJD is shown in Table 16.
A software program was used on the assembled sequence of bacteriophage 44AHJD to identify all putative ORFs larger than 33 codons. The software scans the primary nucleotide sequence starting at nucleotide #1 for an appropriate start codon.
Three possible selections can be made for defining the nature of the start codon; I) selection of ATG, II) selection of ATG or GTG, and III) selection of either ATG, GTG, TTG, CTG, ATT, ATC, and ATA. This latter initiation codon set corresponds to the one reported by the NCBI(http://www.ncbi.nlm.nih.gov/htbinpost/Taxonomv/wprintgc?mode=c) for the bacterial genetic code. When an appropriate start codon is encountered, a counting mechanism is employed to count the number of codons (groups of three nucleotides) between this start codon and the next stop codon downstream of it. If a threshold value of 33 is reached, or exceeded, then the sequence encompassed by these two codons is defined as an ORF. This procedure is repeated, each time starting at the next nucleotide following the previous stop codon found, in order to identify all the other putative ORFs. The scan is performed on all three reading frames of both DNA strands of the phage sequence.
The predicted ORFs for bacteriophage 44AHJD are listed in Tables 17 18.
Sequence homology searches for each ORF were carried out using an implementation of blast programs. Downloaded public databases used for sequence analysis include: non-redundant GenBank (ftp://ncbi.nlm.nih.gov/blast/db/nr.Z), ii) Swissprot (ftp://ncbi.nlm.nih.gov/blast/db/swissprot.Z); iii) vector (ftp://ncbi.nlm.nih.gov/blast/db/vector.Z); iv) pdbaa databases (ftp://ncbi.nlm.nih.gov/blast/db/pdbaa.Z); v) Staphylococcus aureus NCTC 8325 (ftp://ftp.genome.ou.edu/pub/staph/staph- 1k.fa); vi) Staphy ccu pyogeC;(ftp:! ftp.tigr. rg/pub!data./s..neumoninae r.nnti,,. 1121 97.Z); vii)PRODOM(ftp://ftp.toulouse.inra.fr/pub/prodom/currentrelease/prodom99J.o-rbl ast.gz); viii) DOMO (ftp://ftp.infobiogen.fr/pub/db/domo/); WO 00/32825 PCT/IB99/02040 88 ix) TREMBL (ftp://www.expasy.ch/databases/sp_trnrdb/fasta/) The results of the homology searches performed on the ORFs of bacteriophage 44AHJD are shown in Tables 19 Example 10. Sub-Cloning of Bacteriophage 44 AHJD ORFs.
Expression preferably utilizes a shuttle expression vector which is arranged such that expression of the exogenous bacteriophage 44 AHJD ORF sequence is inducible. For example, the shuttle vector pT0021, in which the firefly luciferase (lucFF) expression is controlled by the ars (arsenite) promoter/operator (Tauriainen et al., 1997), can be modified in the following fashion. Two oligonucleotides corresponding to a short antigenic peptide derived from the heamaglutinin protein of influenza virus (HA epitope tag) were synthesized (Field et al., 1988). The sense strand HA tag sequence (with BamHI, Sall and HindIII cloning sites) is: 5'-gatcccggtcgaccaagcttTACCCATACGACGTCCCAGACTACGCCAGCTGA-3' (where upper case letters denote the nucletotide sequence of the HA tag); the antisense strand HA tag sequence (with a HindIII cloning site) is: 5'-agctTCAGCTGGCGTAGTCTGGGACGTCGTATGGGTAaagcttggtcgaccgg-3' (where upper case letters denote the sequence of the HA tag). The two HA tag oligonucleotides were annealed and ligated into pT0021 vector which had been digested with BamHI and HindIII. This manipulation resulted in replacement of the lucFF gene by the HA tag. This modified shuttle vector containing the arsenite inducible promoter, the arsR gene, and HA tag was named pTHA. A diagram outlining our modification ofpT0021 to generate pTHA is shown in Fig. 1A (another userful vector construct is shown in Fig. 1B).
Each ORF, encoded by Bacteriophage 44 AHJD, larger than 33 amino acids and having a Shine-Dalgamo sequence upstream of the initiation codon can be selected for functional analysis for bacterial inhibition. Each individual ORF, from initiation codon to last codon (excluding the stop codon), can be amplified from phage genomic DNA Iuing the pnnlvmera r.hain reaction (PCR). For PCR amplification of ORFs, each sense strand primer targets the initiation codon and is preceded by a BamHI restriction site (scgggatcc') and each antisense oligonucleotide targets thepentultimate codon (the one before the stop codon) of the ORF and is preceded by a Sal I restriction site 5 gcgtcgaccg"). The PCR product of each ORF can be gel WO 00/32825 PCT/IB99/02040 89 purified and digested with BamHI and Sall. The digested PCR product can then be gel purified using the Qiagen kit as described, ligated into BamHI and SalI digested pTHA vector, and used to transform E. coli bacterial strain DH10p(as described above). As a result of this manipulation, the HA tag is set inframe with the ORF and is positioned at the carboxy terminus of each ORF (pTHA/ORF clones). Recombinant pTHA/ORF clones will be picked and their insert sizes were confirmed by PCR analysis using primers flanking the cloning site. The following primers can be used for PCR amplification: HAF: STATTATCCAAAACTTGAACA"; HAR:
CGGTGGTATATCCAGTGATT
3 The sequence integrity of cloned ORFs can be verified directly by DNA sequencing using primers HAF and HAR. In cases where verification of ORF sequence can not be achieved by one pass with the sequencing primers, additional internal primers will be selected and used for sequencing.
Staphylococcus aureus strain RN4220 (Kreiswirth et al., 1983) will be used as a recipient for the expression of recombinant plasmids. Electoporation will be performed essentially as previously described (Schenk and Laddaga, 1992). Selection of recombinant clones will be performed on Luria-Broth agar (LB-agar) plates containing 30 p.g/ml ofkanamycin.
Alternatively, a constitutive promoter can be used to drive expression of the introduced ORF, and compare cell growth to control bacterial cells containing the parental vector lacking any introduced phage ORF. Recombinant plasmids will be introduced into Staphylococcus aureus strain RN4220 (Kreiswirth et al., 1983) using electoporation as previously described (Schenk and Laddaga, 1992).
Cloning of ORFs with a Shine-Dalgarno sequence ORFs with a Shine-Dalgarno sequence are selected for functional analysis of bacterial killing. Each ORF, from initiation codon to last codon (excluding the stop codon), can be amplified by PCR from phage genomic DNA. For PCR amplification of ORFs, each sense strand primer starts at the initiation codon and is preceded by a restriction site and each antisense strand starts at the last codon (excluding the stop codon) and is preceded by a different restriction site. The PCR product of each ORF will be gel purified and digested with the restriction enzymes with sites contained on the PCR oligonucleotides. The digested PCR product is then gel purified using-he Qiagen kit, ligated into the modified shuttle vector, and used to transform bacterial strain DH10. Recombinant clones are then picked and their insert sizes confirmed by WO 00/32825 PCT/IB99/02040 PCR analysis using primers flanking the cloning site as well as restriction digestion.
The sequence fidelity of cloned ORFs can be verified by DNA sequencing using the same primers as used for PCR. In the cases that the verification of ORFs can not be achieved by one path of sequencing using primers flanking the cloning site internal primers can be selected and used for sequencing. Recombinant plasmids can be introduced into Staphylococcus aureus strain RN4220 (Kreiswirth et al., 1983) using electoporation as previously described (Schenk and Laddaga, 1992).
Induction of gene expression from the ars promoter.
If an inducible promoter is used, the ars promoter, induction can be assessed, for example, in either of the two methods.
1. Screening on agar plates The functional identification of killer ORFs can be performed by spreading an aliquot of S. aureus transformed cells containing phage 44 AHJD ORFs onto agar plates containing different concentrations of sodium arsenite 2.5; 5; and 7.5 pM). The plates are incubated overnight at 37 0 C, after which a growth inhibition of the ORF transformants on plates that contain arsenite are compared to plates without arsenite.
2. Quantification of growth inhibition in liquid medium Cells containing different recombinant plasmids can be grown for overnight at 37 0 C in LB medium supplemented with the appropriate antibiotic selection. These are then diluted to the mid log phase (OD 5 0 with fresh media containing antibiotic and transferred to 96-well microtitration plates (100 pl/well). Inducer is then added at different final concentrations (ranging from 2.5 to 10 gM) and the culture incubated for an additional 2 hrs at 37 0 C. The effect of expression of the phage 44 AHJD ORFs on bacterial cell growth is then monitored by measuring the OD 5 40 and comparing the rate of growth to the culture not containing inducer. [As positive controls for growth inhibition, the kilA gene of phage lambda (Reisinger, GR., Rietsch, Lubitz, W. and Blasi, U. 1993 Virology #193: 1033-1036), and the holin/lysin genes of the Sthaphylococcus aureus phage Twort (Loessner, MJ., Gaeng, Wendlinger, G., Maier, SK. and Scherer, S. 1998. FEMS Microbiology Letters #i62;265-274) can be subcloned into the ars inducible vector. An aliquot of the induced and uninduced culture can also be plated out on agar plates containing an appropriate antibioticselection but lacking inducer. Following incubation overnight at 37 0 C, the number of WO 00/32825 PCT/IB99/02040 91 colonies is counted. Any ORF showing bacteriostatic activity will show a lower, but detectable, number of colonies on the agar plates when grown in the presence of inducer as compared to when grown in the absence of inducer. Any ORF showing full bacteriocidal activity will show no colonies on the agar plates, when grown in the presence of inducer as compared to when grown in the absence of inducer.
REFERENCES
Ackermann, H-W. and DuBow, M. S. (1987). Viruses of Prokaryotes. Volumes I and II. CRC Press, Boca Raton, Florida.
Tenover, F.C. and McGowan Jr., J.E. (1998). Bacterial Infections of Humans.
Epidemiology and Control.(A.S. Evans and P.S. Brachman, eds.) Plenum Medical Book Company, New York, N.Y. pp. 83-93.
Rusterholtz, and Pohlschroder, M. (1999). Cell 96, 469-470.
Gray, B.M. (1998). Bacterial Infections of Humans. Epidemiology and Control.(A.S.
Evans and P.S. Brachman, eds.) Plenum Medical Book Company, New York, N.Y.
pp. 673 D 711.
Sambrook, Fritsch, E.F. and Maniatis, T. (1989). Molecular cloning: A laboratory manual. Cold Spring Harbor Laboratory, New York. Cold Spring Harbor Laboratory Press.
Ausubel, F.M. et al. (1994) Current Protocols in Molecular Biology. John Wiley Sons, Secaucus, N.J.
Rost B,l and Sander C. (1996). Ann. Rev. Biophy. Biomol. Struct. 25, 113-136.
Martin, Lopez, Garcia, P. (1998). J Bacteriol 180, 210-217.
Steiner, Lubitz, Blasi, U. (1993). J. Bacteriol. 175, 1038-1042.
Durfee, Becherer, Chen, Yeh, Yang, Kilburn, Lee, W.and Elledge, S.J. (1993). Genes Dev. 7, 555-569.
Qiu, Garcia-Barrio, and Hinnebusch, A.G. (1998). Mol Cell Biol. 18, 2697-2711.
WO 00/32825 PCT/I B99102040 92 Katagiri, Saito, Shinohara, Ogawa, Kamada, Nakamura and Mild, Y. (1998). Genes, Chromosomes Cancer 21, 217-222.
Endo, Masuhara, Yokouchi, Suzuki, Sakamoto, Mitsui, K., Matsumoto, Tanimura, Ohtsubo, Misawa, Miyazaki, Leonor N., Taniguchi, Fujita, Kanakura, Komiya, and Yoshimura, A. (1997).
Nature 387, 921-924.
Karimova, Pidoux, Ullmann, Ladant, D. (1998) Proc. Natl. Acad. Sci. 5752-5756.
Sopta, Carthew, and Greenblatt, J. (1995) J. Biol. Chem. 260, 10353- 10369.
Qin, Fenyo, Zhao, Hall, Chao, Wilson, Young, R.A. and Chait, B.T. (1997). Anal. Chem. 69, 3995-4001.
Swanstrbm, M. and Adams, M.H. (195 Proc. Soc. Exptl. Biol. Med. 78: 372- 375.
Ro0der, Wandall, D. Frimodt-Moller, Epersen, Skinhoj, P. and Rosdahl, T. (1999). Arch. Intern. Med. 159: 462-469.
Sanabria, Albert, Goldberg, Pape, L.A. and Cheeseman, S.H. (1990).
Arch. Intern. Med. 150: 1305-1309.
Frimodt-Moller, Epersen, Skinhoj, P. and Rosdahl, V.T. (1997). Clin.
Microbiol. Infect. 3: 297-305.
Harbath, Rutschmann, Sudre, P. and Pittet, D. (1998). Arch. Intern. Med. 15 8: 182-1 89.
Steinberg, Clark, C.C. and Hackman, B.0. (1996). Clin. Infect. Dis. 23: 255-259.
Field, Nikawa, Broek, MacDonald, Rodgers, Wilson, Lemner, and Wigler, M. (1988). Purification of a RAS-responsive adenylyl cyclase complex from Saccharomyces cerevisiae by use of an epitope addition method. Mol. Cell. Biol. 8: 2159-2165.
Kreiswirth, BN., Lofdahl, Belley, MJ., O'Reilly, Shlievert, PM., Bergdoll, MS. and Novicks, RP. (1983) Nature 305: 709-7 12.
WO 00/32825 PCT/I B99/02040 93 Schenk, S. and Laddaga, RA. (1992) FEMS Microbiology Letters 94: 133-138.
Cohen, M.L. (1992) Science 257, 1050-1055.
Example 11. Growth of Enterococcus bacteriophage 182 and purification of genomic DNA.
The Enterococcus propagating strain (PS) (Enterococcus sp. Group D, Felix d'Herelle Reference Centre #HER 1080) was used as host to propagate its respective phage 182 (Felix d'Herelle Reference Centre #HER 80). Two rounds of plaque purification of phage 182 were performed on soft agar essentially as described in Sambrook et al. (1989). Briefly, the Enterococcus sp. PS strain was grown overnight at 37°C in Tryptic Soy Broth [TSB: 17 g Bacto tryptone, 3 g Bacto soytone, 2.5 g Bacto dextrose, 5 g Sodium chloride, and 2.5 g Dipotassium phosphate per liter (Difco Laboratories (#0370-17-3)]. The culture was then diluted 20 fold in TSB and incubated at 37C until the OD 40 0.2 (early log phase) with constant agitation. In order to obtain single plaques, phage 182 was subjected to 10 fold serial dilutions using the phage buffer (1 mM MgSO 4 5 mM MgCl 2 80 mM NaCI and 0.1% Gelatin and 10 1 of each dilution was used to infect 0.5 ml of the bacterial cell suspension. After incubation at 15 min at 37C, 2 ml of melted soft agar (TSB supplemented with 0.6% agar) was added to the mixture and poured onto the surface of 100 mm Trytic Soy Agar plates [TSA: 15 g Tryptone peptone, 5 g Soytone peptone, 5 g Sodium chloride and 15 g of Agar per liter (Difco Laboratories #0369- After overnight incubation at 37 0 C, a single plaque was isolated, resuspended in 1 ml of phage buffer by end over end rotation for 2 hrs at room temperature, and the phage suspension was diluted and used for a second infection as described above.
After overnight incubation at 37 0 C, a single plaque was isolated and used as a stock for all subsequent manipulations.
The propagation procedure for bacteriophage 182 was modified from the agar layer method of Swanstorm and Adams (1951). Briefly, the Enterococcus sp. PS strain was grown to stationary phase overnight at 37 0 C in TSB. The culture was then diluted 20 fold in TSB and incubated at 37 0 C until the A, 4 o= 0.2. The suspension (15x10' Bacteria) was then mixed with 15x10 5 plaque forming units (pfu) to give a WO 00/32825 PCT/IB99/02040 94 ratio of 100-bacteria/pfu. After incubation of 15 min at 37C, 7.5 ml of melted soft agar (TSB plus 0.6% agar) were added to the mixture and poured onto the surface of 150 mm TSA plates and incubated 16 hrs at 37 0 C. To collect the plate lysate, 20 ml of TSB were added to each plate and the soft agar layer was collected by scrapping off with a clean microscope slide followed by vigorous shaking of the agar suspension for min to break up the agar. The mixture was then centrifuged for 10 min at 4,000 rpm (2,830 xg) using a JA-10 rotor (Beckman) and the supernatant fluid (lysate) is collected and subjected to a treatment with 10 lg /ml of DNase I and RNase A for min at 37°C. To precipitate the phage particles, the phage suspension was adjusted to 10% of PEG 8000 and 0.5 M of NaCl followed by incubation at 4 0 C for 16 hrs.
The phage was recovered by centrifugation at 4,000 rpm (3,500 xg) for 20 min at 4 0
C
on a GS-6R table top centrifuge (Beckman). The pellet was resuspended with 2 ml of phage buffer (1 mM MgSO,, 5 mM MgCI 2 80 mM NaCI and 0.1% Gelatin). The phage suspension was extracted with 1 volume of chloroform and further purified by centrifugation on a cesium chloride step gradient as described in Sambrook et al.
(1989), using a TLS 55 rotor and centrifuged in an Optima TLX ultracentrifuge (Beckman) for 2 hrs at 28,000 rpm (67,000 xg) at 4 0 C. Banded phage was collected and ultracentrifuged again on an isopycnic cesium chloride gradient (1.45 g/ml) at 40,000 rpm (64,000 xg) for 24 hrs at 4°C using a TLV rotor (Beckman). The phages were harvested and dialyzed for 4 hrs at room temperature against 4 L of dialysis buffer consisting of 10 mM NaCI, 50 mM Tris-HC1 [pH 8] and 10 mM MgCl,. Phage DNA was prepared from the phage suspension by adding 20 mM EDTA, 50 g/ml Proteinase K and 0.5% SDS and incubating for 1 hr at 65C, followed by successive extractions with 1 volume of phenol, 1 volume of phenol-chloroform and 1 volume of chloroform. The DNA was then dialyzed overnight at 4 0 C against 4 L of TE (10 mM Tris-HC1 [pH ImM EDTA).
Example 12. DNA sequencing of the Bacteriophage 182 genome.
Four micrograms of phage DNA wna diluted in 200 1l of TE (10 mM '"ri.s [pH 1 mM EDTA) in a 1.5 ml eppendorf tube and sonication was performed (550 Sonic Dismembrator, Fisher Scientific). Samples were sonicated under arn amplitude of 3 pim with bursts of 5 s spaced by 15 s cooling in ice/water for 3 to 4 WO 00/32825 PCT/IB99/02040 cycles. The sonicated DNA was then size fractionated by electrophoresis on 1% agarose gels utilizing TAE (1 x TAE is: 40 mM Tris-acetate, 1 mM EDTA [pH as the running buffer. Fractions ranging from 1 to 2 kbp were excised from the agarose gel and purified using a commercial DNA extraction system according to the instructions of the manufacturer (Qiagen), with a final elution of 50 pl of 1 mM Tris [pH The ends of the sonicated DNA fragments were repaired with a combination of T4 DNA polymerase and the Klenow fragment ofE. coli DNA polymerase I, as follows. Reactions were performed in a reaction mixture (final volume, 100 p.l) containing sonicated phage DNA, 10 mM Tris-HCl [pH 50 mM NaCI, 10 mM MgCl, 1 mM DTT, 50 pg/ml BSA, 100 piM of each dNTP and 15 units of T4 DNA polymerase (New England Biolabs) for 20 min at 12 0 C followed by addition of 12.5 units of the Klenow large fragment of DNA polymerase I(New England Biolabs) for min at room temperature. The reaction was stopped by two phenol/chloroform extractions and the DNA was precipitated with ethanol and the final DNA pellet resuspended in 20 gl of H 2
O.
Blunt-ended DNA fragments were cloned by ligation directly into the Hinc II site of the pKSII+ vector (New England Biolabs) dephosphorylated by treatment with calf intestinal alkaline phosphatase (New England Biolabs). A typical ligation reaction contained 100 ng of vector DNA, 2 to 5 gl of repaired sonicated phage DNA (50-100 ng) in a final volume of 20 p~ containing 800 units ofT4 DNA ligase (New England Biolabs) and was incubated overnight at 16°C. Transformation and selection of bacterial clones containing recombinant plasmids was performed in E. coli DH1Op according to standard procedures (Sambrook et al., 1989).
Recombinant clones were picked from agar plates into 96-well plates containing 100 gl LB and 100 pg/ml ampicillin and incubated at 37 0 C. The presence of phage DNA insert was confirmed by PCR amplification using T3 and T7 primers flanking the Hinc II cloning site of the pKS vector. PCR amplification of the potential foreign inserts was performed in a 15 pl reaction volume containing 10 mM Tris (pH 50 mM KCI, 1.5 mM MgCI,, 0.02% gelatin, 1 pM primer, 187.5 gM each dNT-P,and 0.75 units Taq polymerase (BRL). The thermocycling parameters were as follows: 2 min initial denaturation at 94 C for 2 min, followed by 20 cycles of 30 sec WO 00/32825 PCT/IB99/02040 96 denaturation at 94C, 30 sec annealing at 58 0 C, and 2 min extension at 72 0
C,
followed by a single extension step at 72 0 C for 10 min. Clones with insert sizes of 1 to 2 kbp were selected and plasmid DNA was prepared from the selected clones using the QIAprep T M spin miniprep kit (Qiagen).
The nucleotide sequence of the extremities of each recombinant clone was determined using an ABI 377-36 automated sequencer with two types of chemistry: ABI prism Big Dye T M primer cycle sequencing (21M13 primer: #403055)(M13REV primer: #403056) or ABI prism Big DyeTM terminator cycle sequencing ready reaction kit (Applied Biosystems; #4303152). To ensure co-linearity of the sequence data and the genome, all regions of the phage genome were sequenced at least once from both directions on two separate clones. In areas that this criteria was not initially met, a sequencing primer was selected and phage DNA was used directly as sequencing template employing ABI prism BigDyeTM terminator cycle sequencing ready reaction kit.
Example 13. Bioinformatic management of primary nucleotide sequence.
Sequence contigs were assembled using Sequencher" 3.1 software (GeneCodes). To close contig gaps, sequencing primers were selected near the edge of the contigs. Phage DNA was used directly as sequencing template employing ABI prism BigDye T M terminator cycle sequencing ready reaction kit (Applied Biosystems; #4303152). The complete sequence of Enterococcus bacteriophage 182 is shown in Table 21.
A software program was used on the assembled sequence of bacteriophage 182 to identify all putative ORFs larger than 33 codons. The software scans the primary nucleotide sequence starting at nucleotide #1 for an appropriate start codon. Three possible selections can be made for defining the nature of the start codon; I) selection of ATG, II) selection of ATG or GTG, and III) selection of either ATG, GTG, TTG, CTG, ATT, ATC, and ATA. This latter initiation codon set corresponds to the one reported by the NCBi(http:iwww.incbiii.iii'.gvhtbinpost/Taxonomv/wprintgc?mode=c) for the bacterial genetic code. When an appropriate start codon is encountered, a counting mechanism is employed to count the number of codons (groups of three nucleotides) between this start codon and the WO 00/32825 PCT/I B99/02040 97 next stop codon downstream of it. If a threshold value of 33 is reached, or exceeded, then the sequence encompassed by these two codons is defined as an ORF. This procedure is repeated, each time starting at the next nucleotide following the previous stop codon found, in order to identify all the other putative ORFs. The scan is performed on all three reading frames of both DNA strands of the phage sequence.
The predicted ORFs for bacteriophage 182 are listed in Tables 22 23.
Sequence homology searches for each ORF were carried out using an implementation of BLAST programs. Downloaded public databases used for sequence analysis include: non-redundant GenBank (ftp://ncbi.nlm.nih.gov/blast/db/nr.Z), ii) Swissprot (ftp://ncbi.nlm.nih.gov/blast/db/swissprot.Z); iii) vector (ftp://ncbi.nlm.nih.gov/blast/db/vector.Z); iv) pdbaa databases (ftp://ncbi.nlm.nih.gov/blast/db/pdbaa.Z); v) staphylococcus aureus NCTC 8325 (ftp://ftp.genome.ou.edu/pub/staph/staphlk.fa); vi) streptococcus pyrogenes (ftp://ftp.tigr.org/pub/data/s_pneumoniae/gsp.contigs. 112197.Z); vii) PRODOM (ftp://ftp.toulouse.inra.fr/pub/prodom/current release/prodom99.1.forblast.gz); viii) DOMO (ftp://ftp.infobiogen.fr/pub/db/domo/); ix) TREMBL (ftp://www.expasy.ch/databases/sp_tr_nrdb/fasta/) The results of the homology searches performed on the ORFs of bacteriophage 182 are shown in Tables 24 26.
Example 14. Sub-Cloning of Bacteriophage 182 ORFs.
Preparation of the shuttle expression vector Expression preferably utilizes a shuttle expression vector which is arranged such that expression of the exogenous bacteriophage 182 ORF sequence is inducible.
For example, the plasmid pND50 replicates in E. coli, E. faecalis, and S. aureus (Yamagishi, Kojima, Oyamada, Fujimoto, Hattori, Nakamura, S., and Inoue, M. 1996. Antimocrob. Agents Chemother. 40, 1157-1163). This plasmidcan be modified by conventional techniques to insert the inducible arsenite promoter, derived from the shuttle vector pT0021, in which the firefly luciferase (lucFF) WO 00/32825 PCT/IB99/02040 98 expression is controlled by the ars promoter/operator from a S. aureus plasmid (Tauriainen, Karp, Chang, W and Virta, M. (1997). Recombinant luminescent bacteria for measuring bioavailable arsenite and antimonite. Appl. Environ. Microbiol.
63:4456-4461). This modified shuttle vector will contain the ars promoter, arsR gene and a cloning site for introduction of individual phage ORFs downstream from a shine-delgarno sequence.
Other inducible regulatory sequences can be utilized instead of the arseniteinducible system. An example is a nisin-inducible system The nisA promoter activity is dependent on the proteins NisR and NisK, which constitute a two-component signal transduction system that responds to the extracellular inducer nisin. The nisin sensitivity and inducer concentration required for maximal induction varies among the strains, but is functional in Streptococcus pyogenes, Streptococcus agalactiae, Streptococcus pneumoniae, Enterococcusfaecalis, and Bacillus subtilis. Significant induction of the nisA promoter (10- to 60-fold induction) can be obtained in all of the species. A vector containing this promoter was published as Eichenbaum Z, Federle MJ, Marra D, de Vos WM, Kuipers OP, Kleerebezem M, and Scott JR (1998) Appl Environ Microbiol 64, 2763-2769. Other vectors, plasmids, can also be utilized which will allow replication and transciption in Enterococcus.
Alternatively, a constitutive promoter can be used the p-lactamase promoter is constitutive in E.faecalis see ref. 1) to drive expression of the introduced ORF, and compare cell growth to control bacterial cells containing the parental vector lacking any introduced phage ORF. Recombinant plasmids are introduced into E. faecalis strain FA2-2 by electroporation, as previously described (Yamagishi, Kojima, Oyamada, Fujimoto, Hattori, Nakamura, S., and Inoue, M. 1996. Antimicrob. Agents Chemother. 40, 1157-1163).
Cloning of ORFs with a Shine-Dalgarno sequence ORFs with a Shine-Dalgarno sequence are selected for functional analysis of bacterial killing. Each ORF, from initiation codon to last codon (excluding the stop codon), will be amp!ified by PCR from phage genomic DNA. For PCR amplification of ORFs, each sense strand primer starts at the initiation codon and is preceded by a restriction site and each antisense strand starts at the last codon (excluding the stopcodon) and is preceded by a different restriction site. The PCR product of each ORF will be gel purified and digested with the restriction enzymes with sites contained on WO 00/32825 PCT/IB99/02040 99 the PCR oligonucleotides. The digested PCR product is then gel purified using the Qiagen kit, ligated into the modified shuttle vector, and used to transform bacterial strain DH IOp. Recombinant clones are then picked and their insert sizes confirmed by PCR analysis using primers flanking the cloning site as well as restriction digestion.
The sequence fidelity of cloned ORFs will be verified by DNA sequencing using the same primers as used for PCR. In the cases that the verification of ORFs can not be achieved by one path of sequencing using primers flanking the cloning site internal primers will be selected and used for sequencing. Recombinant plasmids will be introduced into E. faecalis strain FA2-2 by electroporation, as previously described (Yamagishi, Kojima, Oyamada, Fujimoto, Hattori, Nakamura, S., and Inoue, M. 1996. Antimicrob. Agents Chemother. 40, 1157-1163).
Induction of gene expression from the ars promoter.
If an inducible promoter is used, the ars promoter, induction can be assessed, for example, in either of the two methods.
1. Screening on agar plates The functional identification of killer ORFs can be performed by spreading an aliquot of E. faecalis transformed cells containing phage 182 ORF onto agar plates containing different concentrations of sodium arsenite 2.5; 5; and 7.5 The plates are incubated overnight at 37 0 C, after which a growth inhibition of the ORF transformants on plates that contain arsenite are compared to plates without arsenite.
2. Quantification of growth inhibition in liquid medium Cells containing different recombinant plasmids can be grown for overnight at 37 0 C in LB medium supplemented with the appropriate antibiotic selection. These are then diluted to the mid log phase (ODS4o=.2) with fresh media containing antibiotic and transferred to 96-well microtitration plates (100 pl/well). Inducer is then added at different final concentrations (ranging from 2.5 to 10 pM) and the culture incubated for an additional 2 h at 37 0 C. The effect of expression of the phage 182 ORFs on bacterial cell growth is then monitored by measuring the OD 40 and comparing the rate of growth to the culture not containing inducer. As positive controls for growth inhibition, the kilA gene of phage lambda (Reisinger, GR., Rietsch, Lubitz, W. and Blasi, U. 1993 Virology #193: 1033-1036), and the holin/lysin genes of the Sthaphylococcus aureus phage Twort (Loessner, MJ., Gaeng, Wendlinger, G., WO 00/32825 PCT/IB99/02040 100 Maier, SK. and Scherer, S. 1998. FEMS Microbiology Letters #162:265-274) were subcloned into the ars inducible vector. An aliquot of the induced and uninduced culture can also be plated out on agar plates containing an appropriate antibiotic selection but lacking inducer. Following incubation overnight at 37 0 C, the number of colonies is counted. Any ORF showing bacteriostatic activity will show a lower, but detectable, number of colonies on the agar plates when grown in the presence of inducer as compared to when grown in the absence of inducer. Any ORF showing bacteriocidal activity will show no colonies on the agar plates, when grown in the presence of inducer as compared to when grown in the absence of inducer.
REFERENCES
1. Cohen, M.L. (1992). Science 257, 1050-1055.
2. Tenover, F.C. and McGowan Jr., J.E. (1998). Bacterial Infections of Humans.
Epidemiology and Control.(A.S. Evans and P.S. Brachman, eds.) Plenum Medical Book Company, New York, N.Y. pp. 83-93.
3. Rusterholtz, and Pohlschroder, M. (1999). Cell 96, 469-470.
4. Neu, H.C. (1992). Science 257, 1064-1073.
Murray, B.E. (1990). Clin. Microbiol. Rev. 3,46-65.
6. Gray, B.M. (1998). Bacterial Infections of Humans. Epidemiology and Control.(A.S. Evans and P.S. Brachman, eds.) Plenum Medical Book Company, New York, N.Y. pp. 673 711.
Sambrook, Fritsch, E.F. and Maniatis, T. (1989). Molecular cloning: A laboratory Manual. Cold Spring Harbor Laboratory, New York. Cold Spring Harbor Laboratory Press.
7. Ausubel, F.M. et al. (1994) Current Protocols in Molecular Biolovg. John Wiley Sons, Secaucus, N.J.
8. Rost B,I and Sander C. (1996). Ann. Rev. Biophy. Biomol. Struct. 25, 113-136.
9. Garvey, Saedi, and Ito, J. (1985). Gene 40, 311-316.
Pickett, G.G. and Peabody, D.S. (1993). Nucl. Acids Res. 21, 4621-4626.
11. Gutierrez, Vinos, Prieto, Mendez, Hermoso, and Salas, M. (1986).
Virology 155, 474-483.
12. Yoshikawa, Garvey, and Ito, J. (1985). Gene 37, 125-130.
13. Martin, Lopez, Garcia, P. (1998). J Bacteriol 180, 210-217.
WO 00/32825 PCT/IB99/02040 101 14. Steiner, Lubitz, Blasi, U. (1993). J. Bacteriol. 175, 1038-1042.
Durfee, Becherer, Chen, Yeh, Yang, Kilburn, Lee, and Elledge, S.J. (1993). Genes Dev. 7, 555-569.
Qiu, Garcia-Barrio, and Hinnebusch, A.G. (1998). Mol Cell Biol. 18, 2697-2711.
Katagiri, Saito, Shinohara, Ogawa, Kamada, Nakamura and Miki, Y. (1998). Genes, Chromosomes Cancer 21, 217-222.
Endo, Masuhara, Yokouchi, Suzuki, Sakamoto, Mitsui, Matsumoto, Tanimura, Ohtsubo, Misawa, Miyazaki, Leonor Taniguchi, Fujita, Kanakura, Komiya, and Yoshimura, A. (1997).
Nature 387, 921-924.
Karimova, Pidoux, Ullmann, Ladant, D. (1998) Proc. Natl. Acad. Sci.
5752-5756.
Sopta, Carthew, and Greenblatt, J. (1995) J. Biol. Chem. 260, 10353- 10369.
Qin, Fenyo, Zhao, Hall, Chao, Wilson, Young, R.A.
and Chait, B.T. (1997). Anal. Chem. 69, 3995-4001.
Swanstrdm, M. and Adams, M.H. (1951). Proc. Soc. Exptl. Biol. Med. 78, 372- 375.
Example 15. Growth of Streptococcus bacteriophage Dp-1 and purification of genomic DNA.
The Streptococcus pneumoniae R6 propagating strain (PS) (Tomasz, 1966) was used as host to propagate its respective phage Dp-1 (McDonnell et al., 1975). (Alternatively, Streptococcus (Diplococcus) pneumoniae R36A could be used.
Strain R36A is available from ATCC as #11733 or 27336. Streptococcus pneumoniae is also available from Felix d'Herelle Reference Center in Quebec, Canada as catalog number HER 1054. Other S. pneumoniae strains are also available from ATCC.) Two rounds of plaque purification of phage Dp-1 were performed on soft agar essentially as described in Sambrook et al. (1989). Briefly, the Streptococcus R6 PS strain was grown overnight at 37 0 C in K-Cat media [K-Cat: 10 g Bacto casitone, 5 g Bacto tryptone, 1 g Yeast extract, 5g Potassium chloride, 0.2% Glucose, Potassium phosphate buffer [pH 8] and 250,000 Units Catalase per liter (Boehringer Mannheim #10683600). The culture was then diluted 20 fold in K-CAT and WO 00/32825 PCT/IB99/02040 102 incubated at 370C until the OD 5 4 0 0.2 (early log phase) with constant agitation. In order to obtain single plaques, Dp-1 phage was subjected to 10-fold serial dilutions using the phage buffer (100 mM Tris-HCl [pH 100 mM NaCl and 10 mM MgCl 2 )and 10 pl of each dilution was used to infect 0.5 ml of the cell suspension.
After incubation of 15 min at 37 0 C, 2 ml of melted soft agar (K-CAT supplemented with 0.8% of agar) were added to the mixture and poured onto the surface of 100 mm K-CAT agar plates [K-CAT supplemented with 1.2 of agar]. After solidification of the soft agar layer, an additional 5 ml of melted soft agar was added to visualize distinct plaques (Ronda et al., 1978). After overnight incubation at 37 0 C, a single plaque was isolated, resuspended in 1 ml of phage buffer by end over end rotation for 2 hrs at room temperature, and the phage suspension was diluted and used for a second infection as described above. After overnight incubation at 37 0 C, a single plaque was isolated and used as a stock for all subsequent manipulations.
The propagation procedure for bacteriophage Dp-1 was modified from the agar layer method of Swanst6rm and Adams (1951). Briefly, the R6 strain of Streptococcus pneumoniae was grown to stationary phase overnight at 37°C in K- CAT. The culture was then diluted 20 fold in K-CAT and incubated at 37C until the
OD
54 0 0.2. The suspension (15x 10 Bacteria) was then mixed with 15x 10 plaque forming units (pfu) to give a ratio of 100-bacteria/pfu. After incubation of 15 min at 37 0 C, 7.5 ml of melted soft agar (K-CAT plus 0.8% agar) were added to the mixture and poured onto the surface of 150 mm K-CAT agar plates and incubated 16 hrs at 37*C. After solidification of the soft agar layer, 7.5 ml of melted soft agar were added to each plate. To collect the plate lysate, 20 ml of K-CAT media were added to each plate and the soft agar layers were collected by scrapping off with a clean microscope slide followed by vigorous shaking of the agar suspension for 5 min to break up the agar. The mixture was then centrifuged for 10 min at 4,000 rpm (2,830 xg) using a rotor (Beckman) and the supernatant (lysate) was collected and subjected to a treatment with 10 pg /ml ofDNase I and RNase A for 30 min at 37 0 C. To precipitate the phage particles, the phage suspension was adjusted to 10% of PEG 8000 and 0.5 M of NaCl followed by incubation at 4 0 C for 16 hrs. The phage was recovered by centrifugation at 4,000 rpm (3,500 xg) for 20 min at 4 0 C on a GS-6R table top centrifuge (Beckman). The pellet was resuspended with 2 ml of phage buffer (100 mM Tris-HCI [pH 100 mM NaC I and 10A mM MgCAl The phage suspension was extracted with 1 volume of chloroform and further purified by centrifugation on a cesium chloride step gradient as described in Sambrook et al. (1989), using a rotor and centrifuged in an Optima TLX ultracentrifuge (Beckman) for 2 hrs at 28,000 rpm (67,000 xg) at 4 0 C. Banded phage was collected and ultracentrifuged again on an WO 00/32825 PCT/IB99/02040 103 isopycnic cesium chloride gradient (1.45 g/ml) at 40,000 rpm (64,000 xg) for 24 hrs at 4°C using a TLV rotor (Beckman). The phage was harvested and dialyzed for 4 hrs at room temperature against 4 L of dialysis buffer consisting of 10 mM NaCI, 50 mM Tris-HCl [pH 8] and 10 mM MgClI. Phage DNA was prepared from the phage suspension by adding 20 mM EDTA, 50 .g/ml Proteinase K and 0.5% SDS and incubating for 1 hr at 65°C, followed by successive extractions with 1 volume of phenol, 1 volume of phenol-chloroform and 1 volume of chloroform. The DNA was then dialyzed overnight at 4 0 C against 4 L of TE (10 mM Tris-HCl [pH ImM
EDTA).
Example 16. DNA sequencing of the Bacteriophage Dp-1 genome.
Four micrograms of phage DNA was diluted in 200 il of TE (10 mM Tris, [pH 1 mM EDTA) in a 1.5 ml eppendorf tube and sonication was performed (550 Sonic Dismembrator, Fisher Scientific). Samples were sonicated under an amplitude of 3 pm with bursts of 5 sec spaced by 15 sec cooling in ice/water for 3 to 4 cycles. The sonicated DNA was then size fractionated by electrophoresis on 1% agarose gels utilizing TAE (1 x TAE is: 40 mM Tris-acetate, 1 mM EDTA [pH as the running buffer. Fractions ranging from 1 to 2 kbp were excised from the agarose gel and purified using a commercial DNA extraction system according to the instructions of the manufacturer (Qiagen), with a final elution of 50 p1 of 1 mM Tris [pH The ends of the sonicated DNA fragments were repaired with a combination of T4 DNA polymerase and the Klenow fragment ofE. coli DNA polymerase I, as follows. Reactions were performed in a reaction mixture (final volume, 100 gl) containing sonicated phage DNA, 10 mM Tris-HCl [pH 50 mM NaCI, 10 mM MgCl 2 1 mM DTT, 50 pg/ml BSA, 100 pM of each dNTP and 15 units of T4 DNA polymerase (New England Biolabs) for 20 min at 12 0 C followed by addition of 12.5 units of the Klenow large fragment of DNA polymerase I (New England Biolabs) for min at room temperature. The reaction was stopped by two phenol/chloroform extractions and the DNA was precipitated with ethanol and the final DNA pellet resuspended in 20 pl of H 2 0.
Blunt-ended DNA fragments were cloned by ligation directly into the Hinc II site of the pKSII+ vector (New England Biolabs) dephosphoryiated by treatment with calf intestinal alkaline phosphatase (New England Biolabs). A typical ligation reaction contained 100 ng of vector DNA, 2 to 5 pl of repaired sonicated phage DNA (50-100 ng) in a final volume of 20 pl containing 800 units of T4 DNA ligase (New England Biolabs) and was incubated overnight at 16*C. Transformation and selection WO 00/32825 PCT/IB99/02040 104 of bacterial clones containing recombinant plasmids was performed in E. coli DH lOp according to standard procedures (Sambrook et al., 1989).
Recombinant clones were picked from agar plates into 96-well plates containing 100 pl LB and 100 gtg/ml ampicillin and incubated at 37 0 C. The presence ofphage DNA insert was confirmed by PCR amplification using T3 and T7 primers flanking the Hinc II cloning site of the pKS vector. PCR amplification of the potential foreign inserts was performed in a 15 pl reaction volume containing 10 mM Tris (pH 50 mM KCI, 1.5 mM MgCl,, 0.02% gelatin, 1 pM primer, 187.5 iM each dNTP, and 0.75 units Taq polymerase (BRL). The thermocycling parameters were as follows: 2 min initial denaturation at 94C for 2 min, followed by 20 cycles of 30 sec denaturation at 94C, 30 sec annealing at 58 0 C, and 2 min extension at 72 0
C,
followed by a single extension step at 72 0 C for 10 min. Clones with insert sizes of 1 to 2 kbp were selected and plasmid DNA was prepared from the selected clones using the QIAprep T M spin miniprep kit (Qiagen).
The nucleotide sequence of the extremities of each recombinant clone was determined using an ABI 377-36 automated sequencer with two types of chemistry: ABI prism Big Dye T M primer cycle sequencing (21M13 primer: #403055)(M13REV primer: #403056) or ABI prism Big DyeT terminator cycle sequencing ready reaction kit (Applied Biosystems; #4303152). To ensure co-linearity of the sequence data and the genome, all regions of the phage genome were sequenced at least once from both directions on two separate clones. In areas that this criteria was not initially met, a sequencing primer was selected and phage DNA was used directly as sequencing template employing ABI prism Big DyeTM terminator cycle sequencing ready reaction kit.
Example 17. Bioinformatic management of primary nucleotide sequence.
Sequence contigs were assembled using Sequencher T M 3.1 software (GeneCodes). To close contig gaps, sequencing primers were selected near the edge of the contigs. Phage DNA was used directly as sequencing template employing ABI prism BigDyeTM terminator cycle sequencing ready reaction kit (Applied Biosystems; #4303152). The complete sequence of Streptococcus bacteriophage Dp-1 is shown in Table 28.
A souiwarie programl was used on the assembled sequence ofbactclophag Dp-1 to identify all putative ORFs larger than 33 codons. The software scans the primary nucleotide sequence starting at nucleotide #1 for an appropriate start codorr.
Three possible selections can be made for defining the nature of the start codon; I) selection of ATG, II) selection of ATG or GTG, and III) selection of either ATG, WO 00/32825 PCT/IB99/02040 105 GTG, TTG, CTG, ATT, ATC, and ATA. This latter initiation codon set corresponds to the one reported by the NCBI(http://www.ncbi.nlm.nih.gov/htbinpost/Taxonomy/wprintgc?mode=c) for the bacterial genetic code. When an appropriate start codon is encountered, a counting mechanism is employed to count the number of codons (groups of three nucleotides) between this start codon and the next stop codon downstream of it. If a threshold value of 33 is reached, or exceeded, then the sequence encompassed by these two codons is defined as an ORF. This procedure is repeated, each time starting at the next nucleotide following the previous stop codon found, in order to identify all the other putative ORFs. The scan is performed on all three reading frames of both DNA strands of the phage sequence.
The predicted ORFs for bacteriophage Dp-1 are listed in Tables 29 and 30, and Fig. 6.
Sequence homology searches for each ORF were carried out using an implementation of BLAST programs. Downloaded public databases used for sequence analysis include: non-redundant GenBank (ftp://ncbi.nlm.nih.gov/blast/db/nr.Z), ii) Swissprot (ftp://ncbi.nlm.nih.gov/blast/db/swissprot.Z); iii) vector (ftp://ncbi.nlm.nih.gov/blast/db/vector.Z); iv) pdbaa databases (ftp://ncbi.nlm.nih.gov/blast/db/pdbaa.Z); v) staphylococcus aureus NCTC 8325 (ftp://ftp.genome.ou.edu/pub/staph/staph- k.fa); vi) streptococcus pyogenes (ftp://ftp.tigr.org/pub/data/s_pneumoniae/gsp.contigs. 112197.Z); vii) PRODOM (ftp://ftp.toulouse.inra.fr/pub/prodom/current release/prodom99.1.forblast.gz); viii) DOMO (ftp://ftp.infobiogen.fr/pub/db/domo/); ix) TREMBL (ftp://www.expasy.ch/databases/sp_trnrdb/fasta/) The results of the homology searches performed on the ORFs of bacteriophage Dp-1 are shown in Table 31.
Example 18. Sub-Cloning of Bacteriophage Dp-1 ORFs.
Preparation of the shuttle expression vector Expression preferably utilizes a shuttle expression vector which is arranged such that expression of the exogenous bacteriophage Dp-1 ORF sequence is inducible.
For example, the plasmid pLSE4 replicates in E. coli, and S. pneumoniae (Diazand Garcia, 1990). This plasmid can be modified by conventional techniques to insert the inducible arsenite promoter, derived from the shuttle vector pT0021, in which the WO 00/32825 PCT/IB99/02040 106 firefly luciferase (lucFF) expression is controlled by the ars promoter/operator from a S. aureus plasmid (Tauriainen, Karp, Chang, W and Virta, M. (1997).
Recombinant luminescent bacteria for measuring bioavailable arsenite and antimonite.
Appl. Environ. Microbiol. 63:4456-4461). This modified shuttle vector will contain the ars promoter, arsR gene and a cloning site for introduction of individual phage ORFs downstream from a shine-dalgaro sequence.
Other inducible regulatory sequences can be utilized instead of the arseniteinducible system. An example is a nisin-inducible system The nisA promoter activity is dependent on the proteins NisR and NisK, which constitute a two-component signal transduction system that responds to the extracellular inducer nisin. The nisin sensitivity and inducer concentration required for maximal induction varies among the strains, but is functional in Streptococcus pyogenes, Streptococcus agalactiae, Streptococcus pneumoniae, Enterococcusfaecalis, and Bacillus subtilis. Significant induction of the nisA promoter (10- to 60-fold induction) can be obtained in all of the species. A vector containing this promoter was published as Eichenbaum Z, Federle MJ, Marra D, de Vos WM, Kuipers OP, Kleerebezem M, and Scott JR (1998) Appl Environ Microbiol 64, 2763-2769. Other vectors, plasmids, can also be utilized which will allow replication and transcription in Streptococcus.
Alternatively, a constitutive promoter can be used to drive expression of the introduced ORF, and compare cell growth to control bacterial cells containing the parental vector lacking any introduced phage ORF. Recombinant plasmids are introduced into S. pneumoniae R6 as previously described (Diaz and Garcia, 1990) Cloning of ORFs with a Shine-Dalgaro sequence ORFs with a Shine-Dalgamo sequence are selected for functional analysis of bacterial killing. Each ORF, from initiation codon to last codon (excluding the stop codon), will be amplified by PCR from phage genomic DNA. For PCR amplification of ORFs, each sense strand primer starts at the initiation codon and is preceded by a restriction site and each antisense strand starts at the last codon (excluding the stop codon) and is preceded by a different restriction site. The PCR product of each ORF will be gel purified and digested with the restriction enzymes with sites contained on the PCR oligonucleotides. The digested PCR product is then gel purified using the Qiagen tit, lgated into thp morified shuttle vector, and used to transform bacterial strain DHIOp. Recombinant clones are then picked and their insert sizes confirmed by PCR analysis using primers flanking the cloning site as well as restrictiondigestion. The sequence fidelity of cloned ORFs will be verified by DNA sequencing using the same primers as used for PCR. In the cases that the verification of ORFs can not be achieved by one path of sequencing using primers flanking the cloning site WO 00/32825 PCT/IB99/02040 107 internal primers will be selected and used for sequencing. Recombinant plasmids will be introduced into S. pneumoniae R6 as previously described (Diaz and Garcia, 1990).
Induction of gene expression from the ars promoter.
If an inducible promoter is used, the ars promoter, induction can be assessed, for example, in either of the two methods.
1. Screening on agar plates The functional identification of killer ORFs can be performed by spreading an aliquot of S. pneumoniae transformed cells containing phage Dp-1 ORFs onto agar plates containing different concentrations of sodium arsenite 2.5; 5; and 7.5 PM).
The plates are incubated overnight at 37 0 C, after which a growth inhibition of the ORF transformants on plates that contain arsenite are compared to plates without arsenite.
2. Ouantification of growth inhibition in liquid medium Cells containing different recombinant plasmids can be grown for overnight at 37 0 C in LB medium supplemented with the appropriate antibiotic selection. These are then diluted to the mid log phase (OD 5 s4=.2) with fresh media containing antibiotic and transferred to 96-well microtitration plates (100 .l/well). Inducer is then added at different final concentrations (ranging from 2.5 to 10 pM) and the culture incubated for an additional 2 hrs at 37 0 C. The effect of expression of the phage Dp-1 ORFs on bacterial cell growth is then monitored by measuring the OD 40 and comparing the rate of growth to the culture not containing inducer. [As positive controls for growth inhibition, the kilA gene of phage lambda (Reisinger, GR., Rietsch, Lubitz, W. and Blasi, U. 1993 Virology #193: 1033-1036), and the holin/lysin genes of the Sthaphylococcus aureus phage Twort (Loessner, MJ., Gaeng, Wendlinger, G., Maier, SK. and Scherer, S. 1998. FEMS Microbiology Letters #162:265-274) can be subcloned into the ars inducible vector. An aliquot of the induced and uninduced culture can also be plated out on agar plates containing an appropriate antibiotic selection but lacking inducer. Following incubation overnight at 37 0 C, the number of colonies is counted. Any ORF showing bacteriostatic activity will show a lower, but detectable, number of colonies on the agar plates when grown in the presence of inducer as compared to when grown in the absence of inducer. Any ORF showing full bacteriocidal activity will show no colonies on the agar plates, when grown in the presence of inducer as compared o when grown in the absence of inducer.
REFERENCES
Cohen, M.L. (1992) Science 257, 1050-1055.
WO 00/32825 PCT/I B99/02040 108 16. Tenover, F.C. and McGowan Jr., J.E. (1998) Bacterial Infections of Humans.
Epidemiolovg and Control.(A.S. Evans and P.S. Brachman, eds.) Plenum Medical Book Company, New York, N.Y. pp. 83-93.
17. Rusterholtz, and Pohlschroder, M. (1999) Cell 96, 469-470.
18. Klugman, K.P. (1990) Clin. Microbiol. Rev. 3, 171-196.
19. Fenoll, Martin Bourgon, Munoz, Vicioso, Casal, J. (1991) Rev.
Infect. Disease 13, 56-60.
Jorgensen, Doern, G. Maher, L. Howell, A. Redding, J. S. (1990) Antimicrob. Agents Chemother. 34, 2075-2080.
21. Neu, H.C. (1992) Science 257, 1064-1073.
Hsueh, P. Wu, J. Hsiue, T. R. (1996) J Formos Med Assoc5, 364-371.
Garcia, Martin, and Lopez, R. (1997) Microbial Drug Res. 3, 165-176.
Martin, Lopez, and Garcia, P. (1996) J. Virol. 70, 3678-3687.
Sheehan, Garcia, Lopez, and Garcia, P. (1997) Mol. Microbiol. 717-725.
Kodaira, Biswas, and Komberg, A. (1983) Mol. Gen. Genet. 192, 80-96.
Maki, S. and Komberg, A. (1988) J. Biol. Chem.263, 6547-6554.
Tsuchihashi Z, Kornberg A. (1990) Proc. Natl. Acad. Sci. USA. 87, 2516-2520.
Lee, S.H. and Walker, J.R. (1987) Proc Natl Acad Sci USA 84, 2713-2717.
Smidt, Steinberg, Rucker, R. (1991) Proc Soc Exp Biol Med 197, 19- 26.
Frank, D.W, (1997) Mol Microbiol. 26, 621-629.
Nardese, Gutlich, Brambilla, Carbone, M.L.(1996) Biochem Biophys Res Commun 218, 273-279.
Mancini, Saracino, Buscemi, Fischer, Schramek, Bracher, A., Bacher. Gutlich. Carbone, M.L. (1999) Biochem Biophys Res Commun 255,521-527.
Sambrook, Fritsch, E.F. and Maniatis, T. (1989) Molecular cloning "A laboratory Manual. Cold Spring Harbor Laboratory, New York. Cold Spring Harbor Laboratory Press.
WO 00/32825 PCT/I B99/02040 109 22. Ausubel, F.M. et al. (1994) Current Protocols in Molecular Biolog. John Wiley Sons, Secaucus, N.J.
23. Rost B,l and Sander C. (1996) Ann. Rev. Biophy. Biomol. Struct. 25, 113-136.
24. Garvey, Saedi, and Ito, J. (1985) Gene 40, 311-316.
Pickett, G.G. and Peabody, D. S. (1993) Nuci. Acids Res. 21, 462 1-4626.
26. Guti~rrez, Vinos, Prieto, Mendez, Hermoso, and Salas, M. (1986) Virology 155, 474-483.
27. Yoshikawa, Garvey, and Ito, J. (1985) Gene 37, 125-130.
28. Martin, Lopez, Garcia, P. (1998) J Bacteriol 180, 210-217.
29. Steiner, Lubitz, Blasi, U. (1993) J. Bacteriol. 175, 1038-1042.
Durfee, Becherer, Chen, Yeh, Yang, Kilburn, Lee, and Elledge, S.J. (1993). Genes Dev. 7, 555-569.
Qiu, Garcia-Barrio, and Hinnebusch, A.G. (1998) Mol Cell Biol. 18, 2697-2711.
Katagiri, Saito, Shinohara, Ogawa, Kamada, Nakaniura and Miki, Y. (1998) Genes, Chromosomes Cancer 21, 2 17-222.
Endo, Masuhara, Yokouchi, Suzuki, Sakamoto, Mitsui, Matsumoto, Tanimura, Ohtsubo, Misawa, Miyazaki, T., Leonor Taniguchi, Fujita, Kanakura, Komiya, and Yoshimura, A. (1997) Nature 387, 921-924.
Karimova, Pidoux, Ullmann, Ladant, D. (1998) Proc. Natl. Acad. Sci.
5752-5756.
Sopta, Carthew, and Greenblatt, J. (1995) J. Biol. Chem. 260, 10353- 10369.
Qin, Fenyo, Zhao, Hall, Chao, Wilson, Young, R.A.
and Chait, B.T. (1997) Anal. Chem. 69, 3 995-400 1.
Tomasz, A. (1966) Journal of Bacteriology 91, 1050-1061.
McDonnell, Ronda, LC and Tomasz, A. (1975) Virology 63, 577-582.
Ronda Lopez, Tomnasz, A. and Portoles A. (1978) 26, 221-225.
WO 00/32825 PCT/IB99/02040 110 Swanstrom, M. and Adams, M.H. (1951) Proc. Soc. Exptl. Biol. Med. 78, 372- 375.
Diaz E and Garcia JL. (1990) Gene 90, 163-167.
e Tauriainen, Karp, Chang, W and Virta, M. (1997). Recombinant luminescent bacteria for measuring bioavailable arsenite and antimonite. Appl.
Environ. Microbiol. 63:4456-4461.
All patents and publications mentioned in the specification are indicative of the levels of skill of those skilled in the art to which the invention pertains. All references cited in this disclosure are incorporated by reference to the same extent as if each reference had been incorporated by reference in its entirety individually.
One skilled in the art would readily appreciate that the present invention is well adapted to carry out the objects and obtain the ends and advantages mentioned, as well as those inherent therein. The specific methods and compositions described herein as presently representative of preferred embodiments are exemplary and are not intended as limitations on the scope of the invention. Changes therein and other uses will occur to those skilled in the art which are encompassed within the spirit of the invention are defined by the scope of the claims.
It will be readily apparent to one skilled in the art that varying substitutions and modifications may be made to the invention disclosed herein without departing from the scope and spirit of the invention. For example, those skilled in the art will recognize that the invention may suitably be practiced using a variety of different bacteria, bacteriophage, and sequencing methods within the general descriptions provided.
The invention illustratively described herein suitably may be practiced in the absence of any element or elements, limitation or limitations which is not specifically disclosed herein. Thus, for example, in each instance herein any of the terms "comprising," "consisting essentially of" and "consisting of' may be replaced with either of the other two terms. The terms and expressions which have been employed are used as terms of description and not of limitation, and there is not intention that in the use of such terms and expressions of excluding any equivalents of the features shown and described or portions thereof, but it is recognized that various modifications are possible within the scope of the invention claimed. Thus, it should be understood that although the present invention has been specifically disclosed by preferred embodiments and optional features, modification and variation of the concepts herein disclosed may be resorted to by those skilled in the art, and thaftsuch modifications and variations are considered to be within the scope of this invention as defined by the appended claims.
WO 00/32825 PCT/IB99/02040 111 In addition, where features or aspects of the invention are described in terms of Markush groups or other grouping of alternatives, those skilled in the art will recognize that the invention is also thereby described in terms of any individual member or subgroup of members of the Markush group or other group. For example, if there are alternatives A, B, and C, all of the following possibilities are included: A separately, B separately, C separately, A and B, A and C, B and C, and A and B and C. Thus, for example, for the bacteria and phage specified herein, the embodiments expressly include any subset or subgroup of those bacteria and/or phage. While each such subset or subgroup could be listed separately, for the sake of brevity, such a listing is replaced by the present description.
Thus, additional embodiments are within the scope of the invention and within the following claims.
WO 00/32825 WO 0032825PCT/l B99/02040 Table I Phages against human and animal pathogenic bacteria I. Pathogen Phage name HI. Cat Origin/reference name alo Acinetobacter A3/2 Felix d'Herefle Reference calcoaceticus A 10/45 Centre,Quebec,Quebec A36 B9GP
BPP
BS46 E13 E14 531 Ap3 J. Bacteriol 1984. 157: 179-183 P78 J. Gen. Microbiol 1986.132: 2633-2636 Acinetobacler Felix d'Herelle Reference haemolyticus Centre,Quebec,Quebec A cinetobacter Felix d'Herelle Reference johnsonii Centre.Quebec,Quebec Acinezobacter sp. BP I J.Virol. 1968.2:716-722 G4, HP2, HP3 Can.J.Microbiol. 1966.12:1023-1030 HP4 J.Virol.1974.13:46-52 Al, 4, 9 &Arch.Virol. 1994.135:345-354 196 HP I GanJ.Microbiol. 1966.12:1023-1030 A 19, A23, A29, J.Microsc (Paris) 1973.16:215-224 A3 1, A33, A34, CR.Hebdo Seances Acad.Sci.Ser D.Sci A3759 2845 Natur(Paris)278:1907-1909 A -t If :4 Actinobacillus actinomycetecomitans FEMS Microbiol Lett 1994. 1 19:329-337-.
WO 00/32825 PTlB9/24 PCT/IB99/02040 Infec. Inimun. 1982. 35: 343-349 Mol.Gen.Genet 1998.258: 323-325 AaoD247 Oral Micriol. Immunol 1997.12: 40-46 A ctinomyces viscosus 43146-BI The American Type Culture Collection i 1 Infect.Immun. 1985 .48:228-233 I I1netI m n.985:45 i ~i Plasmnid 1997.37:141-153 Aeromonas hydrophila 1PM2** PM3 Microbiol.Lett. 1990.57:277-282 Aeh I Felix d'HemeIc Rufciu Aeh2 Centre,Quebec,Quebec PM4 PM6 T7-ah WO 00/32825 WO 0032825PCT/ B 99/02040 A eromonas salmon icida 3 25 29 31 32 4ORR 2 .8t 43 51 56 59.1 ASD37 Felix d'Herelle Reference Centre,Quebec,Quebec 55R.1 Can. JI Microbiol. 1983. 29: 1458-1461 Alteromonas espejiana PM2** 27025-BI1 The American Type Culture Collection Asticacaulis Felix d'Herelle Reference biprosthecum Centre,Quebec,Quebec Asticcacaulis 15261-BI 1 ITe American Type Culture Collection excentricus 15261-B2 15261-133 4Ac21 Ac24 Azotobacter vinelandii 12518-B I The American Type Culture Collection 125 18-B4 125 18-B5 A14 125 18-B9 A21 12518-BIO A31 13705-BI A4 1 PAV I Azotobacter sp. Virology 1972.49:439-452 Bacteroidesfragilis Bf- 1 Rev. Infect. Dis. 1979. 1: 325-336 B40-8 FEMS Microbiol. Lett. 1991. 66: 61-67 Appi. Environ. Microbiol. 1989. 55: 2696- I _______Zentralbl.bakteriol. 1972.222:57-63 Bdellovibrio MAC-I J. Gen. Microbiol. 1987. 133: 3065-3070 Bdellovibrio sp. VL- I J.Virol. 1973.12:1522-1533 Bord etella 214 Zh.Milcrobiol.Epidemiol.Immnuno. 1987.9bra chiseptica WO 00/32825 WO 0032825PCT/I B99/02040 I 7 Bard etella parapertussis Felix d'Herelle Reference Centre,Quebec,Quebec Mol. Gen. Mikrobiol. Virusol. 1988.4: 22-25 Zh.Milcrobiol.Epidemiol.Lmmuno. 1987.5:9- 13 4140511 Zh.Mikrobiol.Epidemiol.rflfflufo. 1987.5:9- 13 t Brucella aborius Felix d'Herelle Reference Centre,Quebec,Quebec 1 4 I23448-BI1 23448-B2 17385-BI 17385-132 The American Type Culture Collection 10/1 24/I 2121XV BK-2. TB Fi** I I DC Zh.Mikrobiol.Epidemloi.immunobioi. 1983.2: 48-52
I
Deiv Biol. Stand. 1984.56: 55-62 Brucella camis R/c Biol. Stand. 1984.56: 55-6-2 Brucella me! itensis- BK-2 23456-B I The American Type Culture Collection Brucella suis Zentralbl.Veterinanned. 1975.22:866-867 WO 00/32825 WO 0032825PCT/l B99/02040 I i TB I Zh.Mikrobiol.Epidemiol.Imiunobiol. 1983.2: 48-52 4 1- 1 Brucella sp.
Can. J. Vet. Res. 1989.53: 319-325 i 1 Res. Vet. Sci. 1988. 44: 45-49 R Zh.Mikrtobiol.Epidemiolimmunobiol. 1983.2: 48 Campylobacter coli 43133-HI The American Type Culture Collection 43134-BI Campylobacter coli 18 43135-BI1 The American Type Culture Collection (Cont'd) 19 43136-BI Campylobacterjejuni 1 35918-HI The American Type Culture Collection 2 35919-HI 3 35920-HI 4 35921-BI ~35918-132 6 35920-B2 7 35922-132 8 35923-HI1 9 35924-BI1 35925-HI I11 35925-132 12 35922-132 13 35924-B2 14 35922-B3 17 43133-HI 18 43134-BI 19 43135-BI 43136-BI Campylobacter HPl J. Med. Microbiol. 1993. 3 8: 245-249 (Helicobacter) pylori Chiamvdia psittaci Chpl** Gen. Virol. 1989. 70: 3381-3390 Clostridium CAK- I J.Bacteriol. 1993.175:3838-3843 acetobutvlicum WO 00/32825 WO 0032825PCTI B99/02040 Clostridium botulinum Nucleic Acids Res. 1990.18:1291 Bioch.Biophys.res.Commnun. 1990.171.1304- 1311 i i Microbiol.imimunol. 1981.25:915-927 i 1 J.Vet.Med.Sci.1992.54:675-684 I CEP~ CEy I I- Clostridium difficile j41 56 1_I J. Clini.Microbiol. 1985.21:251-254 WO 00/32825 WO 0032825PCT/ B 99/02040
I
Clostridium pefringens Rev.Can.Biol. 1977.36:205-215 i 1 FEMS Microbiol.Lett. 1990.54:323-326 Clostridiumn 8074-B 1 The American Type Culture Collection sporo genes 59 17886-BI1 17886-B3 71 17886-B4 725 17886-B5 17886-B6 Clostridium tetani A &B Rev.Can.Biol. 1978.37:43-46 Corynebacterium Vopr.Virusol. 1986.31:577-584 diphteriae Corynebacterium NN 12319-BI 1 'Te American Type Culture Collection pseudotuberculosis Corvnebacterium sp DLC 292 1/49 12052-B I The American Type Culture Collection WO 00/32825 PCT/I 899/02040 Enterococcus faecalis 42 19948-BI1 The American Type Culture Collection Entero coccus faecium 19950-B I The American Type Culture Collection 19953-b2 19953-BI 124 133 WO 00/32825 WO 0032825PCT/I B99/02040 Escherichia coli Escherichia coli (Cornt'd) C204 El f2**
FCZ
fd** Ifl1** MS2** MU9 Mu- I 0x6 p1** P4 sid, R17** Z1K/l ZJ/2 1 1303-B14 11303-BI0 1 1303-B21 8677-B81 11303-B13 1 3706-B4 15766-BI 15766-B I 1242-B5 1 5669-B2 15767-BI1 11303-816 27-65-B1 25065-B2 15669-BI1 15597-BI 21816-B81 23724-B9 15593-BI 25404-B81 29746-B81 2363 1-BI1 25868-B1 25298-BI1 25298-B2 11303-B37 1 1303-B24 11303-B26 11303-B27 11303-B28 11303-B29 11303-B30 11303-B33 11303-B31 11303-B25 11303-B35 11303-B34 11303-B36 11303-B32 13706-B5 11303-B1 11303-132 11303-B3 1 1303-B4 35060-B1 35060-B2 35060-B3 11303-B5 11303-B6 11303-B7 11303-B38 12141 -81 I The American Type Culture Collection ITIe American Type Culture Collection WO 00/32825 WO 0032825PCT/1 B99/02040 Escherichia coli (Cont 'd) 547
UVI
UTV47 UJV375 X C-17 X sus P-3 X sus R-5 sus J-6 X sus 0-8 X sus A-1 1 X ind- 0 92 OV-1 O6X74*- ,OCcs7Oam-3 1 1303-B20 11303-B 17 11303-B 1 1303-B I I 1 1303-B18 1 3706-B2 23724-B2 23724-BI1 23724-B3 23724-B4 23724-B5 23724-B6 23724-B7 23724-B8 35860-BI1 1 3706-B3 1 5597-B2 13706-BI1 49696-BI1 The American Type Culture Collection G4** OK** Biochirn.Biophysica Acta.1992.1130:277-288 BF23** J.Bacteriol. 1977.129:265-275 Mu I J.Ultrastruct.Res. 1966.14:44 1-448 Hp17 J.MoI.Biol. 1991.218:705-721 K3** 0x2* FEBS Lett. 1987.215:145-150 Rbl8**, Rb5l J.Bacteriol. 1990.172:180-186 Rb69** Hi H3, H8, MoI.Gen.Genet. 1990.221:491-494 K9, K18 Oxi Tula** J.Mol.Biol. 1987.196:165-174 Tulb** KIO J.Bacteriol. 1979.140:680-686 Qsr' J.Bacteriol. 1985.162:256-262 B278 J.Gen.Microbiol. 1988.134:1333-133 8 phi 80** FEMS Microbiol.Let. 1994.1 19:71-76 phi m173 Genetika 1985.21:673-675 tf- I _______J.Gen.Microbiol. 1987.133:953-960 P4 phiR73 ______Mol.Microbiol. 1995.18:201-208 12-2 _______J.Gen.Microbiol. 1982.128:2797-2804 PRDI __viroiogy 1990.177:445-451 K3hx ______Mol.Gen.Genet. 1987.206:110-115 933J**& Infect.Ixnmunity. 1986.53:135-140 H19-B** ______J.Bacteriol. 1987.169:4308-4312 Tcp-1 11 Zentralbnl.Bakteriol.Mikrobiol.Hyg. 1988.270: 41-5 1 WO 00/32825 WO 0032825PCT/I B99/02040 N4** _______Vet.Microbiol. 1992.30:203-212 Phi 80 n-p Inst. Pasteur. 1971.120:121-125 Obeta J.Bacteriol. 1978.133:172-177 PICM _______J.Gen.Microbiol. 1978.107:73-83 PA-2* J.Bacteriol. 1990.172:1660-1662 I 86** Mol.Gen.Genet. 1982.1 87:87-95 I 86.IX.B Mol.Microbiol. 1992.6:2629-2642 21** Virology 1983.129:484-489 P4** MicrobiolRev. 1993.57:683-702 82** J.Biol.Chem. 1987.262:11721-11725 PSP3 J.Bacteriol. 1996.178:5668-5675 HK022* Nucleic Acids Res. 1994.22:354-356 fl1AR~~ Nucleic Acids Res. 1986.14:3813-3825 D108* NuleicAcid Re. 196.143813382 Escherichia coti (Cont 'd) Rh4Q J.Mol.Biol. 1997.267:237-249 Ike** J.Mol.Biol. 1985.181:27-39 P22dis Mol.Gen.Genet. 1978.166:233-243 i.Bacteriol. 1996.178:1484-1486 Ifi Proc.R.Soc.Lond.B.Biol.Sci. 1991.245:23-30 Stx2Phi-I Infect.lrnmun. 1998.66:4100-4107 Stx2Phi-II 18 Virology 1987.156:122-126 J.Gen.Microbiol.1981.126:389-396 AC3 Mol.Microbiol. 1991.5:715-725 WO 00/32825 WO 0032825PCT/I B99/02040 BW- 1 C-1 E920g Esc-7-l11 H 19J Haiti 11(43
KL
3
M
Mu* 0103 0157:H7
PID
ptl Pil~a PR64FS PR772 SS4 I04Q kvljr** 09-1 92 Felix d'Herelle Reference Centre,Quebec,Quebec Haemophilus HP1** Nucleic Acids Res. 1996.24:2360-2368 influenzae S2** Gene 1997. 196: 139-144 Halobacterium S45 Felix d'Herelle Reference cutirubrum Centre,Quebec,Quebec Halobacteri .um Felix d'Herelle Reference Izalobium Centre,Quebec,Quebec Can.J.Microbiol. 1982.28:916-921 Halobacterium Bio.Chem.Hoppe Seyler 1994.375747-757 salinarium WO 00/32825 WO 0032825PCT/1 B99/02040 Kiebsiella oxvtoca Kiebsiella pneumoniae tf- I 1 J.Gen.Microbiol.l987.133:95 3 9 6 0 160 23356-BI I 23357-BI The American Type Culture Collection Ceteubeuee K19Q jFelix d'Herelle Reference FC3-1 FC3-9 _______Can.J.Microbiol.1991.37:27O- 2 7
S
rr'l- I A FEMS Microbiol.Lett. 1991.67:291-297 Klebsiella sp. KiI Mol.Gen.Genet. 1990.221:283-286 Leptospira sp. LEI, LE3 LE4 Res.Microbiol. 1990.141:1 131-1138 Listeria 243 23074-BI1 The American Type Culture Collection monocytogenes 197,1313 Appl.Environ.Microbiol. 1997.63:3374-3377 9425 H387 H387-A _______Appl.Environ.Microbiol. 1993.59:2914-2917 5775,6223 APMIS. 1993.101:160-167 12682 2389, 2671, Intervirology 1994.37:31-35 4211 2685 Zentralbl.Bakteriol.Mikrobiol.Hyg. 1986.26 1:1 2-28 4b, 4ab, 4g 3c Ann.Microbiol (Paris) 1977.128:185-198 A 118, A500 MoI.Microbiol. 1995.16:1231-1241-992 A51I** 1, 3,4, 5,6, 7, 8, Anm.Microbiol. (Panis) 1979.13013:179-189 9,10,11, 14, 16, 17, 19 1/2a, 1/2b, 3c, Clmn.Invest.Med. 1984.7:229-232 4ab, 6a 6b Felix d'Herelle Reference 2685 Centre,Quebec,Quebec Listeria innocua 4211 Felix d'Herelle Reference Centre,Quebec,Quebec Micrococcus luteus 4698-B I The American Type Culture Collection 4698-134 N3 4698-2 N4 4698-B3 Micrococcus luteus N 17 _______Can.J.Microbiol. 1979.25:1027-1035 Mycobacteriurn smegmatis BK-3 BoI** Bo 6 11 Bo 6111 Mc-2 Mc-4
NN
Phagus lacticola
RI
27203-B I 27204-BI1 27205-BI1 27205-133 607-B6 607-137 1 1727-BI 1 1759-BI 607-B I The American Type Culture Collection WO 00/32825 WO 0032825PCT/I B99/02040 HER 317 HER 330 HER 333 HER 335 HER 334 HER 331 HER 316 Felix d'Herelle Refrence Centre,Quebec,Quebec Legendre Leo Roy Sedge Mol.Microbiol. 1993.7:395-405 J.Mol.Biol. 1998.279: 143-164 Proc.Natl.Acad.Sci USA. 1988.84:2833-2837 Mol.Biol.Rep. 1981.30:11-15 Proc.Natl.Acad.Sci.USA 1997.94:10961- 10966 4 29M, 3 1M, 122, 154, 37, 29D, 46, 139,110, 141, 74D, AGI DS6A Arch.Virol.1993.133:39-49 Am.Rev.Respir.Dis. 1975.112:17-22 Mycobacterium 23052-B 1 The American Type Culture Collection fortuitum 27207-BI1 Bo 4 27207-B2 Bo 7 WO 00/32825 WO 0032825PCT/I B99/02040 -r Mycobacteriuyn leprae Ann.Microbiol. (Paris) 1982.133:93-97 Myco bacterium 4. .4tuberculosis Mycobacteriumn sp 25618-BI 256 18-B2 4243-B I The American Type Culture Collection DS6A 110, 139 &33D 1 Arch.Virol. 1993.133:39-49 AGI1,GS4E, BG 1, PH BKl The Biology of Mycobacteria.Academ-ic PressToronto 1982 (Ratledge Stanford) 1982.309-351 Phagus pellegrini
NN
1 1760-BI 1 1761-BI 23239-B 1 The American Type Collection Culture WO 00/32825 WOOO/2825PCTII B99/02040 TM4, ph60, Microbiology 1995.141:1173-1181 ph72, PhAE39, phAE4O BxblI C2 ______Experentia 1969.25:1112-1113 18 115 J.Gen.Virol. 1987.68:949-956 63 Gruzlica 1968.36:617-622 phlei J.Gen.Virol. 1975.29:235-238 butyricum______ MyF3P-59a ______Z.Allg.Mikrobiol. 1968.8:29-37 Bo2a J.Gen.Virol. 1973.20:75-87 D4,D28 D32 J.Exptl.Med. 1966.123:327-340 J.Bacteriol. 1963.86:608-609 Mycobacterium B5 15483-B I The American Type Culture Collection vaccae Mycobacterium phlei 11728-BI1 The American Type Culture Collection 1 1758-BI NN 27086-B2 Bo 2 27086-B 1 Bo 2h Bo 3 Mycoplasma MAVl1* Lnfect.Inimunity. 1995.63:4016-4023 art Mvcoplasma hvorhinis Hr- 1 Arch.Virol. 1983.77:81-85 Mycoplasma Br- I Arch.Virol. 1983.75:1-15 Mycoplasma pulmonis Plasmid 1995. 33: 41-49 Mycoplasma sp.
J.Gen.Microbiol. 1985:131:3117-3126 IJ. Virol.1986.59:584-590 Gene 1994. 141: 1-8 WO 00/32825 WO 0 0 38 2 B99/02040 Microbios 1990. 64: 111-125 j Infection& Immunity 1995. 63: 4016-4023 t 1 Med.BioI.1982.60:1 16-120 MV-L,2 Arch.Virol. 1979.61:289-296 i 1 Acta.i ro.17.243 J.Gen.Virol. 1979.42:315-322 Virology 1973.55:118-126 WO 00/32825 WO 0032825PCT/I B99/02040 I I Science 1971.173:725-727 Neisseria perfiava J.Clin.Microbiol. 1976. 4:87-91 Nocardia eiythrypolis qpC J.Gen.Virol. 1974.23 :247-254 J.Bacteriol. 1976.126:1 104-1107 Pasteurella multocida B225 Arch.Exp.Veterinanned. 1981.35:433-436 B939a Am.J.Vet.Res. 1978.39:1565-1566 Nos.1 115, 32, 967 Vet.Med.Nauki. 1977.14:33-36 1075 Propionibacterium NN 29399-B I The American Type Collection Culture acnes WO 00/32825 WO 0032825PCT/1 B99/02040 Pseudomonas aeruginosa 2 2A 2B 11 16 24 27 44 73 109 113 249 B3 Hoff 2 Hoff 3 Pa Pb PB- I
PC
Pf 12175-BI 12175-132 12175-133 12175-134 14205-BI1 14206-13I 14207-B I 14208-B I 14209-BI1 14210-BI1 142 11-BI 14212-131 142 13-BI 14214-B 1 15692-BI1 14203-BI 14204-B 1 12055-B I 12055-B32 1 5692-133 12055-B33 25102-BI 1 5692-B32 The American Type Culture Collection Felix d'Herelle Reference Centre,Quebec,Quebec 7 &31 Pf3** J.Virol. 1983.47:221-223 (p-MC ______Can.J.Microbiol. 1969.15: 1179-1186 J.Mol.BioI. 1991.218:349-364 PR4* 1979.43:583-592 A7 ______J.Bacteriol. 1992.174:2407-241 1 KFI 1983.93:61-71 Mol.Microbiol. 1993.4: 1703-1709 f2** J.Virol. 1977.24:135-141 WO 00/32825 PCT/I B99/02040 131 qKZ, 2 1, (PNZ, dd( PMN 17, PTB8O, 68, PB-I, E79, 16, 109, 352, 1214, F8, 71, 33 7, M4, qC17, SL2, B17, Li-24, (pnP78, PS17**, (pl, 73, M6, Li-2, 7, (pmnF82, PTB2, PTB2O, PTB42, (pKF77, 3 1, PTB2 1, I 19x, (pPLS27, B3, 258, Hwl2, PM57, PM62, PM 105, 14 8, PM68 1, 198, 218, 222, 242, 246, PC131, cpCl 1, D3112**, Jb19, F7, PM69, PM 13, PM6I1, PM 113, 2 40, 249 269 WO 00/32825 PCT/I B99/02040 Pseudomonas aerugi nosa (Corn''d) 297, 309, 318, 11, Arch.Virol. 1993. 13 1:14 1-151 WO 00/32825 WO 0032825PCT/I B99/02040 Pseudomonas cepacia Felix d'Herelle Reference Centre,Quebec,Quebec Pseudomonasfragi 27362-13I The American Type Culture Collection 27363 B I WY__ Pseudomonas 6Felix d'Herelle Reference phaseolicola _______Centre,Quebec,Quebec Pseudomonas putida gh-I 12633-B]I The American Type Culture Collection Pseudomonas syringae 40492-BI The American Type Culture Collection 21781-B 1 Pseudomonas sp. PPs-G3 49780-BI1 The American Type Culture Collection Salmonella bareilly Sab 2 Felix d'Herelle Reference _________Centre,Quebec,Quebec Salmonella enieritidis 1, 2,3 6 Epiden-iol.Infect. 1995.1 14:227-236 2a, 3a, 4a, 5a, 6a, Vet.Med.Nauki. 1975.12:55-60 7a, 8a, 9a, 20 &21* Salmonella newington Epsilon 34 J.Struct.Biol. 1995.115:283-289 Salmonella newport 27869-BI The American Type Culture Collection 27869-B32 16-19 Felix d'Herelle Reference Centre,Quebec,Quebec Salmonella pararyphi 19940-B 1 The American Type Culture Collection 12176-B I Paratyphoid A Jersey Felix d'Herelle Reference _________Centre,Quebec,Quebec Salmonella SasLI, SaL2, Sal Indian J.Med.Res. 1997.105:47-52 senflten berg 3, SaL4, SaL5 Salmonella typhimurium P22** SL- I 19585-B I 40282 The American Type Culture Collection MB78** 1982.41: 1038-1043 SE I J.Gen.Microbiol. 1986.132:1035-1041 LT2 ______Virology 1971.45:835-636 Virology 1970.42:621-632 J.Virol. 1985.56: 1034-1036 WO 00/32825 WO 0032825PCT/I B99/02040 P1CM clr-l00 _______Mol.Gen.Genet. 1975.138:113-126 F22 1986.48:139-143 Fels I 1978.38:263-272 Fels 2 Genet.Res. 1986.48:139-143 Px MoI.Gen.Genet. 1970.108:184-202 Plkc ______Virology 1974.60:503-514 A3 A4 _______J.Bacteriol. 1987.169:1003-1009 HT Genet.Res. 1976.27:315-322 I I Salmonella typhimurium (Co n 'd) IRA Microbiol. 1990.30:707-7 16 Mudi Mol.Gen.Genet. 1986.202:327-330 P22 (cir4-1, cir5- Mol.Gen.Genet. 1984.198:105-109 1 cir6- 1) BF23** MoI.Gen.Genet. 1976.147:195-202 Kbl .Bacteriol. 1974.1 17:907-908 P22Idis 1978.41 :367-376 Virology 1990.177:445-451 J.Gen.Microbiol. 1982.128:2797-2804 tf-1I J.Gen.Microbiol. 1987.133:953-960 J.Gen.Microbjol. 1981.126:389-396 Salmonella y phosalyphi 8 23 46 53 163 175 Vii ViVI 19937-BI 19938-B 1 19939-BI1 19942-B 1 19943-BI 19946-BI1 19947-B 1 27870-BI1 27870-B2 The American Type Culture Collection 01 Felix d'Herelle Refrence _________Centre.Quebec,Quebec Vill Chung Hua Liu Hsmng Ping H.T.C. 1992.13:288 J.Gen.Microbiol. 1983.129:3395-33400 Salmonella sp. P3
P**
P9a P9c PlO 102 Chi (X) R34 25957-BI1 25957-B2 25957-B3 25957-B4 25957-B5 19945-B I 9842-B I 97541 The American Type Culture Collection 1 Viroiogzy i968.34.521-530 P14 ______.Microb.Pathog. 1990.8:393-402 PSP3 ______Virology 1992.188:414 Zentralbl.Bakteriol. 1 976.234:294-904 P27 9NA J.Virol.1986.12:921-93 I [Sphaerotilus natans SNI -I Appl.Environ-Microbiol.1979.37:1025-1030 WO 00/32825 WO 0032825PCT/I B99/02040 Shigella dysenteriae 23351-BI The American Type Culture Collection P2 11456b $80 11456a-BI Shigellaflexeneri D20 12 661 -B 1 The American Type Culture Collection SflI** ______Mol.Microbiol. 1997.26:9 3 9- 9 50 SfV** Gene 1997.22:217-227 Sf6** Mol.Microbiol. 1995.18: 2 0 1 2 0 8 Gene 1993.129:99-101 Shigella sonnei C16** (Mask) 1977.11:323-331 Shigella sp 37 23354-B I The American Type Culture Collection Spiroplasma citri SpVl Plasmid 1993.29:193-205 Spiroplasma sp. SpVI-R8A2B Nucleic Acids Res. 1990.18:1293 SpV3 Isr.J.Med.Sci. 1987.23:429-433 Stp V4 J.Bacteriol. 1987.169:4950- 4 9 6 1 Staphylococcus albus Staphylococci Staphylococcal Infections. 1997.
Voll :503-508 (Karger,Basel) WO 00/32825 WO 0032825PCT/1 B99/02040 Staphylococcus aureus 17 29 42D** 42E 47 52 52A 53 54 71 77 79 81 83A 84 88 92 5504'
K
P1 P14 UC 18 27702-B I 27703-BI1 27704-B 1 23360-BI1 23361-B I 27705-B 1 277 12-BI1 27690-B1 27691 -B I 27692-BI1 27693-BI1 27694-B 1 27695-B I 27696-B I 27697-B 1 27698-B 1 27699-B 1 27693-B2 27700-B 1 27701 -B 1 27706-BI1 27707-B 1 27708-B 1 33742 3374 1-BI1 15565 19685-BI 1 1987-BI 1 1988-BI 15752-BI1 'Me American Type Culture Collection WO 00/32825 WO 0032825PCT/l B99/02040 HER 101 HER 239 HER 283 HER 49 Felix d'Herelle Reference Centre,Quebec,Quebec Twort* 0i J.Bacteriol.1988.170:2409-2411 4 13** 02** J.Gen..Microbiol. 1989.135:1679-1697 L54a** J.Bcteriol. 1986.166:385-391 8Ot* Can.J.Microbiol. 1996.43:612-616 94,95 96 _______J.Clin.Microbiol. 1988.26:2395-2401 (p1I31,A3 A 5 Staphylococci Staphylococcal Infections. 1997.
1:503-508 (Karger,Basel) Phi PVT.** Gene 1998.215:57-67 Staphylococcus BaSTC2 Felix d'Herelle Reference carnosus ________Centre,Quebec,Quebec Staphylococcus 1la, 2b, 3a, 4b, Can.J.Microbiol. 1988.34:1358-1361 epidermidis 5a, 6b, 7b, 8c, 9a, 10a, Illb,12a 13b 41, 63, 11811, Res.Virol. 1994.145:1 11-121 138, 245, 336, 392 550 Staphylococcus 1 154A, 1405, Res.Virol.1990.141: 625-635 saprophyicus 1314, 1139 Res.Virol. 1994.145:1 11-121 1259 Staphylococcus sp. Phi 812, Phi 13 1, Virology 1998.246:241-252 SK311 U16 Streptococcus faecalis VD 13 HER44 Felix d'Herelle Reference _________Centre,Quebec,Quebec Streptococcus faecium PEI1 Zentralbl.Bakteriol. 1975.231:421-425 Streptococcus oralis Cp- Cp- FEMS Microbiol.Lett. 1989.65:187-192 WO 00/32825 WO 0032825PCT/l B99/02040 Streptococcus pneumofliae Cp- 1 HER223 Felix d'Herelle Reference Centre.Ouebec.Ouebec Cp-1I* Cp-5 J.Virol. 1981.40:55 1-559 CP-9** Eur.J.Biochem. 1979.101:59-64 w-3I 03-2 Microbial Drug Resistance 1997.3:165-176 HB-623 HB- J.Virol. 1990.64:5149-5155 746 EJ -1 J.Bacteriol. 1992.174:5516-5525 Dp-2 Dp-4 1978.26:221-225 Dp- Virology 1975.63:577-582 w-3 o)-8 J.Virol. 1976.19:659-667 304 _______J.Bacteriol. 1980.141:1298-1304 HB- 1,HB-2, HB-3**, HB-4, He-S HB-6 J.Bacteriol. 1979.138:618-624 Streptococcus T12** Mol. Microbiology. 1997#/23:719-728 pyogenes A-i 12202-B I The American Type Culture Collection A-6 12203-BI 12204-BI 14918 Streptococcus 1 HER 339 Felix d'Herelle Refrence sp./Enterococcus 182 HER 80 Centre,Quebec,Quebec VD1884 HER 323 IA 12169-BI The American Type Culture Collection lB 12170-BI NN 21597-BI 42 19948-BI 118 19951-132 19952-BI Veillonella rodentium N2 ______Antonie Van Leeuwenhoek 1989.56:263-27 1 Vibrio cholerae Psi 92 Interviroloev 1993.36:237-244 VCB- 1,2,3 &4 ______J.lnfetion 1998.36:131 CP-TI J.Virol. 1984.51:163-169 VSK MicrobioI.Lett. 1996.145:17-22 Phi 138 1986.57:960-967 Pbil49 j 1985.140:217-223 Fs-2** Microbiology 1998.144:1901-1906 WO 00/32825 WO 0032825PCT/I B99/02040 e4 [Felix d'Herelle Reference -r I Centre,Quebec,Quebec 1-P I' Vibrio cholerae (Cont 'd)I 138 145 149 163 NA4 M4
IV
IV
14 100-Bl1 14100-B2 14100-1330 141 00-B34 51352-Bl 5 1352-132 51352-133 51352-134 51352-135 1352-b6 51352-137 5 1352-138 1352-139 51352-BIO The American Type Culture Collection Vibrio coslicola UTAK Felix d'Herelle Reference Centre,Quebec,Quebec Vibrio etor J.Gen.Virol. 1987.68:141 1-1416 Vibrio natrigens ntl ,nt6 Felix d'Herelle Reference Centre,Quebec,Quebec Vibrio KVP40** Felix d'Herelle Reference parahaemolyticus VF33 Centre,Quebec,Quebec VPl 4HA PELC-1 Vibio sp. ct3a Felix d'Herelle Reference Centre,Quebec,Quebec NN 1 1985-B31 The American Type Culture Collection phi 51582-BI 149 1987.61 :3999A4006 Veillonella rodentium N2 Antonie V.Leeuwenhoek.1989.56:263-271 WO 00/32825 WO 0032825PCT/I B99/02040 Yersinia enterocofitica 2 3 4 6 7 8 9 $YeO3-12 1, IV VIII Felix d'Herelle Reference Centre,Quebec,Quebec Zentralbl.Bakteriol.Mikrobiol.Hyg. 1982.253:1 .02 Yersinia pestis R 23208-BI1 The American Type Culture Collection S 11593-BI Y 23053-BI II Zh.Mikrobiol.Epidem-iol.lr nobiol. 1990.11 .9 Yersinia PST** 23207-B 1 The American Type Culture Collection pseudotuberculosis Yersinia sp. RD2 _______Mol.Gen.Mikrobiol.Virusol. 1990.8:18-21 xxxx) WO 00/32825 WO 0032825PCT/I B99/02040 141 Table 2 >Bacteriophage 77, complete genome sequence, 41708 nucleotides 1 61 121 181 241 301 361 421 481 541 601 661 721 781 841 901 961 1021 1081 1141 1201 1261 1321 1381 1441 1501 1561 1621 1681 1741 1801 1861 1921 1981 2041 2101 2161 2221 2281 2341 2401 2461 2521 2581 2641 2701 2761 2821 2881 2941 3001 3121 3181 3241 3301 3361 3421 3481 3541 gatcaaaata tataaccccc aaaatttagt ataaat taaa attacatgtS aggcgccacc catatcaaaa aaacgccatc gtggaaacaa aaaacatata caaatttatt tatatttctt gggacgtgga cttacacgga aacatcgttt aacgccaaaa ggttattcga tatttttgat attaggtaaa gggttatatc tagtagattg gacgtgggaa gctaagcacg attcatgact gaaagaaata tggtttagac cgatgattac attagaacct tgtcattgaa tgaaaaagtc tggcataaaa tatcgataca gtttactaat agatgaagtc agacgatata ctaatagagg aacatatatg tttagcgatt agtattggaa aaatactgac taacgaggtt cagagaagag ttatcaacgt gacacacttt tgcacaatta cgaaaagaat aaatcaacta tggtaagaat aaatgttgcg ggaaaaaaac gaacgaatta aagttctggt caatcctgaa aaatgatgaa agattaaagg attcgacttg ttataattaa gagctcataa ttatcgcaat *cttggggaac ctcttataac gcttaaagaa *agtagttgat ggaagacata atatgaaagg aataatcaaa tgatgattac ggaaagataa tat tcacgag gaaaaatggt atagataaaa ggcgggaaaa gttaaagaat gatgaaatca gctccttatg tataacacat gaaattcatt aagaaaaata gatgcaatga tttgcttttt aaggcgaacc attgaagaag aagcgaatga ctagcgacta tttgcaaaca atttggttag cctattaaag attgaatata atagctgata cttgaagtac atgtttgcga aatgttgctg agAcgtaaaa gtagacaaag aggtgagaca cttgatttag gatagttgta ggtaatagaa ttatcaagcg ttaatcgtag tacgctttgt actttcacaa gtagaaagtc aaaaactatc atagaaaaat gcaatcgcgc agtaacatgc ttgatgattg acgcttgtat aacgcgaaac gtgaatoaa tcatttacaa ttagacgaat aaagaaaaag cgtcatcgtc tcctaaagat ctcaaatggt aggcaaagtg ggctggtgac ggttagggag cattttaagg aaagaaaggc ggtttaatta aaagaaaaag gaaagaccag caattatcgg ctatgattag ttttaaataa atgatgtata attttccaac atacagatga acggtctaat atcacatctc gaaccgtttt aagttagtaa caaacacaaa atttctttgg gaagaacgtt agcacaaaat attgtaagtt caatgttaca aatataacga atttgcctga atagagagat ttcgagattt gacattcgtt aatgggaaaa tagttgattg attatagaac ttagaaatcc aacataacgt taaaaatcaa cggatggatt acatgtctaa tgagtattct atatgataga ttgaatttgt ttcaaaagaa atagtttttg taagtgacag atgatgatat tgcaagaggt tattcgaaga aaataagagg tacaagcgtt ctttgataga ctttttctga gtatacctcc ttgagaagtt tcataacaca ggaatgaggt acctgattac atgaaaacac tccaacgaag gttttaacac ggtaacctag aatgttcgta cacatcgaaa taaacttcgc caggtgatga tacaagtatt ttcaagcagc gtgattatga tagccaaact atttattgcc taataaatac agaaagaatt ttttgatgaa attaccattt agc tt tct tt aagtgctatt cattgttgct aatggataac agcaaaaata aaccaaagac tcctgaaatg ttatataagt tgcaagtgta agacgatcca taaaccgtta tttaccattc agttgacctt accaaattta tgcaagtgta tgtaagacaa aatgggatta gtttttaaag tgatattgta aaaagcaata aatatatgga gccggatgga catggctttt agcgct tgat agaaaagata agatctatca tgcgcgagct tgatgtttac gcaacaagtt caaagaat ta attcaaagat catatattta ttacgggaaa gattttgaaa cacaaataaa agg tt ttga t attgagtgag aggtttgatt ctgtttaaca aagcatgtat tc.aatatO L gcggattatg taaaaactac tttgaaaggt ataaatgggt aactagaatt tagctggtag tcacagcaat tgagtccggt gataatttta aatggagatt atataaagac aaggctacgt tttatttact atttaatgct cgaagagaaa gttgatgaat gatctcttta cagaaaatcg caaaggtt ta acagaatt tg agtgattttc aatagtgaag aaacgaaata ataaaccgtg ggtggacgtg gtaaacgtca actgatggtt ttaagtggca aaagaagttg tcagaatacg aaccgttcaa gaaaaagtaa gataatcaaa gggctattat gggtttttgg ttgaccattg gctagagaaa agacgtgcgt catggattac gacaatcctt aataaagagt gttcacgcat gcattaatga tttaaaacta caacaagcgt gtcgctcaaa tacaagttaa atatataaac cttatcgcag gtaacggtta aagtacaaca atattcggaa tctgcctcta ttattcaata tatgaggaat ctaatgagag tacggagaaa cctttattaa ttgaaagata 9aacazz ttaggtgaag gaaaaagcta ggtgatgaag ttacgaaatg tagtgatgaa tgaaatatat agcagcaagt tgctagaatg aaaattcatg atagtcgatg atacctagca gtaatgcttg caatctgaaa agagatgctg gaagacacag atataaattt attaictaca aggat tgt at tcatagctaa ctattttcat tttctacgcc atcaagcaaa agacgggtaa caactaaatc aggggtgtgt aacgtggtgg ttgttagaga aggttaaaaa atgacagaca ctaaaacact ataagcccga tagcaccatg tgtgtattgg tccgaaaaaa atgatgtcaa tcgatgatga aatatgggct ttgaggatgc t tgcaccacg tgatgcgttg atatcaaaaa tatatagagc gtatagattt ggaaagatat atgtgaaacg gtcattttaa atataaaacc taatttatga atagctttta aagattatac acaataaagt gaatgatagg gcgcatatga cttttaataa tatctaatgg atgcaataaa cagctgattt aaaagattca caagaataga &aaCLt9L aaccatcaga acagtggtga atgaaagcgg cttggtatgg gatgttgata acacatttaa gcggcatcgc atgattcaca WO 00/32825 WO 0032825PCT/I B99/02040 3601 3661 3721 3781 3841 3901 3961 4021 81 4141 4201 4261 4321 4381 4441 4501 4561 4621 4681 4741 4801 4861 4921 4981 5041 5101 5161 5221 5281 5341 5401 5461 5521 5581 5641 5701 5761 5821 5881 5941 6001 6061 6121 6181 6241 6301 6361 6421 6481 6541 6601 6661 6721 6781 6841 6901 6961 7021 7081 7141 7201 7261 7321 7381 7441 7501 7561 7621 atccttcaag aacatgttgg aacttataga gttttgcgga aag tg ttatc ttaacattga aggaatcaga ttttttaata aatgcgaaaa gaattgtacg gaagctgaaa aatttcttta gaaacaattg ggtattaaaa gtttggggta acagcaattc ggtcctgcgt cttgaaactg gtacaaaaag cttacatttg tcaactaacg ccgtccgatg gttactgctt gttttaacgt aaatttaaag tacggcaaag aaaccagctt gaaatttaaa aggggagt tg aatcaaaaat attattagaa aatcatcgac aatcacttga tgtcgtacga aagaattgat acaattacag aagaaagtgt tataagtata agctgt tggg ggaacgcaaa gaagaacatt gtatcaccag gtgtgaaagt aagagatggt aaataaaaaa gtactgaacc ttgaacgatt aatttgtaaa agtattttga tcatgaagtg gttcaataaa cgacgaccca ccaaatagat gatatctaat tggaaaaccg cattttttat tattaacatt tagtgatatt aaaaacagct aatctcatta ttatgatgaa atggttcaga gtttacaaat agaggttgaa tatctttgat tttcttaaag aactttgtaa agcattaaaa tattgcgcaa tcaaataatq aatgatggct tagtaaaatg gaaagatgta tat tgacgca aatcgatgtt caaaaatagg acgaatttat gtgacatgat gagtttctag tggatatcaa atagaatctt atgctggttt aaatctatgg aaaataaatt ggattgaaag cgttcttaaa gtgtatcggt ctaatccgcg agaaaggtaa cttttgaggt taccatttaa acgttaaagg aaacacttgc cgaaagataa tagaagatac gttgttagag tatccagctg aagtacgaca ctatgcgaat ttattgaatg aaagattgac gcgtataaaa acttatacgc acctgaaata t taagaaacc ctgaaaataa cgagtattga atgacattaa atcttgaaat atttggataa gacaggtgat aaaagttcaa acaactcaaa tgaatggata tagaatagta acctaaagct gacgctaaaa attagtcaag taccctaatg atacctacaa gtttttgtta cgcattcaaa gaatatatag aaggaggaaa actggtttag acaaaaacaa tatgctgatg caaatgcatg gatggcgttt caagagcgta cctaaaatcg ggtgaggcac tcagctaaca aaaattttag c caaaaccggc cacttaaagtt ggagaagtga gctgaggcat aaggaaacgt tttgaaaacg ttaaatcgtg atagcaaata gcagatagia aggtcataaa taatgcagta taaccaacta tttacctaaa taagagtgtt cgaagattta gcgtttgaag tgaaattaaa gacagcgttt atttgttcgt aggtactggt aactgatggt cgctacggtt atcagtagcg tcaagcacag tttgaatgtt tctatatgat gttagatgat t aaag ttgc t cgaagaaaca aatttaaaga aagggtataa aagtttatat cattacaaaa gtgaagacaa cataattcag aatcagtgcg gctagataig atagattttt tagaattaca tggtccagaa tggtgtctgg attgtatatt tgaatcaaga taaagacttt aaagcattag gataaggcgt ccttcagaag aaggggaaac catttaattg atgggtggga agggagttga acagaattat taaaagatac cttatactga agtacaatga agttattatg aagaatttaa attaaatggc gtttcgctaa gaggattaca gcggtccaat :gttccctaa acgaagagaa iagacggtac itggagaaac :tttcccttt tgacaaatca ;cgaagaata ttcatcggaa :ggcgacacaI aagatctaaa atgcggttag ggctaaatgc acaatatgca taacagcttt aagtaattga aattatcagc atgactataa aacaacggtg tttgaagaaa tcagcacaaa ggatataaag acaacgaatc ttcttaaaat ggtcaattag gttgttttac gttcaaatcg aaagaccaac gcttatccag aatgaattga gttaaaggta tatacacatt attgagtcta ggttatttag atggatttat gctgtttgga ctataaaatt catagagcac caatcctcgt cgtaccttta aaaagcgtct tgacgattga aggatgagta gagtttttga cttatcaaga cgttatctct actaaacgtt gctggagaaa ttacgtgaat cgtgatccgc tatttcaaaa attatgattc aaagagaatt taatagctgg actcaggagc gtactgttac aaaatggtca ttaatagagc aaaaattgtg tagagagcac tgatgtacct cggagatgag tgaatataat gtctgaacta aacatataga agtaaaacat attaacgaaa aaaaattggt tgaaccaggg agagattcgc acaaggtaaa atttagaaca ggctgagaaa agttgataat .gatggagac tactggaaac actgcggtaa tacgatttaa tcatgctgca agctggtaaa tgatgaagcc aattgtagca ggtaagtaaa aaaaataaat aaatggattt atttatcgga aaccgcaaga ctaaattaca ctttgagtgc aagaaaaact atccattati ccgaaacttc atgctgcgtt caaaagattt aagaagcat t cgattggctt agaaagaaga cgcaagtgtt atgtaacaat taaatgcaaa cagttcaaga ctggtggtat acactgcaaa aattagattt ttatgaggtg aatcaacaca gttgaattgt gataagctga agttcaatgg tgatttgctt cttaaagcag attagagaat tttattagaa aatggaggta taaatacgcg aagaagaaaa tagaacaagc aaggtgatta atcgtttgaa gcggaggata agaaaaacat tgctaaggta actgattagt aattaggtgg tgttgagaaa aataagacaa attgatattt gtaaatatca tttattgtta tgtgcatata gcgagaatca aaaatgggaa agctctcgcg gcaagtgcgc gaaggcgcgg gttgaaactg aaaattgttt caaaacaatt gttttattac oattgggatt aaaaagtcag ggtgaaaaag gtgacagagg agtcggttaa atgttgtagt gaaacattag aacaaacaag attgaacaag agcgatacac acgccagagg atgaaagaaa tcaagattcc aacattcgca aagacaaaat agcaaaagca aaaccaaaga tttaccagaa agctgactta tggcgtggct cagtgaagaa aaatgatttt tgcagtggcg aaaccgtcaa acaaggtacg taaataccac ggt tgt taa t tggcgtatat agcaggtaag taatgttcag acaatttgct aaaaggacat ataaaatggt agtacaaagt tgacaaatca caaaacaaga ttaaaagtga gtcaaattta ttgttaaaaa ttaataggtc cacttcaacg tcagaagatg tgttcatttt attattatat tatctcaaac tttacccagt tataaagcaa tagt tcatga tttggcataa attgttgaag gagattggtc cgtgggcct t aagtcaggaa gggcaaaata tgtacaaagt ataatattaa ttgacgatat gttatattgt taagaaataa atgtttcaaa tttacgaggg caaaggcgta aattaaaata gtggagaact :taatgaaga acgtagctgt ctaaagttattctcaagtga tacgtaagta 3cgaagaggc ;taacgaaga tataccagat igagccatct WO 00/32825 WO 0032825PCT/I B99/02040 7681 7741 7801 7861 7921 7981 8041 8101 8161 8221 8281 8341 8401 8461 8521 8581 8641 8701 8761 8821 8881 8941 9001 9061 9121 9181 9241 9301 9361 9421 9481 9541 9601 9661 9721 9781 9841 9901 9961 10021 10081 10141 10201 10261 10321 10381 10441 10501 10561 10621 10681 10741 10801 10861 10921 10981 11041 11101 11161 11221 11281 11341 11401 11461 11521 11581 11641 11701 aatcaaagta gatggtcaag atgagtgaca ttgaaaataa agaagatcca atttgaaatt gatgaagcca ccaattcaca tcgtgaacaa ccagaacatg actctcatga tttcattatg gaggctttaa ttttagaaag tagatgcagc attctgactt acaaacaaag atttagccaa aaaagttacg tacaaaaaac tggcagaaag caaaaatggg ctgttttagg atactgttac ttaaagatgt aagttaatac tgaaattcag caatgggcga aagcggcgca gcgctccaat gggaaaagtc attggggtaa aaaagacgcc caggtcctga aaactattga ccgaaagat t ctattgaaag t tgat tgg tt ttgctgctgc atgcagtaac gttttttatc ttggcattgt aatctgaaac gtaattttat cgatatcagc atgaaaacgg tatttgaatt tgcaatttat gtgtaataca tcgttggtga aattaatttg actttggcgg tcagtaaatc atagcgtaaa t ccgtacgaa ctaatttatg tttggaattc gtggaatttt atatcggcgg actgggtcgg acacacatac cagttgggga tccctaacgg gctcaaaagt ttagtttagg caaaagataa attttatgga t caat tctt t agttattgaa ttactgcgga ctataacaat ggagagtatt aaagcaaatg gtatacgaag agagaaatcg gttaaagacc gtgattttca aaataaagcc tggacttaat tgctttccat ttgatgcatt gaggtaaaaa aaatttaaat aaaattaaca gattaaagaa gcaatatgac acaagaatat atcagccgaa tggctgggga tgatggttta tattgcagca tcaagcaaca ttatggcaat aaggttaggt tcatataaca tgcaggtatc agctagtggg gagagctatg aggcgttaat agc tggtaaa ggatatagct tttagcagac agattcccaa taaagtagca tgcgtttgct ttccaattta aa ttggt cct tgtattagct gactaaagta attaggtgta atttagaaat tcaatttatt aatagttgat aatttccatt tattttaaat t tggccggcg aggtgcttta t tggcgagga gaatttagtt gttgctaaaa tttatcagca atcaattttc tacaatagga gaatgcgacg cattaaagat cacaaatatg tatggtaagc tggtaagttg tac tacaaga taagggacgc taaacgtgta atacaacggt tactatgtgg aataggtaaa aaatccaggc aactaaaggt atacacaaca agcacaaggc aaatgtagaa ataaaatggc aaattaaatt caatggattt ctgacagatt taaaagaacg ttactcaagg tgaagattta tgaaaatggt atatcaaaat ttaaccttaa atgggagaaa agatcatttg ggcaacaact cttgatggaa aaggtatctc aacaaacaag tttgaagagt aaaaccagta aaatccattg gcatcaggaa ggcgcaacag tttccagcag tttacaggta ggttctgacg gaagcaagtg ataagtgttg ggctttgaga actgaaatag aacccaagag agcgcaacaa gctattaaag ggcacagtaa atgaataaat cccgtaatgg agtgatggt t gtagtttttg ccattgttag cctatattag ttggctggtt tttgttaatg caacctttcg ttcgcaaaag gttcaagcac tttgtaatta gttaaagcct aatatcatac gtttgggacg caattatggt ggattgatag atttggaatg acaaatatga aaagcgcagt aaagaaattt aatacggtag cgcgatggct gctattaaaa ggaatggata ttagttaaga ggaaatggtc atcacaccta gcacaaactt aaagatatta ggtaccaaat aaacttttaa atgggaattg gatcaaacga attgctacgg gcataagagg aaaattaaaa acaaacgtac aatcgatgat gatggatatg tatgcatgca tcaacaaact acatataaag aaagacgcta aaaaataatg ccgtttggtt gaataaaagg cagaaatcaa tcaaatatac ctatcacagg aagaacaggg caaatgagct tcaaaaaagc aag tt t ttga gtaaaggttt aagcttttgc gcagtgaatt atgctgaaac aagaact tga gtgtgcaagc aatatcaaag atacattagc tgaaagaatc cat tcagtgg aagaatttaa gtttagcgat gtggtcgctt accaaacatt taaaattagt aagaattaat ctaaaagatc ggttaggtgc ctagtattgc gaactgtctt tagcagtcgc gtgcaattga ttgattctgt atatttggag ttcaaaatat aaccaattat tgattgtcag ttggcttgat ccgttgtgat ttgtaggtaa caggaatttg caacaaaaag aaaattggtt cattatttag ttagtaattt gaattgcaag tgagttccat aaggactcaa aaatacctaa acggtaagat caaatggttt atacagatac attcaatgtt aatctggtgc ggcttggcga attatatact caggcgacat atattgtatc t taaagcaac gggcaacccc cgtaacatta t taacaccac attgaggacg gt tgtaaaaa cctgatggaa gaggaaacta caatgttgaa acgaagtttt acatttctga agggttattt tttatctata acgaaacttt cgaaaaatca ttataagaaa cgaaaacagt gaattattta tcaagttgaa aagtatggga gatgattggt agaagttgat aaaaaaattg tgttggtgga aaatgccaca cgtacagtta tgttttggat tgatagtatt aattgcttta t ttgaaaaaa gaagaca tta tgaagcattt tagttatcaa taaagattct aggtgcigat caaaaagcta aattgttatt atttataagt aaaggctggt cacagcttta atttacaatt aagtgttaaa taaaaacatc tcaaatcaat atgcaacttt gttcgcgatt tacttgggag taagttcttc gattcttaaa aatacttggt ggacgtaata tatttttgga atctaatact tggcgtcaaa aagaaattgg ccgtttatgg tatagataag tLtaaLtaaLL gttacacact tgcacgtgac tagaaatgaa taccgcttat aaacggaacg atcatcggca taaagttogc tgaagctttt 5 aacaaaagct aatcaatagt agttggtaat tctattttat ttcaattagt acttcatttc aaaatagcac tttacgataa tgaatgcact gaaattttat aaatatggat aaaaatgcca agaaaaagca ttttgaactt ggtttggatt aaaactttaa actgatagtt aacgttgatg gcagaagctc gaaagagaat gctcaaagaa cctaaattaa gtaactgcac aaaggtttag cagaactcat gttttaggag gagtcattct attacccgtg atggtagcaa actaaatacg ttctctcaat gctatatcaa gcagaaattg ggtgcaaagg gaatttttaa gaaagtggct gtatgggctt tctatagcgg ttcagtggta acaattggca ggattgatta actggtccaa gcttataaga caaacattta tttaaacaag ggattcttta attaaagcga tggcaagtga aacataaaag tcaagtttat ggagcagttc gttgttaggt agaagtatat tttttattta tggagcagta tcaaaattta atgtcaaata agtaaggtac attaaaagtc 3gtacagagc acattcgcta itgattgaat.
ttacctaaag :ttccaagat tttaactgga 3atgttttag ~gaattgatt ;catggtcta WO 00/32825 WO 0032825PCT/I B99/02040 11761 11821 11881 11941 12001 12061 12121 12181 12241 12301 12361 12421 12481 12541 12601 12661 12721 12781 12841 12901 12961 13021 13081 13141 13201 13261 13321 13381 13441 13501 13561 13621 13681 13741 13801 13861 13921 13981 14041 14101 14161 14221 14281 14341 14401 14461 14521 14581 14641 14701 14761 14821 14881 14941 15001 15061 15121 15181 15242.
15301 15361 15421 15481 15541 15601 15661 15721 15781 agattaagae atttagtcgq cttataccgc aagaagttac atggtaatta actttagcaa ctggtaatac gacattttga gtggtggcgs cgcaaagtat t tgcaaaacg aaagaggaga ctaaacgtgg acattgttag caggtggaaa ttattccaac cagaagtaag acgggt ttga ctttattact ttattgacga aagaatcaac taaagtgaac ttttaattat gcgtaggctt tcacaacggc cgaggaacaa aggaccaata actaacagac agtttcagtt taaaccatct tgatgaggta tgatttcaaa taaggtcggc tcctgatgca agattttcaa agcacaacat atatcatgat caaaaagata ttatatgcgg cattaaagac cggtaagttt ttataagtgg gaaaggcgca aaaaagtgt tttcaatgtt gacggttaaa agatittaac gattcataaa aagagctgaa gcgtgaat tt tatagcgtct gaaaaagaca tgaacaaacc tgaagtttta tagctctaat aggtaaagaa agaaatcaaa gctagttgtg nnnv.I-n 9a Iagccaaaaca tgatttggaa acatagagat cataatttca attacgagaa tagcaatatc caaaatacac tacaagtaac Laagtgctact Icggaatatta :tgcaactgga Iaacgccgatg Ltgtaaaaatt Latcaccacct !cggatttagt Lccctgaacca ftgctacttct tttaggtggt tgaaagtaac cccatcaaga atatactaac acgatatggt agtttttgat agatccagct agggaaaaaa tgatcctagc gaaaatagca atacgctttt aaaagtaaag aacaaaacaa gttttaaaaa gaatcttata attaaaacac gttaaattac aagctgcaca ccttacaaat gtaaatagtg agttacttta accaaagaag ggttggacta ggtgactttg aaaggttggg attacctata atttatgata agaaaaatag tacgactatc c tcagaagag ccagatagac tatcagcgtc atggagatga agggatgtca gtcatcaatg gattctgggt tggcaagata gacaagatta cgtaatgtta aagttccgtg attattaact tatcttgctg acttcagaag gaatacgatg aagcaattat accgtcaaag attgaatatg acagcattaa acagatgacg gagttaaata gttacgtatc tttaacccgc gaaaatagcac gagtttaaca i aacactatagt aaaagtgata c cctgatgttg c gattggataa gaccctgaca agaccatttc ggtggcagac actagtggcg agtggcacga acaggaccac tatttaagga ggaagtggcg cgttataaag taccagtcaa ggattattcc tttaataatc tggggtggtt ggttggtata cgtagaaatg gcgagtaaaa ttattattga caatctaacg gataaaaagg tttagaaaag ttccttggtt cagaaaatgt gttttgatat atgatgacgt aattcaaatc aagaatttac attcagtaac ggactgctga tgattactaa ttaaggatta agatgattac tgatatccaa ttggtgctgg aatgtattgt gtgatggtaa gacatattgt agaataaacc taggtaataa gtaaacctat cagcttctat atgggttagg ttatacaaaa aggaaccaat acagtgaat t gatatttata tagatttcct atgacaattc aacgacatcg gggttcaaga atataacaac cattgaaaga gcttacgtac gtacaaccta 3tagatatgt 3taaagattt :tgctgtggg iagcgcaaag -a-at -tca iacgtaagtc :gcacgagat :attgtatgt i ~atatacattc ~gcgattgaa :taaagatgtt :accgccaga z :tgtcttgcg aattaggtggt aagaaaattt aaattaatta atgaaggtgt t tacaagaat t tatcgatat tggtaaagcc atttacatct atgctaagaa caacttatgc gtaaatggat atgcagtgaa aaatcatcgg cagtacatca ttaaacgtgc acttaggtga atgcaatgaa ataagcgtcc aaatgattga atgtgattgc tgaacgcgtc gaggaattgc gtatgtcgaa agatggacgt acctttggtg cttgaatgaa taaagattgg aatacctgtt aggaaataaa cactccttta aaatgatgaa catgcctcct tgaagatatt tcttggcgaa cacgaaacga tgaacaaaaa gt tacttgct tgttacgttg gataatgtat attttctatt tgatatggat catagctgtc ttcattcaat aggtgattta gttgagcgag aatcatacaa gaaaggagat ttctactgat agaaatgctt tgttattata tacgatggac agctaaaccgI tgtgttgagc tacgtcatgg taaaatggtt agtactcaaa agtcgggtta acctgaaaat tcaattcaac aaarnrnmV- f ~gcagttatg :atatcaatt C igaggcagaa :ggtcaacct z :ataatacat c :gtagatggt Laatccagtc z :agatattgg a :ataacaaga g agaagctatg tcattatgga cgattttcca gccatttatg gctattgcg cggtgatgtt tgaaatgagg aaaaggaaga cagtcgagta tcatgaccaa taactgggat ctcaactttt aggtatctca tggtgattac agacggtcat gattttgcat tagccaat ta acaacagcaa agataaagat tatagaaaag tattcaatga agagggtttg tcggggtcta gtacgtaatg ttagtaaagt tactggaacg aagttcacta aatactgcga attgttgaag gattatttta gtttatcata ccaagtaatg ggatataaag gggctcccta ggtaaaggtg t cta ttggt i tataaccaaa aacttggaca aaaacttgga gagaaagagt tatagtgcga acggagattc gtaaaaatag aaatcgtttg cctgaaaacg gagagtgtga gacccttcct gaactgctca agggattcaa ggctacacag tatgcaccag gatacaggtt acttcttatc :tagattttt aagaaaaaca ictaggaaga jacaaagggaa :tacctatgc tcatatgaga ~gcgatacagt ;ttattgctg a Laagagttca a :aaaagttaa a jaattagaat a Latgatatgc t ~atggtcgat g ;agaaagcgc t ggcggtggcg cgtaccgcag tttgtatatc tctggtggtt catttgaaaa gttggtttaa agaaatggac ttatcaatag atccgacaag atgatgcgcg ataaatgctc agagcaaacg gcaatgcagt gcatatgcta ccagaa tgga tatgcagcag tcagacttaa caacaaatag tatcagccga cgagaaaggc tagacactat aaataccctc tatataaagg actatttac tttttaacta cttatttcga tcaaagtagt tttcagacca cccgagcaat tggttggtga gtgagtttcg acttaggtgg caactaattt aagcgatgac ccggaagaac atgaaaataa aaggagaccc gaatcgttgt aatttgatca ggatagatgg agtataacgg taccgaaacc atatgcaagc gaagtaatta tctttgatac tacatgtttt tagttagagc tatcatcaga acaaacaatg agatagaatg gcaaatttga gggaagtttc aaactagata atattgagct ;cttattcaa ttgatatgtc igcgtttaga ;ctatatttg ttacttctac :cagagtaaa ~agaatataakagaatcaga ~cgataatat lctttgaacg :ttggtatga ;ga ttgaagc :attcagtga aacaccaaat gatgttgaaa WO 00/32825 WO 0032825PCT/IB199/02040 15841 15901 15961 16021 16081 16141 16201 16261 16321 16381 16441 16501 16561 16621 16681 16741 16801 16861 16921 16981 17041 17101 17161 17221 17281 17341 17401 17461 17521 17581 17641 17701 17761 17821 17881 17941 18001 18061 18121 18181 18241 18301 18361 18421 18481 18541 18601 18661 18721 18781 18841 18901 18961 19021 19081 19141 19201 19261 19321 19381 19441 19501 19561 19621 19681 19741 19801 19861 attaaacaat agaattactg tttagacgct cgaaactgca gaaattacaa taaattatta aacaaaattt taaatcagct aacatcggac tgagagaacg cggattggaa tgagattaaa cattgatgct t tcggaagaa aaacgcagaa ggtcaaagaa acaaaatggt tacactttca atatgatgat tgctgataaa agataaagta tatcaatgtt gaatgattct agacgatatt cggttcactt cggtggttca tggtataaca tgttctggag tccaaacaca taatgcttat tgcgggtatc atatgcaaca acgacgtgat agatgatgca agctaatttg caagttatct tattcttaac agagctgaga tttgattgct aggagaaatt agaacaacaa gattacaagc cacaagaaaa ctgaggaaga atggcaaaag ggtacagaac aaccatgctc ttgttatatc aga tcaga tt tagatactca aaacacgaac acactcaacg aaaaccttag gataagaaca tcgctaatta tcggattaaa agttaagagt gatgcaaaag ocoaacaaaa actgtagtcg gcaaatcaaa gcgccaatta gtggttgata ggaaacaatt tctttatgtt ataataaagc cgcaaaagtt aaattgttga at tt ttat ta aatagcgagt gtgattgatg acgattggtc gatgtttata cagtcacaat ggtttaacgg attgaagcag tataaaacag actttaaaag gaacaaaaac gcaagtattg caagatgatc gagcaacgcg ctaaaggcia agcacagatg aaggaaatca aatatattaa aacggagtgg attgatatta gataaaaccg aatagaattg attgaactag tttacgcgac tatatgtcac tctggtacga atcaattcct tct tacgc tt gacaaagtgc tcgagtgacg aggttttcta ggtggagata ggtaataggt ggagatagga catattactt at cgaaaat c ttacctatta gaagatagaa gaagaggtgg gaaggtatag ctaagaatca taatcctgaa cgcgatgtta gtaatcctta aaat tatcaa gtgtagtata aagattttaa aattaactaa tatctccaga tagtctttat atgaatggcg aaattaaatt atgctattca tacgtga tat tagcattatt ttttggagct cagtgcttcg taataacaag otattaoccc ctttatatac aattaaagaa aagaagtaat tatgttaatg caacccagat agcgacaggc aaagat tgaa ggatattgtc gagcgcaaat atttatctat acttagtaga tttataatca ggttggtaga cagatgtaga acactgatga tgaatgaaga ctagagaatc acaaagacgg gtgaaatcaa aatatactga aacaagcaaa ttaaagagaa ctatacaaga gaaacgctga cacagaggaa aattaagaac acgagattgt ctcaagcttt acggtaatag atattgtcaa gaattaaagg gtggtattgt tgaaagacgg attttggtat ttcaatggtg atggtggtgt catcgaatat ctggat taaa gttatattat aagaaagaaa caacaatcga atattcatat tagcttctaa ctgctggcac aatataacga gaacgtggtt aattatcgga agaatttagg cgtatgatcg agaaat tgga tatacaattc aaagcgtata gcactatttt caatacagaa tcaagatttc atctgaagaa caaaaaacaa ggtaacagtt tcttttagaa catcagaagg aggtcaaaaa aaaagaaaga gaaaatgtgg gcgtatgctt tcgctgtgga gcactggctt atacatcgta aattccacta aacgtataaa atataaagct gacacctacg acaaaaaatc ggttggtatg gaaaggctgc aaatatggtc gttttcccgt ttaaatactt acaacacgct taatgatttg aattaaaaat tacacaagct agatgtcaaa aaaatataaa tttgcagtta cacaaaagaa tattgttgaa aga taaagt t tgaccagtta tcaagaagcg ggaatcgcaa tgctcaagct aaagaaagct aacattgact tactaaagaa tcaaaatgtt gaatgtgggg agaaataaac cagtcttaat cggtgacaat gcaacgtact tcacctaaga ttcgacttat ggataaaact cgttgcacta caaaagcaaa ccgatttgca gtttggttct taaaggtctt agcagggtat acagagtaca ctcaatttat aattgggcgt tagagatgaa tgataaagct agacacctat attaaaagag tctatggatt ggagtcaaag attatttatc tacaagaaaa tatacaaaaa aggtttattt acaggaagtt aacgctaaga cgtgtgaaag aacactgaaa agcgggtgta ttagaagaga acccaagagc gaaatagatg gtgcttggtt atgggcatat cgtgtttctg tttattttgg ttgatcttag cacaatcaaa gacaatccaa gaaaataagt aatatgaacg aagcagaaaa gatttcagtg aaggtttata aaataattaa caaagtatgg tcacatcatt agtcttttgt aaagcggact aatttagaat ttatttcttg atcgccattt gaagcgt tgg gtcggagaac caattacgtg cgtttagata acgt taaacg agtgatttgt caagaagct t gcgtatgctg aaacttgaag aatgcttata cgctatggtt gagtttaatg acagatggaa ccacgtggta cttcttatcc ttatcaagag aacagatatg tggagaggga tttagaaata attgatggtg tacagtgata acgtcagata caggcaccgg ttcacgctgt gatgagaact gttcaaattg ggcaaattta gacctactgt agacgtactt tcgacatcag caactggaac gagtctgaaa aaacttgata tttgtcacgt catcttatcc aatgcaggat acaggaaat t taaagaaaat tttaaggagg tagtacaaat ttacaacttc aaattgcgga tagttaaaga cagtatgaaa ctgaattggg atgataaaac aagttaacat aaaagaataa tagttgggac aagagaggtg gtttggtaag ataaaaggag cattagtaaa ctatatcatc catctcaaga atagaaaagc acacaaatga atggtttgac ttatgattac tgcttataat aaactatgac tggcggagct tggtcaaaac cagaagctac tacaagcaag ctatgacacc agtatagaaa cagatagait aaataatagc ctaatgttgt actatgtaaa ctgctgaagc aatatcgaaa ccaataatcc taaaatcata atggtaaaat aggcaaaaca cagacaacaa ctcaaattat caaccaatcg caacaatcag ttagattaaa aaaatatgcg agggtcttga ttcaaataca aacgttcaac acaccgctgg aaggtgaaga gtggcatgaa ataatcgggt tgtatttata ctaatgcaga atgattacgg ttaatggacg atatgctgaa c tg taggt ic attcggccgc cgcgtaaata attcaaaagc ttttagctag gatacgtagg atgatgacaa ctgttatcaa aacaaacaag atgaggttaa caacaatgtg tcatttaatt cgacaaagaa tgaaatggtt gacgttaaat agtagttgaa agctatgagt gtggttcaaa aatgctcagc taaattagat gaaagaaaat aatatttggg attaccatgt tgtaagtaat caaacaaa tg tcaattctta aataatactt aggtaaatgg aacagggcaa tttagggtag aattcattag gccaatatgt atcccgtttg agctttitac ggacacgttg tggaacggta WO 00/32825PC/B9004 PCT/I B99/02040 19921 19981 20041 20101 20161 20221 20281 20341 20401 20461 20521 20581 20641 20701 20761 20821 20881 20941 21001 21061 21121 21181 21241 21301 21361 21421 21481 21541 21601 21661 21721 21781 21841 21901 21961 22021 22081 22141 22201 22261 22321 22381 22441 22501 22561 22621 22681 22741 22801 22861 22921 22981 23041 23101 23161 23221 23281 23341 23461 23521 23581 23641 23701 23761 23821 23881 23941 aaggttggac ttcattatta t tggcaataa aacctaaaaa acggaacaaa taagacatgc atcaagatac ttaaatcaca caagtggtgg tacaagatgt tactaaatgt ttattactaa taatagccgg catcagctaa atgtccctta taagagacgg ttacgtatga gtggacaacg gttttggtaa tatagggaat tttttaacat tattttttta catcaactat gatagagagc taacagttta tagccgggca cactcga tac tcgatacggt attgtggtta caaacgcttg tttggacgct cttgtgttaa aaaaaagggc taactctctg tgtatgtcct tatgtgtgta agctgaggac gaatataaac cattattttt tgcggtct Ca gccattaata ctcacctatg agctctatac catctctaaa cttctttggt ggcgtattta tacgtttgat taaattttga cgttacttta aaaagtcaaa ttctaaacga tacacgtttc ttcattgttc aaaaaataat aaaatacaga aatatatacg agggatctgc tattactgga tgatactatg atcagatata ttttttatcg aaacaatctt ttctttagag catcactatt aaaaactatt aaatctcttg agtttttaat taatggcgtt tgacaatcca agctaaaggt aattatgctt cgaacgcgat aggacatgaa tgcatacggt ggggtatgac gcatgttatt tattaaaaat taatgtatca taaaaatgat tgcgattcat aaacaaaaaa taaaaaagaa ttattcaact cggtgcatat t egtt a a Ca gtttagcacg cttacagtta ttctctcaag tgttatagct ttacatctat atagttttca cggggtgctt gaggccatgt atatatctta tatatttat t CCCtgcgt tggaaaagct cgtgtacgtt aaagccttta agaaaaaggg C ccatCCC c C actcttttca Cgccttagtg aatcgtttgt cctctatcaa Ctcaatacat gtagtatctt gcgatcgttt cgcatacctg tgcatgttat tagttataca agtgtgacgc atagcttctt aatttgttaa gaactgttct aagccagatg gtttttaatt aacattgcct catttatctg ttatCttaa aagggtaggc cgccacttat tgttttaaag aatatattat C C C Caa C C ttattaatgt aattcaataa aaaacttctt aaataatact gataagggaa gcaaagtgtg tctccttgtt agtaaatagt Ctattaatgc gcgcaacctg atgtattCa attattaagc gtagccggtc tttatacgta gttgcattat gttaatgtag attgttctag atctcaagtc aacttaggac gcagaaataa atggattgga ggtaagccta aatccaccag caaggcaatt aattcaagaa tgtattaatg gcgacaggag atttagtatt ttaaataact atttaaatgt agccttcggg ccC Cgttcac Cactactccc ttatgttata atctgactgt acaacataga cccctacaac C CCC Ctgggg aaaaggttaa agagaatgac atatcagttg cagatacctt ctgttacatg taattgcttt tgtgagtagt ttatcctact catagcttgg Ctgctatcct tgtgaccaaa tattCCtgag ttaaagcttg tatcgttcag tCttcgcttc tatttaatat tcatatgtcc taaatgtttg ttttgatgtt tttatatg cgcttgacga ctttttgcga tatacggatc atttttcaaa gggctaccca aattataaga gataaacctt tattaattct tttggggtaa ttctgtcaat aataatcttt ttaatatagc cccatttcaa taacatttac aattagaaaa taaactttgg gaatatctga gtttttctat gC tggggtcc ttaggttaaa aagcgactac atggttataa aatatataac acggtggctc gcaataaaaa aaatacattt aattcaatgc aaataagagg atataaatta ttaagaaaaa taggtggttt tgccagcagg acacagtagc ttacaggggt gttatagatg aggtagacaa tacttagaat atttggatgg agataacagg ctagtttC ccaagcatgt cgtagtatat attgctttta tggtcccaca aatgttacat caacaaaacc caaaaaaagg aaa Cgacaaa cggtttacca C Cacaaagga ttagtacaca tgtatacacc taacgatata aactttttta gccttgcata ttcccattgt tgaattgatg tccagcatta gtcaacaCt aacttctaca tataaaatcg ttcttCttct gtgttcgtt aagttgacgc catgtactt tttgattctt atattcaagc cttgttgtCC ttgctttgta ccacatttta tgaaaattgt ttacatggtt taatatatta atttatcagt aacttttctt tttatttaat agtgatgaat tgaattat~t atcaaaattc tatatcctcc ttctttatta ataaaaacct atctaacttt attatgcgtc tgaaactgtg cC Cccctaac aaaaaaagag cga Ccc tgga gcctaatatc aagtcaatca agattatggc agacgcagca agatactatt tgtgacacct tcgtttatct ctatgacttg ggtagctggt Ctatacactc taatgttaaa attacccaac gattacttaC ggcaggtaat aaaaattttg atgttaatat caggtacttc gttatgatgt cactggatgt atgactttag tatagtagga ggagacatct Ccgctataac acagatccta gcagattatt aaccttgata tcatacaagg tgtagcgt agttCttcta ttatagtcg ttcatttccg tttatattta ggatttcct tgcatctttt gcgattCCtc catttgattc ttaacttgga gccccagcaa cgtatctgta atatcttcta ggataattgt tacctgat gtatcaaCt gttcaaat cattcatcta agtttCtCtCC ttcttattca tcgtagtatc catccctcct ataaaaaaag aattaccaaa aaattatatc aacataatat atgcgaaact C tatCt ca a Cctgtgttgt tgcgcgctaa atctttaaat gtattagaat acg C at ac ttatggtttt ttaaattttg atcatttctc acaagacatg aacttaagcg gcagtaatta gcagtaggaa gctaagtatt caagatatgt ttatattggg ggagaaagcg gataaaagta cgtaatgatt gaattaggtt tattctaaat aatgttaaaa gataagaata ggtaataatg aacacaacaa attgctaata agaataagta ctacattaat Ccctatacac ggtacttgcc gttacacatg ttttc C cC cattcccgta gtgaactata Cccttgtcat cgtatcttaa CtaaCCtagg tgaaaaaggg caacagtgtt gtgggattaa ctttaaaaat atCCCtgctc CCttttcatc ccaataaact atgattctgc ggcaagttgt tattttctaa ttcttgaacc tgtgaatagt gagctaataa ctaaaatacg ttacctgttc Ccgtcttact aaaatttaac C tgcagaata tgtttaaaag tatcaagcgt ataacgcgtg ttattCtc agacaacact tatacttcgt caaaattggc acgcctgtat aatggtaacg atcttatatc ccgaagaatc tact aatCcgg ,-y-nnrrnr tttctaaact Cttttggta ttaaatttaa acttCtgt catttCtaCC cgaaatctac tttcaccttc gatttccaga ctttaCtctc WO 00/32825 WO 0032825PCT/I B99/02040 24001 24061 24121 24181 24241 24301 24361 24421 24481 24541 24601 24661 24721 24781 24841 24901 24961 25021 25081 25141 25201 25261 25321 25381 25441 25501 25561 25621 25681 25741 25801 25861 25921 25981 26041 26101 26161 26221 26281 26341 26401 26461 26521 26581 26641 26701 26761 26821 26881 26941 27001 27061 27121 27181 27241 27301 27361 27421 2748 1 2 754 1 27601 27661 27721 27781 27841 27901 27961 28021 gctcacactc cataatgaat tatttacgcg ccactagtta attataaact tactttaatt cttaaagtta cgctaaatat gaagcgactt catatatcta tctttaatag at tgaatcac aaaaatactt gcaccacatg ttattctgtt gtgagttgag tcaggaactc gttttagata gataagtgac agaatatcta ttttgcattg caaaagttca ttatagttca t tt tcgt tat gctttaggtg acaacagaca tattttttta tgcaagaacg ttaggacaaa tacaacaaac gatgaaatta ttagacaagc aaagtgaatc cagatgtgag gcgataaccg ttacaaacat tattttgtag agaaatcatg agaaatatga caaagcaaaa gatgtcctac acattaaaag caaaacttac tcggtagctg aacggtgttg attaaaaaga ttggatatca ccaaaagtaa acatcttaaa aattatagca aattttcagt tttcgcaaaa aaagtatcag ccatattaga aagtgaatac ctatgaaaag atgttacaaa catgctagt t tLyLLyciadt tttgagatcc cttttgtaac catgtttttt gttgataaca tcgaacatcg ggtcgagaac atgcttaaat acagctcaag gaaatcgcaa *tcaccaccat cttctttggt cattatgtga aaacttcata ccttttaaac ctttaatcca agattgcttt acgttattaa tgatatcatc cacgcttgat aatcttcttt cattaactaa cttcatgcaa caatatacga catctaattg aaaatatgtt gataagaatc ataagaataa tttttgacat cttgacgcaa gtaatgcctc acttttttaa aatgtttgaa tgaacggtaa tatcagaaag ttattaaagc aacagaaagt agaaaaggtt ttgaagctta acctaagcat ctaaaaagct ttaacggtgt agtaacattc cgagagctgg tctgctgaat ttaattttaa gaaaagatat ttgatagcga tcattatcaa acgaaaaaat cagctattcg atccagacta ttttacaaca gtagtgataa atataggaca gtggagaaag aaaaacgaat caggcaaagg aggaggaaca aaagaagtta aaagtaagaa gatttgtcgc catggcttcg aaattaacat aacctagcag agagtttcag aatttagaat actgtttaga dyyLayctcl aacaaacaca agtttcaatt tgaaccatcc acattataca gagaaatgtt ttcatcaagc acggatttga gcaatatgac tgattcaacg tcaacgtcta taacttatcg cgataaatct tactatagtt actgctgaaa catatattta tttcatgtca tcacaataca atacttcgga aagacttact cttaataaaa aatacaaaaa tatgtcatca tactagttta ctcatttgca attgattttt tacatcatac tttatgttgg tttaatattc gttcctatct cttgaaattc ctttttgtgt ct taggaggt gatagtcgaa aactttgtct ttgtaagtta tcaaatgttt aataaaagta cgacaaaacg gaacaacgaa acgaagtgct tgaaagcgat acttcttaat cgatgatatg gtgggtgttg agagctacca tgctgagatt ggacaagctg cgaatcagga tagagaaacc caaacacggt caicattaca gcaagtagaa ttcaatactt aaacagattg ttataactta aattaataat acaacaatac caatggaaca gagaggctat tcaataatga taggaagatt aatcaattca tatcaatttt caaaagttta aattaactat tgcgaaagaa aagaaacaac taagttttag ccttcatatc tttaaagtaa cgaaaggagc caatattcaa attagaagtt agaaaataca tcactatatt tagtgaacct S cacttgtagg ccatctattt ttaggtaact tcttttttta tagacgtctt aaagtgaggt atttctcctt actttgccca tttagagata ccatctaata gcg ta tgtt c tcagcatttg tataattctt gactctttat tagttaagta gacattatcg cccataagcc tctggagaag aattcttttt ttcataattt attatatagg tgacattgtt gattatttga gtgtactcga ttgaagttga ttgggaatac gaacttaata acacatcttc cttaaagaaa gacgcattcg atcaaagagt aacaaagtta ataaccacgc agccgcgttt aggaaaaagg gtaagaacag t tagga ta tg acgcaccaat ttatacagtc gctagaaaat atatacgcaa gtgttgactg gttaacaaac gttggagaac t tcaaa tgg t ccaactcaaa ccagatggtt tttgttaata aatcacatta aaatggcaag cgatttagaa gaggaagctc tcaaaaagct tggagtgaca tcgagaaatc cgatgatttc aaaaataaat aaccctgaac .aaggatttg tagggtctag ctggaaaaac ctagaagttt itaaacaata1 ;aaaaagaaa iagacagcat lattacacag ;accacgcac ;gcaaacgtgc cgtt t tt tga tttgtgaaat cataagtgaa ttttgcaatt tttcaaataa ag tagg taa t tgtttatatt ttactttaat ccaaattaat caacgagtgc ct tgt t ttaa atggcgtttc ctcctatgcc attcatctat cgttttcttg tttcatcttg acgcttcacc accttccatt gaaagggttt gttttaatct aagggaaata caaaattggg atactaatac cacaatttaa acaacaaagt ctataaaaga agtaaaggag aaatgaagca taaagtacac ctggtttgga tccaaaaagt ctgatttaac ttatcaacat aaatacattc aggatactca tagaaattga caagatcaaa ttagtgcatc taatcttcga tcaaacgctg cagacaatgt agtataagaa caaaagtatt tagcgaaaat taagaaataa agagtatgga caagtaaagt agtttttagg accaaagaag aaaccaatca gaaatcaata aatcatccga tatgtacaag cttaattcag aaaaactatt gaataaagga taaaactcaaa :gttgcgagc iaaaaatgtt iattttcttta :atacttatc tgcaagcatta itggagaaat itaaagattg :tatcgctca a :caCactaga c :aagacaatat ttagtaaaat aaattccaag tggttgatta ag ttatt tt c gcatgattaa aaatataaga tatattaaag atcactaaac atagtcttcg aattgtacca cataggttcc gtcttcttta agcaccagtt agaagtgact gcggggaggt acgttcttcg gacatttaaa aacatactgg cgacttttct ttcagaagtg aaaa tcaa ta gttatagtta aacttttgat ctttgctata accatggaaa tgttcacaaa gcataacaca tcaaaacctt tcgagacctt aatggtagag agtgaaagcg agagtggcgg ccacattgag gatagtcatt aatgcaagca aaacgaacct caatgccatt aggtcaaaac tgcttctaaa ggtaacatca aattgaacaa agaaaaagag attcgctgac acttaaacaa tggatatctc tctaaaaatc atcacgtaca agaaaaacaa agt tgaaaga gttcaggttc aaaaactcaa ttccgctaaa atgttcatga acttgagtga atttatacat 3gaacaacaa attactcaag igttgcagag tcgatttcct iaatccgaaa lcctccttag icaaacaaat :gcaatcagc ~tttccaaga ~aaaagagca :actgcaaaa :ttcatccaa WO 00/32825 WO 0032825PCT/I B99/02040 28081 gttgaaaaag catggaacac 28141 aacacaatcz 28201 gatgcagtac 28261 caaaacggtz 28321 cttattaaac 28381 ttattcgaaz 28441 acgccaaaac 28501 caaacaactt 28561 acaatggcac 28621 attagtagce 28681 gctgaaaatt 28741 aaataacaac 28801 caccagaaaa 28861 cacaaatcca 28921 accgcaaaga 28981 tgattaatat 29041 aggatattaa 29101 cttagcgatt 29161 cgcaagtatc 29221 ctacttgttg 29281 aacgaaaaac 29341 ttcatgttaa 29401 tacaagttaa 29461 acttagatat 29521 cagacgaaca 29581 agactgtaac 29641 ctgataacaa 29701 gtatggaaga 29761 aaactattga 29821 gtaagcatac 29881 tataaatttg 29941 aaagacgctt 30001 gacgtagaaa 30061 ttacaggaga 30121 aaacttagag 30181 aatgattggg 30241 caagaagaat 30301 gatgatgaag 30361 aaagctatta 30421 aacggagaaa 30481 aagattagac 30541 acggacgtag 30601 aattatgaaa 30661 taacggctca 30721 caatgataga 30781 taaacataat 30841 attagttact 30901 tcttattggt 30961 gtattttacg 31021 acctatrccg 31081 atcaatgtct 31141 agatttagcg 31201 cgacggtact 31261 actagaaaac 31321 tatagaacaa 31381 accagtagaa 31441 agaaatcagt 31501 agcgtttatg 33S61 aaataaaaco 31621 tcacgcagac 31681 ccactatgac 31741 tggcgttaag 31801 gaggctcaat 31861 atagcactcc 31921 gtggagaaaa 31981 aggcggacaa 32041 ttaatacttc 32101 ttaacattct iatcaattagi Ictactactai itaaacatcgc :gcaagggtgt kttaaagaaac Itaacaggtae aataggagge Fttgtgacgtc Lgggcgttgac ctactgaatc attatacacc Lcacatataga Ltcaattgttt *taatttaggt ttctaaattg *atgagcaaca gtacttatgc *gcaacattca gagcaagtaa ggaggaagtc cggattcgat agatatgaac ggcatcagac ggacagacta ttatatcatt ttcagatatt agcgagtatc gtacgaggag tcaaaaaact cagtatacgg tcgtcattga tcgagaacta tgagagaaaa atatgacatt gagaagttgc acaaattcca gtagcactat cttctcaaag agaaagctag attcaccttc tagaagcaat atcacaggac gcagggtttc gaaaatagat caatttgtac cgattaggta aagttttgtc gatttttcat aagacagata caacaaagca ttttaaggtg tattccgtcg ggatatccac cgcaaaaaaa tcaactagaa ctgcgcgact tttcatcatc ctggcacatt aaacatgtat tcgtttgatg aaaatgttga taatcgteat ttttaaaatc actaattgag aaattcaatg tttaacaaat cccagaaatc i aacaaagati igacatcaati Igcaacgcagi ggattataac atcaatcaci taggacaacaz tattacaaatc Igaaggtttgc Itgactatctz :tgctcgtcgc I aaaggaaage ggcgaagaaa ggagtatgta gtagaaaatt gaagagtatt tttataaaac cgtttctata tgtactacaa cagtatcaaa aagatgtatt tttaagctat aacgtaccaa ttatttaacc attaacttag cgtcataggg agttactcca aatatggatt gtagaacatg aaagataaat aaaaattggc cattaacgaa tcaacacttt cggacaagaa gaatgatgtg tgaacgaatt ctttgttatt caaccctact tgatgtgtta atatattcta aataacaatt tagaaatgga aagcgcaatt aagctggaga atttcacaat cgccgtataa ttaagttaaa acttggtatt ttattaaacc agcaaaaagc atccatttga tggtttaaat ttgctactgg taaaagcaga tattcgcaat aattattaca gttctatgaa aaatacctat atgaagcagt tagcgttatg ataaatacca aaggagagaa ct tggcggaa tccgtttagt cc tt tt ttga ccagaaagtt tctaatcccg 3attatgcaac :gcacgtgaci :ttagttggac a ttgtttgagt :atgcctacac i cattcggacc ;aacgcactat ;aagattgage iaacaacaaat :cttttgaagt itagaaatgcc taatttgtgaa Lgaagtacagt tatacattga tgatcagaaa Ictacctagta Lcttcactaca Lagaatgcttt Lcacttaagaa acgaaatagg tcattttaaa ttaaacatgc aagcaataga tcatgaaatg atatgccaat caaatagaaa atcacaaagc actgaggaaa aa tat cgc tg tcaggaaaaa ggtggaacaa gtttatgttg atcaatgttg atgaaaaata gtcagtatgt acaggtcatg atcactattg gctagggcaa aacgctgaac aacaataaga aactaaaaat tactaaagaa attcacagtg cgtatttgaa atatgatttc tcttcctagc gaaatggaaa ttacaaaaag tgaagaaaat aagcagtggc gcaatacatt tgttgaactt agtagaggtt gtgtagagat aacagaattg agttgcaagg gagtgtagaa cggcagaggc tcgcgaacat cttgcatgac aaaggaatga gagattagaa taatacaggt tgtctattac tacttattgt aaacaaatct -gtgctttaae iaaccaaaaat Iagttagcaaa *ggttacgtca -agtattcaat Igtcacacatc iacaagttttt acaaaacaac Lagcacactag *ctttaaccat *tcgccgaaca *aaaaatcata aaagttatac atacaactgg *ttattcacca gcataaaaaa gcagtattat gcatggtcaa ttcaaagaat aaaattcatg cgaaa tcat a aggtcatatg ttatgtcgta tgaatggatt gtaggaggtc ttatataact tagagctagg aatcaagaaa aacaagaacc agaaaaataa ccacgtttgc cggttactga taaatttttt tagttattga agtctaaaaa acagattaat aaggtatcaa aagcgcaaga tgat tgaaga cttctaatac aatttgcaaa taattaaaag acaaatcaag aaagttaaaa aatgatgaag caagaaaaac ttagattttg ttcaatgaag ggcgatgatq aacggggcac caatttggat acaagatacc gaacaaagtc ccggacaata atagaacttc gaaattatga gagttaatag acgagtaagt atgaacagaa cacaacgagc tcgtggataa atagactaag atgctatgca ttttacaaaa ccaggggctg ttctaggttg ttgtttttct taattgctaac tgtatttgca Lgatcattaaa Laaacggattc ggaacgtgag aattagtaag aggagaaaaa cctcctcatc iaaaacctgtg accgaaagat aactattagc gtaccaccaa gcaacaccta ttgaaatatt acaggcactc tggtattagg gcttcacagt ttgcgggatt aaaaaaactg ttcaatataa cgcaaaaata ggcatatcaa gatgagaatg gaagagaaca gctatgaagc aacaaaccaa gagtttaacg acagtgacag acaagaaaaa aaggaaattc tacaagagat cgaaggatca acctcaaatt aactattcaa accaacgttt aggaaaactt caaagataaa acaaattaaa atttgatgat gtttgaaaca tcctagcatt gacggtattt aaaagtttta atattgaatt gcaaacaata aattgattga ataccaatga atgaaggtaa ttgttaacaa aacaacaaac atgacgacca agaaagataa acattgactt aaaaactatc actggggcga aaggttatga aactgattat tgttaagcga acaaaatgaa aacatgcgat aagttgatgaaataataaaa tgctgtaaaa gctttaccat taatgtaact tgtcctgact ataatCttat WO 00/32825PC/B9024 PCT/I B99/02040 32161 32221 32281 32341 32401 32461 32521 32581 32641 32701 32761 32821 32881 32941 33001 33061 33121 33181 33241 33301 33361 33421 33481 33541 33601 33661 33721 33781 33841 33901 33961 34021 34081 34141 34201 34261 34321 34381 34441 34501 34561 34621 34681 34741 34801 34861 34921 34981 35041 35101 35161 35221 35281 35341 35401 35461 35521 35581 35701 35761 35821 35881 35941 36001 36061 36121 36181 taaagtgatt aagacatgtc cggttgattc cattctttgg aatgttttaa ggtgggttaa tacttaaagt aaaattgtgg ctttgaattt atttcccaaa cttcaataat ctaacgcaat aagttactac aaagttactt taatggttac gaacct tacc acaaaggaag ccctattgat tattaataat agcatcttct ttttaaacac taggttggag tagcgataaa tcaaaaaata ttgggattag aaaatatcaa cgacttatat ttgcaaatgt attcaatcaa acaaaatgaa tacaaaagaa agcatacgct accaatgttg cgagctagtc aaacacagag caacatcttt tataaattcg ggagcgagat t acgc tcaga atccaaaaac aaaatgccga atcaagtata gaaaagaaaa tatgagcaag aaatacaacg gaatattacc caaccgaaat gacttcgcgt accgaagtag aattggatat attaaagcaa tataaatgca tgtggataaa agagtatgac tgtaatcatt agcgtgggat aagctttaga aaaagattaa ayctauyLaa actggttcga aatcagtaac ttacacatac accacaat ta gaatggtcat ggagtgatga aagagatatc tattactttc gcggaagagc taaaaactga aaaagtttca tatatctaac gtttaaaacc aagaatagca tgagtttttt tttttcacta ttcttgtaaa ttcaaattct gacaagttcc tttatcaata agcgataata tcaataatta tttgcagaaa tttgcaactt aactttggtt atgtacccct aattctgtca acaagtaata ataccctata aatacagcta gattttaaaa taccttagac caaccaactg ggggatatta cctactcatg aagtttgctc gaaatctatg tcaaacgtta aaacaagtac ccaaaatcat atcgcaaaag atggatcgta agattgctaa cacactttaa acaactaact agaatgaaaa gcatggtaac aactcataga t tgcagaacg aagaaaaata aagacaacgt ttatgactga aattaggttt ctaagaaagt aatatttaga tcgaattatt tatatctcga caaaacttaa gtaaagcgcc gacgagaacg acgattgata gaaaaagaag aatttaaaaa aataataaac aaatgctgga cgcgccttat acaagcgaga %yaaqaaycca tgtcacttat agaaaagtag ggcgacat tg gcattcgcaa gaggatttag ccatgacaga tgtatcagga acggtcatat ttgaaacata ggagcataaa tttaaaaccc ggagagtctt gctctatatt tcatttgggg ctgtcatcca atgtaaaact ttatttt tag acttctcttt caagttttag cctttaccta aaattatacc cagcaaatgt taacatcttt tatacaacgt atctaaaaat tgacgcaaac atacccctat acaatataaa aagaaattat aaacaaaaga aggtgattga cagaaacact gcacggatca igaaaccact tcgaaaaagg ctactaaaaa aggaatataa atccgtcttt acgctaaaca taatattgca cagtcaaagc tcaaagcgac gtgatattga ataaactttt ttagtgataa aaagagcaag caaagaattt tgaggcacag tcatacacgc ttacttatac aaatgaggtt tagtgaccta acaagcaacg tgagtacaaa aagtaatatg accaaaacta tggcaaactg agctaagatt taagtataca caaaagagaa taaggatacc cgctggcaga ttagaaacgt catataaatt attgtttcta ggcatgcacc ctcgaacgtg LOLLLt1LLLO aaccaaatgt atatgaacaa aaattataga taggtaatgc caaaggcgaa tagcggacgt taacgaacga cgtgccaggt tataaagcaa acttattata ctaaccttac ttattaacgt taacggcagg ataattgttt tagatgatgc ttgaagcttc gtacagaaga gataaataac agaatgtttc aaataggatc agaaaggaga cagatacgat aagtaacaaa tgt taaggaa cgaaattatc gtcaatacct tgacgcaaat tagaatagat cgattactta ttttattaaa tatcaaaaca ttttggcagt attggaacgc attcagcgaa attgaaatgt acacccgaat gcgaaacaag aagagatgca aacagcaata aggttcatac taaagggcat atacaacaaa tttacttgta cagcattgtt agaactaaat aaaagtaaga ttaaaaacta ggcgatgaaa cccgctatcg cgagaagatg tattcgctca aaacgattca atatttgata ggaattgtat aatggcacta gataaacaac attgaagtta ttcagacata ggtaaaacat atgaagtgat tacagaagtt ttacttatat aaatgtagag taacaatttt aacgtgttag taaaagaata aattggaaag tcaagaaatg aacgcaagac ttttattgaa aattaaatac 3ttttacgtc aaagaatact ;tggcacata .ggcaaggtg igtgatttgg aattcctttt taggttatta gtccgatata a tg tact tcg aattatttca tat tag ttt t tagagcagga tatttctttt tttatccaca tacaggccct cataattatt atcaacatga aaccgactta tacggatact actatatctc aaagaaggta attgacgcaa gtcaaagaga atattgtcgg aacaaaaaag gcaagatgga gctgagtggc aaatttgagg atgaagtacg aagataaacg gagagatgtg ggttacgagt caacggaaga acagtcaaaa gagtacgtac ggaactggta acggttgctt aatgcagtag ctagatgata gataacagag caaaatatga gtaatcggag aacttgagtg ataggttgta tcgaatat ta gcacagaaga caggagccca aaggcgctca tttagaggtg ttgatagcaa attatgatca gaaagattga tcgacattaa aatacagaaa ggattacgta ctaatgcaac gaatatcagc aacaatcctg gtggaataaa gaaaaaagaa aggttgttgg tagagaaatg agagcgaaag gagtgaagca aacgttaagc caagttacgg ttgtctagag ;atagagtat taaaacatit tccatgtagt tgaaaaagac tatatgagga tttgttaagt attgaaattt ttcataccgt tgattcttta acaaatgaat gcgaacatat cctagaagag ttaaattgtt taaaggtgga tttgatgcgc cacccccaat ctgaccaacc ctgacagcga gcacagcaag gtagaatttc atgaagttaa aaatcaatac atattacaag gcaacccgac cgggcaagca atcaagattt taaacacgga ggtacctcaa acgaaagtta aaagcttgaa gaagtgaata ataaagacgg taaacaacat actacaagcc aaggcttctc aaagccacct ttatgcacat agactacaga tgggtgtaga taggtaaaaa actggcaacg acgatttcag ttcagatatg cgacctattt aggagtgtta tattaaggtc tttcagcgac cgggcttcta gacgatgagt agtagagtgt tatcgaaata atatattgca aggtatgcca cataaaactc cgaggaatta aacaagcata attttgatga acgaaatact tgggcagtgt ataatggcaa gagttttcag aaacaaatgg aaagaggctg taatgagcat aacctgcgca cacagtaccc caccgttaaa ttgacttgtg tttcggctct aaatggcact atttgataca acagaagcaa WO 00/32825 WO 0032825PCT/I B99/02040 36241 ctaactttat 36301 acctgaactt 36361 ttcaaatgat 36421 tgtgactgga 36481 aaaatcaaag 36541 ccggatctat 36601 tttcatccaa 36661 atagttgata 36721 ttcgagattc 36781 tgtttatatg 36841 atgacgttaa 36901 ataaagataa 36961 tgatttcaac 37021 aagatgtgca 37081 aagtaagttt 37141 aattactaag 37201 tgctattgga 37261 tactaactct 37321 caagaagcaa 37381 tctagagaag 37441 aaacgaattc 37501 ttttatggag 37561 ctagtgcgta 37621 agaaagcgca 37681 ttaaagaagg 37741 aatatgagga 37801 gctagaatgc 37861 actgtcgtac 37921 ccagagggct 37981 gtgattgaaa 38041 aatgatgaag 38101 gatggattaa 38161 agaagagttt 38221 tggacaccgg 38281 ggcttcggaa 38341 gtgacgcaat 38401 actgtggcta 38461 aaagagaagt 38521 gaaaatataa 38581 tttattttgt 38641 atggttatat 38701 ttgaatcaat 38761 tgacggtagt 38821 ggacaatttt 38881 tgtattgaac 38941 tacctaaaat 39001 cacctgaaga 39061 gtgaatttta 39121 gaatgatgcc 39181 atggatgggt 39241 ctacttgtta 39301 caaggaacta 39361 gtattagaca 39421 agcgcagata 39481 tatagaatac 39541 caatgattaa 39601 atgtaactga 39661 attacgtctt 39781 attaggaatt 39841 gtagataaaa 39901 gctttattca 39961 aatcaattta 40021 gattggagct 40081 agtctttaat 40141 acaaatgatt 40201 caatgaattt 40261 tattataaat t ttaaaaggg atccaatggg gttgagcgca tatgtatcaa t taaaaaaga cacaaggaaa atacaaataa ttgaaaaaga aagaaggaga gcagatgtgt tctggaaaga aaaagttatg aggttataaa cggtgtggag tatcgagttt tgaaaatgac ggttatgaga aacaagttat cgagatgagc aaagcaagcg ggtaacgatg gatgacacaa taacggtaat agcgtttgat tattgaactt ggaataggaa ccgaacgaaa tcgaaccaca atgtcggact caggcaagat aacgtgatgg taagcatttt accaaatcaa aactaaagca gtagcggagt acttagtcac gagataatca acgaggcaca gggagtgtgg aaaaacagct ttacgcaagt cacacttatt agtttgatat agaaatgatg aaaggttata gaaattcccg aaaggctaag cagtcctacg tagtttaatt t cgaca tc tt tcacattgcc ttacagataa acaaacaagt tacaagctag actttttaaa acaaatacta gcaagtgtat tcgagcggag tttggtatgt atgaacgagc gttaaagaga gaacaaatat tatattattc aatttagata aatttatcaa agtacaaatc gaatatcaaa cggaaacaat cttgggataa actgttttgt ttaacgataa aatgagatta aatatttttt gtgttcgacg agtaacggaa ctataactct gcctaccaaa tggggagttg agtattattg agtttcaatg atttatgaag aaagaaggag gataitattg tgacgttcac tagataaact ttattgggga catgggatag atgaaagagt atgaataatc gacacagagg gaaatacttg gatgaagcag aatgactaac tcataagacg agaaaaagca attaactagt agacgcggga aatacccttt agatataaaa caaaggcgat agtggaggaa gtaaagacat aacattcaaa gacgtttaca agttaaaaga gaaatgacgg gaaccttttg actataatca gaggagcatg tacatgaaga atgactattt tagttgggat aaaaagtaca attgaagatg atggctaata gatactggag tattgttgga tatctataca atataacaag cat tgaaaat gttaaaagta tttatatccg agactattat attatgatga gtgagtgaat ataaaattat aaataatagg tttttaggta atccgatata caacagaaca agcaaagtaa atagagttaa agattttttt1 aagatatatc gaaaatcaaa ccccaagtta gacttttcat atttactgtt gatgaat taa tcaacaggat tcaagtttta gagactaagg acactatatg gcattctaca ctagtatgat acgaaatcga aagtaaaact gggatattgt ccttttatat aaattgttgg cttatcagat tcacaaagca tatagcgaag gtattgcaag taaatztcgga gcgaaaaaat ggt tgctaaa agggaatgac tagggattat acattacaag gatgcaggtt gtgatcaaaa cgtagtggtg tatcatggca ttatatgatg ggtaactatg aaactagctc ttcgaaagtg cttagatcga gattcaacag gttattgagg gatgcagtta atgttaaaat aaaaatatgt agaaaccaac ggaaatgaat tgaattattt aacgatagaa caatgttgag ctgaaataat attttattaa tgaatgaata atgacaatga atactgtcat gtggctagtt agacaagata tccgacttat ggcgataaggI gtcttatacg tcttactagc c.ggctaatga aatgagaata agattccttt aagcatatat ttttacagat gcatgaattt t aaaagcgtat 3gcaatggaa 5 taatccttctt ttatttaaaa z attgaaaaag tcaggtaata gttgatagca caagaggaga ttaaatgggc ttagtgatgg ttccaattga ttgataggtt agaacactag tcttaaacga gttgaaattt ttttaatagt attacaatac tcaagattgt aacttttagc aaatattttt gaacaatata ttaaaagatc ttacgagatt agcgt tgaaa atggaattaa cgaacagtcc agagattgag aaatgctatt ggcaggtcaa taaaactatt atgacatat t cagatgtagc taagtagtaa atttagggat atatagacgc tacaagatgg aa ttggt tat tttcagaacg gttaaggagg gacgaccaca cagagagtaa t taaagtggg taaaactatt tgaaagaatg gtatattaaa cagctgagaa aacgaaatag aaagattatg gaggcagatg caaaaaatat agaaattaaa tgaattaagg tgattaaaaa tattcggtat accaacacaa aagaagacaa tattcaaaaa tagaagttaa aagtaaagaa aatgtatgag1 :gatgtagag :ttatttatg -arnnant-a tcagaaaata ictttagcag tctaacttac :ttaaaaagg :tagattttt ;aaaatgttat ;attttaagat aaatgaattt aaagattcta tcttatgtaa tataacaatg gcgagaaaat attcgttcgt tatccccttc gattgaatta tataaaagaa tgacctaact aaagcttggg gggtacarttt acaggattta tat tcgagag aatgtaactg gaaaatgaga aaaatctttg gtgaagagta gtaacaaaga aagatttaat acaataaaat gttattagtg gacgtgtata caacattcag gttgtctata atcaaaaaat ctcagctgaa tgtgagtata aacgtattta taatatcaag tgaattagaa aagaggcata cgtgcctata tggagcaaaa ttttggggaa tgaacatatt agaagaagcg tcagttgtat tcaggtggag acgagtttta acagatacga ttttattaca tatttgtttt gcagagaact atgattaaca aaaaataaag gataaagaca gctatgttaa acttaaaaat attcgcattg agaattacat gttctatatt gaaatttgat aacaatcggt gtagataaa :taggtaagt 3cgccgagtg itttgatcgt taaaats ttataaagag ;aggtgttgt iacgtaaaaa- :taaaaagat :tgatattga :tggatttag :gagtttcaa iaacaattgc taaaatgaa WO 00/32825 WO 0032825PCT/I B99/02040 40321 40381 40441 40501 40561 40621 40681 40741 40801 40861 40921 40981 41041 41101 41161 41221 41281 41341 41401 41461 41521 41581 41641 41701 tgaaaataga aaacgattat aaacagaact atgtggatta agtgatcgtg gtagttaaaa aataacgata gagaaacacg gagttgaaga ggcaaaagta aacaggtaca gatgaggata gaatgggaag aatgcactga tgtaagtccg aatgctaata gaggtatgaa tcttgatact cgattctata tatgaatgtc aagtcgttgg ttaaacaatt aaaaaagaaa aagcgatc gcttataatc aatctttatc tcgataaaag ctatgactat caagagagat ctaaagggta t taaaaagta atatcaaaaa tgcgagaa ta atttgccggg atacattaag cgcttgagtt atatagcaca tcgataagtt cat taaaaca cgacgcatga catgttcaaa acttaagtta agtctaaaga aacaatgtaa atgtagatca tagaaacact ataaatggaa atattgatag ttgataaatt aaagaattat tgtatttgct acaagcactt caacgggt ta atttatatta gcttgaagaa tgaattactt taacccgat t aaatatagt t attaaggttt ttactttggt agcaaagtat gtttattatg acaagaggcg ctaattgtaa tataaggtga atggcaaata gagagacggc tatattatcg gtgtattaaa agacgaaaaa ttttatcact tgaagaacag tattaattta atattgctat agatatatga gaagaataca tcggaggtat tacattcagc gaaagtcatg gaacgatgtg aacggtgtag agatattggg acaagtaaga attggttatg ttagtatcag catcactatg atacattact aacattatga acaagaaaaa aagttaacga ctagaacatc tgtcacaaca tggtaaatac tcagagtacc tttagtcaaa aacaagagga tagtttgtat atgattatct ggattgaatt tgcattgaat acatcgataa aaccagataa caataaagaa atagattgat attgtcctat caagtatatt tgtagcggac attaatattz tgatgtgtcz acacatcaag tgactaaaga gagtgctaga catatgacaa atccggagtt aaaaagaaaa ccccgggtca gacgaaaaat agtttaaaat gatttaaatg cagtattaat act tgatgaa gaagcgaatg gataaagatt ctatcgaaga tgcgggagct gtttagtgat aggtgaaagt tggttgttat acgtagaagg ttttacccta aaagttatta ttttatttat tatagatgag cgaacgtata aagagataat aagcaagcgt tgctcatgac gagatttata aaaaaatcaa WO 00/32825 WO 0032825PCT/I B99/02040 152 Table 3 Name 770RFO05 770RF006 770RF007 770RF008 770RF009 770RFOIO 770RFO1 I1 770RF012 770RF013 770RF014 770RF015 770RP016 770RF017 770RF01 8 770RF01 9 770RF020 770RF02 1 770RF022 770RF023 770RF024 770RF025 770RF026 770RF027 770RF028 770RF029 770RF030 770RF031 770RF032 770RF033 770RF034 770RF035 770RF036 770RF037 770RF03 8 770RF039 770RF040 770RF041I 770RF042 770RF043 770RF044 770RF045 770RF046 770RF047 770RF048 770RF049 770RF050 770RF051 Position 19572..21026 3976..5 196 21871..23076 2120..3307 31946..32803 26092..26889 2444 1..25208 29788..30576 33620..34399 27760..28512 3291..4028 32867..33610 23269..23982 31169..31840 3985 1..40501 6926..7570 37762..38304 30605..31156 26903..27346 10700..11140 9707.. 10147 40729..41 145 6518..6925 34795..35 199 6117..6521 36478..36879 3915 1..39546 33892..34266 5758..6120 7886..8236 19258.. 19560 36876..37223 102..446 34908..35219 37220..37528 41377..4 1676 35454..35753 5490..5774 29304..29564 1848 1..18768 5216..5500 25663..25935 11159..11425 28776..29039 36013..36255 35753 36007 3893 1..39167 Name 770RF052 770RF053 770RF054 770RF055 770RF058 770RF059 770RF064 770RF065 770RF066 770RF069 770RF070 770RF071 770RF072 770RF073 770RF074 770R.F075 770RF077 770RF079 770RF080 770RF085 770RF092 770RF094 770RF096 770RF098 770RF1 02 770RF104 770RF109 770RFI 12 770RF 117 770RFI 18 770RF 120 770RF1 24 770RF 128 770RF1 30 770RF133 770RF140 770RF147 770RF149 770RF151 770RF155 770RF157 770RF1 67 770RF 175 770RF1 76 770RF1 78 770RF1 79 770RF1 82 Position 1762..2013 3752 1..37757 228 18..23060 17546..17788 18892..19 122 34564. .34785 29574..29795 28528..28746 27494. .27703 3834 1..38547 36269..36475 40498..40701 38735..38938 30945..31148 38544..38738 13673 13870 25357..25605 29089..29280 35204..3 5389 24060. .24242 39706..39876 32226..32393 13606.. 13773 7092..7256 2905 1..29212 34393..34551 18282..18434 39543 39692 27361..27501 38390..38530 36059..36199 33699..33833 14221.. 14355 15675.. 15806 84 14. .8542 131131.13235 7029..7148 30668 30787 31837..31953 30278..30391 4044..4157 20692..20799 357 il..35821 6836..6940 35390..35491- 8318..8419 29268..29564 WO 00/32825 PCT/I B99/02040 153 Table 4 770RF017 sequence 23982 atgacgcataatatagaaaaacgcattaataaattaaaaacttct 1 M T H N I E K R IN K L K T S 23937 ggaaatccaaaatttaaaaagttagattcagatattcactattta 16 G N P K F K K L D S D I H Y L 23892 ctcaagagatttgaaggtgaaaaaaaccataaaggtttttatcca 31 L K R F E G E K N H K G F Y P 23847 aagtttaaacaaggagaaatagtttttgtagatttcggtataaac 46 K F K Q G E I V F V D F G I N 23802 gttaataaagaattttctaattcacactttgcaatagtgatgaat 61 V N K E F S N S H F A I V M N 23757 aaaaatgattctaatacggaggatatagtaaatgttattccctta 76 K N D S N T E D I V N V I P L 23712 tcctctaaagaaaacaaaaagtatttaaagatgaattttgatttg 91 S S K E N K K Y L K M N F D L 23667 aaatgggagtattatttaagattgtttttaaatttaattagcgcg 106 K W E Y Y L R L F L N L I S A 23622 caaaataattcagctatattaaaagaagttttcgataaaaaatac 121 Q N N S A I L K E V F D K K Y 23577 caaaaaaacaacacagaattcatcactaaagattattttattgaa 136 Q K N N T E F I T K D Y F I E 23532 tttatatctgatagtttagaaattgaaaataaattaaataaaatt 151 F I S D S L E I E N K L N K I 23487 gacagaaacattaataacatagtatcagcaattgataaggtaaaa 166 D R N I N N I V S A I D K V K 23442 aaattaaaaggtaatagttacgcttgcataaattctttccagccg 181 K L K G N S Y A C I N S F Q P 23397 attagtaagtttcgcataagaaaagttttaccccaaaaaattaaa 196 I S K F R I R K V L P Q K I K 23352 aatccagtaatagattcttcggatattatgttactgataaataga 211 N P V I D S S D I M L L I N R 23307 attaataataatatattgcagatccctgatataagatga 23269 226 I N N N I L Q I P D I R WO 00/32825 WO 0032825PCT/I 899/02040 154 Physico-chemical parameters of ORF 770"F017 1 MTHNIEKRIN KLKTSGNPKF KKLDSDIHYL LKRFEGEKNH KGFYPKFKQG EIVFVDFGIN 61 VNKEFSNSHF AIVMNKUDSN TEDIVNVIPL SSKENKKYLK MNFDLKWEYY LRLFLNLISA 121 QNNSAILKEV FDKKYQKNNT EFITKDYFIE FISDSLEIEN KUJKIDRIIIN NIVSAIDKVK 181 KLKGNSYACI NSFQPISKFR IRKVLPQKIK NPVIDSSDIM LLINRINNNI LQIPDIR Number of amino acids: 237 Average molecular weight (Daltons): 27887.38 Mean amino acid weight (Daltons): 117.67 Monoisotopic molecular weight (Daltons): 27869.83 Mean amino acid monoisotopic weight (Daltons): 117.59 Amino acid composition Aci Symbo Numb *Average Aci Symbo Numb *Average jd I er in Swissprot d I er Swissprot I l A 5 2.11% 7.58% Cys C 1 0.42% 1.66% 1[Asp ID 14 -5.91% 5.28% Glu E 13 5.49% 6.37% [Phe F 16 16.75% 4.09% Gly G 6 2.53% 6.84% IF H 4 1.69% 2.24% Ile 1 29 12.24 5.81% Lys K 33 1.25.95% Leu. L 19 8.02% 9.42% Met M 4 1.69% 2.37% Asn N 30 12 .664.5 [r IP 7 2.95% 4.9% Gin Q 6 12.53% 3.97% jMg IR 8 3.38% 5.16% 1 Ser S 17 7.17% 7.12% Thr IT 15 12.11% 5.67% fVal V 11 4.64% 6.58% 11 10.42% 1.23% Tyr Y 18 3.38% 3.18% Number of acidic (negative) amino acids (ED): Number of basic (positive) amino acids (KR): 27 11.39% 41 17.30% 68 28.69% Total charge (KRED): Net charge (KR ED): Theoritical pI: Total linear charge density: Average hydrophobicity: Ratio of hydrophilicity to hydrophobicity: Percentage of hydrophilic amino acid: Percentage of hydrophobic amino acid: Ratio of %hydrophilic to %hydrophobic: 5.91% 10.01 0.30 -5.37 1.41 57.81%-z.
42.190/o 1.37 WO 00/32825 WO 0032825PCT/I B99/02040 155 770RF019 sequence 39851 atgaacgagcaaataataggaagcatatatactttagcaggaggt 1 M N E Q I I G S IY T L AG G 39896 gttgtgctttattcagttaaagagatttttaggtattttacagat 16 V V L Y S V K E I F R Y F T D 39941 tctaacttacaacgtaaaaaaatcaatttagaacaaatatatccg 31 S N L Q R K K I N L E Q I Y P 39986 atatatttagattgttttaaaaaggctaaaaagatgattggagct 46 1 Y L D C F K K A K K M I G A 40031 tatattattccaacagaacagcatgaatttttagatttttttgat 61 Y I I P T E Q H E F L D F F D 40076 attgaagtctttaataatttagataagcaaagtaaaaaagcgtat 76 1 E V F N N L D K Q S K K A Y 40121 gaaaatgttattggatttagacaaatgattaatttatcaaataga 91 E N V I G F R Q M I N L S N R 40166 gttaaggcaatggaagattttaagatgagtttcaacaatgaattt 106 V K A M E D F K M S F N N E F 40211 agtacaaatcagattttttttaatccttcttttgttatggaaaca 121 S T N Q I F F N P S F V M E T 40256 attgctattataaatgaatatcaaaaagatatatcttatttaaaa 136 1 A I I N E Y Q K D I S Y L K.
40301 aatataattaataaaatgaatgaaaatagagcttataatcatatt 151 N I I N K M N E N R A Y N H I 40346 gatagttttatcacttcagagtaccgacgaaaaataaacgattat 166 D S F I T S E Y R R K I N D Y 40391 aatctttatcttgataaatttgaagaacagtttagtcaaaagttt 181 N L Y L D K F E E Q F S Q K F 40436 aaaataaacagaacttcgataaaagaaagaattattattaattta 196 K I N R T S I K E R I I I N L 40481 aacaagaggagatttaaatga 40501 211 N K R R F K WO 00/32825 WO 0032825PCT/I B99/0204() 156 Physico-chemical parameters of ORF 770RF019 1 MNEQIIGSIY TLAGGVVLYS VKEIFRYFTD SNLQRKKINL EQIYPIYLDC FKKAKKMIGA 61 YIIPTEQHEF LDFFDIEVFN NLDKQSKXAY ENVIGPRQMI NLSNRVKA4E DFKMSFNNEF 121 STNQIFFNPS FVMETIAIIN EYQKDISYLK NIINKMNENR AYNHIDSFIT SEYRRKINDY 181 NLYLDKFEEQ FSQKFKINRT SIKERIIINL NKRRFK Number of amino acids: Average molecular weight (Daltons): Mean amino acid weight (Daltons): Monoisotopic molecular weight (Daltons): Mean amino acid monoisotopic weight (Daltons): 216 26026.06 120.49 26009.34 120.41 Amino acid composition Aci Symbo Numb Average Aci Symbo Numb Average d I er in Swissprot d I er in Swissprot Ala A 7 3.24% 7.58% Cys C 1 0.46% 1.66% Asp D 10 4.63% 5.28% Glu E 16 7.41% 6.37% Phe F 19 18.80% 4.09% Gly G 5 2.31% 6.84% His H 2 0.93% 2.24% Ile 1 28 12.96 5.81% Lys K 22 10.1 9 5 Leu L 12 5.56% 9.42% Met M 7 3.24% 2.37% Asn N 23 10.65 Pro P 3 1.39% 4.9% Gln Q 10 4.63% 3.97% Arg R 11 5.09% 5.16% Ser S 13 6.02% 7.12% Thr T 17 13.24% 5.67% Val IV 17 13.24% 6.58% Trp W 10 0.00% 1.23% Tyr IY 113 16.02% 3.18%_ Number of acidic (negative) amino acids (ED): Number of basic (positive) amino acids (KR): 12.04% 33 15.28% 59 27.3 1% Total charge (KRED): Net charge (KR ED): Theoritical pI: Total linear charge density: Average hydrophobicity: Ratio of hydrophilicity to hydrophobicity: Percentage of hydrophilic amino acid: Percentage of hydrophobic amino acid: Ratio of %hydrophilic to %hydrophobic: 3.24% 9.52 0.28 -4.84 1.37 54.17% 45.893% 1.18 WO 00/32825 PCT/I B399/02040 7 770RF043 sequence 29304 atgtattacgaaataggcgaaatcatacgcaaaaatattcatgtt 1 M Y Y E I G ElII R K N I H V 29349 aacggattcgattttaagctattcattttaaaaggtcatatgggc 16 N G F D F K L F i L K G H M G 29394 atatcaatacaagttaaagatatgaacaacgtaccaattaaacat 31 I S I Q V K D M4 N N V P I K H 29439 gcttatgtcgtagatgagaatgacttagatatggcatcagactta 46 A Y V V D E N D L D M A S D L 29484 tttaaccaagcaatagatgaatggattgaagagaacacagacgaa 61 F N Q A I D E W I E E N T D E 29529 caggacagactaattaacttagtcatgaaatggtag 29564 76 Q D R L I N L V M K W WO 00/32825 WO 0032825PCT/I B99/02040 Physico-chemical parameters of ORF 770"F043 1 MYYEIGEIIR IOJIHVNGFDF KLFILKGHMG ISIQVKDMNN VPI1K{AYVVD ENDLDMASDL 61 FNQAIDEWIE ENTDEQDRLI NLVMKW Number of amino acids: Average molecular weight (Daltons): Mean amino acid weight (Daltons): Monoisotopic molecular weight (Daltons): Mean amino acid monoisotopic weight (Daltons): 86 10186.68 118.45 10180.02 118.37 Amino acid composition Aci 1Symbo Numb *Average Aci Symbo Numb Average d I I er in Swissprot d I er Swissprot Ala A 3 3.49% 7.58% Cys C 0 0.00% 1.66% Asp D 9 1.7 5.28% Glu E 7 8.14% 6.37% Phe I F 4 14.65% 4.09% Gly G 4 4.65% 6.84% His H 3 3.49% 2.24% Tle I 11 12.79 5.81% Lys IK 6 6.98% 5.95% TLeu L 6 6.98% 9.42% Met IM 5 .5.81% 2.37% Asn N 8 9.30% 4.45% Pro IP 1 1.16% 4.9% Gin IQ 3 3.49% 3.97% Arg R -2 2.33% J5.16% Ser S 2 2.33% 7.12% Thr T 1 1.16% 15.67% Val V 6 6.98% 6.58% Trp W 2 2.33% 1.23% TrY 13 3.49% 3.18% Number of acidic (negative) amino acids (ED): Number of basic (positive) amino acids (KR): Total charge (KRED): Net charge (KR ED): 9.30% Theoritical p1: Total linear charge density: Average hydrophobicity: Ratio of hydrophilicity to hydrophobicity: Percentage of hydrophilic amino acid: Percentage of hydrop~hobic amino acid: Ratio of %/hydrophilic to %hydrophobic: 18.60% 8 9.30% 24 27.91% -8 4.38 0.30 -2.80 1.19 48.84% 5 1. 16% 0.95 WO 00/32825 PCT/I B99/02040 9 770RF102 sequence 29051 atgagcaacatttataaaagctacctagtagcagtattatgcttc 1. Mv S N I Y K S Y L V A V L C F 29096 acagtcttagcgattgtacttatgccgtttctatacttcactaca 16 T V L A I V L M P F L Y F T T 29141 gcatggtcaattgcgggattcgcaagtatcgcaacattcatgtac 31 A W S I A G F A S I A T F M Y 29186 tacaaagaatgctttttcaaagaataa 29212 46 Y K E C F F K E WO 00/32825 WO 0032825PCT/I B99/02040 160 Physico-chemical parameters of ORF 770RF102 1 MSNIYKSYLV AVLCFTVLAI VLMPFLYFTT AWSIAGFASI ATFMYYKECF FKE Number of amino acids: Average molecular weight (Daltons): Mean amino acid weight (Daltons): Monoisotopic molecular weight (Daltons): Mean amino acid monoisotopic weight (Daltons): 53 6155.42 116.14 6151.07 116.06 Amino acid composition Aci Symbo Numb *Average Aci Symbo Numb *Average d I er in Swissprot d I er in Swissprot Ala A 6 1.2 7.58% Cys C 2 3.7 1.66% Asp D 0 0.00% 5.28% Glu E 2 37 .7 Phe F 7 13.2 4.09% Gly G 1I.8 6.84% His H 0 0.00% 2.24% le 1 4 5 5.81% Lys K 3 5.66% 5.95% Leu L 5 .3 9.42% Met M 3 5.66% 2.37% Asn N 1 1.89 Pro P 1 1.89% 4.9% Gin IQ 0 0.00 3.7 Arg R 0 0.00% 5.16% Ser S 4 7.5 7.12% Thr T 4 7.55% 5.67% Val V 4 7.5 6.58% Tip W 1 1.89% 1.23% Tyr IY 5 94 3.18% Number of acidic (negative) amino acids (ED): Number of basic (positive) amino acids (KR): Total charge (KRED): Theoritical pI: Total linear charge density: Average hydrophobicity: Ratio of hydrophilicity to hydrophobicity: Percentage of hydrophilic amino acid: Percentage of hydrophobic amino acid: 2 3.77% 3 5.66% 9.43% 1 1.89% 8.18 0.13 10.8 1 0.40 28.30% 71.70% WO 00/32825 PCT/I B99/02040 Ratio of %hydrophilic to %hydrophobic: 0.39 WO 00/32825 PCT/I B99/02040 162 770RFI 04 sequence 34393 atggtaaccaaagaatttttaaaaactaaacttgagtgttcagat 1 M V T K EEFL K T K L E C S D 34438 atgtacgctcagaaactcatagatgaggcacagggcgatgaaaat 16 M Y A Q K L I D E A Q G D E N 34483 aggttgtacgacctatttatccaaaaacttgcagaacgtcataca 31 R L Y ID L F I Q K L A E R H T 34528 cgccccgctatcgtcgaatattaa 34551 46 R P A I V E Y WO 00/32825 WO 0032825PCT/I B99/02040 163 Physico-chemical parameters of ORF 770RF104 1 MVTKEFLKTK LECSDMYAQK LIDEAQGDEN RLYDLFIQKL AERHTRPAIV EY Number of amino acids: Average molecular weight (Daltons): Mean amino acid weight (Daltons): Monoisotopic molecular weight (Daltons): Mean amino acid monoisotopic weight (Daltons): 52 6193.13 119.10 6189.12 119.02 Amino acid composition Aci Symbo Numb Average Aci Symbo Numb Average d I er in Swissprot d I er in Swissprot Ala A 4 7.69 7.58% Cys C 1 1.92% 1.66% 7.69 11.54 637 Asp D 4 5.28% Glu E 6 6.7 Phe F 2 3.5 4.09% Gly G 1 1.92% 6.84% His H 1 1.92 2.24% le 1 3 5.77% 5.81% Lys K 5 9.62 59%Leu L 6 11.54 9.42% Met M 2 3.5 2.37% Asn N 1 1.92% 14.45% Pro P 1 1.92 4.%Gin Q 3 5.77% 3.97% Arg R 3 5.7 5.16% Ser S 1 1.92% 7.12% Thr T 3 5.7 5.67% Val V 2 3.85% 6.58% I Trp W 10 0.0 1.23% Tyr IY 13 15.77% 3.18% Number of acidic (negative) amino acids (ED): Number of basic (positive) amino acids (ICR): Total charge (ICRED): Net charae (KR ED): 3.85% Theoritical pI: Total linear charge density: Average hydrophobicity: Ratio of hydrophilicity to hydrophobicity: Percentage of hydrophilic amino acid: Percentage of hydrophobic amino acid: 19.23% 8 .38% 18 34.62% -2 5.03 0.38 -5.81 1.47 53.85% 46.15% WO 00/32825 PCT/I B99/02040 Ratio of %bydrophilic to %hydrophobic: 1.17 WO 00/32825 WO 0032825PCT/I B99/02040 165 770RF182 sequence 29268 atgttcaatataaaacgaaaaacggaggaagtcaagatgtattac 1 M F N I1K R K T EE V K M Y Y 29313 gaaataggcgaaatcatacgcaaaaatattcatgttaacggattc 16 E I G E i I R K N I H V N G F 29358 gattttaagctattcattttaaaaggtcatatgggcatatcaata 31 D F K L F I L K G H M G I S I 29403 caagttaaagatatgaacaacgtaccaattaaacatgcttatgtc 46 Q V K D M N N V P I K H A Y V 29448 gtagatgagaatgacttagatatggcatcagacttatttaaccaa 61 V D E N D L D Mv A S D L F N Q 29493 gcaatagatgaatggattgaagagaacacagacgaacaggacaga 76 A I D E W I E E N T D E Q D R 29538 ctaattaacttagtcatgaaatggtag 29564 91 L I N L V M K W WO 00/32825 WO 0032825PCT/I B99/02040 166 Physico-chemical parameters of ORE 770RF182 1 MFNIKRKTEE VKMYYEIGEI IRIGNIHVNGF DFKLFILKGH MGISIQVKDM NNVPIKHAYV 61 VDENDLDMAS DLFNQAIDEW IEENTDEQDR LINLVMKW Number of amino acids: Average molecular weight (Daltons): Mean amino acid weight (Daltons): Monoisotopic molecular weight (Daltons): Mean amino acid monoisotopic weight (Daltons): 98 11691.50 119.30 11683.84 119.22 Amino acid composition Aci Symbo Numb Average Aci Symbo Numb Average d I er in Swissprot d I er in Swissprot Ala A 3 3.6 7.58% Cys C 0 0.00% 1.66% Asp D 9 9.18 5.28% Glu E 9 9.18% 6.37% 5.1 Phe F 5 0 4.09% Gly IG 4 4.08% 6.84% His H 3 3.06 12.24% Ile 1 12 12.24 5.81% Lys K 9 9.18 59%Leu L 16 6.12% 9.42% Met M 6 6.12 2.37% Asn N 9 9.18% 4.45% PoP 1 1.02 4%Gln Q 3 3.06% 3.97% Arg R 3 3.6 5.16% Ser S 2 2.04% 7.12% 2.04I Thr T 2 2.4 5.67% Val V 7 7.14% 6.58% Trp W 2 2.04 1.23% Tyr Y 3 3.06% 3.18% Number of acidic (negative) amino acids (ED): Number of basic (positive) amino acids (KR): Tnta! charpe (KRED): Net charge (KR ED): 6.12% Theoritical pI: Total linear charge density: Average hydrophobicity: Ratio of hydrophilicity to hydrophobicity: 18.37% 12 12.24% 30.6 1% -6 4.76 0.33 -3.89 1.28 WO 00/32825 PCT/I 099/02040 Percentage of hydrophilic amino acid: Percentage of hydrophobic amino acid: Ratio of %hydrophilic to %hydrophobic: 5 1.02% 48.98% 1.04 WO 00/32825 WO 0032825PCT/I B99/02040 168 Table BLASTY 2.0.8 [Jan-05-1999] Query= sidIlOO0l7Ilanl77ORF017 Phage 77 ORF 123269-239821-3 (237 letters) Database: nr 393,678 sequences; 120,452,765 total letters Sequences producing significant alignments: gil4493986lembiCAB39045.11 (AL034559) predicted using hexExon;..
gi17306071spIP232501RPII1_YEAST NEGATIVE RAS PROTEIN REGULATOR P gil3097044lembICAA752991 (Y15035) KlR (Cowpox virus] gil2l462451pirj jS73794 hypothetical protein H91_orfl8O Mycopi gil839l01pirlI S04682 ribosomal protein varl yeast (Candida gi gi11331351spIP21358IRMAR_-CANGA MITOCHONDRIAL RIBOSOMAL PROTEIN..
gil2128843[pir1 1H64475 hypothetical protein MJ1409 Methanococ giI1l070171gblAAD39926.11AF126285_-2 (AF126285) RNA polyrnerase gil2l462l0jpirj 573342 hypothetical protein E07_orflEE Mycopi Database: swissprot 79,449 sequences; 28,874,452 total letters Sc Sequences producing significant alignments: (b; Score E (bits) Value 0.010 0.053 0.090 0.090 0.15 0.15 0.20 0.35 0.60 spI P23250 spl P21358 spIQ21444 spl P27240 spl P53192 spl P32908 spf P54683 spJOO31OO RP11_YEAST
RMARCANGA
LDLCCAEEL
RFAYECOLI
YGCOYEAST
SMC1_YEAST
TAGEDICDI
CYAADICDI
NEGATIVE RAS PROTEIN REGULATOR PROTEIN.
MITOCHONDRIAL RIBOSOMAL PROTEIN VARi.
LDLC PROTEIN HOMOLOG.
LIPOPOLYSACCHARIDE CORE BIOSYNTHESIS PROT.
HYPOTHETICAL 27.1 KD PROTEIN IN ALKI-C1l.
CHROMOSOME SEGREGATION PROTEIN SMCl (DA-B.
PRESTALK-SPECIFIC PROTEIN TAGB PRECURSOR ADENYLATE CYCLASE, AGGREGATION SPECIFIC ore E its) Value 38 0.014 37 0.040 34 0.35 33 0.46 33 0.60 33 0.60 32 0.78 32 0.78 WO 00/32825 WO 0032825PCT/I B99/02040 169 BLASTP 2.0.8 (Jan-05-1999] Query= sidI100019IlanI77ORF019 Phage 77 ORF139851-4050112 (216 letters) Database: nr 373,355 sequences; 114,214,446 total letters Score E (bits) Value Sequences producing significant alignments: giI3341966ldbjlBAA31932I (AB009866) orf 59 [bacteriophage phi PVL] giJ2689911 (AE000792) B. burgdorferi predicted coding region BB giI11715891embICAA64574I (X95275) frameshift [Plasmodium falcip gil4493986jembICAB39045.11 (AL034559) predicted using hexExon;..
giJ1412571spIP180l91YPI9_-CLOPE HYPOTHETICAL 14.5 KD PROTEIN gill334l21spIP27059IRPOB_-ASTLO DNA-DIRECTED RNA POLYNERASE BETA...
giJ3122231IspIQ588511HISXMETJA HISTIDINOL DEHYDROGENASE (HDH)..
gil3649757IembICAB11106.11 (Z98547) predicted using hexExon; MA giJ2688313 (AE001146) sensory transduction histidine kinase, pu Database: swissprot 79,449 sequences; 28,874,452 total letters e-122 0.058 0.10 0.23 0.29 0.51 0.51 0.66 0.87 Sequences producing significant alignments: Score E (bits) Value sp I P18019 SPIQ58851 spl P27059 spI002224 spl P04931 SI P18011 sp IP18709 sp1064409 sI P21358 spJ0 039 4 5 YP19_CLOPE
HISXMETJA
RPOBASTLO
CENEHUMAN
ARPPLAFA
IPABSHIFL
VTA2_XENLA CP3H_CAVPO
RMARCANGA
IPABSHIDY
HYPOTHETICAL 14.5 KD PROTEIN (ORF9).
HISTIDINOL DEHYDROGENASE (EC 1.1.1.23) (H.
DNA-DIRECTED RNA POLYMERASE BETA CHAIN (E.
CENTROMERIC PROTEIN E (CENP-E PROTEIN).
ASPARAGINE-RICH PROTEIN (AG319) (ARP) (FRA..
62 lCD MEMBRANE ANTIGEN.
VITELLOGENIN A2 PRECURSOR (VTG A2) [CONTrA..
CYTOCHROME P450 3A17 (EC 1.14.14.1) (CYPI..
MITOCHONDRIAL RIBOSOMAL PROTEIN VARi.
62 KD MEM13RANE ANTIGEN.
36 0.079 35 0.14 35 0.14 34 0.31 33 0.53 32 0.69 32 0.90 32 0.90 32 0.90 32 1.2 WO 00/32825PC/B9/04 PCT/IB99/02040 1 BLASTP 2.0.8 [Jan-05-1999] Query= sid1lOOO431lanI77ORFO43 Phage 77 ORF129304-2956413 (86 letters) Database: nr 373,355 sequences; 114,214,446 total letters
S
Sequences producing significant alignments: (b gil334l947ldbjlBAA3l9131 (AB009866) orf 39 (bacteriophage phi PVL] gil7445181prfI 12014422A FKBP-rapamycin-associated protein [Hono gilll697361spIP423461FRAP_RAT FKBP-RAPAMYCIN ASSOCIATED PROTEIN giIll697351spIP423451FRAP_HUMAN FKBP-RAPANYCIN ASSOCIATED PROTE gi13282239 (U88966) rapamycin associated protein FRAP2 (Homo sa giJ387S4O21embjCAA98l221 (Z73906) cDNA EST EMBL:D64544 comes fr gill0847921pirlIS54091 hypothetical protein YPRO70w yeast (Sa Database: swissprot 79,449 sequences; 28,874,452 total letters core E its) Value 6e-46 0 .84 0.84 0 .84 0 .84 4.2 Sequences producing significant alignments: Score E (bits) Value Sp IP42345 spl P42346 sp P34554 sp10Q24118 spl P80034 spl P22922 sp1Q44363 spf P38255 spIP55822 sp 10584 82 spf P34252 FRAP_HUMAN FKBP-RAPAMYCIN ASSOCIATED PROTEIN (FRAP).
FRAPRAT FKB3P-RAPANYCIN ASSOCIATED PROTEIN (FRAP) (R.
32 0.24 32 0.24 YNP1_CAEEL
LIODROME
ACH2_BOMMO AlATBOMMO TRAAAGRT6
YBUS_YEAST
SH3BHUMAN YA82_METJA YKK8_YEAST HYPOTHETICAL 42.2 KD PROTEIN T05G5.1 IN C.
LINOTTE PROTEIN.
ANTICHYMOTRYPSIN II (ACHY-Il).
ANTITRYPSIN PRECURSOR (AT).
CONJUGAL TRANSFER PROTEIN TRAA.
HYPOTHETICAL 51.3 KD PROTEIN IN PH05-VPS1.
SH3EGR PROTEIN (21-GLUTA4IC ACID-RICH PRO.
HYPOTHETICAL PROTEIN MJ1082.
HYPOTHETICAL 52.3 KD PROTEIN IN HAP4-AATI.
WO 00/32825 PCT/I B99/02040 171 BLASTP 2.0.8 [Jan-05-19991 Query= sidJlO0lo211anI77ORF102 Phage 77 ORF129051-2921212 (53 letters) Database: nr 373,355 sequences; 114,214,446 total letters Sequences producing significant alignments: Score E (bits) value gij3341946IdbjjBAA319l2j (AB009866) orf 38 (bacteriophage phi PVLJ giI43252881gbIAADl731sI (AF123593) voltage-dependent sodium cha gif 2649684 (AEO0lO40) A. fulgidus predicted coding region AF092 Database: swissprot 79,449 sequences; 28,874,452 total letters 3e-20 7.1 9.3 Sequences producing significant alignments: Score E (bits) Value spjP42087 1fUTMBACSU PUTATIVE HISTIDINE PERMEASE.
SpIP04775 CIN2_RAT SODIUM CHANNEL PROTEIN, BRAIN II ALPHA SUSBU...
spIP426l9 YQJFECOLI HYPOTHETICAL 17.2 KD PROTEIN IN EXUR-TDCC WO 00/32825PC/B9/04 PCT/IB99/02040 172 BLASTP 2.0.8 [Jan-05-1999] Query= sidIlO0lO4IlanI770RF104 Phage 77 ORF134393-3455111 (52 letters) Database: nr 373,355 sequences; 114,214,446 total letters Sequences producing significant alignments: giJ2315523 (AF016452) similar to the leucine-rich domains found.
giI43771681gblAAl18990j (AE001666) CT711 hypothetical protein gil3882l71ldbjIBAA344451 (AB018268) KIAA0725 protein [Homo sapi.
Database: swissprot 79,449 sequences; 28,874,452 total letters Score E (bits) Value Score E (bits) value Sequences producing significant alignments: spl P04879 spj P04880 spIQ13 946 spl P35381 Sp IP54659 spl P40397
RRPPVSVIG
RRPPVSVIM
CN7AHUMAN
ATPADROME
MVPBDICDI
YHXDC_BACSU
RNA POLYMERASE ALPHA SUBtUNIT (EC 2 .7.7.48.
RNA POLYMERASE ALPHA SUBUNIT (EC 2 .7.7 .48.
HIGH-AFFINITY CAMP-SPECIFIC ATP SYNTHASE ALPHA CHAIN, MITOCHONDRIAL P.
MAJOR VAULT PROTEIN BETA (MVP-BETA).
HYPOTHETICAL OXIDOREDUCTASE IN APRE-COMK WO 00/32825 PCT/I B99/02040 173 BLASTP 2.0.8 [Jan-05-1999] Query= sid1l2274811anl770RF182 Phage 77 ORF129268-2956413 (98 letters) Database: nr 393,678 sequences; 120,452,765 total letters Score E Sequences producing significant alignments: (bits) Value gil334l947jdbjIBAA3l9l3.lI (AB009866) orf 39 [bacteriophage phi.. 182 8e-46 gill0847921pirlI1S54091 hypothetical protein YPRO70w yeast (Sa. 35 0.13 9 il1l69736IspIP423461FRAP_-RAT FKBP-RAPANYCIN ASSOCIATED PROTEIN.. 32 1.1 gil7445l81prfI 12014422A FKBP-rapamycin-associated protein (Homo. 32 1.1 giJ5O5l38ljembICAB44736.lI (AL049653) dJ647MI6.2 (FK506 binding. 32 1.1 gij48267301refINP_004949.llpFRAPlI FK506 binding protein 12-rap.. 32 1.1 giJ3282239 (U88966) rapamycin associated protein FR.AP2 (Homo sa. 32 1.1 Database: swissprot 79,909 sequences; 29,054,478 total letters Score E Sequences producing significant alignments: (bits) value spIP42345 FRAPHUMAN FKBP-RAPAMYCIN ASSOCIATED PROTEIN (FRAP) 32 0.29 sp1P42346 FRAPRAT FKBP-RAPAMYCIN ASSOCIATED PROTEIN (FRAP) 32 0.29 spIP40557 YIA5_YEAST PUTATIVE DISULFIDE ISOMERASE YILOOSW PREC. 29 3.3 spIQ241lS LIODROME LINOTTE PROTEIN. 28 4.4 spIO44 3 6 3 TRAA_-AGRT6 CONJUGAL TRANSFER PROTEIN TRAA. 28 4.4 spJP80034 ACH2BO'ThO ANTICHYMOTRYPSIN II (ACHY-Il). 28 4.4 spjP34SS4 YNP1_-CAEEL HYPOTHETICAL 42.2 KD PROTEIN TOSG5.l IN C. 28 4.4 sp1P22922 AlATBOMO ANTITRYPSIN PRECURSOR 28 4.4 WO 00/32825 WO 0032825PCT/I B99/02040 174 Table 6 2nd position 1st position end) 3rd position (T end) Phe Ser Tyr Cys U Phe Ser Tyr Cys c U Leu Ser Stop Stop A Leu Ser stop TrrD G Leu Pro His Arg U Leu Pro His Arg C C Leu Pro Gin Arg A Leu Pro Gin Ara G Ilie Thr Asn Ser U Ilie Thr Asn Ser C A Ilie Thr Lys Arg A Met Thr Lys Ara G Val Ala Asp Gly U Val Ala Asp Gly C G Val Ala Giu Gly A Val Ala Giu Gly G WO 00/32825 PCT/I B99/02040 Table 7 Bacteriophage 3A, comnplete genome sequience 1.
71 141 211 281 351 421 491 561 631 701 771 841 911 981 1051 1121 1191 1261 1331 1401 1471 1541 1611 1681 1751 1821 1891 1961 2031 2101 2171 2241 2311 2381 2451 2521 2591 2661 2731 2801 2871 2941 3011 3081 3151 3221 3291 3361 3431 3501 3571 3641 3711 3781 3851 1921 3991 4061 4131 4201 4271 4341 4411 4481 4551 4621 4691 caaacgctag caacgcggat aaatttttca aaaagaatat atagaagatt acaaaaaatc ttttattgtc ggttaagaga tgaacttaaa cgagcaatat tattaagaat ccattaagca caagtctatg ggtttaactg cagcacaaag taaagtttta aatgaacctt caccaaaact ataaaaacaa gcaaatatgt tagaaaagaa gggtatttga tgaagaatta gcgcatcgtc atctaaacgt caacttgtat tacagccatg aaagaaacaa aactgcgcag gtttaaagaa ctatttctgg ggttgctaac tatgctgtat aaacgtaatg aaacaagcta ggattctatt gataaaaatt tcagaacatt aagagatgaa catcagatag cgataagtta gatggattga agactataaa ttgatttcag ttataaaaaa acgacagcag ggtatcaatt agatggtcca aaatcataga agacgaaaga actttttatt gtcgaactgg ataaaagcaa atcccaactt gaaaaagcta agagaacacc agctgaacgt atgacgagat gagttttatt gattacccaa ggaaggcaga ccgtgcacga ttggttata tttgcgttag ataatggtaa agttgcagtt ctaacgaaaa aataccctat agagaaiggg tgactaccaa gatgttttaa attggataat gatagagcga acgcattcaa actaaatcaa aaggagcttt gaccttgagc cctgcattga taataataat cctttaatga aatggtatat ttgccgtcta agcaaagcag atatcgtaaa ttatgaataa agttgtttct gatagtggtg ttaaggaggt gaatgttatc gcaaaagaga gattgatcag tcaacttcta agctttatga aataatacgc ttgaaactaa tgaaacgata tgcccttgaa aatgtatgaa gattataaag gaataattct ctgagcagtt ttgattttat tatgtgctaa ttgaacgaga catctatcat aaatgttaat tgaaaaccaa tcacgtgaac tgttcataat atggacatgt tgcattttaa attgatgtgt tgaagaatac aactgatttt aacctgattc tttcatgctt aaatatggtt tttcaaacag tactatgaag aaaacggtgg cctaaaaaat atgtctctga agatatagig ttcaattgcc ctcagtattc ttaaatgcaa attttacttg cagcatacct tattgccaat actaaaacag acagagaaaa aaataggtat caacacaagc agaagtgtac tttaaagcag ggaagattta ccaccagttg aaggtggaga ccacttgaat taagaaaatc tttgaaaggt aaaagaaaat caaaaagtaa aggtgaaata gtgatgtaac tgctacagat ttcaaaaata tataaattca tctggaggca gtgtatttga aaaattaata tctatgtcga tgccttagcg tttttatgca caaaaatagt tttttaatga gttaagaaag acagcggatt tacttgaaaa aaagatttag atcaagaaca cttaaaacag tgtctttcgg cttgatagat gaaattttag taagcgtttc gagaacgtcc cagaagatrt catacattto aattaottoa aacacctaaa ttaaacgcga atgcgaaatt ttaaaaatga gaattaaaac aatccttagg tatgattgga caacagaccc aaatattgat atggaagaca atttaacatt gttgaaagac aagtaaaaga gaagcttatc aatctttaaa tgatcatgag ttttaccaaa tqaatttgaa aaaccttcaa tgaaaggggg tctttatatg tgatgacata ttaattaatt aatagtgatt taatgataga tagaactoac aaaaacagct aaaaaagata gttcaagaag attaacaaca tggtatgcag tgtgagagac atcttagata ctattcgatt tatagaaaag gcaacatttt attatcggca gctttgatat ttatggggcg cacaagatgg agaaaatggt tgatgaatct aaggcgatga atccattatg acgcaacgat atacacacat ggggattttt ctcaagagct gcaaggttac cttgttgata tggtagaagc atttagcatc tttggatgat aggtgtctct ataaatttag ggagatttta taaccaaaag cactccaaaa aaataatgaa tttatcagaa acagaggact ttatcgcatt catggattcc aagaagatgg cttattaaca taagatgaat gagcattatg gagttaaaaa attacgggtt aggatttaaa agaaatgttt caataatgtt cagttgaaac atagatggct ttgcagcatt aaggaaacat agagtttatt atattgtcac acgcataaag ctttagccca tggaaaaata ttttcagcta ttacaaagtt tagttaatac agaagtatct taatcaaatt gaaacaatca caaccatcaa agcttttct tttattattc cattcatgct acacatcgtg gcatctaata gataatgcag taagaacctt ccaatgtagg taaagaaaaa aatattattc caagagcctq aagttaacaa tgtatataga gcatacaaac caaacactaa aaggtggatt agcaagtcac tctagaaaat tttgtaaac gtttgtttgg aaaaaatggt gcagaaattc ttaaagctag atcaaaaatt gatgaaattc aacctcttct gggaagagac gacgatgata atgagatgaa gtttaatatc attgtttctt ttacagccgc taagcacaaa gtgcaagata tagtagaaaa tgaaacggaa ttagatggga tagacagaaa tttaaacaca agtattaaag aaaaaattga gatctttttg atctaattcg gatttactta gaaatgaaaa attaaatcca gcaactggaa tggtgcaagg taatcttaca aggcagcaag gtgttgaaat agaaagagta aaaaatgaag aatttaatcg ttatttaagg ataaatgaca atttataccc aagctaagta aagtgataaa atcagtgaaa tgctaaaaat cgctatgagt actgtaggta cagcttattt gcttactgca agtatctcta tcactaaaat aaaagaaaaa atgaaatgcc tgatgaat tg aaagcaggct cgaaagttaa agagttttat cacgctt tac tttcagaacc tccaagagtt ttaaaaggtg ttcatggatc aaaaacagct aacatatgaa aaggctggtg ataacttact cggtgactat tcaagggaaa ggaggtaaat cttccaaagg ttgggttcat aaaacaacca atttgttagc cccaaagctt atgccccaag atgaatttaa catctacatt accttagatc ttaatgattc agaagagtgg tttgctaata tagaagagct gtgtgctact gttgaatati agccttatat aattacttat gaaacaagac aaataatatt cggaaactgg tatacagata acataatgcg tagacaattg gggtgtaatt atggctagtt cagtgtcacc aggtaatgca gatgttgttg ataaattgat cattagtccg gaaatgcaaa tgttagaaga cgaaccgtta gctaacgttt agttaaacag gaaactactt gctgatagtg ttagagagtg aattgacacg ttttcaaatg tggtttgaaa tagatgttca gcatcctgca ggtgacacta atgcagaaga agataaagca gaagaagcct aagagcaata cgatgatgta agagaaaaaa gacattatat agtcagaaag tacaacaaag agacacagga cgtcacgcga caacaggtaa atttgctaaa tcatatactt atacagttaa agatgtagat gccttagcag gcaagcgaga atttaacaag gatcaaatac aaatttcgcg cgtcaaacag tatgaagaag tttaaattta acgttaaatc ttcgtagtgg ttactacact taagccgcta acaagcggtg ggtgataaaa tttatttatg atgtcaatga gtgatattgt aactagatga actaggagac agggcatgca atatacaata gcatcaattg ctagtgttat ttcataattc atgggttatg aacagatgct gttagtaatt atgttagatg cagaaacttg gag3ctaatga aataactgct aaagaaagat gtagacaaaa qaaaqtatqt cactagaaga caatgagtta ttaggaggaa caacaattaa aaaataaaaa tcaaacaact agaaacagaa cat tgaagaa aaagaaaaag aagatggtta aagctaaggc tggaggcaca acgtttatta aacactttct aaagaaattg aacattaaag gtttagagat aaacagcaaa agaattaaaa tgcaatttca gatactgtaa tgattcaggt ggtgataagc aaccaattac gtgaaaaagc tagacgatga tgacttcatt.
attcactact aataaattca ttagtaaact gggttgaaaa tcttaccaaa tcgtctaact acagatgtag aagtatttgc cgcactacaa tcaggtctag cagctaaaga acgtaaagat WO 00/32825 PCT/I B99/02 040 4761 4831 4901 4971 5041 5111 5181 5251 5321 5391 5461 5531 5601 5671 5741 5811 5881 5951 6021 6091 6161 6231 6301 6371 6441 6511 6581 6651 6721 6791 6861 6931 7001 7071 7141 7211 7281 7351 7421 7491 7561 7631 7701 7771 7841 7911 7981 8051 8121 8191 8261 8331 8401 8471 8541 8611 8681 8751 8821 8891 8961 9031 9101 9171 9241 9311 9381 9451 9521 9591 9661 9731 9801 9871 9941 10011 taagtcctaa catgtatgat atgcgatatg cagcagaaaa caattatttt ttgtiigtat aagaaaatac taatatcagc atttcgaaaa agattatgac tatgaaagtc tgattctgaa ttttttcaat gtaaaatata cataattatg ttatattccg tggttttatc tacggtaaac ctttaaagaa ttatacaaaa attattcact ttgcaaaaac gaatatatta atatactatt atttaccttc tt caaataat atctggatta gctattatta cggattatgt agtatttggc ggaattaact taactgcatg aggttcatta cgaatagggg tgatttaatt aaagatgact gtgggtactc attaaaaaag atgtaaataa taatccttct aggtcttcta ataaattatt agaaaaatga aatcaatgca agaattcgag gtaggaagtc tgaatgaaca attagctgct aacaccataa ataaagtcac agacttcatg cagaaaacaa gatcacatgt cattttacaa acgctttagc agatttacat caaaattatt agtgttcttt aaaccagtag tatttacaga atgatggaac aacttatgac gtatgatcag caacgtacat cccagctaag ccccaaaagg tgatgaaatg agtttagaag gaaggtctca ttcaatcggc tggaataccc aaatgaccaa tggtaggtga agggccgtat atgaaagata aaattgaata caacattaaa gtgtagaaat agctaagagt agttttaaag aagaaagagc tggttataca aatgaacgga aagaaatttt tgaaaatgct tctgataaac ttgatataac tgcttatttt tataaaaatc caaggttctt ttaaacgcat tttagtagat gacatgagta gcttttttgt tctagaagca tttttaaatg ccagatgaag gagaaatttt tctaccacaa gaaataagaa taaagggata gatagagctt atacgggtgc tgttttaatt agagatggaa agtatagaga attatctgat gaaacttcca atcttagtga aaaacgaata gaagaaacta agcgcatcga ataaagtagg gacatgggaa atgtttgcat tttttaatat tggatctgtt aaagaagttg gaagattacc gtgataacgc caaatggaac aacaaatttc tgcagcagtt aaacctattg actgataaag atgttaaaaa tagacagtgc attcagaatt ttaatgtaac agctaaggct aaattaaatt gtggttgaga taagtctgaa ttactattaa acagcgatta gatatatcat aggtttttaa tgaaaaggga gaatttaatg aatttaaaga aggaaaaaat gaagttgtat aaaagcgact gaatcaaagt acaaatcact tagttaaaat ttgatacacc agatattggc cctgaagtgt tgaagaaatt taaatgaagc atctgaattt tagcatagaa gaaatgacta gaatgggtag gccctatgaa aaaaatazac accaagaggt aattataaaa aaggagttgg gcagagctcc aaacatatat aaccttttgt tgttattaca agaatactta attcaaatag agatatctgt tatatcaaca aacgttatgt gatgtcgaga ttttaaaaga ttatacgttg gatgaaaaag gtggtacagt ctaacaaacg tgtttggatg tccaagtgaa gatctaaata aatacaagag caccatacgt ttgcgctact taaaggtact accagagcca acaaaattaa attgtatacg ggtatcatga cggacagtga agatcattca cagttaattc aaaatctgca tatgactaaa aetttaaagg tcagtaactt tatctaattt aaaatggtaa agaatctagt atcatttaca cccgaaacta agtacagcaa cgaataaaac caggagcaat tcacggtgta gtctggacaa attacagtaa atttgaataa aaggagctaa gctatacaaa agaagatgtg agagaataaa aaagaagtac tttaaagatg aaggattgac tgaaagatat atttcgagaa agggagcaga aacaatttat tttgacacac tgggagattt aggcgaatat gcaaaagcaa aaatcagctg attgactata gtggggttcc tgcaagagat ttgcaaaaaa tcgcgcatat agttgctttt caggactaac tgacagaggc tataatacag agaatcggta tttataaagg aatctaagcc tcgcaaaaac tttggagtta ccagataaat aaattctaga cctatttatg atgtagaatc aaatttaatt cgttatcaag t taaggagga gagtttttaa tgatctaaat aaaaaacaag cagttattgg aacagttatt tttagcttgg ctggtgactg aggtaaagaa gaggattctg acagtttcag tttataaagg agaagcggat aaagttgatq aatcaatcac gttgaaatat gctgagggaa cagtaacaaa tacaatgatt acaatgggcg ctaacgcaac tgaagaagat atcaatggtg aataaaagat actgaagtca caagcatcta gtcagttaga gcataccaaa aaatatatat aataagcaat ggcagaagga cccagaagca acaaaagtag atcacaggtt tageaccaga gtactaatga agttaagtct tcgttctaaa gataaaaatg ggagaatctg aagatggttt attcaattga atttaaaaca gatgaacaga aaagttgatg ggagaagcag aattcttcaa caagttcgtt acccagctaa cagaataggg gctttcaaaa agacgacgtc gtagcttctg acaacttatc caaaaggtac tacctcaatt caaaaccaat ggtaaatgct gatgacaatg acaagtgaac atccagagtt cttcagttat cactgctacg tggataatta tttgagacgc aaatttgaaa ttaaagaccg aagcagaaaa atgctatgag aaaaatgaga caaaaagagc gttttgaaca agatgagcac aagatgaaga agacttcagaa attttatcga acattaagaa gaaaacagcc gtatgtaaaa aagtgaacaa aaagtcat ta gaatgaaaaa gtagaaggca ggtttaaagc gacaattagg aaaaatcaat ggaaaagtat gtattctcaa gtagaagatg gatgttgaga aagcatattt aagaagcctt aaaatcttcg gtacatcttg ggtaggagag aacaggtcaa ccagtgtacg cgaggagaaa aagcagaagc ttgatggtac tccacaaggt aaaagtattc cccccaaaat taaatcaaag aacaaggtga ttaccaagtg ccaattctag ttgaaccaaa tgttactgtt tctactgacg agaatatctg taaaacagga tatttagaat gacagttatt taaaacttat actgaaccag aatacaacgt cttttagaaa caggtacgga tgaccttgga tgttgttaat caggcgagaa gttggataca gttgaagtag gagaataatt aggcaaagtg gcatgggaag tctcaggcgt cattgcacca gatgagagaa gaagtgacaa cgtctttttt aaaacagaga tagtaaatca agtagattta acaaaagcct aagagatggg t tctgtatgg tacttaatga tttaagaaaa gctgaaatta agtgaaatga ttaaggggtt aaagacagaa agcagtatgg gaacaatctc gtggacatta agagaataaa gaagagactg ctttttggaa gctagaaagg gaccatag gtgtccaaga aagctaatct gtcatcattt aaatgataag cttaaagttc taaacaagat gctactactg ttgaacaata aaacgaaaaa catttaaata aagctgcct acaaatgata tctattggag ggttaggtgc agcaagcagt gctaacgaag ctatgccggg atc-agcaatt gctaatgata cattaggagt t caagcaggt gcatatcaaa cacaactaaa taaacaagaa atagaaaatt atactattaa aaataattta gctcaaagtc ataaaatgac agcattaaaa aaaqacttaa ttgctaaagg tgttatcagt aattctttcg gtgctgcaga ttcaatagag actgcattaa agcttaaaca agttaacgct aaagctagta gaagctaata aatacagaac ttaaaaaagc aacttaaaca gttgagagat aagagcaagt gacgcagtac ggcaatcaag ttcaaaaact cttacgctaa aactaatact gaatcataqc qctaatqtcq gagcgttcaa tagataaagc atttcggcaa actcgctagt ttccctagga cgtacgatga acaagtgcag acttcgaagg aaagcatgtc taatcaagcg tatggaagaa ttggcagcct gcagcagaag caagcggtgc gtttaaaagc atctgatgca tattcaatac atgggagatg aattatcaaa aaaaagaaaa cgaaaatcaa gcagaacaaa agaagcagtc aaaagtacaa aaattaaagc caaaaqctga ttcatccgaa caagcggatg cgatgggcgt gcaaatgtct gttgacttag taggctttaa agaaatggct aaccatgttg cattaaaata aagctaaatc attagcc ctt tataaacgta agcttaagaa cgctaagcat aatgataatc aaacagaaaa aacagctqtt atgaagactt tcatgtcaaa atctacaccg cgagttggag gcgctaaaac tgccaaacaa acaactgcaa ctgatttact tgcaggtact aactcagggt gtaaaagtac aagaaaaaca aggttaatat aggcatgaag gataagtctg aaaaaaagat tagtgtaaaa gataaatcta caaatcaacg tagtaaccaa aaagcacttg tttcaaaatc agaatttaat aacaaagaaa ttaacaaaga gaaatttagt attactttag cgattgcaca aagtaaaagt acaatggagg ctgtaatggc tgcgagatca ccagcaaaag tagaggggtc agctaaggaa gacacttctg gagcttcgtt cagcaattga agttttatct tattaggcta gctaatccaa WO 00/32825 PCT/I B99/02040 10081 10151 10221 10291 10361 10431 10501 10571 10641 10711 10781 10851 10921 10991 1106 1 11131 11201 11271 11341 11411 11481 11551 11621 11691 11761 11831 11901 11971 12041 12111 12181 12251 12321 12391 12461 12531 12601 12671 12741 12811 12881 12951 13021 13091 13161 13231 13301 13371 13441 13511 13581 13651 13721 13791 13861 13931 14001 14071 14141 14211 14281 14351 14421 14491 14561 14631 14701 14771 14841 14911 14981 15051 15121 15191 15261 15331 atgaaaaaat agttccaaga tgaagcagca ttgaagaact aacaattagg aggtgcggaa gtaggtttag ttggaagcgc caattcaaaa aaaggcttta caaagctagc agtaagtgga actgctatta ttaacggttt agagtttaaa ggttataaat ccatgggcac agaaaaagct taggtattca tttgtctgac caacatgaaa ggcatgacga agtggatttt tagccttgat ctaatggtga aagtaaaaaa tggCgcrtttt ggattaacaa cgatttttgg ggctaaaggc gcaatgaaat aaggattagc aattttaccg ggcgcaagat caattgcata aggagaaact aar tat crtg ctttgagtga agcttctaaa C tagaaaaat gaatcgttag aattagttga cgcatctatt tatgcatcat ctttCaggt Ct cggagctatg ttcaaacttt CtgctggtgC taaagttttt ataaagtttt gaagtatagg cgatgaccrt aaagcatctg acgtacacta agacaaagca agaaataaaa agcaaaacat gctaaaggtc gagaacaaaa tgaagcgggt gcagctgatt caattgaagc tggatttaca ggccCtgCtg taaatagacg tcaaacctta ttgttaatt tgaaaaacgg agccttaaag aaaaccgcat ttggtggcaa caaaagcttc ctgaaagtag atactgtaaa ttctgaagag aaaaaacttt aggaactcga tttaactaga aatttgttgg catgggtgaa ttgattagac actagcaaca gtggctacaa tagttggCac ccagataaaa ttaatagcta tagcaaatca tgatgaaaga caacctcaaa ggtgctctgg tggtaaagat ttaacgccta Cgattagagc catcttcctg gttggtttag aaaggcttcg ttcttgctgg tggcttaCta atacgtgcag cattgctgaa aatacaatac tgtctaatac tttCCtggCC ctacaacagg aaaaaCgtca taaaacctat aaatgttttg aaaaattctg ttaggaCta gccgcaaaat ccttatttgc CCCCtaacag gacctatagg tgctacaata atgatcgtgt ggaatggttc agaaacggta aattattggc ggtgctgtta ggaagctagg aaagaaaagt tttcaaagga tatgaaagat gagtcaacaa gtitaaagga tttatgcaaa agtgttgggg aaaggtgttt caaaagaaac aacaacagaa tcatggaaaa agtacgttta tgaaaattga agcggattta tctaataacc aaaaactcaa gaacttattg ataagtatag actaaagaaa aaaatgactt gcgaattaaa aactcgggtc aaacaacaga ttatagctga aatagaaaaa tgcgttcgat gaacaagaaa aaagagcaag aaCtcaatca gaaaatcaaa aaaatgaaag gactgaaaaa gcgagcaaag aagatgatgt tgctgatcaa aaaaagcaaa agiggtggaa cgctaaagaa gacggcgtaa aaatgaaaaa aagttctgta tctgttaaat cttgggcgca gggctggcta aatgcaaaat ctggagatat atcagtattt atgggaagag tgcggtat tactggtact gtattaaatg gaacattcca caatgacact tggctagacc cgcataatat ttggttaggc atgtcaggtt attgctcaaa agctatctat gtcgtcacta agataaggta tggtatatgc aatcaggtgc agggaatgat tcaggtgtta atgttacttc aactcaatct atcccacaaa tagcgttctt aagaagatat gagatggtta gtatggatgg aattgttatg aataacttag caaatgcaaa aaggaagaag atgataacag at tagtat Ca tctgtgttta atacagctaa taaagggtat aatggaattc ctactgatac atttgaactg aaaagaaatt gaaaagcttg gagcaagagc gtatCCtagt caattaaaga agcagaaaaa cattgctata aaaaataacg agacataagg atgaagtaag ataaagatat tgataaagag tggccttaaa agttggtggt caagaagaaa cagctcgtag aaactaaaac tggcgaagct aatgtggagt ggaatcaaag ggatatcaca ctaaggctat cgactacagg aagtatttac ttcaaaatct atttggaaag acggatatgg ctaataaatc ccgtttggaa aggaacatcg gtattcaaga gcccacgatc aatggtttta gaaaatggct ctgcggctga tttaggtaaa taataaaata tctaaagcca ttagcaggaa agggtgtagc atagaggttc tggaaacgcc tgcaccccaa ggacgagatg ctgaagttac agcggatggg aacttaaagg taatataggt caaaaaaggt gcagaagaaa gataaaatcg gcgatgtgtg taaatattaa ttttggaggc aagaaattaa Cagacaaagt ttgaatatcc aatctggcaa tggtatagac CCtggtatgc tggactgatt acggtggcgg atttatctaa gcaattagca tacaggtaat Ctcgttagag acagctaaag atccagaaaa gaattgaaag aaaaagcttt aagtgatggt cagatttcag aaaatcaaag acgtgacatc actgttaaag aattgagtaa aagaatgcaa agaaacagaa atgcttattc aatagacgaa gcaagaaaag caagaaaaaa agaagtggac aagcaatatg tcaacctttc taagtctgaa aaagataaat tattagctat aaaggcaaaa tctaaaaaag atgctgtagt agacgttgtt atggattat ccagtggtcg tgtatataaa aatactgaaa ctaacttcag agaagaccaa aagaagaaaa gtgataagta aaacagagaa aatataaaga aatggtttgg aaatgcttgg tttagtaaaa tgggcagaaa Cgctaatcat Cttggcggcg gaattccaag caaattaagt tcaggttgga gctcagccaa agctaatagt actggtaaat ggtttggaaa agcttggcaa aatcaaacta agcaaaagta ttcagatgcc tcagataaag ggacatcaaa atggtttagc aatgcatata aaagtgcaaa gcgctcgaaa tgggataata tttctagtac agcatggtcg aaatggttta gtaactcata caaatcttta aaaggttgga gttttgatgc aatttcaagt Ccggcatggt ctaacgctaa atcaagaaca tatgaatgga itagagatat tggtaaagac aatgttgcta ataaagctat tggcggttta aatagcatga ttactgataa aaatctcatc aagccaatac ctacattgtc taccgataat tcgggagcat taacgcaacc gacatrgct ccaggtggtg gagttcaaga agtaattcac agggctgacg tggttgttcc actaggagtt ggagatagtg taataaatgc tgttttgcca aaattccatg gtggtacgaa aaagaaagat aaaaaagcag gagaatttgg agctacagct aaaaacacag tggttgaagc agcaggcgat aaaatcaaag atggtgcatc ggattacgta caacatccag ggaaactagt aaataaagta ggactaacgc tacagtaaaa attgctaaag gcgcgtactc aaaatcgtgg tttgaagatt ttggtggtgg aggcgatgga agatttggac gctacacagg tggacttaac tttaatgacg ctactggaac aaacgtttat gccgttaaag gtggtatagc taattctata caaattaaga ccggtgctaa cgaatggaac agacaaggcc aacgtattaa agctggtcaa ctgataggga gagcacactt acatttccaa ttgatgcaag qgtcacatcc atggttgaag tcacttaaag gtagtggcgt tcgaagtggt gcaggcgata tacgtcgtgc agcaaaacga atgggtgtta ttagcttgat tcaacacgaa tcaggaggaa atgcaggtat cgttttacag ggcaatccag caaaaggatt gcttcaatat agaggtcaca acaatatata tagtggttac gatcagttat cacagtttaa cccaagaggt ggttggtctc caagtggtcc aaagcatcaa cttgctgaag Cgggtgaagg agataaacag cgagcaattc aattaactga acaggttatg cgcatcatcg taaataatga tacttctaca gttgaaaaat tgttgaaaca ggtttagaaa aaatattgtc aaaacaaagt gggcatagag ctaattaatg caatcCCCtg taaaaatcat agatggttac cttatatttt tagatgcaag ggctgaaagt ccaaacacca atggtatttt accgggcgca attagttttg cgccttCCtc agatgttata gatttaaatt tatttgagca ttggCCtaga gttattactt etcaaatgcc tggtgttaaa Catgcagtga atggttcttc aactgaaatt gaagtaagtt taaatgttta cgatagcgag ttcttattcg actctaattg gatgtttgaa tatactcata catcaaatca atttactatt tggaacggtt acgatttgaa aatattaatt aattaaatg cgagtggagg Catttttaag tacaacaaaa gtatagataa aaacactgat ataaggctgc gggtgatgta agttcgetta catttagaca taacaacaga gcgaatggtg tcccCCCaac caagccaaat Ctaagtgata gttctaatga taattatatg taataacaga tgtaactatt aggtttggct atcgcagaca tgttacatct tctgaatcag ctcttgattt gataaatcca gttaactata atctgcttgg ggaaatatca gagacatcaa ttatgctgtt tat tggcgct gtttgattac tagacgtaaa aacatcactg aaggaadLda tgcaattaga ggaggtttga ttttaatcag aacggagtag atgatggtat tccttattat aaittaaaag ttaattggac cacacctaaa cgaCtcaagc caacaggtga WO 00/32825 PCT/I B99/02040 15401 154 71 l5541 15611 15681 15751 15821 15891 15961 16031 16101 16171 16241 16311 16381 16451 16521 16591 16661 16731 16801 16871 16941 17011 17081 17151 17221 17291 17361 17431 17501 17571 17641 17711 17781 17851 17921 17991 18061 18131 18201 18271 18341 18411 18481 18551 18621 18691 18761 18831 18901 18971 19041 19111 19181 19251 19321 19391 19461 19531 19601 19671 19741 19811 19881 19951 20021 20091 20161 20231 20301 20371 20441 20511 20581 20651 ttgttttag taacattagc gtccttit aattgatttg taggiacatc aacttcataa trtitattga tgatgacgac gcaaatcaaa acgaattagg cccaaatgat tatcaatata ttggaaaaaa aaatggtttt tgcaagtatg t tttagacgg aataaaaaat attgatatat atttaattgc gcagtttggg tttgcaaaaa tagagccaag acttgatagg gtacaaattc gtataaatca ataatggcaa at iggaaagc catttcggta attgtacaga tgctatta aggtcgagat acagagtta gatcgcaagc cagtgctaat ttagacgatt aagaaattct accicttaca gctagaaata taggtggggg cttaaatggt aagtatggta aagaaaataa tatagagata atgagactaa caggagacag tcaggttaat caaatatact atagaacaca ttttatcatg gggatgttaa taccaagaga tcagatattg ggagacat caaaaatgt c agattitcca tttgttcata aaacaagttc gtagctataa gagagagtga aaatatcaaa ccatttaaag aagtttcggt aaaaagttg aaaactatta t tgaaaaagc gcaacttgct caaaagcgat qgggrctac atttgattt aattatccaa tgacttatcg gtcagaatta atggccgaaa gaacagcaaa ttatgggact aatccctca atggigigta tgcatatcga gccaggtaaa aatgaattta atttataggi aggtgata tgaaaattta ctagatgtag actttacag tttatagaac tttatcatgg tgaaaaatac agiiacggca tatcacataa agtagcgaaa ctggtaaaac aaacttcggt caaaatgacc taacaaaaac ggcttagaat acggagatat gttttattc atactgatac tgtatctgca gtatacagct gaggaaaaga ataaaagaag gtacttatcg gtaatgaaac agttagattt caagcaaatt aagcaaattt attgataaag gcaagcacgi cttcaaataa aaaagctaag tgacaactca ggtcgcaatc attcgataig ctaatacgca agcaaataaa tgatactcct agatagcgta ttctttgttc tcacatccat ttgtaaacgc aacaagcgct taacagacga tttatacact agtacttga cagaagaagt taaaatcaaa tataaatggg ataccttatg gatgataaaa ataaattaga aatccccaaa cggtaaattg ttagaaagga aggtgcatta tatagagcta aggaaataca agttgcatca gaaagctaaa agcagtaaac aaatctttac gagattgttc aagctcgagt gggaaaccca aacicaaatt tgatatcgag tatcgttg aatgcagtaa tgcaatcctt atggctatat gctaagtcgt tcatggtaca cataacggtt aataatgaga atacattagt tgcaagatgt atttacagga aattctatac agaattgaga agaagtttag acgatgttga caaacgaaac ccaaccaatg taatccaaat aatagaaact gctgactatg gtggaacact atgacaaaga tagtggtaaa tcgtatttc atgattgggc agtgacacag gtggtagagi cagagccagg ictttaciat atggcgtgat gcaggttggt acgcgtaata gttatgcaag cggactggaa ttatgtgcct agatattaac atagtaggca actgaacgta aaggggiagc ggctagttcg taatagtgt agggccatgg actttacacg atttagaaaa cacagcgcat ttcttctgtt cttcttttta gcacgaatag cgattcgagg gtattttaaa atttcaatta tgcagaatta ggtaattcag tttagcatga ttccatctga tggatcgtgc gaaagcaaig taaagaaaaa ggiccaLu~~q aatagccata tatctgatti atatggatga gaaacatgaa ttcaaattgg caaaaagcta aataatccag aacaaagaat gaggtgttag tactaacqgt accgaacggt acaaataaag gctattaatg attacaatac gtaacgctaa attatacgca tggtgtgggc ictacattgi ticcaggtg gtgactttac atcttccaga tggtgatgga gatataaata gagtgggaat agattaaagg aiggattatc attatggttc tactcataat acaattaagc tgtatgaatt gccagaatac tataaaataa agacatcagi atgatcat i tt aaagaa agttttgt agacagcgcc icaaaatcac icrtttagatg itggaaat actgtaaaga agcggtagac tcctgaaaca iiitacaaa acigicagia caitggaaii aaaatiaiaa icciattaga iaccgcaaca aitgggtcta acaataaaaa agggcictca citgitttgc iaagicggt tttagaaaig atattitiag cctigiaigt aigttggaac aatacaaagc aaiigitgac aacaaatgaa gatatcgaaa aagactgaat tagaigttaa atgaaitaai gggaiataac aaiagatgaa gigtctica giiattgcac aagaiaaiag attctcctt cgagacaatg gcgciaciig aaaacgataa caggcagtag igatattgat iaaittaaaa aiaggcgaag tggaaaataa cagitgacga iggaaaatit gtatttaata aaacttacaa agaatagagc tgatacaaat gatattaaaa atcagtaatg tatiatgaac tcgattiact aaaggt tgaa tcagtggaat agiact taag taagcgaaaa ctatttggci gaagcgagaa aagaacagct acaactgaca aagctactat aggtggaata cagtctgaaa gagaagaccc igaaaaatca tacgtcgcag cacaggataa ttat ataggi actgaaitaa gcaatgaaat atataactat gaiataggga acagtact ggattgcctc caggaatiat tagtgggaaa aaggatttgg aatttgcgct agaggcatta ciacaittaa gatiitaatg tgaatgaagc aaiiigtgaa ggtgataaag caaataagct ataiggatti gtaccgattg gtaiaattta agtgaitcga ataaaagttt tiaaatattc taactttgai tataagttga caatagattt caaaaataga acagtcttaa atagtgcaaa gcigttagaa taigaaaaaa aggitgttaa aaaggatatg caagcaaatc gtgtatiaat ccagctacac aagacggtat tcaaaatagc cttggtacag gagctttagc tggcttgaca acaattcaac aaagtcaaat taaigiaaac aataaagagg aacctgataa ctggtttgat ctaagaccta atagatatat tcgtttcaag acitcgctga gcaaatttat tataatggta taacgcactt gttgtggcac cacgtaataa ggcgaaacct aaactataag gcaagaattt aatagaacag atggtcaatt tgatgatgag iatacgccia aaccctacat tcacigggaa gataaagttt tigacacat agaattaaat ttatttgiga gcatagtata tatagacagc ttatggattt atgtggaaat cacccitgtc cttgaaaact tgcataaaat aaaagaacat aagaccaaac cagaacttgc catgacacaa tcattgattg atagttttat tagctatggc aiaaatgaaa caatgaatta cagtatccct cacccagaaa gacctagaag taaaaatatt cagggigtga atttaacggc agattcatti aaagctttga aaagaggtat taaaccttta ttatacactg tcttggaagi gaatatgaig aaaaatagtg tgtcgtttta tggttggaac acagcatctg aagggagaat tatgaaaata atatcactaa taaaggagtc cciaaigaig acgtggtagi aacaaaatta gacgaggcta ggaaaataaa gccttcaagc agaiiaciaa aggtgatica actgtcgaat igittgttaa acctttigaa gatgacaagt acitaaatga agagtttggc aatggtggag cttgaiga aaaataciig tatiggtaia tttcgattta gaaacaggag aagaagcgta cciggcgaat ttgcggaagc agaaggtttg igctaggtgt tactgtcggt ggtgatggaa ittagaaaia citcacicaa gaggcgttcc ccaatgaggc ctgataaact taagaatctt atcaiacagi tcaaatcgat gatttcccat iaagccacca caaactggcg gigatgtaat acttttgaaa gggigctic tggaagaact gtaaatggga gagagtacci tcattcaica tttaactacg gatgaiacaa aacgttttac tiataigiag aagcttcaaa cacaggtggc cigagatact attgaaaaat taigatagia tataagttaa tgagtaaitt agagaaaict iiicaaatct agatataact tttagaacag aaataatcaa ccgitatiai tgagigaaga aiggiagtig ctccactaga aatattagat taaiiaaacg agaiggaagi taicaagcic tgtcgagaga actaicacat ttaacgtiga cactatatig tigaatiica ggaattagaa taaaaaaigg igaagattai gcgagtctga toataa gcgcaagcai attcaagaac attcgatgag agicagtgaa tagtggtggt ttagtcacaa agaigatggt aagataargc agattactgg acccaatia tttatgttic gcaagctata atttagtagi aacttcagat iacaagcgta aagaaaagaa gcgggttcat ggtctgagig acigttcaaa gtgcccaatc aaaagctaat ttaaiaaaag gtattcggtt atttttgatg gagittagac caatttattt tattaatttt agiccttig gaggaggaaa gaticatig gtgtttatga gttggatta actaaatcia WO 00/32825 PCT/I B99/02040 179 20721 gtcgtacatc tttaactata tcaaacgatg tctatttcga cttaggaagt caaagaggct ctggtgcgaa 20791 cgcaaataga gggacaatta acaaaattat aggagtgaga aaataatgca aatattagtt aacaagcgta 20861 atgagataat ttcatacgct atcattggtg gctttgaaga aggtattgat actgaaaatt taccagaaaa 20931 tttctctcaa gtttttagac ctaaagcctt taaatattca aatggggaaa tagtttttaa cgaagattat 21001 tcagaagaaa aagatgactt gcatcaacag attgacagtg aagaacaaaa cacagtcgct tctgatgaca 21071 tcttacgaaa aatggttgct agtatgcaga aacaagttgt tcaaagtaca aagttatcga tgcaagttaa 21141 taagcaaaat gcactaatgg caaaacaact tgtgacactt aataaaaaat tagaagaggt taaaggagag 21211 actgaaaatg cttaaattaa tttcaccaac attcgaagat attaaaacat ggtatcaatt gaaagaatat 21281 agtaaagaag atatagcgtg gtatgtagat atggaagtta tagataaaga ggaatatgca attattacag 21351 gagaaaagta tccagaaaat ctagagtcat aggttataat cttatggctt tttaatttga ataaagtggg 21421 tggtgtaatg tttggattta ccaaacgaca cgaacaagat tggcgtttaa cgcgattaga agaaaatgat 21491 aagactatgt ttgaaaaatt cgacagaata gaagacagtc tgagaacgca agaaaaaatt tatgacaagt 21561 tagatagaaa tttcgaagaa ctaaggcgtg acaaagaaga agatgaaaaa aataaagaga aaaatgctaa 21631 aaatattaga gacatcaaga tgtggattct aggattaata gggacgattc taagtacatt tgttatagcc 21701 ttgttaaaaa ctatttttgg catttaaagg aggtgattac catgcttaag ggaattttag gatatagctt 21771 ttggtcgtgt ttctggttta gtaagtgtaa gtaatagtta agagtcagtg cttcggcact ggctttttat 21841 tttggaaaaa aggagcaaac aaatggatgc aaaagtaata acaagataca tcgtattgat cttagcatta 21911 gtaaatcaat tcttagcgaa caaaggtatt agcccgattc cagtagacga tgagaatata tcatcaataa 21981 tacttactgt tgttgcttta tatactacgt ataaagacaa tccaacatct caagaaggta aatgggcaaa 22051 tcaaaagcta aagaaatata aagctgaaaa caagtataga aaagcaacag ggcaagcgcc aattaaagaa 22121 gtaatgacac ctacgaatat gaacgacaca aatgatttag ggtaggtgtt gaccaatgtt gataacaaaa 22191 aaccaagcag aaaaatggtt tgataattca ttagggaagc agttcaatcc tgatttgttt tatggatttc 22261 agtgttacga ttacgcaaat atgtttttta tgatagcaac aggcgaaagg ttacaaggtt tatacgctta 22331 taatattcca tttgataata aagcaaggat tgaaaaatac gggcaaataa ttaaaaacta tgatagcttt 22401 ttaccgcaaa agttggacat tgtcgttttc ccgtcaaagt atggtggcgg agctggacat gttgaaattg 22471 ttgagagcgc taatctaaac actttcacat cgtttggcca aaattggaat ggtaaaggtt ggacaaatgg 22541 cgttgcgcaa cctggttggg gtcccgaaac cgttacaaga catgttcatt attacgatga cccaatgtat 22611 tttattagat taaatttccc agataaagta agtgttggag ataaagctaa aagcgttatt aagcaagcaa 22681 ctgccaaaaa gcaagcagta attaaaccta aaaaaattat gcttgtagcc ggtcatggtt ataacgatcc 22751 tggagcagta ggaaacggaa caaacgaacg cgattttata cgtaaatata taacgccaaa tatcgctaag 22821 tatttaagac atgccggtca tgaagtcgca ttatatggtg gctcaagtca atcacaagac atgtatcaag 22891 atacagcata cggtgttaat gtaggtaata aaaaagatta tggcttatat tgggttaaat cacaggggta 22961 tgacattgtt ctagaaatac atttagacgc agcaggagaa agcgcaagtg gtgggcatgt tattatctca 23031 agtcaattca atgcagatac tattgataaa agtatacaag atgttattaa aaataactta ggacaaataa 23101 gaggtgtaac acctcgtaac gatttaetaa atgttaacgt atcagcagaa ataaatataa attatcgctt 23171 atctgaatta ggttttatca ctaataaaaa tgatatggat tggattaaga aaaactatga cttgtattct 23241 aaattaatag ccggtgcgat tcatggtaag cctatcggtg gtgtgatatc tagtgaggtt aaaacaccag 23311 ttaaaaacga aaagaatccg ccagtgccag caggttatac acccgataaa aataatgtac cgtataaaaa 23381 agaaactggt tattacacag ttgccaatgt taaaggtaat aacgtaaggg acggctattc aactaattca 23451 agaattactg gtgtattacc taataacgca acaatcaaat atgacggcgc atattgtatc aatggctata 23521 gatggattac ttatattgct aatagtggac aacgtcgtta tattgctaca ggagaggtag acaaggcagg 23591 taatagaata agcagttttg gtaagtttag tgcagtttga taattgtata tgatgaatct taggcaggta 23661 cttcggtact tgcctattat ttaaaattaa taaacagtta atttttacat gaatatatta aattttaaaa 23731 aaacaaacgt ttttagtata taaattattt tgtgttcgta ttgtgtgcta tgattaaaaa gttgttatgg 23801 tcaactatat cgtggtttta tgtttattat caatcaaaat ataaattatt rataatttgt ttggtaatga 23871 acgggttttt ttcgaaataa tagtaaaaaa acacatttgt agatatttta aactcggtaa atcttttaat 23941 aaatatttaa ttttattaaa agttaaaaag gtttaatata aaaatgtaat aaaatttata aagaaaggaa 24011 atgattttta tggtcaaaaa aagactatta gctgcaacat tgtcgttagg aataatcact cctattgcta 24081 cttcgtttca tgaatctaaa gctgataaca atattgagaa tattggtgat ggcgctgagg tagtcaaaag 24151 aacagaagat acaagtagcg ataagtgggg ggtcacacaa aatattcagt ttgattttgt taaagataaa 24221 aagtataaca aagacgcttt gattttaaaa atgcaaggtt tiatcaattc aaagactact tattacaatt 24291 acaaaaacac agatcatata aaagcaatga ggtggccttt ccaatacaat attggtctca aaacaaatga 24361 ccccaatgta gatttaataa attatctacc taaaaataaa atagattcag taaatgttag tcaaacatta 24431 ggttataaca taggtggtaa ttttaatagt ggtccatcaa caggaggtaa tggttcattt aattattcaa 24501 aaacaattag ttataataaa ataaaaagta ggtgataaga tgactcaatt tctaggggcg cttcttctta 24571 caggagtttt aggttacata ccatataaat atctaacaat gataggttta gttagtgaaa aaaacaaggt 24641 tatcaatact cctgtattat tgattttttc tattgaaaca tgtttgatat ggttttatag ttttataatt 24711 tttaataatg ttgatttaaa aaatttgaat ttaattcagt tgcttacagg tctaaaagca aatattttgt 24781 ttctatttat ttttgtttta acagtgtttg tatttaatcc tttaattgtt aaatttatta tctggttaat 24851 taatataacc agaaagttta tgaaattgga ttgtataagc ttattagaca aaagagacaa gttgtttaat 24921 aacaacggta aaccagtatt tatagtiata aaagactttg aaaacagaat cattgaagag ggtgaactta 24991 aaacctataa ttcagctggt agcgatttcg atttactaga agttgagcga caagatttca aagtatctga 25061 tttaccgtca aacgatgaat tgtatattaa acatacactt gtagacctta aacaacaaat taaattggat 25131 ttatatttaa tgaatgaata ctaatctttt ttcttagctt tttctgataa agtgcttttt aatttttcgc 25201 tggcgcccgg ctttcaaaa cczttgtLLLd t9trtttatc 25271 cgccataaaa ttctcaccac cattcaacgt ctacacttgt aggcgttttt ttatttagta aagtcataat 25341 gaatctictt tggttaactt atctccatct attttttgtg aaacaaattc caagtattta cgcgcattat 25411 gtgacgataa atctttaggt aactcataag tgaatggttg attaccacta gttaaaactt catatactat 25481 agtttctttt tttattttgc aattagttat tttcattata aacttccttt caaacactgc tgaaatagac 25551 gtcttttata ttaaagcgcc acacaggcgc tgttaatcac aatacaactt tgcccattac tttaatatta- 25621 ctaaacgaag cgactttgat atcatcatac ttcggattta gagataccaa attaatatag tcttcgcata 25691 tatctacacg cttgataaga cttactccat ctaatacaac gagtgcaatt gtaccatctt taatagaatc 25761 ttctttctta ataaaagcgt atgttccttg ttttaacata ggttccattg aatcaccatt aactaaaata 25831 caaaaatcag catttgatgg cgtttcgtct tctttaaaaa atacttcttc atgcaatatg tcatcatata 25901 attcttctcc tatgccagca ccagttgcac cacatgcaat atacgatact agtttagact ctttatatcc 25971 atctatagaa gtgactttat tctgttcttc caattgttca tttgcatagt taagtacgtt ttcttggcgg WO 00/32825 PCT/I 899/02040 26041 26111 26181 26251 26321 26391 26461 26531 26601 26671 26741 26811 26881 26951 27021 27091 27161 27231 27301 27371 27441 27511 27581 27651 27721 27791 27861 27931 28001 28071 28141 28211 28281 28351 28421 28491 28561 28631 28701 28771 28841 28911 28981 29051 29121 29191 29261 29331 29401 29471 29541 29611 29681 29751 29821 29891 29961 30031 30101 30171 30241 30311 30381 30451 30521 30591 30661 30731 30801 30871 30941 31011 31081 31151 31221 31291 ggaggtgtga gtttgttgta tatggaagtg acaaatcatt aatcttcaca ttgaagtact ttcattttcc cattttgata tcttgCCttt gttgctaatt gttccatagt catattttta cgaaaccctc cttatataag ataatttcat gcaaaagttg ttgacatcga aacttttatg agggggttca atgacaacta gtgtagcaga ggaactaacc aaaaagaagt tgctaaagca gaattaatgg cagagatttt acaacttcag tgattttttt caaactttaa gtttcgaaag taacgttaac caaagaagag ttgaaagaaa accaatcagc tcaggtgcaa ttttcagtaa aaactcaatt tcgcaaaaga tttgtcgcta agtatcagca tggcttcgaa tcaattcatc attaacatta tcaatttttg gagtgacact aaaatttata gagatatcaa aaactattat atgatttcga atgaaggagg aactacaaat ttaattgacg tgtggcatgg aaatcaatgg tctcggatag agaaggtaag aaatatctaa ctattgcctt acaatcctaa atcttttctg aaatgctgaa atagtcacga gcaacgctat ttcgttatga atcttatgtc tatctagagc tctaaatcca taaatttcac ctccttccac ggaacgacaa atgcaagctc aaaacaaaaa ccattagata ttcaaattaa tgacggatat aagaaatacc atacgtaaat aataacttat ttttgagaaa gatattgaaa agctaatttc ctcttttaac ttcgttccaa gttttattgt gtcatcaata atccaagaaa cgaccctgcc tctaatttta aaagtgagta cattactgtt tatgtcgagt aagtggttca cctattttct gtgattaagt ttcatcctat cacctccata aggaataaca aatgaacatt caagaagcaa agattggaaa gaaagtcatc gaactaagat aatagcgatg ggacaaacct tatcagatat aagttataaa cccaactaga gaccaggaat aaattgtttt taaactcatt ttcaaagtaa tactagcata cacgccgttt aggaacccag atgtagtttt tgaaaatact ttgtatgtat gaacctaact ttacacattc taaataatct gtgtatcaaa ttcatcagat atcaagggca aaggagcata aacaaatgaa cacaagatca ctgatgcttc ttcatcctat ttaacggaaa aaaaagtgat tacagctact tagaaataaa gcgaataata acaaacttta acatttatct caccagaaaa cacatatcga ggcgaagaaa tcaattgttt ggagtatgta gaagtacagt gtagaaaatt tatacattga ttattcagca tgatcagaaa gcataaaaaa tggtattagg agcagtgttg tgcttcacgg tcttagcgat attgcgggat tcgcaagtat cgcaacattc gctacttgtt ggagcaagta acagtgcaag tatgacctta caacaaaaaa tactatcaca gaagtttttg ggatatctaa aacacatgca aattggaaag ttggggtatc tggcgtgttg agagatatta gaagaacaat tcgagttatt gaagaacgca tcaagttaat gattcgtcta agaaggtatt tttgaagaat taaaactatt gtagattcat caattgtaca agagaaagtt aatcagttga agaagttaag gaaacttctg gttccttaaa aaagcagata cttctgataa aagctatcta ctatcaaaga agagcattat gaagctagat cactcaaata gagctcatgc ccaccgagta ttaaggcaag tgaaggtatt ctcatgagtt aagtgagtta catLLCag3tc ttttcaaaat tataagcgaa atcaatatta aatgtagaag aaaaatataa cgaagctttg tggatttagg taaatacgtc cctgaatctt tgaaattatt gacettaaat acggtaaagg tatggcttgg gcgcatatga actgcttagt aaccacgaat agataacttt tctactgaag tgttaaacca ttagceagac ttgcttataa tgtaagataa agcattcatg tagaacacgt tgttgagtga tgaagagatt gcagaacttt agaaaaatat gcactagatc aagcgaaaga cgctcgcgaa gaatgataac tgatacaaat atgtcgttat cagccaaaat cgttaatttc tttttttCtt t ataaaagt t atgtattctt taaaccatac atcggaatga aagctaaaaa tgacaactaa ttatagcgaa agtaagaatc ggaagattga aaaaagctta taattcagac ttatatatci gaaactacta ttaaaagtga ttaaataagc cttttttctt ctttagcgaa tctaggtaat tgggagataa agtcatctat gaactgatgg atgccttggt cccataagat ctctaatatt ttcgatgaat tcaaaatcat tattagattc acaggagtat ctaagatagc attaccaaca tggcaacctt tattgaagca acaacagtct agtttttaag atctttagca ttgtagagta tgttatcacc gaaggattgc aggaacgtaa caaagttttc aaaggagtga aatttgtgaa atacaactgg acgggaacat aggattatca tgtactcatg atattttata atgagcaatt ttttgcaaca aaatccacac ttgaaccgca ggcaagatta gccaaccaat aaataagaat aaagaagcac ctgatttaac gaaagaattt gaaaaaattg aaagcttagt gcagataaaa cagtgaagag agtagagatg t tggt actgg cattgaagtt ttaatgtatg agttaccaat cggtgaaggt gcagaataca tatataaact aaatgataaa gcaacgcttg cgtctttgta tttggcagtt attaagtcgg agcttcttta tcgaaaacga aaatcaagtt ttaaaaataa gtagaagttt attagcagat ataaaaataa agaagttaga aataatgacg ggaagctcaa tgtacaagat ttgagtgaaa atgaaaagag agaaggctat tgtagtattt gataatcgag gatatttatt aaccttcacc aacgcaagga gttacaaacg aaagct tgat attgagtata cattiaaatg ggaggacact aatgctataa atttagaaga tcatccgatt gttcatgacc gtgaatacaa agtttcagaa tcaataaaaa aagaaagcaa attaaaaaaa gcacttaatt agtgcaagta cttcttgtaa tcccaataac tgcaattacg tcatcaccga agcgagattg caatatcgtg ctaaattata taacaaaaca tactactatg acgaagaagg tccgatctca tttcatcaac tgatggttat gaatttaagt taagagacat actggatgtt atcgagaaat tcatggccag ttcagatcgc aacaaataaa atttatcaaa aataatatta tatttctaag agcaagagtc agcagaaagg atcataaaca tacaaaaaat cttgtctcta aatgatagtt ttttacaatg cagccgatga cctcatggca attttagaaa tgctatcaat tgtctgaaat tgttacatga tttatttaaa tcgtatttta cttccaaaat tattgcaggt cggacaagat atattgttgg tccttaggtt gataacaaca gtataggcgt cccacaagtt cttaggagcg gaaatattag tatgcattag atagagaact tagagatgcc aaaaatcata aaagttatac gcaacaccta ttgaaatatt accgtgaaga tgattaatat ttctaaatta aatgagcgac acatataaaa ccgtttctat acttcactac aggaatactt ttatgaagaa gtcttaaata attatataag tatgacaatt tcaattctga tttcaagact taagaaaaaa gttacattta actgttgtag aacgaacaaa gtgatgaccc tttaaggagg agttaatcaa ttacgtgtgc taaatactga caatgccaaa agatgaaaca taaagattat gttttatcag agaaataaac ttaacgaact ttgattttat gaatgcgaga gattcactat gttcttCCtt attaagatca aatacccata aaatattatt aaacaaaagg tgcacttaaa aagataaatc ttaaagttga atggaacaaa aaggcgagaa aatcaataaa ccgctaaaaa atattagaaa cctagcagca ttaactatcg acacgaaaac tataaagtgg at caagtgcg acagaagagt cttcttgcca ageaattttc acttaaagga taataggcga aacaccattg tagattgaat ttgttaacga accaagtgat tttagcttct tcgttgaaat taacgcaatc tcttaaaagg tgacacggaa catcatttca aatgattggg gatacttttt taaatagtgt catcttcgaa taatttaac tctttagtaa t tatacacga tctagcaaag agcttattaa tcaatacagg ataccaccaa cacaaatcca taatttaggt gaagagtatt gctacctatt agcatggtca taaagaaat gagttattaa tgatgttgtt ggaaagattg aacgtaagaa tagagaaata tggcaatatt actatcaact gctcaactgg taggaaaaga tggtgcggat ataaatgcat gctaaactgt ggtacattcg ttaataaagc gtacgtagct gaaacaaaat gtggtgtact acttagatta actatcatac gaaccgattt ttgtagatt c ccaccacatt ctgatgaagt tgtagaaggt tataaacctg gcaagtggag gttcagtttt ttgcgcgaat acgatgtaat tgatgtcatt tcagctatag acattcatac atcaagatta gagtttaaag caaaacaatg tgctgaagaa cagtttaaat atgttgaaga agctttattt atattttcag ataatcctca agttcgcatg cttcaatggg caggtagtca tgcaaaatgt gcctcaaaag gcctgacatc aaaaaatggg aactattetg gttggaagct aaaagttagt tgaagcaggt WO 00/32825 PCT/I B99/02040 31363.
31431 31501 31571 31641 31711 31781 31851 3 192 1 31991 32061 32131 32201 32271 32341 32411 32481 32551 32621 32691 32761 32831 32901 32971 33041 33111 33181 33251 33321 33391 33461 33531 33601 33671 33741 33811 33881 33951 34021 34091 34161 34231 34301 34371 34441 34511 34581 34651 34721 34791 34861 34931 35001 35071 35141 35211 35281 35351 35421 35491 35561 35631 35701 3 51771 35841 35911 35981 36051 36121 36191 36261 36331 36401 36471 36541 36611 aagatattac agaaaccaag taaaattgca gaaggcttta ccagctataa agcaatCtgc atgaaagcaa aagtattaaa ttgaacctca cagtatgcaa tacaagtacg ataaaagcca ggaggcaaag ttcctgcaaa attatcaaga cgcttatttt tagattaacg gattctggaa aacacaaatg gtaataaggg ttggcggtgc aagtgcagca caataggtgg ggtttttagc atatagatat tgaaacatat agatttcgaa atcttaatta gtagataatg agcctttcca agiatgcatt caatgctaat agaatggatt tgcacaatgg gttttaagac tacaaaacca agccaacaaa agttaatgga tatagattac tgtattcgag actgtaattg aacaagcata cattgatgtt aggagctaat aacaggttta gaaaatccta atacctaatt tacaaaagaa tagaaattag attgcaaatg tgatgaacgg gtaagaggtc caacttcaga atttaacaaa aacaacgttt tgacgattta gacgacattt actgctgaag atagcatggt atgcaaaaga cggcttctca aatgttiaat agtgtccgaa ttagctttag ggcattgaag aaaacgagtt tttggaaggc ttgccaagag tagattttat atgaaaaaag gctttagttg gtgaaaatag ggtcaaagtt aaaaacgtat tgcgatttct atagcaaggc gtagaaatac ctagaggttc caaaaggatt gaatttgaat catgcaacat caagcttata tacaatcaga ttgataaaga aatataacgt tataaatatt gtgcgtatca aaggaaaact ttagatatag atacaaacat agatgaagtt gaatatatgc gcgattttga acctatacca catggcttta tgatggaacg aatgttcaag aaatggagtg aagacaatgt taagcaacca tacggcacag tatccacctc ttaaagaatg gtcatgagga gatgaccatg acagatagcg caggataacg aacgagtggc caggctggca aggcgtgaaa tttggaatac gaggaacaga aaaaataatg aaaatagacg ttacttagca ttacgaattt tagaaaagcc acaaggtaaa tgaagatgat tttgacaaac caaaactaaa gtgattacag gaagggcaag aagcaaagta ttgaacaagc tatagaagct tctgaaactt ccattacgtg attaacgcat caagcaaaca ctattgtaag tggtgactat tatcgcagtt ggattgaaca gaagatgatt tcgatgaatt cccactttaa ttttaaagaa agcagtaacg atatttcgaa tagcttattc aatagatggt cgctgattat gagacgttta ttcgaaagaa cttgtcttgc ttaattcaat gcgtattggc aaaagataaa gcaggtaaaa ggaagaacaa gaaatttgcc atgtagaagt agaaatgaca ttgggttttt gaccaacata qtgctcgata agcagagtaa atagtcctac acagttattg aacggttcag gattacttaa tctaaaacca gtgtgaaaaa a t a a a a gcattatatt gatttattac aaggtaatga acaatggcgt gtaccggtag gctatcaagg acaaggttta gctgcaatta gttttctaat ttggggtagt ggtgggaagt tcagatactg tcaatgttca actagcagta ttagatgtgt aaagcataac tggcgctgga go Ogatagot atactgtaaa gattgaactg caagttgttg tagtcgagaa aaatggactt aaggaaatcg agtgacgggt ttacttCtCC tcaatgcttc tgttgacatt aaaagaaaat ttggcggact aaggttttag atttagaggt aatgacattg agagaagttt aaaatgcgcg gcgatgaatt agagtcaaat aaaagatgaa acgaaatgtg agagcggaat ccacaaaaac attcacgtga aagcataatg agcgtaatca gcgcactaca catacggcga aactagcatt cgcaataggt tttagcaaag gcgaagtttt catgtaaaga atacttaaac acatatccat gtagtgaatg aagacatttg atacagcgga .gaaaaatta atcggcaaaa aagcattttc taacacttg ctaccgagtc tgataaacga ataaaaatt aaaaaggacg gtatataaac aaaagtaag agcatcatat gcacatattt *tcaatcagt ttaatcattc ctaaatcaga jctaaagaag aaggaaaagt tagtaagttt Loggagatac tgaaagagaa gatgatgtga Lgcacctggt attattgacc aaaacaaaat Lttagagctt caatcaattt atttccattc Lcattcaact tgtagaaaaa ggcgaacctc Lgacactgat gatgaggatt tctiataagt ittgaggtgt caagaatttg aaatttatga togtggtgtc tataaataca cagaagctga ;gaccgatta gtgcgattga catgactaaa ~aattgctct atttgaccct gctgtaaaaa :aaacatttt aataaacaga tgccacctga :tacctgctt cgcttgataa agttggagaa itttaattcg ttatttctct ataccttgta tgaacatgat cttgaaaaat ggcaacaatt attgctaata aaattaaaga ctttccagta :aaacgacag aggtattaag ctttctaaat agaagaattg cttaaacaag ctaaacatat gcttggttaa aggatgaaca aggattagat aagtagcaac aggaaaagct aaaaaaatgc atacaacaaa atgcatgaca tgatgtgcag ggtactggaa gatgggcagg tagaggtgta aattagaaat agcaagagat cttattaaag tcctcaagac ttattaagtc aattagttag agtgattttt ctgcaataga ggcaagagtc tcaacacaca cggaaagata tatgaagcat taaaggcgac cctctcagac aaaaaggaaa gctttaaaag caatgggtgc attggaaatg ggcgtaacgc aaatcctaac atagttaatt atcccgaaag acgcatcata cacatggact cctagtggaa gagctttagc ttatccaaaa aatttatggg gttagatctt aaccgtaaat tattgttcaa gcaactgcaa gggatttact atagttggcc atgtccatga tgaagtaatt aaactatcat gaataagcct gttgattggg gttttatatg aaggattagg agtgtgattg agaattccta cagaagtcga aagtgttaat atttatttaa taatccaggt gaactattaa ggaatgatgg ctagaagaaa agttataaga cagaaaaata tcacatatct ccagaacttc attgtgtgga agaaaagact caaaatctaa gaaaaagaga gagaaaaaat cagaaaaaaa atgaagaaga aagaaagaga agattgagac tccgtactgg ttcgatgtca cttataacca gtaacagaaa agtagatatg aacgaagcgc cattgaaatt atagatttta tcgaacaggt aacgcaataa aatacttgtc tagagcacct acgtccaaag agetttogac ttgtgggagt caatttttcg gatctaagag atatctgtat gcacttatta ctttcacggg catatcgtac agagctcgaa acatatataa agcaaCatgg ggagatagaa atgatgaaaa tcaaagttga gaaaatccgg agctatcatt tggcagaaaa tcggtgtttt taaaaattgt tttaaaataa tgaagaagaa gtaaccgaag aaactaagtt tataaatctg catcatatga gaatgctagt aagcattcta catcttaaac gacgacctaa atggaaCacg gttcaaaaga atattacgaa tgaagcaacg tgatgagctt attggagata agcaagtgca tgggataggt attgcaagag gaagagtta at. ttgqaat ogaattaaac gaacaaatcg aacaatcagt tattagtgct agattgagga cgtgtataag aaagcgcaag agatgcaatc aaagaagata ttggtcttga tatgaggagg agcaggaaaa tgactaacat gaacgaaatc ataagacgga tgcaggttat aaaaggCagt gatcaaaaca gacgtagct~g tagtggtgta agtagtaaaa cgcatttagt ttagggatta atatcaagaa tgataatgaa cttctggtat agatggaaaa tacaccctac tgtcataaat aaaggcgaca aactagctca gtggaggaat tcgagagtgt ttcagaacgt aattaattaa tattatacaa cagacaaaaa gcgattttat attagttaat tgataggttg tttgaagtgt ataaacgaac gtttaaaaaa ctatgacgtt aatttggaaa aagcaaagtg aatactggtt tactaaqtt aagagagtgc cgttgaaaaa gatttaataa aataaaattt ttatggagga agcgcgtata acggcaatga cgtttgatga aatacttgag tgaagcagta ggaattatga attacaagtg aaactattat gacatatttt cagctaaaac taagcattcc agagggctat gattgaaaca ggcaagatag acgttagaga gtgaggatat tacctgtaac agataaattt attggttatc gtgcctatat gctactttta aCgagattca tgacagaatt gaaggagagt tgatgaagca aacaaagagc acgaatttgg agacgcaaat gtgggcgcga to tatttact gtgtcaaagt agaaggagt C tttcttgcta agcaagtttt tggagaagaa caaaqyxLtgqt gaataaccgc cacagaggga ttattaaaag ggtttaccta atgctatgca cgggtcaagt tgtctataaa caaaagacgc tagaatgcca tgtcgtactt gagccacaag gtcggtttat taactagccg acgcgggata tcatggtaat gagtaacttt ggtcggagtc ttatgtatga atggtagtta ggacacctga actaaagcaa WO 00/32825 PCT/I B99/02040 36681 36751 36821 36891 36961 37031 37101 37171 37241 37311 37381 37451 37521 37591 37661 37731 37801 37871 37941 38011 38081 38151 38221 38291 38361 38431 38501 38571 38641 38711 38781 38851 38921 38991 39061 39131 39201 39271 39341 39411 39481 39551 39621 39691 39761 39831 39901 39971 40041 40111 40181 40251 40321 40391 40461 40531 40601 40671 40741 40811 40881 40951 41021 41091 41161 41231 41301 41371 41441 41511 41581 41651 41721 41791 41861 41931 ggagcaaaag gcttcggaag tgagtgacat gttagaaata tctcaagagt aaaaagacta tctacatttg ctgataaaca agctcaaaca agttcaaggg ggacgtaagc atacacacat aagaagaagc gaaagagaag tgaaaatata agggagtgtg agctaggcaa gtatgtaact tgactttgaa aaaatcagag tgtgtttagc aatagcactg ttattgatct tgttaagtag atcttttcaa gcagttagta aattagaaaa agatttaaaa ccgaacatgt tagacaaagt ctgctagtcg tttatctaca acaaacagtt agaactaaag aaagatgttg gcggatttgt gttcaatgtt aacacttgat ttttgcatat tgtttatatt ttaaaacgaa atgtaaatgc attacttcga tgatacaact atttttcttt acctatgaag actgacacat tagaatggcc gcgacccaga agaaaagccg aacttttatt cctgatttat ggtggattgg tgttatacga tgcttgtgaa cagttttgat tactccggtt aatcgactac aagcaattaa ttaacgacaa catggtctga gacgttagaa attgcgtaat gatccaaatt gggaaaatgc catggaataa gttatatcga aaagatttat gcaaaatgcc tatcatccag aagttattta tcaaatactt ctgctggaai cgctcgagta aggtgtaggt aaatctgctt actggtaagg aagcatatga gaaaagctga agttgaagct acattatatt gaagattttc gatgaaactg gtggaagacg aactaaccaa agaagagatc gttccttaac cctgaactag acaggtatta ttgatgaata gaagacgatt ttatcaaggt ggtctgtgcg cttgaagtgt attagaaaga tttctaacgt aaattcgatt tggaaaagat ataagaaata ttgaataaat aacacggtgg tgtaaaaagt gatacaacac ctctttccct ctataaagtt taaaaagtag aagtttggca agaattgatg gaatcgacat tagaaaaata cacctggaac aagaggtgta gaagcaagaa aagggaaagt acagtgtatg tgttatggaa attgatttca aaccacatag tgtttttaga tatggggcta cactaaaaaa atgttagtca tggaaccatt taaatcatct acacagaggc tgatatctat atggccattt gacatggttg tctattaaaa agaaattacc tacaggattt atgggctcaa tcgagaaagg tactttaaac tctgaagaaa agatatatga tgcctgacag agttgataCt agaaaaaaac tatattttag caaaaactac ttcaactatc agaagttaga taagttagag caaacatgat aaagaaagaa tagcggagtg tttttcatag tacacacaaa tcaaaagacg agctatgagg aactaaagct tacgaggcac ggaaatgact gagcaagtat ctgaagtttc gaggtgttgt catttctcat aaaaataatg gctaaagaga cactcaaata cat tggacta aagatttaac cggtggttat atcgattatg caacacataa agatgagtat tatcaaccac atttaccttt aacgtcttca ggaattgttg acgaaaaaca aaataacaag ttagtacgca ctagttataa aatgtctgat attacttcga taaaaggaaa taattttaaa gacatacacc taagagatta aggtgttgaa atggagccag tgctaaaaaa ggcattacaa attaagcatt caaggcaatg ttt ttggcca gaccaaatct aagaagaaat tcttaacacg gatgttgata ttgttgaatg cttaagacaa tatggtgtac atacattttt aatcgiaggt tctcgctgta tgttagggag caacatcgga tttagtgaaa ccagatagaa tacatccttt taaagaacaa ctatcaaaag gggaaaacag tagcacctaa gaaagtgtct gtaaccaata taattgatga actuALLaaL taaagacata ggtttggtgt cctatatgaa cat at crcaa aatgacacaa aagagcaatc aagttaaaag aaacaaatac atattatgat atggtaatag aaatatgtgg aagatgaaaa gtgttgaagg ggcgttggt gaaacaatta atcacaatat tgagtacaat ttaaaagaag ctgctcaaga gcatagagag gaagctattg ataggttaat gttagaccca agggaagaga gtgcattttg ttctactaac tttgcctatt tacacttata agcaatgCag gcaatgcagg aaggtacttt aatagcattt atacgtcaat ttagatcgag ttatctattt atgttgttga t agcat t cc tacctagtca aaaggtttac aaatgcagtt caagattatt gacggctaat ctattatcat attgtcattt ccatagaagc tatagaagat tctctgttct aatatgatcg ggcttggtct aaaatgtcta gcaaacgacg tatgactgac ataagtccaa ggcgtaaagt gcattggcct gataaaatat gtaagactaa tagagcctat cgttatacct ctcatcataa tggtgctcaa caaagagcgc atttcgatga caaagctagt aatgaattta ggcaagacgg atccaggcaa aacaaaagat tcaaataaa atatcgtggg gacactgaag tgaatagaac gatgtaaatt tgactatatg aataggtggt gcacggtttt ggcgtttggt caatggaaat tcatacctaa acaagttgac tactttcatt ggcacaacta acgactgcaa atccagagag gggcagaagc taaatactat gcgctcaatc caaagtaaac ccaatcccaa gcaattggga tgtcaccaac aggaaatgta ctctggtaaa gataagggag tcagacaatt ggtctgtata ccaaggaggt tttggggaag tgtcgcatag gtactatttt ccgctactac ctttgtgaca agtaatgttt cctatgagta caacattcaa agattcaaca agttgttgat gcggagagta attaaattag ggcagttgtt accctcacta gcgacgtacg gatgatgcag aggcgccgag ccctgaatta attatattaa caattgtttt atccaactc accggagtac atgaatgcti catgaaaatg aagtcgaacg atttaaaata atgaaaggag cgatgtttca tatcccatg gactttatgc agaagctagc agtctgaaca agccgataca tgctggtcaa gtcatgaatc ataccaccta tgttttatga gactgcgttt agtgattcct cgcagatatc gttggcatgg ccaactagta acgacgcgga caaatgaata tgttgattgg aagattagca gataagcaag acgatagaag aagccataga accatgaagg ttcaactgca cacggatccc gtaagcggta gatgaagacg ctaaaacaga aaaatgatga agttgttaaa aatagtaaat agcgatgatg atcccaaata cagaaattat caaaacaaat cgaatgctta cgatgatagc agtttaagaa gccaccacaa gcgtagcaat atggacataa acgtcttgaa aactaccaaa aaggcattga cctacacttt atggtcctca ctgacagttt agtttctgtt ggcagaactt gcagctacaa cggtttcgtg tcgcttatgg ataaagttga tttcctaaga agttgaagtg aactggtcta tatgaacaag gagaagagtt atactgagga atctccatat agacttaact atctttgaaa gattacattg aaagagacaa atagtagagg atcatggaa tgaaggcaat aaaagtggga gaaagtttag aggatttaat catcattttt tgagtgatgc gcaacattga tgcaacaaat ccaatgttgc atcaaattca cctcaacag ctatttttaa tacaaggtga ataaatgaaa attatgttta aaatgggtcg aaaacatatt ctgtagaaat aatttgaaaa cagagatcat ggtaggtgga acatttggcg tgataatgag aaatacggtt agtgaattgc agtcgttaga gggttgatga agttgataag agaaagaaat gatgcattaa tgtgatcaat ataaaaaaga ccaagagtca aaggtttaaa Aacacctaqt ccaaatagtt gagtcttcat ccagtcgtta accgggagct aagagacgga agcgaaagat tatctggata agaaaagtat atgaagaatt agaatggggc atcattaagt_ tgtaagactt atacatgata atattattgt tttataactt tagaggattc aaactataaa cgcagggcat ggattaaact gaattatacc aacaagcaaa tcatgaccga taacacaata agatagcgta agatgttgta gtcgtatcat aggttcaacc taaaggggta acacaaatat gagataacaa ttattattat acaaaaatat gcaaatactt tatgcaatag tatcaacact acaagttgt ttagtcttag aagaaaatac actgtctaca cgtaagagat tcaaatgttg ttttggtgat ctgtttgttt taggggtaac aaattttgta agttaaatgg gccagaagga gtgcatcggc ttataagaat ataaagtgat tacagcattt aaagatacat gaacacctaa taaatggtta tttaaaagtc cgaaagactt catgttttta taagcatgaa tgaaaaagaa gttgtagctc atgatgaaga aggccaacca gcaaccacat atccagcaag atggtcattg attcatcaca gtttatttga tagacagagg caacacatca agttagcgaa acgaatagaa gatacacgct aaacaaacag tagccttatc aatcggaaga agaaggaaca taacggtgca gtccatacag gaaattatag aggagtctca tacttcaaag gtttaaggaa gaacgctgga tacaacaagg tgcaagatta atagtggaga cattaagctg cttatagcac tgggcacact attgtttggt ttggacttac tatagacaag gacaaaatca tacgactatt WO 00/32825 PCT/I B99/02040 42001 42071 42141 42211 42281 42351 42421 42491 42561 42631 42701 42771 42841 42911 42981 43051 gatcaaagag caagaatagc atttaaatat gagatactta t Oagaacaac tgaagcagot aataaagata caatacgaaa gcaaaaggcc cgacataaat tgaccaagca agcatggaag gatattataa acttagataa taatcttaag tgcccatcgg tatataaagc taagcataag attgaaccag acccaacgaa tgagttaatg gaaagtgagt agaagctaaa gaactttgtt tacaaatctg acatgaggca taataacatt aagttaagag cagatgcaaa tctaatgtca aaaattagag cttaaaatgt tttacaaaac aaagaaccaa taatggaggt ataagatggg aaatatataa tttaaatgag agaactagac accaacattg gcgacaaggt tattgactaa acttaaagtt acctgaagat gatagaacaa ataggggatg aaagcgatag cgtatcatgc tagtaatatg atagtatcgg catcgctaag cggtgtgtct tataagcatg gtcgtaagtc agatagcatt agatagagat gattgtgcat cacattattt gtttgttata gctgtcacaa ttctaaaaat ttaaataaaa tttttcgccg ggtaccggag cgcaagaaga attgatgaaa gctattaaag aaaggcgtca tatgatatta agccaggaac aacaagaaag agataaatag attgagaatg tgtatggacc gttacaaaaa ggagagccag taagatgtta cgtaacttag aagagatggt cataagaaag taataaggtt aaagtattgg cttgtcacat gcatcgcaat acagttacta aggtatcaaa taacattgtg caaagattgt aaagatgtat aaagttatct gaaagttata ttgttatgc aatcaaagag gtgtaagaga atatcaatac gattggttct atcattcaaa aattatcttt gtcaaatgtg tttacgcgaa atgttgatga agattttaac aaagctttag caaaattcat gcaaatgata atgacaaaag aaattattta aataaaattt tatgcccccc aggcc WO 00/32825 PCT/I B99/02040 184 Table 8 Bacteriophage 3A ORFs list SID LAN FRA POS a.a. RES sequence STA STO 100379 3AORF0O1 1 8515..13488 1657 acaggtacggatctaagaaaacttt ttg taa 100380 3AORF002 2 37667. .40114 815 tttaaaataatgaaaggagccgaac atg taa 100381 3AORFOO3 1 32188..34149 653 ttaaagaaattgaggtgtcaagaat ttg tag 100382 3A0RF004 3 17457..19370 637 gctattttattagaaaggaaggtgc att taa 100383 3AORF0OS 1 334..2034 566 agaaaaaagatagttcaagaagaag gtg taa 100384 3A0RF006 1 15571..17154 527 cttttatttataggtaggtgattta atg taa 100385 3AORF007 2 19337..20836 499 atgatagtaaaacaagttcagggcc atg taa 100386 3AORFOO8 3 22176. .23630 484 aatgatttagggtaggtgttgacca atg tga 100387 3AORFOO9 1 40726..42093 455 gtaaatacttttataagaatggtag gtg taa 100388 3AORFO10 3 13491 14738 415 gaggcggactaacgctacagtaaaa att taa 100389 3AORFO11 2 2039..3277 412 attaaagacataatgcgttaaggag gtg taa 100390 3AORF012 2 4001..5209 402 aaaaaagagaaaaaattaaacgcga atg taa 100391 3AORF013 1 30379..31545 388 attttatgaatgcgagaataaatgc atg taa 100392 3AORF014 2 14738..15562 274 attatatgggaggtttgactaatta atg tag 100393 3AORFO15 3 3249..4034 261 cttgaattaagaaaatctttgaaag gtg tag 100394 3AORF016 -2 25587..26273 228 aagaagctaagaaaaaaataaaaat atg tga 100395 3AORF017 3 6729..7370 213 ttaattttaaggaggaaataagca atg taa 100396 3AORF018 3 24540..25154 204 aataaaataaaaagtaggtgataag atg taa 100397 3AORF019 2 31565..32128 187 ctataaaaattaaaaaggacggtat ata taa 100398 3AORF2O 3 36150..36713 187 gcagtaggaattatgacgggtcaag ttg taa 100399 3AORF021 2 24011. .24535 174 gtaataaaatttataaagaaaggaa atg tga 100400 3A0RF022 -2 12423. .12938 171 taaagtaccagtagacaatgtaggt att tga 100401 3A0RF023 1 7462..7917 151 aaaataaatcaaaggagaataattt atg taa 100402 3A0RF024 1 26731..27174 147 actaaataaaaataaggaggacact atg tga 100403 3AORF025 1 42106. .42543 145 taagcataagtaatggaggtataag atg taa 100404 3AORF026 2 35255. .35671 138 aagcaactaactttattttaaggag ata taa 100405 3AORF027 2 5888..6298 136 atattggctataatacagtggtttt atc taa 100406 3A0RF028 -3 27845. .28255 136 ccttttaagatgtttatgatccttt ctg taa 100407 3A0RF029 3 34344. .34748 134 ttaaggttttagatttagaggtgga atg taa 100408 3AOR030 2 6299..6694 131 tataaaaaaggagttggccagataa atg tag 100409 3AORF031 1 20833. .21225 130 ttaacaaaattataggagtgagaaa ata taa 100410 3A0RF032 -2 39984. .40361 125 aaatagctgttagagggttacccct ata tag 100411 3A0RF033 1 7957. .8325 122 gaatatctgcgtctttttatttga ata taa 100412 3A0RF034 -2 28506. .28871 121 gttatcaacctaaggaggtgataac atg tag 100413 3AORF035 -2 10671. .11036 121 tcctagcttcctaacagcaccgcca ata tga 100414 3A0RF036 2 30020. .30382 120 accaattttaaggaggagttaatca atg tga 100415 3A0RF037 2 21818. .22165 115 aagtgtaagtaatagttaagagtca gtg tag 100416 3A0RF038 -2 42003. .42347 114 gtactcactttcaactgcttcaacc atc tga 100417 3A0RF039 2 21386..21727 113 tccagaaaatctagagtcataggtt ata taa 100418 3AORF040 -3 29654. .29995 113 ttgattaactcctccttaaaattgg ttg taa 100419 3AORF041 -1 4333. .4671 112 tactaaatctacatctgatccatga att tga 100420 3A0RF042 3 5568..5900 110 taaaaaagtggtaggtgatttttaa atg tga 100421 3AORF043 1 25690..26019 109 taccaaattaatatagtcttcgcat ata tag 100422 3A0RF044 3 29676. .30005 109 gtcttaaataattatataaggagtt att taa 100423 3AORF045 3 30. .353 107 cgctagcaacgcggataaatttttc atg taa 100424 3AORF046 3 27894. .28214 106 aagatattgaaaagctaatttcccc ata tga 100425 3A0RF047 -2 11907. .12227 106 ttcqccgccaaaatqattagcattt ctg tga 100426 3A0RF048 -3 40343..40663 106 ccataacacatacactgtatgatct ctg taa 100427 3A0RF049 -3 6749. .7069 106 tgttaaaccatcttcagattctcca ata taa 100428 3AORF050 1 42700..43014 104 ttatgcaatcaaagaggtgtaagag atg taa 100429 3AORF051 -2 13077..13388 103 ttgtacgtaatcccacacatcgccg a t tga 100430 3AORF052 -3 3722. .4024 100 gcatttcatttcctcctaataactc att tga 100431 3AORF053 3 17145..17444 99 tcgagacaatggatatagggagtgt att tag 100432 3A0RF054 -1 19915. .20211 98 ataatttatagcttgcgaaacataa ata tga 100433 3AORF055 -1 42436. .42729 97 aatcgtattgatatgacttacgacc atg 100434 3AORF056 3 40455. .40745 96 taaattttgtatacaaggtgaataa aug tga 100435 3AORF057 -1 38665. .38952 95 atcatcaccgtcttgccattgacgt att taa 100436 3AORF058 -1 21265..21549 94 gaaatttctatctaacttgtcataa att tga 100437 3AORF059 -2 10278..10562 94 tttagccgcgcttccaactgcacgt att tag 100438 3AORF060 1 5278..5556 92 atatcagccgaataggggtgatgaa atq tag 100439 3AORF061 1 35668..35946 92 tttggaaagaaggagagttgattaa ata taa 100440 3A0RF062 2 35912. .36187 91 gttaaatttggaatggaattaaaca ata taa WO 00/32825 PCT/IB99/02040 100441 3A0RF063 3 36720..36995 91 cggaagtagcggagtgtaaagacat att tga 100442 3A0RF064 -2 35694. .35969 91 ccgttatacgcgctagcactaataa ctg taa 100443 3AORF065 -2 32697..32972 91 aaccgttttcttttgtaaattaggt ata taa 100444 3AORF066 3 29157..29429 90 caaactttaacatttatctaaagga gtg tag 100445 3A0RF067 -2 26661..26930 89 atacttttttagcggaatcggatga ttg taa 100446 3A0RF068 -2 9624..9893 89 ttttaatgcatctcccatgtattga ata tga 100447 3A0RF069 -3 13847. .14110 87 tgcatttcctcctgattcgtgttga atc tga 100448 3AORF070 1 34993. .35250 85 tttacgtccaaagagcttttgactt gtg taa 100449 3AORF071 2 34745..35002 85 aaatgttcaagaaatggagtgaagc ata tga 100450 3A0RF072 -1 27379..27636 85 tttgtcgttcctcctttaagttgtt ttg taa 100451 3A0RF073 2 37367..37615 82 tggtaatagctattatcatttttga att taa 100452 3A0RF074 -2 23466..23714 82 cgtttgtttttttaaaatttaatat att taa 100453 3A0RF075 -3 2471. .2719 82 agtactgtttgaaatcttctaacac ttg tga 100454 3AORF076 1 26047..26292 81 aagtacgttttcttggcggggaggt gtg tag 100455 3A0RF077 2 28292..28537 81 aacatcttaaaaggaggaataacaa atg tag 100456 3AORF078 -1 5836..6075 79 ttttgtataaggcttagatttagtc att taa 100457 3AORF079 -2 5460..5699 79 attcagtcgcttttaaaatttctct ate taa 100458 3AORF08O -2 31350..31586 78 cctgtaatcactttagttttattta ata taa 100459 3AORF081 -3 8252. .8488 78 aagttttcttaaatccgtacctgta atg tga 100460 3A0RF082 -1 35905..36138 77 atatttatagacaacttgacccgtc ata taa 100461 3A0RF083 -1 34039..34272 77 atagttcacctggattattaaataa ata tga 100462 3A0RF084 -1 12007. .12240 77 acatttttttcatttcgccgccaaa atg taa 100463 3AORFO85 -2 32367..32597 76 cttacaaggtatagagaaataacga att taa 100464 3A0RF086 -2 30618. .30848 76 atataatctaagttgaggattatct ata taa 100465 3A0RF087 -3 24746..24973 75 ataggttttaagttcaccctcttca atg tga 100466 3AORF088 -3 12980..13204 74 tctttctttttcgtaccaccatgga att tag 100467 3AORF089 3 4290..4508 72 acaggagaagcttatcaatctttaa atg taa 100468 3AORF090 3 28926..29141 71 ttatacacgaaaggagcataaacaa atg taa 100469 3AORF091 -2 13587..13802 71 cttgtcttgctaattgcttagataa atg tag 100470 3A0RF092 2 26471. .26683 70 aaacgaaacaaaaggagggggttca atg taa 100471 3A0RF093 -1 2524. .2736 70 tecaccgttttcttcatagtactgt ttg tga 100472 3A0RF094 -3 25334..25546 70 tggcgctttaatataaaagacgtct att tga 100473 3AORF095 3 8316..8525 69 aagagatgggaaagacagaagaaca ate tag 100474 3AORF096 2 36992. .37198 68 aacaagttcaagggagctatgagga atg tga 100475 3AORF097 -1 32593..32799 68 aaagcttaatacctctgtcgtttat atg taa 100476 3A0RF098 -1 15346..15552 68 aatccattaaatcacctacctataa ata tag 100477 3A0RF099 1 7225..7428 67 actggtgactggatgaacagaaaag ttg tag 100478 3AORF100 -2 22620..22823 67 cgacttcatgaccggcatgtcttaa ata taa 100479 3AORF101 -1 40060..40260 66 aaccttacagcgagaagggaaagag gtg taa 100480 3A0RF102 -1 35035..35235 66 ttctatctccttaaaataaagttag ttg taa 100481 3AORF103 -2 1149..1349 66 atttttttggagtgttgggtaatca ata taa 100482 3AORF1O4 1 27661..27858 65 aaacaacttaaaggaggaacgacaa atg tga 100483 3AORF105 -2 9420..9617 65 gcctaagteaaccgcttgattagac atg tga 100484 3AORF106 -2 23244..23438 64 caccagtaattcttgaattagttga ata taa 100485 3AORF107 2 11966..12157 63 tctaaaaaagatgctgtagtagacg ttg taa 100486 3AORF108 -3 35054. .35245 63 ttttcatcatttctatctccttaaa ata tag 100487 3AORF109 -3 16010..16201 63 gttcttaatccaatgtactgacag ttg taa 100488 3AORF110 -1 6184. .6372 62 attttcagtgactttataatagtat att taa 100489 3AORF111 -2 16500. .16688 62 gtagtcaacaattgctttgtattga ttg tga 100490 3AORF112 -2 8502. .8690 62 cttaattctcgcctgatacttttcc att taa 100491 3AORF113 1 34162..34347 61 tatgaaggattaggagtgtgattgc atg tga 100492 3AORF114 2 12356..12541 61 ggatatcacactaaggctatageta ata taa 100493 3AORF11S -2 7635..7820 61 tgaagttccctcagctacaccgtga att tga 100494 3AORF116 -1 26434. .26613 59 tttagcttctgaagttgtaaaatct etg tga 100495 3AORF117 -3 17804..17983 59 atagccattatttctagcttgtgtc atg tga 100496 3AORF118 2 27899..28075 58 attgaaaagctaatttccccataag att taa 100497 3AORF119 -1 39268..39444 58 acgaaaccggtcaacttgtttagat atg tga 100498 3AORF120 -2 37152..37328 58 tagctattaccatgaaacttcagct ctg taa 100499 3AORF121 -2 18900..19076 58 aaggtactctctcccatttaccact att taa 100500 3A0RF122 -1 21550..21723 57 taagcatggtaatcacctcctttaa atg taa 100501 3A0RF123 -3 33062. .33235 57 aaacgttgttctttaataagatctc ttg tag 100502 3AORF124 2 21212..21382 56 aaattagaagaggttaaaggagaga ctg tag 100503 3AORF125 -1 22051. .22221 56 aaatcaggattgaactgcttcccta atg tga 100504 3A0RF126 -2 7821..7991 56 tgtttttcctgttttacggtcttta att tga 100505 3A0RF127 -3 34712. .34882 56 ttgcattacctattgcgaatgctag ttg taa 100506 3AORF128 -3 24056..24226 56 tttttaaaatcaaagcgtctttgtt ata -taa 100507 3A0RF129 -3 4940..5110 56 cataccatgcagttaatacaaacaa a-a tga 100508 3AORF130 3 27171. .27338 55 cagaattaactatcgatgatttcga atg taa 100509 3AORF131 -1 40387. .40554 55 ccttctggcataataataattctat ctg taa 100510 3A0RF132 -2 1860..2027 55 gcgataacattcaccteettaacgc att tga 100511 3A0RF133 -3 42317..42484 55 acaaagttctttcgtattgtagtaa etg tag 100512 3A0RF134 2 12671. .12835 54 tcatacaaatctttaaaaggt gga ctg tag WO 00/32825 PCT/I 099/02040 100513 3AORF135 -1 39484. .39648 54 ataatagtatttagcttctgcceag att taa 100514 3A0RF136 1 29710. .29871 53 accttacaacaaaaaatactatcac att taa 100515 3A0RF137 1 37186..37347 53 ggcagttgtttgaaaatataaggga gtg taa 100516 3A0RF138 2 20996. .21157 53 aatggggaaatagtttttaacgaag att taa 100517 3A0RF139 3 15114. .15275 53 tcaactgaaattgaagtaagtttaa atg taa 100518 3AORF140 3 29442. .29603 53 aaaatggtattaggaggattatcaa atg taa 100519 3AORF141 -1 39883. .40044 53 tacaccataatcttttccaaatcga att taa 100520 1 3A0RF142 -1 20416. .20577 53 accacctggaaaagtcccataaaaa att tga 100521 3AORF143 -1 1942..2103 53 ataaagcttagaagttgactgatca atc taa 100522 3A0RF144 -3 39380..39541 53 ttccaccagtttcatctcttaagaa ate taa 100523 3A0RF145 3 20388..20546 52 tctgagtggtcagaattagctatta atg taa 100524 3A0RF146 -2 2358..2516 52 aacatgtccatattatgaacaatca att tga 100525 3A0RF147 -3 5606. .5764 52 gtgatttgtttgtggtagatattca att tga 100526 3A0RF148 2 34145..34300 51 tttacttetccgttttatatgaagg att taa 100527 3AORF149 -1 7918..8073 51 tattctcttgatttactaattctaa ata taa 100528 3AORF1SO -2 11745..11900 51 ttcatccttatgtctttgatcagca ata taa 100529 3AORF151 -3 7097..7252 51 tttaccttcatgatacccgtataca ata tga 100530 3AORF152 1 21652. .21804 50 ctaaaaatattagagacatcaagat gtg taa 100531 3AORF153 2 5381..5533 50 tcggctaagtctgaattactattaa gtg tga 100532 3AORF154 -1 39670..39822 50 ttgataaaatcgtcttctttcaaag ata taa 100533 3AORF155 -1 38233..38385 50 ataggctctacaaaatgcaccaaca att tag 100534 3AORF156 -1 33040. .33192 50 tatctgaaatataatgctttgttaa att tag 100535 3AORF157 -2 10119. .10271 50 cttcaatgatttgctatagctatta att tga 100536 3AORF158 -3 36074. .36226 50 atccgtcttatgatttcgttctggc act taa 100537 3AORF159 -3 18338. .18490 50 taaatagtttctattatttggatta ctg taa 100538 3AORF160 3 39399. .39548 49 gtttggttaatggaaatggcagaac teg taa 100539 3AORF161 -2 8976..9125 49 ttgtacttttagtttttgaacttga ttg tga 100540 3ARF162 -3 31199. .31348 49 tctgtaatatcttcaggtttataac ctg tga 100541 3A0RF163 -3 14459..14608 49 attatcctgagaagaaacagtttga atc tga 100542 3A0RF164 3 25182. .25328 48 ttttttcttagctttttctgataaa gtg tag 100543 3A0RF165 3 28353. .28499 48 aatcttgtctctatgacacggaaag att taa 100544 3AORF166 -1 8899..9045 48 gtactgcgtcacttgctctttttag ttg taa 100545 3A0RF167 -2 411..557 48 taatacaagttgacgtttagatcct ttg tga 100546 3A0RF168 -3 25973. .26119 48 gctgagtacttcaatgtgaagatta atg tag 100547 3A0RF169 -3 25151..25297 48 aaaaaaacgcctacaagtgtagacg ttg tag 100548 3AORF170 -3 24995..25141 48 taagaaaaaagattagtattcattc att tag 100549 3AORF171 1 23437..23580 47 aaaggtaataacgtaagggacggct ate tag 100550 3A0RF172 2 32414. .32557 47 ctatttgaccctgctgtaaaaaagt atg taa 100551 3A0RF173 -1 38005..38148 47 ataagttgtatcatcgaagtaatcc aeg taa 100552 3A0RF174 -1 4123. .4266 47 atttaaagattgataagcttctcct gtg tga 100553 3AORF175 -1 3124. .3267 47 ttcatttgaaaatacttagctttca teg tag 100554 3A0RF176 -1 580..723 47 cattttctcCatcttgtgatacagc ata taa 100555 3A0RF177 -2 39819..39962 47 ttagaaatctttctaatttccatag ate tag 100556 3A0RF178 -2 38466. .38609 47 ttagcgtcttcatcttgagcaccat ata ta 100557 3A0RF179 -2 33927. .34070 47 ttttgcccaatcaacaggcttattc atg tga 100558 3AORF180 -2 33555. .33698 47 cgtctttcgggattttacagtatta att tga 100559 3AORF181 -2 29538. .29681 47 atagtattttttgttgtaaggecat att ega 100560 3A0RF182 -3 17099..17242 47 aatatcactactgcctgCataaggt ace tag 100561 3A0RF183 2 23750. .23890 46 ttaaaaaaacaaacgtttttagtat ata taa 100562 3A0RF184 -1 31648. .31788 46 tggaagtttcagatttgcaggaacc ttg tga 100563 3AORF18S -1 30565..30705 46 attttgtttcaaataaagctatcac ate tag 100564 3A0RP186 -1 16951. .17091 46 gagaattcaaagtactagtgtataa atg tga 100565 3A0RF187 -1 7153..7293 46 tatccaacgaatacttttttgaaga att taa 100566 3AORF188 -1 1237. .1377 46 ccagctcttctaaagaaacaatett act taa 100567 3AORF189 -2 33309..33449 46 catttgagaagccgatgcttcatat ate tga 100568 3AORF190 -2 7197. .7337 46 gtaacgaaettgcagaatcctctga atg taa 100569 3A0RF191 -3 41459..41599 46 tcatctgtataaactgcaccgttag aca tag 100570 3A0RF192 3 4863..5000 45 gatgctattattaacgctttagcaq act tag 100571 3A0RF193 3 25965. .26102 45 tatacgatactagtttagactcttt ata tga 100572 3A0RF194 -1 37069. .37206 45 ctagtaagaataataatcttagtat ttg tga 100573 3AORF195 -1 11749. .11886 45 tetgatcagcaatagctaataattt ate tga 1inOA74 IA ORF196 -2 40764. .40901 45 atctttagcaacttgtttaggtget atg tga 100575 3AORF197 -2 31989. .32126 45 ggctaaaaaccccacctattgactt ata tga 100576 3A0RF198 -3 36431. .36568 45 tttatttatgacataactaccattc ata tga 100577 3AORF199 -3 33515..33652 45 ttccaaaaattaactatgttaggat ttg tga 100578 3AORF200 -3 21233..21370 45 ataagattataacetatgactctag att -tga 100579 3AORF201 1 23293. .23427 44 aagcctatcggtggtgtgatatcta grg taa 100580 3A0RF202 -1 39088. .39222 44 atagtcaaatttacatcctggctcc att taa 100581 3A0RF203 -1 16309. .16443 44 tttgcttgccgtctaaaatcaactt ata tga 100582 3A0RF204 1 23845. .23976 43 atgtttattatcaatcaaaatataa att taa 100583 3AORF205 1 29500. .29631 43 gtgttgtgcttcacggtcttagcga ttg taa 100584 3AORF206 2 16667. .16798 43 gaaaaateaacagtcttaaatttaa ttg tag WO 00/32825 PCT/I B99/02040 100585 3A0RF207 1 -1 35386. .35517 43 tgcagatttatagactccttcttga atc taa 100586 3A0RF208 -1 30013..30144 43 cagttgagctgtttcatcttttggc att taa 100587 3ARF2 09 -1 28366. .28497 43 taattcctggtctctagttgggttt ata tga 100588 3AORF2 10 -1 15739..15870 43 catcaagcttatttgattccactga gtg tag 100589 3A0RF211 -1 7693..7824 43 taactgaagttccctcagctacacc gtg tga 100590 3A0RF212 -2 4314. .4445 43 qgttctgaaacaatttctttagaaa gtg tag 100591 3A0RF213 -2 4011. .4142 43 tgtttgatgtcttccatatcaatat ttg taa 100592 3A0RF214 -2 1722. .1853 43 tctgtctagtttcaactgaacatta ttg taa 100593 3AORF215 -3 16616..16747 43 tcttcatttgtttgcgtattagcat atc tag 100594 3A0RF216 -3 15833..15964 43 gtcattttgaccgaagttttttgat ttg taa 100595 3AORF217 3 6363..6491 42 gatgcagagctccaaacatatataa att taa 100596 3AORF218 -1 32146..32274 42 aataagctataattaagatttcgaa atc taa 100597 3A0RF219 -1 29800. .29928 42 ctagggtcatcactttgttcgttta atc taa 100598 3A0RF220 -1 18409..18537 42 gcattaacctgatacgcttcttctc cg tag 100599 3A0RF221 -1 13234..13362 42 ttttatcgcctaaccaagatgcacc atc tag 100600 3A0RF222 -1 12313..12441 42 cccaagctttatctgaggcatctga ata tga 100601 3A0RF223 -1 4915..5043 42 tccatcatagttaattccaaaataa ttg taa 100602 3AORF224 -1 2125..2253 42 attaactactttataatcttcatac att taa 100603 3A0RF225 -2 26298. .26426 42 tcgtttgtaacaacttgatttaaga ata taa 100604 3A0RF226 -2 17184. .17312 42 cgcctatttttaaattatctaattt att tag 100605 3A0RF227 -2 1425..1553 42 atcttcttcccattctctatagggt att taa 100606 3AORF228 -3 31055. .31183 42 cattttttgatgtcaggcagtttat ata taa 100607 3A0RF229 -3 22592. .22720 42 gttataaccatgaccggctacaagc ata taa 100608 3A0RF230 -1 27883. .28008 41 gaaggcagggtcgtttcttggatta ttg tag 100609 3A0RF231 -2 29988..30113 41 gcttctttaactttctcttgtacaa ttg taa 100610 3A0RF232 -2 22485. .22610 41 tatctgggaaatttaatctaataaa ata tag 100611 3A0RF233 -2 9264..9389 41 aagtttgccgaaatgactttgagct atc tga 100612 3A0RF234 -3 23033..23158 41 acctaattcagataagcgataattt ata tga 100613 3AORF235 1 25558..25680 40 aacactgctgaaatagacgtctttt ata tag 100614 3A0RF236 1 34420..34542 40 acattgagagaagtttcagaaaaat atc taa 100615 3A0RF237 3 38442..38564 40 aagaagctatagaaacttttattc ctg taa 100616 3A0RF238 -1 33628..33750 40 caatcattagaaaaccttttttcat ata taa 100617 3A0RF239 -1 29248..29370 40 tcttctaatttagaaatattaatca atg tag 100618 3AORF240 -2 18156. .18278 40 gtctctcaattctgtatagaatttt att taa 100619 3A0RF241 -2 8088..8210 40 tttcaaggcttttgtataagtttta gtq tga 100620 3A0RF242 -3 39149..39271 40 ttagcaaagcagatttacctacacc ttg taa 100621 3A0RF243 -3 23558..23680 40 aaaattaactgtttattaattttaa ata taa 100622 3AORF244 -3 1697..1819 40 catttcattaaaggattattattaa ata tga 100623 3A0RF245 1 19015..19134 39 agttatgcaaggaatatgatgactt ttg tag 100624 3A0RF246 1 22504. .22623 39 gctaatctaaacactttcacatcgt ttg taa 100625 3A0RF247 -1 40567..40686 39 aaagtatttactrgttctttattcc ata taa 100626 3AORF248 -1 23956..24075 39 tttagattcatgaaacgaagtagca ata taa 100627 3A0RF249 -1 11113. .11232 39 cacctttccccaacacttttacagt atc tga 100628 3A0RF250 -1 8719. .8838 39 ttttattagcttctactagctttaa ata taa 100629 3AORF251 -2 16899..17018 39 aactcgtctgttaagcgcttgttga att tga 100630 3AORF252 -3 37025..37144 39 acaactgccctaatttaataactgc att tga 100631 3A0RF253 -3 29138..29257 39 tctacatactccaaacaattgatgg att taa 100632 3AORF254 -3 15476..15595 39 caaatcaattcattaaaatccatta ctg taa 100633 3AORF255 1 13552. .13668 38 ttaatagacaaagtaaaatcgtggt ttg tag 100634 3A0RF256 2 12545..12661 38 aaaagtgcaaagggctggctaacgg ata taa 100635 3A0RF257 2 41870. .41986 38 gggcatggattaaacttacaacaag gtg tga 100636 3A0RF258 3 10827..10943 38 tcaaacttttgaaaaacggtttagg att taa 100637 3A0RF259 -1 34570..34686 38 gtgacatcgaaccagtacggatcac gtg tga 100638 3A0RF260 -1 32389..32505 38 aagcaggtaagccaatacgcattga att tag 100639 3A0RF261 -1 23830..23946 38 cctttttaacttttaataaaattaa ata tga 100640 3AORF262 -1 8158. .8274 38 ccatctctrctggttcagtttctga atc taa 100641 3AORF263 -2 14001..14117 38 ttatacctgcatttcctcctgattc gtg tga 100642 3A0RF264 -2 294..410 38 tttgcttgtttttattttcccttga gtg taa 100643 3A0RF265 -3 42683. .42799 38 tgacaaagataattatctctatcta atg tga 100644 3A0RF266 -3 31979. .32095 38 aatcctcatcatcagtgtctaattc atc taa 100645 3AORF267 -3 26306. .26422 38 ttgtaacaacttgatttaagaatac atc tga 100646 3AORF268 -3 1 0. at vvg tac 100647 3A0RF269 -3 9872. .9988 38 tgagacccctctaaccctgagttag ata tag 100648 3AORF270 1 21829..21942 37 atagttaagagtcagtgcttcggca ctg tag 100649 3A0RF271 2 29468..29581 37 tgagcgacacatataaaagctacct att. taa 100650 3A0RF272 3 2955..3068 37 gagttaaacagattttacttgcagc ata taa 100651 3A0RF273 3 5010. .5123 37 tttggcaaaccagtagtatttacag atl taa 100652 3A0RF274 3 19956. .20069 37 tcaagtatagatgaattaaagcaac ttg tga 100653 3A0RF275 3 39882..39995 37 gatatgttaccaacaggaaatgtag att taa 100654 3AORF276 -1 27211. .27324 37 attaagtgcgcttatttaattagat att tga 100655 3AORF277 -1 13516..13629 37 cgaccgtcattaaagttaagtccac ctg tga 100656 3A0RF278 -1 11893. .12006 37 ttttatatacacgaccactggataa atc taa WO 00/32825 PCT/I B99/02040 100657 3AORF279 -2 17535..17648 37 tttgraaagatttgtttactgctgc ttg taa 100658 3AORF280 -2 6474. .6587 37 rcaaaaraagcatctaacrgacrag arg raa 100659 3A0RF281 -2 759. .872 37 ttttgaratcgtgcgtcataatgg art tga 100660 3A0RF282 -3 36608. .36721 37 cccaaaacctcctt gactcgatcta ara tga 100661 3A0RF283 -3 14960..15073 37 tttcagttgaagaaccatctrttaa art taa 100662 3A0RF284 1 18859..18969 36 atgtraacagagccaggtctttact art taa 100663 3A0RF285 2 8237 8347 36 aaaacttataeaaaagccttgaaag ata raa 100664 3A0RF286 3 5157.. 5267 36 tatgatcagcaacgtacattagaca gtg tag 100665 3A0RF287 3 38610..38720 36 tttgatttagtacgcaracaettat atg taa 100666 3A0RF288 -1 36454. .36564 36 tttatgacataactaccattcatac ara tga 100667 3AORF289 -1 30217..30327 36 aacaattttttcataatgctctrct rtg taa 100668 3A0RF290 -1 16678. .16788 36 gcttttttgcaaartctaacagett atc tga 100669 3AORF291 -2 14310..14420 36 gtctagttaaagggaraaccatctc ctg tga 100670 3A0RF292 -2 11457..11567 36 ttctttcaatrctrtgatrttctga ttg tga 100671 3A0RF293 -3 29462. .29572 1 36 ttcataaaagratrccttataaaar atg tag 100672 3A0RF294 -3 22388..22498 36 accattccaatttrggccaaacgat grg -ag 100673 3A0RF295 -3 18629..18739 36 aaaaggaacgectcrtgagtgaagt art tag 100674 3A0RF296 -3 6332 6442 36 tatcagacatgaagtctgaaggtaa atc taa 100675 3A0RF297 1 13984. .14091 35 aaarggttgaagtcacttaaaggra geg tag 100676 3A0RF298 1 40174. .40281 35 aetcaaargtrgcatcatrttrtga gtg taa 100677 3A0RF299 2 1481..1588 35 gccgcgtgtgcracttttgcgrtag ata taa 100678 3AORF300 2 40451. .40558 35 aatataaattrtgtatacaaggtga ara rag 100679 3AORF301 3 25479. .25586 35 accactagttaaaacttcatatact ata raa 100680 3A0RF302 3 32106. .32213 35 gaagatgattrcgatgaattagaea ctg tga 100681 3A0RF303 3 36024. .36131 35 gaeacagagggattattaaaagaga ttg tag 100682 3AORF304 -1 37762. .37869 35 accgacaaatccgccaacatctttt ata rga 100683 3AORF305 -1 24088..24195 35 rrtatctrtaacaaaatcaaacrga ata tga 100684 3A0RF306 -1 19507. .19614 35 atcartaggtaattgaaattttaaa ara rga 100685 3A0RF307 -1 16081..16188 35 atgtactgacagttgCagatacagt arc tag 100686 3AORF308 -1 11398. .11505 35 tttctttagttctagttaaaatgrt ttg taa 100687 3A0RF309 -2 33003. .33110 35 aaacagacctcttacccgrtcarca ctg taa 100688 3AORF310 -2 24894. .25001 35 gtaaarcgaaatcgctaccagctga art taa 100689 3AORF311 -2 22005. .22112 35 ttcgtaggtgtcattacttctttaa ttg rag 100690 3A0RF312 -2 21711. .21818 35 aaaaraaaaagccagtgccgaagca ctg tag 100691 3A0RF313 -2 17901. .18008 35 catraggtcrtagacgacttageat ata taa 100692 3A0RF314 -2 16710. .16817 35 taattcagtcttaggagtatcattt art rag 100693 3AORF315 -2 15990. .16097 35 acatatctccgtarcattrgggtaa art tag 100694 3A0RF316 -2 2862..2969 35 aattettcttcatactgtttgacga ttg tag 100695 3A0RF317 -3 40217. .40324 35 tccctaacactactttraaacttt ata tga 100696 3A0RF318 -3 37535..37642 35 tgttcggetcctttcattattttaa ata taa 100697 3A0RF319 -3 34421. .34528 35 trctrcatctttratttgactetgc ata tga 100698 3A0RF320 -3 28262. .28369 35 carttgttggtaatatcttagtrcg atg rga 100699 3A0RF321 1 23989. .24093 34 taaaaaggtttaatataaaaatgta ata tga 100700 3AORF322 1 34660. .34764 34 aagagaagattgagaccatggcttt atg taa 100701 3A0RF323 3 30105. .30209 34 craaatactgaactarcaacgrag art raa 100702 3A0RF324 3 30258. .30362 34 ggaaaagagttccrtaaaaaagcag ara tga 100703 3A0RF325 3 40236. .40340 34 gttgtatcatttttggrgatgcaae art tag 100704 3A0RF326 -1 36964. .37068 34 cgcarcaaeaactgtaaacetttga ttg tga 100705 3A0RF327 -1 35242..35346 34 atttttgtctgttgtataatatttt ctg raa 100706 3AORF328 -1 21916..22020 34 ccatttaccttcttgagatgttgga trg tga 100707 3A0RF329 -1 18820. .18924 34 ggtggcttaacrtccaagaaccaac ctg taa 100708 3A0RF330 -1 15631. .15735 34 ttatgaagttttcacaaartagraa arc 100709 3A0RP331 -2 37998. .38102 34 ttacgcccaatagcttcatactcat ctg tag 100710 3A0RF332 -2 7359..7463 34 rttaraaacctttaaagrtttagrc ara raa 100711 3A0RF333 -3 24584. .24688 34 aaaaattataaaactataaaaccat ate taa 100712 3A0RF334 -3 24269..24373 34 tatttttaggtagataatttattaa ate rga 100713 3A0RF335 -3 14273. .14377 34 cacttcageaagttgatgcrttgra arc tga 100714 3A0RF336 2 7559. .7660 33 graactrtatcraatttagaagcgg ara rag 100715 3A0RF337 2 13277. .13378 33 aataraggtaaaaaagcaggagaat rtg rag 100716 3A0RF338 3 9501. .9602 33 taggacgracgargaegatgggcgt arc taa 100717 3A0RF339 3 27348. .27449 33 aratctaattaaataagcgcactta art rga I rAA .LUILo mcfv- 100719 3A0RF341 -1 33421..33522 33 aagctaattcggacacttttccttt trg taa 100720 3A0RF342 -1 29047. .29148 33 trtggcatctctateactcctttag ata taa 100721 3A0RF343 -1 7549..7650 33 atgatacgctgagactagaattgg att. taa 100722 3A0RF344 -1 7297..7398 33 ctgctgaaactgttgcagattttga art-I. -rga 100723 3A0RF345 -2 23850. .23951 33 ttaaacetttttaacttttaaraaa art taa 100724 3AORF346 -2 20607. .20708 33 aaagargtacgaeragatttagrta ate raa 100725 3AORF347 -2 14175. .14276 33 atetgttgttaaagaacgetaataa ctg raa 100726 3A0RF348 -2 6984..7085 33 cgtacactggttgacctgttaaaec are tag 100727 3A0RF349 -2 6882. .6983 33 ragaacgaceaataaetgtatttag aetc taa 100728 3AORF350 -3 40748..40849 33 aactgcaatteactaaatgetgraa gtg rga WO 00/32825 PCT/I B99/02040 100729 3AORF351 -3 38345. .38446 133 ggttagtagaatgtttttcgtataa atc taa 100730 3A0RF352 -3 38081. .38182 I33 Itagttgaaggccaatacattaacct Iatg a I 100731 3A0RF353 -3 35432. .35533 33 tagcattctcatatgatgcagattt ata a 100732 3A0RF354 -3 349S2. .35053 33 ttatcctgatacagatatctcttag atc a WO 00/32825 PCT/I B99/02040 190 Table 9 Bacteriophage 96, complete genome sequence 1 71 141 211 281 351 421 491 561 631 701 771 841 911 981 1051 1121 1191 1261 1331 1401 1471 1541 1611 1681 1751 1821 1891 1961 2031 2101 2171 2241 2311 2381 2451 2521 2591 2661 2731 2801 2871 2941 3011 3081 3151 3221 3291 3361 3431 3501 3571 3641 3711 3781 3851 3991 4061 4131 4201 4271 4341 4411 4481 4551 4621 4691 catagttata ggcttttcag ctatatacca gaaaccttga ttcaatgggg ttcaatcta cgttgacctt gctctttttt atgttcatca aatggcctaa ccttttgcta atatatttaa tcctaatgaa taaggtgcta ttgtagtatc ttaacggcat tatgactcaa taaacaac taatatgttg tatatccttt tttggtacct atgtactgta cccicttttt cgtttagatc gtgatagcta ggatgaataa aaaaatataa agtattgttc catggtgatg aatttagagt acagctaggg tctttcttta aatagcctc aatttacgaa ccgtttcatt agtacgacct tgatgttttt tattaaaaaa tcattccga gaattgttgt gaagcgacat gcctcttatt ttattttcat ctaaattgtt tccatcatcc ctttagtttt gaatcctgac tttcttttct agatgctgtt gctttattct tcctttttgi ggcaaaaaat aataagggta ggcgagctac cccttcctac ttcttttcta aaactatcat tccagcatgt tggtttttgt ccggattatt tcgcaactag gtccgtttgg gtcgcgtggt gcacctgttg cttagatgtg ttattggttt tcgactattg ccatcgcttt gattactat ctgtctttgt. tctctttctt tgtttCggtt atgcacctaa cactaacgca ctagctaata tgctatttgt tttaataaat ctatgatttc tcgtctaaca tctctattaa gacgaaattt caggattaga aaacgaacta ctgaaacgcg taacatatct ttaccgctct cagacattgt tattttgttt cctgatttct ttcgatttct tatcacgttt ttcagaaact gacatacgat tcctccggca gtccaagact ctttaactgt cctttctca tatttcttta tattcaaaaa agttccaata ccgtatatct tcttatattg tactcagaca actcatacaa gttacgtacg ctgagataaa gccgtgtcgt cttgcgtaat gttgccatac gtcaacttgt ggtgggcaag gaaggtctaa taaaaatttc tccttcttga ccacttcaac tccacatttc ataagcaatt tttctttcta tctctaaccc attgcataaa ttatttgcat gaccggctat agtttcttga taaagtaatc tgctaattgt tggacttttg tgccgattga cttaccccga ccgcttcaga tctaagttct ctgataaaat cittctagca gtaatactaa tttaccataa gtaatatcac cttaggtgtt gacatattac tttaagtgat cagaaaattt taaagagttc tctgtaaagg tgataaatta ggcgttacta aacaatctgt caattgtatg ctagccaa attattcaac aatatcactt taagtgataa aggaggaaat ccagtaagga aaattgaagt ggaaggagaa atgcacgagc agataacgcc atacgcaatc gtcaggtcaa aacagaaata tgatcatcat aaacaaagta aaaacgaaaa cattagagaa caccgacgtt aagaaaaact ggtgcttatc tgaagctaca gaagaaacaa aacaagaaat caaaacgg tgcgegaga ctacaatttc gactacatgc gataacaaac caaaaacaac gatgactggt gcgagttcaa gaacgaacgt aattggttcc cgtcacaagc tactttatac aggagaggct gaatatggaa tacatcggat aaaagatgat ctagagaaaa aagtctactc cgaggacaaa agcgttatat aaaaattgac aatacgaatt ataggaggag ttatcaaatg tcacagtctt agcgattgta cttatgccgt aagtatcgca acattcatat actcaaaga caagtaacag tgacaaacat ttatcaaaat tatggctgaa aatattaaaa ctgaataaca agataagatt gcaagtgtca agtaagtgag caggtat acc atttagtcct ttattatctg ccacaagtt gataggcaac ctcgattcgt gt tcgt ccc atatactgca cgaccgaatt aatattcgtt ttttgaatct aaatctctaa ttcctgattt aattgtaaat ccgaaatttt atgattgatt ttccatttt tgtgcttgtt.
gttgattgtt ttctttttcc tctctgcttt ataaaactaa attgttttgt cgatttatca ttgaaaagct atttagttcg tcacttcaa caaatacttg taacttatca ctctcaacgg ctctattgcc ccataattgt tcgaactt cccttcatat taccaaccat cttcgtattt atttcgatt tgaatacttc atattccagg caattctact ctcttatatt ttttcaatac tatcccgccg tccccataaa aatatgcttg aatatgtgtc aagaaaataa ttcctgaca agtaggtgtc titagaaagt attgactct tacgttttgg gtccgcgtta atattaatta ctctagattt tgattttttt tctctgaagc cgtteaaaaa aaataattct aaccaatcat gcagttgttg gaaagacgga gacgccattt attgttgaac agggtgtgtt tcagtggctc gtccattatt gttaatgttt gcttctgctt cctctttctt taatcttttc tcatgattt tttcgtaagt atctataaat cgcttattta aagggatatt tttttgacct ttaggaactt ctcaaatgta tccaatatgt aagcttctac gcgattgttg aatacttcta cgaatcctcg tcccatgcgc tcttcccatc tttctccgc atatttaagt tgagtaatgt ccataatttt aaaatattac taaagttata aggaaagatg tagcatggtt taattttgat actgtttttg catcgctgta aaagtattct gtaccacgaa attgtgataa cttttgatac aatgaacgtt tgtaatattc agcagcgttg cgttttacgt cacttttcct aactattgt aacgacatt c ttctagcatc ggtagctgga gtgttgttct tatctttagt atcgccgtcg atgttttaca tgttttcait aaacattcga tgaccaactt aagt tt ccc gttattaaac tcatttaact gattcatctt atcgaatact attcttcgct aatttcgcgt aatttcgatt attcgttcct aggtactctt caaacccctc cttcgggagt aattctcgat tctttaagcc cgttctcttt cccctttagt tttcttgaaa cgggaggtga atgacacaac gatacattat tatgcgtgcg aaatgacttt aattcagtt ccccacgaag tcttgcacca atcaattgca tatctatttg acaactgtct tccgaacgtt gataccaata ctcaaacctt gttgcctcag cgcaccgcca cctcaaaatt ccacttcttg ctggaccacc caaatattct agattcttct cgttgtttac ttctttCttt ttgctaccgc ccccctttact tccaagatgt cccgtgttgt tattttttaa tataattttg ttttcgataa tccctcgaat ttatatgat cgccatagtg taattgtaga aacgggactg gatctaaaat ttcggataag tgtgtttctt tggtgtctta aaattcatct tcaggtacat agccagagat cataagttgt atcacttaat taaatatcac tacgaaatgc aagatgtcgc aaaaggctta atttaacatt agtatagttg caaatgccaa tcggagaac caactcgaat aataagatgg gaaaaagatg acagaagttg actatacaaa acgcagaatt ggctaaaaaa tgaaatgcaa cccttctc atgttgatag caacgaatct accgctagga aagtacctag caaaaacgtg ttaactagaa gtagcgaatt aagacaaaaa agaatcaagc atgcagacgc gaacaaagag aaagctattc agtaaaact tcctatactt acacttttat acacaactta ttattacact gaattacaaa catttaattt cgaagaatta taggtaagga cgicgctgaa actttagggt cgaagatagg ccgatcacc aaattagtgc ggattataca gtttaatctt cgacgcttct aacccaaacg ccgggtaact ccggaagttt cgacccaatg caagcactga gactaatgtt aaagatgacg tcattgatct gaaagaaaat caatcaatca aagagtagcc catacacaaa attcagggat actaattcag aagcgaaaaa aaattgaaat aaatgcgttt ccccaaaaag aacttcat cgg acaaaagcta cactacagcg gaagaataaa attaaatcaa aaagatttct gaaattttaa gtaaaaataa aatgcatgta taccaattta cctagtagca tggtcaattg aaaactgcta aatatacgga caggatacag aacgaaatat gtggcacttc cagatttggt atgatcaatg gcactatgcc caggattCgt cccgcgtcaa ggcagtcaac aaatgaagaa WO 00/32825 PCT/I B99/02040 4761 4831 4901 4971 5041 5111 5181 5251 5321 5391 5461 5531 5601 5671 5741 5811 5881 5951 6021 6092.
6161 6231 6301 6371 6441 6511 6581 6651 6721 6791 6861 6931 7001 7071 7141 7211 7281 7351 7421 7491 7561 7631 7701 7771 7841 7911 7981 8051 8121 8191 8261 8331 8401 8471 8541 8611 8681 8751 8821 8891 8961 9031 9101 9171 9311 9381 9451 9521 9591 9661 9731 9801 9871 9941 10011 gataactttg taaaggctgt aaaattggaa gaggacgaag agctttaacg atatgaataa aatgcgtgac atgaaatggt accgattcaa agagaatgaa gaaaaattac aagatagcaa agttagcaat aaaaaactta acggcgaagt gatgactgaa aatgcaaaca aagataacaa gcagattaag gaaaacggaa acatcatgga tatttatcgt attaacaaag taaaaaaagc cgaaaaaact tagcaaatca taaagataaa aaagaaaata aagacaagga ggtttgaatt tgttccactt gttataaacg aggaggtatg agtaactgat atcaacgatt agagtggcaa attcaagttg gagcaagaag gccaatcaga tttagagggt aaaactaatc agttttataa taccttagta aattataagc aacaaattga cggtaaatca tatattcta acaaagatga agtcatcagt tatcaatatg agcaacatca aaattaacaa atggtaatta aaaattaatt caagaaacta atcaagaggc ttgcaaatgt cgagtttaac acaatacaaa cacaaccaat cttagtagat taggaattaa tcggaactat tgtacitaaa aaaagtttgg aataaagacg gaacagcaag caaatggtaa caatagaaat caatgatgat gaaagacaat gacggtactt ctagaaaacg gatatccgCt gcaaaaaaat attcgcaatg attattacaa acagaattgg gttgcgagag agttaataga cgagtaagtt gttaagcgaa cggaaagcct cacgcagacc cactacgaca aacatgtgtt cgtttgatga taaatatcaa aggagagaaa aatgaataag aatagggtta aacgaagcaa gatggcaaaa cttggatttt ctataaaaag gacatttggg atttgaccgt acaaaatggt caaaatggcc cgacgatgag actacacaga gactaacaaa tttagaaatt atacaaaacc cagttcgaaa tagtaaaagt ctgtattaaa gaactggaat taactctaaa aaagaaacta cgatgagcaa aacacaagca accaaagtta gaaatgtgga gatggatata tcaacagtaa aagtatctat tgagccggta tgt cagacaa agaaaaatag tgaattacgg tcaatttgaa aaagataaga gttaaagatt gtcggaaaga tagatcatca ctataactaa actttacaac d-tnaagttat t-aaatcaRtca tttaataact tgattgatga agtttgttga ggagtttgtc gctcatggat tacaaaatag qgtaagacag gttttgcatt gtctcgaaac aactggcaca gataaaagaa atcaggaact aaattaggca tcgatatttc attcagacag gcaacaagtt acgtgtagca gtagaaaaga ctactttcac aactgaatcg aatcaggcgg aatagaagca agaattgaca gaagaagata taaaactaaa ggacgagaac taaagtgtct aaatcgtttc aatggattga aaaagtgaat ctagacagtg tagcaaataa agaatattta aatacacctt tcattaaaca cattaagaaa gctagcattg aagatgtcga atttgcgata ctagatataa attttagtgc tgttgtagtg aagaaaccga aattacaaga tgagggaagt caatctgacg tacttaacgc caaacgtttc aaaaggacgg attcatgaaa gacagagaaa tcgtcccacc attgaactta aataaattta atgaagtagt acagacacct gatttaccgt attccgtcgt aaaagcagaa tgtagagata aaattatgaa actgattata gataaagcgt tggcacatta agcactgtgt ttgcatgact ttactaatag tagtattgca taattcttat agtttagaaa attcaatcaa gacaaat tgg catagagaga ctttaaaagc cgctaccgat aaagaaggcg ctgatgatgt ttagaaatta ttgatgtatt caacccgtac gacaacgaaa cgctagataa attgaaagca acatgtttac agaaatctat tccgatttca aacaaaa actcaaggat gatgagttat ggggattgga aaacatgatg tcagtattga taacgccgga tgataaaagt atttttatag tatcacgtga tggtgtcgag gatgcgagtt gatttacaag ctaactgggt agaaattgaa caaagccttg acggaaaagt act tgagcaa gatttcaatg tggagaaacc tattttaacg tacggcaagt atgaggacgg agtgattaaa attgaaacaa cacttaatga acattatcaa actatcaatc tgttagcaag tgaaccatca attaatccaa tataaaaatt ggtggggact acagatactt attccaacaa ccagatttaa acgaggaaca agagagaatt ttttaagaaa cgcataaacg taattgaata gacagccaga cgaggttttg taactgaaaa aacgtcatac caactaggtg tcggtactgg aacaacggta atgctgccta tccaaaagtt t tggggcgag tttcatcttg caacaatcac aatgacaata aatttattcq gtattaacga atgaaaatca ttttaggagc cacgattgtt gattatcaag cttttgacac aggcaagtat gtaacaatca cattgaacga ccacagagg taagtatttt aaacaaaatt tatgaattac gatgtattac aaaattggtg gtgatgaaga gacatgacgg atgtcataga tttaagcgaa aaataacaca gatgaacagg taataaacca gaacctaaaa atgagtcaac aatcaaaccc tctaggacgt ggtttaaatg tgctactggt gttgaacttg gtagaggttc cggacaataa tagaacttca ctggggcgaa aggttatgaa gaaatcagtc tcgtttatgt ttcatcatca tattatattg ggctacaatc tgaagcagtc ggcagaggta agacaacatc ataatgaaca cgtggataaa agttgatgag atgactatcc gatacaagta acaaattcat tattggctaa ccagaatggc aaaaacaatt aacaaaattt attgcatgta ttatgaaaca ttaaacaaac cacgatgcaa gaggacaaaa cagacgacgt ctcaaagtca agaacagtta gaacacgaaa tactgcaaag aaaacaacaa tttcagataa agaaagtgct catagcacaa atggaaaaag ttaaaaaagt taggtacgta aagtcaaaac ggggattatc ccgcctaacc taccagcaat ccgctacaca ccaatggaaa cttcatgaat aagttggcag caataatcgc tacgctactt gaacgaaaag tttaaaacct ttaaaagcaa ctaaagataa ttggatacgg attctttgaa attcsataact qaqttCaaaC ttaaaaacaa ttactaacag acagcgatag ccctaagaag gccgtcgcaa ttaatcgtca ctgaacatag cacaaaatgg aacgtatgtt atcaacaatt tgacttaaca aagttaacga aaiatcacac cgcaagatgt attatcttca actgatggat cttaaagata atcgctaacq tctagacagg ataaaagacc tagcgatgct actttaccgt gtacagaaaa cttattgaaa gcggaagaaa gaaaaaacga aaaacgaaga agataacgag agataaaaaa caaaattata tatcagtata caaatcaagg ctatatattg cgacggatgc acaaactaat taacttagtc agaaacattc gaacaagacg ctggatgcaa tcacaactaa aatggaaaga taaagaagta itacagaata caaaaagaac aaaggttcaa aagtcattca acaactatgt aaaagtaact cggcacattg attgacgcaa acggtaaagg tgggagaata taaaagacat cagcaaacaa taaaactacg tttttaacaa acagaagatg gggcagttgt aaattattga acaactaaga acgtgatatc actatggacg tgtgctacac gcattgtaag ctataagcgg acacgagggc gatagaggca caagaccaaa gaagaacatg agcaagacgg agacaaagat aagacactca tgttgtacaa gcaattagaa ctggtagaac acaatacatt tggagaattt acagtaaaag tttgaaaaca acgaaggtaa aaaaacaata tatcgagtta agatcaatta attaacaaaa tttgtaagac tctcatatgt ctgatgagat gaaacaaaaa attcgctaat gctaatggtc caatacatta caagatacca aacaaagtca cattgattta aaaactatct atagaacaac ccagtagaat caactagaaa tgcgtgactg ttcaatgaaa aatacctatg agtgtagaaa aaccgcaact gtgtaatatg tgaacagaaa caagatgaat gcacgcaatt ggtgttaagt aggctcaata aaatgttgaa ttaccgaaat tagctgaatt acaactcaaa acataaatac tccattttgg agcgagagaa ggtaactaca acaaggctgg tagtggcacg accatcggga tgacccgacc aataccatag tttaagtata ttagtaccaa ttaaatcatt taagcaagat aggtctgaat tacttactaa gaaaacaaat tgaaacctcg aattgagtga tgactaatgc tacaacatcg attttgataa aaccaactgt aaaagctgta catgcgtaag gcacctaaaa atgcagaatg accccgaata aatttggggg cgataacgaa aaacaaccgg acgtactaga ttttcaatta tgtaatggac agagttttta gatgcagata cgttatcaac aagaattatt aacaacctac gaaccaaaat aaaagaagac ggaaccaaga caaattaaga cgggttataa tcgcagcgcg tccctcagtg atacaaaaca tctttcttta actggtattg agttaacaaa_ atgcgatgga taaaatcatg gcgagcgcaa gcaatgaggc actgatgcga aagttgatag agacaggcgc aatcatcgta aatgctatcg gacatgaaag gatgattatt ataaccgtga WO 00/32825 PCT/I B99/02040 10081 10151 10221 10291 10361 10431 10501 10571 10641 10711 10781 10851 10921 10991 11061 11131 11201 11271 11341 11411 11481 11551 11621 11691 11761 11831 11901 11971 12041 12111 12181 12251 12321 12391 12461 12531 12601 12671 12741 12811 12881 12951 13021 13091 13161 13231 13301 13371 13441 13511 13581 13651 13721 13791 13861 13-931 14001 14071 14141 14211 14281 14351 14421 14491 14561 14631 14701 14771 14841 14911 14981 15051 15121 15191 15261 15331 cgaagatgac ggaataattg cttattgaaa ggttgggcgg agaaaatcat atattactta gtttattcgc ttaaaggcgc gtggcacaat gtgcgaatat aaatttgaat aggaagggaa gataticaga gaatggatgg caacaacaag atgatgtgga tgacaacata aatttaaggg gatcaactat agatgctcta attaaaaaag taccgaacat ggtggctggg tgttatccaa actacgtaag gtcacttata t atgaacaaa tttatcgaac tgtctagagc tgacttgtgg tgggattaga tgtgattgaa cctaattgtt ctctaagaga tttcacggac tatatataaa ataatggcaa ctgaacaagt tttttcaact acagaggaag tatatgaaaa taaaactatg gtatcaagtt ggaaaaagcc gtcgaaaaag tggagctgaa gtttttaaag gaaaacgtaa ttactgtttt ggtaaaagca ttttagggca gtgcgtataa gtttgatgaa gaagcagtag gtaggagata gagatacaca tctaattaag aataaaaatt ttgaaatttt ttggattaat tttgaat t tt atttaaacga tgatattgat taacagtaga tttaaatgac aattggaaga gtttgcaatt gat tgaaaat gaacaatttg cttactatac tggaacagca gtgacgcaat gagataatca agttaagata agtagctagt ggcgataata agtatcactg aatttgagta tcgatgtatg tcaatagatt gaatgaaatc taccgagaag c cacaggagc c cacgggct t gagtaaatac taccaatatt tacaacctaa actggttgaa tatcagtata tatatgagga catatataaa t aaagaaaa& acaataagac taatcatggc tagtggatat ggtgttaatg taaatgtata gaaaatgaat cgttgttaga aatcatggag aagaagccac accaaatgtt acgcaagaca aggttacggc accgttaaag gagggttaac agactgtgaa aactataaga aaaaggagtg tatctgtatc attataaaac gcaacatgat agattaaaag tgaaagtaaa gatgggcatg tcactgaaga tgattcaatc acattaattt ctttgcatct acgaacatgg tggaggcaat aatgttgact acgaacgtga cgaagagctg aaaattagag ctgcagatat gcaaatctat ttacaagaag aagagacaaa gttggacaat aactggaaga atggcacgga ccatttcagc ctatatgagc aatgctaaga tagaaagtaa attcgggaaa gttatagacg gagatgtgaa cttagtgaaa cgcaacaatt gatacgctgg atgcatatat tagagataac aaaaacggaa taagtgaact gaggtggaat tgataaaaaa agccttacat cgagagagca atttgtttaa caagaaatgg atgttaaaca acagtatcca aatggtcatg gatggcaacg aaatatacag caagcatatg atgaccatga aagacaacga gatgtttaaa ttggaatatg aaaaaagaag gtgtttcaat ggttttatac tactgagttt agagagttga ggaaagatgg aaatgtgatt tattaaaact ataatgaiac ctatcactga tgagtacaag gagaacatgt aactacaccc tatactgtcg tgttgaatgt aacatagcca aaaacaaaga cggcgaaacc actcagaggc ttttcacatg aatataatgc aattcaaaag gcaaagcgac ccgattgtag caaatgtata tatcgagact aacgagttat cgcctttcga tgattacgac agagttgaaa aaacacacat taaggagtgt taaaaaatgc cgaaagaaaa agatattaag gtcatcaagt acaaagacaa cgtaaatgaa gacgaaaaga aaattatgac tgatagrgac ctaaaacgat aagagctagg attgcaagca acgatatttg atattiagag aagttgagta caaaggaatt gratttgata gcaaagtaga tatgaatggc actaactatg atcgtatcga aatacaaccg caaagaccga ttacgtatat agccgatttc tctttgtgga ttaaaggtaa ggcgactgaa gttgccaaca tcaaagcgaa tttaacgtgg atatgtaaag cgcctaaata cacaggtcaa gtcagacgta aaagaaaaag agaaatgaag tgatctaatg gatataagaa tacctacaga agttgaatat cagcattacg caaagcgctt agatgacaat ccggacgaat tactaaagta agaggtggaa taaatgaagt tgaacgaagt attcgcaact gtaagtgtcc aagatttgca caatgaaact ggcgtatcaa aagctgagat ggttaactta aatgtattag ataaattggc atttactaga aatcacaaca cgcacaaatt agaggattgg aaatgagtat cgtaaagatt aacggtaaac catataaatt gaacggttta actccaggaa tggttgcaaa aagagtacga gcaccitatg gtatgcgctt agctgagtat aaagaaattg aagagcgtga aatggttagg caacgacgta aagaggctga tgtgcctcaa aaacattctc gtgatccgta ctggttcgat agtgaagcat aatgagcata atcagtaaca gaaaagtaga accggcgcat tacacatacg gcaacattga aattatagat cctcaactag cattcgcaat aggtaatgca atcaaatact aggatttagc aaaggcgaag ttttacgtcc aaagagcttt caaaaacaag ttgattacgt aatgtcatta caggaacaat acgaacaagt taaagctatg agtcataaag aagttagcaa ggatgaagag ctatataacg aatgcatgtc gtttggtctg acgatagcgc acgcaaagaa tacttaaacc aatttttcag gcgagtggca catatccatg tagtaaatgg cacttattac ggcgtgaaaa agacatttga tactgctgaa gagctcgaaa aggaacagaa geaaccaact ttattttaga ggagatggaa atgacgctac tcgaactggt ggaatgggca tggaacaatc cagatagaat gggcacgctt ggagaatgta gcgaagtaea aaaagtagta acagataaag atatttttac tgtagaaatc gattgtctag tagaactaaa cgatattgaa ggttttgaaa tagacggtac ttccagagcg ttttatatac taaacgaaga ggagttggta gtatgatgca aacctataaa gtatgtcttt ataaattaaa gaaacattat ttcgtgaaaa gtacgaatga gattcgtaaa aagctcccgt tcgaaactgc aagcatagaa aaccaacaag agaagaatta attaatttca tgaaaaaaca tgagcaaagt gcaataagac actttagagc tcaatcaaaa aagcaacgag atgagcttat cgaggatata gctaagttaa ggcgcacagt caaaaatgaa ttgcttggaa gatacgaaca tgagagcaaa gcgaacagga taggagctct ctatatagga cgaatggaag aactagacgg aacaaatgag tctacgaat aataaccgtg aacaaataga acaatcagtg atcagtacta tactaaaaga gattgaggac gtgtataaga aagcgcaagc tgctattcaa cattcagtta aagaaggtat tgaacttgat gtctataaat atgaggagga gcaggaaaat gagtattagt aacgaaagtc tagagattgt gcaattggtc ggagatatta cagttattag cattatagat tttattacta aaccaattta gaaacgatta aaaaatgtgg tgccgcaccc agttatcaaa aaagataatg tgaaagaggt tcaaaaagaa ttaggtttct ctggattttt atcatttcaa aggataccta tttacattat tagatattac tttgctaacg agcatgagat tgaaagatat gaaatcatcg accaacgtga tgcattgcta gaagaaaagt attattggtt gaataaacgc aagtcagaaa atgaacagat ggaattaaaa cgataggaga taacgaataa atgaataatt aaatacaaaa ggagttcgac gatagaatac cgactagaaa tgaatttgcg gagtgggtta acacacttga gttttttaaa gatacacaat tagatgagat tgctgattac ttagctttca aagaagattt ggaagagact actgaggtta tggttgattt acattcagtt tattttgttc atgtaatgca tacactaaca attgtacaag ttttaataat gccttttttg tacgccaata catacaaaaa gaaaatgaaa aggaaccacg aaagacaaga gtaaagacat cttagatcga gtcaaggagg ttttggggaa gattcaacag gacaaccaca tgaacatttt actgctgcta cggagagtaa agaaggagcg aaagagaagt acgagaaaca agaaaacggt aacgattgat gtagatgaaa acttattagt atatgaatat gacagtgagt taatgtcagc tgatgaagat gacgcattaa aacaagctat acaaattatc gataaattaa aatggaggca gacacaaatg cggtaatgac acagaggggt atacttgagg gaatgacaaa gggttatggc aggtcaagtt aagtatataa ccatgaaaca ttataaactg tctgatgatt ggggacgagt gagtggaatg taaagtcggt atacgtaaca taatgaagaa gtgttattaa cctaaatctc ataagacgcc tggaggacga gtaaatgctt cgactggtgg tacgagctag agagtgctta aatttattga tcaattaaaa gaacttttac acagtagcta gtatgattat aacaaccagg taagccatta aactctgact gaagttactt taaaaggtat aatcgaccaa gacgcaggaa acttagtcac gacgtttaca aggagagatg aatgaaatat gagatatcga attgttgatg tacctaaact tgataatagt ctcattgacg aaggatacgt aacattcaaa gttgttgagg gagatgccaa cagaactatt aaaaaaaaga WO 00/32825 PCT/I B99/02040 15401 15472.
15541 15611 15681 15751 15821 15891 15961 16031 16101 16171 16241 16311 16381 16451 16521 16591 16661 16731 16801 16871 16941 17011 17081 17151 17221 17291 17361 17431 17501 17571 17641 17711 17781 17851 17921 17991 18061 18131 18201 18271 18341 18411 18481 18551 18621 18691 18761 18831 18901 18971 19041 19111 19181 19251 19321 19391 19461 19531 19601 19671 19741 19811 19881 19951 20021 20091 20161 20231 20301 20371 20441 20511 20581 20651 catgtcgagg aagaaatata ataaagacag aatgatgcct cgatatcttt atctatacag gacaagataa at tcaaaaag acgattggtt aatgattaaa aaagtatata agtctgattt cacacataaa tatggcatac ggcttgagag gagattggga aagaagatgc tttatctact gaattaattt tcatcgatgt gcctatatgg acgtagctcc tgtgatacag ggcaagagcc ataaagaagc aggcagacga cgattaacat acctaaaatg aaaataaaac acctgaagaa aaagctaaga tgaattttac agtcctatga tggctaatat agtttaattg atactggaga tggcaatgat attgctggaa tactgcgatt attcggcgta rggctagtta ccaaaacaaa gaagtatatc agaagacaag ttctatattg tgttagacaa aaatttgata gcgcagacat acaagctagg acagaataca ctttttaaat ttatatccgg caaatattaa gactattatt cttactagcg ttatgacgac ggctaatgat gatgtagagg gatgagggcg gaggtgtcag agtagatgta atgaagtgta atgtattagc tgatgtaata aagcaacgtt gccgaaacca caaggggaaa agcaaataaa aggtatgctc agatgttaaa cacgttgact tttgcttctt agagttattg ctaactctaa attaaataga aacaacttct acagtgacga aaatgacaaa aatgacagaa tatataattg atttgtaaga attatcttaa gattgagaag tgacaaacat acaaaagatg tgggtagttt aattcttgca ttttgagtca tgcacaagat gtaggggcat tttttatatt tgaaaacaat tgattaaatt aacaccgaag aacggaaagc atatattgac gcagggtatt gagtacactt tttaaaaatc ggaaggtttc aaattcccga aaaagtacac ttgaagatga tttcattaaa gaatgaacat gaattaaggg gattaaaaaa cttaaaaata atcgcactga tgcttgttgt aagggacaat tacagataaa caagcaagtc atcgaaaact ttaaaagtag gcgacaaagt tcttatacga atgtatgagc cgccgagtga t agcaaagag ccggaatatg actcaagtaa agaggttgag aagaaaggt t tagcgcgccg atgacgaaaa gacgtggggt atatgttacg taactatttt taaataacta caagaaaagt cgactaaagg cggaaggtac gaatatgagt atgcattcct aatagaaact gataaaatca acggacaaac caaagcgtac agcagacgaa aacgtactaa tatttggcgt atcgtttggt tgtagtggaa attatattga tcttaatgat aacggtatgt cagatgaact atctattgtg taggtaagta cttcgcaaag tcaattgtta atagcaattc agttgaagat tttataaatc ataacaggga tgatgagtta tgacactatt aatagccaca ctattaatca cctcctttca gagtaattaa ttgtgctagg taagagtggg gaaaaattgc ggctaaagaa cgctagttta gaaattaaga aacaacttca aaattgaaat aggtaaaaca gaaggaattg tagagttaac tcaagtggtt gcatatatca ctggtgcaag aaatacagat agatataaag gggtgtctgg ggacgacata ttaataggac tgaaataatc gaaattaatg ctatgttaag tggattggtt eatatcgcct tataacaaga ctgacttact agaagttaaa gtagataaat tgtaactgag ttgagcgatc atatgatagg aattgcacag gttgttgtga aat cgcaaca tgcgattatc gcagaaaaga tttaaactgt ttagatgttc cctactacct catttattga cgtaaaggcg actcatagag gaatatctag gtcaagaagt tgtagctaag gatggtatga tgcttgagaa cgacgcaata attgaagcaa atattaaaca agacacctca taacaaatat cagacaggtc atgaagcgtc aatattggta cctaaagcag agtctattaa tgacggtgta cctatcaaag gaggtataga agctgaacaa t caaaatgga aatgacattg aaatagaggg atagaatgac gttaggtaac gaagattgaa caaatagata acggaagtga tcgtcaatga cacgccaaaa ggcctttgaa agtgaagaaa gcgacagctg gaagttttag ctaaaaagaa aaggtgacag tggaacagaa ataaacttaa atctttatat atgacttatt tttacgtgag ttaatgcgtg tgcgaaagat atatatactt gctggtgcaa cattaggtac gattcaaaaa ggcattgagt ttaattttga taaatataat tcattcatgt acagtaaagt aagtggtata ggagctatac gtggtatgac gttagcgcat gaagaggtgt ttgacgagat taagtcacgt gataccaacc ctgaccatcc cgagcattgg ttgttgaaag gtatactgag tcaccaattt aagctcgatg acaataactt ggcttcaaca ccatcaggta tgttctatga acgtaatatc gtatatgccg actttgattt gaatgagaat acgattaaag aatactttgc tggtgtcgac tggggttacg agcactatgg tggtaacttt tattttattg aggagcacgc acaccaattt aaagatattg taagtagata tggcaatatt aatttttact aatttagaag acatagatta cgtgcaatta acgctgataa taagttgttc aaacaaaaca agttacttgt tctttatgat aaatatgttt ggcaccctac aaacggagag cctataaaag atgccatata cacacatact aaacctgaac gattaaggag gatattgaag cacaaggaat attgcctaag catattgagg agagaatggt taatctctat aatagataca agacacatat aattgaagaa aaagaagatt ttgaaactgg tggaaatgta cttaacaact cttttgacag cgaaattgtt gatacacgtg atgatttaga tgaaaacgca gaaaaaaacg aaaagttgaa tagtgttgat gatgaggatt ctgaaatagg taaaatggea tatattgata cgaatggtga tattaggatt aagaatatag atattttaga acctacatac tcattgcgct acttttatga gtacgcagag ttttacgata atgcttatta ttatgtattt gttggacgat atgaacattt atttgattac aatccattgt gagatgctga aaaggttatt cacttaattg acgcatatga tagtcagaca cgtttagcat accttgtgtt acgcggtatg caaaagagtg gcgcatttga gttgttcgac aaagatatgg acacaatgat tgagaaccat ttagatcgaa tcgaaaagaa taattctgac gagtttaacg gaaatgtacc tatcattgga aagtgtatga cgtttgagcg taagatgaca gctatgttga taaagcgtaa agggtacaac ttggatgatg atagttattt tccagttaat aagttagaag aaLC~at ~~ata ttaggacaat cacaactagt tgatgatgtt gattacgaat ttaatgacaa attacctgac atagatgaag gtgacgcaaa atattgatga gtatatcgag ggtttaatct ctaaagcaga acttaaagag ataaaacaaa tcatcgcaga tatgtttgag tggactgaat tcaataaata caacaggctc aataaggagt actataggca agtagctaag atgattcaga agtcacaaga cctttattia tatgaaatgg cgagtcaaac atctatgcag tcagctattg aacaacctat tgagttcatt cgtttaatgc tgaaaaagat acgtatgcac attacacaag gtattatgag aatacgtgat gatgtcggca tgtctaaagc tcaatcattg atgtcacaag ctggacttga tagcgcaatg gttgctaaag aagtttattg atgattgggt ggttattgca gcgatactgc acgacctgaa tacatcactg aagtaaacta tcgggtgtgg aggaagttgc aatatggata ggtttaagca agaggtattt aatttgatga cgtgttggac tcgttaagat ggggaaatga cattgtataa gttaatagat ctctaataga gtcacataaa gacgatagag aggcgattag ttggttattt aaagtttata gcaatttgcg atccctataa aaaagatgat cgaggagaag ttggtgtacc tttaacaatg ggtatgagtg acgttaaata tatcatgcgt atgaaactta ggtatcaatt aaacctgata tagacgaaat tgacaaatcc aaaaccaata aaatatcaaa taactcgtat agatgcttat tttgatgtte caacactaca tggagagggt cgtgtggctc ataacggttt acgtgtctgt taataacaaa acatggtgtt cctgttactt accaactttg ccattagaaa gatatggtgc taggttagca tgttattttt gttggcgaca gataatggca ctgattatgt gtattgacgc tttgcaagaa taacaacaaa gagatgatag agcgatgcat caagtgagat aagaaatgat tcaagaaaca cttaacaaaa gatgtaaatg tttgcaaagt cagtaaactt aacttatggc tttagagaac caaagttatt ttatctgcat tttaagttca ctcgtaacat nagt ttcaca acqaacaagqq ggaaaaagaa agtcttgaat caaaataacc aatcagaatg gaacaactat ttgctaatcg atgatgatgt gtatgtt-aca aggtacaatg ttgacttatg_ atagaaaaat tccttatgag cgagtaaaga ggtaatcaaa aaaacatcgt gatgaagtat tactctaaga tagctaaagc gtacagaagc aggcagagca gaatatgaag aaacgttggc WO 00/32825 PCT/I B99/02040 20721 20791 20861 20931 21001 21071 21141 21211 2 1281 21351 21421 21491 21561 21631 21701 21771 21841 21911 21981 22051 22121 22191 22261 22331 22401 22471 22541 22611 22681 22751 22821 22891 22961 23031 23101 23171 23241 23311 23381 23451 23521 23591 23661 23731 23801 23871 23941 24011 24081 24151 24221 24291 24361 24431 24501 24571 24641 24711 24781 24851 24921 24991 25061 25131 25271 25341 25411 25481 25551 25621 25691 25761 25831 25901 25971 atgctactaa gaattttaaa aatattaatt gtaaagacga aggtggtaat acgcattaaa aaaaatgaag cactttttta cggactgtta tttgaagaac acgt taaagg gaaaggatta cctgagcaat caaaacgtga tgatagattt aagtatgttc aagatttaga tgaggtaata cgttattcca gctaaaaatg gggtatcaga aattggtgta aaacctctaa agatacacga acacgtgata ctcatcgtca tcaagtgggt gtgttgggca ggcgcccaag gtcgttgcaa attactttat tatattgatg tggtaaaaat gaagttatcc CattcatgaC tgatatggat tttaaaataa aagtaaatgt tccttgtacg aagagataat agagttacaa ctgatttaga tatggttaaa acatctatra gttgtctctt tgctactcga ccttagcatg gggtacgcga agggcaaaaa ggagttttga acaaagacga taaagaagta aaagattatc ctttttagat acagaagaag gtaaacgatt gaatcatgga aagagaaaaa tcttgaggat cagaagaaca aaaacgtatt agtgctcttg gaagttaaga agtaacgcgc taggtaaagc ttaggcgatt ctgatgaaga tactgagcaa aaaaaggcgt tgagtctaaa tttaaatcga cccttcaaat gtaaagtcca ttgaagaaat aaatatggca actccaacat acacgccagg gcagaacaag gtactttaat catgaaagac agccaatgac agcacaaaag aaaaaattta aacggaacgt attcaaactt ctaagcctga attattccgt tatcaaaaga gtttcttaaa ttgcagaggc attttacaaa gcgtttgacc acaacacttc aactagtggt taataattta tacgtagacc gtattaacta cacgttcatt atgctaacgg gaacgagatt atcgttagca ctaatgggtg tctgaagatg ccacgttaac gtgatatgtt cgctttacgt gcttaaacca actgaatagg atatgactat tactgttaca atcacgtcgt actacgcr tcaagaaaaa tggagtggaa agaagatgc t agggatataa aaaaagtatg tcgcagatgt gtatggggac agtgtcgtac taaacgagca aagtttcatc aattccctca cactatttct tgtaagcgat aaaacaatta atgtcacaag aatatgacag agtatgaggg tagaatcttt actacgactt aagcaggtgc taagttcgat aagaaaatag actgctgtag cattagctcc gtgggttatc cagtgttata tgctactggt cctggtggta tacaccacat atggtcaagc agcagtactt ttcatagagg taaaagatta atctcagacc gacgctgttt acccatatat gagaaacagt cggtattgt c aagcgcgata ggttatgtgc gatagtcaag cagtattccc acagacataa aaagaaaaac agctgaaact gacttagatc gaaaatgatt tagctgaaat ttaaattaac aattggtaat tggacagttg cgtatatggc tatgttgttc cagaatcatt ttaaatggaa tacagcagaa tacagttgaa tacgaaaaat tctgattcac acacggaaga tataaaaaaa ttgaaaagag gaagatagcg aagatgggag ctagaaacaa agcgatttta acaattagaa aaagcaattg ttggacaaac ttaacaatag caccgaatat ggccaaaagc caaggaaat c atgggcgcag gatatatccc tgaacatgaa ggataggtac ctagatcaaa aagaggctaa ctagtatgat ctcgtgtaca aaaagctaga attccttgaa tcgaaaggag aaaccgcttg t t tcggcat t cagaagtaaa atgggattac attgggatta gacgttacaa gcgacgatgc aggagatatg aagaaggcat ataagagcga ttgaaggcgc aatggct act atgcgtaatg cactatctta cgcacgttac gcatcagatg atattgcata atggctaatc ttgactctta gtgataaaaa tttagatggg ctatttattg aaaatgaatt ttatcgtgag tgatactggc aacgaaaaag gcgaagaaaa tcgttaaact tatatgaata taaagggact cattcaacct ctaatcgaac aacaagagtt gcaggaacta aacttaaaag gtggaagaga ggcgaaagaa caatgttatt attatggcta ct tacttagc atatgcgcaa tggactgcaa aagctgttat agaagagaaa attgaagatg ctttagatgc tactggagcg ggtatcttac cttctggcca catgaacgtt ctgcagaaga t tacagt ctt tgactcttta aatatttaag gatggtcaag ctgaagttaa tgattacatt aggtgttgtt aggagagtat actacatctg acttgccaat agatcagggc cggtgctgat attgctaaaa aagaaagtat aatatacgtt :ggagtttta :tgcaattga agt tgagcct gatgataggg :cactaacaa cgctacacaa *gttaaaagtg atgaagaaga *caaacaatga ttttatgaaa cctagagtat tatcaacgac *acttataacg atggtgtccc cgtttaaacc aataaggtag attggaagta tcaaaaaagt aaggatttat ggatacgcct aaacctatat gtaccttatg agtattgaag gtgattctgt catatggcaa aagttaagta aagagtgggt taaaaaaggt tgttgactta ggttttttag agtgtcggcg cagattatgc gtcgtgctac aaagattccg gccacagcca ttttggaacc tggttaaata tgtgggtatc ctaacattaa caaactagt tgttgtgggt gaatcaaacg atacatgtgt attcacagtt ttaacagacc tatagaaata tgatatagac aggtttacta gaaggagtgt attaaatggc cagtagaatc tttattatta agtacgaggc ggt aaaacgg gtgcctggag ataaaggaat tttatgagcg taataaacgt tgaaatgtca tttgatgatg ggtgctgaag ataacttgcc tcggcgaaaa agtcggaaca ccattctatg taaactaata gaatcagtgg aaatagatca gtgtaaacag tgcgaaagag gccaactgta atgagagcac Itgggagaaat ataagcgaaa gaageratag aaaagttaga ttgttgtaaa cgtaacagtt tgccaaaaat aatgatttca gctttttatt atgcactttt tcgaagaagt taagtctttt taagacggtg tctgttgatg gaattagatc gttatcattc aagaagtacg gaagcgtaat agaaaaacgc gacgcagagg aatttaccaa catccttagt ctttaaaaga aacctttgac tgttaaagaa tcacgaaatc atcaatatta gaaaataaag ttatcggatc ttaaaaacgg attcagcaat tatgaaatta aaaaggtgta ggcgcctact gcagaaatgg aagctaagaa aagatttctt taatgaggtt ctttggtact aaatcacct ggtaacgttg itacagatac aagagt taga t ccaaacgga taatgacaga ccattatttg gatgtatacg acaaaaagaa aaggtattga gtatgcaatt accagtatca ttatttgaac aaaccagaag cgttcgcaac gattaaggta aaaaaagaca gtcggttaca aagaggttaa tgaagatgtt aaacttttac atggaagttg acggaatact tcatttatcc ttactcaatc aaagaattta aagtcaagaa agtggagtat taaacaggta tgtgtttaac ccatacgacg ccaattatac aagagcgctt aacaactaaa atttcatcaa atctaaaaac aatttatttg ggacaacatg aaattaagtt agcatggttg ttgaattgga caacgacgaa gatttacaac tgactttaaa tatttcgatg gaatacggta ctggtatata aaggtgatga cggcgaatgg cgcaggacgc aagacattcg gaacttacaa atcaaatata tttttgacgt tgttcaagat cgaatctagc gcaacaatga tacgaggcta agctcatttt agtttcaatt tagccgtatc gatacggctt ttatttaagt tatttagcag ttgtacgtcc aagaaggtgg acatacgat t caatgcaatg tcagaatcat aaacacgctg tacaaacagg aacatcacgg aatgttggt aatcgaacta tcattaaaag tttgaagctg caggtgcgcc aaaagaaagc tagtgttgta gcgtaagctc cctatttttt acaacattaa aaattaatga gaaaagctoa aaaattctca ctttaacggt ttgctagaat aaaaacccac caactcgaga tgccgttatt acaaggggct ctggatgaca ttgaacaaag gaaatgatga aagagaatta actgacagct agatatttag gattggctta ttggtggtca acggcttagt acaagcttct aagagaacct ggtagctatg ttcaaagaag gtacaagaaa agattatggc caatattaga gataattacg agcatggcac gcaaaaaaac gctgacttac actattctcc tgaagcagtg gcagacggta aaagtgacaa gaaagagtgg ttcgagaac gatcaagggg gtatatattt tgactgaatt tasascaa aqtaratt aaaaggagca atgccaggat caattttggg aatgtgctac atgatttcat cactgaaaac tggttttttc aagagggaga gaggacaaag aaatgacgaa aaccttacac gattactcaa ttgttagcac taacacctgc gacaattatt aattgaacaa tcgtgacatt gagaaacaac t tagaagaag aaaaaagaag gttagccttt ggatactcat taatccaatt or~atttoacc tcaatgttat tgcttattta gaggatactt gtcgctcgta agcaggcata aaataaggca tgaatggcgt gcgcaagcta gttacgaaat acgtgaactc tttatggcaa WO 00/32825 PCT/I B99/02040 26041 26111 26181 26251 26321 26391 26461 26531 26601 26671 26741 26811 26881 26951 27021 27091 27161 27231 27301 27371 27441 27511 27581 27651 27721 27791 27861 27931 28001 28071 28141 28211 28281 28351 28421 28491 28561 28631 28701 28771 28841 28911 28981 29051 29121 29191 29261 29331 29401 29471 29541 29611 29681 29751 29821 29891 29961 30031 30101 30171 30241 30311 30381 30451 30591 30661 30731 30801 30871 30941 31011 31081 31151 31221 31291 gatttccaaa atgtaaaagc agagcattct gtagagcgat caactaaagc aatggcggt t agatggaaat gaagaattgg attgaaagat caacaggtca caataatggc tgtcgcaggt acattggcag atattgttaa gatgtctcaa aattgggtta tcggagattt gttgttcaaa gctggtcaga ttggtaatgt ttggttggtt gactttatca tagt tgctt t atttatcgct ggtgtatttt gcttattcag agcattaata attggtgtcc gtgttaaaac ccaatctacc gttttggtaa cgttccaagc tcagttgctt acgatttggc cactttctat gaacactgtt tttattatca tagctgatgg aagtttcttc tctgcgcaca taagtggaca agactttggc acttctatag atgtaacgat cggaagtttc gttcacagta aggcgcagga gaagaactta agtcagacgt taagtatgag caactttttg cgtttgaaac ccctgaaaaa tacaacacta ttgttgaaat aggaaatatc acctatagag ggaategatt agtagcctat aagtaaaaag ataggtgcta ttgtcatact tgatgacctt actgtcttta cgatgcaaag ggtgtgaatg gtgattttga attgataggt gcaatggagt atttcccaga cttagtgaga ttaggagact taaaatctac tataaataat gttacaactg gcaacgtaag agatatttca gttaaattat ttaaacaaca tactgtcgaa aaagggttta taggaaataa t cagattatg tacggcgaga aaggcttaac agtacttaat cttggagttg taacaaaaga agagaatcaa ttaaaaccat aacattccga attgaacgct tttgtgtctc atgctattaa gttcgctggt aaattaactt gttatgttca tggtactgca aaactattcg gggctttaat cgtcactgaa ggtgcattcg tcgtttattt ggcagtttct ttacaaccga taggcatcat gaaagctcaa agattccaaa tcatgaaaac taaagtagat gcttggagaa aagaagattt attcacaaaa agaaaagaag aaatggacgc gattgctagt gcggttggtg ttggctttgg agttcaaaac gcaagtatct tcttatccga aacagctaag gcaggacgat aaggactaca agcgt ttat t attggtaatt ctcaatttag agagaatggt atggctccta aaacacaccc ggctccgatt aagattttag gttcgatttc atggaaaaca ggtgcgattc taatgcctat tacaaacgtt cgattagcaa gagctttaca agatgagtat ttgaaactaa aacatgttgt aatagatctt gaattcaatg taaatggaac cttagctact atacaagcat tattaggtgg tgcaatggct tttagagatg agacgtctgt acgcgctaaa aaagcgaatt gtaacactga taagttggat agcaggcata aagtcgaagg aagtgatatt aaaatccgaa tgataccagt tggcgtttta attagcgctc cgagcgat ca cataggaaca gtgatatccg actggcgact tctcaggtgc aatacatgca atcagtttgg gtttggtaca agttggtcac acaagttggt tcagtcgagt caaaaggttc tgaatgggtt gtttaaaaga gttgtctcaa agtgatttct taaatgccgg aagtagtcag cgcggtaggc cggtggaggt agtagcttag agtgccttta ataaagagct acagacacat gactagcgat tagaaatgag ggcgaccttg aacttattat aagggaggtt tcgcgtcagt gacaatcctt tatcatcgta actattctga aaaaagtaga gcttaagata ccaagcacta tttgctggac catatattag atataccaaa taggactagt aagtgaagtt aaccgaacta ccatactttg tggtcggtac ctgatagatt actcaggaga agtttattat agagttagct gaagatgtta tcagttatta aggaagttga gttatttaaa tatagattct caagtctaat aaagtaatga gccaatttta ttaaaaagtc ctaaatgaag atagttcttt taactaaaat gtggacgatc tgataagtct actattggcg aacaattcta ggatttacca aaggaacggg ttataagtat actaaaacgt ttcatttgta cegataacgc caaaatacaa tggacaacag acttttgcag aaaagagaag cgccaccgct tattgataaa gaaaagtgtc agctaaccct aaaataggtg atagtcgaaa tcactacaca ttacaaggcg taatcgttat aaaatccgac ccatctaaag gaattggtta agcagaatga ctaatggtac gatcatgtac ttaatgcgat gtcagcaggt agtatctatg ctagttgaag aaagcgtttg aagcattgaa ttggcgacgg attagttaac gaacatgtct atagctttcc gactacacta ccactaactt taatgattgc ttttgcacaa agcatggtca gaacaagtag cctactatta tgcagttaat tagctagtaa attgttagac agctatagca caagttgctg gttgctataa gtagtgtact acttcgttag aacatcaagt agcacctatt ttagcagttg aacgagaact ttagaaatac aaggtgtagt cggctggtta attgcaagta ttaggacaaa atgaatatca tacaaggttt tagcagtcca aatcatagta ttgggagact attaaaacta gagtcaatta tcggcttttt agatatggag tacaatcact ggcttcgagt gtagctgaaa tctaacattt ggaatacagt atgtaggtga cggtatgagt agcggaatta atcggcaaag gatgcgattt catcagcttg gtaaaggttt agcggtatca atcctctact ttgacagata gttcaacata gcttaaaaga atttaattaa atcacgcatt tgttagttga tagcgcacga tcacttataa tcacttggaa tatagagggt attgatggta aggtataaag tacctaaaat gtttttattt aagggaatta agacaaacaa gcatttgagc tcttttgaca caacacaaac aaagtgtcgg ttatagtact gcctacaaac gaaggtgata aacggtgatg ttcctttaac aagctaatga taaggatgga tttaaaagcc ggagataaaa tttaataaaa ctttagaaca aacaaattac atttagacac tacagggtgt agggcacgct ggatctaact attatcgaga actcatgttg aaggtgaaga aaaaaataag gcttgatatc agagtataac gaaagtttta gtattacatc caaaagtaga ttaaaaaagg aettoaacat tgatgaatta tctaagtttg gaagatgcat ctaaatgtta aagcgggact acaaattgaa tgttgatgga cgtattaaaa actgcttcta tttccttaga atgttgttag agtggtggat tagagatgcg tacaataata aacaaagcag ttcatgatgc aacr aaaagc attaaacgca aaaaataaac gctaaagtcg gactttacta gtcaatcaag accaaacgaa attgaaacag tcaatggctc aacgatggcg tagaacgcgc taaagctcaa attaatggcc aaatataatg ttagatgcaa accccgctaa gttttgatat tgattccagc agcagttaaa cgttctttcg tggggtaaac ttaacaacc ctttcggtac tatcttcgcg gattgccgga ttagracctg ggtttagttg gcgcattctc ttaaaatggt tgaagatgga gttaaaaact acatggcgtg atcagaggeg ttacaagtgc caaacgcacg cgagtttgag tagcataggt ggcgcaatct attttcactc aattaatgcc aaaattgggc taatagtgta acctaagatt ggtcagatat aacagttcca acatttttga gacaatcaca agggtttaaa cggtaatatc gtaaaagcat tttatcacta atctagctgg gcgttatggg tattttaggc tacaaatgtg tttggtttga ttagttactg gagctacgga ttgcagtaat tggtgcattc tattactgaa gcgtggaacg actgaattgt ggggcaaaat tattcatgca agttttaggt gtggacttta attacaattg ggtttgttca ctgctttaat cggttaccaa tgtgcttgat aactggcgta atgaatcgaa aattttgtta gcagtatttg aaatggggca agcactaaac tacaagtttc gcgagtaaag gatgcacttg gtaagattaa tagctgaggg tgtagccaaa ggactctgta acttcattcg caagcaaaag taattgctac gtatagtaaa tcctgtaagt aaataataga cctattgtga gatgacatga acgctataga tatagaagta ataaggaatg gtagttgaat ataacgttac gatttcataa ttacgctaaa tgcttatgct tcacatttaa gctacaccag acaattcaat ttgattatgt tgatggacga atcaggggaa ttttctttgt gatcttgaaa gtaataacga agaggcgtca aatgacattt acagtttaat cagtttaatg ttcactttct atacagataa taatcttcga cggtaaacat accggtttta tatccaggct aaattatatt ttagataagg attaatgtta gtacaaaggt acgcgagtac gtttgacgca tgatttcaac gaatatgtaa aaagctaggc aaaaagaact caggcgttga gttcttcaat tgcatctaaa ttcgagggat tatcatctcq aatatqaata ccaattatta cattaaagct tacctttatt aaaggttatg ttcactcatc cattagcaca aagaagatag tttaaaaaaa ctttgtagcg ttacgtgaac tctgccatag gatataacga tcactaagca agatgtagta tgcaaattat gttaaaagcg aaagttaacg caagtttatc ataagatgaa tactaaaaca tataagaaac atcaaatcaa WO 00/32825 PCT/I B99/02 040 31361 31431 31501 31571 31641 31711 31781 31851 31921 31991 32061 32131 32201 32271 32341 32411 32481 32551 32621 32691 32761 32831 32901 32971 33041 33111 33181 33251 33321 33391 33461 33531 33601 33671 33741 33811 33881 33951 34021 34091 34161 34231 34301 34371 34441 34511 34581 34651 34721 34791 34861 34931 35001 35071 35141 35211 35281 35351 35421 35491ttggaacgat gaaattgaaa gaagcggtag ctgatgatga gttttacggt atgacagcta tagggttaac tgacgcatat ttacacccta actaaaggag tgctcaacat catcaaaaag atttaacgta aaaagacgcg tcaacactag cagaataccg agtaatgcaa cattacatgt acggtacaca caaaaacaac gtcatgccga tcagacgtga tgacgatatt acacaaccta ccaaccctaa tggcggtgtg gaaacaggac attctatcgg aggtggacgt tactatatct ggtggttctt aggtagaaat ttctgtccgc tggcgactct gctaaaacga aaaacagcat ctggttacac gccttttgtt caagacaatg acttgaggac cacacagatt acgaaaaagg gcaaccaatg gagtataact aagaaatata tttaaatagc cgtattgata atgctttcac attcgaacca tcattttggg tatctagatt caatgcgtat aagtttgtac atatatttaa atataaagct gataaaggta tgcaaggtat ctacttacaa aataataact gtaaagcact ccaaagaggt gttaaaccgt atacgcaaga ggatgtactg gtagctagag ctaatcttgc ttatagacaa ggttattggg ctgcaattga ccctatgagt tatgtaaacg actttaagcc tcacgaggtt gcttacggat atcgcttgti cgcacactca cgttttagca atacaggtta taaaaaggtt aaagagcaag tagaccctag gaagcccaac agatacattg gtttccaata cgacagatat tctgaaagac tagacaaagt cacttatgat ggt t tcgat a ttaaaggaga tttaataggg gttaaccaat taccaataca cacacaaaat cctggacact ggtcgcacgc aagaggtggc aaaaactaat ttcacagaaa tgttaggcaa gcaacaatgg caacagttcc aataggtaaa gcagagcaaa taagaggaga caggcgtacc gataggcact agttattaga aagaacaatc ggtacaacaa tacgccgtaa ctcaaatatt agcttgtagt atacaatcca gcttttagga attatgtacg agttaatcaa taattacaag tttacattca agatgaaggt atgagcaaac gcgaaacaaa ttaaatactt atatgattct aggccataac tggtcataag acattgcaag gagaaagetg tagatgaaca aaccggaatt tatcactgat aacgaaaatt atttatatga ggacaattta ttgatagatt atggagaatt atggatttat tagaactgga gaaataactt acgtcagcga tttataatcc aagctaagaa ttcattgaat attgtatcaa atggatatac gcaggtatct tatattggta taaaaacaaa agaattgtta cttccaagaa gctgagggtc gtaactattg gacctggtaa tcttaaaaaa cattgcacct gaacccagca tatctaagtg cctaatcata gataaaacgg cagattcaaa agacacggac gagttagatg aagcgagcat ggat tat tac atgacaggtc t tgagaaaat gaatgacagt ggcgacggta atcgtttgta ctataaagaa ttatcgccat cgcaagctcg gcttgttaaa tcagctgtat atggtaatga tatagaaaat ttcattgaag ctatggaata tacaggtgat tttaaacgac tagatatgta taacagacat caagtatcga atattacgga ttacaaggca atacaaaaac aatactagtg aaaaacaaac t accagtgtt ggaggacggc agtttttacg ggaaaatagt gtttatatac gttgaagatt tcaatgaagt tcatgattat tatcgagcga atacaaatgc tccaggtaat aacqgcggtc tggacagtaa aatgcaagat ttaatgattt taagaagtgc cacttcagat tcgaatacag gtatcgatat ttacgatcta cactcaattt tgactgattc agttggtcat atgcttaaat tcgaacqtgt aaaacgccgg ttattgggaa tcgttggttt agatttctat aggtattgca ggttggatat aataacttcc cgtctgcaca tattcgaagg aaaggtggtt caaacaacat gtgttttaaa tatcgtgtta gacgcaatta ctcaggcatt aaatgattta gaagcagtcg tgaatgatag tgcgacgcaa atttttgaac caaattggca tagcgtttta acggatatag cttatacatc gtacccagac aagaatgatg agtatgtaga aattcaagtt agagtgccgg caaacaagct acttgagcaa gtaggtacaa caggtggcgt cacaatataa ttttgcagta aaaaccgatg atgggcgttt ctttacacaa gttagtgggt gtaaagactt tgcgacaaaa aetagtgcaa gtgttaacga aaagtctaaa agcgcagtta gcacgttaga aagcaaatct gattcaaacg gaaacttaac tgataaattc aacttaaata atttaccagg atttaacttc cagtggacag caatcaatct tattgaggga atcactactg tagaagtaaa tcaattttta gaataatgat tccaattatt actaagaata attataacgt gcagtatgtg aacgggagta ttgatggtat taaccaatta ggcattcaac cacaagctgt agttgaacaa cttacagatg acacatctag gaagcctgga ggtgtgttag atgagtacac taagcaattt ggaacaacga atgcgcaagg tagtgttgaa acgccttata ttcctaatga aaccgaacca ttcggactaa gcattagatt tcccgttacc gaaagcgttt agagatgcag ataatggtgc tctaagacaa gtacttacca gaaacagcac cattgacatt ttcaataaga aaaacaacgg agcatggaat catatcccta agagtattac aaaattatca gatttaaaaa aagaatcaaa acgatttact gattttccta aagactttaa atcgaataca ccaggtaaca caacacaagt attaagacgt gttagaaact ttggtactgg tggcgttggt aaatggagtt agtagataat ttttcgaaag acgataactt aatcgagtta gacacaaaca tcagtttcta tgaatcagat agaggaactg acagaccgtt atctataagt tctgaacatg ttaaaacatc agatagaggc gcttatattt cagacgaatt aacgatagta ataccgaatg aatttttaaa acattcaggc aaggtgcatg ataatgttgt tgttgaacgt caatttagct tcaatattga aacaaagctt gtttatatca aatctattca agatactatc aagcaagata tggatgatac acaaacgtta atagcaaaag aaatcgaaat caagcaaaac gaagctatac aagctattac tacagctgaa gtcgataaaa tagttgaaaa agagcaagcg caaatcaatg gcgctgacct tgttaaaggt aattcaacaa attacggtaa agcaattgaa tcgtatgagc agtccataga gattattcat attactaatg caacagatgc gccagaaaag caagatggtg ttgatgacgg ttcttcgttc gatgaatcaa ttgtttatgt tgttgataat aatactgctc gtgcaacatg aaaatacaaa atctacggca catggtaccc gttttataaa gttgaagaaa cgtctaacaa cgctttaaat caagctaagc gctggcaaca acataagatg acagaggcga atggtcaatc cgatttggga tatttaactg ctggtaatta ctatgcaaca agttatgagg gttatttatc ggtattcgtt aaagacgata actctaaaaa gatttacaca cgatcaatca caaacggcag acataagtca acggtattgt tcgacggtgg agcaaatggt tacacaaact attctatttt attagtaagt ggaacttatc ccacattacc taatgcaatt caattaagta aagcgaatgt ttatgagtgt ttactatcca aaacaagtag cactacttta ggtaaaacat caggttctgg agcgaatgcc aacaaagtta gaaaatcaca gtaaatgata aaaatgaagt tatcggatac 35561 agttgactca gacggtaacg gtggcggtat 35631 agaatcgata acgatgtgta ctttgattta 35701 ctataactaa aattatggjgg tggaaataat 35771 gttaatactg gcggtttacq caatagttta 35841 aattcgaacc tagaaaartc gttttcacta 35911 cgtaccgaat gcatcaaacc aacaaagtgc 35981 agtatgcaaa tgcagatgac gcaagtgaac 36051 cacaacagtt gaccgaactg aaaactaaca 36121 ttatccaact tttaaagaca ttaaaaccttt 36191 tacgtagaca tgggtgtaat cgacaaagaa 36261 aagatgaaaa gtcacaggtg taatgcttga 36331 gatttaccaa acggcacgaa catgaatggc 36401 cactctcaat gagattaaat taggtcaaaa 36471 gatgctatcc agagggaaag acagatagac 36541 tgaaaatgtg gattctcggt ttgataggga 36611 ttttggtatt taaaggaggt gattaccatg gatgt agacg acggcgaaat gtcagattta atgttgacaa aaacaaatac ttatgtgtgg gaatatgcat ggctttttaa gaattagaag aactcaagag gaaaaaaata ctatctt cag cttaaaggga ataacaatgt taaatacaat agtgatgagg tgcaattgac tgagggggac ggttgctata tgatcactgg tttaacacaa attagaagag caagttaaca agaaagaaaa tacgattgtc ttttaggata gtctatcaaa agcaatttcg aacttcgcgg gcaacaaaac gtttaaatga aaaatgagca tgaaaaatat agtaggtggc aatgataaaa ttaaattaga cgacaaaaat atagctttac tagcttctgg ttcaaagaag aaaaagaaga aatggttgca gctatgttaa tgaagatgat aattaagtgg ccagaggcaa gtaatgtttg caatgcttag taaaacttta atacgegata taagaactat gcgtgcttct WO 00/32825 PCT/I B99/02040 36681 36751 36821 36891 36961 37031 37101 37171 37241 37311 37381 37451 37521 37591 37661 37731 37801 37871 37941 38011 38081 38151 38221 38291 38361 38431 38501 38571 38641 38711 38781 38851 38921 38991 39061 39131 39201 39271 39341 39411 39481 39551 39621 39691 39761 39831 39901 39971 40041 40111 40181 40251 40321 40391 40461 40531 40601 40671 40741 40811 ggtttggtaa atgtaaataa ggtgcataca tgggattacc agtcgaatat tggtaagagg tatttttaaa agatattggg aagggtttcc gattctatcg ctggcaacgg aataggttcg tagcgttgac caaaactggg tatgtaagtg ttacaggct caagttcagc atcaaaagcc taaaacagta aaatacactg gtaatgggtg atgaacgctc acgaactgta tacgcaaaga ggctacatgg cttgctagac tgtggtggtc aaacagatag ggttattgtc agggattgat tatgaaagat ttaattaatt aaaaagaaaa tgcttgagac caaaaacaac aataagaatt agacgaaaaa ccaagcatcg tctaggggta acccgaaaaa caatgaatgt taagcgaata aggcatttca gttagtgcgc ttcgcggaag cttgtaagai gatacggaac aagtaacttc caacgaccct gattatgcaa ggcggtgcta gcttcgtaac atcctaagaa tccagctacc cgctaagtta tataaacaaa gtgtaaatgt acaaaataaa gctgtcgatt ttacactgaa tatcgatcta aaagcacatg aatgagcccc ttattatcga aacacqctgg ttatgttcgc ttctttcaat atcgttgata gatgctatta cgagaatttt ataaagatgt catttcttac accgggacaa cctggtgcga actgtagtat caatcaatcc aacctgagtt attggacaaa aaaaatcaaa gatacaactc agcgcgcctg aatctatgaa gtgtattgca acagattggc tttatatgct ccaaaaaatc caagcggata atttatatca atgaatataa ctatatatat ggtaatttta aaaatgtaaa t ccaagtatg gaaaaagcct agaagaat ca acagcaaaat aatgtacgta gcggtataat tcaatagcgg agttgcaagt tgatgtttat tcaggatctc tattttggat aaaaggagca ttagtaaatc aattcttagc taatacttac tgttgttgct aaatcaaaag ctaaagaaat gaagtaatga cacctacgaa aaaaaccaag cagaaaaatg ttcagtgtta cgattacgca ttataatatt ccatttgata tttttaccgc aaaagttqa cagttaagag tcagtgcttc taacccaaag acragaaagc attaatatag ataattatcg gttttgtaac atggggcaat ttattcatct ggatttgtac gacggacaca ccgcaatagt ttaattctaa tagttggaca tgttaggcct ccatactcaa aatgactcaa caattactgg cttacagcaa tgttttagat agatatteaa ggattatata aataagttta taagcgatta caaccaattt tgatgacccg caaacgacaa ttcttattga aaaaacttat ctgaaacgac acgacttgat taagcaaggt atacattaaa cgagatatat agtgataaaa catcagttga ttactgaaac aagtccattc atctcataca tggggctggg tgggaaagta acacgcaatg ttaacaaaat acttaaagga aaacaacatt aacgaaattt gctagtggta gatacqgtgc tgacgtttgc taaaaataaa aaaggattac atcaataaag caccaatacg ctactgctat tcggcttaaa aggtatctac agatgttgaa acgagaataa gatgaaaata cagcatciat gcttaacacc tagattacat cgatgttgta aaagggttcc tgtaagctgt ttttagagaa gtggtattga atctgctgta aaaagataac gcgacagatt atcgaaaaga atgaaagtag aaggtgatac aggtaaaaaa tgacactaaa atgtggcaaa atcaatatcg caaatgttga tcaacgactc taaaacgtat tacattaaga gaattagcag tcaaaagtta gtacagaaga ataatcatga tgagcggtat actaaaaagc gcatctcaac caaaaagacc ctaatacact1 gaagtttatt tacgaaggtg :catcttttg taataaaacc ggacaataaa tggagttgaa gagggtttcg caaactaattt aaaaatatca actttagtaat caacattgac cattgaataat iacaaatgga tgcaaaagta i ;aacaaaggt attagcccga t :tatatacta cgtataaaga c ktaaagctga aaacaagtat a :atgaacgac acaaatgatt t ;tttgataat tcattaggga a Latatgtttt ttatgatagc a ggcacrggct ttttattttg ctacagctag tgaagtggtg gggcagtcaa tgttgggata gctaaggata tggctaatta cggaacctgg agacatcgca agtaggacca tctaataaaa ggttctccag gaagattagt aagatactag caaacctagt cgaagcgaag aaaccgcaat aaagaagagc acttcattga taaaagaatc aatgcatatg tgaaataccg catttatatg cgtcacccta attggctagt atcaaataca agcgttaata gttaaaggta gaccctaata ataccggata acgcaaagta tgacacgaga aaatataaaa cagtgcgtcc acacgaggcc acattccagc aagcactgga ctaatgcaac acgagcacaa ctatcaaatg cttaatttag aaaggaacgc t cgacggaca atttgatcgc gcacgctttc atataattac ttcggtattg ggttggacat ctccagcaaa gtcaaaacac attgtaccga agagtggtgc caacatcaag ttcacaaggg ataaatataa aaaatgatgg tgttgactta aagaataggt atcaatgaca ttgtttatgg aagatggctc ttacctacaa agaagaagaa gcaaaagaaa tattgagcaa atttaaaggt ggagaacaag ttgatggtaa agggttagaa acggattcaa aagcaataca ttttgaggaa gttttgtcat cgacggttaa ttacttaatt aataCctaaa aaaatacatg tcgatgttaa agactttaaa gcgaaaggtg gcgcacccgg agatacagat gataaattgc aaattgctga aaacaactct ttcaaacaaa ctcaagcttt aatttggaca aaggggtgat attgaaatga gagtgggcaa cacctaactt cagatatcct gtttggcacc gttattttta aagacaccct agtactgata t taaagaagt tcatatagtt cgttctgtag tcgatagaga tattgaagta cgtggtgttt tttggcgtag tgagcaagtt gaagt aacga ctactccatc tagacaaatg acgagctcgg gcaagtatca aggcaaagca ttagaaagtg gtgcattcga agcaatcatg attagatgga caagtacaat ataaagaggt ggtgacattg aacaaggtcg tatattcaaa aaggttatca tcgcaaactt attggtagat gagaaaatag ataaaggcga taaaaacg atcaaagcag aagaagttaa actagttgat atttcagaaa cactaaacga tactaaacaa ggaacagaaa ttttatggaa gctggaaata cggatagcat gacaatagaa atatcaggag tcactataag agtcgt ttt t gatcttagca atatcatcaa gtaaatgggc gccaattaaa gttgataaca ttttatggat gtttatacgc ctatgatagc catgttgaaa gttggacaaa tgacccaatg attaagcaag gttataacga aaatatcgct gacatgtatc aatcacaggg t a ataaagcaag gattgaaaaa tattgtegtt ttcccgtcaa aacactttca catcatatgg ggggtcctga aactgttaca 40881 ttgttgagag cgcaaattta 40951 tggcgttgcg caacctggtt 41021 tattttatta gattaaattt 41091 41161 41231 41301 41371 41441 41511 41581 41651 41721 41791 41861 41931 caactgccaa tcctggagca aagtatttaa aagatactgc gtatgacatt tcaagtcaat taagaggtgt tttatctgaa tctaaattaa cagctaaaaa aaaagagact tcaagaatta atagatggat aaagcaagca gtaggaaacg gacatgcagg atacggtgtt gttctagaga tcaatgegga aacacctcgt ttaggtttta tagctggtgc ccaaaaaaat ggtaattaca caggtgtatt tacttatatt cccagataaa gtaagtgttg gtaattaaac ctaaaaaaat gaacaaacga acgcgatttt tcatgaagtt gcattatatg aatgtaggaa ataataaaga ttcatttaga cgcagcagga tactattgat aaaagtatac aatgatttac tgaacgttaa ttactaataa aaaagatatg gattcatggt aagcctatag ccaccagtgc cagcaggtia cagttgccaa tgttaaaggt acctaataac gcaacaatca gctaatagtg gacaacgtcg t a 9 a 9 t a 9 t 9 a t 9 9 t a a
C
aagaatatac aaaagtatat ctacctaaa aataaatatc cctaataaaa gttatcaggt :aggttcaag tgatttagga :ccttcaggg atgttgtatg :tttaaacga ctaatttttt itaacaagat acatcgtatt :tccagtaga cgatgagact ~aatccaaca tctcaagaag igaaaagcaa cagggcaagc :agggtaggt gttgaccaat igcagttcaa tcctgatttg Iacaggcgaa aggttacaag :acgggcaaa taattaaaaa gtatggtgg cggagctgga caaaattgg aatggtaaag .gacatgttc attattacga agataaagc taaaagcgtt atgcttgta gccggtcatg .tccgtaaat atataacgcc 'tggctcaag tcaatcacaa tatggatta tattgggtta aaaatgcaa gtggtgggca agatgt tat taaaaataac gtatcagca gaaataaata attggatta agaagaatta tggtttggt agctggtaat acacttgat aagaataatg ataacgtaa gggacggct a atatgacgg cgcatattgc tatattgcg acaggagagg tgttattatc ttaggacaaa tcaattatcg tgacttgtat gttaaaacat tgccttataa ttcaactaat atcaatgggt tagataaagc WO 00/32825 PCT/I B99/02040 42001 42071 42141 42211 42281 4 235Si 42421 42491 42561 42631 42701 42771 42841 42911 42981 43051 43121 43191 43261 43331 43401 43471 43541 aggtaatagg cattaattat ttaacattac taatgtaatt gaggact tac gttgtttttt tatgcaaaaa ataccagttg atgtcagcaa tctatatata taaacgtgtt tttatggaag agaaacggga aaaacatctt acgaagaaga ataagtagtt agggaatctt tctcaagatt acattaccag ttgcgtaaag atgttatatt aaacgaaaaa agaggaggat ttgccatagc aattctaaca tttaggcaac agggataaaa tataaaattg tatcagatgc taaataaaag ttggtaagtt acagttatta taaatgtaga taaccaatct tagtaagaag ataaatgatc aagttcataa aaaaagtgt t gaaaacattg ctaaaatact gatataagta atgacagcaa ctaaaaattc cagatttaga gagccaaaaa tagcacgatt aataactatt taacaggcag ggcttaaaac ctgactgcat aaaccacacc aaagtattgc agaaaatttt aaaaaagacg atgaaaacaa aaagtgttgt t aaaagaaat eggattacca acgataataa tatgtttgtt tagtatttac ttagaataaa aattttgcta tggatggatg ttaatattcc tatacacttt gtactacggt acttgcctat ttttttgtta cacatttccg gtagccaatc cggctatgca atttaaacca cccatactag ttgctgggtg acctattaat ttaggagtgt ggttattttt atatcacgtt taaccgtgtt ataataaggt aaaactatag cagaaatcgc cttttataca ataagtaagt agacaagccc gaaagggctg tttacattat tttaatcatt cttatttgga tgcactgctt actactttac tgcttatcaa aattgaatca atagaaaagt tattcgaaaa tatcaaactg tgcaagattt aagaaatgga agttatacga gtatcaaaga tcgcttgaaa acaaaagaag aatttaaaac tttgaatgta cagatggaag acatgcaata tattgggtaa gtacccgcaa aaagcatacc caaaatatat cttaattgac gtggttattt tttaggtttg ttttctttct aaacttaatt gcttgtaaac tttatcccgc cgtctccata aaaatatgct caaatatgtg tcaagaaaat aattttctga agagtaggtg tctaaagtta tagatatatt aaagaagtat ttgaatcagg atgatagata cgtagtactt caaaagaaaa ttagtaagtt cgcgtgtcaa atacgtgtca cgcatagtta taggcttttc tggaaacctt gatttaatgg cacgttgacc ttgctctttt ataatggcct aatcttttgc taaaaacttt ataaaaatta gaccataaaa aaggcgattt aaataattag aaaaccacgt atttagttct atttctttag agctatatac caagataaga ggttttaatc tagcaagtgt ttatgttcat caagtaagtg taatatattc aatagg WO 00/32825 PCT/I B99/02040 199 Table Bacteriophage 96 ORFs list SID LAN FRA POS a.a.I RES sequence STA STO 100733 960RF001 1 25999. .29142 1047 1ccttgaatcgaaaggaggttagcct ttg taa 100734 960RF002 1 32008. .33906 632 tttttacgactaaaggaggcaacca atg taa 100735 960RF003 1 30109..31995 628 ttatattttagataaggagtagcct atg taa 100736 960RF004 1 36760. .38634 624 attttgattgaaatgaggtgcatac atg taa 100737 960RF005 3 33903. .35729 608 gtttattcgaaggaaaggtggttga ata taa 100738 960RF006 2 40589. .42043 484 aatgatttagggtaggtgttgacca atg tag 100739 960RF007 1 18652. .20091 479 tatacacacatactaaacctgaacg att tga 100740 960RF008 2 8960..10201 413 tggcagaatttgggggcgaraacga atg tga 100741 960RF009 2 17447..18670 407 gacgcaataacggaagtgatcgtca atg tga 100742 960RF010 1 38647..39819 390 taaatataaataaagaggtgtgtaa atg tga 100743 960RFOll -1 119..1195 358 gtagctcgcctacccttattarttt ttg tga 100744 960RF012 2 20045..21013 322 tttaargacaaattacctgacatag atg tga 100745 960RF013 3 29157..30098 313 acttattataagggaggtttgttag ttg taa 100746 960RF014 1 21925..22839 304 agaaaataaagtgaggraataaaac aig tag 100747 960RF015 1 5812..6591 259 atacacggtaaaggtgggagaatag atg taa 100748 960RF016 1 7852..8607 251 aataaaatgttgaaaggagagaaaa atg taa 100749 960RF017 3 3444..4190 248 aaatttaacarraatatcactttaa gtg taa 100750 960RF018 -3 28281..29000 239 taagctatgttgaacatcgctagtc atg tga 100751 960RF019 3 7188..7859 223 tttaccgttctaggacgtggtttaa atg taa 100752 960RF020 3 21324. .21908 194 gaagggcaaaaaggag tttgatat atg taa 100753 960RF021 3 6612..7175 187 attaaaaattaattaaaaggacggt ata tag 100754 960RF022 2 24536..25093 185 aaagaaaaacgaaggagtgtattaa atg taa 100755 960RF023 1 5275..5811 178 catgaaatggtaggaggtatgaaaa gtg tag 100756 960RF024 3 14481..15014 177 taaaacgataggagataacgaataa atg taa 100757 960RF025 2 25157. .25666 169 ataaaaaaattgaaaagaggtatat art taa 100758 960RF026 -3 15084..15590 168 tcattctraacatagcccttaattc atg tga 100759 960RF027 -1 1229..1732 167 aatagcaaataaaggagtgtaaaac atg taa 100760 960RF028 1 16960. .17454 164 aaggcgtotgatacagtgaaaacaa ttg taa 100761 960RF029 -1 1736..2227 163 tatgagaaaaggagtcatataaaag atg taa 100762 960RF030 1 25531..25995 154 ttttcaagagggagagtcgctcgta crg tag 100763 960RF031 2 23633..24097 154 tttagtattgaaggtgattctgtag ate tag 100764 960RF032 -2 2248..2706 152 ataagacaccaaag ggtttggcgc atg iga 100765 960RF033 -3 39147. .39605 152 agcatataaatcgtttagtgtttgt ttg taa 100766 960RF034 2 13181. .13615 144 tagaagtcgaaaaagtggaggcaat ata taa 100767 960RF035 2 10628. .11053 141 1 gagctaggatrgcaagcaacgatat ttg tga 100768 960RF036 2 24110. .24535 141 1 gtatttttcatagaggtggttaaat atg taa 100769 960RF037 1 12583. .12996 137 atgaggaacagaagcaaccaactr art tga 100770 960RF038 1 15628..16032 134 atgttaagaatgatgcctagtttaa ttg taa 100771 960RF039 3 39816. .40220 134 ctaatacacttacttaattaagj gtg taa 100772 960RF040 -3 27528..27932 134 tttccataaataaacgaggacacca atg tga 100773 960RF041 3 16206. .16607 133 gatgagggcggaggtgtcagagtag arg tga 100774 960RF042 2 35720..36106 128 aagttactataactaaaattatggg gtg taa 100775 960RF043 -2 35713..36081 122 ttaaacgtccccctcagtairrgt ttg taa 100776 960RF044 -2 9460..9828 122 agtatccatcagttgaagaat ct ata taa 100777 960RF045 -3 5139..5504 121 ttctttttgtattctrgaatattca art tga 100778 960RF046 2 11513..11872 119 aagtaaatgtatagaggtggaataa atg taa 100779 960RF047 2 22991..23350 119 gtcgtactacgtctgataagagcga gtg tag 100780 960RF048 3 8607..8963 118 tggaaaaagaattgagtgatgacta arg tga 100781 960RF049 1 23353. .23697 114 atccgtttaaaccaataaggtagag gtg raa 100782 960RF050 -2 2728..3072 114 tggtaaattagtattacattaagca ata taa 100783 960RF051 3 4692..5021 109 tcaaaaratacggaggtagtcaact atg tga 1 nn 7d nm-D W .1 lfQ I I 100785 960RF053 1 40252..40578 108 acgactaattttragrcgtttrrt att tag 100786 960RF054 1 4942..5262 106 aatataaaaccaaaaaacaaaattt atg tag 100787 960RF055 -2 4840..5151 103 ccgtcgcaatatatagrrcgcttaa ate taa 100788 960RF056 3 36324. .36623 99 aatttaacacaaagtaggtggcgra atg -aa 100789 960RF057 2 1394..1690 98 cttcagtggctctttagcatttaa ata taa 100790 960RF058 -3 26247..26537 96 tacrtcttttccataatctgacca art rga 100791 960RF059 -1 21485..21772 95 agactcaacgcctttttgaaeatac ttg tga 100792 960RF060 -3 22647..22931 94 cctctttgtaaccgacaagactgta ata taa 100793 960RF061 1 14023..14304 93 ttatcraattaagggggacgagtga gtg taa 100794 960RF062 -2 38281..38559 92 tatataacttagcgattgtacttc ttg taa WO 00/32825 PCT/I B99/02040 100795 1 960RF063 3 30786..31064 92 gtctcctaatactacatctt gccca gtg tga 100796 960RF064 -2 30205..30480 91 atgca ccac tttggatgtaacac ata tag 100797 960RF065 1 2617..2886 89 aaggtctaataaaaattcccctcc ttg taa 100798 960RF066 3 28056. .28325 89 aaggtgtagtcggctggttaactga att taa 100799 960RF067 -3 17142. .17411 89 ttccgttattgcgtcgtgaat ttg tga 100800 960RF068 2 12326. .12589 87 aatgcacgccttggtctgcctaa ttg tag 100801 960RF069 2 42734. .42997 87 ttttaggcaacgatataagtaaaa gtg taa 100802 960RF070 1 11869. .12129 86 aaatgttcaagaaatggagtgaagc ata taa 100803 960RF71 3 15396. .15656 86 aacaagctaacaaa raca an act taa 100804 960RF072 -3 37749. .38009 86 agacttttccgggctacccctaqac act taa 100805 960RF073 3 11244. .11501 85 acatgcatatatagaggtggaataa atg tag 100806 960RF074 -3 42936. .43193 85 aattatttaacttactaattttcc ttg taa 100807 960RF075 -3 26610. .26867 85 tactgccaatgttccacttcaacc att taa 100808 96ORF076 -1 11126. .11380 84 tttactaatacatttaagttaacc atc taa 100809 960RF077 -2 16537. .16791 84 cacccaccatataggcaggtagtag gg tag 100810 960RF078 -3 19521. .19775 1 84 aataactttgaattgatacctcaac ata tga 100811 96ORF079 3 13608. .13859 83 ttagggcaaaCggaggcagacacaa atg tag 100812 960RF080 -3 28029..28280 1 83 tgagaagtcgccagtaagcaactga act tga 100813 960RF081 3 20973. .21221 82 aatgaagttatcccattcatgactt atc tag 100814 960RF082 -1 8729..8974 81 cgattattgtgctttcaatttcaaa ttg tga 100815 960RF083 -3 3147..3392 81 tttagcctttatataatcaacctct gtg tga 100816 960RF084 3 1611..1853 80 tgctttatctttagtttctttcttt tg tga 100817 960RF085 -2 29470. .29709 79 ctcttatcaccttcgtttgtaggca atc caa 100818 960RF086 1 35188. .35424 78 gcgcaaggcgattgggatatttaa ctg tag 100819 96ORF087 -2 13039. .13275 78 ttttgattgagctctaaagtgtctt att tag 100820 960RF088 3 24930. .25163 77 gaaccatcaccaaaagttaaatgga ata tga 100821 960RF089 -3 22329. .22562 77 tccagtataagatagtggtaatccc ata taa 100822 960RF090 -3 16803. .17036 77 acctttagtcgaataccccgcgtca ata tag 100823 960RF091 -1 22559. .22789 76 aacgcttcggtttaacgtcatgt atg taa 100824 960RF092 1 3 18360. .18587 75 attgcaaaagacattgtaagtagat atg taa 100825 960RF093 -2 25384. .25608 74 catgatttccttgtaattctccttc atc taa 100826 960RF094 1 10417. .10638 73 aacacacat aaggagtgttaaaaa atg tag 100827 960RF095 3 12963. .13184 73 tactaaacgaagataaaacacgac att taa 100828 960RF096 1 42994. .43212 1 72 gaccgcttgaaaacgaagaagataa ata Iaa 100829 960RF097 -1 36047. .36265 72 tcaagcattacacccgtgacttccc atc taa 100830 960RF098 -2 36766. .36984 72 caggttccggtacaaatccagatga ata taa 100831 960RF099 -2 34765. .34983 72 tcattctttttataaaacgggcacc atg tag 100832 960RF100 1 10198. .10413 71 acaagaagactcagaggttttccac atg taa 100833 960RF101 1 15208. .15423 71 gagaaacaagttaagataaggagag atg tga 100834 96ORF102 3 4209..4424 71 attttaaaacgaaatataggagagg ctg tag 100835 960RF103 3 11673..11888 71 cacgcaccttatggtatgcgctcag ctg taa 100836 960RF104 3 12117..12332 71 tttacgtccaaagagccttcgaccc gtg taa 100837 960RF105 _L3 23892. .24107 71 gatggtgggttatceagtgccataa gtg taa 100838 960RF106 1 -3 34428. .34643 71 tagacrttcgccaatttgttgtga act taa 100839 960RF107 -3 24495. .24710 71 ggcacattaccaattgttaat ttaa atg taa 100840 960RF108 -1 23876. .24088 70 acatatttaaccacctctatgaaaa ata taa 100841 960RF109 -2 17317. .17529 70 acctgtacgctttgctccgtgacca act taa 100842 960RF110 -3 38931..39143 70 actttcatctttt cgatgtaagaa atg taa 100843 960RF111 -3 I21855..22067 70 aqcaaattttttcttttgtgctgtc act tga 100844 960RF112 1 3217..3426 69 aaatgtcaacgggaggtgatacgaa atg taa 100845 960RF113 -1 25469..25678 69 tcagggatatatcccaaacatcag ctg taa 100846 960RF114 -2 9838..10047 69 acaataaccatcacggtaaagtagc atc iga 100847 960RF115 1 13819..14022 67 gcagtaggggccacggcaggtcaag ttg tga 100848 960RF116 -1 41033..41236 67 caacttcatgacctgcatgtcttaa ata taa 100849 960RF117 -3 24711..24914 67 tctgctgtattccattcaacctcca atg taa 100850 96ORF118 -1 12374. .12574 66 cccatctcctcaaaataaagtccgg cg caa 100851 960RF119 -1 3980..4180 66 ctcctatatttcgttttaaaacccc act cga 100852 960RF120 -3 6033..6233 66 ttgtaattagaaaataacgacaa ata taa 100853 960RF121 -2 37939. .38136 1 65 ctgaaacgccccgatactgcccaa act tga 100854 960RF122 2 37892. .38086 j 64 acgacaaaaacaacaataagaacca gtg tga 100855 960RF123 -3 29193. .29387 64 ggacccgaccttaaatgcgaagc ata tga 100856 960RF124 1 4408. .4599 1 63 1 ttrat orrr rrr a t- r Arn I 100857 960RF125 -1 7787..7978 63 ttaaaaatccaagcttgccatcgt att tga 100858 960RF126 -3 27027. .27218 63 aaattgaacaacggcattaatcga gtg tga 100859 960RF127 T 3 15051. .15239 62 atcgagtcaaggagg tttggggaa gtg tga 100860 960RF128 -1 6914..7102 62 agcgaatgggttgattgttgaccc ata -t4a 100861 960RF129 -3 31332. .31520 62 tcctatttgctccgctgtctataa a"g tga 100862 960RF130 -3 30084. .30272 62 gaaaccatcccaccttcaacacga gt taa 100863 960RF131 3 11058..11243 61 agaaaaagagaaatgaagtgatcta acg taa 100864 960RF132 -1 36434. .36619 61 taagcatggtaatcacctccccaa ata tga 100865 960RF133 -1 35591..35776 61 ctaaaccatt cgaaaccgccagt act caa 100866 960RF134 -2 9250..9435 61 atccatgagcttataacccgccca I at Iga WO 00/32825 PCT/I B99/02040 100867 960RF135 1 29563. .29745 60 cgacaactttrgtaggactagtaa gtg tga 100868 960RF136 -3 12486. .12668 60 cactttactttcaacttgrtcagga ttg taa 100869 960RF137 -1 14501. .14680 59 caaactgaaagctaagtaatcagca I ate tga 100870 960RF138 -2 23326. .23505 59 cttgtgacatttgatgaaattttaq ttg iga 100871 960RF139 -3 42672. .42851 59 aat ccggaatttttagcaattttat atc taa 100872 960RF140 -3 31137. .31316 59 acttgattgactagtaaagtcgrac atg taa 100873 960RF141 -3 18969..19148 59 aacaaaaataacattatagggatct ata taa 100874 960RF142 -3 4740..4919 59 cataaattttgttttttagttttat att tga 100875 960RF143 2 36107. .36283 58 aacaaatactgagggggacgrttaa atg taa 100876 960RF144 3 16029..16205 58 tatacgaagtaaagaaggtagataa ata tag 100877 960RF145 -3 29013. .29189 58 tgtcactgacgcgatactgtgaacc att tga 100878 960RF146 -3 14883. .15059 58 aatctttgaatgttgtgactaagta ttg taa 100879 960RF147 -1 18251. .18424 57 tatcagcgttaattgcacgtaatct atg taa 100880 960RF148 -1 13583. .13756 57 aataccttctttaactgaatgttga ata taa 100881 960RF149 -2 10756. .10929 57 taaattcacatctctaractgtat ctg tag 100882 960RF150 2 14171. .14341 56 atttttaatgaagaagtgttattaa ctg tag 100883 960RF151 2 19217..19387 56 cctacatactcattgcgctactrt atg tga 100884 960RF152 -1 12614. .12784 56 atttctacagtaaaatatctttat ctg taa 100885 960RF153 -2 11836. .12006 56 ttgcattacctattgcgaatgctag ttg taa 100886 960RF154 -2 4165..4335 56 atataacgcttttgtcctcgaccaa ate tga 100887 960RF155 -3 40464. .40634 56 aaatcaggattgaacgcttcccta atg tga 100888 960RF156 3 423..590 55 tggtaattttgataatttagcttta ata taa 100889 96ORF157 -1 41879..42046 55 gtagcaaaatttttattctaagtaa ata taa 100890 960RF158 -2 36166..36333 55 cattcatgttcgtgccgttggtaa ate tag 100891 960RF159 -2 16228..16395 55 tttaacatctgagcataccttttat ttg taa 100892 960RF160 3 1038..1202 54 atctccaagcagtgttgagcagcg ttg taa 100893 960RF161 -1 19193. .19357 54 tctttgttgttaggtacaccaaaca atg tag 100894 960RF162 -1 18074. .18238 1 54 crcgtcctattaacacaatagatcc ata tga 100895 960RF163 -1 15386..15550 54 agccatcataggactgtaaaattca ctg taa 100896 960RF164 -1 10049. .10213 54 tacatcgatttcaataagcttttga att tag 100897 960RF165 -2 18514. .18678 54 gtgcttcaatatcatctatraact ata taa 100898 960RF166 -2 11104. .11268 54 ctagccatgattacccttaaattag ttg tag 100899 960RF167 -3 13764. .13928 54 agacagrttataatgtgatctcta ata tga 100900 960RF168 1 14305..14466 53 ttttgaatttttggaggacgagtaa atg tag 100901 960RF169 -1 17885..18046 53 gtgttgaagccttaatagactcttt ata tga 100902 960RF70 -1 10790..10951 53 taggcgctttacatatccacgttaa art taa 100903 960RF171 -3 12765. .12926 53 arcrrcgrrtagtatataaaacgct ctg taa 100904 960RF172 3 22836. .22994 52 cgttcgcaacgcttaaaccaactga ata tga 100905 960RF173 -1 15956. .16114 52 1 ctctacatcatcattagccgtcgtc ata taa 100906 960RF174 -1 10571. .10729 52 tagtgccattcatattactttctaa ata taa 100907 960RF175 -1 3440..3598 52 cagcctatcttcactatcaacatga ttg taa 100908 960RF176 -3 37170..37328 52 tttatctaaaacattgctgtaagca gtg taa 100909 960RF177 -3 6693. .6851 52 ttccraatctactaagtaactcgat ata taa 100910 960RF178 -3 5655. .5813 52 1 gacatcttgatagttttttcagtc atc tag 100911 960RF179 1 34564. .34719 51 1 gttacagctgaagccgataaaatag ttg tag 100912 960RF180 1 42661..42816 51 atataaattCtaacactaaaatact atg tga 100913 960RF181 -2 37741. .37896 51 jggacgcactgtcaactgatgtttt atc taa 100914 960RF182 -2 25039..25194 51 ttcgtaatctttttctccgtcatta art tga 100915 960RF183 -2 4534. .4689 51 tcagttttaatattttcagccatag ttg tga 100916 960RF184 1 6721..6873 50 ggagccggagaatttacagraaaag ttg tag 100917 960RF185 2 36548..36700 50 acaaaaatatacgcgatatqaaaat gtg taa 100918 960RF186 -1 40025. .40177 1 50 tggagarcctgaataaacatcactr ara tga 100919 960RF187 -1 34466. .34618 50 attacctttaacaaggccagcgcca ttg tga 100920 960RF188 -1 33842. .33994 50 agrrccrctatctgatcaragaaa crg taa 100921 960RF189 -1 24914. .25066 5 acaragaatggtcttccgtgtgtga atc taa 100922 960RF190 -2 20395. .20547 50 tatcttagagaaccctctccactc ata tga 100923 960RF191 3 24768. .24917 49 aaaggaartgaagcagrgaaacacg ctg taa 100924 960RF192 -1 16169. .16318 49 ttgtggtttcggcaacgttgcttgt atg tga 100925 960RP193 -2 39100. .39249 49 cagtaccgtttttaccgggtgcgcc ttg taa 100926 960RF194 -2 25921. .26070 49 1 ttggtacagacgtctttgctaatcg ttg taa 100927 960RF195 -2 17779. .17928 49 caacaat cggatggtaggg ttg tga lflflq2R 9Or 1 Qr 141A9 1AQ1 d I rrl r rerttCt.oCaltoC atC toa 100929 960RF197 -2 7609..7758 t rarcatcaaacgacttaacaccaa ttg tga 100930 960RF198 -2 1537..1686 49 ttatt aqcagtgcgttagtgttag g taa 100931 960RF199 -3 7719. .7868 49 taatacttgtatcggatagtcatct art taa 100932 960RF200 2 22271. .22417 48 ttcttaatgaggttaaacctctaa ttg tlag 100933 960RF201 2 30353. .30499 48 r rcractattggcgaaaaaataaggc ttg tag 100934 960RF202 2 32591. .32737 48 agattgaagcccaacggacaattca ttg taa 155M3 960RF203 2 39131. .39277 48 agcaaagactttaaagagaaaatag ata ta 100936 96ORF204 -2 36985. .37131 48 atcttcctggagaacctgtccaact art tga 100937 960RF205 -3 38721. .38867 48 aaggaacccttttacaacatcgrcg ata taa 100938 960RP206 3 35880. .36026 48 gtraacatagcgttttgttgcgtca art raa WO 00/32825 PCT/IB99/02040 100939 960RF207 -3 11550..11696 48 ttgcictctcgctccatqattttgg ata taa 100940 960RF208 2 37178..37321 47 agattag aagacaccctatgtaa gtg taa 100941 960RF209 2 42341..42484 47 tgcatatttaaaccacccatactag ttg taa 100942 960RF210 3 41850..41993 47 aaaggtaataacgtaagggacggct act tag 100943 960RF211 -1 6662..6805 47 ttgttggaatggtgggacgaattgg ttg tga 100944 960RF212 -2 25213. .25356 47 agtagcacattcccaaaattgtaaa atc taa 100945 960RF213 -3 42219..42362 47 gtggcttgatcatttataatataac ata taa 100946 960RF214 3 27834. .27974 46 aaaagatttagacttcgttagaac atc tag 100947 960RF215 3 35811. .35951 46 ttacgcaatagtttagatgtagacg ata taa 100948 960RF216 -1 5402..5542 46 tttccotaaggtgtattcaacttga att tga 100949 960RF217 -2 24229..24369 46 tataagcctgttaagcacataacct atc taa 100950 960RF218 -2 6253. .6393 46 ttgtcattcttgctaacacgtcaga ttg taa 100951 960RF219 1 883. .1020 45 aaatcactcccgaaatattcgttaa ata taa 100952 960RF220 2 32936. .33073 45 gataaaggtatagacaaagtattgt atc taa 100953 960RF221 3 41703. .41840 45 ggtaagcctataggtggtttggtag ctg taa 100954 960RF222 -1 39860. .39997 45 acttttattaggttcaactccattt act taa 100955 960RF223 -1 24716. .24853 45 acattrcaaatgattctggaacaac ata taa 100956 960RF224 -2 26794. .26931 45 caatatcacgccatgtagtttttaa ctg taa 100957 960RF225 -2 19201. .19338 45 caaacaatggattgtaatcaaataa atg tga 100958 960RF226 -2 15709. .15846 45 cgacttgcttgttgtctaacacaat ata taa 100959 960RF227 -3 36711..36848 45 acattgactgccccgacaattatct ata tga 100960 96ORF228 3 2325..2459 44 tcgccatagtgagttccaataccgt ata taa 100961 960RF229 -1 38612. .38746 44 ttgtcattgatacctattcttatag atg tga 100962 960RF230 -1 31733. .31867 44 gcvgaartgtatggcctaaagtaat ctg tag 100963 960RF231 -2 12076. .12210 44 1gactcatagctttaacttgttcgt ctg taa 100964 960RF232 -3 31644. .31778 44 atagtcctcaagtgttaaccctagt tig taa 100965 960RF233 -3 23988. .24122 44 atttaatttgtaagttcaggctcaa ctg taa 100966 960RF234 -3 17529. .17663 44 agtacgtttttttgaatcgtaccta atg taa 100967 960RF235 1 7153. .7284 43 aatgctaatggtccaatagaaaca atg tag 100968 960RF236 2 2681..2812 43 ttctttcacttcaacttcacatttc ata tga 100969 960RF237 2 4496..4627 43 gtaccatgcttcacagtcttagcga ttg taa 100970 960RF238 -1 41720. .41851 43 cacctataatcctgaattagttga ata tga 100971 960RF239 -1 35324..35455 43 acttactaataaaatagaatagttt gtg taa 100972 960RF240 -1 8570..8701 43 atccccgttttgacttaatacatca atc tga 100973 960RF241 -2 33502. .33633 43 ataattttgtaatactcttagggat atg tag 100974 960RF242 -2 23662..23793 43 agctaatgctacagcagtgttgtaa atc tag 100975 960RF243 -3 32391. .32522 43 acctgacgagcttgcgtcatataa ata tag 100976 960RF244 -3 30273..30404 43 aaaaccttcgttatactcttggtaa atc tga 100977 960RF245 -3 5895. .6026 43 tgcaccaaaatgcttataattctta atc taa 100978 960RF246 -3 2679..2810 43 attcatcaagaaactatagccggtc atg tga 100979 960RF247 1 34891. .35019 42 acatcaagcaaatctggtgtgttag ttg taa 100980 960RF248 2 30668..30796 42 aaccattacattaaagctggtgtga atg tag 100981 960RF249 2 31838. .31966 42 caaatattagcttgtagtgagttag atg taa 100982 960RF250 2 33539. .33667 42 cttaccaqaaacagcacaggtagaa ata taa 100983 960RF251 -1 20486..20614 42 cttctqtacgagccacacgcaatga ttg tag 100984 960RF252 -1 15128. .15256 42 gatactccattactagctactacta ata tga 100985 960RF253 -2 41446. .41574 42 aaaacctaattcagataaacgataa ttg tga 100986 960RF254 -2 41005. .41133 42 gttaraaccatgaccggctacaagc ata taa 100987 960RF255 -2 23008. .23136 42 aggataaatgacttgaccatctttc ata taa 100988 960RF256 -2 14794. .14922 42 ttgtagcgtcaatgagt ggtcga ttg tag 100989 960RF257 -2 8503 8631 42 tacctaacttttttaataatttcta atg tga 100990 960RF258 -3 22143. .22271 42 aaacgctttgtaaaatgcctctgca att tga 100991 960RF259 -3 18639. .18767 42 cttgtacctattatagagattaacc att tag 100992 960RF260 -3 15624. .15752 42 gtttagtaactagccactgtatag eta taa 100993 960RF261 2 18746. .18871 41 catattgaggctctaetagegtcac eta taa 100994 960RF262 -1 13067. .13192 41 aattaattaettcttctcttgttgg ttg taa 100995 960RF263 -2 18742. .18867 41 taacagacecgtctaatcgccttac att tga 100996 960RF264 -2 18376. .18501 41 catattatcataaagaacaagtaac ttg taa 100997 960RF265 -2 367. .492 41 ctaaacgaaeaagagggtacaatac atc tga 100998 960RF266 -F 32802..32927 41 aggtaatccatttgatacaatact ttg taa 100999 960RF267 T -3 10194 10319 41 atctcgaaggcgetaactcgtta tt tga 101000 960RF266 1 1159..1281 40 ttattcttcctttttqtaattqtae atq taa 101001 960RF269 2 10373 10495 40 gacaaagttgaaaagaaaatcatga atg taa 101002 960RF270 2 15734..15856 40 ttattcggcgtaatcgcactgatgc ttg tag 101003 960RF271 -1 43451..43573 40 c c t*Jo shine-dalgerno act tg.
____sequence 101004 960RF272 -1 36959. .37081 40 acgccataaaaataacttttattag att tag 101005 960RF273 -1 35798. .35920 40 ctgacgcectttgttggtttgatgc att taa 101006 960RF274 -1 8147..8269 40 tctgtctctctatgtttgttagtct ctg tga 101007 960RF275 -2 43066. .43188 40 tttaacttactaattttcttttgat ate tga 101008 960RF276 -2 42535..42657 40 aaataatgtaaattgttttcatagt at tag 101009 960RF277 -2 30628..30750 40 tttgtagtcccgcttctqcaaaegt ctg taa WO 00/32825 PCT/IB99/02040 101010 960RF278 -2 13291..13413 40 rrcgrarcttccaagcaarrcarrr ttg tga 101011 960RF279 -2 3172..3294 40 cagactgtttagtaacgcctaattt arc taa 101012 960RF280 -3 18804. .18926 40 taaaraaccaacacgtgtatcaaca art tag 101013 960RF281 -3 15843..15965 40 atttaaaaagtgtattcraraacca atc tag 101014 960RF282 -3 8460..8582 40 rragrcarcacrcaartctttttcc atr raa 101015 960RF283 -3 7593..7715 40 gargrrgtcracacagrgcaacac atg raa 101016 960RF284 -3 6453..6575 40 aartaatttttaarraccattt ca art Iga 101017 960RF285 1 15082..15201 39 caaracttagrcacaacatrcaaag att raa 101018 960RF286 1 34444. .34563 39 acacaaacgraaracaaaagga J arg rag 101019 960RF287 2 27920. .28039 39 cctattttagcagttgrrgcagcaa trg rag 101020 960RF288 2 28415. .28534 39 atcggctttttaactggcgraarga arc tag 101021 960RF289 2 38147. .38266 39 ratcaaargcttaatttaggcaagr arc tga 101022 960RF290 3 40917. .41036 39 gcaaarrraaacacttrcacarcar atg taa 101023 960RF291 -2 38815. .38934 39 tctctaaaaacagcrracagcgaac ata taa 101024 960RF292 -2 32671. .32790 39 craraqgataraaarcgcrgacgr ara tga 101025 960RF293 -2 31216. .31335 39 ttgarrrgatgtttcttatactrga ttg taa 101026 960RF294 -2 21589..21708 39 gtatcttcatcagaarcgccraaaa atc taa 101027 960RF295 -2 18976. .19095 39 tatcaaratargcraaccragcacc ara taa 101028 960RF296 -2 11482..11601 39 gccaccrcgtactctttttgcaacc art taa 101029 960RF297 -3 12933. .13052 39 rcacgaaaraargrttctttaattt ara taa 101030 960RF298 -3 8262..8381 39 gaactgarctrgcttaaargattra art tag 101031 960RF299 -3 6993..7112 39 catragcarragcgaargggtttga ttg tga 101032 960RF300 2 23516. .23632 38 acracarcrgaacaacraaaattrc arc rag 101033 960RF301 2 25943..26059 38 agattagaagaagaaaaaagaagac grg taa 101034 960RF302 2 36929..37045 38 tattggggrrtgtaacarggggca atg rag 101035 960RF303 3 4476..4592 38 araaaagcraccragragcagrac arg tga 101036 960RF304 3 20586. .20702 38 tactcraagatagctaaagcaarac grg tga 101037 960RF305 3 28356. .28472 38 cggrtaccaatgtgctrgaracgar ttg taa 101038 960RF306 -1 24359..24475 38 actcaaaraaaagccgatcgrgcc arg taa 101039 960RF307 -1 20147. .20263 38 ttgtaccraracgagraacrccrr art tag 101040 960RF308 -2 38158..38274 38 rccgratccactttcraagaaagc gtg tga 101041 960RF309 -2 35149. .35265 38 agcttgrrtgtatcgctrraacga ata taa 101042 960RF310 -2 31423. .31539 38 graarargatraggtctcctctrar ttg raa 101043 960RF311 -2 10438. .10554 38 cgcctttaaarcgttttaggtcac arc taa 101044 960RF312 -2 1390..1506 38 gagaacaacacaaacattaacaaca arc taa 101045 960RF313 -3 33051..33167 38 acgrcctgtttctagatcgtaatac ata tag 101046 960RF314 -3 25194. .25310 38 agcaaaccgttaaagaraacattga arc taa 101047 960RF315 -3 6273. .6389 38 cattcttgctaacacgrcagarrga ctg tga 101048 960RF316 -3 4281..4397 38 ataatrcgrattcattaarcarraa att tag 101049 960RF317 1 2260..2373 37 atgacrccttttcrcatatttcttt ata raa 101050 960RF318 2 21230..21343 37 atttcacacttttttagtgtcrcr ttg raa 101051 960RF319 3 18018..18131 37 aractgagrcaccaarrraagcrcg atg tag 101052 960RF320 3 36972. .37085 37 attacagaratccraagggtttccg att taa 101053 960RF321 -1 36302. .36415 37 crcttgattrtttgaccraat a arc taa 101054 960RF322 -1 32606. .32719 37 ccaraagttatttcrccagttcrat att taa 101055 960RF323 -1 11453. .11566 37 ttaaaccgtrcttttttatcaac c art tga 101056 960RF324 -1 7268..7381 37 racrggrrcgcccagrgaagttct ata rga 101057 960RF325 -2 32347..32460 37 tracrgcartgrtatarggcgaraa arc tag 101058 960RF326 -2 24682. .24795 37 acgtttattacgcrcaraaagccar ata tag 101059 960RF327 -2 23905..24018 37 aaatggctgtggcgcttgaccatat grg taa 101060 960RF328 -2 21460..21573 37 agagcacraaracgtttttgttctt ctg tga 101061 960RF329 -2 21208..21321 37 gactraacttcttcgatattcatat arc tga 101062 960RF330 -2 18085..18198 37 ccagrcgacaccagcaaagattcr trg tag 101063 960RF331 -2 8170..8283 37 acrttgagacgrcgtctgtcctct arg tag 101064 960RF332 -2 5971..6084 37 caarttgttttccgttttctctrag rtg tag 101065 96ORF333 -3 37632..37745 37 accttgcttaarcaagtcgtaatta art tga 101066 960RF334 -3 29628..29741 37 crgagtragtgttgraaaatgtcar ttg tag 101067 960RF335 -3 7164 7277 37 ttagcggararccgrrtcragraa arc taa 101068 960RF336 1 22903. .23013 36 graaaaaaagacaarargacrarra ctg tga 101069 960RF337 1 43258. .43368 36 raattgacgrggrtattttttaggt ttg taa 101070 960RF338 2 12668 12778 36 gaactggggaarggqcatggaaca atc tag 101071 960RF339 2 28292. .28402 36 ttcact act t tr AarraAy -o.rr fo I- r 101072 960RF340 2 35396. .35506 36 rrccraargaacataagtcaacggr art rga 101073 960RF341 3 25428. .25538 36 actcggaacaarragaaaaacaa ttg tga 101074 960RF342 -1 40913. .41023 36 tatcrgggaaarttaatctaataaa ata tga 101075 960RF343 -1 39173. .39283 36 tgccacarttrragtgcaggatrga ttg raa 101076 960RF344 -1 37580. .37690 36 gggtcraccrrraacgrcgrrrcag ata taa 101077 960RF345 -1 31556. .31666 36 ggattartctttctaaraacttcaa ttg tga 101078 960RF346 -1 29972. .30082 36 ggctacrcctatctaaaatacaat tg taa 101079 960RF347 -1 28787. .28897 36 cgccaaagtctgragcaattac ttg tga 101080 96OR348 -1 21839. .21949 36 rraaaaccgaraaaaraacarrgc cr9 tga 101081 960RF349 -1 3647..3757 36 raaaacrrccgaagttacccagcgt ttg tga WO 00/32825 PCT/I 399/02040 101082 960RF350 -2 40801..40911 36 accattccaattttgcccatatgat gtg tag 101083 960RF351 -2 38953. .39063 36 tatcttttaaaattctcgtaatagc atc taa 101084 960RF352J -2 31585..31695 36 tagctgtcatcactagtatttttga atc taa 101085 960RF353 -2 24550. .24660 36 atagtccgttttaccgcctcgtact att tag 101086 960RF354 -2 20083..20193 36 arcatcattttgatatttctcaaac ata tga 101087 960RF355 -2 991..1101 36 gcaccttggcagtacgacgtaaaac atc tag 101088 96ORF356 -3 38148..38258 36 taagaaagcgtgcgcgatcaaataa att tga 101089 960RF3S7 -3 8790..8900 36 tgaagttatctagcgctatttttct ttg tag 101090 960RF358 -3 4458. .4568 36 ttcataaaagtattctttgtagtat atg tag 101091 960RF359 1 4666. .4773 35 ttatcaaaatatacaacttaattaa atc tag 101092 960RF360 1 11569..11676 1 .15 ataaacttaccgaacatgaaaatga ait tga 101093 960RF361 2 6122..6229 1 35 ggaaaacaaattgatgttgtagtga ttg taa 101094 960RF362 -1 40418..40525 1 35 ttcgtaggtgcattacttctttaa ttg tag 101095 960RF363 -1 34358..34465 35 gttttgcttgatttcgatttgttga atg tga 101096 960RF364 -1 20654. .20761 35 ctatttccactgattccccatctaa atg tga 101097 960RF365 -1 8423. .8530 35 tcttttttagagtacgaggtttca att tag 101098 960RF366 -1 2402. .2509 35 tgacgtatggcaacattttagatca atc taa 101099 960RF367 -2 36607..36714 35 aaaataaaaagccagtgccgaagca ctg tag 101100 960RF368 -2 27061..27168 35 caaatcgtcctgcagcgttcaataa atc tag 101101 960RF369 -2 26470..26577 35 atgagttgttaagtrtaccccaaat atc taa 101102 960RF370 -2 10327..10434 35 ccgtgccatcttctcggtataagta ata taa 101103 960RF371 -2 8650. .8757 35 gggtacgggttgttactgttgatat atc taa 101104 960RF372 -3 14382..14489 35 gttcttttaattgatctactgttaa att taa 101105 960RF373 -3 8151..8258 35 atgtttgttagtctctgtgtagtct aig taa 101106 960RF374 -3 5007..5114 35 aaacqatttaagtggaacattattc ata taa 101107 960RF375 2 30563..30667 34 cgattagaaatctttaaaaaaggac ttg tga 101108 960RF376 -1 19916. .20020 34 tctatgtcaggtaatttgtcattaa att taa 101109 960RF37 -1 9236..9340 34 cttttctgttagtaattgtttttaa atc taa 101110 960RF378 -1 9026..9130 34 actctttatctttagttgcttttaa ata tag 101111 960RF379 -2 28447. .28551 34 cttttgtgataataaagtttagtgc ttg tga 101112 960RF380 -3 40329. .40433 34 ccatttaccttcttgagatgttgga ttg tga 101113 960RF381 -3 39801. .39905 34 caaaagatgaaggctttttccatac ttg taa 101114 960RF382 -3 33831..33935 34 atgttgtttgtaactcgattaagtt atc tga 101115 960RF383 -3 33687..33791 34 gttattacgtcttaatacttgtgtt gtg tag 101116 960RF384 -3 13530. .13634 34 tatacgcactagtactgatcactga ttg taa 101117 960RF385 -3 3843. .3947 34 tttgattgattgttctagttaagaa att taa 101118 960RF386 1 12256. .12357 33 agtcataaagaagttagcaatgtga ttg tag 101119 960RF387 2 2207..2308 33 tccaagactctttaactgttaactt atc tag 101120 96ORF388 2 2519..2620 33 attgttgaatttcgattgatcaaa atg_ tga 101121 960RF389 2 22517..22618 33 agaagcaaaatgcgtaatgctttag atg tag 101122 960RF390 2 27302. .27403 33 ttccaaaattgggctaatagtgtag ctg taa 101123 960RF391 2 32384. .32485 33 ___________tgagaaagctgtag atg taa 101124 960RF392 2 39287. .39388 33 aaaaacggtactgtagtatcaatca atc tag 101125 960RF393 3 18153. .18254 33 gtagtatatgccgactttgatttga atg taa 101126 960RF394 3 24189. .24290 33 tcagaccctaacattaacaaactag ttg tga 101127 960RF395 -1 15266. .15367 33 tcgataatttgtatagcttgtttta atg tag 101128 960RF396 -2 32239. .32340 33 ttttagtgaaagcatctagtgttga ata tag 101129 960RF397 -2 16123. .16224 33 ttatgtgtgcctatcatattaacaa ttg tag 101130 960RF398 -2 13648. .13749 33 tctttaactgaatgttgaaagcat ttg tag 101131 960RF399 -2 10987. .11088 33 acttctgtaggtattcttatatcaa ttg tga 101132 960RF400 -2 3382..3483 33 cttactggtaattcttcaaaattaa atg taa 101133 960RF401 -3 40794. .40895 33 ccatatgatgtgaaagtgtttaaat ttg taa 101134 960RF402 -3 39978. .40079 33 1 atattcctaaatcacttgaacctaa att tga 101135 960RF403 -3 38607. .38708 33 atcttcagtgtaaaatcgacagcca atg tag 101136 960RF404 -3 21288. .21389 33 cagacaccgtcttaagtccctttag ata taa WO 00/32825 WO 0032825PCT/1 B99/02040 205 Table 11 SEQUENCE INFORINATION FOR PHAGES MATCHING WITH TABLE 1 M32695 Bacteriophage PM2 nuclease cleavage site gij166145jgbjM32695jBM2NCS (166145] (View GenBank report,FASTA reportASN.1 report,Graphical view,1 MEDLINE link, or 1 nucleotide neighbor tv32 693 Bacteriophage PM2 Hind III fragment 4 gi1166I44jgbjM32693jBM24HJ1M3 (166144] (View Genflank report,FASTA report,ASN. I reportGraphical view,lI MEDLINE link, or 1 nucleoride neighbor 'A32 693 Bacteriophage PM2 Hind III fragment 4 Sijl66144jgbjLM32693jl3M24HJND3 (166144]' (View GenBank report,FASTA reportASN.lI report,Graphical. view, I MEDLINE link, or 1 nucleotide neighbor) M32694 Bacteriophage PM2 Hind III fragment 3 gijl66l43jgbjM32694jBM23HTND3 (166143] (View Genfanic report,FASTA reportASN. 1 reportGraphical view, or 1 MEDLINE link) ,%26134 Bacteriophage PM2 structural protein gene containing punine/pyrimidine rich regions and anri-Z-DNA-IgG binding regions, complete cds giI28936OjgbM26I34jBM2PROTIV (289360] (View GenBank reportFASTA reporeASN. 1 report,Graphical view, 1 MEDLINE link, or I protein link) J02452 bacteriophage fi 3'-terminal region ma giI2l 5409IgbjJ024S21PFITR (215409] (View GenBank reportFASTA reportASN. 1 reportGraphical view, or 1 MDLINE link) AF020798 Bacteriophage ChplI genorne DNA, complete sequence gij2l776lldbjID00624jBCP1 (217761] (View GenBaak report,FASTA reportASN.lI repoMtGraphical view,1 IEDLINE link, 12 protein links, or 1 genome link) X72793 Clostrdium. botulinum C phage BONT/Ci1, ANTP- 139, ANTP.33, ANTP- 17, AN4TP.70 genes and ORF-22 guS 1617IjembIX72793ICB3CBONT (516171] (View GenBank reportFASTA reportASN. 1 reportGraphical view, I MEDI link, 6 protein links, or 4 nucleotide neighbors) X51464 Clostridium borulinum D Phage C3 gene for exoenzyme C3 gijl49O7jembjXS 1464ICBDPE3 (14907] (View GenBanc reportFASTA reportASN.1I reportGraphical view,lI MEDLINE link, 1 protein link, or 2 nucleotide neighbors) D902!01 t Bacteriophage c-st (from C. borulinuni) C I -tox gene for botulinum Cl1 neurotoxin giJ21I7780ldbjID9OZ 1OICSTC ITOX (217780] (View GenBank report,FASTA reportASN. I reportGraphical view,1I M:EDLINE link, or I protein link) WO 00/32825 WO 0032825PCTI I B99/ 02040 206 S49407 type D neurotoxin. [bacteriophage d- 16 phi, host borulinurn, type D, CB 16, Genornic, 4087 nt) gi!2602381gbjS49407lS49407 (260238) (View GenBank report,FASTA report,ASN. I report,Graphical view,l MEDLINE link, or 1 protein link) X53 370 Bacteriophage phi29 temperature sensitive mutant TS2(98) DNA polymerase gene gi115733lembX53370[POTS298 [15733] (View GenBank reportFASTA reportASN.1I reportGraphical view, I MEDLMN link, I protein link, or 7 nucleotide neiehbors X53371 Bacteriophage phi29 temperature sensitive mutant TS2(24) DNA polymerase gene gill 573lIlembIX5337lfIPOTS224 (15731) (View GenBank report,FASTA report,ASN. 1 report,Graphical view,! MEDLINE link, I protein link, or 7 nucleotide neighbors X05973 Bacteriophage phi29 prohead RNA gill 5680lembIX059731POP29PRO [15680] (View Gen.Bank reportFASTA reportASN. 1 rcporrGrapbical view,2 MEDLIN links, or 4 nucleotide neighbors) Left end of bacteriophage phi-29 coding for 15 potential proteins Among these are the terminal protein and the proteins encoded by the genes 1, 2 (sus), 3, and (probably) 4 gill 56591emb Vo 1551P0P29B (15659] (View GenBank reportFASTA reportASN. 1 report,Graphical view,!I MEDLINE link, 16 protein links, or 16 nucleotide neighbors) X73097 Bacteriophage phi-29 left origin of replication Sil3 12194IemblX73097lBP29ORJL (312194] (View GenBank reporrFASTA reportASN. 1 reportGraphical view,I MEDLIN link, or 5 nucleotide neighbors) LM14430 Bacteriophage phi-29 gene- 17 gene, complete cds gil? 1532 ljgblM1443OlP29Gl7A (215321) (View GenBank reportFASTA reportASN. 1 repo-tGraphical view,1 MEDLINE link, 6 protein links, or 8 nucleotide neighbors) M14431 Bacteriophage phi-29 gene-16 gene, complete cds giJ215319fgbjM1443 lIP29Gl6A (215319] (View GenBank reportFASTA reportASN. 1 reportGraphical view,! IMEDLINE link, 2 protein liniks, or 7 nucleotide neighbors) M20693 Bacteriophage pbi-29 DNA, 3'end gil2 l53431gbIM20693IP29RE-PIN (215343] (View GenBaak reporrFAS'rA reportASN. 1 reporrGrapbhical view, 1 MEDLINE link, or 4 nucleotide neighbors) IM21016 Bacteriophage phi-29. DNA, 5' end gil? I 3421gbIM2 101 61P29R.EPINA (215342] (View GenBaak reportFASTA reportASN.1 reportGraphical view,!I MEDLINE link, or I nucleoride neighbor) WO 00/32825 WO 0032825PCT/I B99/02040 -207 M 12456 Bacteriophage phi-29 genes 9, 10 and 11I encoding p9 tail, incomplete, plO connector, complete, and p11I lower collar, incomplete, respectively giJ2 1338jgbIM 124S6IP29P9 (215333] (View GenBari reportFASTA report.ASN. I report,Graphical view, I .MEDLE\TE lnk, 3 protein links, or 2 nucleotide neighbors) N114782 Bacillus phage phi-29 head morphogenesis, major head protein. head fiber protein, tail prote-in, upper collar protein, lower collar protein, pre-neckappendage protein, morphoeenesis(1 lysis, rnorphogenesis(l encapsidation genes, complete cds giJ215323Igb1M14782IP29LATE2 [215323) (View GeniBank report,FASTA report,ASN. I report,Graphical view, 1 MEDLINE link, I I protein links, or I I nucleotide neighbors) M26968 Bacteriophage pni.29 (from Bacillus subrilis) proteins p 1 delta- I genes, complete cds, and the susl(629) mutation giJ34lI558IgbJIM26968JP29Pl1Dl A (341558] (View GenBanik reportFASTA reportASN. 1 repor-t,Graphical view, I MEDLINE link, 2 protein linksz, or I nucleotidle neighbor) 102448 Bacteriophage fl, complete genome gill166201 jgblJ024481FlICCd 166201] (View GeniBank report,FASTA report.ASN. I report,Graphical view, IMNEDL ME link, 10 protein links, 205 nucleotide neighbors.
or 1 genome link) M24832 Bacteriophage Q2 coat protein gene, partial cds; gill66228JgbJM24832iF2CRNACA [166228] (View Genflank reportFASTA report,ASN. 1 report,Graphical view, I MEDLIN-E link, I protein link, or 4 nucleotidle neighbors) 102451 Bacteriophage fd, strain 478, complete genome giI2 15394 lgb1J0245 I IPFDCG (215394] (View GenBank report,FASTA reportASN. 1 reporrGraphical view,5 MiDLINE links, 10 protein links, 204 nucleotide neighbors, or 1 genome link) M34834 Bacteriophage ft replicase gene, 5' end gi11661391gbIM348341BFREGRA [166139) (View GenBanc reportFASTA reportASN.1I reportGrapbical view,I protein link, or 9 nucleotide neighbors M38325 Bacteriophage ft replicase gene, 5' end gil 166 1371gbIM383251BFRREGR [166137] (View GenBank repOrtFASTA reportASN. 1 reporzGraphical view, 1 protein link, or 9 nucleotide neighbors M35063 Bacteriophage ft coat protein replicase cistron region) RNA gi1l66134gbIM350631BFPy,RCRRA, [166134] (View GenBank report,FASTA reponcASN. 1 reportGraphical view, I protein link, or 3 nucleotide neighbors) S66567 alpha-at-ial nariuretic factor/coat protcein--fusion polypeptide [human, bacteriophage fr. expression vector nFANIS5, PlasiiiSvntheticReconibinant. 5 10 ot] gil435742igb1S66567IS66567 [435742] (View GenBank report,FASTA report-ASN.1 I eport,Graphical view,1 I NEDLrNE link, I protein link, or 15 nucleotide neisnbors) WO 00/32825 WO 0032825PCT/I B99/02040 208 X15031 Bacteriophage fr RNA genorne gil 071 JembjX 1503 1 1LEBFRX 1507 1 (view GenBank repottFASTA report,ASN. I rcport,Grazhical view, I MEDrNT Ilik, 4 protein links, 9 nucleotide neighbors, or I genome link) 1233 Mus musculus neutralizine anti-RNA-bactcriopbage fr inmunoglobulin variable region light chain (1gM) mRNA, partial cds gi11277150jgbIU512233lMMvU51233 [1277150) (View Genflank report,FASTA reporl,ASN. 1 report,Graphical view,lI protein link, or 1669 nucleotide neighbors) 1232 Mus musculus neurralizine anti-RNA-bacuerioo0hage fr immunoglobulin variable region heavy chain (1gM) mRNA, partial cds gijl277l48JgbjU5 1232lMMU5 1232 (1277148] (View Genflank repoz-t,FASTA report.ASN. I report,Graphical view,lI protein link, or 1073 nucleoride neighbors) U02303 Bacteriophage Ifl, complete genome gil36762801gbIU02303lB32U02303 [3676280] (View GenJBank report,FASTA report,ASN.1 report,Graphicai view,l0 protein links, or I genome Link) V00604 Phage M 13 genorne gill 49591embl V006041rNM13 X [14959) (View GenBank report,FASTA reportASN. I reportGra-;hical view, I NEDLINE link. 10 protein links, or 205 nucleotide neighbors) A32252 Synthetic bacteriophage M 13 protein III probe gil l5673401ernblA322521A32252 [1567340] (View Genflanik reportFASTA report,ASN. 1 report, or Graphical view) A32251 Synthetic bacteriophage M13 protein EI probe giIIl5673391embIA3225 1lA3225 1 (1567339) (View Genflank report,FASTA reportASN. 1 report, or Graphical view) M 12465 Bacteriophage M 13 npl10 mutations in lac operon gij2l52l0IgbI2465jMl3LACNjU [215210) (View Gen.Bank reportFASTA reportASN. I reportGraphical view, I MEDLINE link. or 215 nucleotide neighbors) M24177 Synthetic Bacteriophage M13 (clone M13.SV.B12) SV40 early promoter region DNA gij2094lI6jgbjM24l177JSYNS VBI12 [209416] (View GenBank repomtFASTA reportASN. 1 repoMtGraphical view,!I MEDLINE link, or 1 nucleotide neighbor M24176 Synthetic Bacteriophage M13 (clone M13.SV.Bl 1) SV40 early promoter region DNA giI20941I51gblM241I761SYNSVB 11 [209415] (View Gen.Ban~k reportFASTA repoMtASN. 1 reportGrarphical view,!I MEDUNE link. or I nucleotide neighbor WO 00/32825 WO 0032825PCT/I B99/02040 -209 M24 175 Synthe tic Bac teri ophag e M 13 (c lone M 13.S V.8) S V4Q0 early p romoter reg ion DNA gil2088061gbIM24175SyNM13SV8 [208806] (View Gen.Bank report,FASTA report,ASN. 1 reportGraphicalI view. I tMEDLINE link, or 242 nucleotide neighbors) Ml 9979 Synthetic hybrids; recombinant DNA from bacteriophage IM1 and plasmid pH V33 gij2078l3jgbjMl9979jSYN33Ml3M [207813] (View GenBank report,FASTA report,ASN.l report,Grapbhical viewl IMEDLINE link, or 617 nucleotide neighbors M1 9565 Synthetic hybrids; recombinant DNA from bacteriophage M 13 and plasrnid pHV33 gil207808gbIM I 95651SYN33M 13H- [207808] (View GenBank report,FASTA report,ASN.lI report,Graphical vicw,l IEDLINE link, or 567 nucleoride neighbors M19564 Synthetic hybrids; recombinant DNA from bacteriophage,\11 and plasmid pHfV33 gil207807lgbIM195641SYN33Ml3G [207807] (View GenBarik repoz-t.FASTA reportASN. I rep ort,Graphical view, I MEDLINE link, or 12 nucleoride neighbors) M1 9563 Synthetic hybrids; recombinant DNA from baCEt-riopha-ee M13 and plasrnid pHV33 gij207806IgbjMl19563jSYN33Ml13F [207806] (View GeaBank report,FASTA reportASN. I report,Graphical view, I MEDLINE link, or 262 nucleotide neighbors) M 195 61 Synthetic hybrids; recombinant DNA from bacteriophage IM13 and plasmid pHfV33 gij207804jgbjMl956ljSYN~33Ml3D [2078041 (View GenBank rcport,FASTA repotASN. I reportGraphical view, 1 NfEDLMNE link, or 27 nucleotide neighbors) M19560 Synthetic hybrids; recombinant DNA from bacteriophage M 13 and plasnid pHV33 gij2078O3jgbjMl 9560jSYN33ZMl3C [207803] (View GenBaak report,FASTA report,ASN. I report,Graphical view, or 1 MEDLINE link) Ml19559 Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid pHV33 7 8021gbM195591SYN33M13 [207802] (View GeniBank reportFASTA reportASN. I reportGraphical view, I IMMULNE link, or 227 nucleotide neighbors) M10568 Bacteriophage M 13 rep licative form II, replication origin, specific nick location gil2l5220IgbM1056jM3ORpB (215220] (View Genflank reportFASTA reportASN. 1 reportGrapical view, I vMEDLRIJE link, or 650 nucleotide neighbors) M1 0910 Bacteriophage M 13 gene 13 regulatory region and M 13sjlI mutant gij2 15209 jgbjM 10910lOIM 3IREG [215209] (View Gen.Baak report,FASTA reportASN. I reportGraphical view,]I MEDLIN link, or 72 nucleotide neighbors) M38295 Bacteriophage M 13 HactIl restricrion fragment DNA i *~vmA,)rkiIWAVrT11 M (View Genflank revortFASTA report,ASN.1I reporMGraphical view, or 67 nucleotide neighbors) WO 00/32825 WO 0032825PCT/I B99/02040 E02067 .210 DNA encoding a pan of Bactenopbage \113 tg 127 giI2 170311I Idbi.JE020671EO2067 [2170311] (View Gen.Bank repoi-tFASTA report.ASN. I repon, or Graphical view) J02467 Bacteriophage MS2, complete ge-norne gi12l5232jgbjJ02467IMS_4CG [2152321 (View GenBan-k repor-t,FASTA reporLtASN. I report,Graphical view,8 MEDLINE links. 4 protein links, 20 nucleotidle neighbors, or 1 gertome lik A1004950 Bacteriophage PI ban gene gi13688226 embIAJOI 15921BPIOI 1592 1 36882261 (Viewv GeniBank report,FASTA repori..SN. I recort,Grapbhical view, or I protein link) U88974 Bacteriophage P1I sorucrural lyric trans zi-.cosvlase (orf47), pep44b (orf44), pep>44a (orf44a), and pep43 (orf43) g.-res, cornplete cds; and pep42 (orf42) gene, partial cds giI266l099IgbJA.F0356071Ay0O35607 [2661099] (View GenBank report,FASTA rcport.ASN. I report, Graphical view,5 protein links, or 1 nucleotide neighbor) AJ1000741 Bacteriophage P1I darA operon g1i2462938lemblAJ0007411 BPAJ764 1 (2462938] (View GenBank report,FASTA report,.7*SN.lI reoort,Grapbical view,lI MEDLINE link, 10 protein links, or 31 nucleotide neighbors X01828 Bacteriophage P1I recombinase gene cmn gijI5l33IernblX0l828jMYPICrN [15133) (View GenBank repori,FASTA reportASN.l reportGraphical view,l MEDINE link. I protein link, or 3 nucleotide neighbors X98 146 Bacteriophage P1I DNA sequence around the Op 8 8 operator gi1135951liembIX98 1461BPI0P880P r1359513] (View Gen.Barnk report,FASTA reporE,!ASN. 1 report, Grapical view, or I nucleoitidie neighbor) S61175 irnml operon: icd=cell division represso:. ant I =ancirepressor (promoters P5lb) Cbacterioiphage PI, Genom 728 nt) gi1385908igbIS6l 1751S61 175 [385908] (View Gen.Bank report,FASTA reporLASN. 1 reporxGraphical view, I MEDLENE link, or 3 nucleoride neighbors) X87824 Bacteriophage P1I gene 26 gil 861 l64jembjX87824Pc~BPlG26 [861164] (View GenBanc report,FASTA repor.SN. I reporiGraphical view, or 1 protein link) X15638 Phage P1I DNA for lytic replicon contziz.ng promoter P53 and two open reading frames gilI5735lemblX1563S8PPplLRP. [15735'! (View GeaBarnk report,FASTA repor.ASN. 1 rep ort, Graphical view, I IWEDL M. Ilnk, 3 protein links, or 24 nucleotide neighbors WO 00/32825 WO 0032825PCT/I B99/02040 211 X 17512 Bacteriophage P I DNA for umunuy rerion imril g ill 5479lemb IX 1 751 21P I IMVMY 15479] (View Gen.Bank report,FASTA reportASN. I report,Grephical view.2 MEDLIN'E links, or 4 nucleocide neighbors) X 16005 Bacteriophage P1I c I gene for P Ic I repressor protein gi115477lemnbIX160051PlCI [15477] (View Gen.Banik repo!t,FASTA reportASN. 1 repomtGraphical view, I MEDLIhJE link, I protein link, or 3 nucleotide neighbors) X03453 Bacteriophage P1I cre gene for recombinase protein gill151351embIX03453I1MYPICRE [15135] (View GenBank report,FASTA report,ASN.l report,Graphical view,l IMEDLINB link, 2 protein links, or 12 nucleotide nighbors X06561 Bacteriophage P1I c I gene gilIS128IernbIX66ljMYPICI (15128] (View GenBank report,FASTA reporiASN. I reportGraphical view, I MIDLUNE link, 4 protein links, or 6 nucleotide neighbors) V01534 Bacteriophage P1I genome fragment (IS2 insertion spot). This regions contains four unidentified reading frames and is known as insertion hot spot for IS2 insertion sequences gi115118lemnbIVOIS341,MYOVPI [15118] (View GenBank reportFASTA reportASN. I repoMtGraphical view,1I MEDLINE link, 4 protein links, or 3 nucleotide nehbors) X5695 1 Bacteriophage P1I gene gi1406728lembIX5695 I IBPP I GP 10 [406728] (View GenBank report,FASTA reportASN.1 repomtGraphical view,2 MEDLINE links, 3 protein links, or 1 nucleotide neighbor) K02380 Bacteriophage P1I replication region including repA, parA, and parB genes and incA, incB, and incC incompatibility deteriat giJ21S6S21gbjK023 SOfPP I RP [215652] (View GenBank repomtFASTA reportASN.l reportGraphical view,5 IMEDLINE links, 4 protein links, or 8 nucleotide neighbors) X87674 Bacteriophage P1I lydA lydB genes gil 9 74 7 63lenmbIX876741BACPILYD [974763] (View GenBank report,FASTA reportASN.lI reportGraphical view, I MEDLINE link, 2 protein links, or 2 nucleonde neighbors) X87673 Bacteriophage P1 gene 17 giI974761jembIX876731BACP1 117 [974761] (View GenBank reportFASTA reportASN. I repor%.Graphical view,1I MEDLINE lfik, 1 protein link, or 1 nucleotide neighbor) M 16618 Bacteriophage P1I c I repressor binding sites giJ2 lS600jgbIM 1661 SIPPICI [215600] (View GenBank reportFASTA reportASN. I reportGraphical view, 1 MEDLIN link, 2 protein links, or 3 nucleotide neig-hbors) WO 00/32825 WO 0032825PCT/1 B99/02040 SEGPP ICON Bacteriophage P1I cin gene encoding recombinase, cixi recormbination Site, and end of C invertible element gij2l56O7IgbISEGPPlCrN (215607] (View GenBanc report,FASTA teport,ASN.lI report, GraphicalI view,lI MEDLrhJE link, I protein link, or 4 nucleotide neighbars K03 173 Bacteriophage P1 C invertible element, right end, and cixR recornbinarion site gi:2lS606jgbIK03l73PPlCrN2 (215606] (View GernBanc report,FASTA report.ASN. 1 report, or Graphical view) 215605 Bacteriophage P1 cin gene encoding recornbinase, cixL recombination site, and 5' end of C invertible element gii2l56051lclIX0ls28 [215605] (View GeriBanik report,FASTA report,ASN. 1 report, or Graphical view) M25470 Bacteriophage P1I tail fiber protein gene, complete cds gi1341I349igbjM254701PP 1 TFPR (341349] (View Gen.Banc report.FASTA reportASN.1I reportGraphical view, I MEDLrNE link, 3 protein links, or 3 nucleotide nehoors M34382 Bacteriophage P1I sim region proteins, complete cdls gil2l566llgbIM34382PPSLM (215661] (View Gien.Bank reporxFASTA reportASN.1I reportGraphical view,1 IMEDLINE link, or 2 protein links) Bacteriophage P1I R protein gene, complete cds gil2lS6S8jgbIM8l9561PPlR? (215658] (View GenBank reportFASTA reportASN.lI reportGraphical view,lI MEDLENE link, 2 protein links, or 4 nucleotide neighbors .M37080 Bacteriophage P I nini-PlI plasmaid origin of replication gif21I 6571gbIM370801PP I REPOR (215657] (View Gen.Bankj reportFASTA reportASN. I repomtGraphical view,1I NMDLINE link, or 46 nucleotidle neighbors) IM27041 Bacteriophage P1I ref gene, complete cds gil2l56501gbjM741jp~ P E 215650] (View GenBank reportFASTA report..ASN.lI reportGrapbhical view,lI M:EDLINE link, I protein link, or 1 cucleoride neighbor) LO 1408 Bacteriophage P1I partition protein (pazB) gene, 3' end.
gil2l56421gbIL014081PPIPARB (215642] (View GenBank reportFASTA rcportASN. I reportGraphical view,lI protein link, or 41 nucleotidle neighbors) SEG-PP IPAR Bacteriophage miniplasmid P1I parA gene, 5' end gij2 1S6391gbl[SEGPP 1 PAR [215639] (View Gen.Bank reportFASTA report.ASN. I reportGraphical view,1 MEDLINE link, 2 protein links, or 48 nucleonide neighbors) M36425 Bacteriophage miniplasmid P1I parB gene, 3' end gil2156381gbIM3642.5PPPpAR2 [2156381 (View Gen.Bank reportFASTA reportASN. 1 report, or Graphical view) WO 00/32825 WO 0032825PCT/l B99/02 040 M36424 213 Bacteriophage miniplasmid P1I parA gene, 5' end gijl15637igbjM3 64241 PP 1 PAR 1 (215637) (View GenBanic r-eport,FASTA reportASN. I report, or Graphical view) N111129 Bacteriophage P1 rniniplastnid origin of replication region gil2IS632igbIM I I 1291PP IORIM [215632] (View GeriBank report,FASTA report,ASN. I report, Graphical view, 1 NEDLDNE link, 1 protein link, or 43 nucleoide neighbors) M254 14 Bacteriophage P1I c I repressor binding site, operator 88 (OP88) gij21563 IljgbIM254 I4IPP IOP88A (215631] (View Gen~anc report,FASTA report,ASN. 1 report,Graphical view, I MEDLINE link, or 3 nucleotide neighbors) M254 13 Bacteriophage P 1 c I repressor binding site, operator 68 (Op68) gi!2 156301gblM254 1 3PP I P68A [215630) (View Gen.Bank report,FASTA reportASN. 1 reportGrapbhical view, or 1 MEDLINE link) M25412 Bacteriophage P1I c I repressor binding site, operator 21 (Op2l) gil? I5629IgbIM254 I2IPPl 0P2 A [215629] (View GenBanik report,FASTA reportASN.1I report,Graphical view,lI MEDLINE link, or 1 nucleotide neighbor) \410510 Bacteriophage P1I recombination site IoxR giI21I56281gbIM1 05 IOPPILOXR[215628] (View GenBank reportFASTA reportASN. 1 report,Graphical view, 1 MEDLINE link, or 1 nucleotide neighbor) MI10287 Bacteriophage P1 I OXP X lOXP recombination site giI215627lgbIM1287PPLOPX [215627) (View GenBank reportFASTA reportASN.1I reportGraphical view,1I MEDLINE link, or 13 nucleotide neighbors) M 10494 Bacteriophage P1I recombination site IoxP giJ2 1S626IgbIM 10494f PP 1LOXP (215626) (View GenBank reportFASTA reporLtASN. 1 reportGraphical view, 1 ?MDLINE link, or 134 nucleonide neighbors) M1 0511 Bacteriophage P1I recombination site IoxL gil? lS6251gbjM 10511 IPP1ILOXL [215625] (View Gen.Bank reportFASTA reportASN. 1 reportraphica view, 1 MIDLINE link, or I nucleotide neighbor) M10512 Bacteriophage P1I recombination site IoxB gi;215624gbM012PPLOB [215624) (View GenBank reportFASTA reportASN.1I reportGraphical view, or 1 MEDLENE link) M10145 Bacteriophage P1I genome fragment with recombination site loxP gil2lS623igbIM1OI451PPICREX [215623] 11f.. ACT AS pc- I ica! view, I X47TlT TNF. link, or 21 nucleotide neighbors) WO 00/32825 WO 0032825PCT/I B99/02040 M13327 21-4 Bacteriophage P1I Cmn recombinase activated cross over site, junction IV, clone pSHI326 giJ2 I 6221gbIM133271PP I CN26IV [215622) (View GenBank report,FASTA report,ASN. I reDor-t,Graphical view, I MEDLINE link, or 7 nucleotide neighbors) M13325 Bacteriophage P1I Cin recornbinase activated cross over site, junction 11, clone pSI326 gil2l562 I gbIMlI332IPI CN261I [21562 1) (View Gen.Barik report,FASTA report,ASN. 1 repoMtGraphicai view, I tvEDLINE link, or 1401 nucleoride neighbors) Ml3323 Bacteriophage P1 G in recombinase activated cross over site, junction rV, clone pSHI32S giI2l5620IgbFMl3323~1PCN2~IV (215620] (View Gen.Bank report,FASTA report,ASN. I repor-t,Graphical view,l MEDLINE link, or 7 nucleotide neighbors) M413321 Bacteriophage P1I Cmn recombinase activated cross over site, junction UI, clone pSHI325 gil215619igblM1332lfPPICN2s51 (215619) (View GenBank report,FASTA reportASN. I reportz,Graphical view,!I MEDLMN link, or 1058 nucleotide neighbors) M13324 Bacteriophage P1 G in recornbinase activated cross over site, junction 1, clone pSH1326 giI21561 8jgblM 133241PPIlIU61 [215618] (View Gen.Bank reportFASTA reportASN. 1 report,Graphical view,!I M.EDLINE link, or 7 nucleotidle neighbors) IM13319 Bacteriophage P1 G in recornbinase activated cross over site, right junction, clone pSHI327 gi)2IS6I71gbjM133 l91PP ICIN27R (215617] (View GenBank repoIIFASTA reporsASN. I report,Graphical view, 1 MEDLINE link, or 7 nucleotide neighbors) M 13320 Bacteriophage P1 Girt recornbinase activated cross over site, junction 1, clone pSF{1325 gil? 156161 gbIMlI33201P PI CIN251 (215616] (View GenBank reportFASTA report.ASN. I reportGraphical view, I IvfEDLrNE. link, or 7 nucleotidle neighbors) M13318 Bacteriophage P1 G in recornbinase activated cross over site, left junction, clone pSHIJ24 gii21561S1gbM331j1PICIN24L (215615] (View GenBank reportFASTA reportASN. 1 reportGraphical view,l IMEDUrNE link, or 1370 nucleotide neighbors) M1 3317 Bacteriophage P1 G in recornbinase activated cross over site, right junction, clone pSHI323 gil2l56l4gblM133 I71PPICIN23M [215614) (View GenBank reportFASTA reportASN. 1 reportGraphical view, I ?vEDLllJ lik, or 1055 nucleotidle neighbors) M13316 Bacteriophage P1 G in reconbinase activated cross over site, left junction, clone pSHI323 gil~l56131gblMI33161PPICmJ23L [215613] (View GenBanc reporzFASTA reportASN.1I report, Graphical view,!I MEDLINE link, or 7 nucleotide neighbors) M13315 Bacteriophage P1 G in recornbinase activated cross over site, right junction, clone pSHI322 gi~511bM311PCN2 [215612] (View Geam'uk 1CPonFASTA repoMtASN.i reportGraphicai view,! MEDLU'4E link, or nucleotide neighbors) WO 00/32825PC/B9024 PCT/l B99/02040 M13314 2 Bacteriophage P1I Cin recombinase activated cross over site, left juncrion, clone pSH1322 giJ21561 I IgbIM1I33141PP I CN22L [215611 (View Gen.Bank report,FASTA rcport,ASN.1I report,Graphical view,!I MEDLENE link, or 1401 nucleotide neighbors) M13313 Bacteriophage P1I Cmn recornbinase activated cross over site, right junction, clone pSHI321 giJl16lOJgbM33l3JPPlCIN2IR (2156101 (View Gen.Bank report,FASTA repotASN. I report,Graphical view, I MEDLIhJE link, or 7 nucleotide neighbors) M 13312 BacteriophageP! Cmn recombinase activated cross over site, left junction. clone pSH1321 giI2lS609JgbJMl3312JPP1CrN2IL (215609) (View GenBank report,FASTA report,ASN. I report,Graphical view.!I MEDLUNE link, or 1058 nucleocide neighbors) M 16568 Bacteriophage P1I c4 repressor gene, complete cds giJ2156031gbIM I 6568 PP I C4 [215603] (View Gen.Bank repomtFASTA reportASN. 1 reportGraphical view,!I MEDLINE link, 1 protein link, or 4 nucleotide neighbors M13326 Bacteriophage P1I Cin recornbina~e activated cross over site, junction III, clone pSH1326 gil2l56021gbIM133261PPIC26If (215602] (View GenBank report-FASTA reportASN. 1 reportGraphical view, I MEDLINE link, or 1192 nucleotide neighbors) M 13322 Bacteriophage P1I Cin recornbinase activated cross over site, juncrion M, clone pSI325 -giI2l560ligblMl33221PPlC25tf [215601] (View GenBank reportFASTA reportASN.1I reportGraphical view, I MEDLINE link, or 1231 nucleotide neighbors) J05651 Bacteriophage P1I modulator protein (bof) gene, complete cds giJ2lSS98IgbJJOS6S lJPPIBOFY1 (215598] (View Gen.Bank report,FASTA reportASN. 1 reportGraphical view,! IMEDLINE link, 1 protein link, or 3 nucleotide neighbors) M33224 Bacteriophage P1I regulatory protein (bof) gene, complete cds gil2155961gbjM332241PPIBOFFO [215596) (View GenBank reportFASTA reportASN. 1 reportGraphical view,! I EDLINE link, 1 protein link, or 3 nucleotide neighbors M 10288 E.colifbacteriophage P1I loxR recombination site gil 1466471gbIM10 288JECOLOXR [146647) (View Gen.Bank reportFASTA reportASN. I reportGraphical vie w,!I MEDLINE link, or 3 nucleotidle neighbors M10289 E.colilbacteriophage P1 1oxL recombination site gil 146646igbM10289JECOLOXL. [146646] (View GenBank reportFASTA reportASN. 1 reportGrapbical view, I MEDLINE link, or 2 nucleotide neighbors M10290 E.coli loxE site, which can recombine with bacteriophage P I lox? site gil 146645igbM0290ECOLOB [146645) C w0cti-pon, AST o. n' S r c v; c ME 1 L~~n NEJ Hn1' or 2 in 1 ori f e nr i a-h h nrs WO 00/32825 WO 0032825PCT/1 B99/02040 216 M10287 Bacteriophage P1 I1ox.P X IOXP recomnbination site giI2 156271gbIM 102871PP I LOXCPX (215627] (View GenLank report,FASTA report,ASN.lI report,Graphical view,lI MEDLINTE link, or 13 nucleotide neighbors M74046 Bacteriophage P1I pacA and pacB genes, complete cds giJ2 156341gblM740461PP 1 PACAB [215634] (View GenBank report,FASTA report,ASN. I report,Graphical view, I M!EDLMN link, or 2 protein links) IM9S666 Bacteriophage P1 Igene 10, doc and phd genes, complete cds gil463276jgbIM95Z666jPPlPHDDOC (463276] (View GeriBank report.FASTA report,ASN.lI report,Graphical view,2 MEDLINE links, 4 protein links, or I nucleotide neizhbor) M25604 Bacteriophage Q-beta mutated autonomously replicating sequence MDVI RNA fragment gil5563591gbfM256041PQBARSMLJT- [556359] (View GenBank report,FASTA reportASN. I reportGraphical view, I MEDLINE lik, or 8 nucleotide neighbors) V00643 first half of the phage Q-beta gene for coat protein gil 15O88IcmbIVO643LEQBET [(15088] (View GeriBank report,FASTA reportASN. I reportGraplucal view, 1 MEDLIN link, I protein link, or 4 nucleotide neighbors) 167 Bacteriophage Q-beta RNA fragment recovered from replicase binding complex gil556362jgbjM2s 1671 PQBR.EPLICB [556362) (View GenBank report,FASTA reportASN. I reporeGraphical view, I MEDLMN link, or 2 nucleoride neighbors) M24876 Bacteriophage Q-beta replicase RNA, 5' end gi1556360lgblM24 S761PQBREPLICp. (556360] (View GenBank repoz-t,FASTA reportASN. 1 reportGraphical view, I MEDLINTE link, I protein link, or 4 nucleotide neighbors M25444 Synthetic bacteriophage Q-beta DNA gi12091 BgbIM254441SYNPQBmpjR{ [209118] (View GenBank report,FASTA reportASN. I reportGraphical view, 1 MDLMN link, or 8 nucleotide neighbors) IM25463 Bacteriophage Q-beta self-replicating microvariant (+)'RNA gil532489gbIM25463PQBMVSRRNA [532489] (View GenBank reportFAS.TA reportASN. 1 repartGraphical view, or I MEDLIN link) M25014 Bateriophage Q-beta RNA replicase gene, 5'end, and maturation protein gene, 3' end giI2943l6jgb;Mi25l4jPQBRPpLC (294316) (View GeniBank rCpor?,FASTA reportASN. I reportGraphical view,1 MEDLrNE link, 2 protein links, or 2 nucleotide neighbors M25065 Bacteriophage Q-beta. RNA sequence with putative stem loop (View GenBank reportFASTA reportASN. 1 report.Graphical view, 1 M:EDLINE link, or 3 nucleoride neighbors) WO 00/32825 WO 0032825PCT/I B99/02040 217 M 10265 Bacteriophage Q-beta RNA molecule with the ability to replicate extracellularly giI2IS7261gbIMI102651PQBRNA [215726] (View Gen.BarJk reporx,FASTA report,ASN. I report,Grapbical view, I MEDUNT'E link, or 8 nucleotide neighbors) M24815 Bacteriophage Q-beta specified replicase subunit RNA, giI2 l57251gbIM248 1 SIQBREPL [2 15725] (View GenBank repor-t,FASTA report,ASN. I report, Graphica i view, 1 MEDLINE link, or 4 nucleotide neighbors) M25461 Bacteriophage Q-beta plus-strand RNA, 5' terminus gi:2 LS7241gbJM25461 JPQBPS5E [215724) (View GenBarnk report,FASTA report,ASN. I report, or Graphical view) M25462 Bacteriophage Q-beta plus-strand RNA, 3'terminus gil2 l5 7 231gbIM254621PQBPS3E (215723] (Viewv GenBank report,FASTA report.ASN.1 reportGrapbical view, or 8 nucleondce neighbors) M24871 Bacteriophage Q-beta nanovariant WSUII RNA giJ215722fgbM248711PQBNvWSIC [215722] (View GenBank reportFASTA reportASN. 1 reportGrapical view, 1 MIDLrNE link, or 2 nucleotide neighbors) M24870 Bacteriophage Q-beta nanovariant WSUI RNA gij21572 ljgbjM24870jPQBNVNVSrB [215721] (View GenBaak reportFASTA reportASN. 1 reportGraphical view, 1 MEDLINE link, or 2 nucleotide neighbors) M24869 Bacteriophage Q-beta nanovaniant WSI RNA gil2 l5720Igb;M248691PQBNVWr5JA [215720] (View GenBank report,FASTA reportASN.1I reportGraphical view,1I MESDLNE link, or 2 nucleotide neighbors) M10495 Coliphage Q-beta MDV- RNA gil2 1571 9lgblM1I 49SIPQBMDVIA [215719] (View GenBank reporxFASTA reportASN. I repartGraphical view, I MEDLIE link, or 10 nucleonide neighbors) J02484 bacteriophage qbeta coat protein cistron first half gil 1571 71gbIJ024841PQBCP5 (215717] (View GenBank reportFASTA reportASN.1 reportGraphical view,1I MIEDLINE link, 1 protein link, cr 4 nucleoride neighbors) M57754 Bacteriophage Q-beta minus sn-and RNA, 5' terminus gil2 1571 6IgbIM577541PQBBMsSE [215716] (View GenBank report,FASTA reportASN.1 reportGrapical view, or 8 nucleotide neighbors) M24297 Bacteriophage Q-beta 5-termintal region of the minus sn-ad (View GenBank report,FASTA reportASN. I reportGraphical view, 1 MDLINE link, or 8 nucleotide neighbors) WO 00/32825 WO 0032825PCT/I B99/012040 M 10695 1 Bacteriophage Q-beta, MIDV- I RNA giI21 l7141gbIM 106951PQB IIR (2157141 (View GenBank report,FASTA report,ASN.l report,Graphical view,2 MEDLINE links, or 12 nucleotide neighbors M24827 Bacteriophage R17 A protein gene, Yend gi12I6078jgbM24827Rl7RNAClS (216078) (View GenBank report,FASTA reportASN.1I reportGraphical view,1I MEDLINE link, or 5 nucleoride neighbors) M24829 Bacteriophage R 17 coat protein gene, gi!2 160751gbIM24829IR17CP5 [216075] (View GenBank report,FASTA report,ASN. I report,Graphical view,1I MEDLNE link, or 5 nucleotide neighbors) J02488 bacteriophage r 17 ma synthetase initiation site gij2l16080jgb1102488fR1 7RNASYN [216080) (View GeniBank report.FASTA reportASN. I report,Graphical view,3 MEDLINE links, 2 protein links, or 6 nucleotide nel2 hbors) J02487 bacteriophage r17 coat protein initiation site giJ2 160731gbIJ02487jR1I7COATP [216073) (View GenBanc report,FASTA reportASN. 1 reportGraphical view, or 1 MEDLINE link) J02486 bacteriophage r1 7 a protein initiation site giI216071IJgbIJ02486IR7A.PROT [216071] (View GenBank report,FASTA reportASN. I reportGraphical view, or 1 MEDLINE link) M24826 Bacteriophage R17 coat protein RNA fragment gij2lI6077jgbjM24826JR1 7CPRA.A (216077] (View GenBankc report,FASTA reportASN. I reportGraphical view, 1 MEDLIN link, or 7 nucleotide neighbors M24296 Bacteriophage R 17 3'*terminal fragment A RNA giI21I607010b1424296IR1 73T]FA. (216070) (View GeaBank reportFASTA reportASN.1I reportGraphical view,lI MEDLIN'E link, or 9 nucleotide neighbors 1T structure refinement for a 24-nucleotide ma hairpin, nmnr, minimized average structure ribonucleic acid, hairpin, bacteriophage r17 mol id: 1; molecule: rl7c; chain: null; engineered: yes gill9423361pdblTFNI (19423361 (View GenBaak repoMtFASTA reportASN.1I reportGraphical view, or 1 structure link)
IRPEA
ma (5S-d(gpgpgpapcpupgpapcpgpapupcpapcpgp epapgpupcpupapu.3') (24-mer ma hairpin coat protein binding site for bacteriophage r17) (nm, minimized average structure) gil 4210201pdbl IRHTM 1421020) (View GentBank reportFASTA reportASN.1 reportGraphical view, or 1 structre link) WO 00/32825 WOOO/3282SPCT/I B99/02040 2119 M 14428 Bacteriophage S 13 circular DNA, Comp~lete genome gi!216089IgblM I44281S I3CG [216089] (View Gen.Barik report,FASTA report,ASN.1 repor-t,Graphical view,2 MEDLINE links, 12 protein links, 26 nucleotide neighbors.
or I genome link J05393 Bacteriophage TI DNA N- 6 -adenine-methyltransferase gene, complete cds gi11661631gbIJ053931BTINAMTA [166163) (View GenBaiJ report,FASTA report,ASN. I report,Graphical view, 1 MEDLMN link, or 2 protein links) L46845 Bacteriophage T2 H3d, frd.2 genes, comp~lete cds gi:95l38?lgbIL468451PT2FR.D32G [951387] (View GenBank report,FASTA report,ASN.lI report,Graphical view,2 protein links, or 17 nuc leotide neighbors) L43611 Bacteriophage T2 fibrinin (wac) gene, complete cds gi19038691gbIL4361 11 PT2WAC [903869] (View Gen.Banc report,FASTA report,ASN. I report, Graphical view, 1 protein link. or 4 nucleonde neighbors) M24812 Bacteriophage T2 secondary structure PUNA sequence gil2 I57961gbjM2481M I ITRNA [215796] (View Gen.Bank reportFASTA reportASN. I reportGraphical view, I MEDLINE link, or 4 nucleotide neighbors) M22342 Bacteriphage T2 DNA-(adenine -N6)me thyltrans fe rase (dam) gene, complete cds gi12 15792 IgbIM223421PTDAM [215792] (View Gen.Bank reporIFASTA reportASN. I reportGraphical view,!I N!EDLIN link, 1 protein link, or 2 nucleotide neighbors) S57515 orf 61.2 fintergenic region between 41 and 6 1) fbacteriophage T2, Genomnic, 323 nt] gil298524gbIS575lSS7S 15 [298524] (View GenBank reportFASTA reportASN. I reportGraphical view, I MEDLINE link, or 1 protein link) X05312 Bacteriophage T2 gene 38 for receptor recognizing protein gijlSl97lembXOS3 12j1MG38 [15197] (View GenBank reportFASTA report.ASN. I repomtGraphical view,!I MEDLINE link,. or I protein link) X04442 Bacteriophage T2 gene 37 for receptor recognizing protein gilI5l95lembIX044421MyrG37 [15195] (View GenBank report,FAST6A reportASN. I reportGraphical view, 1 MIEDLINE link, or 1 protein link) X 12460 Bacteriophage T2 gene 32 m.RNA for sing le-stranded DNA binding protein gil 192IemblX1 24601M'r'TG32 [15192] (View Gen.Bank reporlFASTA rcportASN.1 reportGrapical view,I NMDLMN link, 2 protein links, or 14 nucleotide neighbors) X57797 Bacteriophage T2 pen e for gn 1 gil l4875lembjX56555jBT'2GPl2 (14875] (View Genflank reportFASTA reportASN.1I reportGraphical view,!I protein link, or 2 nucleotide neighbors) WO 00/32825 WO 0032825PCT/I B99/0 2040 220 X0 1755 Bacteriophage T2 tail fiber gene 36 gi115189lembIX017551MYT2F36 [15189] (View GenBank report,FASTA report,ASN. I report,Graphical view, I MEDLINE link, 2 vrozein links, or I nucleotide neighbor M 14784 Bacteriophage T3 strain arnNG220B right end, tail fiber protein, lysis protein and DNA packaging proteins, complete cds giI215810lgbIM147841PT3RE [215810] (View GcriBank reportFASTA report,ASN. I report,Graphical view,l MEDLINE link, 9 protein links, or 10 nucleotide neighbors SEGPT3RNAPOL Bacteriophage T3 RNA polymerase III gene, 5' end giI7IO5591gbISEGPT3PNAPOL [710559] (View GenBanc report,FASTA report,ASN. I repor-t,Graphical view, 1 MEDLINE link, 2 protein links, or 2 nucleotide neighbors) Bacteriophage T3 RNA polyrnerase MI gene, 3' end gij340722jgbjM226l OIPTJRNAPOL2 [340722] (View Gen.Bank reportFASTA reportASN. 1 report, or Graphical view) M22609 Bacteriophage T3 RNA polymerase III gene, S5 end gi134072 1lgblM226091PT3RNAPOLI [340721] (View Genlank reportFASTA repoMtASN.lI report or Graphical view) X05031 Bacteriophage T3 gene region 1-2.5 with primary origin of replication gi1lS5719IembIX050311IPOT30RI [15719] (View GenBank reportFASTA reportASN. I report,Graphical view,lI MEDLENE link, 11I protein links, or 5 nucleotide neighbors X03964 Bacteriophage T3 early control region pos. 308-8 10 from genome left end gillS7l8!embjX03964jPOT3EP [15718] (View GenBank reportFASTA reportASN. I reportGraphical view,2 MDLINE links, or 20 nucleotide neighbors) X1 7255 Bacteriophage T3 gene I to gene 11I gil 15682lernbIX172551P0T3 11 IG (15682] (View GenBank repoMtFASTA reportASN. I reportGraphical view,4 NMDLINE links, 36 protein links, 17 nucleotide neighbors, or I genorne link) X15840 Phage T3 gene giJl562SlenmblXIs84o1PODT3GIO (15625] (View Gen.Banik reportFASTA reportASN. 1 reportGraphical view, 1 MDLINE link, or 3 nucleotide neighbors) X02981 Bacteriophage T3 gene I for RNA polymerase gi1 15561 lembf X029811IPODOT3P (15561] (View GenBank reportFASTA reportASN. 1 reportGraphical view, 1 MEDLIhJE link, I protein link, or 3 nucleofide neighbors) J02503 bacteriophage 03 5 end. terminally re-dui,f se-quene Irh" gij2158l6jgbjJ0203PTTRS (215816) (View GenBank repoMtFASTA reportASN. I report, or Graphical view) WO 00/32825 WO 0032825PCT/I B99/02040 2211 SEQPT3TRS bacteriophage 03 5 end, terminally redundant sequence (trs) giI2 1581 8jgbIISEGPT3TRS [21581 8] (View GeniBank report,FASTA repori,ASN. I report,Graphical view, or 1 MIEDLINE link) J02504 bacteriophage 03 3' end, terminally redundant sequence (trs) giI2IS8I7jgbIJ025O4JPT3TRS2 (215817] (View GenBank report,FASTA report,ASN. I report, or Graphical H Y PERLINC http://www.rs.noda.sut.ac.jp/-kunisawa h t t p ://www.rs.noda.sut.ac.jp/-kunisawa Bacteriophage T4 genomidc database compiled by Arisaka et al.
X95646 Bacteriophage T5 DNA for region 60.5%-71 of the T5 genome gil279lSS7IernblAlOOl 19 1 BTJ001 191 (2791557] (View GenBank reportFASTA reportASN.lI report,Graphical view,7 MEDLIN E links, 12 protein links, or 6 nucleotide rtelghbors) X56847 Bacteriophage TS genornic region encoding early genes D gii I S4O7lembIX 1293 01MYT5D 10 15407] (View GeriBank report,FASTA reportASN. I reportGraphical view, I MEDLII'E link, 5 protein links, or 4 nucleotide neighbors AF039886 Bacteriophage T5 subclone TS.5.3r5.18r, single pass sequence, genornic survey sequence gi:281Ill 4igbjAF0398861AF039886 [28 11154] (View Genlank reportFASTA reportASN.1 report or Graphical view) AF039885 Bacteriophage TS subclone TS.40f,4 If, single pass sequence, genomic survey sequence gil28l I1531gbAF039885IAF039885 [28111531 (View Gen.Bank repomtFASTA reportASN. I report, or Graphical view) AF039884 Bacteriophage TS subclone T5.26.fr, single pass sequence, genonuc survey sequence gi1281 llS2igblAF0398841AFo39884 [2811152] (View GenBank report,FASTA reportASN. I report, or Graphical view) AF039883 Bacteriophage T5 subclone l0-T5.5.7F, single pass sequence, genomic survey sequence gi12811I 511gbAF0398831AF039883 [28111513 (View GenBank reportFASTA reportASN. I report or Graphical view) AF03 98 82 Bacteriophage T5 subclone 41-T5.5.4BF, single pass sequence, genornic survey sequence gil28 11 l50igbIAFO398821AFo39882 [2811150] (View GernBank reportFASTA reportASN. 1 report or Graphical view) AF039881I Bacteriophage TS subclone 39-T5.5.4aF, single pass sequence, genomic survey sequence gil28114igbA03811 398 (2811149] (View GenBank report,FASTA rcport,ASN. I reportGraphical view, or 1 nucleotide neighbor WO 00/32825 PTlB9/24 PCT/I B99/02040 AF039880 Bacteriophage T5 subelone 19-TS.7.2r, single pass sequence, genomic survey sequence gi12811I1481gbIAF03988OIAF039880 [2811148] (View GenBank repori,FASTA report,ASN. I report or Graphical view) AF03 9879 Bacteriophage T5 subclone 18-TS.7.2F, Sin2le pass sequence, genornic survey sequence giI28lll 471gb1AF039879IA.FO39879 [2811147] (View GenBank report,FASTA report,ASN. I report, or Graphical view) AF039878 Bacteriophage T5 subclone I I -T5.5.7R, single pass sequence, genornic survey sequence gi128 11 l'461gbjAF0398781AF039878 [2811146) (View GenBank report,FASTA report,ASN. 1 report,Graphical view, or 2 nucleoride neighbors) AF039877 Bacteriophage T5 subclone T5.4FR, single pass sequence, genomic survey sequence gi1281 1145IgbJAFO39877IAFO39877 [2811 145] (View Gen.Bank reportFASTA reportASN. I report or Graphical view) AF039876 Bacteriophage T5 subclone 22-T5.16R, single pass sequence, genornic survey sequence gil2811l44igbIA.F0398761Ay039876 [2811144] (View GenBank reportFASTA rcportASN. 1 report, or Graphical view) AF039875 Bacteriophage TS subclone 21*T5.16R, single pass sequence, genomic survey sequence gi[28 11 143igblAFO398751AF03987S [28111433 (View GenBanc reportFASTA reportASN. I report, or Graphical view) AF039974 Bacteriophage T5 subclome 2 1-TS.16F, single pass sequence, genomic survey sequence gi1281 1142igb1AFO39874IAF039874 [2811142] (View GenBank reportFASTA reportASN. 1 report, or Graphical view) AF039873 Bacteriophage T5 subclone 09-TS.6F, single p~ass sequence, genomic survey sequence gil281 1l4l1gbiAFO39873[Ay039873 [2811141) (View GenBank reportFASTA reportASN. I report, or Graphical view) AF039 872 Bacteriophage T5 subclone 09-TS.6R, single pass sequence, genomic survey sequence gi12811 14O1gbAF0398721AF039872 [2811140] (View GenBank reportFASTA reportASN. 1 reportGraphical view, or 2 nucleoide neighbors) AF03 9871 Bacteriophage T5 subclone 04-T5.26.R, single pass sequence, genormic survey sequence gi12811I1391gblAFO398711Ayo39871 [2811139] (View GenBank reportFASTA reportASN. 1 report, or Graphical] view) AF039870 Bacteriophage T5 subclone 13-TS.42F, singie pass sequence, genomic survey sequence giJ2811 138jgbIAFO39870Aro3987o [2811138] (View GenBank reportFASTA rcportASN. 1 report, or Graphical view) WO 00/32825 WO 0032825PCT/I B99/02040 223 X69460 Bacteriophage TS lrf gene for L-shaped tail fibers gill541lembIX6946o1MYrSLTF [15415] (View GeriBan~k report,FASTA report,ASN. I report,Graphical view,2 IMEDLINE links, I protein link, or 4 nucleotide neilghbors X03402 Bacteriophage T5 D 15 gene for 5' exonuclease 4 13lembIX03402MYTSEXOG [15413] (View Geaflank report,FASTA report,ASN. I reportGraphical view, 1 MIDLUhE link, 1 protein link, or 2 nucleotide neighbors) ZI 11972 Bacteriophage T5 t.NA-Tyr, tRNA-Glu, tRNA-Trp, t.RNA-Phe, tRNA-Cys and tRNA-Asn genes, and ORFs 91 aa, 9Oaa, 42aa and 172aa gil IS795lembiZl 1 9721T56TRNAG 15795] (View GenxBank report,FASTA reportASN. 1 report,Graphical view, 1 MEDLINE link, 4 protein links, or 3 nucleotide neighbors X03898 Bacteriophage T5 genes for tRNA-His, -Ser and -Leu gill S786lernbIXO389SlSTTsRNJ [15786] (View GenBank report,FASTA reportASN. 1 report,Graphical view, or 2 MEDLINE links) X04177 Bacteriophage T5 gene for transfer RNA-Gln gijl15421 jemblX04 I77jMYT5TRNQ [154211 (View Genflank report,FASTA reportASN. 1 reportGraphical view, 1 MEDLUNE link, or 2 nucleotide neighbors) X03899 Bacteriophage T5 genes for tRNA-Val, -Lys, -fMet, -Pro and -fle3 gil lS787IembIXO38991STT5RN.2 (15787] (View Genflank rep oMFASTA reportASN. 1 reportGraphical view, or I MEDLINE link) X03798 Bacteriophage T5 gene for tRNA-Asp (GUC) gijl5472lernblX03798fNCTTRDG (15472] (View GenBank reportFASTA reportASN. 1 reportGraphical view, 1 MEDLINE link, 2 protein links, or 2 nucleotide neighbors Y(00364 Bacteriophage T5 tRNA gene cluster (27.8%-22.4%) gijl 420lemblYOO3 64jMYTSTRN [15420] (View Genflank repomtFASTA reportASN.1I reportGraphical view, I MEDLUNE link, or 13 nucleotide neighbors) X03 140 Bacteriophage TS DNA with rho-dependent tranrption terrminator (Hind M1-P fragment) gil 154 17lembX03 1 4OMYTSRHO [15417] (View Genflank reportFAStA reportASN. 1 reportGraphical view, 1 MEDLIE link, 2 protein links, or 2 nucleotide neighbors) Z35070 Bacteriophage T6 DNA 3 5 2 28jembZ3074MER'EGBQT6 (535228] (View Genflank reportFASTA reportASN. I repomtGrapbhical view, 1 MEDLINE Link, or I protein link) WO 00/32825 WO 0032825PCT/l B99/02040 224 AF060870 Coliphage T6 small subunit distal tail fiber (gene 36) gene, partial cds; and large subunit distal tail fiber (gene 37) and tail fiber adhesin (gene 38) genes, complete cds gil36764581gb1AFOS26O51AF0526O5 [3676458] (View Gen.Bank report,FASTA reportASN. 1 report,Graphical view,3 protein links, or 2 nucleotide neighbors) Z3S072 Bacteriophage T6 DNA encoding 0RF19.I gene and g19 gene giIS35232lemrblZ35O72!MYTAILT6 [535232] (View Gen.Bank report,FASTA reportASN. I report,Graphical view, I MEDLINE link, or 2 protein links) X12488 Bacteriophage T6 gene 32 m.RNA for single-stranded DNA binding protein giII5B43lembX12488IMYT6G32 [15843] (View GenBank reportFASTA rcportASN. I report,Grapbical view, I MIEDLINE link, I protein link, or 14 nucleotide neighbors Z78095 Bacteriophage T6 DNA (15 06 bp) gi11488562lemnblZ780951BPHZ78095 (14885621 (View GenBank report,FASTA reportASN. 1 report, Graphical view, I protein link, or 4 nucleoride neighbors) Z35079 Bacteriophage T6 DNA for Ip5, 1p6 gij53S2l5iemblZ35079IMfYS7BT6 [535215] (View GeaBaak report,FASTA reportASN. I reportGraphical view, 1 MEDLINE link, 2 protein links, or I nucleotide neighbor X68725 E.coli bacteriophage T6 gene for beta-glucosyl-HMC-alpha-glucosyl-tasferase gil296439jembIX687251FECT6 [296439) (View GenBank report,FASTA reportASN. 1 reportGraphical view, I MEDLINE link. 3 protein links, or I nucleotide neighbor X69894 Bacteriophage T6 alt gene for ADP-Ribosylansferase gil lS422IembIX69894IM1YT6ADP 15422] (View Gen.Bank reportFASTA reportASN. I reportGraphical view, I MEDLINE link, 1 protein link, or I nucleotide neighbor) L46846 Bacteriophage T6 frd.3. frd.2 genes, complete cds gii9S l390IgbjL46846jPT6FRD32G [951390] (View GeaBank report-FASTA reportASN.1 reportGraphical view, or 2 protein links) M27738 Bacteriophage T6 aanslational repressor protein (regA), complete cds gi[21S9931gbM77381PT6REGA [215993] (View GenBank reporeFASTA reportASN. 1 report.Graphical view, 1 MEDLINE link, I protein link, or 5 nucleotide neighbors M38465 Bacteriophage T6 DNA ligase gene, complete Cds gil2l599ljgbjM38465jpT6LIGss [215991] (View GenBaak reporxFASTA reporzASN. I reportGraphical view, I MEDLINE link, I protein link. or 2 nucleotide neighbors WO 00/32825 WO 0032825PCT/I B99/02040 225 V01 146 Genorne of bacteriophage T7 gil43 11 87lemb VO I 146lT7CG [431187] (view Genflank report,FASTA report.ASN. 1 repor.,Graphical view, 13 ?MEDL INE links, 60 protein links, 105 nucleozide neighbors, or I genome link) X60322 Bacteriophage alpha3 genes A, B, K, C, D, E, J, F, G, H gi114775lermbIX603221BACALPHA [14775] (View GeitBarxk report,FASTA report,ASN. I reportGraphical view,l MEDLIN link, 10 protein links, 22 nucleotide neighbors, or 1 genorne link) X13332 Bacteriophage alpha3 DNA for origin of replication giJ150O93lembIX1l33321MIA3ORPL (15093) (View GenBank report,FASTA report,ASN. I reportGraphical view, or I MEDLrNE link) X12611 Bacteriophage alpha3 gene for protein A part., finger domain gill15092l emblX 1261 1 MIA3A.FIN [15092] (View GenBank report,FASTA reportASN. I reportGraphical view,lI MEDLINE link, I protein link, or 6 nucleoride neighbors X15721 Bacteriophage alpha3 deletion mutation DNA for the origin region (-ori) of replication gilI14774lernbIX1572 1IIBA3DMOR9 (14774] (View Gen.Bank report,FASTA reportASN. I reportGrapbhical view, I MEDLINE link, or I11 nucleotide neighbors) X15720 Bacteriophage alpha3 deletion mutant DNA for the origin region (.ori) of replication gil l47731cmblX1I57201BA3DM0R8 (14773] (View GenBank reportFASTA reportASN.1 repoMtGraphical view,l NEDLIhNE link, or I nucleotide neighbor) X1 5719 Bacteriophage alpha3 insertion mutant DNA for the origin region (-ori) of replication gi114772lemnbIX157191BA2DMOR7 (14772) (View GenBank reportFASTA reportASN. I reportGraphical view,1I MEDLINE link, or 10 nucleonide neighbors) X15718 Bacteriophage alpha3 deletion mutation DNA for origin region (-ori) of replication gil 1477 1 lembIX 1571 SIBA3DMOR6 (14771] (View GenBank reportFASTA repoMtASN. I reportGraphical view, I NIDLIhNE link, or 11 I ucleotide neighbors) X15717 Bacteriophage alpha.3 deletion mutatnt DNA for origin region (-ori) of replication gil I4770jcmbjl 571l7jBA3I5MORS5 [14770] (View Gen.Baak rcportFASTA reportASN. 1 reportGraphical view,lI ?vEDLNE link, or 9 nucleotide neighbors) X15716 Bacteriophage alpha3 deletion mutant DNA for origin region (-ori) of replication gil 14769lemnbIX1571 61BA3DMQR4 [14769] (View GenBank report,FASTA rcportASN.lI report,Graphical view,lI MEDLINE link, or 10 nucleotide neighbors) WO 00/32825 PCT/1 B99/02040 226 X15715 Bacteriophage alpha.3 deletion mutant DNA for origin region (-oni) of of replication gi114768lembIX15715BA3DMOR. [147681 (View Gen.Bank report,FASTA report,ASN. I report, GraphicalI view,!I MEDLIhJE link, or I!I nucicoride neighbors) X 15714 Bacteriophage alpha3 deletion mutant DNA for origin region (-oni) of replication gij 14767jembjl 57!l4jBA3DMOR2 [14767] (View Gen.Barik repor-t,FASTA reportASN. 1 reportGranhical view, I ?IEDL1NE link, or I1I nucleotide neighbors) X1S713 Bacteriophage alpha3 delerion mutant DNA for the origin region (-oni) of replication gill14766lemblX 1571 31BA3DMORI [14766) (View GenBanik report,FASTA report,ASN. I report,Grapbical view, I IMEDLINE link, or I1! nucleotide neighbors) X62059 Bacteriophage alpha3 origin of cDNA synthesis (oriGA) gil 14763lernbIX620591AL30PIGA [14763] (View Genflank FeportFASTA reportASN. I reportGraphical view, I MEDLINE link, or 13 nucleotide neighbors) X62058 Bacteriophage alpha3 origin of cDNA synthesis (oriAA) gil 14 7 62lernblX6205SjAL30RJkA [(14762] (View Gen.Bank report,FASTA reportASN. I reportGraphical view,!I MEDLINE link, or 13 nucleotide neighbors) 102444 Bacteriophage alpha3 origin of DNA replication gi11661031gbIJ024441AL30RI [166103] (View GeriBank reportFASTA reportASN.lI reportGraphical view,!I MEDLIN link, 2 protein links, or 12 nucleoride neighbors) M25640 Bacteriophage alpha-3 H protein gene, complete cds gi116610ligbIM256401AL3H [166101] (View GenBank reportFASTA reportASN. I reportGraphical view,! IMEDLINE link, I protein link, or 13 nucleotide neighbors) M10631 Bacteriophage alpha-3 cleavage site for phage phi-X 174 gene A protein gillI 660991gb[M 1063 1 IAL3CSA 166099) (View GenBank reportFASTA reportASN.l reportGraphical view,1I MEDLINE link, 1 protein link, or 3 nucleoride neighbors) X00774 Bacteriophage alpha-3 gene J sequence gil 1543 1lcemblX0O774[NCBAJJ [1543!]1 (View GenBane reportFASTA reportASN.1I reportGrapbical view,lI MEDLINE link, 3 protein links, or 2 nucleotide neighbors) M25640 Bacteriophage alpha-3 H protein gene, complete cds gil166101gbN{256401A3H.P [166101] (View GenBank report,FASTA reportASN.1 reportGraphical view,1 MEDLIN link, I protein link, or 13 nucleoride neighbors) M10631 Bacteriophage alpha-3 cleavage site for phage phi-XI 74 gene A protein (View GenBank reportFASTA reportASN. I reportGraphical view, I MEDLINE link, 1 protein link, or 3 nucleotidle neighbors WO 00/32825 WO 0032825PCT/I B99/02040 J02459 2-27 Bacteriophage lambda, complete genorne gil2lS IO4jgbjJ02459jLAMCG (2151041 (View GeniBanik repor-t,FASTA report,ASN.lI report, Graphical view,87 M ,EDLUNE links, 67 protein links, 190 nucleotide neighbors, or I genorne link) J02482 Bacteriophage phi-X 174. complete genome gii2 16019lgbl102482lPXI1CG [216019] (View GenBank report,FASTA report,ASN. 1 report,Grapbical view,23'M4EDLINE links, I11 protein links, 26 nucleotide neighbors, or 1 genonie link) J02454 Bacteriophage G4, complete genome gil2 1541 SIgbIJ024541PG4CG (215415] (View Gen.Bank report,FASTA report.ASN. I report,Graphical view,6 MEDLIhfE links, 11 p rotein links, 20 nucleoride neigehbors.
or 1 genome link) X60323 Bacteriophage phiK complete genome gi1 147811l8!cmbIX603231BPHIKCG [1478118] (View GenBank reportFASTA reportASN. I reportGraphical view,l10 protein links, 18 nucleotidle neighbors, or 1 genome link) L42820 Bacteriophage BF23 tail protein (his) gene, complete cds gill 048680!gblL428201BBFHflS [1048680] (View Genflank report,FASTA reportASN. 1 reportGraphical view, I MIEDLINE link, 1 protein link, or I nucleotide neighbor) X54455 Bacteriophage BF23 gene 17 and gene 18 gi11I4797lembIX54455[BF23 17 18G [14797] (View Gen~ank reportFASTA reportASN. 1 reportGraphical view,2 protein links, or 2 nucleoride neighbors) M37097 Bacteriophage BF23 DNA, right end of terminal repetiton gi11661 15i gbIM370971BBFRIGH [166115] (View GenBank rcport,FASTA report.ASN. 1 reportGraphical view, 1 MEDLINE link, or 2 nucleoride neighbors) M37096 Bacteriophage BF23 DNA, left end of terminal repetiton gil 16611 41gbIM370961BBFLEFT (166114] (View Genflank reportFASTA reportASN.1 reportGraphical view,1I MEDLINE link, or I nucleotide neighbor) M37095 Bacteriophage BF23 A.2-A2 gene, complete cds, and AlI gene, 5' end gil 1661 IOIgbIM370951BBFA1A3 [166110] (View GcnBank reportFASTA reportASN.1I report.Grapbical view,2 MEDLINE linkcs, 3 protein links, or 1 nucleotide neighbor) 6281 Bacteriophage BF23 clone bf23.rnac5l6.1, genomic survey sequence gil30909301gbIA.F056281I1AF05628 1 [3090930] (View GenBaak report.FASTA reportASN. I report or Graphical view) WO 00/32825 WO 0032825PCT/l B99/02040 228 AF056280 Bacteziophage BF23 clone bi23.mac3, genomnic survey sequence gil3O929gbAF0562801A.FO56280 [3090929] (View GenBank repo[1.FASTA reporzASN. I report, or Graphical view) AF056279 Bacteriophage BF23 clone bf23.macl18/21.34, genoniic survey sequenice gi!3090928igblAFos62791Ayo56279 [30909281 (View GenBanik report,FASTA repomtASN. I report, or Graphical viewF AF056278 Bacteriophage BF23 clone bf23.macl16/19.33, genornic survey sequence gi;309O9271gbJAFos 62781AF056278 (3090927] (View GeaBanl( report,FASTA report,ASN. 1 report, or Graphical view) AF056277 Bacteriophage BF23 clone bf23.mac 16/19-33, genomic survey sequence gij3090926jgbAF056277jAF056277 (3090926) (View Gen.Bank reportFASTA reportASN. 1 report, or Graphical view) AF056276 Bacteriophage BF23 clone bf23.macl2Ig-9, genomic survey sequence gi3995gI.F526A067 (3090925) (View GeriBank report,FASTA rcportASN.lI report, or Graphical view) AF056275 Bacteriophage BF23 clone bf23.nucl 1/14-24, genomic survey sequence giI3090924IgbAFos62751Ayos6275 (3090924] (View GenBank report,FASTA reportASN. I report, or Graphical view) AF056274 Bacteriophage BF23 clone bf23.57r64r, genornic survey sequence SiI30909231gbIA170562741Ayo56274 (3090923] (View GenBank report,FASTA reportASN. 1 reportGraphical view, or 3 nucleotide neighbors) AF056273 Bacteriophage BF23 clone bf23.54fr, genornic survey sequence gi! 3 090922igbjAF0562731Ayo56273 [3090922] (View Gen.Bank reportFASTA reportASN. I report, or Graphical view) AF056272 Bacteriophage BF23 clone b123.47fr.:naclo7, genomic survey sequence giJ309092 I IgbAF0562721AFos6272 (3090921 (View GenBank reportFASTA reportASN.1 report, or Graphical view) AF05627 1 Bacteriophage BF23 clone bf23.23.66r, genomic survey sequence gi13090920jgbjAF05627lj1AF056271 (3090920] (View GenBank reportFASTA reportASN. 1 report, or Graphical view) A.F056270 Bacteriophage BF23 clone bf23.23.64f, genomic survey sequence eil30909l9lgblAF0l56270A-PO5627() ritnono 101 (View GenBank report,FASTA reportASN. I report, or Graphical view) WO 00/32825 WO 0032825PCT/IB199/02040 AF0S6269 Bacteniophage BF23 clone bf23.23.60r, genornic survey sequence gil30909 I SigbIAF0562691AF056269 [3090918] (View Gen~ank report,FASTA report,ASN. 1 report, or Graphical view) AF056268 Bacteriophage BF23 clone bf23.23.60f, genomic survey sequence gi]30909 1 7gbfAFO5 62681AF056268 [3090917) (View Gen.Barik report,FASTA reportASN. 1 report,Graphical view, or~ I nucleotide neighbor) AF056267 Bacteriophage BF23 clone bf23.23.59r, genomic survey sequence gi1309091I6IgbIAF0562671AF056267 [3090916) (View GenBank report,FASTA repor-t,ASN. I report, or Graphical view) AF056266 Bacteriophage SF23 clone bf23.23.S9f, gemnmic survey sequence giJ30909151gb1AF06266Ayos6266 [3090915] (View GenBank reportFASTA repomtASN. 1 report, or Graphical view) AF056265 Bacteriophage BF23 clone bf23.23.56r, genorn c survey sequence gil30909lI4lgblAF0562651AF056265 [3090914] (View Gen~ank report,FASTA reportASN. I report, or Graphical view) AF056264 Bactenophage SF23 clone bf23.23.S6f, genormic survey sequence gij30909lI3jgbAF056264AF56264 (3090913] (View Gen~ank reportFASTA reportASN. I report or Graphical view) AF056263 Bacteriophage SF23 clone bf23.23.68f35r, genornic survey sequence gij30909l12jgbjAF056263jAF056263 [3090912] (View GenBank report,FASTA reportASN. I report, or Graphical view) AF056262 Bacteriophage SF23 clone bf23.23.43fr.66f, genomic survey sequence giJ309091 I 1IbIAF0S62621Ayos6262 [3090911] (View Gen.Bank reportFASTA reportASN. I report, or Graphical view) AF056261 Bacteriophage BF23 clone b123.23.2fr, genomic: survey sequence giJ30909 1 GgbIAFOS 6261 IAFOS 6261 [3090910] (View Genflaak reportFASTA reportASN. 1 report, or Graphical view) AF056260 Bacteriophage BF23 clone bf23.23.55.f, genornic survey sequence giJ30909091gb1AFos626o01&yos6260 (3090909] (View GenBank report,FASTA reportASN. I report, or Graphical view) AF056259 Bacteriophage SF23 clone bt23.23.53.r, genonmic survey sequence gij309D908jgbA056259AF056259 [3090908] zVcw GcnBaak reportFASTA reportAS I report or Graphical view) WO 00/32825PC/B9024 PCT/11399/02040 230 AF056258 Bacteriophage BF23 clone bf23.23.S3.f, genornic survey sequence gij30909O7jgbjAF0562581AF056258 [3090907] (View GenBank report,FASTA report,ASN. I report, or Graphical view) AF056257 Bacteriophage BF23 clone bf23.23.52.r, genornic surv.ey sequence gil30909O6lgbIAF0562571AY056257 [3090906] (View GenBank report,FASTA report,ASN. I report, or Graphical view) AF056256 Bacteriophage BF23 clone bf23.23.52.f, genornic survey sequence giI30909051gb1AF0562561AF056256 [3090905] (View Gen.Bank report,EASTA report,ASN. 1 report, or Graphical view) AF056255 Bacteriophage BF23 clone bf23.23.49.r, genomic survey sequence gi130909041gb1AF0562551AF056255 [3090904] (View GenBank report,FASTA repoMtASN. I report, or Graphical view) AF056254 Bacteriophage BF23 clone bf23.23.49.f, genomic survey sequence gi130909031gbIAIF0562541A.F0562S4 [3090903] (View Gen.Banic reportFASTA reportASN. I report, or Graphical view) AF056253 Bacteriophage BF23 clone bf23.23.48.r, genornic survey sequence giJ3O9O9O21gbIAF0S62S31A.F056253 [3090902] (View Gen.Bank reportFASTA reportASN. I report, or Graphical view) AF056252 Bacteriophage BF23 clone bf23.23.48.f, genomnic survey sequence gij309090 ljgbjAF056252jAF056252 [3090901] (View GenBank reportFASTA reportASN. 1 report, or Graphical view) AF056251 Bacteriophage BF23 clone bf23.23.44.r, genomic survey sequence gi130909001gbIAF05625 11AF05625 1 (3090900] (View Gen.Bank reportFASTA reportAS N. I report or Graphical view) AF056250 Bacteriophage BF23 clone bf23.23.41.f, genormic survey sequence giI30908991gb1AF0562S01AF0562S0 (3090899] (View GenBaak reportFASTA reportASN. 1 report, or Graphical view) AF056249 Bacteriophage BF23 clone bf23.23.22.a.r, genomc survey sequence gii30908981gb1AF056249iAF056249 [3090898] (View GenBank reportFASTA reportASN. 1 report, or Graphical view) AF056248 Racterionha~e BF23 clone bf23.23.22.a.f. zenomic survey sequence giJ30908971gb1AF0562481AF056248 [3090897) (View GenBank reportFASTA reportASN. I report, or Graphical view) WO 00/32825 WO 0032825PCT/I B99/02040 AF056247 231 Bacteriophage BF23 clone bf23.23.68.r, genorruc survey sequence gil3090896igbIAFo562471AF056247 [3090896] (View GernBar'i report,FASTA reporz,ASN. I report, or Graphical view) Z50114 Bacteriophage BF23 DNA for putative tail protein gene g i1246495 2]embzsol I41BF23 LATE [2464952) (View GenBank report,FASTA reportASN. 1 report,Graphical view, or. I protein link) D12824 Bacteriophage BF23 genes for minor tail protein gp24 and major tail protein gp?5, complete cds gi507JbJl84BFTl [520578] (View Gen.Banik repor,FASTA report,ASN. I report,Graphical view, I MEDLUNE link, 2 protein links, or 3 nucleotide neighbors) Z34953 Bacteriophage K(3 ip9, ip7 and ip8 genes gilS 3 526llembJZ34953JN{'YJ~l978 (535261] (View GenBankj reportFASTA reportASN. I reportGraphical view,!I MEDLINE link, 3 protein links, or I nucleoride neighbor) Z35075 Bacteriophage K3 DNA for Ip3 and Ip,4 gil535229lembZ350751MYEOpRp64K [535229] (View GenBank report,FASTA reportASN. 1 report,Graphical view, I MEDLIhNE link, or 2 protein links) X05560 Bacteriophage K3 gene 3 8 for receptor recognizing protein gi1l15112fembIX055601MYK3G38 (15112] (View GenBank reportFASTA reportASN. 1 reportGraphical view, 1 ,MDLR-;E link, or I protein link) X04747 Bacteriophage K3 gene 37 for receptor recognizing protein gil 151 l0lernblX047471MYK2G37 [151101 (View GenBank repoMtFASTA reportASN. 1 reportGraphical view, I IMEDLTN link, I protein link, or 2 nucleotide neighbors) X01754 Bacteriophage K3 tail fiber gene 36 gil SlO8lernbl1754J1MYK3F36 [15108] (View GenBanjc reportFASTA reportASN. I reportGraphical view, I MEDLINE link, or 2 protein links) M 16812 Bacteriophage K(3 'e lysis gene, complete cds gil2 51SO3gbfM 1681 21PK3LYST [215503] (View GenBank reportFASTA reportASN. 1 reportGrapical view,-1 MMDLINE link, I protein link, or 4 nucleotide neighbors) L46833 Bacteriophage K3 frd.3, Hr2 genes, complete cds l377IgbJL46833JPKcjFpR32G [951377] (View GenBank report,FASTA reportASN. 1 reportGraphical view,2 protein links, or 2 nucleoride neighbors) L43613 Bacteriophage K3 fibritin (wac) gene, complete cds giJ90386l~gblL436l3PKjWAC [903861] (View GenBank reportFASTA reportASN. I reportGraphical view, I protein link, or 4 nucleotide neighbors) WO 00/32825 WO 0032825PCT/I B99/02040 232 X0 1753 Bacteniophage 0x2 tail fiber gene 36 gil 122lernbXoI7531.kMYOX2F3 6 [15122] (View GeniBapJ repori,FASTA repor,ASN. I report, Graphical view,!I MEDLINE link, 2 protein links, or 1 nucleoride neiehbor) L43612 Bacteriophage Ox2 fibritin (wac) gene, complete eds gi1903848IgbjL436lI2jOX2WAC [903848] (View GenBanj( report,FASTA report,ASN. 1 repori,Graphical view, I 'otein link, or 4 nucleotide neighbors) Z46880 Bacteriophage 0X2 stp gene gii 5 99 66 3 1ernbZ46880jBPOX2STP (599663] (View GeniBarjc repor-t,FASTA reportASN. I report,Graphical view, I MEDLINE link, I protein link, or 4 nucleoride neighbors) X05675 Bacteriophage 0x2 gene 38 for r c epto r-recognizing protein and flanking regions gil 151 24lernblX05675lMYOXG 3 8 [15124] (View GenBank report,FASTA report,ASN. I reportGraphical view,!I MEDLINE link, 3 protein links, or I nucleotide neighbor) .M33533 Bacteriophage RB 18 translaional repressor protein (regA) and 0rf43.1, complete cds gi!2lI6083IgblM33533jR.B 1 8REGA (216083] (View GenLBank report,FASTA reportASN. 1 repomtGraphical view,!I MEDLrNE link, 2 protein links, or 2 nucleoride neighbors) AF033329 Bacteriophage RJB 18 single-srranded binding protein (gene 32) gene, partial cds, and 5' region gil 2 6457881gbAFo333291AFyo3 33 29 [2645788] (View GenBanc rCpon,FASTA reportASN. I repomtGraphical view,lI protein link, or 11I nucleotide neighbors) M86231 Bacteriophage RB69 gene 62, 3'end; RegA (regA) gene, complete cds giI2lI53541gbM8623 I IP6962REGA [215354) (View Gen.Bank rcportFASTA repoMtASN. I repomtGraphical view, IMNEDLrNE link, 2 protein links, or 1 nucleoride neighbor) AF033332 Bacteriophage R.B69 single-sranded binding protein (gene 32) gene, partial cds, and giI264S7941gbIAyFO3333 1 A&pQ 3 33 32 [2645794] (View GenBank reportFASTA reportASN. I reportGrapbical vie w,lI protein link, or 12 nucleotide neighbors) U34036 Bacteriophage R.B69 DNA polymerase (43) gene, complete cds gil112371l251gbU34036BRU34036 [1237125] (View Gen.Bai rCportFASTA reportASN. I rcportGraphical view, IMNEDLINE link, or I protein link) V01 145 Bacteriophage H I genorne fragment Each Thymine given in this sequence represents a HMUJ-residue (HM1J 5hydroxynethylujyacil) gijlSS7gernbjVOl 1451PODOHI [15557) (View GenBank report,FASTA reportASN. 1 repomtGraphical view, or I MDLINE link) X05676 accI~'~ gco 3 for receptor recognizing protein and flanking regions gilI I 4 1cmbIXO56761MYM1IG38 (15114] (View GenBank reportFASTA reportASN.1I reportGraphical view,!I MEDLMN link, 3 protein links, or I nucleotide neighb~or) WO 00/32825 WO 0032825PCT/l B99/02040 2,33 AF034575 Bacteriophage MlI putative integrase (int) gene, complete cds, and attP region, complete sequence giI26624721gbjAFo34571AF034575 [26624723 (View Gen.Bank report,FASTA report.ASN.lI report,Graphical view, I MEDLINE link, or 1 protein link) AF033321I Bacteriophage M I single-stranded binding protein (gene 32) gene, partial cds, and 5' region gil26457721gbIAyFo333211A.FO3332 1 [2645772) (View GenBank report,FASTA report,ASN.lI report, Graphical view,lI proiein link, or 17 nucleotide neighbors 190 Bacteriophage Tula 37 and 38 genes for receptor-recognizin proteins 37 and 38 (respectively), partial cds giil4860lembIX55 1901BPTUIA [14860) (View Gen.Bank repoz-t,FASTA report,ASN. I report,Graphical view, 1 MEDLINE link, 2 protein links, or 2 nucleotide neighbors) AF033334 Bacteriophage Tuib single-stranded binding protein (gene 32) gene, partial cds, and 5' region gil2645798igblAyo33334Ayo33 334 [2645798) (View Gen.Barnk report,FASTA reportASN. I report,Graphical view, or 5 nucleotide neighbors) 191 Bacteriophage Turb 37 gene for receptor-recogazzing protein 37 (partial cds), 38 gene for receptor-recognizing protein 38, and t gene (partial cds) gil4863IembjX551911BPTuM (14863] (View Gen.Bank reporXFASTA reportASN. 1 repomtGraphical view, 1 MEDLINE link, 3 protein links, or 3 nucleotide neighbors X 13065 Bacteriophage phi8O early region gil 14800irnibIX I 30651BPSOER 14800) (View Gen.Bank report,FASTA reportASN. I reportGraphical view,lI MEDLINE link, 8 protein links, or 6 nucleoride neighbors D00360 Bacteriophage phiBO cor gene gil2 17782ldbj1D00360lPB080COR [217782] (View Gen.Bank rCport,FASTA report,ASN.1 reportGraphical view, or I protein link) X01639 Bacteriophage phi 80 DNA-fragment with replication origin gill828IemnbJxl639JXYHI8 [15828] (View GcnBan~k reportFASTA reportASN. I reportGraphical view, 1 M:EDLINE link, or 25 nueleotide neighbors) X0405 1 Larnbdoid bacteriophage phi 80 int-xis region (integrase-excisionase region) gil 1577O1cmnbIX04OS I ISTPHI8oX (15770] (View Gen.Bank reportFASTA reportASN. I reportGraphical view,lI MEDLINE link, 2 protein links, or I nucleoride neighbor) X06751I Phage Phi8O DNA for major coat protein giJ 1S768fernbIX0675 I ISTPHI80C [15768) (View Gen.Bank reportFASTA reportASN. I report.Graphical view, 1 MEDLINE link, 1 protein link, or I1I nucleoide neighbors) X75949 Bacteriophage pbI8O DNA for ORF x171.8 and ORF x17 1.28' gi145881 IlImbIX7S949[ECORF171B [458811] (View GenBank reportFASTA reporxASN. I reportGraphical view, 1 MEDLINE link, 2 protein links, or 28 nucleotide neighbors) WO 00/32825PC/B9/24 PCT/IB99/02040 L40418 Bacteriophage phi-SO gene, complete cds gidl0191071gb!L4O4l8IP8oA 10 19107] (View Gen.Bank report,FASTA report,ASN. I report,Grapilical view, I MIEDLINE Link, or I protein link) M24831I Bacteriophage phi-SO Tyr-tRNA gene, 3 end giJ2153631gbILM2483 I IP80TGY [215363) (View Gen.Bank report,FASTA report,ASN.lI report,Graphical view,lI MEDLUNE link, or 43 nucleotide neighbors) MI 10670 Bacteriophage phi-SO replication origin gi!2153611gbiLM1O67OIP80oop (215361] (View Gen.Bank report.FASTA report,ASN. I report,Graphical view, 1 MEDLUNE link, 1 protein link, or 1 nucleotide neighbor M24825 Bacteriophage phi-SO RNA fragment gijl 5360jgbjM124825jP8oM3A [215360] (View GeniBaak repOrt,FASTA reportASN. I reportGraphical view, I MEDLINE link, or I nucleoride neighbor) M 11919 Bacteriophage phi-SO cI immunity region encoding the N gene gil215358lgblMI 19191P80CI [215358) (View Gen.Bank report,FASTA reportASN. I reportGraphical view, I MEDLINE link, I protein link, or 2 nucleotide neighbors MI 0891 Bacteriophage phi-SO attP site DNA gil? I 3571gbIM 10891 IP8OATTP [215357) (View Gen.Bank reportFASTA rcportASN. I report,Graphical view, I MDLINE link, or I nucleotidle neighbor) M 19473 Bacteriophage 933J (from E.coli) proviral Shiga-like toxin tye 1 subunits A and B genes, complete cds gil? I S72IgbJIM19473 IJ93SLTI [215072] (View Gen.Bankj report,FASTA rcportASN. I reportGraphical view,? MZDLTNE links, 2 protein links, or 20 nucieotide neighbors) Y10775 Bacteriophage 933W ileX, stx2A and stx2B genes gill 938206lemblY I07751BP9331LEX (1938206] (View GenBanc report,FASTA report.ASN. I reportGraphical view,? protein links, or 36 nucleotide neighbors X83722 Bacteriophage 933W sIt-JIB gene gil 14 9O2291cmb1X83722B933WSLT 1490229] (View GenBank reportFASTA reportASN. 1 reportGraphical view,2 protein links, or 20 nucleotide neighbors X07865 Bacteriophage 933W slt-II gene for Shiga-Like toxin typefI subunit A and B gi11I4892lembIX07865lEWSLTII [14892] (View Gen.Bank rcport,FASTA reponASN. I reportGraphical view,? protein links, or 29 nucleotide neighbors) M 16625 Bacteriophage H19B (from E.coli) slt.IA and slt.IB genes encoding Shiga-like toxin I subunits A and B, complete giJ2 1 0431gbI~M I 66251I9BSLT 12 15043j (View GenBank reportFASTA repoMtASN. I reportGraphical view, 1 MEDL~nE link, 2 protein Links, or 24 nucleonide neighbors) WO 00/32825 WO 0032825PCT/I B99/02040 M17358 Bacteriophage H I 9B shiga-like toxin- I (SLT- 1) A and B subunit DNA, complete cds gi:215O46jgbJMI73S8jHI9BSLTA [215046) (View GeriBai report,FASTA report,ASN. I report,Graphijcal view, I MEDLINE link, 2 protein links, or 20 nucleotide nei-qhconrs U29728 Bacteriophage N4 single-stranded DNA-binding protein (N4SSB) gene, complete cds gil9397081gbIU29728IBN1J29728 [939708] (View GenBank report,FASTA report,ASN. 1 reportGraphical view,2 MEDLINE links, or I protein link) 102580 Bacteriophage PA-? (Ecoli porcine strain isolate) Rz gene, Send; ORF2, outer membrane porin protein (1c) and OPT I genes.
complete cds gil2 153661gb102580PA2LC [215366] (View GeriBank report,FASTA reportASN. 1 report, Graphical view, I MEDLINE link, 4 protein links, or 4 nucleoride neighbors U32222 Bacteriophage 186, complete sequence giI3337249jgbIU32222JBIU32222 [3337249] (View GenBank report,FASTA reportASN. I report,Graphical view,6 MEDLrNE links, 46 protein links, or 5 nucleotide neighb~ors 152? Bacteriophage P4 complete DNA genome gij4509lI6lemblX5l1 22INM4CG [450916] (View GenBank reportFASTA reportASN.lI reportGraphicai view,3 MEDLIN'E links, 13 proteinlIinks, 6 nucleoride neighbo.-s.
or I genome link) X92588 Bacteriophage 82 orf33. cr0l5 1, orf56, orf96, rus, orf45, and Q genes gillOS 11 I iembIX92588IBAC82HOLL [1051111] (View GenBank report,FASTA reportASN. 1 reportGraphical view,7 protein links, or 1 nucleotide neighbor) J02803 Bacteriophage 82 antitermination protein gene, complete cds gil? 15364 jgb1302803 lP82Q [215364] (View GenBankc report,FASTA repontASN. 1 rcportGraphical view, 1 MEDLINElink, or I protein link) UJ02466 Bacteriophage HK022 (cro), (clI) and genes, complete cds, gene, partial cds gil4072851gbIU024661BHU02466 (407285] (View GenBank reportFASTA reportASN. I reportGraphical view,1I MEDLENE link, 5 protein links, or I nucleoride neighbor) IM26291 Bacteriophage D 108 regulatory DNA-binding protein (ncr) gene, complete cds gil 1661941gbIM2629 I ID I 8NER [(166194] (View GenBank reportFAStA reportASN.1 reportGraphical view,1I M:EDLnJE link, I protein link, or I nucleotide neighbor) M11272 Bacteriophage DIO08 left-end DNA gill1661931gbjM1I 12721I8LEDNA 166193] (View GenBank report.FASTA reportASN. I reportGraphical view,!I NMDLINE link, or 2 nucleotide neighbors) M 18902 Bacteriophage D I US Iil gene encoding a repiication protein, 3Y end; and containing three GR.Fs, compicic cd gil 16619 1IgbIM I89021D I KIfl [16619 1] (View GenBank reportFASTA reportASN. 1 report.Graphical view,!I MEDLINE link, I protein link, or 3 nucleotide neighbors WO 00/32825 WO 0032825PCT/I B99102040 236 M10191 Bacteriophage DI108, left end with Mu A protein binding sites L I and L2 giI669OfgbIMl091IDlSB1SL 166190) (View GenBank report,FASTA reportASN. 1 report,Graphical view, I MEDLUNE link, or 5 nucleotide neighbors) J02447 bacteriophage d 108 gene a 5' end gillI 661 891gbIJ024471D I SAAA 166189] (View Genflark report,FASTA report,ASN. I reportGraphical view, 6r1l MEDLINE link) V00865 Bacteriophage D 108 fragment from genes A and ncr (C-terminus of ner and N-terminus of A) giJlS437emnbV00865INCD 108 [15437] (View Genflank report,FASTA report,ASN. I report,Graphical view, I LMEDLINTE link, or 2 protein links) XO 19 14 Bacteriophage EKe gene for DNA binding protein gil 4957lemblX0 I9l4rNnC.EDBP [(14957] (View Gen.Bank report,FASTA reportASN. 1 reportGraphical view, 1 MEDL NE link, I protein link, or 2 nucleotide neighbors AF064539 Bacteriophage Nl15, complete genome gil3 l926831gblAFO64S391Ayo64539 [3192683) (View GenBank reporzFASTA reportASN. 1 repoz-tGraphical view,2 MEDLINE links, 60 protein links, 26 nucleotide neighbors.
or I genome link) U02303 Bacteriophage Ifl. complete genome gil3676280IgblU02303jB2Uo2303 [3676280] (View GenBanik reportFASTA reportASN. I report,Grapbical view, 10 protein links, or 1 genome link) AF007792 Bacteriophage Mu late morphogenetic region gi[355 l775lgbIAF0077921A.F0o7792 [3551775] (View GenBank reportFASTA reportASN. I reportGrapbical view, or 1 nucleotide neighbor) U24 159 Bacteriophage HPlI strain H{P Ic1, complete genome gil l046235igbjU241I591BH1J24 159 [1046235) (View GenBank reportFASTA reportASN.1I repoMtGraphical view,6 MEDLINE links, 41 protein links, 8 nucleotide neighbors.
or I genome link) Z71579 Bacteriophage S2 type A 5.6 kb DNA fragment gill 679806lemnbIZ7 15791BPHiSI1ADNA [1679806] (View GenBank reportFASTA reportASN. 1 reportGraphical view,3 MEDLINE links, 9 protein links, or 9 nucleotide neighbors) X53238 Klebsiella sp. bacteriophage K I 1 gene 1 for RNA polymerase gill14984lenibIX5323 81KSKlII RPO 14984] (View Gen.Bank repomtFASTA reportASN. 1 reportGraphical view, 1 NMDLINE link, 1 protein link, or I nucleotide neighbor) WO 00/32825PC/ 9020 PCT/IB99/02040 237 X8501 0 Bacteriophage A51 I ply5 I11 gene gil8S3748IembjX8501o1BPA51 IPLY [853748] (View Gen.Bank report,FASTA report,ASN. I report,Graphical view, I MEDLI'NE Iink, 3 protein links, or I nucleotide neigh bar U29728 Bacteriophage N4 single-stranded DNA-binding protein (N4SSB) gene, complete cds gil939708igbfU29728IBMrJ29728 [939708] (View GenBank repon-,FASTA repoz-t,ASN. I report,Grapbical view;,,2 ME!DLRTE links, cr1I protein link) 102445 bacteriophage bo I T-terrninal region ma gill 661 521gbIJ0244s IBO ITR3 [166152] (View GerLBank report,FASTA report,ASN. I report,Graphical view, I MEDUrNE link. or 5 nucleotide neighbors) L06 183 Bacteriophage L5 (firom Leuconostoc oenos) genome giI 2 8 93 531gbIL061831BL5GENM [289353] (View GenBank report,FASTA reportASN. I reportGraphical view, or I genomne link) AF074945 Mycoplasrna arthriridlis bacteriophage MAy!I, complete genome gifJS I 2431gbiAF0749451Ayo74945 (3511243] (View GenBankc report,FASTA reportASN. I report,Graphical vie%;, 15 protein links, 3 nucleoride neighbors, or I genome link) LI13696 Bacteriophage L2 (from Mycoplasma), complete genorne gil2893381gb1I36961BL2CG [289338] (View GenBank reportFASTA reportASN. 1 reportGraphical view,3 MEDLINE links, 14 protein Links, or I genome link) X80191 Bacteriophage PP7 rnRNA for maturation, coat, lysis and replicase proteins gi7237lemb1X80191jBPP7PR (517237] (View GenBank reportFASTA report-ASN. 1 reportGraphical view, 1 MEDLMN link, 4 protein links, or I genome link) M19377 Bacteriophage PO3 from Pseudomonas aeruginosa (New York strain), complete genorne gil5380jgbjMl9377jPF3CONfM.,Y [215380] (View GenBank reportFASTA reportASN. 1 report, Graphical view,lI MEDLNE link. 9 protein links, or 5 nucleoride neighbors) M1 1912 Bacteriophage POJ from Pseudomonas aeruginosa (Nijmnegen strain), complete genome gil21S3711gbIMl 19121PF3COMN [215371] (View GcnBank reportFAS.TA repartASN. I reportGraphical view, I MEDLMhJ link, 9 protein Links, 5 nucleotide neighbors, or I genorne link) V00605 Bacteriophage PfR gene encoding DNA binding protein gijl4970!embl V0O6orNopF 1 (14970] (View GenBank report,FASTA reportASN. 1 repoMtGraphical view, I proteine link, or I niucleotide neighbor) L05626 Elacternophagc PR4 capsid protein (P6) gene, complete cds giI2l5735jgbjL05626jPR4P6NWjpA [215735] (View GenBank reportFASTA reportASN. I reportGraphical view, I MIEDLINE link, 1 protein link, or I nucleotide neighbor) WO 00/32825 WO 0032825PCT/l B99/02040 -238 D 13409 Bacteriophage phiCTX (isolated from Pseudomonas aeruginosa) cosR, attP, tnt genes gil21I77761dbjlD I 34091BPHCOSR [217776] (View Gen.Bank report,FASTA report,ASN. 1 report.Graphical view, I MEDLUNE link, 3 protein links, or 3 nucleotide neighbors Dl 13408 Bacteriophage phiCTX (isolated from Pseudomonas aeruginosa) cosL, ctx genes giI2 l777SldbjfD 134081BPHCOSLCX [217775] (View Gen.Baric report,FASTA report,ASN. I repo rt,Graphica I view.,2 MEDLMN links, or 3 nucleotide neighbors) M24832 Bacteriophage f2 coat protein gene, partial cds gil 1 6 6 22 8 igbM24832IF2CRNACA (166228] (View Gen~ank report,FASTA repoz-t,ASN. I report,Graphical viewv, 1 IMEDLINE link, I protein link, or 4 nucleotide neighbors S72011 Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) genes,partial cds gil261I89671gbJAF0 I76291AF0 17629 [2618967] (View Gen.Banc reportFASTA reportASN. I report, Graphical view, I MEDLINElink, 2 protein links, or 44 nucleotide neighbors AF0 17628 Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds gi1261I8964igbfAF01I76281AFO 17628 [2618964] (View Gen.Banc repomtFASTA repoMtASN. I reportGraphical view, I MEDLINEink, 2 protein links, or 44 nucleonide neighbors) AF0 17627 Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds gii2618961[gbjAF0176271Ayol7627 [2618961] (View Gen~ank report,FASTA reportASN. 1 reportGrapbical view, 1 MEDLII'Elink, 2 protein links, or 44 nucleotide neighbors) AF017626 Bacteriophage 21 isocitrate dehydrogenase (icd) gene, partial cds; and integrase (int) gene, partial cds gil26189SgblAFO I76261AFO 17626 [2618958] (View GenBank reportFASTA reportASN. I reportGraphical view, I NEDLINE link, 2 protein links, or 49 nucleotide neighbors) AF017625 Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds giI26l89S5lgblAJFo176251AyFO17625 [2618955] (View Gen.Bank report,FASTA reportASN. I reportGraphical view,lI MEDLU-MEink, 2 protein links, or 44 nucleotide neighbors) AF0 17624 Bacteriophage 21 isocitrate dehydrogenase (icd) and integmae (int)genes, partial cds gil26189521gblAFol 76241AF017624 [2618952] (View Gen.Bank reportFASTA reportASN. I reportGraphical view,lI NMDLINElink, 2 protein links, or 44 nucleoride neighbors) AF017623 Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds gi2199glF163A072 [2618949] (View GcaBank repoI-t,FASTA reportASN. 1 reportGrapbical view, I MEDLINE link, 2 protein links, or 44 nucleotide neighbors AF0 17622 Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds gi1261I89461gblxFo I 76221AF01 7622 (2618946] (View GenBank report,FASTA reportASN.lI reporx.Graphical view,1I MEDLINE link. 2 protein links, or 44 nucleotide neighbors WO 00/32825 PCT/l B99/02040 AF017621 239 Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds gil26l8943igbiAF01762 11AF01762 1 [2618943] (View Genlank repont,FASTA rcport,ASN. I report,Grapbical view, I MEDLNE link, 2 protein links, or 44 nucleotide neighbors~ D26449 Bacteriophage PS 17 Fl gene for tail sheath protein (gpFI) and FII gene for tail tube protein (gpFU), complete cds gi1452 162 IdbjID26449jBPSFEII [452162] *(View Gen.Bank report,FASTA reportASN. I report,Graphical view, or 2 protein links) X87627 Bacteriophage D3 112 A and B genes giI97476SiernblX87627lBPD3l l2AB [974768] (View Gen.Barik report,FASTA report,ASN. I repo rt,GraphicalI view, I MIEDLINElink, 2 protein links, or I nucleotide neighbor) U32623 Bacteriophage D3 trans crip tional activator CrI (cli) gene, complete cds gil9848521gbU326231BDU32623 [984852] (View GenxBanic reportFASTA reportASN. 1 reportGraphical view, 1 protein link, or 1 nucleotide neighbor) L34781 Bacteriophage phi 11I holin homologue (ORF3) gene, complete cds and peptidoglycan hydrolase (lytA) gene, partial cds gull 1838lgblL3478 I IBPHH-OLIN [S511838] (View GenlBank repomtFASTA reportASN. I reportGraphical view, I MIDLINE link, 4 protein links, or 2 nucleotide neighbors L14810 Bacteriophage P22 (gp 10) gene, complete cds, and (gp26) gene, complete cds gil294OS3lgbjL 148 1 OIP22GPI1026X [294053] (View Gen~ank report,FASTA reportASN. I reportGrapbical view, 1 MEDLMN link, 2 protein links, or 2 nucleotide neighbors X87420 Bacteriophage ES 18 genes 24, c2, cro, cl1, 18, and oL and oR operators gillI 1434071emb1X874201BPES I 8GEN 1143407] (View GenBank reportFASTA reportASN. I reportGraphical view,5 protein links, or 9 rnucleotide neighbors) L42820 Bacteriophage BF23 tail protein (his) gene, complete cds gill 04868IgbL42821BBFRS (1048680] (View Genanak reportFASTA reportASN.1I reportGraphical view,l I EDLINElink, I protein link, or I nucleotide neighbor) X14980 Bacteriophage PRD I XV gene for protein P15S (lytic enzyme) gill S8O2lemblX1I498OJTEPRD IXV [15802] (View GenBank reportFASTA reportASN. I reportGraphical view, I NMDLENEink, I protein link, or 4 nucleotide: neighbors) X06321 Bacteriophage PRflI gene 8 for DNA terminal protein gill5800lembjX063211TEPRDIS [15800] (View GenBank re-porT,FASTA reportASN. I reportGrapbical view,lI MEDLINE link, 2 protein links, or 10 nucleotide neighbors) X14336 Filamentous Bacteriophage 12-2 genome gi11l4920lembIX 1433 61IIN 22 [14920] (View r~nc rpor AST1A repoitASN11. i Noahclv~vlMDL2NE li 0 prm-e-in l;Ik, I genome link) WO 00/32825 WO 0032825PCT/I B99/02040 L05001 240 Bacteriophage X glucosyl transferase gen.*, complete cds gil2l6O44gbIL5001PXyFcLUSYLT [216044] (View GenBank report,FASTA report.ASN. I report,Graphical view, I MEDINFE link, or I protein link) M29479 Bacteriophage p4 sid and psu genes partial cds, and delta gene, complete cds gil? 157011 gbIM294791PP4SDP [21570 1] (View GenBank report,FASTA report,ASN. 1 reportGraphical view,3 protein links, or 4 nucleoride neighbors) SEG PP4PSUSID Bacteriophage P4 capsid size determination protein (sid) gene, 5 end giJ2l5698gb~SEGPP4PSUSp) [215698] (View Gen.Bankj report,FASTA report,ASN. I report, GraphicalI view, I MEDLUNE link, 2 protein links, or I nucleotide neighbor) M29650 Bacteriophage P4 polarity suppression protein (psu) gene, complete cds gij2 I5697jgbjM29650PP4PSUSIrD2 [215697] (View Gen.Bank report,FASTA reportASN. I report, or Graphical view) M29651 Bacteriophage P4 capsid size determination protein (sid) gene, gijlI5696jgb1M2965 1IPP4PSUSID 1 [215696] (View GeriBank reportFASTA reportASN. 1 report, or Graphical view) M27748 Bacteriophage P4 gop, beta, and cI! genes, complete cds and int gene, 3' end giI2lS69l1gbIM277481PP4GopBC [215691] (View GenBank reportFASTA reporsASN. 1 reportGraphical view,l I EDLINE link, 4 protein links, or I nucleoride neighbor~ K02750 Bacteriophage M~e, complete genome gil? 15061 gbIK027501TKECG [215061] (View GenBank reportFASTA reportASN. I reportGraphical view,1I MEDLINElink, 10 protein links, 4 nucleotide neighbors, or I genome link) L4041 8 Bacteriophage phi-80 gene, complete cds gijl0191071gbIL40418IP8oA [1019107] (View GenBank re'portFASTA reportAS.N. 1 reportGraphical view, 1 MEDLINE link, or 1 protein link) AF032 122 Bacteriophage SM1 integrase (int) gene, partial cds; and bacroprenol glucosyl transferase (bgt), and glucosyl tranferase Il (gtrl[) gencs,complete cds gil24654 l21gblAF02 13471AF021347 [2465412) (View GenBank reportFASTA reportASN. I reportGraphical view, I M:EDLINblnk, 4 protein links, or 2 nucleoride neighbors) M35825 Bacteriophage SF6 fragment D lysozyme gene, complete cds giJl6l05IgbjM3S82sjSF6LYZ [216105] (View GenBank reportFASTA reportAS.N. I rcportGraphical view, or 1 protein link) Z.35479 Dactci1p6 iyl. g~.
gil534936iemblZ35479lBC1 6EP1 [534936] (View GenBank report,FASTA rcportASN.1 reportGraphical view, IMNEDLINE link, I protein link, or 2 nucleotide neighbors WO 00/32825 WO 0032825PCT/l B99/02040 241 X1 2638 Bacteriopbage 21 DNA for gene 2 gi:296l4lembIXl2638IB2IGENE2 [296141] (View Genflank report,FASTA report,ASN. 1 report,Graphical view, I MEDLrNE link, 1 protein link, or I nucleotide neighbor) X02501 Bacteriophage 21 DNA for left end sequence with genes 1 and 2 gi158 S25jembjX0250l1 XXHA2l 15825] (View GenBank report,FASTA reportASN. 1 reportGraphical view, I NfEDLINE link, 2 protein links, or 3 nucleotide neighbors) M65239 Bacteriophage 21 lysis genes S, P, and Rz, complete cds giJ21S4661gbIM652391PH2LYSGEN [215466] (View GenBank report.FASTA report,ASN. I report,Graphical view,1I MEDLINE link. 3 protein links, or I nucleotide neighbor) M58702 Bacteriophage 21 late gene regulatory region giJ21546SIM587021PH2LATEGE [215465] (View Genflank report,FASTA reportASN. I reportGrapbical view, or 1 MEDLINE_ link) M81255 Bacteriophage 21 head gene operon giJ2lS4541gbMS I25IPH2HEADTL (215454] (View Gen.Bank reportFASTA reportASN.l reportGraphical view,2 MEDLINE Links, 10 protein links, or 4 nucleotide neighbors) N123775 Bacteriophage 21 glycoprotein 1 gene, complete cds, and glycoprotein gene, 5, cud gij2 1545 1jgbjM23775jPH2GPA [215451] (View GenBank reportFASTA rcportASN. 1 reportGraphical view, I MEDLIN link, 2 protein links, or 3 nucleotide neighbors) M61965 Bacteriophage 21 excisionase (xis), integrase (int) and isocitrate dehydrogenase (icd), complete cds giI21I5448fgb1M61I865IPH22XISAA [215448] (View Gen.Bank reportFASTA reportASN.lI reportGraphical view,2 protein links, or 9 nucleotide neighbors) S72011I Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds gil261I89671gblAF0 176291AF017629 (2618967] (View GenBank reportFASTA reportASN. I reportGraphical view,lI MEDLINE link, 2 protein links, or 44 nucleotide neighbors) AF017628 Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds gi126 18964IgbIAF0l 76281AF01 7628 [2618964] (View GemBank reportFASTA reportASN.1 reportGraphical view,1 MEDLINE link, 2 protein links, or 44 nucleotide neighbors) AF017627 Bacteriophage 21 isocit-ate dehydrogenase (icd) and integrase (int) genes, partial cds gil2618961igblAF0176271AF017627 (2618961] (View Gen.Bank reponFASTA reportASN.lI rcportGraphical view,1 I MIEDLMN link, 2 protein links, or 44 nucleotide neighbors AF017626 Bacteriophage 21 isocitrate dehydrogenase (icd) gene, partial cds; and integrase (int) gene, partial cds giI261I89S8ggbjAFOl 76ozAFj!7uI62 j.2c i958] (View GenBank reportFASTA reportASN.1 reportGraphical view,1 MEDLINE link, 2 protein links, or 49 nucleotide neighbors WO 00/32825 PCT/I B99/02040 AF01 7625 242' Bacteriophage 21 isocitrate dehydrogenase (icd) and integmase genes, partial cds gi1261I89551gbJA.FO1I76251AF0 17625 (2618955] (View Gcn.Bani( report,FASTA report,ASN. I report, Graphical view, I MEDLINE link, 2 protein links, or 44 nucleotidle neighbors AF0 17624 Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds giI 2 6 I8952igbJAyol76241A&yoi7624 [2618952] (View GenBank: report,FASTA reportASN. I report,Graphical view, 1 MEDLINE link, 2 protein links, or 44 nucleotide neighbors) AF017623 Bacteriophage 21 isocierate dehydrogenase (icd) and integrase (int) genes, partial cds gil26 18949lgbIAFO I 76231AFO 17623 [2618949) (View Genifanc report,FASTA report,ASN. 1 report,Graphical view, 1 MEDLINE link, 2 protein links, or 44 nucleotidle neighbors) AFO 17622 Bacteriophage 21 isocitrate dehydrogenase (icd) and integmase (int) genes, partial cds gil26 189461gblAy0 176221AF0 17622 [2618946) (View Genflank report,FASTA reportASN. I report,Graphical view, I MEDLIN link, 2 protein links, or 44 nucleotide neighbors AF017621 Bacteriophage 21 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds gil 26 189431gbjAFo176211AF01762 1 [2618943) (View Gen.Bank report,FASTA reportASN. 1 reportGraphical view, 1 MEDLINE link, 2 protein links, or 44 nucleocidce neighbors) M57455 Bacteriophage 42D (clone pDB 17) (from Staphylococcus aureus) staphylokinase gene, complete cds gil2l5344jgbiM57455jp42S-TK [215344] (View Gen.Barik reportFASTA reportA SN. 1 report,Graphical view, I protein link, or 9 nucleotide neighbors) Y12633 Bacteriophage 85 DNA, promoter sequence of unkriown gene gi2525ebY23I8PO (2058285) (View Gen.Bank report,FASTA reportASN. I report, or Graphical view) X98 146 Bacteriophage P1I DNA sequence around the Op8 8 operator gil 135951lIemblX98 146BPlop880P [1359513) (View GenBank report,FASTA reportASN. 1 repoMtGraphical view, or 1 nucleoride neighbor) Y07739 Staphylococcus; phage Twort hoITW, plyTW genes gi2699ebY73jPWHL (2764979) (View GenBank reportFASTA reportASN. 1 reportGraphical view, or 2 protein links) L07580 Bacteriophage phi-IlI rinA and rin B genes, required for the activation of Staphylococcal phage phi-IlI int expression gijl661 6jgblLo7S80jBP1.AJ3& [166160) (View GenBank repomtFASTA reportASN. I reportGraphical view, I M:EDLINE link, or 2 protein links) M34832 Bacteriophage phi-IlI integrase (int) and excisiona-se (xis) genes, complete cds gijl66l57jgbjM34832jBPHJUJ-yJ [166157) (View Gen.Bank reportFASTA reportASN. 1 reportGraphical view, 1 MEDLUNE link, 2 protein links, or 2 nucleoride neighbors) WO 00/32825 PTIB9/24 PCT/IB99/02040 '243 M20394 Bacteriophage phi- I I Saurcus attachment site (attP) gil 66IS61gbIM2O3941BPHAj-jP 1661561 (View Gen.Baik report,FASTA report,ASN. I report,Graphical view, I MEDLINE link, or 4 nucleorjde neighbors) X23 128 Bacteriophage pi-13 integrase gene gil75S228lembIX8231I21PHI I3rNT [758228] (View GenBanik repoMtFASTA reportASN. I report,Graphical view, 1 protein link, or 3 nucleotide neighbors) X6 1719 S.aureus phi- 13 lysogen right chromosome/bacteriophage DNA junction gil466251cmbIX6 171 9ISAP 1 3RJNC [46625] (View GenBanik report,FASTA reportASN. I repon-,Graphical view, or 1 MEDLUNE link) X6 1718 S.aureus phi- 13 lysogen left chromosomal/bacteriophage DNA junction gi146624lemblX6 171 81SAP13LINC [46624] (View Gen.Banic reportFASTA reportASN. I reportGraphical view, or I MEDLINE link) X6 1717 Bacteriophage phi- 13 core sequence for attachment gil 14799lembIX617171BP 13ATT?~ 147991 (View GenBank repotFASTA reportASN. I repomtGraphical view,2 MEDLrNE links, or 3 nucleotide neighbors U01 875 Bacteriophage phi- 13 putative regulatatory region and integrase (int) gene, partial cds gil4371 1SlgblU01875JUols75 [437118] (View GeaiBank report,FASTA reportASN. I rcportGraphical view,3 MEDLINE links, or 4 nucleotide neighbors X67739 S.aureus Bacteriophage phi-42 attP gene gil I 4 8O9lemblX67739lBPAT-rpA 14809) (View GenBank reportFASTA reportASN. I reportGraphical view,!I MEDLINE link, or 3 nucleotide neighbors) U01872 Bacteriophage phi-42 integrase (int) gene, complete cds gil4371 151gblU018721Uo1872 [437115) (View GenBank reportFASTA reportASN.1 repomtGraphical view,3 MEDLINE links, 2 protein links, or 3 nucleotide neighbors X94423 Staphylococcus aureus bacteriophage phi-42 DNA with ORFs (restriction modification system) gi1 1771 597lembIX944231SARMvS [1771597] (View Gen.Bank reportFASTA reportASN.1I repartGraphical vicw,2 protein links, or I nucleotide neighbor) M27965 Bacteriophage 1.54a (from S.aureus) int and xis genes, complete cds giJl S96JgbIM27965JL54lNfl-[S [215096] (View GenBanc rePOrIFASTA repoMtASN. I report,Grapbical view, MEDLINE 1 link, 2 protein links, or 3 nucleotide neighbors) U72397 Bacteriophage 80 alp~ha holin and amidase enres. comniete cd-% gi1l17632411IgbIU723971BBU72397 (17632411 (View GenBank report,FASTA reportASN. I reportGrphical view,2 protein links, or 2 nucleoide neighbors) WO 00/32825 WO 0032825PCT/11B99/02040 244 AB009866 Bacteriophage phi PVL proviral DNA, complete sequence gil334 19O7Idbj1AB0O98661ABOO9866 [3341907] (View Gen.Bank report,FASTA report,ASN.lI report,Graphical view,63 protein links, or I nucleoride neighbor) Z47794 Bacteriophage Cp-lI DNA, complete genorne gil2288892lembIZ477941BPCPlXX [2288892] (View GenBank report,FASTA report,ASN.lI reportGraphical view,3 MEDL1Nt links, 28 protein links, I nucleotidle neighbor. or I genome Link) SEQCP7RSIT Bacteriophage Cp-7 (S.pneumoniae) 5' inverted terminal repeat gil 1661 861gbIiSEG -CP7RSIT f[166186) (View Gen.Banc report,FASTA reportASN. I reportGraphical view, or I MEDLINE link) Ml 11635 Bacteriophage Cp-7 (S.pneumoniae) DNA, 3' inverted terminal repeat gil 1661851gbIM I 1635lCP7RS1T2 [166185] (View GenBank report,FASTA reportASN. I report, or Graphical view) Ml 1636 Bacteriophage Cp-7 (S.pneunioniae) S' inverted terminal repeat gil 1661 84[gblM I I636ICP7RSITI 166184] (View Gen.Bank report,FASTA reportASN. I report, or Graphical view) SEG CPSRSIT Bacteriophage Cp-5 (Spncumoniae), 5'inverted terminal repeat gil11661811gbIlSEG CPSRSIT (166181] (View GenBank report,FASTA reportASN. I reportGraphical view, or 1 MEDLIN link) M11633 Bacteriophage Cp-5 (S.pneumoniae) 3' inverted terminal repeat gil 66l801gbIMI l6331CPSRSIT2 [1661801 (View Gen.Bank report,FASTA reportASN. 1 report, or Graphical view) M11634 Bacteriophage Cp-5 (S.pneumoniae), 5'inverted terminal repeat gil11661 791gbIM11l6341CP5RSITI (1661791 (View Genflank reportFASTA reportASN. I report, or Graphical view) M34780 Bacteriophage Cp-9 muramidase (cpl9) gene gil 1661871gbIM34780lCP9CPL 166187] (View Genflank reportFASTA reportASN.lI reportGraphical view,l M EDLINTE link, 1 protein link, or I nucleotide neighbor) M34652 Bacteriophage HB-3 axnidase (hbl) gene, complete cds gil2l15055IgbIM346521HB3F{BLA [215055] (View Gen.Bank repoMtFASTA reportASN.l reportGraphical view,lI MIDLINE link, or 1 protein link) U64984 S~cpococuspyocn- p-g-T2 ers-r exiinc irtTp(int) ard ervthrogenic toxin A precursor (spCA) genes, complete cds gill 18774261gbIU404531SPU40453 (1877426] (View Gen.Bank reportFASTA reportASN. I reportGraphical view,2 NMDLINE links, 4 protein links, or 22 nucleotide neighbors) WO 00/32825 WO 0032825PCT/ B 99/02040 245 X12375 Phage CP-TI (Vibrio cholerae) DNA for packaging signal (pac site) gill 5435lembI1I237SINCCPPAC [15435] (View GenBarik report,FASTA report,ASN.lI report,Graphical view,lI MEDLUNE link, or 1 protein link) AF0878 14 Vibrio cholerae filamrentous bacteriophage fs-2 DNA, complete genome sequence gil3702207jdbj1A30026321AB~O2632 [3702207] (View GeniBarik report,FASTA reportASN. I repoMtGraphical view, 1 MEDLUE link, 9 protein links, or I genome link D835 18 Bacteriophage KVP40 gene for major capsid protein precursor, complete cds gil3046858ldbjID835 18ID83518 [3046858] (View Gen.Bank reporz,FASTA report,ASN.1I report,Graphical view, I MEDLINE link, or 1 protein link) AF033322 Bacteriophage PST single-stranded binding protein (gene 32) gene, partial cds, and 5' region gil264S774igbjAF0333221AF033322 [2645774] (View CienBaak reportFASTA reportASN. I rcportGraphical view, I protein link, or 17 nucleotide neighbors) X94331 Bacteriophage L cr0, 24, c2, and c I genes gil 146921 3lemblX9433 1 IBLCRO24C [1469213] (View Gen.Bank report,FASTA reportASN. I report,Grapbical view, I MEDLINE link, or 4 protein links) U82619 Shigella flexneri bacteriophage V glucosyl transferase (gtr), integrase (int) and excisionase (xis) genes, complete lcds gi!2465471gb1U826 1 9SFtJ826 19 [2465470) (View GenBank reportFASTA reportASN.1 repartGraphical view,l MEDLINE link, 8 protein links, or I nucleotide neighbor) WO 00/32825 PCT/I B99/02040 246 Thable 12 NCBI Entrez Nucleotide QUERY Key words: bacteriophage and iysis 56 citations found (all selected) AM01 1581 Bacteriophage PSI 119 ly-sis genes 13. 19, 15, and packaging gene 3, complete cds giI3676OS4emblAJOl 15811BPS01 1581 [36760841 (View GenBank report,FASTA reportASN. I report,Graphical view,4 protein links, or 1 nucleotide neighbor) AJO 11580 Bacteriophage PS34 lysis genes 13. 19, 15, antiterminator gene 23, and packaging gene 3. complete cds gii3676O78JemblAJOI 15801BPSOI 1580 [3676078] (View GenBank report,FASTA reportASN. 1 reportGraphical view.5 protein links, or 2 nucleotide neighbors) AJOI 1579 Bacteriophage PS3 lysis genes 13, 19, 15. and packaging gene 3 git36760731emblAJ01 15791BPSOI 1579 [3676073] (View GenBank report.FASTA reportASN. 1 report.Graphicai view,4 protein links, or 1 nucleofide neighbor) AF034975 Bacteriophage H-19B essential recombination function protein (erf), kil protein (kil). regulatory protein cIfl (cII), protein gpi 7
N
protein ci protein cro protein (cro), cII protein (cUI), 0 protein P protein ren protein (ren), Roi (roi). Q protein Shiga-like toxin A (sit-IA) and B (sit-rn) subunits, and putative holin protein genes, complete cds gil266875lgblAF034975i [2668751] (View GenBank reportFASTA report.,ASN. 1 reportGraphical view,1I NEEDLINE link, 20 protein links, or 30 nucleotide neighbors) U37314 Bacateriophage lambda Rzl protein precursor (Rzl) gene, complete cds gi1017780IgbIU373141BLU37314 [1017780] (View GenBank reportFASTA report,ASN. I reporGraphical view,2 MEDLINE links, 1 protein link, or 9 nucleotide neighbors)
UUMUZOS~
E. coli MIlA locus encoding the hflX. hflK and hflC genes, hfq gene, complete cds; miaA gene, partial cds gil,436153Ig-bIUO0O5ECOHFLA [4361531 (View Gen.Bank reportFASTA report.ASN. I report.Graphical view,4 MEDLINE WO 00/32825 PCT/l B99/02040 247 links, 5 protein links, or 8 nucleotide neighborv- U32222 Bacteriophage 186, complete sequence gi133372491gbIU32222lB IU32222 [3337249] (View GenBank repon,.FASTA reportASN. I report,Graphical view,6 MEDLIN links, 46 protein links, or 5 nucleotide neighbors) AF064539 Bacteriophage N.15, complete genome giI3l926&3lgblAF064539fAF064539 [3192683] (View GenBank rcportFASTA report-ASN.lI report,Graphical view,2 MEDLINE links, 60 protein links, 26 nucleotide neighbors, or I genome link) AF063097 Bacteriophage P2, complete genome gil3 139086Ig-b1AF0630971AF063097 (3139086] (View GenBank report,FASTA reportASN.1 report.Grapbical view ,21 MEDLIN links, 42 protein links, 3 nucleotide neighbors, or I genome link) Z917974 Bacteriophage phiadh lys, bol. intG, rad,and tec genes aiIZ7OY7950IembZ97974JBPHJADH (2707950] (View GenBank reportFASTA reportASN. I repor?,Graphicai view,2 UMDIN links, 9 protein links, or I nucleotidc neighbor) AF059243 Bacteriophage NL95, complete genome gil3088.545lg~blAF0592431AF059243 [30885451 (View GenBank report,FASTA report.ASN. I reportGraphical view.2 MUEDLINE links, 4 protein links, 3 nucleotide neighbors, or I genome link) AF052431I Bacteriophage M I I A-protein, coat protein, AlI-protein, and replicase genes, complete cds gil29812081gb4AF05243 ii [2981208] (View GenBaak rtportFASTA report.ASN. I reportGraphical view,2 MEDLINE links, 4 protein links, or 8 nucleotide neighbors) YD7739 Staphylococcus phage Twort holTW, plyTW genes (View GenBaak reportFASTA reporz,.ASN.1 reporGraphical view, or 2 protein links) X94331 WO 00/32825 PCT/I B99/02040 248 Bacteriophage L cro. 24. c2, and c I genes gil 14692 13lemb1X9433 IIBLCRO24C 14692131 (View GeaBank report.FASTA repor.ASN. I report.Graphical view, I MEDLINE link, or 4 protein links) X78410 Bacteriophage phiadbhlin and lysin genes LiI79384IembIX78410ILGHOLLYS (7938481 (View GenBank report,FASTA report.ASN.lI report.Graphicab'iew, I MEDLINE link, 2 protein links, or I nucleotide neighbor) X99260 Bacteriophagec B 103 genomic sequence oill1429229lemblX99260BB103G [1429229) (View GenBank reportFASTA repor:,ASN.1 report,Graphical view,1 MEDLINE link, 17 protein links, or 12 nucleotide neighbors) MJ000741 Bacteriophage P1 darA operon gil2462938lemblAJ000r741lBPMJ7641 [2462938] (View GenBank reportFASTA reporlASN.I report,Graphicai view,1I MEDLINE link, 10 protein links, or 31 nucleotide neighbors) X87420 Bacteriophage ES 18 genes 24, c2, cra, cl, 18, and oL. and oR operators gillI 1434071embIX8742OIB PES 18GEN 1143407] (View GenBank repoMiFASTA reportASN.1 report,Graphical view,5 protein links, or 9 nucleotide neighbors) L35561 Bacteriophage phi-lOS ORFs 1-3 gilS32218IgbIL3556lIPHSORFHTR [532218) (View GenBank report.FASTA repor.ASN. I report,Graphical view, I NMLITE link, or 3 proteina links) D10027 Group 11 RNA coliphage GA genome giI2l77U4dbjIDIOOZ7IPGAXX (217784] (View GenBank repoMtFASTA repont.ASN. I repor,.Graphical view, IMNEDLINE link, 3 protein links, 5 nucleotide neighbors, or I genorne link) VOI 128 Bacteriophage phi-X 174 (cs7O mutation) complete genorne gill15535lemblVOl 1281PHXC174 [15535] (View Gen.Bank rcportFASTA reportASN.1 reportGraphical view,4 MEDLINE links, I1I protein links, or 26 nucleotide neighbors WO 00/32825 PCT/l B99/02040 249 S81763 coat gene replicase gene [bacteriophage KUl, host=Escherichia coli, group 17 RNA phage, Genomic RNA, 3 genes, 120 ntJ gil14387661obIS81763IS81763 11438766] (View GenBank report,FASTA reportASN.1 rcport,Graphical view, or I MEDLINE link) U38906 Bacteriophage rnt integ-ase, repressor protein (rro), dUTPase,holin and lysin genes, complete cds gi1531lbU38906BRU389O6 [1353517] (View Gengank report,FASTA report.AS.N. I report,Graphical view,2 MEDLINE links, 50 protein links, or 3 nucleotide neighbors) X91 149 Bacteriophage phi-Ol1 DNA cos region gillI 107473lemblX91 I49IAPHIC3 IC [1107473] (View GenBank report.FASTA reportASN. I reportGraphicaI view, I MEDUNE link, 6 protein links, or I nucleotide neighbor) V00642 phage MS2 genorne ail 1508 1emnNVOO642LEMS2X [15081] (View GenBaik report,FASTA reportASN. I reporL.Grapbical view,8 MIEDLINE links. 4 protein links, or 20 nucleotide neighbors) V01 146 Geome of bacteriophage 7 g11431 IB7lembl VOl 1461T7CG [4311871 (View GenLBankc rcportFASTA reportASN.1 reportGrapbical view,13 MEDLINE links, 60 protein links, 105 nucleotide neighbors, or 1 genome link) X78401 Bacteriophage P22 right operon, orf 48, replication genes 18 and 12, niD region genes, ninG phosphatase, late control gene 23, orf 60. complete.
cds, late control region, start of lysis gene 13 gil512343lernblX784011POP22NIN [5123431 (View Genflank reporFASTA reporl.ASN. I repon.Graphical view,2 NMLINE links, 13 protein links, or 4 nucleotide neighbors) Y00408 Bacteriophage T4 gene t for lysis protein (View GenBank report.,FASTA repon.A SN. 1 reportGraphicai view,I MIEDLINE link, I protein link, or 3 nucleotide neighbors) Z26590 WO 00/32825 WO 0032825PCT/l 899/02040 250 Bactenophage mv4 lysA and lysB genes gi141IO0cmb1Z26590ONfV4LYSAB [410500] (View Gen.Bank report,FASTA reportASN. I rcport,Graphical view, or 4 protein links) X07809 Pbage phiX 174 Jysis gene upstream region gil l5094lembIX078091MIPHlJCE [15094] (View GenBank report,FASTA report,.ASN.lI report,Graphical view,1I MEDLIN'E link, 2 protein links, or 4 nucleotide neighbors) Z34528 L-actococcal bacteriophage c2 lysin gene oiI506455IembIZ34528JLBC2LYSIN [506455) (View GenBank reportFASTA report.ASN. 1 report,Graphical view, I MEDLINE link. 1 protein link, or 4 nucleotide neighbors) X 15031 Bacteriophage fr RNA genome giIl5O7I~emIXl503lILEBFRX [1-5071] (View GenBank report,FASTA reportASN.1I reportGraphical view.1 MEDLINE link, 4 protein links, 9 nucleotide neighbors. or I gcnome link) X80191 Bacteriophage PP7 mR.NA for maturation, coat, lysis and replicase proteins giIl7237IernbIX80l9IlBPP7PR [5172371 (View GenBaak reportFASTA reportASN.1I reportGraphical view,1I MEDLINE link. 4 protein links, or 1 genome link) X5010 Bacteriophage A511I ply~li gene gil8537481cmblX85010lBPASlIIPLY [8537481 (View GenBanik reportFASTA reportASN. IrepomtGrapbical -iew. I MEDLINE link, 3 protein links, or 1 nucleotide neighbor) X85009 Bacteriophage A500 hoI5OO and ply 5 00 genes giI853744kmbX8OO91BPA500PLY [853744] (View GenBank report,FASTA report.ASN. I reportGraphical view, IMNEDLINE link., 3 protein links, or 4 nucleotide neighbors) XSS500 Bacteriophage Al118 holl 18 and plyll8 genes giI8S37401emb1X&5008IBPA I I8PLY [853740] (View GenBanc reportFASTA report.ASN. I reportGrphical view, IMNEDLINE link, 3 protein links, or 1 nucleofide neighbor WO 00/32825 WO 0032825PCT/I B99/02040 Z35638251 Bacteriophage phi-X 174 genes for lysis protein and beza-lactamase 0oiI52O996IembIZ35638IBPLYSPR [520996) (View GenBank reportFASTA reportASN. I report,Graphical view, I MEDLINE link. 2 protein links, or 516 nucleotide neighbors) J02459 Bacteriophage lambda. complete oenome giI2I51O4IgbIJO2459ILAMCG [215104] (View GenBank report.FASTA report,ASN.1I report,Graphical view,87 MEDLINE links, 67 protein links. 190 nucleotide neighbors, or 1 genome link) X87674 Bacteriophage P1 lydA lydB genes giI9r74763IembIX87674lBACPlLYD [974763] (View GenBank reportFASTA reportASN. I report.Graphical view, 1 MEDLINE link, 2 protein links, or 2 nucleotide neighbors) X87673 Bacteriophage P1 gene 17 gil97476 1 embIX876731BACPI 17 [9747611 (View GenBank report,FASTA reportASN. I report,Graphical view, 1 MIEDLINE link. 1 protein link, or 1 nucleotide neighbor) M14784 Bacteriophage n3 strain amNG22OB right end, tail fiber protein, lysis protein and DNA packaging proteins, complete cds g'iI2l581C)IgblM14784Fr3 RE [215810) (View GenBank rcport,FASTA reportASN. I report.Graphical view,1I MEDLINE link. 9 protein links, or 10 nucleotide neighbors) M 11813 Bacteriophage PZA~ (from B.subtilis), complete genome g'il2l6O46lgbIM I 18131PZACG [216046] (View GenBank report.FASTA reportASN. I reportGraphical view,3 MIEDLINE links, 27 protein links, 17 nucleotide neighbors, or 1 genome link) M16812 Bacteriophage K3 T lysis gene, complete cds gil215503IgblM168l21PK3LYST [215503] (View GenBankc report,FASTA reportASN. I report.Grapbhical view,1 IMEDLINE link, 1 protein link, or 4 nucleoude neighbors) J04356 Bacteriophage P22 proteins 15 (complete cds), and 19 (3'end) genes gil2l52651ablJ04356lP2215P [215265] WO 00/32825 WO 0032825PCT/l B99/02040 252 (View GenBank report,FASTA report.ASN. I report,Graphical view, 1 NMDLINE link, 3 protein links. or 2 nucleotide neighbors) J04343 Bacteriophage JP34 coat and lysis protein genes, complete cds, and replicase protein gene, gil2150761gbIJ04343lJP3C0LY [215076] (View GenBank report,FASTA reportASN. I report,Graphical view, 1 MEDLINE link, 3 protein links, or 2 nucleotide nei ghbors) J02482 Bacteriophage phi-X174, complete genome gil2l6Ol9IgbIJO2482lPXICG (216019] (View GenBank report,FASTA report.ASN. I report.Graphical view.23 MEDUNE links. I1I protein links, 26 nucleotide neighbors, or 1 genome link) M99441 Bacteriophage T4 and-sigma 70 protein (asiA) gene, complete cds and lysis protein, 3' end gil215820lgblM9944l1PT4ASLA 1215820] (View GenBank report.FASTA reporxASN. I rcport,Graphical view,3 MIEDLINE links, 2 protein links, or 2 nucleotide neighbors) M65239 Bacteriophage 21 lysis genes S, R, and Rz, complete cds gil2154661oblM65239lPH2LYSGEN (215466] (View Geniank report,FASTA report.ASN. I reportGraphical view,! M EDLINE link, 3 protein links, or I nucleotide neighbor) M10637 Phape G4 DIE overlapping gene system, encoding D (morphogenetic) and E (lysts) proteins 0il2I54271gblM10637lPG4DE [215427] (View GenBank report,FASTA reportASN. I reportGraphicai view, IMNEDLINE link, 2 protein links, or 12 nucleotide neighbors) J02454 Bacteriophage G4, complete genome gil215415lgblJ024541PG4CG [215415] (View GenBank rcport.FASTA reportA SN. I report,Grapbicai view,6 MIEDLINE links. I1I protein links, 20 nucleotide neighbors, or I genomec link) JO2-sRfl Bacteriophage PA-2 (E-coli porcine strain isolate) Rz gene, 5'end; ORF2.
outer membrane porin protein fic) and ORF1 genes, complete cds gil2 153661gbUJ025801PA2LC [215366] (View GenBank reportFASTA reporL.ASN. I reportGrapbical view. IMNEDLINE link. 4 protein links, or 4 nucleotide neighbors) WO 00/32825 PCT/I B99/02040 253 M 14782 Bacillus phage phi-29 head morphogenesis, major head protein, head fiber protein, tail protein, upper collar protein, lower collar protein, pre-neck appendage protein, morphogenesis(13), lysis, encapsidation genes, complete cds 0giI215323lg1M 147821 P29LATE2 (2153231 (View GenBank report,FASTA report.ASN. I report,Graphic-al view, I MEDLINE link, I11 protein links, or ItI nucleotide neighbors M 10997 Bacteriophage P22 lysis genes 13 and 19, complete cds Oii2l52621gb1Ml09971P2213l9 1215262] (View GenBank report,FASTA report.ASN.1 report,Graphicai %view,1 MEDLINE link, 2 protein links, or 3 nucleotide neighbors) J02467 Bacteriophage MS2, complete genome giI52321gbIJ02467lMS2CG [215232] (View Geniank report.FASTA rcportASN.lI report,Graphical %view,8 MEDLINE links, 4 protein links, 20 nucleotide neighbors, or I genome link) M 14035 Bacteriophage lambda lysis S gene with mutations leading to nonlethality of S in the plasmid pRG1 giI2l.1I8O1gbIMI4O3SZ1LAMLYS [215180] (View Gen.Bank report,FASTA report.ASN.1 report,Graphical view.l MIEDINE link, I protein link, or 14 nucleotide neighbors) U04309 Bacteriophage phi-LC3 putative holin (lysA) gene and putative murcin hydrolase (lysB) gene, complete cds giI5307%61obIUG43091BPU04309 [530796] (View Gen~ank report,FASTA reporl.ASN. I repoMtGraphicai view,I IMMDUNE link, 2 protein links, or 1 nucleotide neighbor WO 00/32825 PCTiI B99/02040 254 Table 13 NCBI Entrez Nucleotide QUERY Key word: holin 51 citations found (all selected) AF034975 Bacteriophage H-19B essential recombination function protein (ert), kil protein (kil), regulatory protein cli (cIII), protein gp17 N protein ci protein cro protein (cro), cli protein (cII), 0 protein P protein ren protein (ren), Roi (roi), Q protein Shiga-like toxin A (sit-IA) and B (sI t-IB) subunits, and putative holin protein genes, complete cds gi126687511 ibLAF0349751 [26687511 (View GenBank report,FASTA report,.ASN.1I report,Graphical view,lI MED)LINE link, 20 protein links, or 30 nucleotide neighbors) U52961 Staphylococcus aureus holin-like protein LrgA (IrgA) and LrgB (lrgB) genes, complete cds giII84i516IgblU52961lSAU52961 [1841516] (View GenBank report,FASTA report.ASN.1 report,Graphical view,1I MEDLINE link, 2 protein links, or I nucleotide neighbor) U28154 Haemophilus somnus cryptic prophage genes, capsid scaffolding protein gene, partial cds, major capsid. protein precursor, endonuclease, capsid completion protein, tail synthesis proteins, holin, and lysozyme genes, complete cds gil l765928igblU28 154l1-SU281 54 [1765928] (View GenBank. report,FASTA reportASN.1 report,Graphical view, IMNEDLINE link, or 13 protein links) AF032 122 Streptococcus thermophilus bacteriophage Sfl 9 central region of genome gil29356821gbLAF032 122i [2935682] (View Genflank report,FASTA reportASN. 1 report,Graphical view,I MEDLINE link, 14 protein links, or 2 nucleotide neighbors) AF032121 Streptococcus thermophilus bacteriophage Sf12I central region of genome gil2935667IgblAF0321211AFO32121 [2935667] (View GenBank report,FASTA reportASN.1 report,Graphical view,1 MEDLINElink, 14 protein links, or 2 nucleotide neighbors) WO 00/32825 WO 0032825PCT/l B99/02040 255 AF02 1803 Bacillus subtilis 168 prophage SPbeta N-acetylmuranol-L-alanine arnidase (blyA), holin-like protein (bbIA), hahin-like protein and yolK genies, complete cds; and youJ gene, partial cds g~il2997594lgblAF02 18031AF02 1803 [2997594] (View GenBank report,FASTA repor-tASN.lI report, Graphical view,lI MEDLINE link, 5 protein links, or 1 nucleotide neighbor) AF057033 Streptococcus thermophilus bacteriophage sfi I11 gp502 (orf5O2), gp2 8 4 (orf284), gp129 (orf 129), gp 193 (orf 193), gp119 (orf 119), -p348 (or048), gp53 (orf53), gp 113 (orfl113), gp104 (orfl104),gop114 (orf 114), gp128 (orfl128), gpl68 (orf 168), opl 117 (orfi 117), gplO5 (orflO5), putative minor tail protein (orf15 10), putative miunor structural protein (orf5l2), putative minor structural protein (orf 1000), gp373 (orf373), gp57 (orf57), putative anti-receptor (orf695), putative minor structural protein (orf669), gp 149 (orf 149), putative holin (orf 14 putative holin (orf87), and lysin (orf288) genes, complete cds oi]33204321ablAF057033LAF057Q33 [3320432] (View Gen.Bank report,FASTA reportASN. I report,Graphical view,25 protein links, or I nucleotide neighbor) U32222 Bacteriophage 186, complete sequence gil3337249gbIU32222lBI1U32222 [3337249] (View GenBank report,FASTA reportASN.1 report,Graphical view,6 MEDLINE links, 46 protein links, or 5 nucleotide neighbors) AB009866 Bacteriophage phi PVL proviral DNA, complete sequence goil3341907ldbjlAB009866AB009866 [3341907] (View Gen.Bank reportFASTA report.ASN. I report,Graphical view,63 protein links, or I nucleotide neighbor) AF009630 Bacteriophage bILl7O, complete genorne g~il32822601gb1AF00%6301AF009630 [3282260) (View GenBanic report,FASTA reportASN. 1 report,Graphical view,63 protein links, 3 nucleotide neighbors, or 1 genome link) AF064539 Bacteriophage N15, complete genome WO 00/32825 WO 0032825PCT/1 B99/02040 256 oil3l1926831gb1AF064S391AF064539 [3192683] (View GenBank report,FASTA reportASN.1 report,Graphical view,2 MIEDLINE links, 60 protein links, 26 nucleotide neighbors, or I genome link) AF063097 Bacteriophage P2, complete genome gil3 1390861gb1AF063Q97IAF063097 (31390861 (View GenBank report,FASTA reportASN.I report,Graphical view,21 MEDLINE links, 42 protein links, 3 nucleotide neighbors, or I genome link) Z97974 Bacteriophage phiadh lys, hol, intG, rad,and tec genes agil 27 07950emb1Z979741BPJ-nADH- [2707950] (View GenBank report,FASTA reportASN. 1 report,Graphical view,2 MIEDLINE links, 9 protein links, or 1 nucleotide neighbor) X95646 Streptococcus therrnophilus bacteriophage Sf12l DNA; lysogeny module, 8141 bp cgi1229274-AembIX9564&JBS12 I LYS [2292747] (View GenBank report,FASTA reportASN.1 report,Graphical view,2 MIEDUNE links, 19 protein links, or 3 nucleotide nei ghbors)
SEGLLHLYSINO
Bacteriophage LL-H- structural protein gene, partial cds; minor structural protein gp6l (g57), unknown protein, unknown protein, structural protein (g20), unknown protein, unknown protein, major capsid protein (g34), main tail protein gpl9 (g17), holin (hal), muramidase (mur), unknown protein, unknown protein, unknown protein, unknown protein, unknown protein, and unknown protein genes, complete cds; unknown protein gene, partial cds; and unknown protein, unknown protein, unknown protein, unknown protein, unknown protein, minor structural protein gp75 (g70), minor structural protein gp89 (g88), minor structural protein gp58 Qg71), unknown protein, unknown protein, unknown protein, and unknown protein genes, complete cds gril 10043371 gbIISEGLLHLYSINO 1004337] (View GenBank report,FASTA reportASN.1 report,Graphical view,4 IMEDLNE links, 31 protein links, or 1 nucleotide neighbor) M96254 Bacteriophage LL-H hahin (hal), murarnidase (mur), and unknown protein genes, complete cds 0gil1OO4336gbM962ILHLjYSIN03 [1004336] (View GenBank reportFASTA reportASN.1 report, or Graphical view) WO 00/32825 PCT/I 099/02040 257 Y07740 Staphyl ococcus phage 187 ply 18 7 and holl1S7 genes GiI2764982lembIYO7740IBPl87PLYH [2764982] (View GenBank report,FASTA reportASN.l report,Graphical view, or 2 protein links) U88974 Streptococcus therrnophilus bacteriophage 01205 DNA sequence ai124440801gbIU889r741 [2444080] (View GenBank report,FASTA reportASN.lI report,Graphical view,1I MEDLINE link, 57 protein links, or 6 nucleotide neighbors) Z99117 Bacillus subtilis complete gecome (section 14 of 21): from 2599451 to 2812870 gil2634966lembIZ991 17IBSUTBOO 14 [2634966] (View GenBank report,FASTA reportASN.1 report,Graphical view,233 protein links, 51 nucleotide neighbors, or 1 genome link) Z99115 Bacillus subtilis complete genome (section 12 of 21): from 2195541 to 2409220 ,gil26344781embIZ991 15IBSUBOOL2 [2634478] (View GenBank report,FASTA reportASN.l report,Graphical view,244 protein links, 64 nucleotide neighbors, or I genome link) Z991 Bacillus subtilis complete genome (section 7of 21): from 1194391 to 1411140 g~i126334721emblZ99l 1OIBSUJBOOOY7 [2633472] (View GenBank report,FASTA report.ASN.1 report,Graphical view,226 protein links, 31 nucleotide neighbors, or I genome link) X78410 Bacteriophage phiadh holin and lysin genes giI793848lembIX7841OILGHOLLYS [793848] (View GenBank reportFASTA reportASN.1 report,Graphical view,1 MEDLIN link, 2 protein links, or I nucleotide neighbor) Z93946 WO 00/32825 WO 0032825PCT/I B99/02040 258 Bacteriophage Dp-1I dph and pal genes and 5 open reading frames 0oil 1934760fembIZ93946IBPDP1 OR.FS [1934760] (View GenBank report,FASTA reporrASN. I report,Graphical view, or 6 protein links) AF01 1378 Bacteriophage ski complete genome gil2392824IgblAF01 13781AFOI 1378 [2392824] (View GenBank report,FASTA reportASN.1I report,Graphical view,54 protein links, 2 nucleotide neighbors, or 1 genome link) Z47794 Bacteriophage Cp- I DNA, complete genome gciI2288892IembIZ47794IBPCPIXX [2288892] (View GenBank report,FASTA reportASN.1I report,Graphical view,3 MIEDLINE links, 28 protein links, 1 nucleotide neighbor, or 1 genome link) 125561 Bacteriophage ph-105 OR~s 1-3 giI532218IgbIL3556lIPH5ORFHTR [5322181 (View GenBank report,FASTA reportASN. 1 report,Graphical view, 1 MIEDLINE link, or 3 protein links) D4971 2 Bacillus licheniformis DNA for ORFs, xpaL2 homologous protein and xpaLl homologyous protein, complete and partial cds cri11514423ldbjID49712ID49712 [15144231 (View GenBank report,FASTA reportASN.1 reporE,Graphical view,2 MEDLINE links, or 4 protein links) X90511 Lactobacillus bacteriophage phig le DNA for Rorf 162, Holin, Lysin, and Rorf 175 genes gi I1926386lernblX9051 I LBPFHHOL 19263 861 (View GenBanic reportFASTA reportASN. 1 report,Graphical view,4 protein links, or I- nucleotide neighbor) X98106 Lactobacillus bacteriophage piaigiv ComxpicEe geDOIrWiC DINA gill 9263 2OlemblX9 1 O61LBPI-IGI1E [1926320] (View GenBank report,FASTA reportASN. I report,Graphical view,1 M EDLINE WO 00/32825 PCT/I B99102040 259 link, 50 protein links, or 4 nucleotide neighbors) U72397 Bacteriophage 80 alpha holin and arnidase genes, complete cds gil176324llgbIU72397lB8U72397 [1763241] (View GeriBanik report,FASTA reportASN. I report,Graphical view,2 protein links, or 2 nucleotide neighbors) U38906 Bacteriophage nt integrase, repressor protein (rro), dUJTPase, holin and lysin genes, complete cds ,gil1353517igbIU389061BRU38906 [1353517] (View GenBan~k report,FASTA reportASN. I report,Graphical view,2 MIEDLINE links, 50 protein links, or 3 nucleotide neighbors) X9 1149 Bacteriophage phi-Ol1 DNA cos region gilIllO7473lemblX9l 1491AP1-IC3 IC [1107473] (View GenBank report,FASTA reportASN.1 report,Graphical view,1 I.MEDLINE link, 6 protein links, or 1 nucleot~ide neighbor) U24159 Bacteriophage HPI strain HPlc1l, complete genorne gil lC)462351gbU241591BHU24159 [1046235] (View GenBank report,FASTA reportASN. 1 report,Graphical view,6 MEDLINE links, 41 protein links, 8 nucleotide neighbors, or 1 genome link) Z26590 Bacteriophage mnv4 lysA and IysB genes gil4lO5OOenbIZ26590lMIV4LYSAB [410500] (View GenBank. report,FASTA reportASN.1 report,Graphical view, or 4 protein links) 177 B-subtilis-DNA (28 kb PBSX/skin element region) gi11225934lembIZ70177IBSPBSXSE [1225934] (View GenBank reportFASTA reportASN.1 report,Graphical view,32 protein links, or 4 nucleotide neighbors) Z.36941 WO 00/32825 WO 0032825PCT/I B99/02040 260 B.subtilis defective prophage PBSX xhlA, xhIB, and xylA genes GiI535793lemnbIZ3694I1BSPBSXXHL [535793] (View Genlank report,FASTA reportASN. I report,Grapbical view,4 protein links, or 5 nucleotide neighbors) X89234 Linnocua DNA for phagelysin and holin gene gillI 134844iembIX89234ILICPLYHOL 1134844] (View GenBank report,FASTA reporEASN.lI report,Graphical view,1 IMEDLINE link, 2 protein links, or 4 nucleotide nei ghbors) X8501 0 Bacteriophage A51 1 ply 5 1 I gene gil853748lemblX85Ol0iBPM 11 PLY [853748) (View GeniBank report,FASTA reportASN. I report,Graphical view,l M EDLINE link, 3 protein links, or 1 nucleotide neighbor) X85009 Bacteriophage A500 hol500 and ply5CO genes 0oil853744lembIX850091BPA500PLY [8537441 (View Gen.Bank report,FASTA reportASN.1I report,Graphical view, 1 MEDLINE link, 3 protein links, or 4 nucleotide neighbors) X85008 Bacteriophage AIlIS hol 118 and plyll18 genes 0oii8537401enibIX85008IBPA 1 18PLY [853740] (View Gen.Bank reportFASTA reportASN. I report,Graphical view, I MEDLINE link, 3 protein links, or 1 nucleotide neighbor) L34781 Bacteriophage phi I I hahin homologue (ORF3) gene, complete cds and peptidoglycan hydrolase (lyt.A) gene, partial cds giIll l8381gbIL3478 1IBPHHOUN [511838] (View GenBank report,FASTA reportASN.1 report,Graphical view,I MIEDLINE link, 4 protein links, or 2 nucleotide neighbors) U 11698 Serratia marcescens SM6 extracellular secretory protein (nucE), putative phage lysozyme (nucD), and transcriptional activator (nucC) genes, complete cds gii509550olblUl 1698ISMUJI 1698 [509550] (View Genflank report,FASTA reportASN. 1 report,Graphical view, I MEEDLINE WO 00/32825 WO 0032825PCT/I B99/02040 261 link, 3 protein links, or I nucleotide neighbor) U31763 Serratia marcescens phage-holin analog protein (regA), putative phage l:.,sozyme (regB), and transcriptional activator (regC) genes, complete cds a gi 1 965068igb1U31763l5MU31763 [965068) (View GenBank report,FASTA reportASN-l report,Graphical view,1 MEDLINE link, 3 protein links, or I nucleotide neighbor) X87674 Bacteriophage PI lydA lydB genes gil 974 763embiX7674BCPILYD [974763] (View GenBank report,FASTA reportASN.1 report,Graphical view,l IMEDLINE link, 2 protein links, or 2 nucl eotide nei ghbors) L.48605 Bacteriophage c2 complete genorne gill 1462761 gbILA865IC2PVCG [1146276] (View Genigak report,FASTA report.ASN. I report,Graphical view,3 MIEDLINE links, 39 protein links, 3 nucleotide neighbors, or 1 genome link) L.33769 Bacteriophage bIL67 DNA polymerase subunit (ORF:3-5), essential recomnbination protein (ORFl3), lysin (0RF24), mninor tail protein (ORF3 terminase subunit (0RF32), bolin (0RF37), unknown protein (ORF 1-2,6-12,14-23,25-30,33-36), complete genome gil522252fablL337691L67CG [522252] (View Genflank report,FASTA report.ASN.1 report,Graphical view,I MIEDLINE link, 37 protein links, 2 nucleotide neighbors, or 1 genome link) 131348 Bacteriophage Tuc2009 integrase (int) gene, complete cds; lysin (lys) g-ene, 3' end .il508612gb1L3 13481TU21NT [508612] (View GenBank TCportFASTA reportASN.1 report,Graphical view,2 MEDLINE links, 3 protein links, or 3 nucleotide neighbors) 121364 Bacteriophage Tuc2009 holin gene, complete cds; lysin (lys) gene, complete cds gi1496281 lgbJL3 1364JTU2SLYS [496281] WO 00/32825PC/B9004 PCT/I B99/02040 262 (View GenBank report,FASTA reportASN.1 report,Graphical view, I MEDLINE link, 2 protein links, or 1 nucleotide neighbor) 121366 Bacteriophage Tuc2009 structural protein (mp2) gene, complete cds gil4962731abJL313661TU2NUW2A [496278] (View GenBank report,FASTA reportASN.1 repo4l,Graphical view,1 MEDLINE link, 2 protein links, or I nucleotide neighbor) L31365 Bacteriophage Tuc2009 structural protein (nplI) gene, complete cds gil496276IgblL3l1365Ff U2NMPIA [4962761 (View GenBank report,FASTA reportASN.1 report,Graphical view,1I NMDLINE link, or 1 protein link) U04309 Bacteriophage phi-LO3 putative holin (lysA) gene and putative murein bydrolase (lysB) gene, complete cds gil530796lgblU043091BPUG43o9 [530796] (View GenBank report,FASTA reportASN.1 report,Graphical view, I MEDLINE link, 2 protein links, or 1 nucleotide neighbor) WO 00/32825 PCT/I B99/02040 263 Table 14 NCBI Entrez Nucleotide QUERY Key word: bacteriophage and kil citations found (all selected) AF034975 Bacteriophage H-19B essential recombination function protein (erf), kil protein regulatory protein clii (clIl), protein gpl7 N protein cl protein cro protein (cro), cll protein (cll). 0 protein P protein ren protein (ren), Roi (roj), Q protein Shiga-like toxin A (sit-IA) and B (slt-IB) subunits, and putative holin protein genes, complete cds; gil266875l1igblAFO34975! [26687511 (View GenBank report,FASTA reportASN.lI report,Graphical view,1I ME-DLINE link, 20 protein links, or 30 nucleotide neighbors) X15637 Bacteriophag"e P22 P(L) operon encompassing ral, 17, kil and arf genes gilIS646lemblX156371POP'22PL [15646] (View GenBank reportFASTA reportASN. I report,Graphical view, 1 MEDLINE link, 7 protein links, or 2 nucleotide neighbors) J02459 Bacteriophage lambda, complete genome gil2l51044bIJ02459lLAMCG [215104] (View Gen.Bank reportFASTA reportASN.l report,Graphical view,87 MEDLINE links, 67 protein links, 190 nucleotide neighbors, or I genome link) M64097 Bacteriophage Mu left end gil215543IgbIM64O97IPM1JLEFTEN [2155431 (View Gen.Bank repoMtFASTA reportASN.1 report,Graphical view,2 MIEDUINE links, 39 protein links, or 15 nucleotide neighbors) M 18902 Bacterioph age D108 kil gene encoding a replication protein, 3' end; and giII661911cgbIM189O2iDl8KAL [166191) (View Gen.Bank reportFASTA report.ASN.1 report,Graphical view,I MBEDUNE link, I protein link, or 3 nucleotide neighbors) WO 00/32825 WO 0032825PCT/I B99/02040 264 Table U77328 V01282 U U11787 U93688 A47599 D21 131 U76864 U38428 AF 151117 AF 121672 1U11786 U93687 A47598 D30690 U76863 U66665 AF151218 AF072726 U11785 AJ224764 A47597 D14711 U76862 U66664 AF146368 AF115379 U11784 JAF064774 3A47596 D90119 U76861 U66663 AF144661 AF034153 U11783 jAF064773 IA47595 D00730 U76860 X87104 AF132117 AF029244 U11782 IY14370 A47594 D83357 U76859 X87105 Y15477 U67965 U11781 AF065394 A44534 D83356 U76858 X89233 Y09928 U96610 U11780 AF062376 A44533 D83355 U76857 M28521 Y09594 U96609 U11779 AF062375 A44529 D83354 U76855 U54636 AF134905 U73027 U11778 AF062374 A44528 D83353 U76854 U46541 AB019536 U73026 U11777 AF062373 A44527 D12572 U76853 L14017 AJ23 7696 U73025 U311776 AB007500 A44526 D86727 U76852 U60589 AF106851 AF068904 U11775 Y09924 A44525 D86240 U76851 X48003 AF106850 U60050 U311774 1U63529 fA39696 D67075 U76850 M37889 AF106849 D10907 U311773 IAF033191 TAF001783 D67074 1376849 V01281 M26321 D10906 AF053772 Y15856 ]AF001782 1397062 U76848 X97985 AF060191 AF053140 AF053771 AB000439 L77194 1396620 U76847 X00127 AF060190 AB013298 AF029731 AF041467 AF003593 U96619 Y09929 X03286 AF060189 Y 16431 AF027155 Y14051 IAF003592 Z84573 Y09570 X62282 AF060188 AF076684 AF024571 U82085 X73889 AB00 1896 X95848 X01645 AF060187 AF076683 1387144 AF026122 X74219 Y07645 Y09428 X16471 AF060186 Y13225 AF086644 AiF026121 Y 10419 1392441 S76611 X52734 AF060185 AF094826 AJ223781 AF026120 M63177 U91741 S76213 X13290 AF0601 84 AJ223480 ,AF076030 AB009635 E08773 U29454 S75707 X66088 AF036324 AF093548 AF044951 AB006796 E07163 1329478 S75706 Z30588 AF036323 AJ005352 AF044906 U39769 E07 162 U77374 S75705 X16457 AF053568 AF051916 AF044905 D00184 E07161 L42945 S76270 X00342 AJ132841 Y09927 AF044904 X56628 E07160 U38429 S72497 V01287 Y13766 AF051917 AF044903 AF033018 E07159 U381980 S72488 X61307 AF101234 S77058 AF044902 AF034076 E07158 X55185 S74031 Y00356 AJ133520 S65052 AF044901 D82063 E07157 V01278 S67449 X06603 AJ133495 AF009671 AF044900 D76414 E07156 1331979 U75367 Z93205 AJ132803 U81973 AF044899 _U57060 E07155 X91786 U75368 X64172 AB016487 U77308 AF044898 D89066 E03836 1336912 U31175 X72700 AB0 16431 1320869 AF044897 1385095 E03835 U3691 1 X53096 X60827 AB015981 U89396 AF044075 1385097 E03526 U36910 X53951 X64389 AB0 15195 U394706 AF044074 U85096 E02873 U364885 X53952 X62288 AF107307 U41072 A1F044073 D42078 E01690 1376872 X03408 X55798 AF079518 U52961 AF044072 AF015929 E00876 1376871 U50629 X58434 AJ223806 U21636 AF044071 D10369 E00203 U76870 U38656 X06627 Y18018 U65000 AF044070 A48955 D83951 U76869 U58139 X12831 T- O A I OA A27 Y17795 U48826 I At044069 A4850i D 173 0 0 u /uouo AJ005647 1320503 jAF044068 A48500 D42144 U76867 L42943 X02529 AJ005646 U11789 jAF044067 A48499 D42143 1376866 U51474 Y00688 AJ005645 U311788 AF044066 A47600 D10489 U76865 U50077 X04121 X59477 X54338 A 12915 1151133 M63176 M10500 L01055 M63917 X59478 X51661 {A12913 1151132 L11998 M 10499 M83994 M58515 X63598 X05815 A12906 IX02588 L05004 AI-000934 J03947 L10909 X52593 X15574 A12905 X61716 L42764 IM10498 IJ03479 I M15067 WO 00/32825PC/B9000 PCT/IB99/02040 X76490 Y07536 A 12904 JX61719 M32103 M M10497 M64724 M92376 X81586 X02166 A 12903 JX61718 JU1 0927 M 18264 M14372 M62650 X72014 Z49245 A12902 X67743 AH003057 J01786 M14371 M32312 X72013 X 16298 A A12901 X67742 1M73535 M33833 M14374 M20393 X71437 Z18852 A 12900 X67741 M73536 M32470 M15215 M90536 X62992 X68417 IA12899 X67740 U20782 M20270 M36694 M21854 X52594 X68425 A 12898 X67738 L37598 J03323 M37915 M36771 X14827 X17679 A12897 JU029 10 L37597 M33479 M12715 L14020 X13404 X63072 A12896 1AH003349 L36472 M94061 J04151 M81736 X17301 X02872 A09523 MI 1118 L25288 M37888 L22566 U 11702 X17688 V01277 A04518 JM18086 L25893 M76714 L13379 119300 X03097 X52543 A04517 jU19459 K02687 M 17123 L13378 L25372 Z16422 A 19943 A04512 U35773 L23109 M97169 L13377 L22565 Z33409 A19942 L41499 U26702 L07778 M81346 L13376 M58516 Z33408 A 19941 U 19770 U21221 M90056 M90693 L13375 U06462 Z33407 A 19940 X53818 U36379 J02615 M25257 L13374 L19298 Z33406 A19939 M20129 U06451 M18970 JM25256 M17348 M80252 Z33405 A19938 L43098 U35036 K02985 M25255 M17357 L11530 Z33404 A19937 L43082 fU20794 M21136 M25254 M17347 X75439 A19936 X03216 L25426 M10501 M25253 M28364 X62587 A17958 X70648 M86227 A.H000935 1M25252 M21319 WO 00/32825 PCT/I B99/02040 266 Table 16 Phage 44AHJD complete geriome sequence. 16668 nucleotides.
1 71 141 211 281 351 421 491 561 631 701 771 841 911 981 1051 1121 1191 1261 1331 1401 1471 1541 1611 1681 1751 1821 1891 1961 2031 2101 2171 2241 2311 2381 2451 2521 2591 2661 2731 2801 2871 2941 3011 3081 3151 3221 3291 3361 3431 3501 3571 3641 tccatttctt atcaaataca tttacttat gatgaaatcg aagaaacttc agaaacaatc gaccctgttg tgcaacaacc ttcagatgaa catgcgttct atgtcaattg ttacaggtga agtagcaaag gacaaatggg agactttatt acaatatagt ttaatgttga aattcacatt aggtgtatac tactaaactt aaaaatgctg aaatttatgt atctattgac taaatattct atgatataat taaaaccaga cgacaaagaa agaatcaact gaagaatcaa gaagaagaaa atgaaaacaa tactagaaca acgtattgct acaacaagta caacaaacac gaactagttg ataagttaga atgatgggta catcatatga atacaaataa aagtgaagat cgttgaggag gaataataaa tcagctaaat cagcgttaca ataattcaaa tacaatgttc aatcgatatt gcattaggta gaagaatacg tgattatgga aacgtaatta tccacgtatg aaacaacaat gatacacgtt aaaaagaaaa tttctgatat tgcaacaact ttttattcaa tagttataaa gaaacti cag ctgaagaacc attagaacct tcattagaac aatcagatgt tttagattag agattcaaga agttatggtg ttatggcaca agattttaat gaaacatttg atcgttttaa cacagtacca gcaactaagt tcaatttcca taatgtatta taaccaactt atctaaccta aattatgatt tcaacatata atattcggag gtgtataaat aatcaactga agaaaattta aactgaagaa tcaactgaag actacaacag atgaagatag aacaagtgac tactttttta aacagaatca aacaaagaag gaggaattta aacatgtatg ttaaataaac gaacagaatt tacaaattca ttcactttca acaatctaca aaaaatgaaa catgattatt caaaatcttg taaataaata tttattccct ttggttagct aaagagcaag attaacatgg acttatctaa tatatggtaa cggaattgtg aacattagca gacgcaacta gaagaaaaag aaatgcgtgc aagcaacatc aaaagaagat taaatataat gaagtacatc gatattgtga ttttaacaac agattgcagg cattgatttc ttacatattc ataaaattaa gacagaattt gaatcaactg ataaaacagt ttcgaaattt tcttcacaaa ataacgacta agggaaacaa aaatgaaaac aaacaatcat ctgcactttt gacatttggc aagattaatg attttattgg aaatgaggaa aagaaacaaa attacgcttt aatgttagtt ttagcaagca gtgcatcagg agattcatta acagatcacg gattactcat tgaatcaatt atccgaaaca aagtttttga agcaatccta aacttacaaa tggtgcaatt ggacaatata caactgtatc aaatcttatc ttttagatac taagattgca ttattagttt tgacgactta ggtggcgtgt aattgacttt ttacgtgcgt atggagatta tttacttatg atgtatctaa acttaaagag tatatgcgtt tattttggat attaattcaa attccataac cctgaatttg atgaagttac ttctttaata aaattttaat tactqaccaa aatgtacgta acaacagtgc aaaattaaaa aacacattcc ttaaagtaac tcaatcacaa tttactggca ttaaatataa aaaagaattt aagttacaaa accaagattc ttaggagata caattccagt tggtgctgta acgttgaaga aattaaacca aaatcagatt acgttacaca aggagcgt aa gttgttgaac tcaactatac taatgtagtt aatcaagcac attttattgt gagtggtaac tatactgatg tatttaaatc tgttaaaatg gcattaactg cattagccgt aaacagtaat aaaccgtatt gtgatataaa tcaatgaatt gaagttcgac ttatcagatt ttgaagcatt atcaagtgaa actgatgaaa cttatgtgtc taatacgatt caaaatcaaa aatatgaaca atcgcaacag caatcgtgat gttggtgaag caaatttc acgataaaag atttaaacgt acggttggta ctagaaataa atcaaacttt acctgactat ttaatacctg tttgttgtca tgcaaaataa aattagcaga agttgcttta agaaattaat gacgagtcaa tcacctatgt ttaatgcaga aaatgaaacg ggaatatcaa tgataaagaa agcggtgttt atctatttaa aaggtcgtga acgatgatga aacaacgtct tggctagata cacaatgact tgtaaatgat aataaattaa aaagaegttt tagctatcgt tactttttaa aaaatcattt tggcatgcaa gttgaaaaat catcgaatca attaccacaa gataacggta aaggtaatgc 3711 cgtaegattt aagaaagaaa 3781 gttaaataat ggcatataat 3851 ttataaaacg agagaacgtt 352i. ZLtatqatt atatt---- 3991 atgaattaaa aaaacgtttc 4061 atttagaggt tggttaaacg 4131 ggattaacat cggcatttgc 4201 ttaaagactt aattaaagat 4271 tgtgatgggc tttggtggta 4341 atgtactcta cacaatccga 4411 acttaatttc aagcatgcgt 4481 tggtgaaacg aaaatctggt 4551 tatgtattag atttagaaga 4621 catttacacc gttaattgat 4691 tegttcaaga gcagacgtaa gtgattactg acttacaatc aaatgctaca agtgaggt ta aaactgtgaa aaaaggtaca attttaaatg gaaaacgat atacaccgtt aaaaattggg acggtacgat tttatttaaa attgaccgtt ttcgcaacgc ttctcaaaaa attgtacagg tacatcacga ggctaaaggt gaagcaaatg aaaatcacat acaetggatt cattactat gatgtaaatc caaaaccaga aggtttaaac gttgagttat cttatgttta atcgttattt tagattttat tcagttagaa gcaaattatg attcttggt aatttccaat ttcaaaaacg atgattgtct acaaattcat accaattcaa tataatagtg tctcgctttt ctttaatcat tcaatcaact tgtgtccgaa tgacgatatc attgatttaa aacaaaatta gtgaattaag cagacgaaga ggcaaaaagt accaattacg tttttatcaa aaaatatcaa tggtagacac ttatacgatt tcaitaaatc cgttttatga tgatgaattt taatgaaaaa gtatttaaag acgattcatt ttttagatag tatgtattac acatgaggat acaaggcttc acagaacaca tctttagaca attcaactgg acattgatgt tgataataca taaatcgagt aacgaaagta caattcacta agcagtattt aatttgataa aaaatgtttt ttaaatattt tgatgacatt ttacgatgat agagcagatt acgacttaat gaaagcattt tgacagtatt attcatgacg gttactgaaa tgaaacaaat tcgttaatgg gtttgaatta agttaaccaa cctattaata cctgaaggtt tttggataaa gtggtcatgg tacaacaatc tggtgttgca aaactgttac ttaacagatt atacaccaca acaaactcat tttaagattc -gataatgta gaaaaagaaa agaattgatt caattcatgc gtttttcatt agaaatcaac tatttaaatg atgaagatac catgactgca acgttacgat atcaaaacgc aattgataat ttacaaattt cgtccatttt ataatactaa etatttao ccagagcaag agtttaaaaa gaatgacttt aatgagcttg ttgataaaga taaattaaca ggattagaac aagtcgcata gtcactttta ggtgacggaa tgacaattga aaaggtatgt taaaaccacc catttaaagc cattagtcca ggaagaatta caagaataaa caaaggaaat cagcaaaaga ggaattttta ccgctactaa tcagctttaa gacaaaacat atgtaaataa cacttacttt attaactaaa gaagatatat aagctatatg ataactgtat atatagaaat tatagaacat gcaagcaaaa tttagcaaga atatataacg gtgcaccatt caagtaatag cgtaatccca taactattta ggcattaatt aatcgtggat ttaccacatc agcgttatgg tttagatatt actttttaaa gatgaaagca aaaaaaggtt aaaaaatgct gaaagatgaa agacaaacag tggtttattc aacaagtaac aacagaaacg tcgctgataa aaaacgtaat attgataaag ggtagaggtg tagacgaaat ttcaaaatca gactatgaca cgaaagactt atatagcgca aaatcagaag aaccaaagt aacaaatcic cctagtggtg gtcaatccaa taaagataat aacaaacaca caatacaggt taattcagaa WO 00/32825 WO 0032825PCT/I B99/02040 4761 4831 4901 4971 5041 5111 5181 5251 5321 5391 5461 5531 5601 5671 5741 5811 5881 5951 6021 6091 6161 6231 6301 6371 6441 6511 6581 6651 6721 6791 6861 6931 7001 7071 aacaatgata gttcagttaa atttaagtta atttatacaa gtttccataa ttcacaaaac aatgacttaa tgaatggtag tacggttcaa gacgtattta atttggaagc taaccaaatg agtggtaacg gtaatgttaa taaatgaaat gtgcatatgg catgtggggt ccgagcttta tgttaagtgg ggaaaaagca agtaatagca catattatag aaaattatca tcagatggtt atggaaaaac tttttctttg gtttaagaca ttttctataa ttatttttta agaatggaaa attttgagga atattttgtc attgaacgtc tgttaaaagt atcgttggat gcaaggcatt ttcacatgtt caaatcggta tcatatcaag acggtattaa atccaaaaac aaaacgtaaa tttatatggt ttcttccaac tataaattaa caaaagacga cgcaagctgg tttttattat caagcgtata attgacgctg gaattaacac gtttctcaac cgttaaaatg ggattatgga aagtcaatac aataactgga gaattattta gagacgcgcc cagtcggtga agtaagacaa tgcggaaaca aaacatcgtg cacaacaaca agcaaaagaa atttcaatgt atggacttat aatgctaaag acgcgataaa aacctcaatt aggggacgtt aaatcttgat tattatacat accattagaa cacattatta aagcattaga aacatcaaaa aaatgaaaat ggtacattta gaacctaatg gctattggtt acgtatggat tggttataac aggtaatagt tacagtgttg aatttagttg gaaaagatac agttgaaaca attgaacgtt cacaccgttt acagactatc aatggtcgtc attttaaatc tcaatgttga tatgcagtgg tagaagatat tacgcttttg attgatacca ttatgacgta aacatttatc aaaacgcacg atcaaataaa aactatgttt gctgatttat caaagaaatt acaatatcac atcaccagtc gagtgcctat ccatggatta aaagacttag aggacgttaa caaaagaatg gagtctaaaa agatgaattt aaacatatga atgttactcg acgctggtaa atcataatga agttcgagtaI aaataaagaa atattgattg gtaccaatat taatcaataa gtcaattaat tacaaatcgt :gtgagtgta gcaagtaatt1 aaacaacaac aagctgaata acgcattcca aattgcgaat :acattttta caaaaatatt kttaacagta tgactgtttg aatatggttg ggtagcaaaa tggatataca agcatgaggg cagttgctti taatgacttt gctgtatata gct tagaaca tgacggtgta gtaaatacat catgtggttt ccaaccaaac tggcaaggca gtattccttg aaataagagg tat tggagga aaaacacgat gttagactat catgacgcac taaaccaaat tacacaaggg tataactata ataaccaaat tggtactaaa aacttatacg cgcaaaactt aaccagtgaa gatttatcat tacgtaatga gatttcacaa tatccagtag atacgggttc tggtatctta attgataatg :aagtccaac taaagattta agcattaacg atatgttgtt caattattta 7141 ccagtcaagc 7211 acgatttatg 7281 tggataaaat 7351 tattaataca 7421 ggtggtaaat 7491 tatctaaaaa 7561 tggaaatacg 7631 attattggtt 7701 tactcgctaa 7771 ttttgcacaa 7841 aatgcagaaa 7911 tttatgacgc 7981 taatttctac 8051 gaaatgggca 8121 ctaaagaaat 8191 tattgaacca 8261 atcgaccccat 8331 gttcaggtaat 8401 taaaattcag a
C
a gctgttgatg aatattcatt tttcccacgt t egt tat tac ttggtgagta cggt cgtgca attgacgggg gttgtttcat aggtcgtaaa ttatggacaa ttgcttatgt agaagaaatt acattagagg gtgatgatt aacaacaggt gataacttta ttgctatgac tgaacacttt ttatctattc gtactgcaga taatgtatac atggttaaaa caatcaaaac aacaacagct aaaaaagtgg ctaatatatc tgtgtattac aaaggtttag caaatggaca aaactggtta actcacttta t tggaaaatg tttaccaata ggttatacac cacgttatta 999ggtgttc tgtaaacaat aaaaaatgag tcattttaat tcaaaacaac aaggtattaa cgaatacgtg aatgtattag tgttaccaat gcaacaatat aaagagccaa ttatggaata tcaaaaggtt atactggtta agtggtaaca caaaagattt atgattatcc aagagcctga gggtatttgc aaacggcggt ggtggaaaac gaagcattac gcgcaagagg cagaccatat cgacgattta aaaacttaag aatatgccaa cctacaacac aaacattagg tggtgcgtgg tatgacttta tgacgcacca tatcaagaat ggtgagtatt acattacagg gtgcatggtt acgtgtgtca ggaatataaa gaattcttca catcaaaaat aggagtgata ggcaggtgtt gactttgatg attactgacg gtaaagttcg cgacggtgta taaaaataca atatggacat attcaatgtg ggcggcggtt ttgacggttg ttagacctaa attttcaggt gaaacgaaac caatacggca tttgcacgtg tcggtagtcc catataacga agtttgttta tttaccagtg cgccaatgga tcataatggg tattttagcc ggctgataga atcgtaagjaa aaagttaacg aattttaagt agtaataaag aacgtgatga cgtataattt tatacgtgat ctacatgacg tttttatcag aatgacgttg tggttaaaat agcaactctc aaacgtcaat gttacgtaat aatgatgatg ttggaaaatt tagtattatt acttagatac gtcaaaaggt tggtgacttt attaacttta caaatgttac ctaaagactt :gttaatgga :ccaatgtta tttacagac acaattaaaa gcaattttag 8471 8541 8611 8681 8751 8821 8891 8961 9031 9101 9171 9311 9381 9451 9521 9591 9661 9731 9801 9871 9941 10011 ctttttattt gtattaatgt tggtcaaaaa aatcaatgag tcattgacca gattttacaa gggactttct attgtagaaa agagtcatta aaaatgatac cattcctata ttaatgatga gctttaatgt atggcttggt tgtttcacct actctttttg catacgtctg cacgtggtga aacaagctgg acaaccgtct aaaaggtaat gcagattttg gcs ngt ,-t-nooranr'c caaaatccat tcagaagcgt tcgttgatat aggattttct ttaaaaggtg attgtgcaga tgaaaagagt ttttggttta taaataacaa ttcacatgtt tattacaggt aaaaaattat gtctactcat aatggacgta gataacaatg acgatagacc gtaaaatcag aaaaaatgag gatgctaaat caggtaacat gaaagcaggt ttaaattta ttctagcttg tggtgacagt aaaattacag gattaaaaac attaaaacag taagtttctc aaatcttcaa gagatgatgt gtatatgaca attgaatttt atgactggaa aaaactggtg ttaagttacg tacaaaatca attataacag tgctgaaaac gacagaccaa attcttaaat acaaatataa catttaatag ggacaatcac aacaagccaa ccgacaaaaa tattaaatgg tagcgacccg aaatcacgct tgctttattt ggtaagttta atgaagaata gccttacaac caccttctgt aactgaatca gtttaacgat gaaaattagt gtaccgtcac tggttttgaa gtgaatgact ataattcatt aaatgtacag gtacgtatac tatacgtgac aatctggtgt aagattttgg cataatgacg atttagagag ggggtataat atgaacgaag tatatacgct ggggatttaa aattactcta atttcaaaag caattaaaaa taataactta tgatattctg tattatcatt ttagcaaaca gattacaata ttttattata ttgcaaatga ttagtaccag aacaaattaa agataaatta aacgatcaag agaagataga taaatttacg agttagtccc taaagtaaaa ggatatgggc acaagtatta aaagcagtaa aagagatagg ggttttagtt ctggacttgg ttggttaaac :catagcaag aaagttagta tcacaatcaa :gtccacttt gtaccacaag acgtacaaag acaattggac gtgcatatat tccattaaca aacatcata taacaaagta caaaactatg Iggtggtaaa ttagacggta aaggtggatc ~gtagttcac tactcgcttt agcaaaacaa iatgggacgt tcatagtatt ggtagtgata :aacaacaca tatcatatta aaatgacgat..
:aagtagata gtgggagtag tagttctmat aatcagtcaa gccaaatgga aaaagtggtc iaaatataaa aaagcaattg gtgtaccttt :ctcaaacgg gtaatgcagg acaatgtaca ;acaacctac cgacgacggt caaataacaa Laaaacaaca cataatccaa cagtaggtta atggtattg gtcacacagg tgttgttgta ggtaatgaaa ccgtatatga ttatttaaca tggtatgacg caaaaaatat gcgaatacta gtggtagtag gtaatccatt acctagtgat gccatgcaag aattttttag tggtttactt cctactgatg gtgtgattgg attcaaaaaa gaattaacat acggtcagcg tggtttctct tttagacggt tcgtctgaca aattattaaa taatgattat gattcattaa atgacggaga tggtaactgg gaatacttat gggcgtatat tgtatggtac agtaaaccac aaaaatacaa gacgcattac tttacattag aaaaaacatt aaaaactgat tgatagcgtt ccataaacca attagtggta acatatgcac agttaccaga acaaaccagg taacatattt gtcacaacta catggtaaaa gtctataaaa agttaggtgc catacttaca agcaactgca WO 00/32825 PCT/I B99/02040 10081 gcagtttttg aagatggttc gtttttagtt gcaaactat 10151 tggtattgta tacactcatt aatggcgtac caaataatg 10221 tgcttaatta actatgctat aatgaacaca tgctagtaat 10291 ttttcgtaca catttttcat gttatctcaa aaagaaaagc 10361 atttcatcat gttcacgttt taatatatgc aaaccagatt 10431 agtcgttaag tgaaaatgaa ccgatgccac tttcaatata 10501 attttctcta gcgtctttta atataaattc acgtttcata 10571 acattaccac atacaatttc agttttagac ggatatac 10641 tattgttttc aataatggca ccgtcaaaga attgttcacg 10711 aaaggcgttt ttcggtatac cagcagaagc aattttaatc 10781 tgattcagta caaacatctt atctatctgt tcgttttcaa 10851 ataaactggg gttcaataag ggtttaacaa cggatttcat 10921 attgtcgtca atttcacttt cegttaagta ttggaaagga 10991 acaaatgtag agaataatat attacgttca gtgtttttgc 11061 11131 11201 11271 11341 11411 11481 11551 11621 11691 11761 11831 11901 11971 12041 12111 12181 12251 12321 12391 12461 12531 12601 12671 12741 12811 12881 12951 13021 13091 13161 13231 13301 13371 13441 13511 13581 13651 13721 tatcatctaa tttagataac ggtgatgtca cacgtgcatg cgtaatgtct tttacaatca caatcttata ttcgtaaaag atagaaaaac agaatgattt ttgatattgg aatgttaatt tcacgtcatt ggcttcacta tctgttaaat tatcaataat tttttctaaa tcttctgcag aaaaatgatt atcacgtctt atttcataac gtccgttaac atattgcatg tttttcgtat tcatttcatt cacgtaattt tctttataaa catattgttt atatttatca gccaaattat tgttacgttt attcatttta gcattaccta catatattgt tataataata ateacaccaa ttgatttaaa atgacgtgat aacgattttc acggaataag ataacctcct tattgatttt aaagtattca tgaatcattc tttgacgtaa taatgaaaaa tataaccatg aaggctcatc aatatagtca ttgagtaact tgttatagtc atgaatgtat tcattcatat aaccaccatc attaaattta attacatttt attttaaata atcgtattta ttcacatacg cattaaacca tttattgtac cattctagta ttaaaatgtg catttgagta gttttttaac catcatata ttaatgtatt tacacgtgat taattgtcat ttggaatttt aattagtttg taaaaattca gaaaacgtgt aaaatttgga gtaagttgtt cattatctct aagtaatttc aaaccarttta gatttttggc tatattttcg ttacagttat tataaaatgc accaattgct gctaatgttt atcccataat aacgtaatgc tgggtgttcg ttaccttgtg taacaaacga tgtattgata ttaattttaa catctaaaaa ttcatgatac tttatgtatt tatcatggaa catttcatta aatatatcac caggtgtgag atcaaaaatc ttacctaatg aaaacattgt taaaatagta cgtgtaatat gtgcaatcat gtcataaaaa acagaataag ctatatccca cttacacacc tatatttttt atgtaccacc atatgttgca ccatcacgtg *tggtgataat attgtattct ttagtggtat gctagtaaat aaaatacaaa acataatcaa ;agactgttat tttaacagtt gccttttttt tgttatgtac tgaacgttca actggaaata aagaatatca tcaaattgac tatggtcgaa Itraagttcat cagtaaaata ttcatcatat atattgtacc ttgctcatta tagatacttt tacaaaggtt tcaaaatcga cgcttgtatc tttccattca cttcatatgC atatttCtta tatcccattt acctaaggct atcgggtcga atacaaacta tcagtatcgc aataaataaa accaataagt tatacaatga acgtgatgtg aaccgttaat gatattgtat agttcattgt aggtatgcca tataatccat ttaaaacgac ttgatatcat cagtaatgtg atagtcgtaa ttttaataaa atagttttga aaaataatat attaacacgt atatgcatgc aatcaatacc ttaacgtaat cattatcatt atiatagtat ttaataaatc atcgttaaat acatctttat agtagggatt aacgttggtt ctgaatagtq atcacataag gataactcga attgatgtca tggtgttata catatttaaa ccaccacgat atgataatgt gtataagata ttttaatatc ttcaagtaag attccataat attcaatgaa tataatgaat atggcacata cctaatataa ttttgcaaaa catttcacag catagtcata gtataattaa aatctgtttt aagttgtgat ttgcaattga tgtattggtt ttcataaagt taaatctaaa ttgattgaag atttaacacg gccctctca tttttaatgt gtgttcattt tarcaaaata acgcatggtg tctttaagta gataatatct gtttttgatt ttgtgattgt gattcgaaac tcggaaatac ttcaacatca taacgttttt atatttggtt ggtttttttc gtataaaatc attcgacgtt catgtttatg ttttataagc catattgttt cattagatac cttttctttc aaactcactc atatttttct atctaaatat aaatatctat cattatcaac cttggatigt agtaataacg ttccatgttt tgtctatatc cctaatttta gtacaaaatt aaaagtagtt acattataca tgactttaat tgattgatgt gatttcttaa attatcatcc tagcgtcgtc atttgaatta aacgcacgtg taaaaaaatg tttttgtaaa gtcttgatgt ttggataata taggacttga aaagttgact caacgttacc atggttacgg tcgattgatt atcatcttca agtgctaaaa actcatcata tttaaatcag tggcactatt caaatctgta agtccctagc acttctgaac gtgacacgtt aaaatcacgg taagcgtcac gtaatgtata qctaaataaa taaaaaatga aacatagttg tctctatata gttatcttcq ttttatatga aattttataa ttttattcat ttagagtaag cattgtcaaa atgtaaattg acatatcatc acgtaaatag gtaacatgat tgtatatggt acaacgataa tatttgtcat tcagttttga tatagaagaa atcaccgttt attcgttaaa ttcaaattct ccagttgtca ttcattcacg taatcgtttc gtcgcattc tgcttttgta ataaattgta tatatttaaa gtaaaaacat tttagggaat ccaatataat ttttaactta tcccactcat caattaaata ggatagtgtt ttaaaaagtt agaatgatat tttctttatc ttgatagata aiagctaaat tagtggattt tcatctatga tttcttcaat aataaagtaa attttatatc aagtttaata catcagaacg gtttgaaata gatatataat aatctatatc atcattcata agttcatcaa gacatgattg acagcatttt gataatctct 13791 ctaattctat ttgattatac 13861 taccatgtct aaacgatttt 13931 gttaaattta ttcgtcaaat 14 001 cactggtcta tctgatttac 14071 gaattattat ttttagcttg 14141 catcattacg gttatatatt 14211 agcatttaca tatgatacgt 14281 atgtacgcct cttgtaaatc 14351 ctttttcttg ctcttttcta 14421 tgttatcaac ctccatataa 14491 ataaaacacc aaatgacacg 14561 Qaaaacaata Qctatqttta 14631 ttgataaata atcattaatt 14701 tttaataaaa tttctcttat 14771 catgtttttt aatatcaatg 14841 tttgtcaaac tcaatataca 14911 ttttcaaatt tatcattata 14981 tagttttaat tttcattttt 15051 ttatattcat aatatgaata 15121 aatgtccttt aatctcatcg 15191 aaaatagggg ataagtatcc 15261 tttttgacct tttttgtttt 1S331 acgtcctcat ctgctctcat gttttaccat gttgcataat caatttcttt taaattatat tcaagaatat ttctttcttt tttatcatca gcttctcttt agcataaata gtttttgctc ataaaaagtt atataaagta aaagcgtatt aatttaattt aggaaaatag aatttaaaat cttttttcca accattaaaa cttgtaatag atacctcaca tcgatttctt tatgcttttt tttatgtctt ggcagatgtg taacattact tctatccatt agataatata tttatgatgt tcttcatcat taatatattt agcgttttta gaacctctta caaaatgttt aaaatcattt tcagacgtat gaatataatc gtaattaaaa ttatcgtata cgtttacagt caatatttgt atacattgaa ttattagaat ccaatgatac ccatgtgtca aaaaataaat gtctaaccaa aatgtagtga atgctaaaag tgactttgtt ttatctggga tttccttct cgtatatgca gtgtcataag ataqiqttatt catggtcaat cgctttcaag aggtttatca taattcatta cttaaataat tcgatatcta acgtaaataa catcacccat atttattttt ataatctcta tttgttaaaa atatctcctt aatgtattct tacaacttta gcgtcatata caatataata ataccgtttt cctatgaaat tgtattaaaa tatgataagt cactttccca aaatttgata atacgtcgtc aattgtaacg ttctataata aaatacaagt atattaaaaa ctccttcca atttcaaaat catcatattg actatacatt ttttattaga tgaagtaaat ggtaataaat taaattattt aatctaaaag atgatatacg cgtatttttt agtgagibagg aatcttcaaa cattgagatt tgatgtggaa gtatttacgt tccatttaaa cacctcataa tgatacttga ccaaaattga ttgagtaacc tgaacttctc cagcattgat aatgtcaaca agcggttcgg taagtttacg ttatagtcat catattcata ttct t taat t aattgtgaat aattctgtta WO 00/32825 PCT/ I B99/02040 15401 15471 15541 15611 15681 15151 15821 15891 15961 16031 16101 16171 16241 16311 16381 16451 16521 16591 16661 cagtgacgat ttttttcata aatctcgcta t taaat tat t tgcgtgtagt ctcgtgaagt acacgtaagg cgtttcataa gattctggtt atagttgttg tttattgaat attattatca aacaccttgt ccgtattttt atgtgttttg tgctttctgc ggacaatagt ggtaaaaact taacaatgtc aatcctttat tagtttcgtt gcaagccgat agttgcaaca cttcctaata tcaccgaatt ctactaattc gtgtcttgat aattgcgatt ttacatgtgt cctcaatgta gtcaactttc gcatattcca gtttagttca aataagttaa t tt cagtata aagttgaaat tgaaatttca ttcaactcaa caaacgttta agaagaaata ttgattcttt tgatagtttg aaaatatctt tgtagtaaat ctggtaataa ttcattatca attattatat ttgttctatt tcatttaaga ttgcattgtc ggcttgtcct tacgcgtaaa ttttctgttc tgttttttgt aaagttttgt agattttaag gtttgtgaat ataaattctc ttacgtttgt cattgtaata ttcttttgct ccatctaagt CactcCtttc gggtcatcac attgaacaac aaatgtataa ttttcttctg acagaattat tatcaaataa ttcattttcc caatagt tt t ttcaaatcat aatgctctaa cgatatactc tttctttttc ctcaaattca cattttaitt ctcctcttat aacttgaatt gttttcgttg tgtgttttgg ttaaatgata aatgaagtat ataacctttg taaaaaacgt aaacgttata cagcaatata agacaatatt agaactatta tagtttaata gctggattcc attgaatcag gtgcattatc aacattaacc gatttaaatc ttcaatttca cgctatacat aaacttccat atttttgtta ctccttgttt tcttaaaaaa gtttaaataa aattttgaat aaaagtcaat tcaatgtcaa catcataaaa tcttaaaacg aaaaacatgc tgattacata cttagtatag ttttaaaact actatttaat agatacataa attttgtatt ttttaagttt agtaaagaaa ggtggggt tgatgaatat gtaataggtt agataagttg gttaagttgt tgcacagtat tgataagtaa atttataagt tttgatttgt ataatcgttt attttaaacc WO 00/32825 PCT/I B99/02040 270 Table 17 Phage 44AHJD ORFs list nb Name Frame Position Sizea.a Key words 1 44AHJDORFOl -1 10342..12627 761 DNA polymerase; 2 44AHJDORFOO2 3 3789..5732 647 Techoic add; Staph: 3 44AHJDORFO3 2 6626..8389 587 Tail: 4 44AHJDORFOO4 1 8764..10227 487 Serin protease motif 44AHJDORFOS -1 12643..13890 415 6 44AHJDORFO6 2 803..2029 408 7 44AHJDORFOO7 i 2044..3027 327 Upper collar: 8 44AHJDORFOO8 2 3020..3775 251 Lower collar: 9 44AHJDORFOO9 2 5744.6496 250 Amidase: Staph; 44AHJDORF01O -2 13938.14420 160 11 44AHJDORF012 3 8391..8813 140 Holin: 12 44AHJDORFO13 -2 14586.14996 136 13 44AHJDORF1 13 1 199..600 133 14 44AHJDORFOI 1 -2 15225..15593 122 44AHJDORF114 -2 15870..16172 100 16 44AHJDORF014 3 1 6243..6521 92 17 44AHJDORF015 1 1 15403..15645 18 44AHJDORF016 -1 15616..15852 78 19 44AHJDORF017 -2 10536..10757 73 44AHJDORF018 -1 886..1098 21 44AHJDORF0I9 -2 9630..9836 68 22 44AHJDORF121 -1 16165..16362 23 44AHJDORFOO 2 13865.14053 62 24 44AHJDORF123 2 614..796 44AHJDORF021 -2 5634..5816 26 44AHJDORFO23 -2 6315..6494 59 27 44AHJDORF024 1 14275..14451 58 28 44AHJDORFO25 -3 14999..15175 58 29 44AHJDORFO26 -3 14426..14593 44AHJDORF027 1 12916..13080 54 31 44AHJDORFO29 -1 15019..15183 54 32 44AHJDORF028 -3 9071..9235 54 33 44AHJDORF3O 3 14487..14648 53 34 44AHJDORFO31 2 11039..11191 44AHJDORF135 3 693..842 49 36 44AHJDORF033 -1 3646..3795 49 37 44AHJDORF032 -2 9306..9455 49 38 44AHJDORF034 -3 14000.14146 48 39 44AHJDORF035 -3 13811..13957 48 44AHJDORF036 -3 10019..10165 48 41 44AHJDORF022 -3 8468..8611 47 42 44AHJDORF037 1 14788..14931 47 43 44AHJDORF038 -2 3528..3671 47 44 44AHJDORF039 3 1743..1883 46 44AHJOORF040 2 9740..9877 46 44AHJDORFO41 2 15836..15973 47 44AHJDORF042 -1 5014..5151 48 44AHJDORFO43 -1 "02..4539 49 44AHJDORF044 -2 12783..12917 44 44AHJDORF149 -2 639..770 43 51 44AHIDORFO46 i 4891..5019 42 52 44AHJDORFO47 1 11911..12039 42 L Ifla OFfl 2 f35.. 10783 42 54 44AHJDORFO48 -3 15212..15340 42 44AHJDORF049 3 5784..5909 41 56 44AHJDQRF05O 3 13158..13283 41 57 44AHJDORF051 -2 10944..11066 40 58 44AHJDORF052 -3 14216..14338 59 44AHJDORF053 3 3348..3467 39 44AHJDORF054 3 7551..7670 39 61 44AHJDORF055 3 15705..15821 38 62 44AHJDORF056 1 5512..5625 37 63 44AHJDORF057 2 10121..10231 36 64 44AHJDORF058 3 10767..10877 36 WO 00/32825 PCT/I B99/02040 44AHJDORF1 64 -1 592.1702 36 66 1 44AHJDORF059 -2 8250..8360 36 67 1 44AHJDORF060 -2 6147..6257 36 68 44AHJDORF061 2 15551..15658 35 69 44AHJDORF062 1 4285. .4389 34 44AHJDORF063 -3 9383..9487 34 71 4.4AHJDORF065 1 5029..5130 1 33 72 44AHJDORF064 2 2609..2710 1 33 73 -44AHJDORF066 f -2 10380..10481 1 WO 00/32825 PCT/I B99102040 272 Table 18 Predicted amino acid sequences 44AHJDORFOOl 12627 atgggattactagaatgcatgcaatatcataaacatgaacgtcgaatgattttatactgggatatagaaacattagcgtacaat 1 M G L L E C M Q Y H K H E RR M I L Y W D I ET L AY N 12543 aagtagagaaaccacattaacta tt~ga~atgtgttagtagat 29 K V N G ft K K P T K Y K N V T Y S V A I G W F N G Y E I 12459 gatgttgaagtatttccgagtttcgaatctttttatgacgcattttatacgtatgtgaaaagacgtgatacaatcacaaaatca 57 D V E V F P S F E S F Y D A F Y T Y V K ft R D T I T K S 12375 aaaaaattagtgaaacgatatcgtactttataaaact~tatta K T D I I M I A H N C N K Y D N H F L L K D T M R Y F D 12291 aaatcccaatttttaacgaagaataccctaaagaggcatttac 113 N I T R Et N I Y L K S A E Et N E H T L K M K ft A T I L A 12207 a~aajcaaaatgtaattttagaaaaacgtgttaaatcttcaatcaatttagatttaacaatgtttttaaatggttttaaattt 141 K N Q N V I L E K R V K S S I N L D L T M F L N G F K F 12123 ~aattattgataactttatgaaaaccaatacatcaattgcaacattag9taagaaattacttgatggtggttatttaacagaa 169 N I I D N F M K T N T S I A T L G K K L L D G G Y L T E 12039 tcacaacftaaaacagattttaattatacgatttttgataaagataatgatatgaatgatagtgaagcctatgactatgctgtg 197 S Q L K T D F N Y T I F D K D N D M N D S E A Y D Y A V 11955 ~aatgtttgcaaaactcacaccgaacaacttacatacattcataatacgtgattatattaggtattgccatattcattat 225 K C F A K L T P ft Q L T Y I H N D V I I L G M C H I H Y 11871 agtgatatatttccaaattttgactataacaaattaacattttcattgaatattatggaatcttacttgaataatgaaatgaca 253 S D I F P N F D Y N K L T F S L N I M E S Y L N N E M T 11787 cgttcagttactcaaccaatatcaagatattaaaatatcttatacacattatcatttccatgatatgaatttttatgactat 281 Rt F Q L L N Q Y Q 0 I K I S Y T H Y H F H D M N F Y D Y 11703 attaaatcattctatcgtggtggtttaaatatgtataacaccaaatacataaacaaactaattgatgagccttgtttttctatt 309 I K S F Y ft G G L N M Y N T K Y I N K L I D E P C F S I 11619 gaacatggttctttagactaaaatcaagtaattagaattcgac 337 D I N S S Y P Y V M Y H Et K I P T W L Y F Y E H Y S E P 11535 acgttaatcctacttttttagatgatacaattatttttcatatataaggtaaagatgtatttaacgatgatttatta 365 T L I P T F L D D D N Y F S L Y K I D K 0 V F N D D L L 11451 ataataacctttagcatattaatcaatagtagtagtaa~aaaa 393 1 K I K S R V L ft Q M I V K Y Y N N D N D Y V N I N T N 11367 acattaagaatgattcaagacattacgggtattgattgcatgcatatacgtgttaattcgtttgttatatatgaatgtgaatac 421 T L R M I Q D I T G I D C M H I ft V N S F V I Y Et C E Y 11283 tttcatgcacgtgatattatttttcaaaactattttattaaaacacaaggtaagttaaaaaacaaaatCaatatgacatcacct 449 F H A ft D I I F Q N Y F I K T Q G K L K N K I N M T S P 11199 tacgact atcacattactgatgatatcaacgaacacccatactcaaatgaggaggttatgttatctaaagtcgttttaaatgga 477 Y D Y H I T D D I N ft H P Y S N Et E V M L S K V V L N G 11115 ttatatggcatacctgcattacgttcacattttaacttattccgtttagatgataacaatgaactatacaatatcattaacggt 505 L Y G I P A L R S H F N L F ft L D 0 N N E L Y N I I N G 11031 tacaaaaacactgaacgtaatatattatCctacatttgtcacatcacgttcattgtataacttattggttcctttccaatac 533 Y K N T ft ft N I L F S T F V T S ft S L Y N L L V P F Q Y 10947 ttaacggaaagtgaaattgacgacaattttatttattgcgatactgatagtttgtatatgaaatccgttgttaaacccttattg 561 L T ft S ft I D D N F I Y C D T D S L Y H K S V V K P L L 10863 aaccccagtttattcgacccgatagccttaggtaaatgggatattgaaaacgaacagatagataagatgtttgtactgaatcat 589 N P S L F 10 P I A L G K W 0 I ft N ft Q I D K M F V L N H 10779 aagaaatatgcatatgaagtgaatggaaagattaaaattgcttctgctggtataccgaaaaacgcctttgatacaagcgtcgat 617 K K Y A Y ft V N G K I K I A S A G I P K N A F D T S V 0 10695 tttgaaacctttgtacgtgaacaattctttgacggtgccattattgaaaacaataaaagtatctataatgagcaaggtacaata 645 F ft T F V ft ft Q F F D G A I I ft N N K S I Y N ft Q G T I 10611 tcgatatatccgtctaaaactgaaattgtatgtggtaatgtatatgatgaatattttactgatgaacttaatatgaaacgtgaa 673 S I Y P S K T ft I V C G N V Y D Et Y F T D ft L N M K R ft 10527 tttatattaaaagacgctagagaaaatttcgaccatagtcaatttgatgataftcttatattgaaagtgacatcggttca t 701 F I L K D A ft ft N F 0 H S Q F D) D I L Y I ft S D I G S F 10443 tcacttaacgacttatttccagttgaacgttcagtacataacaaatctgatttgcatatattaaaacgtgaacatgatgaaata 729 S L N D L F P V ft ft S V H N K S D L H I L K ft E H D ft I 10359 aaaaaaggcaactgttaa 10342 757 K K G N C 44AHJDORFOO2 3789 aiggcatataatgaaaacgattttaaatattttgatgacattcgtccattt ttagacgaaatttataaaacgagagaac9gttat 1 MA Y N EN D F K Y F D D I ftP F L DElI Y K T ft ft ft Y 3873 acaccgttttacgatgatagagcagattataatactaattcaaaatcatattatgattatattcaagattatcaaaactaa~t 29 T P F Y D D R A D Y N T N S K S Y Y D)Y I S ft L S fI 39S7 gaagtattagcacgtcgtatttgggactatgacaatgaattaaaaaaacgtttcaaaaattgggacgacttaatgaaagcattt 57 ft V L A R ft I W D V D N ft L K K ft F K N W 0 D L M K A F 4041 ccagagcaagcgaaagacttatttagaggttggttaaacgacggtacgattgacagtattattcatgacgagtttaaaaaatat P ft Q A K 0 L F ft G W L N D C T I D S I I H D ft F K K Y 4125 agcgcaggat taacatcggcatttgctttatttaaagttactgaaatgaaacaaatgaatgactttaaatcagaagttaaagac 113 S A G L T S A F A L F K V T ft M K Q M N D F K S ft V K D 4209 ttaattaaagatattgaccgtttcgttaatgggtttgaattaaatgagcttgaaccaaagtttgtgatgggctttggtggtat WO 00/32825 PCT/I B99fO2040 273 141 L I K D I D R F V N G F E L N E L E P K F V M G F G G I 4293 cgcaacgcagttaaccaatctattaatattgataaagaaacaaatcacatgtactctacacaatccgattctcaaaaacctgaa 169 R N A V N Q S I N I D K E T N H M Y S T Q S D S Q K P E 4377 ggtttttggataaataaattaacacctagtggtgacttaatttcaagcatgcgtattgtacagggtggtcatggtacaacaatc 197 G F W I N K L T P S G D L I S S M R I V Q G G H G T T I 4461 ggattagaacgtcaatccaatggtgaaatgaaaatctggttacatcacgatggtgttgcaaaactgt tacaagtcgcatataaa 225 G L E R Q S N G E M K I W L H H D G V A K L L Q V A Y K 4545 gataattatgtattagatttagaagaggctaaaggtttaacagattatacaccacagtcacttttaaacaaacacacatttaca 253 D N Y V L 0 L E E A K G L T D Y T P Q S L L N K H T F T 4629 ccgttaattgatgaagcaaatgacaaactcattttaagattcggtgacggaacaatacaggttcgttcaagagcagacgtaaaa 281 P L I D E A N D K L I L R F G D G T I 0 V R S R A D V K 4713 aatcacattgataatgtagaaaaagaaatgacaattgataattcagaaaacaatgataatcgttggatgcaaggcattgctgtt 309 N H I 0 N V E K E M T I D N S E N N D N R W M Q G I A V 4797 gatggtgatgat ttatactggt taagtggtaacagt tcagt taattcacatgttcaaat cggt aaat at tcatt aacaacaggt 337 D G D D L Y W L S G N S S V N S H V Q I G K Y S L T T G 4881 caaaagatttatgattatccatttaagttatcatatcaagacggtattaatttcccacgtgataactttaaagagcctgagggt 365 Q K I Y D Y P F K L S Y Q D C I N F P R. 0 N F K E P E C 4965 atttgcatttatacaaatccaaaaacaaaacgtaaatcgttattacttgctatgacaaacggcggtggtggaaaacgtttccat 393 I C I Y T N P K T K R. K S L L L A N T N G C C C K R. F H 5049 aatttatatggtttcttccaacttggtgagtatgaacactttgaagcattacgcgcaagaggttcacaaaactataaattaaca 421 N L Y G F F Q L C E Y E H F E A L R. A R C S Q N Y K L T 5133 aaagacgacggtcgtgcattatctat tccagaccatatcgacgatttaaatgacttaacgcaagctggtttttattatattgac 449 K D D G R A L S I P D H I D D L N D L T Q A C F Y Y I D 5217 gggggtactgcagaaaaacttaagaatatgccaatgaatggtagcaagcgtataattgacgctggttgtttcattaatgtatac 477 C C T A E K L K N N P N N C S K R I I D A C C F I N V Y 5301 cct acaacacaaacatt aggtacggtt caagaattaacacgt ttct caacaggt cgtaaaatggt taaaatggtgcgtggtatg SOS P T T Q T L C T V 0 E L T R F S T C R K N V K N V R C N 5385 actttagacgtatttacgttaaaatgggattatggattatggacaacaatcaaaactgacgcaccatatcaagaatatttggaa 533 T L D V F T L K W D Y C L W T T I K T D A P Y Q E Y L E 5469 gcaagtcaatacaataactggattgcttatgtaacaacagctggtgagtattacattacaggtaaccaaatggaattatttaga 561 A S Q Y N N W I A Y V T T A C E Y Y I T C N Q M E L F R.
5553 gacgcgccagaagaaattaaaaaagtgggtgcatggttacgtgtgtcaagtggtaacgcagtcggtgaagtaagacaaacatta 589 D A P E E I K K V C A W L R. V S S C N A V C E V R Q T L 5637 gaggctaatatatcggaatataaagaattcttcagtaatgttaatgcggaaacaaaacatcgtgaacatggttgggtagcaaaa 617 E A N I S E Y K E F F S N V N A E T K H R E Y C W V A K 5721 catcaaaaatag 5732 645 N Q K 44AHJODORFOO3 6626 atgagaaagttaacgaattttaagtttttctataacacaccgtttacagactatcaaaacacgattcattttaatagtaaiaaa 1 N R. K L T N F K F F Y N T P F TODY Q N T I H F N S N K 6710 gaacgtgatgattattttttaaatggtcgtcattttaaatcgttagactattcaaaacaaccgtataattttatacgtgataga 29 E R 0 D Y F L N C R. H F K S L D Y S K Q P Y N F I R D P.
6794 atggaaatcaatgttgatatgcagtggcatgacgcacaaggtattaactacatgacgtttttatcagattttgaggatagaaga 57 N E I N V D N Q W H D A Q C I N Y N T F L S D F E D R R 6878 tattacgcttttgtaaaccaaatcgaatacgtgaatgacgttgtggttaaaatatattttgtcattgataccattatgacgtat Y Y A F V N Q I E Y V N D V V V K I Y F V I D T I N T Y 6962 acacaagggaatgtattagagcaactctcaaacgtcaatattgaacgtcaacatttatcaaaacgcacgtataactatatgtta 113 T 0 C N V L E Q L S N V N I E R Q H L S K R. T Y N Y M L 7046 ccaatgttacgtaataatgatgatgtgttaaaagtatcaaataaaaactatgtttataaccaaatgcaacaatatttggaaaat 141 P M L R N NODD V L K V S N K N Y V Y N Q N Q Q Y L EN 7130 ttagtatiattccagtcaagcgctgatttatcaaagaaatttggtactaaaaaagagccaaacttagatacgtcaaaaggtacg 169 L V L F QS S AODL S K K F C T K K E P N LODT S K C T 7214 at t tatgacaat at cacatcaccagt caact tatacgttatggaatatggtgact ttattaact t tatggataaaatgagtgcc 197 I Y 0 N I T S P V N L Y V N E Y C D F I N F N D K N S A 7298 tatccatggatacgcaaaactttcaaaaggtcaaatgttacctaaagacttataatacaaaagacttagaggacgttaaa 225 Y P W I T Q N F Q K V Q N L P K 0 F I N T K D L E 0 V K 7382 accagtgaaaaaattacaggattaaaaacattaaaacagggsggtaaatcaaaagaatggagtctaaaagatttatcattaagt 253 T S E K I T C L K T L K 0 C C K S K E W S L K 0 L S L S 7466 t t ct caaatct tcaagagatgatgt tat ctaaaaaagatgaat ttaaacatatgatacgt aatgagt atatgacaattgaarttt 281 F S N L Q E N M L S K K 0 E F K H M I R N E Y N T I E F 7550 t atgact ggaatggaaatacgatgt tact cgacgct qgtaagat t tcacaaaaaactggtgt taagt tacgtacaaaatcaa~tt 309 Y 0 W N C N T N L L 0 A C K I S 0 K T C V K L R T K S 1 7634 at tggt tat cat aatgaagt tcgagt at atccagtagat tat aacagtgctgaaaacgacagaccaatactcgc taaaaat aaa 337 I C Y H N E V R. V Y P V 0 Y N S A E N D R P I L A K N K 7718 gaaacat tgat tgatacgggt cat t ctaaatacaaat ataacat ttaatagt tttgcacaagtaccaar ataac caacaac 365 E I L I 0 T C S F L N T N I T F N S F A Q V P I L I N N 7802 ggtatcttaggacaatcacaacaagccaaccgacaaaaaaatgcagaaagtcaattaattacaaatcgtattgataatgtatta 393 C I L C Q S Q Q A N R Q K N A E S Q L I T N R I D N V L 7886 aatggtagcgacccgaaat cacgctt t tatgacgctgtgagtgt agcaagtaat ttaagt ccaactgctetat t ggtagtt t 421 N C S 0 P K S R F Y 0 A V S V A S N L S P 7 A L P-C K F 7970 aatgaagaatataatttctacaaacaacaacaagctgaatataaagatttagccttacaaccaccttctgtaactgaatcagaa 449 N E E Y N F Y K Q Q Q A E Y K D L A L 0 P P S V 7 E S E 8054 atgggcaacgcatt ccaaat tgcgaatagcat taacggt t aacgatgaaaatt agtgtaccgtcacctaaagaaat tacattt 477 N C N A F Q I A N S I N C L T N K I S V P S P K E I T F 8138 ttacaaaaatattatatgttgtttggttttgaagtgaatgactataattcatttattgaaccaattaacagtatgactgtttgc 900AHmarHytt A 0 x w 'I A N m IL N a I A N a a x I x a x E6E H3 3 3 s w N x m w 3 N x I H a M I I x N w ai 'i A 'i A H S9E 86LZT a N a A A A N X sAxY N a d H ri N s c dN A A Id 3 W N 3 LEE H A N CaaH 'i A A H Na I a HI H I D A aA 0 A d A A I I N L W A 1. 1 A N A N A X I A X a C I A 7 A A TOZ a o N 0 N I H N H 'I N a a v ri N A 3 N A 3 A 3 f) 1 W V Esr a a N S N a V Hd 1 N H X 3 N A A a N H H W 3 1I A 1 N X A SZZ xerlle ~6 a D eve 36ouevBur oe2l~vri 1ee e 2za vB 0E eepo 81 BIZET 'I H S 1L N W X H N 0 '1 '1 N A I N qI N S 1 1 d S S A N A V L61 N D '1 'i a N X d a 0 I A a A N 0 H N 'a a I s a A I J, H 'i 69T 99CETl x a m a a 1 '1 A a a 3 '1 v 'i a s a A I I I d A H X 7I a N TVT S N A x 'i a L V S N ri a 1L 1 A D 1 a X a 0) A I i v '1 AL a ET!
VISSET
Hd V S H A J. A H X S X a 'I d N 3 a I 1 3 a 1 a a H A v a S8 H'1 1 A H HS I 1 1. a x I a 'I X I V 'T A I A A3 S A A N A LS
ZZLET
S X D a S HN S I s I AL a I a a N W 'I 3 a A 'I 3 1 0 N 6Z 909 El A X a d I X H H A H N A V NO A; a H A W a '1 H N 0H> A W I 06SET V I DS A A A I N a D 1LV V N N d A 0 N I '1 L A '1 A A Hd S d V A A d d A N A N V A 6VD '1 A S a 3aA A V A A A 0 1 H I D A V L V 0 'T A d d H TlI S S a 0 A 0 A L d N H 1 1 H V 0 'I H H A A AL M A H 00 N E6E 0t66 1 0 D a a I d 0 H1 H D H '1 0 S W A V M 1~ 7 3 D V S9E N 0 1. d A I N 0) d H A '1 A 3 H H A '1 d A D I V H X A H LEE ZLL6 3 0V A, X. t N V 0 I A H0S H D N d H A S H D S I d H H a D a a a A. d N S S S S 0 S a A 0 A S a I 'I H H '1 S T8Z ~t096 a '1 '1 c) I x w x I H A x. N N a j, H a 'i L a A a N s a A4 H ESZ a S D I S H A a m 0 1I V a 0 I H X 'I '1 3 0 W V 0 H V 'I V SZZ ZSE6 M V 'I I I. N V D a 'I a d N D A N 0 A H N A S V X '1 0 '1 d 69! A A V V M I V V V I 'I d I A V Hd D I .L 0 V H W N H v a a l1 V N 0 H H1 0 A a 0 d A A H A I N D V a AL 4 S d 0 0 V X H 51 0016 S 0 S A 'I H H V I A Hi v a 1 '1 A a 0) H V S 1 H N 'I M 'I 58 D S S A 0) 3 N H 3 A A V A '1 1. d S A 0 I 3 H A V H 'I A 0 LS.
H X S 3 N D 1 M A N A Hd D A 0 H A H d A '1 0 a 1 1. '1 6Z 0 a a a N I A SN H A H3 a I X a30 a N W I I A H H '1W I 2 S~e~ e 2 2 2 SB! 5ES2 52BE EEB2 S~ '9LB *A D03 H IAH N N 'I d NO0 1W d N D S D a2 N NH A H dAD0 19S E655~e~5E222e~ E EEES eB62 2~2 2SB 90 ES S 3 'I I V H '1 0 3 W 'I N d a 1 a H 1 1. A 1 0 1. 3 H '1 A N EES 0 A 1 W S N I d 3 I A S N A a N A 3 A 0 A 'I W A A H 0 1I SOS t7LZ Ot'OZO/660 Ifl3d
ZVOQA
sz8zf/00 om WO 00/32825 PCT/I B99/02040 275 803 atggcacaacaatctacaaaaaatgaaactgcacttcttagtagcaaagtcagctaaatcagcgttacaagattttaatcatgat 1 M AO Q S T K N ET A L L V A K S A KS A L Q DEF NH D 887 tattcaaaatct tggacatttggcgacaaatgggat aatt CaaataCaatgt t Caaacat t tgt aaataaatat ttatt ccct 29 Y S K S W4 T F C D K W4 D N S N T M F E T F V N K Y L F P 971 aagattaatgagactttattaatcgatattgcattaggtaatcgttttaattggttagCtaaagagCaagattttattggacaa 57 K I N E T L L I D I A L C N R F N W L A K E Q D F I C Q 1055 tatagtgaagaatacgtgattatggacacagtaccaattaaCatggaCttatctaaaaatgaggaat taatgttgaaacgtaat Y S E E Y V I M D T V P I N M D L S K N E E L M L K R N 1139 tatccacgtatggcaactaagttatatggtaaCggaattgtgaagaaaCaaaaattcaCattaaaCaacaatgatacaCgtttc 113 Y P R M A T K L Y C N C I V K K Q K F T L N N N D T R F 1223 aatttccaaacattagcagacgcaactaattacgctttaggtgtatacaaaaagaaaatttctgatattaatgtattagaagaa 141 N F 0 T L A D A T N V A L C V Y K K K I S D I N V L E E 1307 aaagaaatgcgtgcaatgttagttgattactcattgaatcaattatccgaaacaaatgtacgtaaagcaacatCaaaagaagat 169 K E M R A M L V D Y S L N Q L S E T N V R K A T S K E D 1391 ttagcaagcaaagtttttgaagcaatcctaaacttacaaaacaacagtgctaaatataatgaagtaCatCgtgcatcaggtggt 197 L A S K V F E A I L N L Q N N S A K Y N E V N R A S C C 1475 gcaattggacaatatacaactgtatcaaaattaaaagatattgtgattttaaCaacagattcattaaaatCttatcttttagat 225 A I C Q Y T T V S K L K D I V I L T T D S L K S Y L L D 1559 actaagattgcaaacacattccagattgcaggcattgatttcacagatcacgttattagttttgaCgaCttaggtggcgtgttt 253 T K I A N T F Q I A C I 0 F T D H V I S F D D L G C V F 1643 aaagtaacaaaagaatttaagttacaaaaccaagattcaattgactttttacgtgcgtatggagattatcaatcacaattagga 281 K V T K E F K L Q N Q D S I D F L R A Y C D Y Q S Q L G 1727 gatacaattccagttggtgctgtatttaCt tatgatgtatctaaacttaaagagtttaCtggcaaCgttgaagaaattaaacca 309 D T I P V G A V F T V D V S K L K E F T C N V E E I K P 1811 aaarcagatttatatgcgtttattttggatattaatttaattaaatataaacgttacacaaaaggtatgttaaaaccaccattc 337 K S D L V A F I L D I N S I K Y K R V T K C M L K P P F 1895 cataaccctgaatttgatgaagttaCaCaCtggattcattactattCatttaaagCCattagtccattctttaataaaatttta 365 H N P E F D E V T H W4 I H V V S F K A I S P F F N K I L 1979 acracsgaccaagatgcaaatccaaaaccagaggaagaattacaagaataa 2029 393 I T 0 0 D V N P K P E S E L Q E 44AHJDORFOO7 2044 atgaacaacgataaaagaggtttaaacgttgagttatcaaaggaaatcagcaaaagagttgttgaacatcgCaaCagatttaaa 1 M N N D K R G L N V E L S K E5I5 K R V VSE H R N R F K 2128 cgtcttatgtttaatcgttatttggaatttttaccgctactaatcaactataCcaatCgtgatacggttggtatagattttatt 29 R L M F N R Y L E F L P L L I N Y V N R D T V C I D F I 2212 cagttagaatcagctttaagacaaaacattaatgtagttgttggtgaagctagaaataagcaaattatgattCttggttatgta 57 Q L E S A L R Q N I N V V V C E A R N K 0 I M I L C V V 2296 aataacacttactttaatcaagcaccaaatttttcatcaaactttaatttccaatttcaaaaacgattaactaaagaagatata N N T Y F N Q A P N F S S N F N F 0 F Q K R L T K E D I 2380 tattttattgtacctgactatttaatacctgatgattgtctacaaattcataagctatatgataactgtatgagtggtaacttt 113 V F I V P D V L I P D D C L 0 I H K L Y D N C M S C N F 2464 gttgtcatgcaaaataaaccaattcaatataatagtgatatagaaattatagaaCattataCtgatgaattagcagaagttgCt 141 V V M Q N K P I 0 Y N S D I E I I E H Y T D E L A E V A 2548 ttatctcgcttttctttaatcatgcaagcaaaatttagcaagatatttaaatcagaaattaatgaCgagtcaatcaatCaactt 169 L S R F S L I M 0 A K F S K I F K S E I N D E S I N 0 L 2632 gtgtccgaaatatataacggtgcaccatttgttaaaatgtcacctatgtttaatgcagatgacgatatCattgatttaacaagt 197 V S E I V N C A P F V K M S P M F N A D D D I I D L T S 2716 aatagcgtaatcccagcattaactgaaatgaaaCgggaatatcaaaaCaaaat tagtgaattaagtaactatttaggcattaat 225 N S V I P A L T E M K R E V 0 N K I S E L S N V L C I N 2800 tcattagccgttgataaagaaagcggtgtttcagacgaagaggcaaaaagtaatcgtggatttaccacatcaaacagtaatatc 253 S L A V D K E S C V S D EE A K S N R C F T T SN S N I 2884 tatttaaaaggtcgtgaaccaattacgtttttatcaaagcgt tatggtttagatattaaaccgtat tacgatgatgaaacaacg 281 Y L K G R E P I T F L S K R V G L D I K P Y V D D E T T 2968 tctaaaatatcaatggtagacacactttttaaagatgaaagCagtgatataaatggCtag 3027 309 S K I S M V D T L F K D E S S 0 I N CG 44AH1JDORFOOS 3020 atggctagatacacaatgactttatacgatttcattaaatcagaattgattaaaaaaggtttcaatgaatttgtaaatgataat 1 M AR Y T M T L V DOF I1K S E L I K K C F N E F V N D N 3104 aaattaacgttttatgatgatgaatttcaattcatgcaaaaaatgCtgaagttCgaCaaagaCgtt ttagctatcgttaargaa 29 K L T F V D 0 E F 0 F M Q K M L K F D K 0 V L A I V N E 3188 aaagtatttaaaggtttttcattgaaagatgaattatcagatttaCtttttaaaaaatcatttacgattCattttttagataga 57 K V F K G F S L K D E L 5 D L L F K K S F V I H F L D R 3272 gaaatcaacagacaaacagttgaagcatttggcatgcaagtgattaCtgtatgtattacacatgaggattatttaaatgtggtt E I N R Q T V E A F C M 0 V I T V C I T H E D V L N V V 3356 tattcatcaagtgaagttgaaaaatacttacaatCacaaggCttCaCagaacaCaatgaagatacaaCaagtaaCaCtgatgaa ii S S S E V E K Y L 0 S Q0 3 F H. N E D T TS 3440 acatcgaatcaaaatgctacatctttagacaattCaactggCatgactgcaaacagaaaCgCttatgtgtCattaccacaaagt 141 T S N Q N A T S L D N S T C N V A N R N A V V S L P 0 S 3524 gaggttaacattgatgttgataatacaacgttacgattcgctgataataataCgattgataacggtaaaactgtgaataaatcg 169 E V N I D V 0 N T T L R FA D N N T I D N GOK ftV N-2K'S 3608 agtaacgaaagtaatcaaaacgcaaaacgtaatCaaaatcaaaaaggtaatgCaaaaggtaCaCaattCaCtalgcagtattta 197 S N E S N Q N A K R N 0 N 0 K C N A K C T Q F V K 0 Y L 3692 attgataatattgataaagcgtacgatttaagaaagaaaattttaaatgaatttgataaaaaatgt tttttacaaatt tggtag 3775 225 I D N I 0 K A V D L R K K I L N E F 0 K K C F L Q I W4 44AHJDORFOO9 WO 00/32825 PCT/I B99/02040 276 5744 atgaaatcacaacaacaagcaaaagaatggatatataagcatgagggggcaggtgttgactttgatggtgcatatggatttcaa 1 N K S Q Q Q A K E W I Y K H E G AG V D FODG A Y G F Q 5828B tgtatggacttat cagttgcttatgtgtattacattactgacggtaaagtt cgcatgtggggtaatgctaaagacgcgataaat 29 C M D L S V A Y V V Y I T D G K V R N W G N A K D A I N 5912 aatgact ttaaaggt ttagcgacggtgtataaaaatacaccgagctttaaacctcaattaggggacgttgctgtatatacaaat 57 N D F K G L A T V Y K N T P S F K P Q L G D V A V V T N 5996 ggacaat atggacatat tcaatgtgtgt taagtggaaat ct tgat tattatacatgct tagaacaaaactggttaggcggcggt G Q V G H I Q C V L S G N L D Y Y T C L E Q N W L G G G 6080 t ttgacggt tgggaaaaagcaaccattagaacacattat tatgacggtgtaact cact ttat tagacctaaat t ttcaggtagt 113 F D G W E K A T I R T H V Y D G V T H F I R P K F S G S 6164 aat ag caaagcat tagaaacat Caaaagt aaat acat t tggaaaat ggaaacgaaac caat a cgca cat at tat agaaatgaa 141 N S K A L E T S K V N T F G K W K R N Q V G T V V R N E 6248 aatggtacatttacatgtggttttttaccaatatttgcacgtgtcggtagtccaaaattatcagaacctaatggctattggttc 169 N G T F T C G F L P I F A R V G S P K L S E P N G V W F 6332 caaccaaacggttatacaccatataacgaagtttgtttatcagatggttacgtatggattggttataactggcaaggcacacgt 197 Q P NHG YVT P Y N E V C L S D G V V W IG Y N W Q G T R 6416 tat tat t taccagtgcgccaatggaatggaaaaacaggt aat agt tacagtgt tggt attccttggggggtgttctcataa 6496 225 V V L P V R Q W N G K T G N S Y S V G I P W G V F S 44AHJTDORFO1O 14420 ttggttagacatacgtctgaaatggatagatggaaaaaagaaagagaagctagaaaagagcaagaaaaagatttatttttaaat 1 L V R H T S ENMD R. W K K E R E A R. K E Q E K D L F L N 14336 gattttagtaatgttaattttaaatttgatgataaagatttacaagaggcgtacattgacacatggaaacattttgcacatctg 29 D F S N V N F K F D D K D L Q E A Y I D T W K H F A H L 14252 ccctatttccaaagaaagaaacgtacaatgtaaatgctgtatcattggtaagaggttcaagacataaaaaattaaattat 57 P V F P K E R N V S V V N A V S L V P. G S R H K K L N V 14168 attcttgaaatatataaccgtaatgatgattctaataataaaaacgctaaaaagcataaatacgctttatataatttacaagct 1 L E I V N R N D D S N N K N A K K H K Y A L Y N L Q A 14084 aaaaacaataattcttcaatgtataaatatattaaagaaatcgatactttatataaagaaattggtaaatcagatagaccagtg 113 K N N N S S N Y K V I K E I D T L V K E I G K S D R. P V 14000 acaaatattgatgatgaagatgtgaggtataactttttatattatgcaacatttgacgaataa 13938 141 T N I D DE D V R Y N F LVY Y A T F D E 44AHJDORFOll 15593 atgacaaacgtaaaagatattttatcaagacaccaaaacacattagcgagatttgaatttgaggaaaaagaaagagaatttatc 1 M T N V K DOI LS R H Q N T L A R. FE F E E KE R EF I 15509 aaactatcagaattagtagaaaaatacggtatgaaaaaagagtatat cgttagagcattattcacaaacaaagaatcaaaattc 29 K L S E L V E K V G N K K EY I V R. AL F T N K E S K F 15425 ggtgaacaaggtgt tat cgtcactgatgactataacgtaaact taccgaaccactt aacagaat taattaaagaaatgagagca 57 G E Q G V I V T D D V N V N L P N H L T E L I K E N R A 15341 gatgaggacgt tgt tgacat tat caat gctggagaagt tcaat tc acaat t tatgaat atgaaaac aaaaaaggt caaaaaggt 0 E 0 V V D I I N A G E V Q F T I V E V E N K K G Q K G 15257 tactcaatcaattttggtcaagtatcattttaa 15225 113 V S I N F G Q V S F 44AHJDORF012 8391 atgaacgaagtaaaat tcagatt tacagact cagaagcgtt tcacatgtt tat atacgctggggat ttaaaat tactctacttt 1 N N E V K F R F T D SE A F H N F IVY AGOD L K L LVY F 8475 ttatttgtattaatgttcgttgatattattacaggtatttcaaaagcaattaaaaataataacttatggtcaaaaaaatcaatg 29 L F V L N F V D I I T G I S K A I K N N N L W S K K S N 8559 agaggat tt ctaaaaaattattgatat tctgtatt atcatt t tagcaaacat cat tgaccagat tttacaatt aaaaggtggt 57 P. G F S K K L L I F C I I I L A N I I D Q I L Q L K G G 8643 ctactcatgattacaatattttattatattgcaaatgagggactttctattgtagaaaattgtgcagaaatggacgtattagta L L N I T I F V V I A N E G L S I V E N C A E N D V L V 8727 ccagaacaaat taaagat aaat taagagt cat taaaaatgat ac tgaaaagagtgat aacaat gaacgat caagagaagat aga 113 P E Q I K D K L P. V I K N D T E K S 0 N N E R S R. E D P.
8811 taa 8813 141 44AHJDORF013 14996 atgaaaattaaaactacttttagattaaataatttaatttattaccttttaacaaatagagattattataatgataaatttgaa 1 N K I K T T F R L N N L IV V L L T N R D V V N DOK F E 14912 aaatttacttcatctaataaaaaatgtatagtaaaaataaatatgggtgatgtgtatattgagtttgacaaacaatatgatgat 29 K F T S S N K K C IV K I N N G D V VI E F D K Q V DOD 14828 tttgaaattgaaaaagagttatttacgttagatatcgacattgatattaaaaaacatgtttttaatatacttgtattttattat 57 F E I E K E L F T L D I 0 I 0 I K K H V F N I L V F V V 14744 agaaat tatttaagtaatgaattaataagagaaattttattaaacgttacaattgacgacgtattatcaaattttgataaacct R N V L S N E L I R E I L L N V T I 0 D V L S N F D K P 14660 ct tgaaagcgaat taa tgat tat ttatcaaaacaaagt cat at acgat aat gggaaagtgat tgaccat gaat aa 14586 ii3 L, E S E. L. M 1 Q N K V; 1 7 D N G K V 1 D H E 44AHJDORF113 199 atgacagaatttgatgaaatcgtaaaaccagacgacaaagaagaaacttcagaatcaactgaagaaaatttagaatcaactgaa 1 N T E F D E I V K P0D0DK E E T S E S T E EN L E S T E-- 283 gaaacttcagaatcaactgaagaatcaactgaagaatcaactgaagaatcaactgaagataaaacagta~faaacaar.zgggaa 29 E T S E S T E E S T E E S T E E S T E D K T V E V-I E E 367 gaaaatgaaaacaaattagaacctactacaacagatgaagatagttcgaaatttgaccctgttgtattagaacaacgtattgct 57 E N E N K L E P T T T D E D S S K F 0 P V V L E Q R I A 451 t cat tagaacaacaagtgact act t t ttatc t tcacaaat gcaac aaccacaacaagt acaacaaacacaat cagat gt aac a S L E Q Q V T T F L S S Q N Q Q P Q Q V Q Q T Q S D V T 535 gaatcaaacaaagaagataacgactattcagatgaagaactagttgataagttagatttagattag 600
I
WO 00/32825 PCT/I B99/02040 277 113 E S N K E D N D Y S D E E L V D K L D L 0 44AHiJDORF114 16172 atggttaatgttgataatgcaccagaagaaaaaggacaagcctatactgaaatgttgcaactatt caataaactgattcaatgg 1 M V N V D NA P EE K G Q A Y T E M L Q L F N K L I Q W 16088 aatccagcttatacatttgacaatgcaattaacttattatcggcttgccaacaactattattaaactataatagttctgttgtt 29 N P A Y T F D N A I N L L S A C Q Q L L L N Y N S S V V 16004 caattctt aaatgatgaactaaacaacgaaactaaaccagaat caatattgt ctt atat tgctggtgatgacccaatagaacaa 57 Q F L N D E L N N E T K P E S I L S Y I A G D D P I E Q 15920 tggaatatgcataaaggattttatgaaacgtataacgtttacgttttttag 15870 W N M H K G F Y E T Y N V Y V F 44AHJDORF014 6243 atgaaaatggtacatt tacatgtggt t ttttaccaat att tgcacgtgtcggt agt ccaaaat tat cagaacctaatggctartt 1 M K M V N L N V V F Y Q Y L HV S V V Q N Y Q N L M A I 6327 ggttccaaccaaacggttatacaccatataacgaagtttgtttatcagatggttacgtatggattggttataactggcaaggca 29 G S N Q T V I H H I T K F V Y Q M V T Y G L V I T G K A 6411 cacgttattatttaccagtgcgccaatggaatggaaaaacaggtaatagttacagtgttggtattccttggggggtgttctcat 57 H V I I Y Q C A N G M E K Q V I V T V L V F L G G C S H 6495 aatgggtattttagcctttttctttga 6521 N G Y F S L F L 44AHJDORF015 15403 gtgacgataacaccttgttcaccgaattttgattctttgtttgtgaataatgctctaacgatatactcttttttcataccgtat 1 V T I T P C S P N F D S L F V N N A L T I Y S FF I P Y 15487 ttttctactaattctgatagtttgataaattctctttctttttcctcaaattcaaatctcgctaatgtgttttggtgtcttgat 29 F S T N S D S L I N S L S F S S N S N L A N V F W C L D 15571 aaaatatcttttacgttgtcattttatttctcctcttatttaaattatttgctttctgcaattgcgatttgtag 15645 57 K I S F T F V I L F L L L F K L F A F C N C D L 44AHJDORF016 15852 atgaaagttgacgacattgttaccttacgtgtcaaaggttatatacttcattacttagatgatgataatgaatacattgaggaa 1 M K V D D IV T L R V K G Y I L N Y L D DODN E Y I E E 15768 tttttaccacttcacgagtatcatttaaccaaaacacaagcaaaagaattattaccagacacatgtaaactattgtccactaca 29 F L P L H E Y H L T K T Q A K E L L P D T C K L L S T T 15684 cgcacaacgaaaacaattcaagtttattacaatgatttactacaaatcgcaattgcagaaagcaaataa 15616 57 R T T K T I Q V Y Y N D L L Q I A I A E S K 44AHJDORF017 10757 atggaaagattaaaattgcttctgctggtataccgaaaaacgcctttgatacaagcgtcgattttgaaacctttgtacgtgaac 1 M E R L K L L L L V Y R K T P L I Q A S I LK PL Y V N 10673 aatcttgacggtgccattatgaaaacaataaaagtatctataatgagcaaggtacaatatcgatatatccgtctaaaactg 29 N S L T V P L L K T I K V S I M S K V Q Y R Y I R L K L 10589 aaattgtatgtggtaatgtatatgatgaatattttactgatgaacttaatatga 10536 57 K L Y V V M Y M M N I L L M N L I 44AHJDORFOIB 1098 atgttaattggtactgtgtccataatcacgtattcttcactatattgtccaataaaatcttgctctttagctaaccaattaaaa 1 M L I G T V SII T Y S S L Y C P IK S C S LA N Q L K 1014 cgattacctaatgcaatatcgattaataaagtctcattaatcttagggaataaatatttatttacaaatgtttcgaacattgta 29 R L P N A I S I N K V S L I L G N K Y L F T N V S N I V 930 tttgaattatcccatttgtcgccaaatgtccaagattttgaataa 886 57 F E L S H L S P N V 0 D F E 44AHJDORF019 9836 atgttacctggtttgtataagtattcttttttgaataaaggtacaccaattgcttttttatatttttctggtaactgtgcatat 1 M L P G L Y K Y S F L N K G T P I A FL Y F S G N CA Y 9752 gtccagttaccaccaatcacacgaccactttttCCatttggcttgactgatttaccactaattggtttatggtctccgtcatca 29 V Q L P P I T R P L F P F G L T 0 L P L I G L W S P S S 9668 tcagtaggattagaactactactcccactatctacttga 9630 57 S V G L E L L L P L S T 4 4ANJDORF12 1 16362 atggaaaatgaaacaaaaaacattgagttgaagcatgtttttcgttttaagaatggaagtttatgtatagcgttar ttgataga 1 M EN E T K N I E LK H V F R F K N G S L CI A L FODR 16278 acagaaaatgaaatttcattttatgatgttgacattgatgaaattgaagatttaaatcataattctgttttacgcgtaatttca 29 T E N E I S F Y 0 V D I D E I E D L N H N S V L R V I S 16194 actttattaggaagtgataataacggttaa 16165 57 T L L G S 0 N N G 44AHJDORF020 13865 atgtctaaacgattgttaccatgtttttgctccttgtaatagtttatgatgtcgtttacagtgttaaatttattcgtcaa 1 M S K R F C F T M F L L LV I V Y D V V Y S V K F IR Q 13949 aigrcraa-aagra-craac--acacad~gccgcczcacacatcta 29 N L H N I K S Y T S H L H H Q Y L S L V Y L I Y Q F L Y 14033 ataaagtatcgatttctttaa 14053 57 I K Y R F L 44AHJDORF123 614 atgtatgagggaaacaacatgcgttctatgatgggtacatcatatgaagattcaagattaaataaacgaacagaettaaatgaa 1 MNY E G N N M RS NM CGT S Y E D0S R L N K R T EL N E 698 aacatgtcaattgatacaaataaaagtgaagatagttatggtgtacaaattcattcactttcaaaacaatcatttacaggtgac 29 N M S I D T N K S E D S Y G V Q I H S L S K Q S F T G D 782 gttgaggaggaataa 796 57 V E E E WO 00/32825 PCT/I B99/02040 278 44AHJDORF021 5816 atgcaccatcaaagtcaacacctgccccctcatgcttatatatccattcttttgcttgttgttgtgatttcatttatatcactc 1 N H H Q S Q H L P P H AY I S I LL L V V V I SF1I S L 5732 ctattttgatgttttgctacccaaccatattcacgatgttttgtccgcattaacattactgaagaattctttatattccga 29 L F L M F C Y P T I F T N F C F R I N I T E E F F I F R 5648 tatattagcctctaa 5634 57 Y I S L 44AH{JDORF022 8611 atgtttgctaaaatgataatacagaatatcaataattttttagaaaatcctctcattgatttttttgaccataagttattattt 1 MHF A K M I I Q N I N N FL E N P LI D F F D H K L L F 8527 ttaattgcttttgaaatacctgtaataatatcaacgaacattaatacaaataaaaagtag 8468 29 L I A F E I P V I I S T N I N T H K K 44AHJDORF023 6494 atgagaacaccccccaaggaataccaacactgtaac tat tacctgt tt t tccat tccattggcgcactggtaaataataacgtg 1 M R T P P KE Y Q H C N Y Y L F F H S I GA L V N N N V 6410 tgccttgccagttataaccaatccatacgtaaccatctgataaacaaacttcgttatatggtgtataaccgtttggttggaacc 29 C L A S Y N Q S I R N H L I N K L R Y H V Y N R L V G T 6326 aatagccattag 6315 57 N S H 44AHJDORF024 14275 gtgtcaatgtacgcctcttgtaaatctttatcatcaaatttaaaattaacattactaaaatcatttaaaaataaatctttttct 1 V SHMY A S C K S L S S N L K L T L L K S F K N K S F S 14359 tgct ct ttt ctagct tct ctt tct tt t ttccat ctat ccat t tcagacgtatgt ct aaccaatgt tatcaacct ccatataaag 29 C S F L A S L S F F H L S I S D V C L T N V I N L H I K 14443 cataaataa 14451 57 H K 44AHJDORF025 15175 atggaacgtaaatacaaaacggtatt attatattgcgatgagattaaaggacattttccacatcaaatctcaatgtttgaagat 1 M E R K Y K T V L L Y C D ElI K G H F P H Q IS M FE D 15091 ttatatgacgctaaagttgtatattcatattatgaatataacctgttcactaaaaaatacgcgtatatcatagaatacattaag 29 L Y D A K V V Y S Y Y E Y N L F T K K Y A Y I I E Y I K 15007 gagatataa 14999 57 E I 44AHIORF026 14593 acgaataacctattaaacatagccattgttttccttttagcatttttaattacacttatcatacttatgacactgcatatacgc 1 M N N L L N I A I V F L L A F L IT L II L M T L H I R 14509 gtgtcatttggtgtt ttatt cac tacat tgat tatat tctatat tat ctt tttaatggttatttatgctt tatatggaggttga 14426 29 V S F G V L F T T L I I F Y I I F L H V I Y A L Y G G 4 4AHJDORF02 7 12916 atgattgtctatatccctaatttagtacaaaatcatattgttttgtatatggtacaacgataatatttgtcataaaagtagt 1 M I V YI P N F S TK F I L F C I W Y N D N I C H K S S 13000 t acattatacatgactttaatatat tatcat cagt t ttgaatagaagaaat caccqt tttgat tgatgtgat ttcttaa 13080 29 Y I I H D F N I F I I S F D I E E I T V L I D V I S 44AHJDlORF029 15183 gtgtttaaatggaacgtaaatacaaaacggtattattatattgcgatgagattaaaggacaitttccacatcaaatctcaatgt 1 V F K W N V N T K R Y Y Y I A M R L K D I F H I K S Q C 15099 ttgaagatttatatgacgctaaagttgtatattcatattatgaatataacctgttcactaaaaaatacgcgtatatcatag 15019 29 L K I Y H T L K L Y I H I H N I T C S L K N T R I S 44AHJDORF028 9235 atggaatatatgcacgtccaattgtacctgctttcatatttrttgcaaaatctgcattaccttttcrttgtacgtcttgtggta 1 HE Y M H V Q L Y L L S Y F L Q N L H Y L F F V R L V V 9151 caaagtggacgatgttacctgcgtcataccaagacggttgtccagcttgttttgattgtgatactaactttcttgctatga 9071 29 Q S G R C Y L R H T K T V V Q L V L I V I L T F L L 44AH4JDORF030 14487 gtgaataaaacaccaaatgacacgcgtatatgcagtgtcataagtacgataagtgtaattaaaaatgctaaaaggaaaacaatg 1 V N K T P N D T R I C S V I S H I SV I1K N A K R K T H 14571 gctatgt tt aataggttat tcatggt caatcactt tcccat tat cgtaatgact gtt ttgataaataat cat taa 14648 29 A H F N R L F H V N H F P I I V Y D F V L I N N H 44AHJDORF031 11039 atgatattgtatagttcattgttatcatctaaacggaataagttaaaatgtgaacgtaatgcaggtatgccatataatccattt I Ml L Y 5 SLL S S KR;K LRKC: 2 RN P1A.1G 11123 aaaacgactttagataacataacctcctcatttgagtatgggtgttcgttgatatcatcagtaatgtga 11191 29 K T T L D N I T S S F E Y G C S L I S S V HM 44AHJDORF135 693 atgaaaacatgt caat tga acaaataaaagtgaagat agt tatggtgtacaaatcat tcact tt caaacaat ot t-acag 1 H K T C Q L I Q I K V K I V M V Y K F I H F Q N N-H L Q 777 gtgacgttgaggaggaataataaattatggcacaacaatctacaaaaaatgaaactgcacttttag 842 29 V T L R R N N K L W H N N L Q K H K L H F 44AHJDORF033 3795 atgccattatttaaccacctctaccaaatttgtaaaaaacattttttatcaaattcatttaaaattttctttcttaaatcgtac 1 H P L F N H L Y Q0I C K K H F L S N S F K I1FF L K S Y WO 00/32825 PCT/I B99/02040 279 3711 gctttatcaatattatcaattaaatactgcttagtgaagtgtacctrttgcattacctttttga 3646 29 A L S I L S I K Y C L V N C V P F A L P F 4 4A}{JDORFO3 2 9455 atggcttgttttgctaaagcgagtagtgaactaccactgtcaccactactaccactgtcagacgaatcactaggtgatccacct 1 M A CF A K A S S E L P L S P L L P L S D ES LCGD P P 9371 ttaccgtctaatttaccaccccaagctagaatagtattcgcaccgtctaaaaatggattaccatag 9306 29 L P S N L P P Q A R I V F A P S K N G L P 4 4AHJDORF03 4 14146 atgatgatctaataataaaaacgctaaaaagcaaaatacgctttatataatttacaagctaaaaataataattcttcaatgt 1 M M I L I IK T L K S IN T LVYI I Y K L K I I I L Q C 14062 ataaatatattaaagaaatcgatactttatataaagaaattggtaaatcagatagaccagtga 14000 29 I N I L K K S I L Y I K K L V N Q I D Q 44AHJIDORF035 13957 atgcaacat ttgacgaat aaat t taaca c tgt aaacgac at cat aaac tat tacaaggagcaaaaa cat ggt aaaacaaaat cg 1 M Q H L T N K F N T V N D II N Y Y K E Q K H G K T K S 13873 tttagacatggtaagagattatcaaaatgctgtcaatcatgtcagaaaaaaaatcccagataa 13811 29 F R H C K R L S K C C Q S C Q K K N P R 44AHJDORF036 10165 gtgtatacaataccacacgtgatggtgcaacatatggtggtacattatagtttgcaactaaaaacgaaccatcttcaaaaactg 1 V V T IP M V M V Q H M V V H Y S L Q L K T N H L Q K L 10081 ctacaacaacacctgtgtgaccaacaccatatgcagttgcttgtaagtatggtggtttactag 10019 29 L Q Q H LC D Q V H M Q L L V S M V V Y 44AHJDORF037 14788 atgtcgatatctaacgtaaataactctt tttcaatttcaaaatcatcatattgtttgtcaaactcaatacacacatcacccata 1 M S I S N V N N S FS IS K S S Y C L S N S IY T SP I 14872 tttatttttactatacattttttattagatgaagtaaatttttcaaatttatcattataa 14931 29 F I F T I H F L L D E V N F S N L S L 44AHJDORF038 3671 gtgtacct tt tgcat taccttt ttgat t ttgattacgt t ttgcgtt ttgat tactt tcgttact cgat ttat tcacagttrttac 1 V Y L L H Y L F D F D Y V L R F D Y F RY SlIY S Q F Y 3587 cgttatcaatcgtattattatcagcgaatcgtaacgttgtattatcaacatcaatgttaa 3528 29 R V Q S V Y Y Q R I V T L Y Y Q H Q C 44AHJDORF039 1743 gtgctgt at t t acttatgatgtatc taaact taaagagtt tactggcaacgttgaagaaat taaaccaaaat cagatt tatatg 1 V L Y L L M M Y L N L K S L L A T L K K L NOQN Q I Y M 1827 cgtttattttggatattaattcaattaaatataaacgttacacaaaaggtatgttaa 1883 29 R L F W I L I Q L N I N V T Q K V C 44AHJDORF040 9740 gtggtaactggacatatgcacagttaccagaaaaatataaaaaagcaattggtgtacctttattcaaaaaagaatacttataca 1 V V T G H M H S Y 0 K N IK KQ L V Y L V S K K N T Y T 9824 aaccaggtaacatatttcctcaaacgggtaatgcaggacaatgtacagaattaa 9877 29 N Q V T Y F L K R V M Q D N V Q N 4 4AHJDORF04 1 15836 atgt cgtcaact ttcattatt atat cactcct ttct aaaaaacgtaaacgt tatacgttt cataaaat cct ttatgcatat tcc 1 M S S T F I IIS L L S K K R K R V T F H K I L V A Y S 15920 attgttctattgggtcatcaccagcaatataagacaatattgattctggtttag 15973 29 1 V L L C H H Q Q Y K T I L I L V 4 4AHJDORF04 2 5151 atgcacgaccgtcgtcttttgttaatttatagttttgtgaacctcttgcgcgtaatgcttcaaagtgttcatactcaccaagtt 1 MNH D R R L L LIVY S F V N L L R V M L Q S V H T H Q V 5067 ggaagaaaccatataaattatggaaacgttttccaccaccgccgtttgtcatag 5014 29 G R N H I N V C N V F H H R R L S 4 4AH4JDORF04 3 4539 atgcgacttgtaacagttttgcaacaccatcgtgatgtaaccagattttcatttcaccattggattgacgttctaatccgattg 1 M R L V T V L Q H R D V T R F S F H HNW I D V L I R L 4455 ttgtaccatgaccaccctgtacaatacgcatgcttgaaattaagtcaccactag 4402 29 L Y H D H P V Q V A C L K L S H NH 44AHiJDORF044 12917 atgttacctatttacgtgatgatatgttttataaagaaaacatggaacgttattactacaatccaagcaatttacattttgaca 1 M L P IY V M I C F I K K T W N V I TT I Q Al V I L T 12833 atgcttactctaaaaattacgtggttgataatgatagatatttatatttag 12783 29 M L T L K I T W L I M I D I V I 4 4AHJDORF1 49 770 atgattgttttgaaagtgaatgaatgtacaccataactatcttcactttatttgtatcaattgacatgttttcatttaatt 1 M I V L K V N E F V H H N Y L li F Y L, Y iL1t 686 ctgttcgtttatttaatcttgaatcttcatatgatgtacccatcatag 639 29 L F V V L I L N L H M M Y P S 44AHJDORF046 4891 atgat tatccat t taagt tat catat caagacggtataatt tcccacgtgat acttaaagagcctgagggtatr-tgt t t 1 M I I H L S Y H I K T V L I S H V I T L K S L R V-F A F 4975 atacaaatccaaaaacaaaacgtaaatcgttattacttgctatga 5019 29 I 0 I 0 K Q N V N R V Y L L 4 4AJDOfRF04 7 11911 atgaatgtatgtaagttgttcaggtgtgagttttgcaaaacatttcacagcatagtcataggcttcactatcattcatatcatt 1 M N V C K L F R C E F C K T F H S I V I G F T I I H I I WO 00/32825 PCT/I B99/02040 280 11995 atctttatcaaaaatcgtataattaaaatctgttttaasttgtga 12039 29 I F I K N R I I K I C F K L 44AHJDORF045S 10655 atggcaccgt caaagaat tgt tcacgt acaaaggttt caaaat cgacgct tgtat caaaggcgtt t tt cggt ataccagcagaa 1 M A P S K N C S R 7 K V S K S T L V S K A F F G P A E 10739 gcaattttaatctttccattcacttcatatgcatatttcttatga 10783 29 A I L I F P F T S Y A Y F L 44AIIJDORF04 8 15340 atgaggacgttgttgacattatcaatgctggagaagttcaattcacaatttatgaatatgaaaacaaaaaaggtcaaaaaggtt 1 M R T L L T L S M L E K F N S Q F M N M K T K K V K K V 15256 actcaatcaattttggtcaagtatcattttaatacaatttcatag 15212 29 T Q S I L V K Y H F N T I S 44AHJDOR-FO4 9 5784 atgagggggcaggtgttgactttgatggtgcatatggatttcaatgtatggacttatcagttgct tatgtgtattacattactg 1 M R G Q V L T L M V H M D F N V W T Y Q L L MC I T L L 5868 acggtaaagttcgcatgtggggtaatgctaaagacgcgataa 5909 29 T V K F A C G V M L K T R 44AHJDORF050 13158 gtgtgt tacgt tt tcat tcacgtaat cgt tt cgtcgcatt ctaaaaaaatgtt t ttgtaaagt cttgatgtat tcatt ttat 1 V C Y V F H S R N R F V A F L K K C F C K V L M Y S F Y 13242 gcttttgtaataaattgiatatatttaaattggataatatag 13283 29 A F V IN CI Y L N W I I' 4 4AHJDORF0 51 11066 atgataacaatgaactatacaatatcattaacggttacaaaaactgaacgtaatatattattctctacatttgtcacatcac 1 M IT M N VT I S L T V T K T L N V I Y Y S L H L S H N 10982 gttcattgtataacttattggttcctttccaatacttaa 10944 29 V H C IT Y W FL S N T 44AHJDORF052 14338 atgatttagt aatgtt aatttaaa ttgatgataaagat ttacaagaggcgtacat tgacacatggaaacat tttgcacat c 1 M I L V M L I L N L M I K I Y K R R T L TNH GN I L H I 14254 tgccctattttcctaaagaaagaaacgtatcatatgtaa 14216 29 C P I F L K K E T Y HNM 44AHJDORF053 3348 atgtggtttattcatcaagtgaagttgaaaaatacttacaatcacaaggcttcacagaacacaatgaagatacaacaagtaaca I M W FINH Q V K L K N T Y N H K A S Q N T M K 10 Q V T 3432 ctgatgaaacatcgaatcaaaatgctacatctttag 3467 29 L M K H R I KM L HML 44AHJDORF054 7551 atgactggaatggaaatacgatgttactcgacgctggtaagatttcacaaaaaactggtgttaagttacgtacaaaatcaatta 1 M T G M E I R C Y ST L V R FMH K K L V L S Y V QN Q L 7635 ttggttatcataatgaagttcgagtatatccagtag 7670 29 L V II M K F EY IQ 44AHJDORF055 15705 atgtgtctggtaataattcttttgcttgtgttttggttaaatgatactcgtgaagtggtaaaaattcctcaatgtattcattat 1 M CL V II L L L V F W L N D T R EV V K I P Q C I M Y 15789 catcatctaagtaatgaagtatataacctttga 15821 29 H H L S N E V Y N L 44AHJDORF056 5512 gtgagtattacattacaggtaaccaaatggaattatttagagacgcgccagaagaaattaaaaaagtgggtgcatggttacgtg 1 V SI T L Q V T K W N Y L ET R Q K K L K K W V M G Y V 5596 tgtcaagtggtaacgcagtcggtgaagtaa 5625 29 C Q V V T 0 S V K 44AHJDORF057 10121 atgtaccaccatatgttgcaccatcacgtgtggtattgtatacactcattaatggcgtaccaaataatgctggtgataatattg 1 M Y H N M L MMMHV W Y C INHS L MA Y Q INM L V II L 10205 tattctttagtggtattgcttaattaa 10231 29 Y S L V V L L N 4 4AHJDORFOS58 10767 atgcatatttcttatgattcagtacaaacatcttatctatctgttcgttttcaatatcccatttacctaaggctatcgggtcga 1 MNH I S Y D S V Q TIS Y L S V R F Q Y P I Y L R L S CR 10851 ataaactggggttcaataagggtttaa 10877 29 I N W G S I R V 44AHJDORF164 702 atgttttcatttaattctgttcgtttatttaatcttgaatcttcatatgatgtacccatcatagaacgcatgttgtttccctca 1 M F S F N S V R L F N L E S S Y D V P I I E R M L F P S 618 tacatgttaaatcctcctaatctaa 592 29 Y M F K F L L I 44AHJDORF059 8360 atggatttgt aacattggatactgaaccgt catatgccaaaatct tacaccagatt ctaaaatgcttt taat~ttcca 1 M D F V T L D Y L N RNH Y AK I L HQ IL K L L i I V P 8276 ttaacatggggtcgatgtcacgtatag 8250 29 L T W G R C H V 44AHJDORF060 6257 atgtaccattttcatttctataatatgtgccgtattggtttcgtttccattttccaaatgtatttacttttgatgtttctaatg 1 M Y N F H F Y N M C R IG F V S I F Q M Y L L L M F L H WO 00/32825 PCT/I B99/02040 281 6173 ctttgctattactacctgaaaatttag 6147 29 L C Y Y Y L K I 44AHJDORF061 15551 atgtgttttggtgtctgataaatatcttttacgtttgtcattttatttctcctcttatttaaattatttgctttctgcaatt 1 M C F G V L I K Y L L R L S F Y F S S Y L N Y L L S AlI 15635 gcgatttgtagtaaatcattgtaa 15658 29 A I C S K S L 44AHJDORF062 4285 gtggtattcgcaacgcagitaaccaatctattaatattgataaagaaacaaatcacatgtactctacacaatccgattctcaaa 1 V V FA T Q L T N L L I L I K K Q I T C T L H N P I L K 4369 aacctgaaggtttttggataa 4389 29 N L K V F G 44AHJDORF063 9487 atgcgtcttgtattttttttaataattcttgcatggcttgttttgctaaagcgagtagtgaactaccactgtcaccactactac 1 M R. L V F F L II L AW L V L L K R V V N Y H C H H Y Y 9403 cactgtcagacgaatcactag 9383 29 H C Q T N H 44AH{JDORF065 5029 gtggtggaaaacgtttccataatttatatggtttcttccaacttggtgagtatgaacactttgaagcattacgcgcaagaggtt 1 V V E N V SI I Y M V S S N L V S M N T L K HNY A Q0EV 5113 cacaaaactataaattaa 5130 29 H K T I N 44AHJDORF064 2609 atgacgagt caatcaat caacttgtgtccgaaat at ataacggtgcaccatt tgt taaaatgt cacctatgt ttaatgcagatg 1 M T S Q SI N L C P K Y I T V NH l L L K C H L C L MQ M 2693 acgatatcattgatttaa 2710 29 T I S L I 44AHJDORF066 10481 atgatattctttatattgaaagtgacatcggttcattttcacttaacgacttatttccagttgaacgttcagtacataacaaat 1 M I FF IL K V TF S V HF H L T T Y F Q L N V Q Y I T N 10397 ctgatttgcatatattaa 10380 29 L I C I Y* WO 00/32825 PCT/I B99/02040 282 Table 19 Sequence similarities between ORFs 44AH-JD and public databases Phage: 44AHJD Database: nr Query= sidjll.87ljlanI44AHJDORFO0l Phage 44AHJD ORF110342-126271-1 (761 letters) giIll88481spIP19894IDPOLBPM2 DNA POLYMERASE >gil768961pirljJQ0 55 le-06 gi 11072656 1pir I I S51275 DNA polymerase phage CP-l >gi1836593 1e. 53 6e-06 gill429230lemb1CAA676491 (X99260) DNA polymerase (Bacteriophage 49 le-04 gill572479IembICAA657l21 (X96987) DNA polymerase (Bacteriophage 46 0.001 giIIlS8511spIP69SOIDPOL.BPPZA DNA POLYNERASE (EARLY PROTEIN GP. 45 0.002 giJ2435429 (AF012250) unassigned reading frame (possible DNA p0 45 0.002 gill0844871pirl IS41618 DNA polymerase slime mold (Physarum po 45 0.002 gil48778191gblAAD31446.11 (AF133505) DNA polymerase [Neurospora 44 0.004 gil4619621sp1P3353?fDP0MNEUCR PROBABLE DNA POLYI4ERASE >giJ2833. 44 0.004 gij2499511jspIQ12471j6P2i YEAST 6-PHOSPHOFRUCTO-2-KINASE 2 (PHO. 41 0.041 gi122583751gbfAAD1l909.11 (AF007261) transcription initiation f. 40 0.070 giJlS7341embICAA3745OI (X53370) DNA polymerase (AA 1-575) [Bact. 39 0.092 Query= sidjl1OB72IlanI44AHJlOR.Foo2 Phage 44AHJD ORF13789-573213 (647 letters) giI135273IspIP276221TAGC_-BACSU TEICHOIC ACID BIOSYNTHESIS PROTE. 112 7e-24 giJ142847 (M64050) DNase inhibitor (Bacillus subtilis] 52 giJ4038407 (AF103943) factor C protein precursor (Streptomyces 39 0.10 Query. eidI1lO873Ilanf44AHJD0RFOO3 Phage 44AH3D ORF16626-838912 (587 letters) gij138123jspjP04331jVG9_BPPH2 TAIL PROTEIN (LATE PROTEIN GP9) 92 8e-18 gi]1381241spIP07S341VG9_BPPZA TAIL PROTEIN (LATE PROTEIN GP9) 82 le-14 gil1429238lembiCAA676571 (X99260) tail protein [Bacteriophage B 78 2e-13 giJ215339 (M12456) p9 tail protein (Bacteriophage phi-291 >giI2 71 2e-11 giJ118l9681embICAA.87738.11 (Z47794) tail protein (Bacteriophage 54 3e-06 giJ118197O1embICAA8774O.11 (Z47794) tail protein (Bacteriophage 42 0.010 Query.s idI11OB75IlanI44AHJDORFOO5 Phage 44AHJD ORF'112643-138901-1 (415 letters) gi 13845203 (AE001399) GAF domain protein (cyclic nt signal 52 6e-06 gil3758843lemb1CABl1128.lI (Z98551) predicted using hexExon; 49 giJ3845297 (AE001421) hypothetical protein [Plasmodium falciparum] 48 le-04 gil4493936fembICAB38972.11 (AL034556) predicted using hexExon; 47 2e-04 gi13845165 (AE001390) hypothetical protein [Plasmodium falciparum] 46 6e-04 Query= sidjllO877IlanI44AHJDORFoO7 Phage 44AHJD ORF12044-302711 (327 letters) gi 11181960 1emb ICAA8773l.l11 (Z47794) connector protein [Bacterio. 46 Se-04 gi11429239lembICAA676581 (X99260) upper collar protein (Bacteri. 45 8e-04 gi1l379151spIP075351VG10_BPPZA UPPER COLLAR PROTEIN (CONNECTOR 44 0.002 giJ1379141spIP043321VG1O_BPPH2 UPPER COLLAR PROTEIN (CONNECTOR 41 0.009 Query= sidI11O878IlanI44AHJDORFOO8 Phage 44AHJD ORF13020-377512 (251 letters) gijArligbAADl~.l.' 3O*.,/Ajr' 1-.A ).inase S- P0 gill730077IspIP181601KYK1_DICDI NON-RECEPTOR TYROSINE KINASE SP 46 2e-04 gil3758855lembiCAB1114O.11 (Z98551) predicted using hexExon; MA 46 2e-04 gilSB579S1spIP2lS381RES1_YEAST DNA-BINDING PROTEIN REBl (QBP) 46 3e-04 giJ172372 (M58728) DNA-binding protein [Saccharomyces cerevisiae] 46 3e-04 gi12952545 (AF051898) coronin binding protein (Dictyostelium, di 45 6e-04.
giI5352601embICAA829961 (Z30339) STARP antigen (Plasmodium reic. 45 7e-04 gill429240lembICAA676591 (X99260) lower collar protein (Bacteri. 44 0.001 WO 00/32825 PCT/I B99/02040 2S3 Query= sidjllOB79jlanI44AHJDORFOO9 Phage 44AHJD ORF15744-649612 (250 letters) gi1276498lembICAA69021.lI (Y07739) N-acetylrnuramoyl-L-alanine 180 le-44 gilll36751SPIP245S61A1LYS_STAAU AUTOLYSIN (N-ACETYLMURAMOYL-L-AL 118 6e-26 giJ1763243 (U772397) amidase (bacteriophage 80 alpha) 118 6e-26 gil45742371gblAAD23962.1IAP1O068S1_1 (AF106851) LytN [Staphyloco. 84 9e-16 gil3767593ldbjlBAA33856.11 (AB015195) LytN (Staphylococcus aureus] 84 9e-16 gil2764983lemnbICAA69022.11 (Y07740) cell wall hydrolase Plyl87 77 2e-13 gil32877321splO0S1S61ALE1_STACP CLYCYL-GLYCINE ENDOPEPTIDASE AL 73 2e-12 gi 179926I1pir IIA25881 lysostaphin precursor StaphylococcuB aiM 69 3e-11 giI1264961spIP10548ILSTP -STAST LYSOSTAPHIN PRECURSOR (GLYCYL-GL 69 3e-11 giI3287967IspIP10547ILSTPSTASI LYSOSTAPHIN PRECURSOR (GLYCYL-G. 69 3e-11 gil3341932ldbjIBA.A31898.1IT (AB009866) amidase (peptidoglycan hy 68 6e-11 Query- sidjl1Oa82Iln44AH-JDORFO12 Phage 44AHJD ORF18391.881313 (140 letters) gi11405281spIP248111YOXH_BACSU HYPOTHETICAL 15.7 Kfl PROTEIN IN 80 6e-15 giJ412663ldbj18AA3665l.11 (AB016282) ORF45 (bacteriophage 76 le-13 gi11410881spIP268351YNGD_CLOPE HYPOTHETICAL 14.9 KD PROTEIN IN 61 4e-09 giJ2293160 (AF008220) YtkC (Bacillus subtilis] >gij2635548lembi 36 0.099 gilllB1973lembICAA87743.11 (Z47794) holin protein (Bacteriophag. 31 3.3 WO 00/32825 PCT/I 899/02040 284 Table Honiolgies between phage 44 AHJD ORFs and proteins in public databases Query- ptI11O87l 44AHJDORFOOl Phage 44AHJD ORF 110342-126271-i 1 (761 letters) >gi 111884 8 1sp IP19894 DP0LBPM2 DNA POLYMERASE >giI176896S1pirI IJQ0161 DNA-directed DNA polymerase (EC 2.7.7.7) phage M2 >giI2lSSO9 (M33144) DNA polyrnerase [Bacteriophage M2] Length 572 Score 55.4 bits (131), Expect ic-OS Identities 96/426 Positives 159/426 Gaps 88/426 Query: 229 KLTPEQLTYIHNDVI ILCMCHIHYSDI FPNFDYNKLTFSLNIMESYLNNEMTR--FQ 283 *.TPE+ YI ND+ I+ DI T+ F Sbjct: 154 EITPEEYEYIKNDIEIIARA LDIQFKQGLDRTAGSDSLKGFKDILSTKKFNKVFP 209 Query: 284 LLNQYQDIKISYTNYHFHDMNFYDYIKSFYRGCLNMYNTKYINKLIDEPCFSIDINSSYP 343 L. D0+1 YRGG N KY KIIE D+NS YP Sbj Ct: 210 RKAYRGCFTWLNDKYKEKEIGEGMV- FDVNSLYP 252 Query: 344 YVNYHEKIPTWLYFYEHYSEPTLIPTFLDDDNYFSLYKIDKDVFNDDLLIKIKSRVLRQM 403 MY +2 Y 2 D LY I+ F +L K Sbjct: 253 SQMYSRPLP--------- YGAPIVFQCKYEKDEQYPLY-IQRIRFEFEL KEGYIPTI 299 Query: 404 XXYYYYYYYYYYYYYYLRMI-DITGIDCMHIRVNSFVIYECEYFPHkJWIIFQNYFIK 462 +T +D I+ EY F Sbjct: 300 QIKICNPFFKGNEYLKNSGVEPVELYLTNVDLELIQEH-YELYNVEYIDGFK FRE 352 Query: 463 TQGKLKNKINMTSPYDYNITDDINEHPYSNEEVMLSKVVLNGLYG------------ IPAL 511 C K+ I+ N L+K++LN LYG +P L Sbjct: 353 KTGLFIWFIDKWI'YVKTH EECAKKQLAKLMLNSLYCKFASNPDVTCKVPYL 403 Query: 512 RSMFNL-FRLDDNNELYNIINCYflflERNILFSTFVFSRSLYNLLVPFQYLTESEIDDNF 570 +L FR+ D YK+ F+T+ Q D Sbjct: 404 KDDGSLCFRVCDEE YKDPV'flPM-GVFITAWARFTTITAAQACY--DRI 449 Query: 571 IYCDTDSLYNKSVVKPLLNPSLFDPIALGKWDIENEQIDKMFVLNHKK--YAYEVNC 625 IYCDTDS+++ 2 +0DP LOW E+ L K Y EV+C Sbj ct: 450 IYCDTDSINLTCTEVPEII1101VDPKKLGYWA14ES -TFKRAKYLRQKTYIQDIYVKEVDG 508 Query: 626 KIKIAS 631 K+K S Sbjct: 509 KLKECS 514 >sillO72SSS1pirI 551275 DNA polymerase phage CP-1 >siL83SS93IesbICAAB772S.1I (Z47794) DNA polymerase [Bacteriophage CP-11 Length 568 Score 53.5 bits (126). Expect 6e-O6 Identities 104/464 Positives 169/464 Caps 66/464 (14%) Query: 230 LTPEQLTYIHNOVIIL--GMCHIHYSDIFPNFDYNKLTFSLNIMESYLNNEMTRFQLLNQ 287 PE +YIIIDV IL G+ +F Y +L F Sbjct: 152 IKPEWIDYINVDVAILARCIFAMYYEENFTK- -YTSASEALTEFKRIFRKSKRKFRDFFP 209 Query: 288 YQDI KINSXTklnnm IDI~ D K+ D+ C K+ DINS YP M Sbjct: 210 ILDEKVD DFCRKHIVGACRLPTLKHRCRTLNQLIDIYDINSMYPATML 257 Query: 348 HEKIPTWLYFYEHYSEPTLI PTFLD)DDNYFSLY-KIDKDVFNDDL- LIKIKSRVLRQMXX 405- +2 Y P K DD+ L I+IK Sbjct: 258 QNALPICIP--KRYKGK PKEIKEDHYYIYIKADFLKRGYLPTIQIKCKLDALRIC 312 Query: 406 XYYYYYYYYYYYYYYYLRMIQDITCIDCWIIRVNSFVIYECEYFHARDIIFQNYFIKTQG 465 L+ N E F +F +Y Sbjct: 313 VRTSDYVTTSIO4EVIDLYLTNFDLDLFLKHYDATIMYVETLE-FQTESDLFDDYI--366 WO 00/32825 PCT/I B99/02040 285 Query: 466 KLINKINMTSPYD)YHITDDINEHPYSNEEVNLSKVVLGLYGIPALR SHFNLFRLDDN 523 YY E+ S E +K++LN LYG S L LDD Sbjct: 367-----------rrYRYK--KENAQSPAEKQKAKIMLNSLYGKFGAXIISVUCLAYLDDK 412 Query: 524 NELYNIINGYKNTERNIL FSTFVTSRSLYNLLVPFQYLTESEIDDNFIYCDTDS 577 L +KN FVTS Q E DNF+Y DTDS Sbjct: 413 GXLR FIOIDEEEVQPVYAPVALFVrSIARHFIISNAQ--ENYDNFLYADTDS 462 Query: 578 LYMKSVVKPLLNPSLFDPIALGKWDIENEQIDKMFLNKKnAYEVNGKIKIASAGIPKN 637 L+ t+L DP 0KW E K L K Y E+ +K Sbjct: 463 LHLFHSDSLVLD IDPSEFGKWAHEGAVKAYLRSKLYIELQEDGlH.EV-KG 517 Query: 638 AFDTSVDFETFVREQFFDGAI IENNKSIYNEQGTISIYPSKTEI 681 A T E ESF GA E +4 +G IYt+ +I Sbj ct: 518 AGMTPEIKEKITFENFVIGATFEGKjjSKIKGGTLIYTrFKI 561 >giI1429230e1b~ICAA67649I (X99260) DNA polymerase [Bacteriophage B103] Length 572 Score 49.2 bits (115). Expect le-04 identities 93/422 Positives =155/422 Gaps 88/422 Query: 229 KLTPEQLTYIHNDVIILGMCHIHYSDIFPNFDYNKLTFSINIMESYINNETR FQ 283 ++TPE+ YI ND+ I+ DI T T+ F Sbjct: 154 EITPEEYEYInUDIEIIARA LDIQFKQGLDRJITAGSDSLKGFKDILSTKKFNKVFP 209 Query: 284 LLNQYQDIKISYTHYMFIIDMNFYDYIKSFYRGGLNNYNTKYINCLIDEPCFSIDINSSYP 343 L+ D +I YhOG N KY K I5 D+NS YP Sbjct: 210 KLSLPMDKEI-----------------RPRAYRGGFTWLNDKYKEKSIGEGMV-FDVNSLYP 252 Query: 344 YVMHKIPTWLYFYEHYSEPTLIPTFLDDDNYFSLYKIDKFDLLIKIKSRVRQM 403 M4Y +P Y P D LY I+ F +L K Sbjct: 253 SQMYSRPLP--------- YGAPIVFQGKEKDEQYPLYIQRIRFFEL KEGYIPTI 299 Query: 404 XXXYYYXXflflYXflYYLRl4IQ. DITGIDCMHIRVNSFVIYE-CEYFHARDI IFQNYFIK 462 +D I+ .Y EY F Sbjct: 300 QI53QqPFFKGNEYLKNSGAEPVELYLTNVLELIQEHYEM'mVYIDGFK FRE 352 Query: 463 TQGKLKNKINMTSPYDYHITDDINEHPYSNEEVMLSKVVLNGLYG------------ IPAL 511 G K I+ H LK++ +LYG +P L Sbjct: 353 KTGLFKEPIDCWTYVKTH EKGAKKQLAKLMFDSLYGKFASNPDVTGKVPYL 403 Query: 512 RSHFNL- FRLDDNNSLYNI INGYKNTERNILFSTFVTSRSLYNLLVPFQYLTESEIDDNF 570 +L FR+ D YK+ F+T+ Q D Sbjct: 404 KEDGSLGFRVGDEE YKDPVYTPM-GVFITAWARFTTITAAQACY--DRI 449 Query: 571 IYCDTDSLYNKSVVKPLLNPSLFDPIALGKWDI ENEQIDKNFVLNKX YAYEVNG 625 IYCDTDS.+ p DP LO W E+ L K YASZV4G Sbjct: 450 IYCDTDSIHLTGTEVPEIIKDIVDPKKGYWAHESTFKRAYLRQKrYIQDIYmV 508 Query: 626 SI 627
K+
Sbjct: 509 St 510 >gi11572479embCAA657121 (X96987) DNA polymnerase (Bacteriophage GA-11 Length 578 Score 46.1 bits (107), Expect 0.001 loentities 60/376 P~ie 146/376 Gaps 54/376 (14%) Query: 234 QLTYIHNDVI ILGMCHIHYSDIFPNFDYKLTFSLNIMESYNNEMTRFQLLNQYQDIKI 293 +Ye +D+eIe- +F N D+ 4 .Y EM +Y Sbjct: 162 SISYLSHDLLIVALA- .LRSMFDN-DFTSMTVGSDALNTY--SEMLGVKQWEKYFPVL- 214-- Query: 294 SYTHYNFHDMNFYDYIKSFYRGGLNMYNTKYINKLIDEPCFSIDINSSYPVMHEKI PT 353 I+ Y+GG NSKY D+NS YP+M P Sbjct: 215----------SLKVNSEIRKAYSGGFTWVNPKYQGETVYGG4XhFDV4SMYPAMMKNlKLLP- 264 Query: 354 WLYFYSHYSEPTLI PTFLDDDNYFSLYSIDIWVRNDDLLIKISSRVLRQMXL XXXXX 413 Y EP LY F KI 4+ WO 00/32825 PCT/I B99/02040 286 Sbjct:- YGEPVMFKGEYKNEYPLYIQQVRCFFELKKD~KIPCIQIKGNARFGQNEYLS 317 Query: 414 XXXXXLMQITICHIRVNSFVIYECEY RADIIFQN'FIKTQGLNINM 4-73 L +T +D I. ItE E+
I
Sbjct: 318 TSGDEYDLY--vTNVDWELI'n-IFEEEFIGG-FMFKGF------------ ICE 359 Query: 474 TSPYDYHITDDINEHPYSNEEVLSKWVLGLYGI PALRSNFN- -LFRLDDNNELYNI IN 531 Y N S E .ttK++LN LYG A LD.N L Sbj ct: 360 FDEYIDRFMEIKNSPDSSAEQSLQ KLLNSLYGKFATNPDITGKVPYLDENGVLKFRCG 419 Query: 532 GY NTERNILFST FVTSRSLYNLLVPFQYLTESEIDDNrIYCDTDSLYhKSVVKPLL 588 K ER+ F+T+ N+L Q L FlY DTDS Sbjct: 420 ELK- -ERfPVYTPMGCFITAYARENILSNAQKLYP--RFIYADTDSIHVEGLGEVDA 472 Query: 589 NPSLFDPIALGKWDIE 604 DP L0WD E Sbjct: 473 IKDVIDPKKLGYWDHE 488 >giI118851IspIPO695OIDPOL BPPZA DNA POLYMERASE (EARLY PROTEIN GP2) >giI7S8l2IPirI IERBP2Z DNA-directed DNA polyinerase (EC 2.7.7.7) pbage PZk ,giJ216O51 (M11813) gene 2 product (Bacteriophage PEA] ,gi12247411Prf111112171E ORE 2 (Bacteriophage PZA] Length 572 Score 45.3 bits (105) Expect 0.002 identities 98/461 Positives 166/461 Gaps 110/461 (23%) Query: 198 QLKTDFNYnIFDKDNDMNI3SEAYDYAVKCFALTPEQLfIHNDVI ILG0MCHIHYSDI EP 257 ++DF T+ D D+ Y YIIND+ I+ I Sbjct: 129 KIAXDFKLTVLKGDIDY4KERPVGY EITPDEYAYIKNDIQIIAEALL IQF 178 Query: 258 NFDYNKLTFSLNIMESYLNNEMTR--- FLLNQYDIKISYTHYHFHDMNFYDYIKSF 312 T T+ F L+ D Y Sbjct: 179 KQGLDRTAGSDDLKGFKDIITTKKFKVFPTLSLODKVYA----------------- 222 Query: 313 YRGGLNMNTKYINKLIDEPCFSIDINSSYPYVMYHEKIPTWLYFYERYSEPTLI PT- -F 370 YRGG N K IE D+NS YP MY +P YEP Sbjct: 223 YRGGFTWLNDRFKEKEIGEGKV-FDVNSLYPAQ?4YSRLLP YGEPIVFEGKYV 273 Query: 371 LDDDNYFSLYKID--KDVFNDDLLIKIKSRVLRQXJUZJ'UiUJUL:XJJDLRMI 425 D+oD I Ke.+ IK +SR Sbjct: 274 WDEDYP{HIQHIRCEFELKEGYI PTIQIK-RSRFYKGNEYLKSSGGEIALW---------- 324 Query: 426 QDITGIDCMHIRVNSFVIYECEYFHADI IFQNYFIKTQGKLKNKINMTSPYDYHITDDI 485 D+ Y ZY F TOG Kt I+ I Sbjct: 325 VSNVD-LELMKEHYDLYNVEYISGLK FKATTGLFKDFIDKWTHIK'flSEGAI 375 Query: 486 NEHPYSNEEVNLSKVVLNGLYG------------ IPALRSHFNLFRLDDNNELYNIINGY 533 L+KroLN LYG +oP L+ L FRL G Sbjct: 376 LAKLMLNSLYGKFASNPDVIGKVPYLKENGALGFRL------------ GE 415 Query: 534 KTERNIL-FSFSRSLYNLLVPFQYLTESEIDDNFIYCDTDSLYMSWVKLANPS 591 F+T+ +Y Q D IYCDTDS.. P Sbjct: 416 EETKDPVYTPMGVFITAWARYTTITAAQACF--DRIIYCDTDSIHLTGTEIPDVIKD 470 Query: 592 LFDPIALGKWDIENEQIDKMFVLNMKKYAY--EVNGKI 627 DP LO W E+ L K Y EV+GK+ Sbjct: 471 IVDPKKLGYWAHES-TFKRAKYLRQKTYIQDI'1KEDKL 510 >gil2435429 (AF012250) unassigned reading frame (possible DNA polymerase) (Physarun polycephalumi Length 54- Score 44.9 bits (104), Expect 0.002 Identities 118/545 Positives 206/545 Gaps 104/545 (19%) Query: 179 TSIATLGKKLLDGGYLTESQLKTDFNYTIFDKDNDNSEYDYAVKCFAKLTPEQLTYI 238 T L K LD T Q F N M Y CF L I Sbjct: 62 TQLFNLLKSLQDSSFYTFKQ FTYQNIM--YSLEISCN--LYPKKKILI 105 Query: 239 HNDVIIL4GMCHIHYSDIFPNFD--YNKL- -TFSLNIMESY-LNNEMTRFQLLNQYQD 290 D+ +I Y+D. YNo.. r*+NI Y L+ Sbjct: 106 -KfLYNFFSENIIYNDVVDYKLLAILYNEIQTAYNINIRKYILSTASLSLRIFKKSFP 164 WO 00/32825 PCT/I B99/02040 287 Query: 291 IKISYTY4FHDMNFYDYIKsFYRGGLNMYNTKYINKLIDEPCFSIDINSSYPYVMYHEK 350 K D0+ +YI+ Y GG N I D.NS YPY+H EK Sbjct: 165 EKYRLIPNLTRDED--NYIRKSYIGGRNE--IFENVAQRNYFYDVNSLYPYIHKKEK 217 Query: 351 IPTWLYFYEHYSEPTLIPTFLDD-DNYFS- LYKIDKDVFNDDLL---IKIKSRVLRQ 402 +P Y F L I+K N IK.-V Sbjct: 218 MPIGI PEYRDKEYIKFEKNISNFFGFIDVLITIEK TNNNIPVIJPYRMGIIK 'V-EV 273 Query: 403 MXYYYYYYXXXYYYXYLRIIQDITGIDCMHIRVNSFVIYECEYFHARDI IFQNYFIK 462 L. Q I+ IY F+ V Sbjct: 274 GIIVAKGTLRGIYFSEEIKLALKQGVKIIE----------- IYSAVEVKEKEVVFEEYVEQ 323 Query: 463 TQGK- LKNKINMTSPYDVMITDDINEHPVSNEEVMLSKVVLNGLVG I PALRS 513 LKK D +D L K LN LG I Sbjct: 324 MYNRRLKAK LVKLLNTLGRFGLVYEQIDIISP 363 Query: 514 I4FNLFRLD0&NELYNI INGVKNTERNILFSTFVI'SRSLYNLLVPFQVLTESEIODNFIYC 573 L +014 N F YT I Sbjct: 364 EKEL--ITDNTVISHDTTEFI0ITANTCYNNIAITSAITSYARIFMYNII.YNLHVIVI 421 Query: 574 OTDSLVMKSVVKPLLNPSLFDP IALG0KWDI ENEQIDKMFVLNHKKVAV -EVNGKI KIASA 632 DTD P+ L +GK+ F+N KVY 1 N I Sbjct: 422 DTDGLFLKN--- I PDIALTTSKEMGKFRLESINAEAHFIAN- KFlIVAPINS PI IYKFK 477 Query: 633 GIPK--NAFDTSVDFETFR---EFFDGAIIENKSINEQT ISIYPSK 678 GIP N 0 +F +1 NN V+ Q+ I Vt+ Sbj ct: 478 GIPLQKPIFNIRDI ITQMKKILZITLGNHYFTFSIRLNNNQTYSFQASRKRKLIPNYKfl 537 Query: 679 TEIVC 683 I +C Sbjct: 538 PWIIC 542 ,9i110844871pir1 1S41618 DNA polymerase slime mold (Physarum polycephalun) >gij50972ljdbjIBAA06l2l.1j (029637) DNA polymerase (Physarum polycephalum] Length 547 Score a 44.9 bits (104). Expect 0.002 identities 118/545 Positives 206/545 Gaps a 104/545 (19%) Query: 179 TSIATLGKXLLDGGVLTESQLKTDFNYTI FDKDNDMNDSEAYDVAVCFAKLTPEQLTVI 238 T L KL0D T Q F N M V CF L P+4 I Sbjct: 65 TQLFNLLKSLQDSSFYTFKQ FTYQNIM SLEISCF--LYPKKKILI 108 Query: 239 HDVIILGMCHIHYSDIFPNFD- V-NKL--TFSLNIMESV-LNNEMTRFQLLNQVQD 290 D+ +1 V.D+ VN.. V L+ Sbj Ct: 109 KDLVNFFSENI IYNDVVKDVKLLAILVNEIQTIANININRKVILSTASLSLRI FKKSFP 167 Query: 291 IKISYTHV}IFHDMNFYDVIKSFYRGGLNMYNTKINKLIOEPCFSIDINSSPYVMYHEK 350 K D0+ YVGGN I1+44 D+NS YPV+M 2K Sbjct: 168 EKVRLIPHLTRDED--NVIRKSVIGGRNE--IFEHVAQRNYFVDVNSLVPVIMKKEK 220 Query: 351 IPTWLYFYEHVSEPTLIPTFLDD-DNVFS- -LVKI0IVFNDDLL- IKIKSRVLRQ 402 +P V F +N.F L I+K N IK+ V Sbjct: 221 MPIGI PEYRflKEVMKKFENIENFFGFIDVLITIEKTNNIPVLPNRGIKNNV-EV 276 Query: 403 MVYYYYYYYYYYYYYYYYLRIDITGIDCHIRVNSFVIVECEYFHARDI I FNYFIK 462 L+ Q I+ IV V Sbjct: 277 GIIVAKGTLRGIVFSEEIKLALKQGVKIIE----------- ISAYEVKEKEVVFEEVVEQ 326 Query: 463 TQGK-LKNKINMTSPVDYMIT0DIJEHPSNEEVMLSKVVLNGLVG IPALRS 513 LKK 0 +0D L K+LNLVG I Sbjct: 327 MVNRRLKAK LYKKLLNTLGRFGLVEQIIISP 366 Query: 514 HFNLFRLDNNELYNIINGYKNTERNILFSTFVTSRSLVNLLVPFQVLTESEID0NFIVC 573 L ON 4 N 44 +4 F V T IV Y Sbj ct: 367 EKEL- -ITDNTVISHDTTEFIDITANTCVNNIAITSAITSVARIFMYNTILNYNLHVIYI 424 Query: 574 DTDSLYNKSVVKPLLNPSLFDPIALGKWOIENEQIDKMFVLNKKYAY-EVNGKIKIASA 632 0T0 .L .GK+ Fi.N KVV +N I Sbjct: 425 0TDGLFLKN- PIPDIALTTSKEMGKFRLESINAEAHFIAN-KFYIVAPINSPIIVKPK 480 Query: 633 GIPK--NAFTSVDFETFVR- EQFFDGAIIENNKSIVNEQGT--ISIVPSK 678 WO 00/32825 PCT/I B99/02040 288 C19 N D .F +I NN Y+ Q+ I Y.+ Sbjct: 481 GIPLQKPIFNIHDIITQHKXILNITL4GNHYFTFSIRLNNNQTYSFQASRKRKLIPNYTf 540 Query: 679 TEIVC 683 I+e Sbjct: 541 PWIIC 545 ,giJ48778191gbIAAD31446.11 (AF133505) DNA polymerase (Neurospora crassal Length 1035 Score 44.1 bits (102), Expect 0.004 Identities 36/172 Positives 82/172 Gaps 14/172 Query: 521 DDNNELYNIINGYKNTERNILFSTFVTSRSLYNLLVPFQYLTESEIDDNFIYCDTDSLYh 580 N EL K+ I +4 +4 S Y DTDS.++ Sbjct: 817 EKNYELLSYLDGEKDDGFIINSTSIAAATASWSRILMYKHIINSA YTDTDSIFV 870 Query: 581 KSVVKPLLNPSLFDPIAI.GKWDIENEQIDKNFVLNNKKYAYEVNGKIKIASAGIPKNAFD 640 KPL,+ .eK Y GK++I GI KN Sbjct: 871 E KPLDSAFIGEGCGKFKAEYNCQLIKRAIFISGKLYLLDFGGKLEIKCKGIT01KDN 927 Query: 641 TSVDFETFVREQFFDG -AIIENNKSIYNEQGTISIYPSKTEIVCGNVYDE 689 T1. E +i.G E K G YD+ Sbj ct: 928 TTHNLDINDFEALYNGESRVLFQERWGRSLELGTVTVKYQKYNLISG- -YDK 977 >gi 14619621 spl P335371DPN_-NEUCR PROBABLE DNA POLYMERASE >gil2B33SllpirlI1S26985 probable DNA-directed DNA polymerase (EC 2.7.7.7) Neurospora crassa mitochondrion plasnid naranbar (SGC3) >gi15781561embiCAA39046l (X55361) putative DNA polynerase [Neurospora crassa] Length 1021 Score 44.1 bits (102), Expect 0.004 Identities 36/172 Positives 82/172 Gaps -14/172 Query: 521 DDNNELYNIINGYKNTERNILFSTFVTSRSLYNLLVPFQYLTESEIDDNFIYCDTDSLYM 580 N EL +4G K+ I 44 44 S Y DTDS..+ Sbjct: 815 EKNYELLSYLDGEKDDGFIINSTSIAAATASWSRILMYKMIINSA YTDTDSIFV 868 Query: 581 KSVVKPLLNPSLFVPIASCKWDI ENEQID3O4FVLNHKKYAYEVNGKI KIASAGI PIOAFD 640 KPL K I +K Y GK++I GIKN +t Sbj ct: 869 E- KPLDSAFIGEGCGKFKAEYNGQLIKRAIFISGKLYLLDFGGKJEIKCKGITKNKDN 925 Query: 641 TSVDFETFVREQFFDG -AlIENNKSIYNEQGTISIYPSKTEIVCGNVYDE 689 4. E E K G YD+ Sbjct: 926 TTHNLDINDFEALYNGESRVLFQERWGRSLELGTVTVKYQKYNLISG- -YDK 975 >.giI2499511IspIQ12471I6P22 -YEAST 6-PHOSPHOFRUCTO-2-KINASE 2 (PEOSPHOFRUCTOKINASE 2 11) (6PF-2-K 2) >gi121311621pir1 1S61066 6-phosphofructo-2-kinase (EC 2.7.1.105) yeast (Saccbaromyces cerevisiae) >gil2l3llE3IpirliS171026 6-phosphofrlucto-2-kinase (EC 2.7.1.105) yeast (Saccharomyces cerevisiae) >gill085116jembICAA623711 (X90861) 6-phosphofructo-2-kinase (Saccharoniyces cerevisiaei >gill420028IembfCAA99lS7I (Z74878) ORF YOL136c [Saccharomyces cerevisiaei >gi116284391emb10AA647331 (95465) 6-phosphofructo-2-kinase [Saccharomyces cerevisiae] Length 397 Score =40.6 bits Expect 0.041 identities 48/208 Positives 92/208 Gaps 29/208 (13%6) Query: 175 MKTNTSIATL0KXLLDGGYLTESQLKTDFNYTI FDKDNDMNDSE-AYDYAVKCFAKLTPEQ 234- +4 S AT+ KLL L+ FN K+ND Sbjct: 139 IRRQISCATISKPLL LSNTSSEDLFN---PKNNDKKET YARITLQK 181 Query: 235 LTY- IHNDVI ILGMCHIHYSDIFPNFDYNKLTFSLNIMESYLNNEMTRFQLLN- -QYQD 290 L I.ND 5 I 4 F S+ F L+ Q Sbjct: 182 LFHEINNDECDVGIFDATNSTI ERRRIFEEVCSFNTDELSSFNLVPIILQVSC 235 WO 00/32825 PCT/I B99/02040 289 Query: 291 IKISYTHYHFHDMNFY-DYIKSFYRGCLNNYNTKYINKLIDEPCFSID- INSSYPYVMYH 348 St Yt H+ F DYt Y FS+D N Yt H Sbj ct: 236 FNRSFIKYNIIUIKSFNEDYLDKPYELAIIOFAKRLYYMSQFTPFSLDEFNQIHRYISQH 295 Query: 349 EKIPTWLYFYEHYSEPTLIPTFLDDDNY 376 E+I T LtF P L+ *Y Sbjct: 296 EEIDTSLFFFNVINAGVVEPHSLNQSHY 323 >giJ2258375jgbIAAD11909.11 (AF007261) transcription initiation factor sigma [Reclinomonas americanaj Length 532 Score 39.9 bits Expect 0.070 Identities 49/205 Positives 84/205 Gaps 14/205 Query: 100 NHFLLKDTMRYFflNITRENIYLKSAEENEHTLKMKATILAlNQNVIL EKRVKSSIN 156 N+t+ F tHE L K NVI+r K +K N Sbj ct: 177 NTYLVKNSYLNLFKTVPHDSIYMNYSYIQTPLNILKEYLQLIKI INVl ILQINKNIKXXNN 236 Query: 157 LDLTE4FLNGFKFNIIDNFM- KTNTSIATLGKKLLDGGYLTESQLKTDFNYTIFDKDND 213 Ltttt.FL F Nt. K +t K L Y+T L T Y K Sbj ct: 237 LNISLFLYKF'YQELKWNYIFINKISRNJTQKINIKTLKNSYITFYNLITFIQYYTTKKQRL 296 Query: 214 MNDSEAYDYAVKCFAK- -LTPEQLTYIHNDVIILGMCHIHYSDIFPNFDYN-KLTFSLNI 270 D +K FK Pt +N +I Gt HI+ N K.T I Sbjct: 297 KKDIFYKQIFIKTFLKQHKIPKINKIKNNSLIKYGLTHIYDMILISILRE-NIKVTLKNRI 356 Query: 271 MESYLNNEMTRFQLLNQYQDIKISY 295 T QY +KI Y Sbjct: 357 IFNYNPYITT ISKQY--VKIGY 376 >giJl57341embjCAA374SOI (X53370) DNA polymerase (AA 1-575) [Bacteriophage phi-29] Length 575 Score 39.5 bits Expect 0.092 Identities 41/150 Positives 64/150 Gaps 36/150 (24%) Query: 497 IPALRSHFNL-FRLDDNNELYNIINCYKNTERNIL- -F 542 L.Kt.LN LYG +P Ltt +LFRL, G+ Tt Sbjct: 381 LAXLHLNSLYGKFASNPDVTGKVPYLKENGALGFRL------------ GEEETKDPVYTPN 429 Query: 543 STFVTSRSLYNLLVPFQYLTESEIDDNFIYCDTDSLYMKSVVKPLLNPSLFDPIALGKWD 602 FtT+ +tY Q D IYCDTDS... P DP LO W Sbjct: 430 GVFITAWARYTTITAAQACY--DRIIYCDTDSIHLTGTEIPDVIKDIVDPKKLGYWA 484 Query: 603 IENEQIDKMFVLNHKKYAY--EVNGKI 627 S+ t. L K Y EV.GK.
Sbjct: 485 HES-TFKRVKYLRQKTYIQDIYNKEVDGKL 513 Query= pt1110872 44AMJDORFOO2 Phage 44AHJD ORF 13789-573213 1 (647 letters) .giJ13S2731spIP276221TAGCBACSU TEICHOIC ACID BIOSYNTHESIS PROTEIN C >gil4781261pirl jD49757 techoic acid biosynthesis protein tagC Bacillus subtilis (strain 168) >gi1143727 (M457497) putative (Bacillus subtilis) >gil2636103jenbICABl5594.1f (Z99122) alternate gene name: dinC [Bacillus subtilis] Length 442 Score 112 bits (278). Expect 7e-24 Identities 91/314 Positives 147/314 Gaps 58/314 (18%) Query: 152 FELNELEPKFVMGFGGIRNAVNQSINIDKETUHMYSTQSDS- QKPEGFWINKLTPSG 207-- F. +tPK VQS N D+ +ttYtT0 S +t+lt Lt G Sbj ct: 7 FDFTNITPKLFTELRVADKTVLOS FNFDEIOJHQIYTTOVASGLGKDNTQSYRITRLSLEG 66 Query: 208 DLISSMRIVQGGHGrrIGLERQSNGENKIWLHHD--GVAKLLQVAYKDNYVLDLEEA 262 SM GGHGT IG.EtNGt+IW +D ttLt YK LD Et SbjCt: 67 LQLDSMLLKHGHGTNIGIEFNR-NCTIYIWSLYDKPNETDKSELVCFPYKAGATLD- ENS 124 WO 00/32825 PCT/I B99/02040 290 Query: 263 KCLTDYTPQSLLNG{TFTPLIDEANDKLILRFDCTIQVRSRADVO4HIDNVEKEMTI0N 322 K L HI TP +D N +L +R +0D104+ N TI N SbjCt: 125 KEQFSMF -DRTADKRL QYDTIO4N--NNKOWVTIFN 170 Query: 323 SE NNDN RWMQGIAV0CDDLYWLSCNSSVNSHVOICKYSLTTCQKI 367 N .4N QG +D LYW Sbjct: 171 LDDAIANKNNPLYTINI PDELHYLQCFFLDCYLYWYTCOTNSCSYPNL ITIV 222 Query: 368 YDYPFKLSYQDGINFPRD---NFKEPECI CIYTNPKTKRKSLLLANTNCGCCKRFH 420 +D Q I +0 NF+EPECIC+YTNP+T KSL++ +T.C C R Sbj ct: 223 FDSDNKIVLQKEITVCKDLSTRYENNFREPECICMYTNPETGAXSLMVCITSCKECNRIS 282 Query: 421 NLYCFFQLGEYEHF 434
YE+F
Sbjct: 283 RIYAYH SYENF 293 9 i1142847 (M64050) DNase inhibitor (Bacillus subtilisi Length 125 Score 51.9 bits (122), Expect le-OS Identities -35/116 Positives 55/116 Caps 10/116 Query: 152 FELNELEPKFVNCFCCIRNAVNQSINIDKETNHMYSTQSDS -QKPECFWINKLTPSC 207 F+ PK V QS ND++ ++Y+TQ S +I C Sbj ct: 7 FDFTNITPKLFTELRVAnKTVLQSFNFDEKNNQIYTTQVASCLGKONTQSYRITRLSLEC 66 Query: 208 DLISSMRIVQCCHCTTILERQSNEMKIWLMHD C-VAKLLQVAYKDNYVLD 258 SM+ CGHT I+E +NC +IW +D '1K LD Sbjct: 67 LQLDSMLLKHCCHCTNICMENR-NCTIYIWSLYDKPNETDKSELVCFPYKACATLD 121 >gij4038407 (AF103943) factor C protein precursor (Streptornyces griseus] Length 324 Score 39.1 bits Expect 0.10 Identities 61/269 Positives 102/269 Gaps 33/269 (12%) Query: 172 VNQSINIDKETNHNYSTQS0SQKPEC---FWINKLTPSCDLISSNRIVQGCECTTICLER 228 V QS D Q S P+ I +L SC+ H GNC +IC.+ Sbjct: 66 VQQSFTFDIVNRRLFVAQLKSCSPD0SCDLCITQLDFSCNKLG3HMYLLCFGHCVSICAQ- 124 Query: 229 QSNCEMKIWLHHDCVAXLLQVAYKDNYVLDLEEAKCLTDYTPQSLLNKHTFTP 281 +W 0 GCT S LKH P Sbjct: 125 PVCAflTYLNTEVD--VNSNARCTRLARFKWNNCATLSRTSSALAKHQPVPCATEMTC 179 Query: 282 LIDEANDKLILRFC0CTIQVRSRAVKNHIDNVEKE4TIDNSENNDNRWMQCIAVDCDDL 341 ID +V V D QG A+CG Sbj ct: 180 AIDPVNNRMAIRYLTASCRRYCIYNVADIAACVYDKPLSDVPHPTCLGTFQYALYGSYV 239 Query: 342 YWLSN SSVNSHVQICKYSLTTCQKIYDYPFKLSYQ0CINFPRDNFKEPECIC 394 Y L+CN NS+V TC +CG F+EPEC.
Sbjct: 240 YQLTCNPYCPDNPNPCNSYVS- -SVBVNTCALVQ -RAFTRACSTL- -TFREPECMC 290 Query: 395 IYTNPKTKRKSLLLA4TNGGCCKRFHNLY 423 lY L +G C R NL+ Sbjct: 291 IYRTAACEVR-LFLCFASCVACDRRSNLF 318 Query= pt1110873 44AMJDORFOO3 Phage 44AMJD ORF 16626-838912 1 (587 letters) 2I'no',3,I,..,Inn jrVl9iyflQ 0034 fTT. PROTEIN (LATE PROTEIN CP9) >gij7S8S0jpirlIWMBPT9 gene 9 protein phage phi-29 >giJ215327 (M14782) tail protein (Bacteriophage phi-29] >gij2253641prfjI 3012700 gene 9 (Bacillus sp.] Length 599 Score 92.4 bits (226). Expect Be-18 Identities 126/618 Positives n 251/618 Caps =71/618 (11%) Query: 5 TNFKFFYNTPFT-0YQNTINFNSNKERflDYFLNGRHFKSL0YSKQPY-NFIRDRNEINVD 62 TN. +PF. DY.NT F S+ R SK F ++V Sbjct: 9 TNVRILADVPFSNDYKNTRWFTSSSNQYNWF- -NRKSRVYEMSKVTFMCFRENKPYVSVS 66 WO 00/32825 PCT/I B99/02040 291 Query: 63 MQWHDAQGINYMTFLS -DFEDRRYYAFVNQI EYVNDVVVKIYFVIDITIM'FYTQGNVLEQL 121 F Di. 4+ +YAFV N V ID Ti. Sbjct: 67 LPIDKLYSASYIFNA3YGNKWFYAF"VTELEFKNSAVYVMFEIDVLQTWMFDMKFQES 126 Query: 122 SNVNI ERQHLSKRTYNYMLPMLRNNDDVLKVSNIWYVVYNQMQQYLENLVLFQSSADLSKK 181 I R+H+ K P D+ L S Sbjcr: 127 F IVREHV-KLWNDDGTPTINTIDEGLSYGSEYDIVSVENHKPYDDMNFLVIISKSIM 182 Query: 182 FGT--KKEPNLDTSKGTIYDNITSPNLYMEYGDFINFMDKMSAYPWITQNFQK V 235 CT Li. P+ Y+ D +I N V Sbj ct: 183 HGTPGEEESRLNDINASL-NGMPQPLCYYINPF--YKDGKVPKTYIGDNNANLSPIV 236 Query: 236 QMLPKDFINKDLEDVICTSEKITGLKTLKQGGKSKEWSLK-DLSL SFSNLQ 285 ML F +0 D .T LK K~i+ LK N+i Sbj ct: 237 NMLTNI FSQKSAVNDI -VNNYVTDYIGLKLDYIONGDKELKLDKDMFEQAGIAflDKHiGNVD 295 Query: 286 KnEFKHNIRNEYNTIEFYDWNGNTMLLDAGKISQK 326 ++KKD+ Y E D+ GN ML I+ Sbj ct: 296 TIFVKCIPDYEALEIDTGDKWGGFTKDQESKLMMYPYCVTEITDFKGNHNNLKTEYINNS 355 Query: 327 TGVKLRTKSI IGYIWEVRVYPVDYNSAENDRPI LAFO4KEILIDTGSFLNTNITFNSFAQV 386 +G N+V DYN+ D Ni. S+N N Sbjct: 356 K-LKIQVRGSLGVSNKVAYSVQDYNA DSALSGGNRLTASLDSSLINNNPN 404 Query: 387 PILINNGILGQSQQANRQ--KNAESQLITNRIDNVLNG SDPKSRFYDAVSVASNLSP 441 I I N L Q N+ +N+S I +0 A+ +AS++ Sbjct: 405 DIAILNDYLSAYLQGNKNSLENQKSSILFNGIMGMIGGGISAGASAAGGSALGMASSV-- 462 Query: 442 TALFGKFNEEYNFYKQQQAEYKDLALQPPSVTESEMGNAFQIANSINGLTMKISVPSPKE 501 QA+ OtA PP-i.T+ AF N Sbj ct: 463 TGMTSTAGNAVLQMQANQAKQADIANI PPQLTKMGGNTAFDYGNGYRGVWVIKKQLKAEY 522 Query: 502 ITFLQKYYNLFGFEVNDYNSFIEPINSMTVCNYLKCTGTYTIRnIDPMLMEQLKAILESG 561 L NY++ DI1. I ++G Sbjct: 523 RRSLSSFFHKYGYKINRVKK--PNLRTRKAFNYVQTKDCFISGDINNNDLQEIRTIFDNG 580 Query: 562 VRFWMNflGSGNPMLQNPL 579 WH D ON L Sbjct: 581 ITLWWTDNIGNYSVENEL 598 >gij138124jspIP07534jVG9_BPPZA TAIL PROTEIN (LATE PROTEIN GP9) ,gij75849jpirI IwmDPSz gene 9 protein phage PEA >gij216058 (M11813) tail protein [Bacteriophage PEA] Length 599 Score 81.9 bits (199). Expect le-14 identities 127/618 Positives 248/618 Gaps 71/618 (11%) Query: 5 TNFKFFYNTPFT -DYQNTIHFNSNKERDDYFLNGRHFKSLDYSKQPYNFIRDRNE -INVD 62 TN PF+ DY+NT F S+ SK R+ I-iV Sbjct: 9 TNVRILADVPFSNDYKNTRWFTSSSNQYNNF--NSKTRVYEMSKVTFQGFRENKSYISVS 66 Query: 63 MQWNDAQGINYMTFLS -DFEDRRYYAFVNQIEYVNDVVVKIYFVIDTIMTYTQGNVLEQL 121 YAFV ++EY N -i+FID N+ Q Sbjct: 67 LRLDLLYNASYIMFQNAflYGNKWFYAFVTELEYKNVGTFYVHFEIDVLQTW-MFNIKFQE 125 Query: 122 SNVNIERQHLSKRTYNYMLPMLRNNDDVLKVSNKNYVYNh -QMQQYLENLVLFQSSADLS 179 S I RiHi.K P D+ L +Y +L S Sbj ct: 126 SF- -IVREHV- KLWNDDGTPTINTIDEGLNYGSEYDIVSVENHRPYDDMMFLVVI SKSIM 182 Query: 180 KXFGTKKEPNLDTSKGTIYDNITSPVNLYVMEY GD---FINFMDK 221 +tE L+ P. Y+ GO +N Shi Ct: 183 HGTAGE-AESRLNDINASL-NGMPQPLCYYIHPFYKD)GKVPKTFIGDNNANLSPIVNNLTN 241 Query: 222 MSAYPWITQNFQKVQMLPKDFINTK DLEDVKTSEKITGLKTLKQGGKSKEWS 273 N VNM D+I K K G+ K G Sbjct: 242 IFSQKSAVNNI- -VNMYVTDYIGLKLDYKNGDKELKLDKDMFEQAGIAflDKHGNVDTIFV 299 Query: 274 LKDL -SLSFSNLQEMMLSKXDEFKEMIRNEYMTIEFYDWNGNTMLLDAGKISQKTGVK 330 K +L 1(0+ Y E D+ GN ML I +K Sbj ct: 300 KKIPDYETLEIDTGDKWGGFTKDQESKLMMYPYCVTEwVTDFKGNHMNLKTEYIDNNK-LK 358 Query: 331 LRTKSI IGY}INEVRVYPVDYNSAENDRPILAKNKEILIDTSFLNTNITFNSFAQVPILI 390 +G N+V OYN+ L+ N-i I+ WO 00/32825 PCT/I 899/02040 292 Sbjct: 359 IQVRGSIGVSNKVAYSIQDYNAGGS LSGGDRLTAS LDTSLINNPNlIAII- 409 Query: 391 NNGILGQSQQANRQ--IOJAESQLIThRIDNVLNGSDPKSRPYDAVSVASNLSP 441 N L Q N+ .N I .L G A +A SP Sbjct: 410 -NDYLSAYLQGNKNSLENQKSSILFNGIVGNLGGG VSAGASAVGRSPPGLASSV 462 Query: 442 TALFGKFNEEYNPYKQQQAEYKDLALQPPSVTESEMGNAQIANSINCLTNKISVPSPKE 501 QA. D.A PP AF N Sbj Ct: 463 TGM4TSTAGNAVLDMQAIJQAXQADIANI PPQLTKNGGNTAPDYGNGYRGVYVI KKQLKAEY 522 Query: 502 ITPLQKYYNLPGPEVNDYNS PIE PINSMTVCUYLKCTGTYTI RfIDPNLNEQLKAILESG 561 L .G N NY+, DI I +.G Sbj ct: 523 RRSLSSPPHKYGYKINRVKK- -PNLRTRKAYNYIQTKDCPISGDINNND3LEIRTIDNG 580 Query: 562 VRPWMNDGSGNPMLQNPL 579 WH D GN N L Sbjct: 581 ITLNHTDIGNYSVENEL 598 >9iI1429238IemlbICAA67657I (X99260) tail protein [Bacteriophage 8103] Length 598 Score 77.6 bits (188), Expect 2e-13 Identities 130/623 Positives 240/623 Gaps 86/623 (13%) Query: 5 TNPKPPYNTPPT -DYQNTIHPNSNKERflDYFLNGRMPKSLDYSKQPYNFI RDRNEIN T..P FN P.DY..T YF +iK NP. I Sbj ct: 9 TDVRIPSNVPFSNDYKSTRWPTNAflAQYSYP- -NAKPRVHVINECNPVGLKEGTPHIR 64 Query: 61 VDMQWHDAQGINYMTFLS -DFEDRRYYAPVNQI EYVNDVVVKIYFVIDTIMTYTQGNVLE 119 D YMHF. YVEYVN V tYP ID IT+ Sbjct: 65 VNKRIDDLYNACYMIPRNTQYSNKWPYCPVTRLEYVNSGVTNLYPEIDVIQTW-HPDPKP 123 Query: 120 QLSNVNIERQHLSKRTYNY4LPMLRNNDDVLKVSNKNYVYNQMQQYLENLVLFQSSADLS 179 QS.+ EQ P+ D+ L V Q P S Sbjct: 124 QPSYIVREHQEHWDANNE PLTNTIDEGLNYGTEYDVVAVEQYKPYCDLMPNVCISCS 180 Query: 180 KKPGTKIEPNLDTSKGTIYDNITS PVNLYVHEYGDFINFMDKHSAYPWITQNPQKVQ 236 KC T E G I NI P++.YV D S P +T .VQ Sbjct: 181 KMHATAET FKACEIAANINGAPQPLSYYVHP YEDGSS--PCVTIGSNEVQ 230 Query: 237 ML-PICDPINTKDLEDVKTSEKITCLKT LKQGGKSKEWSLKDLSLSPSNL--284 P DFs T tiK SL.D Sbj ct: 231 VSKPTDPLKNMPTQEHAVNNIVSLYVTDYIGLNIHYDESAXTNSLRDTHFEHAQIADDKH 290 Query: 4 SKKDEPKHIRNEYMTIEPY--------- DWNGNTMLLDAGK 322 +E +F NE V D. GN Sbj ct: 291 PNVNTIYLKEVKEYEEKTIDTGYKPASFANNEQSKLLMYPYCVTTITDPKGNQIDIIGJEY 350 Query: 323 ISQKTCVKLRTKSIIGYHNEVRVPVDYNS AENDRPILAKNKEILIDTGSFLNTNIT 379 4+ .G N+y DYN. D. A NT++ Sbjct: 351 VNG-SNLKIQVRGSLCVSNKVTYSVQDYNAflTTLSGDQNLTAS CITSLI 398 Query: 380 PNSPAQVPILINNGILGQSQQANRQ- -KNAESQLITNRIDNVLN- GSDPKSRPYDAVS 434 N. V I+ N L QN+ .N 0+ AV Sbjct: 399 NNNPNDVAII NDYLSAVLQGNKNSLENQKDSILFNGVMSMLGNGIGAVGSAATGSAVG 456 Query: 435 VASNLS PTALPGKPNEEVNPYKQQQAEVKDLALLQPPSVTESEMGNAPQIANS INCLTMKI 494 VAS ST.+ QA+ D.A PP.. A. N Sbj ct: 457 VAS -SATGMVSSAGNAVLQIQGMQAXQADIANTPPQLVKMGGNTAVDYGNGYRGVYVI K 514 Query: 495 SVPSPKEITPLQKYYMLPGPEVNDVNSFIEPINSTVCNYLKCTGTYTIRDIDPLMEQL 554 L N NY.. I Sbj ct: 515 KQIKEEYRNILSDPSRKYGVKTNLVK--MPNLRTRESVNYVQTKDCNIIGNLNNEDLQKI 572 Query: 555 KAILESGVRFWHNDCSGNPHLQN 577 I +SG+ WH D G. L N Sbjct: 573 RTIPDSCITLWHADPVGDVTLNN 595 )giJ215339 (M12456) p9 tail protein (Bacteriophage phi-29] >giJ224163jprfJ 11011232C protein p9,tail [Bacteriophage phi -29] Length 335 WO 00/32825 PCT/I B99/02040 293 score 71.0 bits (171), Expect 2e-11 Identities 64/293 Positives 123/293 Caps 20/293 (6W) Query: 292 KDEFKIIMIRNEYMTI EF'YDWNCNTMLLDAGKISQKTCVKLRTKS IICYHNEVRVYPVDYN 351 KD+ Y E D+CGNM L I+ tK++ G N+y DYN Sbj ct: 57 KDQESKLMIIYPYCVTEITDFCCNHMNLKTEYINNSC- LKIQVRCSLGVSNKVAYSVQDYN 115 Query: 352 SAENDRPILAXNKEILIDTGSFLNTNITFNSFAQVPILINNGILQSQQARRQ- -KNAES 409 +s D N+ S +NN I I L Q Nt +N S Sbj ct: 116 A- -DSALSGCNRLTASLDSSLINNNPN---DIAILNDYLSAYLQNCNSLENQKS 165 Query: 410 QLITNRIDNVLNC- SDPKSRFYDAVSVASNLSPTALFCKFWEEYNFYKQQQAEYCDLA 466 NI A+ T+ Ok.. D+A Sbjct: 166 SILFNCIMCMICCGISACSAAGSALMASSV--TMTSTANAVLQMQAQAKQADIA 223 Query: 467 LQPPSVTESEMGNAFOIANSINCLTMKISVPSPKEITFLQKYYMLFGFEVNDYNSFIEPI 526 PP AF N C+ L +G +N Sbjct: 224 NIPPQLTKMCCNTAFDYCNCYRCVYVIKXQLKAEYRRSLSSFFHKYCYKINRVKK- -PNL 281 Query: 527 NSNTVCNYLICCTCTYTIRDIDPMLMEQLKAILESCVRFWHNflCSGNPMLQNPL 579 DI4 I WH D CM L Shict: 282 RTRKAFNYVQTKDCFISCDINNNDLQEIRTIFDNCITLWHTDNICNYSVENEL 334 >gi111819681emb1CAA87738.11 (Z47794) tail protein (Bacteriophage cP-1J Length 230 Score 53.9 bits (127), Expect 3e-06 Identities 29/113 Positives 54/113 Caps 3/113 Query: 1 MRKLTNFKFFYNTPP -TDYQNTIHFWSWKERDDYFLNCRNFKSLDYSKQPYNFIRDHMEI 59 T .PF DY NI+F .D4.F Y I Sbj ct: 1 NQESTKIWLYAKSPFKNDYANVINFETRESMEDFFTKXNPHIEIVYEYDKFQYTQRNCSI Query: 60 NVDMQWHDAQCINYMTFLSDFEDRRYYAFVNQI EYVNDVVVKIYFVIDTIMTY 112 V YMF... R YYAFV Y.N+ +I TY Sbj ct: 61 VVSCRVEKYENVrYMRFINN-- CRTYYAFVFDVLYINEDATRI IYEVDVWNTY 111 >giJ11819701embCAA87740.11 (Z47794) tail protein (Bacteriophage cp-1J Length 586 Score 42.2 bits Expect 0.010 Identities 79/381 Positives 139/381 Caps -92/381 (24%) Query: 277 LSLSFSNLEM4JLSK- -KDEFK- -HMIRNEYNTIEFYDWNCNTMLLDAC- KISQKT 327 L +QE +S KD+ IE YD CM.+ 1+ Sbjct: 187 LKIAYI3OIQECLRSYMCKDDLEIEVQLLNSEFTEIELYDIYCNSYVYQPQYLPRTIDEAH 246 Query: 328 CVKLRTJSIICYHNEVRVYPVDYNSAEN DRPI 360 +G N.V .+YN.A N D+ IL Sbj ct: 247 KYKVIVSCSLCDSN0VHINFLEYNNANNVSYADKNILDSLESDWAEHNPEHFKYCLNnV 306 Query: 361 -AKNKEILIDT-CSFLNTNITFNSFAVPILINNILOS0ANRQG4AESQLITNRIDN 418 K+ IL D 0 N +L QS Sbj ct: 307 TCKSVAILNDAEASYIQSHOJQMEHTQLTFKENRflMLKQSVDLSNKQVATANSQASYNAQ 366 Query: 419 VLNCSDPKSRFYflAVSVASNLSPTALFCKF------------------ NEEYNFYKQQQ-- 459 S S LCP N.+YN 00 Sbj ct: 367 FAVDSANINQWTECASC ILNVACNLLTCNFCCALCCLASGCMKVFNANRDYNDKVVQQCF 426 Query: 468 A DL QP SV AFQ N Sbjct: 427 TSENNALKSQSNALAMKSKIALDQSIRAYNATMADLQNQPISVQOICNDLAFQSCNRLT 486 Query: 489 CLTMKISVPSPKEITFLOKYYNLFCFEVNDY-NSFIEPINSN'rVCNYLKCTCTY--TIRD 545 K+S+ +Y .C VN.+N +5S NY+K T.R- Sbj ct: 487 DVYWKVSLAQKEIMCRANEYIKCYCVLVNWFTNDALSVNRSRKRFNYI KNINVNLCTLR- 545 Query: 546 IDPHLMEQLKAILESCVRFWH 566 M .+AI .SCVR N.
Sbjct: 546 ANQSH!OZAIQAIFQSCVRIWN 566 WO 00/32825 PCT/I 899/02040 294 Query= Pt 1110875 44AHJDORFOOS Phage 44AHJD ORF 112643-138901-1 1 (415 letters) >gi13845203 (AE001399) GAF' domain protein (cyclic nt signal tranaduct.) (Plasmodium falciparumi Length 1245 Score 52.3 bits (123), Expect 6e-06 Identities 59/246 Positives 105/246 Gaps 27/246 Query: 174 ESIDRNMGNVDYIGFPIQ4FLLGNAVNFSSPILSNLNIYNLLQKHKO4TSRLYKG'IFLEMR 233 +S DN+ N+ N.V FS+ N IY++L N *YK E+ Sbjct: 854 DSSDNNNNNNNNNNNNNNYNNNNSVIFST NEKIYDML--NRflNIYKCVKKEIF 904 Query: 234 RNDYVNEKRNTRAFNSNDDANTTGEPEFNEYNILADDNLRNHINQNGDFFYIKTDDKYI 291 D+ N N N+ NNGD Y KY Sbjct: 905 EGDSIIKTENKPNLTNKNYM4NNDNIDNNNNNNNNNNIDNNNNNNGDNIYNDDLKKYYLN 964 Query: 292 KVMYNVTTFMTNI IVVPYTKQYEFCTXIR -DIDNHVTYLRDDMFYKENNERYYYNPSNLN 350 +N K E KC+ I L +F+K NM +L+ Sbjct: 965 TSIFNKDLYVXMFVDIIt'ThKSLEEIIK?.2NVISERINSL LFHKGNN LNDVTKLY 1018 Query: 351 FDNAYSKNYWVDNDRYLYLD?NKIIKFHIIO4EMKNSEFERKEKIYEDN YIENTK 406 NAY N K IF EKX+M F+ +KIY+ N N K Sbjct: 1019 MSNAYGEKCFFFN--FPQIKEIIFVNEYEKKMDNKYFKNLKKIYKYNSNKIFSNNYK 1073 Query: 407 KYLMKQ 412 Sbjct: 1074 FFIIKK 1079 gij3758843embCABh1128.1I (198551) predicted using hexExon; MAL3P6.23 (PFCO820w) Hypothetical protein, len: 4982 aa (Plasmodium falciparum] Length 4981 Score 49.2 bits (115). Expect Se-OS Identities 67/287 Positives 110/287 Gaps 60/287 Query: 127 ITDLNSATDLKYHSNFL)G4YPI IIYDEFLALEDDYLIDEWDKLKT- IYESIDRNHGN 182 I D+N Di. I Yfl IY N Sbj ct: 3619 IMDINKSKDISKNNEIVQN- NKYDKIRNJMDAIYMAIDKDMDN 3664 Query: 183 VDYIGFPKHFLLGNAVNFSS P1LSNLNIYNL- LQIO(KNNTSRLYKNI FLEMRRNDYV 238 I FL N S +N YNL K N R YN F +D Sbj ct: 3665 IGI INCMRYFNLY!OJYNNLSNECNNRE -YNLNELYMEDI KRNMKR -YDNNFNINHYDDNN 3722 Query: 239 NEKRN4TRAFNSNDDANTTGEFEFNEYNLAflDNLRNHINQNGDFFYIKTDDKYIKVMYNVT 298 N N N N N+ NNG F+ D Sbjct: 3723 NlN fl4 Nft N N N 3771 Query: 299 TFMTNI IVVPYTKQYEFCTKIRDIDNVTYLRDDMFYKENNERYYYNPSNLHFDNAYSO4 358 K FCTK +tJ+E N N N Y+ N Sbjct: KDLFFCTK-------------IWNIFPCO4IEVCO4EYNKKIYNNYTCN 3807 Query: 359 YVVDNflRYLYLDMNKIIKFHIKNEF4KKNSEFERKEK-IYEDNYIEN 404 V+N ++IK N E+ +EK +Y EN Sbj ct: 3808 I SVNNTLNCLNI I KEIKLNNNKKXI LNYYEYHKVEKLLYYRRSFEN 3854 Score 35.6 bits Expect 0.70 Identities 62/290 Positives .121/290 Gaps 65/290 (22%) Query: 2 VKQNRLDMVRflYQNAVN- -HVRKXI PDKYNQIELVDELMNDDIDYYIS±SNKSD~UKSENY) S +K+N +N +N DK N I SN 4SF Sbjct: 4445 IKRNNINKSNIKRNNINKSNVKRSNTDKSNVIS----------- DFHIT-SNNNITRSFT- 4492 Query: 60 VSFFIYLAIKLDIKFTLLSRRYTLRnAYRDFIEEI IDENPLFKSKRVTFRSARDYLAIIY 119 A D F LS TL +iY +F I Sbjct: ATLTDSIFNTLSE--TLNYSYDNFFSN4DN---------------------- IKI 4523 Query: 120 QDKEIGVITDLNSATDLKYHSNFLKHYPIIIYDEFL--ALEDDYLIDEWDKLKTIYE 174 El ITD++ +YN N+LK +D DE E Sbj ct: 4524 KKNEINNITDVDYGNKKEYHENYLKVKQNKVNEEYIEETFKSDICDCSIKDE-ACTIRTLSE 4583 WO 00/32825 PCT/I B99/02040 295 Query: 175 S--IDRNGNVDYIGFPKFLL4AVNFSSPILS4L4IYLLQ4KIIN--TSRLYIO4IFL 230 S I N1 N+D P N +N R+IGJ Sbjct: 4584 SCNISE4IS1ID MDDEDHISFPNGRNVINNYMKOJHVNYDKRVGG4KIP 4634 Query: 231 EMRRNDYVNEKRNTRAFNSNDDANI'TGEFEFNEY4LALDD4LRNHINQNGD 280 D +44 s.D E L N4G+ Sbjct: 4635 SFTNFDKILDEKKKK SDKDSSSKWLEREEHIKEIKLEKNEYMN1G4 4680 Score 34.0 bits Expect Identities 47/211 Positives 84/211 Gaps 32/211 Query: 210 IYN4LQKKNCTSRLYKNI FLEMRRNDYVNEKRNTRAF4SNDDAt4TTGEFEFNEYNLADD 269 I++LLQK LY+N+ R N4+ T E 44 Sbjct: 918 IFSLLQKDSSPLLVLYENVHI------------ RSGEKYGRNE--ATDNEVDYKKGDIIKH 964 Query: 270 NLRNHINQ1400FFYIKTD DKYIKVYNVTTFNIIVVPYTKQYEFCTKIRDIDN4V 326 N4+ N D+ D+ KNMY V E K D+ N+ Sbjct: 965 1VTNEHGNHSDSYPYCNSLNLDRKPKl4YE-DIYKEKGVKSDCSNIEI-KiDMINfl 1021 Query: 327 TYLRfDNMFYKENNERYYYNPS14LHFDNAYSIOYWDNDRYLYLDMNKII KFHIKNE 382 Y Y+ WV++ +YL +N4 F +K04 Sbj ct: 1022 VYKKNE- FYEDSRINMIYDEDEIKTflFLIPHKYVIN- I IYLFLNILLTDESNFKLKNK 1077 Query: 383 NKKNNSEFERKEKIYEDN--YIENTKKY 408 E K IYEDN s+1N KKY SbjCt: 1078 KYGYVVNEETKGTIYEDNNGLQEILKNGKKY 1108 Score 33.6 bits Expect 2.7 Identities 42/198 Positives 77/198 Gaps 42/198 (21%) Query: 222 SRLYIOIFIEMR~ -RNDYVNEKRNTRAF NSNDDA4'1TGEFEFNEYNLA 267 S LY +N4 K+14T TT E SbjCt: 411 SVLYSIIYMNKKYKKJCNFIITN4XKNTNVYFENDVIQLSVENTSEDTFTTNTRESSLNSGM 470 Query: 268 ODNLRNNINQNGDFFYI KTI3DKYI KVNYNVTTFNTNI IVVPYTKQYEFCTKIRDIONHVT 327 R +N4 D +00K N YTK E Sbj ct: 471 HNDHRYSVNNYADEKVYHSO0KSDHLIYGIVHDEKNKYDENYTKTKE--------------- 517 Query: 328 YLRDDMFYKENNERYYYNPSNLHFD4AYSKNYVVDNDRYLYLDM14KI IKFHIK1CMIKNH 387 +YK N+ N4 K LD+ KI H+KN+ +N Sbj ct: 518 -NENIIYKS14IVO)KKTCDISSEMVNGK0K------------ LDVEKYIGSHVXND- ENNK 563 Query: 388 SEFERK-EKIYED4YIE4 404 4 YIs.1 Sbjct: 564 EKLKICKID14VNKKEYI014 581 9 i13845297 CAE001421) hypothetical protein (Plasmnodium falciparum] Length 2380 Score 48.0 bits (112), Expect le-04 Identities 87/390 Positives 160/390 Gaps =65/390 (16%) Query: 20 VRKKIPDKYNQIELVELNDDIDYYISIS4RSGKSF4YVSFF--IYLAIKLDIKF 74 +4K +K +N4D R K+NbY++ +YL I DI Sbjct: 1049 LQRIO0DKCSIORNRNRYINKDS14IHL!OUIRIKFKNLNYIONNSFEIELYLKINNDIFL 1108 Query: 75 TLLSRHYTLRDAYR DFIEEIIEN-PLFKSKRVTFRSARflYLAIIYQ0KEIGVI 127 +Y Y EN 44 Y +K(4 SW) ct: 1109 OF14KIIIYNVoNFYNFSITLINIMSKYYSE4FYAYNLEKIVYKFLL14NKNFEYIEKQYSSK 1168 Query: 128 TDLNSATDLKYHS4FLIGIYPI I IEFLA- LEDDYLIDEWOKLKTIYESIDRNHG4V 183 0+14 D+ II EFL L+ D I KLKT Sbjct: 1169 EDMNEL-DILVNTYI3MKYDKII EFLKNNGYLKIDRYIYFYPKLKT DI 1214-- Query: 184 0YIGFPKNFLLGNAVNFSSPILS4LNIYNLLQKHKMNTSRLY KF--LEMRRN 235 F ++FL N4+ L NI K Y K IF M+ Sbj ct: 1215 ILFFFCEIFLND14ILKIDRKFLKK-14ITIMIEVLKEIFFKEYVKRCITKCVIFFPVX4Kfl 1273 Query: 236 NTRAFNSNDDANVTGEFEFNEYNLADONLRNHINQ14GDFFYI KYD 287 DtY K N4+ F14+ 0 N YN D+ 14+ N1N +Y K WO 00/32825 PCT/I B99/02040 296 Sbj ct: 1274 DHVMNINrCNNQYNNSNMFNTRGDHNNNNQTNDNHYNHHYDDTHNflNNNNSKYY -KNIC 1332 Query: 288 DKYIKVMYNVTTFMTNIIV- -VPYTKQYEFCTKIRDIDN{VTYLRDDMFYKEN ME 340 tIK K+M4Y V K K I N Sbj Ct: 1333 NKN-KIMYEKERKSSSLFISNNVQDVPIKHfYLKYSSIYKNFIYIISEIKNFNNKITKIN 1391 Query: 341 RY-YYNPSNLHFD)NAYSKNYVVDNDRYLYL 369 RY YYN NL+ D. ND YL+L Sbjct: 1392 RYNYYNYMNLNIDDL NDAYLFL 1413 Score 32.5 bits Expect Identities 46/183 Positives 73/183 Gaps 26/183 (14%) Query: 225 YKNI FLE?RRNDYVNEKRNTRAFNSNDDAZ4TTGEFEFIEYNLADDNLRNHINQNGDFFYI 284 tKIe s+.eN NSN N Ni. .N N IN+ I Sbj Ct: 27 HIWINIOIIKNKKFINID)NSlNCNNSNSNNSNSNNNNNNNNNIVRNN-NNFINADKKONI Query: 285 KTDDKYI KVMYNVTTFMTNI IVVPYTKQYEFCTKI RDI DNHVTYLRDDMFYKENMERYYrY 344 +D 1K V NI Y t D+ N+ ++KE ER Sbjct: 86 IMEDDDIKNKELVDESFVNIFF--YENYFKNLFNLNDVSNNKVI NIXEQCEODER- 138 Query: 345 NPSNLHFDNAYSKNYVVDNDRYLYLDMNKIIKFHIKNEZ4KKNMSEFERKEKIYEDNYIEN 404 N N N tKN VDN +14K IKN Y Sbjct: 139 NADN NLCNKNIVRDN---INK--IIGJ--TRNVNEILIYNNKYIINFLND 180 Query: 405 TICK 407 T K Sbjct: 181 TTK 183 >gi144939361emb1CAB38972.1I (AL034556) predicted using hexExon; MAL3PS.6 (PFCO600w), Hypothetical protein, len: 250 aa (Plasmodium falciparum) Length 249 Score -47.3 bits (110), Expect 2e-04 Identities 53/215 Positives 87/215 Gaps 30/215 (13%) Query: 209 NIYNLLQKHKNNTSRLYIOIIFLEIRRNDYVNEKRNTRAFNSNDDANTTGEFEF -NEYNL 266 NIYN L++i YIW N tN NtN EFE N YN Sbj ct: 13 NIYNKLEEK YIOFLKLKNMNSHMGASQNHNV-NNNYTMNELEEFEKINNNYNN 64 Query: 267 ADDNLRNHINQNGDFVYI KTD--- DIYI KVHYNVTTFNTNI IVXTPYTKQYEFCTKIRI) 321 N+IN D+ 1K tK YN +1I T Sbj ct: 65 NNNNINNNINNYDYMNIKVSQSVQHNKRLQDFYNNKNSPQHYI KKLKTCRFDADDI RNL 124 Query: 322 IDNHVTYLRDDMFYK--- ENMERYYYNPSNLHFDNAYSKNYVVDNI3RYLYLDO*KIIK 376 tt YRD+ K EN N N+ SNY DN+ LY K Sbjct: 125 LEKRLAYERflNTLIKNIQEEENKKGIGINGNFGSESNSSSSNY- -DNNYLLYRKINRI.NK 182 Query: 377 FNIKNEIJCJNMSEFERKEKIYEDNYIENTKKYLMK 411 KI KICY*+K Sbjct: 183 TNTNKSKNRSRKRKRINSKI DICKYIIK 209 >giJ3845165 (AE001390) hypothetical protein [Plasmiodiumi falciparum] Length 1247 Score 45.7 bits (106), Expect 6e-04 Identities 52/239 Positives 94/239 Gaps 38/239 Query: 206 SNLNIYNLLQKHKMNTSRLYKNI FLEMRRNDYVNEKRNTRAFNSNDDAI4TTGEPEFNEYN 265 +N NtN tK K Rt I +N E N N Sbj ct: 474 NNTNKWNEI KKRKJCIFKREKNKI INNSFQNQEAEDWNNNNNDNNNDNHNDNNNENNNEN 533 Query: 266 LADDNLRNHINQNGDFFYI-KTDDKYIK- VMYNVITFNTIIVVPYTKQYEPCTKIR 320 D+N N+ N D I D+ Y tYN T' ri YTK-+r+ Sbj ct: 534 NNDNNNENNNDINNDINNIHNNDNNYYNNDNINLYNERCKKKCNLDNSYTKYFFYI FTLr 592 Query: 321 DIDN4VTYLRDHMFYKENME--------- PSNU4FDNAYS 356 FYi-rN re-YYN N Sbj ct: 593 -DNLPSIKFETFYEQ4TDHO4FNENYKFYYNTDDDTDIINAIKKKNVKNKXKNGNIVI 649 Query: 357 KNYVVDNIDRYLYLOMNKI IKFH IKINEHKNNSEFER KEKIYEDNYI ENTICKYLNIC 411 KNYt N+ YYL+ N+ *1 I +E KC+I+ .Y E KC K WO 00/3282 5 PCT/I 899/02040 297 Sbj ct: 650 KNYINHNE -YSYLEYNENIOYEINKKEKILTEYEDhYIKNIHYNYSEGDGKQTKK 707 Score 41.0 bits Expect 0.016 Identities 58/245 Positives 96/245 Gaps 43/245 (17%) Query: 207 NLNIYNLLQGKKMNTSRLY)WI FLEMRRNDWVNEKRmTRAENSNDDANTGEFENEYhL 266 N.N4YN +KK Y F D+ N D E YN Sbj ct: 564 N INLYNEMTKKXCMLDNSYTKYFFYI FTLDMLPS IKFETFYEIONTVH1OFNENYKFYYNT 623 Query: 267 ADD NRHNNDF--YKDKYKMNTTMNIVYK 312 DD N..N +NG+ VI V Y+N N T Sbj Ct: 624 DDDTDI INAIKKONVK KXNGNIVIKNYINHNE -VSVLEYNENflCLEINKXEKLLTEN 681 Query: 313 YEFCTKIRDIDNHVTYLRDDFKENERYNPSNLFDNAYSK NYV- -VO 362 YE. 1.0 V D .YN +N +NVYK
VD
Sbj ct: 682 YEYDMKYIKNIHYNDYSEGDGKQTKKSFLNN 44NKYKXEDNKTQIISVMDHVD 738 Query: 363 NDR---------VYLYLDMNKIIKFHIK-NEM ~KXNMSEFERKEKIVEDNYIENTKKY 408 N+ F +K N+M K+ F +E I .EN K+ Sbj ct: 739 NEGKLKNFUSOYFVDDMKERSNVEFNNKEEK 798 Query: 409 L&4KQY 413 L KY Sbjct: 799 LKKHY 803 Query= ptj110877 44AHiJDDRFOO7 Phage 44AHJD ORF 12044-302711 1 (327 letters) ,gil1181960lemfbICAA87731.1I (Z47794) Connector protein (Bacteriophage CP-1) Length 337 Score 45.7 bits (106), Expect Se-04 Identities 44/184 Positives 84/184 Gaps 13/184 (7%d) Query: 127 QIHKLYDNCMSGNFVVMQNKPIQYNSDIEIIEHYTDELA VALSRFSLIMQAKFSK IF 184 IIK N V I +E L. L A. IF Sbjct: 125 ELHKDNPDKIKRPCIVIPNNFVYEPYIGLLFCELADILTIQLNRAQITPYFIF 182 Query: 185 KSEINDESINQLVSEIYNGAPFVKNSPMFNAD DDIIDITSNSVIPALTEMKR 236 N S. N P V~i D I +1L Sbjct: 183 ADTVSKININEVYNQDQGDFQSYQFTAFLKH 242 Query: 237 EVQNKISELSNYLGINSLAVDKESGVSDEEASNGnTSNSNIYLKGREP-ITFLSKRY 295 E L ++GIN+ DK. EA SNO G +lN K R +.KVY Sbj ct: 243 EKLRVMNQLLTFIGINNNPSDKKERLVVSEAISNNGVISAIVGWKSRKVLINKCV 302 Query: 296 GLDI 299 GL. I Sbjct: 303 GLEI 306 ,giI1429239Iem~ICAA67658I (X99260) upper collar protein (Bacteriophage B1031 Length 308 Score 44.9 bits (104), Expect 8e-04 Identities 40/159 Positives 73/159 Gaps 11/159 Query: 150 YNSDIEI--IEHYTDELAEVALSRFSLIQAFSKIFKSIDESINQLVSEIYNG 203 YN.D.. i+E +LAE+s Q I N S. Sbjct: 121 YNNWLKCSTLPALEMFAQDL.AELKEIIAVNQNAQKTPVEIAANNNQLSLIINQYEGN 180 Query: 204 APF"KSPMFNADD-DIIDLTSNSVIPALTEMKEYQNKISELSNYWINiLAVKSG-o 2Q2 APt+. D+ V. L K N E+ YLGI.+ Sbj ct: 181 APVI FVHESLDLDNLKVFKTDAPYV flKLNAQKNAVWN EVMTVL4GIKNANLEKCERN 237 Query: 263 SDEEAKSNRGFlTSNSNIYLKGREPITFLSKRVGLDIK 300 E SN S. NIVLK R E VGL..K Sbjct: 238 VTSEVDSNDEQIESSGNIVLKARQEACNKISELGLNLK 276 >gijl379l5SIpiPO7S3SIVGlO_BPPZA UPPER COLLAR PROTEIN (CONNECTOR PROTEIN) (LATE PROTEIN GP1O) >gij7S8S1IpirjI BP1 gene WO 00/32825 PCT/I B99/02040 298 protein phage PZA >gi1216059 (M11813) upper collar protein (Bacteriophage PZAJ Length 309 Score 43. 8 bits (101) Expect 0. 002 Identities 38/160 Positives 75/160 Gaps 13/160 Query: 150 YNSDIEI--IEHYTDELAEVALSRFSLIMQAKFSKIF--KSEINDESINQLVSEIYN 202 YN+D+ +E ELAEt. S+ A. tN S+ Q+ Sbjct: 122 YNNDMSFPTTPTLELFAAELAELK-EIISVNQNAQKTPVLIRANDNNQLSLKQVYNQYEG 180 Query: 203 GAPFVKHSPMFNADD-DIIDLTSNSVIPALTEMKREYQNKISELSNYLGINSLAVDKESG 261 AP L K N E+ LGI.+ Sbj ct: 181 NAPVI FAHEALDSDS IEVFKTI3APYVVDKLNAQKNAVWN EMNTFLG IKNANLEKKER 237 Query: 262 VSDEEAKSNRGFTTSNSNIYLKGR-EPITFLSKRYGLDIK 300 .E SN *+LK R E YGLD+K Sbjct: 238 MVTDEVSSNDEQIESSGTVFLKSREE.ACEKINELYGLDVK 277 9 iI137914IspIP04332VC10_BPPH2 UPPER COLLAR PROTEIN (CONNECTOR PROTEIN) (LATE PROTEIN GP1O) >gij7S8S2jpirjjWNBPC9 gene protein phage phi-29 >giJ215328 (M14782) upper collar protein (Bacteriophage phi-29] >giJ215340 (M12456) p10 connector protein (Bacteriophage phi-29) >giJ224161jprf I11011232A protein plO~connector (Bacteriophage phi-291 >giJ22536SiprfII1301270E gene (Bacteriophage phi-29] Length 309 Score 41.4 bits Expect 0.009 identities 37/160 Positives 75/160 Gaps 13/160 Query: 150 YNSDIEI--IEKYTDELAEVALSRFSLIMQAKFSKIF--KSEINDESINQLVSEIYN 202 YN+D+ +E ELAE+ S+ A+ N S+ Q+ Sbj ct: 122 YNNflMAFPTTPTLELFAAELAELK- ElISVNQNAQKTPVLIRANDNNQLSLKQVYNQYEG 180 Query: 203 GAPFVKI4SPMFNADD-DIIDLTSNSVIPALTEMKREYQNKISELSNYLGINSLAVDKESG 261 APt+ L K N E+ +LGI Sbj ct: 181 NAPVIPAHEALDSDSIEVKTDAPYVVDKLNAQKNAVWN ENMTFLGI IOANLEKKER 237 Query: 262 VSDEEAKSNRGFTTSNSNIYLKGR-EPITPLSKRYGLDIK 300 +E SN S+ +*LK R E YGL++K Sbjct: 238 MVTDEVSSNDEQIESSGTVFLKSREEACEKINELYGLNVK, 277 Query= pt1110878 44AHJDORFOO8 Phage 44AHJD ORE' 13020-377512 1 (251 letters) >giJ4982468igb1AAD30963.21 (AFli8i5i) SNF1/ANP-activated kinese [Dictyosteliui discoideum) Length 718 Score =52.3 bits (123), Expect 3e-06 identities -28/118 Positives 56/118 Gaps 5/118 Query; 121 YLQSQGPTEHNEDTTSNTDETSNQNATSLDNSTGMTANRNAYV- SLPQSEVNIDVDN 176 GF N ++SN +N N N+ T NN +N ++N Sbj ct: 382 FTTTTGFNPTNSNSISNNNNNNNNNNNNTTNNNNN'ITNNNNSIINNNNINNNNINNNNNN 441 Query: 177 TTLRFADNNTIDNGKTVNKSSNESNQNAKRNQNOKGNAKGTQFTKQYLID-NIDKAYD 233 +NN I+N N +N N N N N+ T++I +Y+ Sbjct; 442 NNNNINN NIINN 4141 4141 RJNNNNNNNNNNSSISGGTEVFSISPNLNNSYN 499 Score 37.5 bits ExpcLuL 0.054 Identities 17/111 Positives 45/111 Query: 130 HNEDTTSNTDETSNQNATSLDNSTGMTANRNAYVSLPQSEVNIDVDNTTLRPADNNTID1N 189 +N +N +N N P N+ Sbjct: 456 NNNf NNNNNIJNNNNNNNSSISGGTEVFSISPNLNNSYNSNSSGNSNGSNSNNNS 515 Query: 190 GKTVNKSSNESNQNAKRNQNQKGNAKCTQFTKQYLIDNIDKAYDLRKKILN 240 N +N +N N N N N N Sbj ct: 516 NNNTNNDNNNNNNNNNNNNNNCIDSVNNS JNEN1VNN 566 WO 00/32825 PCT/I B99/02040 299 Score 32.8 bits Expect 2.4 Identities 31/140 Positives 57/140 Gaps 14/140 Query: 109 LNVVYSSSEVEKYLQSQGFTEHNEDTTS NTDSTSNQNATSLDNSTGMTANRNAYVSL 165 LN S N +T N+ +N N N N Sbjct: 494 LNNSYNSNSSGNSNGSNSNNNSNN TNNDNINlf NNf~iNN CIDS 553 Query: 166 POSEVN--IDVDNflLRADNNTIDNKTVNKSS--------- NESNQNAKRNQNQKGNAX 215 +N DV.N.. dN D.G N N N N ON Sbj ct: 554 VNNSLNNENDVNNSNINNNNNNNSDDGSNNNSYEGGGDVLLLSDLNGNNQLGGNDNGNVV 613 Query: 216 GTQFTKQYLIDNIDKAYDLR 235 o L tD D++ Sbjct: 614 NLNNNFQ-LLNSLDLNSDIQ 632 Score 31.7 bits Expect 5.4 Identities 25/115 Positives 48/115 Gaps 10/115 Query: 130 HNEDTTSNTDETSNQNATSLDNST- -GMTAN-RNAYVSLPQSEVNIDVDNTTLRFADNN 185 +N N1+ *N N T N N.Y S S N. +N Sbjct: 462 NNNNNNNNNNNNNNNNNSSISGGTEVFSISPNLNNSYNS--NSSGNSNGSNSNNNSNNNr 519 Query: 186 TIDNGKTVNKSSNESNQNAKRNQNQKGNAKGTQFTKQYLIDNIDKAYDLRKKILN 240 flN N NHN N D+ +N Sbjct: 520 NNDN NNNNNNNNNNNfl flNNNNNNNNCIDSVNNSLNNENDVNNSNIN 570 Score 31.7 bits Expect 5.4 Identities 15/104 Positives 43/104 Query: 110 NVVYSSSEVEKYLQSQGFTEHNEDTTSNTrDETSNQNATSLDNSTGNTANRNAYVSLPQSE 169 N. 4N .N 4N N V Sbjct: 434 NINNNJIINNINNNIIN JNNN2OJNNNNNNNNNNNSSISGGTEVFSISPN 493 Query: 170 VNIDVDNTTLRFADNNTIDNGKTVNKSSNESNQNAKRNQNQKGN 213 +N .N N 4N N N N N Sbjct: 494 LNNSYNSNSSGNSNGSNSNNNSNNNTNNDNNNNNNNNNNNNNNN 537 Score 30.9 bits Expect 9.2 Identities -16/84 Positives 34/84 Query: 130 NNEIDrSNTDETSNQNATSLDNSTGMTANRNAYVSLPQSEVNIDVDNTTLRFADNNTIDN 189 4N *N +N N S. N +N Sbjct: 455 NRINNNNNNNNNNNNNNNNNNNNNSSISGGTEVFSISPNLNNSYNSNSSGNSNGSNSNNN 514 Query: 190 GKTVNKSSNESNQNAJCRNQNQKGN 213 N NHN N Sbjct: 515 SNNNTI flNn NNNNNNNNflJNNN 538 sgil17300771spIP181601KYK1_DICDI NON-RECEPTOR TYROSINE KINASE SPORE LYSIS A (TYROSINE-PROTEIN KINASE 1) >giI974334 (U32174) non-receptor tyrosine kinase [Dictyostelium discoideum] Length 1584 Score 46.5 bits (108), Expect 2e-04 Identities 29/106 Positives 48/106 Gaps 4/106 Query: 130 HNEDTTSNTDETSNQNATSLDNSTGMTANRNAYVSLPQSEVNID VDNTTLRFAnN -N 185 +NED+SN +N N N N N. .+NTT NN Sbjct: 442 NNEDISSNNINNNJNNNNNNNNNSNSSNTNNNNINNTTNNNNSNSN 501 Query: 186 lcfm._W5fUY~lfvwavlQ~~LD~y 221 .N N.+SN+N N NHN N TiC. I1+ D+ Sbjct: 502 NNNNNNNSNSNSNSNNNNINNNNNNNNNNNNIYLTKKPSIOSTDES 547 Score 34.0 bits Expect 1.1 Identities 20/117 Positives 46/117 (39%) Query: 87 NRQTVEAFGMQVITVCITNEDYLNVWYSSSEVEKYLQSQG-FTEHNEDrTSNTDETSNONA 146 N G IT T N N WO 00/32825 PCT/I B99/02040 300 Sbjct: 415 NNNNNIIGNGKIT2TTTTSTSPSSINNEDISSNNNN4NNNNNNNNNNNNNNNNNNN 474 Query: 147 TSLDNSTGI.rAX4RNAYVSLPQSEV14IDVDNTTLRFAl1NTIDNGKTV4KSS4ESNQN 203 N N N4+ +N N4+ +N4 N4++N4+N41N Sbjct: 475 NNNSNSSNTN1NINTTNNSNSNNW#14NSNS4SNSNNNINNNNNNNNN 531 Score 33.2 bits Expect 1.8 Identities 18/88 Positives 35/88 (39%) Query: 130 HNEDTTSNTDETS14Q1ATSLDNSTGMTANRNAYVSLPQSEVNIDVDNTTLRFAD41TID01 189 +N +N41N T T S+ +E +N4 +NN1 +N4 Sbjct: 405 NNN14NSNNNINNNNINNIIG1GKITTTTTTSTSPSSINNNEDISSNNNNNNNNN41NN14N 464 Query: 190 GKTVN1CSSNES14Q1AKRNQNQKGNAKGT 217 N +N4N+ N T Sbjct: 465 NN14NNNNNNN1NNNS1SSNTNNNNINNT 492 Score 32.5 bits Expect 3.1 Identities 18/94 Positives 37/94 (39%) Query: 120 KYLQSQGFTEH14EDTTS14TDETS1QNATSLD1STGMTANRNAYVSLPQSEVNIDVD4flL 179 K +S N4+ +N41N +T S N D+ Sbjct: 392 KNV1STSILVPNGNN1NNNSNNNNNNNNNNIIGNGKITTTTTTSTSPSSINNEDISSNNN 451 Query: 180 RFAD14NTID14GKTVNKSS14ESNQNAKRNQ4QKGN 213 +NN +N N 14 N 14 +N N4 N Sbjct: 452 NN~l 444)NIIJN1NJSNSSNT4 485 Score 32.5 bits Expect 3.1 Identities 24/110 Positives 44/110 Gaps a 10/110 Query: 138 TDETSNQ14ATSLD1STGMTANRNAYVSLPQSEVNIDVDNTTLRFADNN1TIDNGK 191 T .*4.N14N N +t1N +141 N Sbjct: 429 TTTTI'STSPSSINNNEDISS4N4NNNN1N41NN1NNNN NSS4TN1NN 488 Query: 192 TVNKSS14ES14Q1AKRNQ1QKG1AKGTQFTKQYLID1IDKAYDLRKK 237 T N+S411 141 N1NN 4+ +N LK Sbjct: 489 INNTTNNNNS14SIJ4)NNNNSNSNSNS14I41NINNNNNNNNNNNNIYLTKK 538 >siI37SB8S5texnbiCABlll4O.1I (Z98551) predicted using hexExon; MAL3PE.11 (PFCO76Oc), Hypothetical protein, len: 3395 aa (Plasmnodium falciparum) Length 3394 Score 46.5 bits (108), Expect a 2e-04 Identities 52/202 Positives 96/202 Gaps -32/202 Query: 21 F1EFV1D1KLTFYDDEQFMQKMLKFD -KDVLAIVNEKVFKGPSLKDELSDL- -LFKXSF 77 F +KT D+ Mi.K K D DV +I4EK++ L KK Sbj ct: 665 FEKYCSNIK14TLIRDD- -MKXFRKPDISDVHILH4EKIYLEKLL4EKL14YIKDIEKKLD 721 Query: 78 TIHPLDREINRQTVEAFGMQV--ITVCITHEDYLNVVYSSSEVEKYLQSQGFTEHNE 132 .H4+ IN4+ ++QV I V DY S +K +N4 Sbjct: 722 ELMGV 1IOKEKDIYILQVEKTLIKVISSVYDYTKE-SENIFKM4TTNICLNNV 777 Query: 133 DTTS14TDETSNQNIATSLD14STCMTA14R14AYVSLPQSEVNIDVD14TTLRFADNINTIDIIGKT 192 +814 D +14Q1 14.1 N N N4 +14 Sbjct: 778 HJ4SSNKDY-14NQ14NQNIENNQ4IENQ4----------NQNIE4---- NQ141ENN1QNN 820 Query: 193 VNKSS14ESNQ14AKRNQ14QKGNA 214 N t+14+Q14 14Q1 NA -bc:52--4 Score 33.6 bits Expect 1.4 Identities 46/221 Positives 89/221 Gaps 37/221 Query: 10 DPIKSELIKKGF4EFV14I31KLTFYDDEFQPQKILKFDKDVLAIVNEKVFKGFSLKDELS 69 D +K E K N4 Y M.K K VK SL Sbj ct: 367 DSLKIEYNKSKTNIQQL4EQLVNYK14FIKEHEKKYK QLVVKNNSLFSITH 416 Query: 70 D)LLFKKSFTIHFLDREINRQTVEAFGMQVITVCITH- EDYLNVVYSSSEVEKYLQSQG 126 WO 00/32825 PCT/I B99/02040 301 D K. I R I H ±De.L..V+Y L Sbjct: 417 DFIWLIWSNIIIIRRTSDMKI FKYNLDIEWFNEQDHLSVIY IyEILYNrN 468 Query: 127 FTEHNEDTTSWTDETSNQWATSLDNSTGMTANRNAYVSLPQSEVNIDVDNTTLR'ADNNT 186 4W D +ND +N N +N4 N N N N Sbj Ct: 469 DIRNNNDNDNUNfNDNNNNNNNDNNNUNNNNNN---------WNNNYNN 1MM M 512 Query: 187 IDNGKTVNKSSUESNQNAKRNQNQKGNAXGTQFTKQYLIDN 227 I.W N 44 N N N N 4 1.N Sbj ct: 513 IEN4NSGNI4PWSNNLHNYRHNTNDENNLSSLKTSFRYKIN 553 Score 32.8 bits Expect 2.4 Identities 28/122 Positives 53/122 Gaps 2/122 Query: 119 EKYLQSQGFTERNEDTTSNTDETSNQNATSLDNSTGMTANRNAYVSLPQSEVNID -VDNT 177 E Y S N+ .DN. N N 4D +N Sbj ct: 2838 ENYPVSTHYDN DDIN1DNINNDNNNDNINDDNNNDNIWNDNNNDNINNDNINNDNINUfl 2897 Query: 178 TLRFADNNTIDNGKTVNKSSNESNQNAKRNQWQKGWAXGTQFTKQYLIDNIDKAYDLRKK 237 +NG SSN NK N.G YD K Sbj Ct: 2898 NNNfNNNDNSNNGFVCELSSNINfFNNILNVN-KDNFQGINKSNNFSTNLSEYNYDAYVK 2956 Query: 238 IL 239
I+
Sbjct: 2957 IV 2958 Score 32.5 bits Expect 3.1 Identities 46/249 Positives 101/249 Gaps 31/249 (12%) Query: 9 YDPIKSELIKKGFNEFVNDNKLTFYDDEFQPMQKNLKPDKDVLAIVNEKVFKGFSLKDEL 68 Y..K N N NK E Q+K+ K L++ Sbjct: 2150 YNYVK VQNATNREDNKNK ERNLSQEIYKYINENIDLTSELEKKNDMLENYK 2200 Query: 69 SDL LFKXSFTIHFLDREIRQTVEAFGMQVITVCITHEDYLNVVYSSSEVEKYL 122 +K I L M+ N E+ +L Sbj ct: 2201 NELKEKNEEIYKLNNDIDMLSNNCKKLKESINNMEKYKI IN--NNIQEKDEI IEWL 2255 Query: 123 QSQGPTEHNEDTTSNTDETSNQNATSLDNSTGMTAN RNAYVSLPQSE -VNIDV 174 +D +N N +S Sbj ct: 2256 KNK-YNNKLDDLINNYSVVDKS IVSCFEDSWIMS PSCNDILNVFNNLSKSNKKVCTNNDI 2314 Query: 175 DNTTLRFADNNTIDNGKTVNKSSNESNQNAKRNQWQKGNAKGTQFTKQYLIDNIDKAYDL 234 N +144 +N +N +NN NW N NK YL.+N+ D Sbj ct: 2315 CNENMDSI SSINNVWNINNVNNINNVNNINNVNNINNVKNIVDINNYLVNNLQLNKDN 2372 Query: 235 RKKILUEFD 243 I+ +F+ Sbjct: 2373 DNIIIIKPN 2381 Score 32.1 bits Expect 4.1 Identities 20/103 Positives 48/103 Gaps 2/103 Query: 115 SSEVEKYLQSQGFTERNEDr1'SNTDSTSNQN ATSLDNSTGMTANRNAYVSLPQSEVNI 172 +SKY EW N D +N.W L S E+ Sbj ct: 3264 NNDEEKYSCNDDKNEHTNNDLLNIDHDNNKNNITDELYSTYNVSVSHNKDPSNKENEIQN 3323 Query: 173 DVDNTTLRFADNNTIDNGKTVNKSSNESNQNAKRNQWQKGNAK 215 4 0 N +4 N K Sbjct: 3324 LISIDSSNENDENDENDEnDENENDEWDENflENDENDSK 3366 Score 30.9 bits Expect 9.2 identities 7/1 (22i), Positives 53/116+(44%), Gaps 15/118 (±2%i Query: 104 THSDYLNVVYSSSEV EKYLQSQGFTEHNEDTTSNTDETSNQNATSLDNSTGMTANR 159 Tt DLN. E Y HNND +E QN S+D+S N Sbjct: 3280 TNNDLIIJIDKDNNKNNITDELYSTYNVSVSHNN)PSWKENEI--QNLISIDSSEWDEND 3337- Query: 160 NAYVSLPQSEVNIDVDWTTLRFAlNNTIDNGKVNKSSNESNQNAKRNQQKGNAKGT 217 N+ ODN N +E N N .GT Sbjct: 3338 EN- -DENDENDEWOEW ENDEWDENDENDEKDENDENDENDENFDNNNEGT 3386 WO 00/32825 PCT/I 899/02040 302 )9iI585795pIP21538REB..YE-AST DNA-BINDING PROTEIN REBi (QEP) >gi16261391pirlIS45907 DNA-binding protein REBl yeast (Saccharomyces cerevisiae) >gi15362801ernb1CA.A849921 (Z35918) ORF YBRO49C [Saccharonyces cerevisise) >si15599441ernb1CAA8639l1 (Z46260) REBi DNA-binding protein (Saccharomyces cerevisiae) Length 810 Score 45.7 bits (106), Expect 3e-04 identities 34/158 Positives 72/158 Gaps 14/158 Query: 83 DREINRQTVEAFGMQVITVCITHEDYLNVVYSSSEVEKYLQSQGFTEINEDTTSNTDETS 142 D. Ne...VE V Hi..s. K+ Q E D N S Sbjct: 7 DIGNANQESVEEAVLKYVGVGLDNQNHDPQLNTKDLENKHSKKQNIVESSSDVDVNNNnDS 66 Query: 143 NQNATSLDNSTGNTAURNAYVSLPQSEVN4IDVDNTTLRFADNNTID -NGKTVNKSSNE 199 U.N +.DS L -rE VD+ N +D N. *E Sbj ct: 67 NRNEDNNDDSENISA---LNANESSSNVDHANSNEQHNAVZ'DWYIJRQTAINQQDDE 119 Query: 200 SNQNAKRNQNQKCNAKGTQFTKQYLIDNIDKAYDLRKK 237 *i-N N GN +D D KK Sbjct: 120 DDEN--NNNTDNGNDSNNHFSQSDIV--VDDDDDONKK 153 ,gi1172372 (M458728) DNA-binding protein (Saccharomyces cerevisiae] Length 809 Score 45.7 bits (106), Expect 3e-04 identities 34/158 Positives 72/i58 Gaps 14/158 Query: 83 DREINRQTVEAFGMQVITVCITHEDYLNVVYSSSEVEKYLQSQGFTEHNEDTTSUTDETS 142 D. N VE +V H K+ Q E D N S Sbj ct: 7 DKNANQESVEEAVLKYVGVGLD)HQNHDPQLMTKDLENKHSKKQNIVESSNDVDVNNNDDS 66 Query: 143 NQNATSLDNST MTANRNAYrVSLPQSEVNIDVDNTTLRFAflNNTID NGKTVNKSSNE 199 U.N +A L +E VD+ N +D N. +E Sbj Ct: 67 NRNEDNNDDSENISA---LNANESSSNVDHANSNEQHNAVNDNYIJRQTAHNQQDDE 119 Query: 200 SNQNAKRNQNQKGNAKGTQFTKQYLIDUIIDKAYDLRKX 237 N GN +D D KK Sbjct: 120 DDE-N--NNNTDNGNDSNNNFSQSDIV--VDDDDDOIKK 153 >giJ2952545 (AF051898) coronin binding protein (Dictyosteliun discoideum] Length -560 Score 44.9 bits (104), Expect 6e-04 Identities 26/83 Positives 39/83 Gaps 5/83 Query: 131 NEDflSNTDETSNQNATSLDUSTGNTANRNAYVSLPQSEVNIDVDNTTLRFADNNTIDNG 190 N +N +N N.S NS +N4N+ P N D N T .NN'T N Sbj Ct: 404 NNNNNNNI INNNNSNSNSNNNSNN -NSNNNSNRNS PNIU4NNGDNDNNT NNNTNNNN 458 Query: 191 KTVNKSSNESUQNAKRNQNQKGN 213 N.+N+N N N41N N Sbjct: 459 NNNNNNNNNNNNNNNNNNNNNNN 481 Score 41.4 bits Expect 0.006 Identities 22/88 Positives 43/88 Gaps 6/88 Query: 130 HNEDTTSNTDETSUQNATSLDN STGF4TANRNAYVSLPQSEVNIDVDNTTLRFAflNNT 186 SN N, G P 4N ON .141 Sbjct: 337 NRNNSNNNSNNNSNNNSNNSNNRNITNGSNANKS- -NSPNNNLNTNNDNIGJNNSNNNNN 393 Query: 187 IDNGKTVNKSSNESNQNAKRNQNQKGNA 214 .1N SN N N MNN N.
Sbjct: 394 SNNNSNNGNSNNNNNNNIINNNNSNSNS 421 Score 40.6 bits Expect 0.011 Identities 24/101 Positives 41/101 Gaps =2/101 Query: 115 SSEVEKYLQSQGflEHNEDTTSNDrETSNQNATSLDNSTMTANRNAYVSLPQSEVNIDV 174 S. L N1 S NUN S U.+ WO 00/32825 PCTIB99/02040 303 Sbj ct: 370 SNSPNNNLNTNNDNKNNNSNNNNNSNNNSNNGNSNNNNNNNI INNNNSNSNSNNNSNNNS 429 Query: 175 DNTTLRFAflN--NTIDNGKTVNKSSNESNQNAKRNQNQKGN 213 tN +R +N N DN N +N N N N N Sbj tt: 430 NNNSNRNSPNHNNNGDNDNNTNNNTNNNNNNNNNNNNNNJN 470 Score 40.2 bits Expect 0.014 Identities 21/80 Positives 39/80 Gaps 9/80 (11%) Query: 130 HNEDTTSNTDETSNQNATSLDNSTGMTANRNAYVSLPQSEVNIDVDNTTLRFANNrIDN 189 +N D +NT+ +N N N N N +AJDN+ Sbj ct: 442 NNGDNDNNTNNNTNNNNNNNNNNNNNNNNNNN---------- NNNNNNNNNNYADNSNNNS 492 Query: 190 GKTVNKSSNESNQJAKRNQN 209 N +SW +N N +N.N Sbjct: 493 SNSNNNNSNSNNNDNO4EN 512 Score 39.5 bits (90) Expect 0.024 identities 26/111 Positives 44/111 Gaps 20/111 (18%) Query: 112 VYSSSEVEKYLQSQ -GFrEHNEDTTSNTDETSNQNATSLDNSTGN'rANRNAYVSLPQSE 169 VY.+ K+ G +N SN N N N Sbj ct: 296 VYCTNHHTKFYETHRNGLLNNNNNSNNNSNSNSNNNNNGINNRNNSNNNSN----------- 346 Query: 170 VNIDVDNTITLRFADNNTID)NGKTVNKSS NESNQNAKRNQNQKGWA 214 N I NG NKS+ N +N N NWN N+ Sbj ct: 347 NNSNNNSNNSNNRNITNGSNANKSNSPNNNLNTNNDNKNNNSNNNNNS 394 Score a 37.5 bits (85) Expect 0.094 Identities 24/96 Positives 41/96 Gaps 1/96 Query: 124 SQGFTEHNEDTTSNTDETSNQNATSLDNSTGH- TANRNAYVSLPQSEVNIDVDNrTLRPA 182 S +N SN t* N DN+T T NN N +t+N Sbj ct: 421 SNNNSNNNSNNNSNRIS PNNNNNGDNDNNNNNNNNNNNNNNNNN 480 Query: 183 DNNTIDNGKTVNKSSNESNQNAKRRQNQKGNAKGTQ 218 +NN DN +SN +N N N +K Q Sbjct: 481 NNNYAD)NSNNNSSNSNNNNSNSNNNNDNKWENSDNQ 516 Score 35.6 bits (80) Expect 0.36 Identities 25/99 Positives 42/99 Gaps 18/99 (18%) Query: 130 IOEDTTSNTDETSNQWATSLDNST GTANRNAYVSLPQSEVNIDVDNT1TLRFADNNTID 188 SN +N N TG AN++ P ++DN +NN+ Sbj Ct: 339 NNSNNNSNNNSNNNSNNSNNRNITNGSNANKS NS PNNNLNTNNDNICNNNSNNNNNSN 395 Query: 189 NKSSNESNQNAKRNQNQKGN 213 N N SN N+ NWN N Sbjct: 396 NNSNNGNSNNNNNNNIINNNNSNSNSNNNSNNNSNNNSN 434 Score 35.2 bits Expect 0.47 Identities 21/94 Positives 42/94 Gaps 5/94 Query: 124 SQGFTEHNEDTTSNTDETSNQWATSLDNSTGI4TANRNAYVSLPQSEVNIDVDNTTLRFAD 183 +G T+N N N+ NWN+ N +N Sbj ct: 362 TNGSNANKSNSPNNNLNTNNflNIONNSNN--NNWSNNNSNNGNSNNNNNNNIINNNN 416 Query: 184 NNTIDNGKTVNKSSNESNQWAKRNQNQKGWAKGT 217 N N S+SN+N N N T' Sbjct: 417 SNSNSNNNSNNNSNNNSNRNSPNHNNNGDNDNNT 450 Score 35.2 bits Expect 0.47 Identities 29/118 Positives 53/118 Gaps 12/118 Query: 115 SSEVEKYLQS -QGFTEHNEDTTSNTDETSNQNATSLDNSTGMTANRNAYVSLPQSEVNID 173 SS+ E +GF T+N D S+G V+ P.S+N Sbjct: 114 SSDSEADIEDDKGFQD- -KPITTNNSGSWNPLKNLKDYSSGSSGSSRSGVNQPRSNINNS 171 Query: 174 VDNTTLRFADNNT IDNGKTVNKSSNESWQNAKRNQNQKGNAKGTQFTKQ 222 D I1+ T NQN +NQWQ N Q +Q WO 00/32825 PCT/I 899/02040 304 Sbjct: 172 NDKYKSKSSSSNSNSSSSGGSLISSLLTGGNTYQNQNQNQNQNQNQNNNQSQLQQQQQ 229 Score 34.4 bits Expect 0.81 Identities 24/94 Positives 38/94 Gaps a 12/94 (12%) Query: 131 NEDTTSNTflETSNQNATSLDNSTGMTANRNAYVS LPQSEVNIDVONrTIJRFADNNTIDNG 190 N +T +N N N N S N N +NN+ N Sbjct: 451 NTNNNNNNNNNNNNNNYANSNNNSSNSN NNNSNSNN 504 Query: 191 KTVNKSSNESNQNAKR NQNQKGNAKGTQ 218 NK+ N NQ+ R ++NQK Q Sbj ct: 505 NNDNKNENSDNQSVLRSNEKFTDENQKNGSDDQQ 538 Score 33.6 bits Expect 1.4 Identities 22/90 Positives 35/90 (38%) Query: 124 SQGFTEHNlEDTTSNTDETSNQNATSLDNSTGMTANR24AYVSLPQSEVNIDVDNTTLRFAD 183 S N N+ N N N ++N Sbj ct: 353 SNNSNNRNII'NGSNANKSNSPNNNLNTNNDNIO(NNSNNNNNSNNNSNNGNSNNNNNNNI I 412 Query: 184 NNTII3NGKTVNKSSNESNQNAKRNQNQKGN 213 NNi N .NS+N SNN+ RN N Sbjct: 413 NNNNSNSNSNNNSNNNSNNNSNRNSPNIOIN 442 .giI535260IeibICAA82996i (Z30339) STARP antigen (Plasmodium reichenowi) Length 655 Score 44.5 bits (103), Expect 7e-04 Identities 31/114 Positives 47/114 Gaps 14/114 (12%) Query: 128 TEIOJEOTTSNTDETSNQNATSLDNSTGMTANRNAYVSLPQSEVN IDVDNTTLRF 181 T++N T TD A N N D +NT Sbj ct: 433 TDNNNTNTKATDSNNTNTKATDNNNTN'rKATDNNNTNTKATDNNNTNTKATnNNNTNTKA 492 Query: 182 ADNNTI--DNGKTVNKSSNESNQNAKRNQNQKGNAKGT QFTKQYLIDW 227 I3NN ON T +N NK N N KT T QY+ N Shi ct: 493 TDNNNTNTKATDNNNTNTKATDNNNTNTKATDNNNTNTKATDNNNNTNQYVPAN 546 Score 44.5 bits (103) Expect a 7e-04 Identities 30/103 Positives 44/103 Gaps 13/103 (12%) Query: 128 TEHNEDTTSNTOETSNQNATSLDNS- -TGMTANRNAYVSLPQSEVN -IDVDNTTL 179 T++N T N +014+ T T NN S D +NT Sbj ct: 401 TDNNNTDTKATDKSNNTDTKATDNNNNTDTKATDNNNTNTKATD)SNNTNTKATDNNNTNT 460 Query: 180 RFADNNTI--DNGKTVNKSSNESNQNAKRNQNQKGNAKGT 217 OIN ONT NK N N KT Sbj ct: 461 KATDNNNTNTKATDNNNTNTKAT3NNNTNTKATDNNNTNTKAT 503 Score a 42.6 bits Expect 0.003 identities a 27/96 Positives a 43/96 Gaps 10/96 Query: 128 TEHNEDTITSNTDETSNQNATSLD -NSTGMTANRNAYVSLPQSEVNIDVDNTTLRFAlNNT 186 T++N +T +NW +O0N+T A N r+N NTr DNN4 Sbj ct: 422 TONNNNTDTKATDNNNTNTKATDSNWTNTKATDNNNTNTKATDNN WTNTKATDNNN 477 Query: 187 1--DNGKTVNKSSWESNQNAKRNQWQKGNAKGT 217 ON T' N NK N N K T Sbj ct: 478 TNTKATDNNNTNTKA'rDNNNTNTKATDNNNTNTKAT 513 Score 41.8 bits Expect =0.005 Identities a 35/150 Positives 59/150 Gaps a 9/150 Query: 85 EIWRQTVEAFGMQVITVCITHEDYIJNVVYSSSEVEKYLQSQGFTEmIEDTTSNTDETSNQ 144 E N+ G T+ N E +Q T +N TT+ N Sbj Ct: 118 ETNKTNIKLTGNNSTTINTNLTENTNA -TKKLTENVITNQILTGNNNTTTNTSSTEmJN 175 Query: 145 S ATSLDNSTGMTANRNAYVSLPQSEVNIDVONrrLRFANNTIDNGKTVNKSSWESNQNA 204 N NSTG T4 NI +N L +N T+ T +N41N+ WO 00/32825 PCT/I199/02040 305 Sbjcr: 176 NINTNTNSTGTSTTKCLTE NI-ITNQILTGNNNTTTNTSSTENlNINTNTNS 228 Query: 205 KRNQNQKGNAKGTQFTKQYLIDNIDKAYDL 234 N N N T DNI+ +L Sbjct: 229 TDNSNTNTNLTDITTTTKKNTDNINTTQNL 258 Score 41.8 bits Expect 0.005 Identities 30/101 Positives 43/101 Gaps 13/101 (12%) Query: 130 HNEDTSNTfETSNQNATSLDNS -TGMTANRNAYVSLPQSEVNIDV DNTTLRFA 182 +N DT S 4+ AT DN+ T TNN ND +NT Sbjct: 363 NNTDTISTDNDNTDTKATDNDNTDTKATDNNNNTDTKATDNNNTDTKATDKSNNTDTKAT 422 Query: 183 DNN------TIDNGK1VNKSSNESNQNAKPNQNQKGNAKGT 217 DNN SN T +N N K N N K T Sbjct: 423 DNNNNTDTKATDNNNTKATDSNNTN''KATDNNNTNTKAT 463 Score 40.6 bits Expect 0.011 Identities 31/121 Positives 47/121 Gaps 31/121 Query: 128 TEHNEDTTSNTDETSNQNAT SLDNSTGMTANRNAYVSLPQSEVN-------------171 TEHN +NT+ TN T 4+ +T N N .E N Sbj ct: 171 TEIS4NNINTNTNSTGNTSTTKKLTENI ITNQILTCNNNTTTNTSSTEKNNNINTNTNSTD 230 Query: NTIDNGKTVNKSSNESNQNAKRNQNQKGNAXG 216 D+ TT DN T N TV+ +N +N N K N N K Sbjct: 231 NSNTNTNLTDITTITKXWTDNINTTQNLTTSTNTTTVSTDNNNNNINTKPTDNNNTNIKS 290 Query: 217 T 217
T
Sbjct: 291 T 291 Score 38.3 bits Expect 0.055 Identities 28/98 Positives 41/98 Gaps 10/98 Query: 128 TENNEDTTSNTSETSNQNATSLDNSTGMTANRNAYVSLPQSEVNISVD-NTTLRFADNNT 186 TEHN+ +NT+ S NT+T N+ NTT DNN SbjCt: 216 TEHNNNINTNTN--STDNSNTNTNLTDITTTTKKWTDNINTTQNLTTSTNTTIVSTDNNN 273 Query: 187---------IDNGKTVNKSSNESNQNAKRNQNQKGNAKGT 217 DN T KS++ N K N+ K T Sbjct: 274 NNINTKPTDNNNTNIKSTDNYNTGTKETDWKNTDIKAT 311 Score 37.5 bits Expect 0.094 Identities 31/106 Positives 45/106 Gaps 18/106 (16%) Query: 128 TENNEDTTSNTDETSNQN- -ATSLDNSTGMTANRNAYVSLPQSEVN IDVDN 176 T+.N +T +T T N N AT N+T A N N D +N Sbjct: 390 TDNNNNT- -DTKATDNNNTDTKATDKSNNTDTKATDNNNNTDTKATDNNNTNTKA1DSNN 447 Query: 177 TTLRFANN-----TIDNGKTVNKSSNESNQNAKRNQNQKGNAKGT 217 T DNN DN T N K N N K T Sbj ct: 448 TNTKATDNNNTNTKATDNNNTNTKATDNNNTNTKATDNNNTNTKA 493 Score 35.2 bits Expect 0.47 Identities 24/109 Positives 46/109 Gaps 6/109 Query: 128 TEHNEOTTSTDETSNQNATSLDNSTGTANRNAYVSLPQSEV IDVDNTTLRF 181 TL++N T TV A N I N DnI;-.
Sbjct: 473 TDNNNTNTKATDNNNTNTKATDNNNTNTKATDNNNTNTKATDNNNTWKATDNNNTNTKA 532 Query: 182 ADNNTIDNGKTVNKSSNESWQNAKIRNQNQKGNAKGTQPTKQYLIDNIDK 230 DNN N K K +DK Sbjct: 533 TDNNNNTNQYVFANNYDSTTSDDKLNKDSCDNSEEKEWIKSMINAYLDK 581 Score 34.4 bits Expect 0.81 Identities 26/126 Positives 46/126 Gaps 7/126 WO 00/32825 PCT/I B99/02040 306 Query: 99 ITVCITHEDYLNVVYSSSEVEKYLQSOGFTEHNEDTTSNTDETSNQNATSLDNSTGNTAN 158 IT T Sty S T' TIN N Sbjct: 318 ITTDNTNTNVISTDNSKTNVISKDNSNTHTISTDNSKTNVISTDNNNTDTISTDNlNTDT 377 Query: 159 RNAYVSLPQSEVNIDVDNTTLRFAnNNTID---NGKTVNKSSNESNONAKRNQNQK 211 +NT DNN D N N N +K N Sbj ct: 378 KATDUDNTDTKATDNNNNTDTKATDNNNTDTKATDKSNNTDTKATDNNNNTDTKATDNNN 437 Query: 212 GNAKGT 217 N K T Sbjct: 438 TNTKAT 443 Score 34.4 bits Expect 0.81 Identities 30/100 Positives 44/100 Gaps 14/100 (14%) Query: 131 NEDTTSNTDETSNQNATSLDNS-TGNTANRNAY- -VSLPQSEVNI- -DVDNTTLRFAD 183 N +T TD TN N S DNS T Nt 43 S+ t+ D NT D Sbjct: 313 NNNITITTDNT-NTNVISTDNSKTNVISKDNSNTHTISTDNSKTNVISTDNNNTDTISTD 371 Query: 184 NNTIDNGKTVNKSS NESNQNAKRNQNQKGNAXGT 217 N+ D T N N+N +IK N KT Sbj Ct: 372 NDNTDTKATDNDNTDTKATDNNNNTDTKATDNNNTflTKAT 411 Score 34.4 bits Expect 0.81 Identities 28/101 Positives 41/101 Gaps 15/101 (14%) Query: 131 NSDTTSNTDETSNQNATSLDNSTGMTA NRNAYVSLPQSEVNIDV DNTTLRFA 182 N DT. AT +N4.2 A NN N D +NT Sbj ct: 374 NTDTKATDNDNTDTKATDNNNNTDTKATDNNNTDTKATDKSNNTDTKATDNNNNTDTKAT 433 Query: 183 DNNTIDNGK TVNKSSNESNQNAKRNQNQCGNAKGT 217 DNN N K T NK N NK T Sbj ct: 434 DNNN- TNTKATDSNNTNTKATDNNNTNKATDNNNTNTKAT 473 Score 32.5 bits Expect 3.1 Identities 30/110 Positives 40/110 Gaps 23/110 Query: 131 NEDTTSNTDETSNQNATSLDNS--TGMTANRNAYVSLPQS- EVNIDVDNTTLRF 181 N14.flN S DNt T T NN D NT Sbj ct: 251 NINTTQNLV1'STNTTTVSTDNNNNNINTKPTDNNNTNI KSTDNYNTGTKETDNKNTDI KA 310 Query: 182 DNGKTVNKSSNESNQNAKRNQNQKGNAKGT 211 DNN I DN KT S S4+ N KN T' Sbjct: 311 2DNNNITITTNTNTNVIS2DNSKTNVISKDNSNTHTISTDNSKTNVIST 360 >.gi114292401emb1CAA676591 (X99260) lower collar protein [Bacteriophage B1031 Length 293 Score 43.8 bits (101), Expect 0.001 Identities 53/204 Positives 79/204 Gaps 42/204 Query: 56 EKVFKG FSLKELSDLLFKCSF'IHFLD REINRQTVEAFGMQVITVCITHED 107 EK. KG F K F HF. REI +T F T 1+ Sbj ct: 26 EKIEKGRPKLFDFQYPIFDESYRKVFETHFIRNFYMREIGFETEGLFKFNLETWLI IN?4P Query: 108 YLNVVYSSSEVEKY---LQSQGFTEN NED2T- SNTDETSNQNA 146 Y S E.KY L +G N DTT 5142 NA Sbjct: 86 YFNKLFES-ELIKYDPLENTRLT'GNKKNDTERNDNRDTTGSMKAlGKSNTKTSDKTNA 144 Query: 147 TSLDNSTGN'A NRNAYVSLPQSEVNIDVDN- -TTLRFADNN2'IDNGKTVNKS 196 Sbjct: 145 TGSSKEDGKTTGSVTDDNFNRKIDSDQPDSRLNLTTN3GQGTLEYA- -SAIEENN2'NNKR 202 Query: 197 SNESNQNAKRNQNQKGNAKGTQFT 220 N G2' Sbjct: 203 NTTGTNNVTSSAESESTGSG'SDT 226 Query= pt1110879 44AHJDORFOO9 Phage 44AH4JD ORF 15744-649612 1 (250 letters) WO 00/32825 PCT/I B99/02040 307 >gil27649811ernb1CAA69021.1I (Y07739) N-acetylrnuramoyl-L-alanine arnidase [Staphylococcus phage Twort) Length -467 Score 180 bits (452), Expect le-44 Identities 89/157 Positives 109/157 Gaps 8/157 Query: I. MKSQQQAXEWIYKJIEGAGVDFDGAYGFQCMDLSVAYVY-YITDGKVRNWGNAIWAINNDFK MK* +QA+ +I G DFDG YC+QC~ML+V Y+Y4-+TDGK+RJ4WGNAKDAINN F Sbj ct: 1 MKTLKQAESYIKSKVNTGTDFDGLYGYQCMDLAVDYIYHVTDGKIRNWGNAKDAINNS EG Query: 61 GLATVYICNTPSFKPQLGDVAVYTNGQ- YGHIQCVLS GNLDYYTCLEQNWLGGGF 113 G ATVYKN P+F.P+ GDV V.TG YGHI V G+L Y TLEQNW GG Sbj ct: 61 GTATVYKJYPAFRPKYGDVVVNTTGNFATYGMIAIVTNPDPYGDLQYVYLEQWNGNGI 120 Query: 114 DGWEKATIRTHYYflGVTMFIRPKFSGSNS-KALETSK 149 E ATIRT) Y G+THFIRP F+ +S K +T K Sbjct: 121 YKTELATIRTHD)YTGITHFIRPNFATESSVKXKDTKK 157 Score 61.7 bits (147), Expect 6e-09 Identities 41/125 Positives 57/125 Gaps 8/125 Query: 125 YYDGVTHFIRPKFSGSNSKALETSKVNTFGKWKRNQYGTYYRNENGTFTC -GFLPIFARV 183 YY+G T P +K +T GW N YGTYY+tE+ TF C I R Sbj ct: 346 YYEGKTPV- -PTVVNQKAKTKPVKQSSTS WNVNNYGTYYKSESATFKCTARQGIVTRY 402 Query: 184 OS PKLSEPNGYWFQPNGYTPYNEVCLSDGYVWIGYNWQGTR -YYLPVRQWNGKTGNSYSV 242 P +tP Y+ VC DGYVWI 0 ++PVR W+ N+ Sbj ct: 403 TGPFTTCPQAGVLYYGQSVrYDTVCKQDGYVWISWTTNGGQDVWPVRTWD flflDIM 459 Query: 243 GIPWG 247 G WO Sbjct: 460 GQLWG 464 >gi 11136751 I P245561 ALYS_STAAU AUTOLYSIN (N-ACETYLMURARIOYL-L-ALANINE At4IDASE) >gil 798871pir1 jJQ1147 N-acetylmurarnoyl-L-alanine arnidase (EC 3.5.1.28) Staphylococcus aureus >gi1153067 (M76714) peptidoglycan hydrolase [Staphylococcus aureus] Length 481 Score 118 bits (292), Expect 6e-26 Identities 56/117 Positives 68/117 Gaps 1/117 Query: 135 PKFSGSNSKALETSKVNTFCK- WKRNQYGTYYRNENGTFTCGFLPI FARVGSPKLSEPNG 193 P SN WKRN+YGTYY E+ FT G PI R P LSPCG Ebjct: 365 PVATVSNESSASSN'TVKPVASAWKRNKYGTYYMEESARFTNCNQPITVRKVGPFLSCPVG 424 Query: 194 YWFQPNGYTPYNEVCLSDGYVWIGYNWQGTRYYLPVRQWNGKTGNSYSVGI PWGVFS 250 Y FQP GY Y EV LDG+VW+GY W+G RYYLP+R WNG +G WG S Sbjct: 425 YQFQPGGYCDrTEVML4QDGHVWVGYTWEGQRYYLPIRTWNGSAPPNQILGDLWGEIS 481 Score 78.0 bits (189), Expect 7e-14 Identities 48/109 Positives 62/109 Gaps 6/109 Query: 15 EGAGVDFDGAYGFQCMDLSVAYVYYITDGKVRWGNAKDA -INNDFKGLATVYKNTPSFK 73 EQ D YGFQC D +A 0 AXD N+F GLATVY+WrP F Sti ct: 18 EGKQFNVDLWYGFQCFDYANAG -WKVLFGLLLKGLGAKDI PFANNFDGLATVYQNTPDFL 76 Query: 74 PQLGDVAVYTNGQ YGHIQCVLSGNLDYYTCLEQNWLGGOF -DGWEK 118 Q GD+ YGH+ Vt LDY EQNWLGGG+ DG E+ Sbj ct: 77 AQPGDMVVFGSNYGAGYGHVAWVI EATLOYI IVYEQNWL4GGGWTDG IEQ 125 >gij1763243 (U72397) antidase (bacteriophage 80 alpha] Length 481 Score 118 bits (292), Expect 6e-26 Identities 56/117 Positives 68/117 Gaps 1/117 Query: 135 PKFSGSNSKALETSKVNTFGK-WKRNQYGTrIRRENCGTFTCGFLPIFARVGSPKLSEPNG 193 P SN WKRN+YGTYY E+ FT G PI R P LS PG Sbjct: 365 PVATVSNESSASSNTVKPVASAWKRNKYGTYYMEESARFTNGNQPITVRKCVGPFLSCPVG 424 WO 0032825PCT/i B99/02040 308 Query: 194 YWFQPNCYTPYNEVCLSDGYVWIGYNWQGTRY-YLPVRQWWGKTGNSYSVGI PWGVFS 250 Y FQP Cy Y EV L DG.VW+GY W+G RY-YLP+R WNG +G WG S Sbj ct: 425 YQFQPGGYCDYTEVMLODGHVWVGYTWEGQRYYLPIRTWNGSAPPNQI.CDLWGEIS 481 score 83.5 bits (203), Expect 2e-15 Identities 50/115 Positives 65/115 Gaps -6/115 Query: 9 EWIYIG4EGAGVDFDGAYGFQCMDLSVAYVY-YITDGKVRNWGNAKDA -INN]3FKGLATVYK 67 EW+ EG D YCFQC D +A G AID N+F GLATVY.
Sbj ct: 12 EWLKTSEGKQFNVDLWYGFQCFDYANWAG- WKVLFGLLLKGL4GAKDI PFANNFDGLATVYQ Query: 68 NTPSFKPQLGDVAVYTNGQ- YGHIQCVLSGNLDYYTCLEQNWLGGGF-DGWEK 118 NTP F Q GD.V+ YGH+ Vt WDY EQNWLGG+ DG E+ Sbj ct: 71 NTPDFLAQPGDMVVFGSNYGAGYGHVAWVI EATLDYI IVYEQNWLGGGNTDGIEQ 125 >giI4S74237IgbIAAD23962.1IAFlO6851_1 (AFlOSE5i) LytN (Staphylococcus aureus] Length 383 Score 84.3 bits (205), Expect 9e-16 Identities -48/128 Positives 68/128 Gaps 7/128 Query: 15 EGAGVDFDGAYGFQCt'DLSVAYVYYITDGKVRMWGNAKDAINNDFKGLATVYKNTPSFKP 74 E G DFDC+YG*QC DL Y *G NiP A +Y NTP+FK Sbjct: 252 ENRGWDFDCSYGWQCFDLVNVYWNHLYGHGLKGYGAKDIPYANNFNSEAZIYHNTPTFKA 311 Query: 75 QLGDVAVYT- -NCQYCHIQCVLSGNLD- -YYTLEQNWLGGGFD)GWEKATIRTMYYD 127 GD+ G YGH VTL+G+ D L.QNW GG+ E A H Yt Sbj ct: 312 EPGDLVVFSGRFGGGYGHTAIVLNGDYDGKLKFQSLDQNWNNGGWRKAEVAHCVVHNYE 371 Query: 128 GVTHFIRP 135
FIRP
Sbjct: 372 NDR4IFIRP 379 >gi13767593ldbjjBAA33856.lI (ABOl5iSS) LytN (Staphylococcus aureus] Length a 383 Score 84.3 bits (205), Expect 9e-16 identities a 48/128 Positives a 68/128 Gaps 7/128 Query: 15 EGAGVDFDGAYGFQCMDLSVAYVYYITDGKVRI4WGNAKDAINNDFKGLATVYKNTPSFKP 74 E G DFDG+YG+QC DL Y +G N+F A +Y NTP+FK Sbj ct: 252 ENRGWDFDGSYCWQCF1JLVNVYWNHLYGHGLKGYGACDI PYANNFNSEAICIYHNTPTFKA 311 Query: 75 QL4CDVAVYT- -NGQYGHIQCVLSGNLD -YYTCLEQNWL4GGFDGWEKATIRTHYYD 127 GD+ G YGH VL+G+ D L+QNW CC. E A H Y+ Shi ct: 312 EPCDLVVFSCRFCCYGHTAIVLNDYDKLKFQSLDQNWNNGWRKAEVAHKVVHNYE 371 Query: 128 CVTHFIRP 135
PIRP
Sbjct: 372 NDMIFIRP 379 >gi127649831emb10AA69022.lI (Y07740) cell wall hydrolase Ply187 (Staphylococcus phage 187] Length 628 Score 76.9 bits (186), Expect a 2e-13 Identities 50/144 Positives 68/144 Gaps 18/144 (12%) Query: 5 QQAKEWIYKHEGACVDFDCAYCFQCMDLSVAYVYYITDGKVRMW C--NAKDAINNDF 59 Sbj Ct: 12 KQVVDWAINLICSGVDVDGYYCRQCWDLP -NYI FN RYWNFKTPCNARDMAWYRY 64 Query: 60 KCLATVYICNTPSFKPQLGDVAVYTNCQY -C-HIQCVLS-GNLDYYTCLEQNWLGCCF 113-- V++NT F Pi- CD.AV.T C Y GH V+ Y+ 4+QNW Sbjct: 65 PECFKVFRNTSDFVPKPCDIAVNTCNYNWNTWCHTGIVVCPSTKSYFYSVDQNWNNSNS 124 Query: 114 DCWEICATIRTHYYDCVTHFIRPCF 137 A H YCVTHF+RP Sbjct: 125 YVCSPAAKIKJ4SYFCVTHFVRPAY 148 WO 00/32825 PCT/I B99/02040 309 sgil32877321 splOsiss IALELSTACP GLYCYL-GLYCINE ENDOPEPTIDASE ALE-i PRECURSOR >gill890068ldbjlBAAl3O69l (D86328) ALE-i [Staphylococcus capitis) Length a 362 Score 73.4 bits Expect 2e-12 identities 47/117 Positives 61/117 Gaps 10/117 Query: 132 FIRPKFSGSNSKALETSKVNTFGKWKRNQYGTYYRNENGTFTCGFLPIFARVGSPKLSEP 191 05145 TS N4 0 +K N+YGT +FT I R+ P S P Sbjct: 252 FLKSAGYGSNS TSSSNNNG-YKTNCYGTLYKSESASFTAN-TDIITRLTGPFRS4P 305 Query: 192 NGYWFQPNGYTPYNEVCLSDGYVWIGYNW -QGThYYLPVRQWNGKTGNSYS VGlPWG 247 Y+EV DG.VW.GYN 0 R YLPVR WN TO +G 140 Sbjct: 306 QSGVL.RKGLTIKYDEVMKQDGHVWVGYNTNSGKRVYLPVRTWNESTG- -ELGPLWG 359 ,gil7992S1pirlI1A25881 lysostaphin precursor Staphylococcus simulans >gif153047 (M15686) lysostaphin (ttg start codon) (Staphylococcus simulans] Length 389 Score 69.5 bits (167). Expect 3e-11 Identities 48/133 Positives 62/133 Gaps 20/133 Query: 131 HFIRPKFSGSNSKALETS WKRNQYGTYYRNENGTFTCG 175 HF R S SNS A K +0K WK N+YGT +FT Sbj ct: 258 HFQRI4VNSFSNSTAQDPMPFLKSAGYGKAGGTVTPTPNTGWKTNKYGTLYKSESASFTPN 317 Query: 176 FLPIFARVGSPKLSEPNGYWFQPNGYTPYNEVCLSDGYVWIGYNW-QGTRYYLPVRQWNG 234 I R P S p Y+EV DG+VW+GY 0 R YLPVR WN1 Sbjct: 318 -TDIITRTTGPFRSMPQSGVLKAGQTIHYDEVMKQ)GHVWVGYTGNSGQRIYLPVRTWNK 376 Query: 235 KTGNSYSVGIPWG 247 T WG Sbjct: 377 STN TLGVLWG 386 >gil 1264961 spI P10548; LSTP_-STAST LYSOSTAPHIN PRECURSOR (GLYCYL-GLYCINE ENDOPEPTIDASE) >gilI799271pirl S01079 lysoataphin precursor Staphylococcus simulans by.
staphylolyticus >giS58l744jembjCAA294941 (X06121) lysostaphin (AA 1-480) (Staphylococcus simulans by.
staphylolyticus] Length a 480 Score 69.5 bits (167), Expect 3e-11 Identities 48/133 Positives 62/133 Gaps 20/133 Query: 131 HFIRPKFSGSNSKALETS WKRNQYGTYYRNENGTFFCG 175 HF R S SNS A+ K +0K WK N+YGT +PT Sbjct: 349 HFQRNVNSFSNSTAQDPMPFLKSAGYGKAGGTVTPTPNTOWKTNKYGTLYKSESASFTPN 408 Query: 176 FLPIFARVGSPKLSEPNGYWFQPNGYTPYNEVCLSDGYVWIGYW-QGTRYYLPVRQWNG 234 1 R P S P Y+EV DG+VW+GY GR YLPVR WN Sbj cc: 409 -TDIITRTGPFRSMPQSGVLKAGQTIHYDEVMKQDGHVWVGYTONSGQRIYLPVRTWNK 467 Query: 235 KTGNSYSVGIPWG 247 T 140 Sbjct: 468 STN TLGVLWG 477 >gi[ 32879671 sp!P10547; LSTPSTASI LYSOSTAPHIN PRECURSOR (GLYCYL-GLYCINE ENDOPEPTIDASE) >gil 2072411 (U66883) lysostaphin (Staphylococcus simulansi Length 493 Score 69.5 bits (167), Expect 3e-11 Identities 48/133 Positives 62/133 Gaps 20/133 Query: 131 HFIRPKFSGSNSKALETS WKRNQYGTYYRNENGTFTCG 175 HF R S 5145 A K +0K WK N+YGT +FT Sbjct: 362 HFQRMVNSFSNSTAQDPMPFLKSAGYGKAGGTVrPTPNTGWKTNKYGTLYKSESASFTrPN 421 Query: 176 FLPIFARVGSPKLSEPNGYWFQPNGYTPYNEVCLSDGYVWIGYNW -QGTRYYLPVRQWNG 234
I
WO 00/32825 PCT/I B99/02040 310 I R P S P Y+EV OG+VW+GY GR YLPVR WN Sbjct: 422 -TDIITRTTGPFRSMPQSGVLKAGOTIHYDEVNKQDGHVWVGYTGNSGQRIYLPVRTWNK 480 Query: 235 KTGNSYSVGIPWG 247 T WG Sbjct: 481 STN TLGVLWG 490 >gi133419321dbi18AA31898.ll (AB009866) arnidase (peptidoglycan hydrolase) [bacteriophage phi PVL] Length 484 Score 68.3 bits (164), Expect 6e-11 Identities =52/150 Positives 71/150 Gaps 17/150 (11%) Query: 3 SQQQAKEWI YKHEGAGVDFDGAYGFQCMDLSVAYVYYITDG;KVRZ*JGNAKDAINNDFKGL 62 QA++W G D YGFQCOD+ +4+1 G+ R+ G I D K Sbj ct: 4 TIO4QAEKWFDNSLGKQFNPDLFYGFQCYDYASMF- FMIATGE-RLQGLYAYNI PFDNKAR 61 Query: 63 ATVY- -INTPSFKPQLGDVAVYTN- -GQYGHIQCVLSGNLDYYTCLEQNWLGGGF- 113 Y 104 SF PQ Dt V+ G GH+ V SHML+T QNW GG+ Ebi ct: 62 IEKYCQI IKNOYDSFLPQKLDIVVFPSKYGGGAGHVEIVESANLNTFTSFGQNWNGKGWTN 121 Query: 114 -DGW--EKATIRTHYYDGVTI4FIRPKF 137 GW E T HYYD +FIR F Sbjct: 122 GVAQPGWGPETVTRHVHYYDDPM4YFIRLNF 151 Query= Pt 1110882 44AHJDORFO12 Phage 44AHJD ORF 18391-881313 1 (140 letters) >gill40528lsplP248111YQXH_BACSU HYPOTHETICAL 15.7 KO PROTEIN IN SPOIIIC-CWLA INTERGENIC REGION (ORF2) >gij3221B91pirl 1B44816 orf2 s1of autolytic amidase Bacillus subtilis >gi1142801 (M459232) open reading frame 2 [Bacillus subtilis] >gij12178741dbjjBAA06959j (D32216) ORFl2l [Bacillus subtilis] >gif1303767IdbjjBAA12423j (D84432) YqdD [Bacillus subtilis] >gil2635036fembjCABl45321 (Z99117) alternate gene name: yqdfl; similar to holin (Bacillus subtilisi Length 140 Score 80.4 bits (195), Expect 6e-15 Identities 45/130 Positives 67/130 (50t) Gaps 3/130 Query: 4 VKFRFTDSEAFHHFIYAGOLKLLYFLFVLNFVDI ITGISKAIKNNNLWSKKSMRGFSKKX 63 0 F G +K L L VL +0++TGt KA K L S-s+ G+ +K Sbj ct: 8 INFETL3LARVYLF -GGVKYLDLLLVLS II DVLTGVI KAWKFKXLRSRSAWFGYVRKI, 64 Query: 64 YYYXXXXYYYYYYYYYXKGGLLITIFYYIANEGLS IVENCAEMDVLVPEOIKDKLRVI 123 G L T+ *YIANEGLSI EN V +P I D*L+ I Sbjct: 65 LNFFAVILANVIDTVLNLNGVLTFGTVLFYIANELSITENLAQIGVKIPSSITDR.LQTI 124 Query: 124 IO4OTEKSDNN 133 +N4+ E+S NN Sbjct: 125 ENEKEQSIO4N 134 >giI4l2G63ljdbjBAA366Sl.1I (AB016282) ORF4S (bacteriophage phi-los) Length 135 Score 76.1 bits (184), Expect le-13 Identities 44/115 Positives 61/115 Gaps 4/115 Query: 21 GDLKLLYFLFVLHFVDI ITGISKAIONNLWSKKSMRGFSOCYYYYYYYYYYYYYYYYYY G+.K L VL +DIITG. KA K L G+ +K Sbj Ct: 17 GEVKYLDLNLVLNIIDI ITGVI KAWKPKELRSRSAWFGYVRKILSFLVVIVANAIDTIMD 76 Query: 81 XKGGLLtITIFYYIANEGLSIVENCAEMDVLVpEQIKDKLRVIKND- TEKSD 131 G L T+ .YIANEGLSI EN V +P I D+L VI++D TEXK D Sbj ct: 77 LNGVLTFATVLFYIANEGLSITE-NLAQIGVKI PAVITORLHVIESDN3QKTEKDD 131 >giJ1410s8jspP268351YNGcrLOPE HYPOTHETICAL 14.9 KD PROTEIN IN NAGH 3'REGION (ORFO) >gijl0759671pirlIS43905 hypothetical protein D Clostridium perfringens >giJ455154 (M81878) WO 00/32825 PCT/I B99/02040 311 ORF D (Clostridium pert ringens] Length 132 Score 60.9 bits (145) Expect 4e-09 identities 38/127 Positives 63/127 Gaps 3/127 Query: 1 tOJEVKFRFTDSE-AFHNFIY-AGDLKLLYFLFVLNFVDI ITGISKAIJNNLWSKKSMRGF 59 +N A D+ L+ L V +F+0 +TG+ K K+ L S +RG Sbj ct: 5 INYIKWGIVSLGTLFTWIFGAWDIPLITLL-VFIFLYLTGVIKGCKSKELCSNIGLRGI 63 Query: 60 SI'KrYYYYYYYYYYYYYYYYYYKGGLLMITI -FYYIANEGLSIVENCAEMDVLVPEQIKD 118 +KK I ++YI NEG+SI+ENCA V +PE+±K Sbjct: 64 TKXGLILVVLLVAVNLDRLLDNGTWFRTLIAYFYI'SEOISILENCAALGVPIPEKLKQ 123 Query: 119 KLRVIIQ4 125 L+ N Sbjct: 124 ALKQLNN 130 >gil2293160 (AF008220) YtkC [Bacillus subtilisl >giJ2635S481entICAB1S0421 (Z99119) similar to autolytic amidase (Bacillus subtilis] Length -134 Score 36.4 bits Expect 0.099 Identities 25/109 Positives 41/109 (36%) Query: 17 FIYAGOLKLLYFLFVLMFVDIITGISKAIKNNNLWSKKSMRGFSWWYYYYYYYYYYYYYY 76 F G L LM4 I+ K L KK K Sbj ct: 20 FFFGGFQYSFLILLSLMAIEFISTTLKETIIHKLSFKKVFARLVKKLVTLALISVCHFFD 79 Query: 77 XXXXXKCGLLITI FYYIANEGLS IVENCAEMDVLVPEQIKDKLRVIKNQ 125 +G I +YI E +IV L .e1G Sbjct: 80 QLLNTQGSIRDLaAIMFYILYESVQIVVTASSLGIPVPQLVDLLETLGJ 128 >gilll8l9731emb1CAA87743.11 (Z47794) holin protein (Bacteriophage cP-1] Length 134 Score 31.3 bits Expect 3.3 Identities 27/88 Positives 36/88 Gaps 5/88 Query: 29 LFVLlFVDIITGISKAIKNNNLWSKKSMRGFSWVYYYYYYYYYYYYYYYYYNYK- GGLL 86 LF L+ D ITG KA K S ++eG K G +L Sbj ct: 18 LFALILFOFITGFLKAWK'JIVTDSWTGLKGVIKHTLTFIFYYFVAVFLTYIAMAVGQI L 77 Query: 87 MITIFYYIANEGLSIVENCAEMDVLVPE 114 I Y A LSI+EN A MV+P+ Sbjct: 78 LVIINLYYA LSIMENLAVNGVFIPK 102 WO 00/32825 PCT/I 899/02040 Table 21 Phage 182 complete genome sequence. 17833 nucleotides.
1.
71 141 212.
281 351 421 491 561 631 701 771 841 911 981 1051 1121 1191 1261 1331 1401 1471 1541 1611 1681 1751 1821 1891 1961 2031 2101 2171 2241 2311 2381 2451 2521 2591 2661 2731 2801 2871 2941 3011 3081 3151 3221 3291 3361 3431 3501 tagaatattg tcataaaaca caaacataat aatgcatatt aatatatttg taagttaaag gaggtgacaa aagaacaaat tggaggaaaa ataatgaaat attcactaca acaaatagat ttaaaaaggc atgaactaga ggaattggtg gacgaagtaa atcttttatc gttttattac acagaagaag aacgtttgtt ttacaacgaa aagatcacaa atctgaaatc ggaaatcata aaataattac acaaaaagct ttacaaatat aacacatcat aaatacctta cttcacacct caatcattct tatcaaaata taatgcaacg aaacgtaaca tcaactaaag tagaattctc aattgtacca tgcgaaccag ttgtctraac aggaaaactt Cgtaaaaacc ctgataaaaa cgtagttgta acaaatgttt tcgataaatt tatcgagtta gcagacaaat caacacaagc ggagattata atcatggaaa tcgtaaaaag cacatttgac ttcaatgcca caaacggggc ttcaattccg ttacgtaacg ttctagttta ctcagacgaa gtttctggtt ttggtggagc cttcacagaa gatggtaaaa cttatgcggg tgtatcagca gatatgatga ctgctaaccc tgacatcaaa ccaaaaattt aaaaatttgt aaatctacaa gtggtttcac tgtagcacaa aatctcgcta ggtggttttt attatgtttc tacattgagg aatgatagag ccaagttaga gaaaatctac ggtaaatcta gacaaaaagg agttgaggaa aggcaacttc caactgttcc aaaateaaca aatatgagtc gtagtgattt taacaagatg tacaacgaga attacatttt tgagatcaac aagcgaaatg cgcaaattaa aacagagcaa gctcaaaaag cgaaagaaga taagaagccc acagaaaaca caattgtcac accaactatt caagcaatac cagattitaa tattgacgct ttcacttctc taggaaaaca agacgaacaa tattttgacg aaagagacca gtttactatt ttcaattcag acgctgacga tattgttcgt atgaaaacat atgttagtaa cttcttagac atgaaccttg agaaaaaaga acaagtttac agtaagattg caaaagtgat atataacccc acgaagaaca tcacaattaa ttcaqaaaca attgtttaca cataaatgcz gaaattaaat acgatattgc tgaaattccc tcactcgaaa gttatactaa caaaaggagg agaagttatc t cagaagaaa cacatgaaac ctaataaaaa acacaaacac caattggcga cgaaccatca gtagcaacaa cttttgtcga aaatacagga tgtgtagaat acaaagctcg aacatcaaag ttagacgagt ttgcaatctc acactacaaa ttaacagagt cagaaggagt actttattac ttacttgact actacattta cgagt ctgaa ggagaagaat atggtcgtgg tttgagtggt tgctttcatg actcatatca :caggtaaaa tgaaacaaat tattggttac ttaaaaattc attggctaaa taaagactta ;gtgacggca ;aacacctct agtaagattc Lacgaatatc :tgacctaga aatatgtaat ttagaaattg *caacaattt taaagatccg tctgcaagat *aaagattaca aagagtagta gaaaataatg ttcgtgatat caaaaactat cagaactaga gaggaaagat taatagatta aaaactagta agggaacgga ggtcgaaaac gcgactttga aacaacaact cgttgacaat atgacgttcg gacatttatt tccacaacga aatggtgtaa agaagcaaaa gatctcaacg attgtcgtgt gtttagaaat cgattctttt aaaattgac ggagagttta gaagatcgaa cattctccac tttggaaatt acaataattt ctataaaaaa ggagtattta cgaatgacta ctttcaaaca cttcacttgg ttgtatccct aaccgaacaa tccaaccatt ttaggagttg atattttaga cgataaatgg agcttgtatg ttgttcgatt ggcttggggt tgttgggaag ttaattacac atgatagcct taaaaaatat aggcgaaata gattatacaa aagaacgaca ttcagattat gaggaagega cgctttaggc atggtttcct attttgtctt gtaaacaaag tttttcaagg ctcaaatgta cgtaagacct aacaacaaaa ccttttccag aagaaagacc1 ggcgatggca gattacaaag tagggtttga gaaagaaata ttaccatatg S gtacaagatg gagcgccaac aagctttaic agcgatcaaa agcgctttac acaatgccag caaaactaaa acaaaacaga cagaaggaat gttacaagta agtactagaa ttgaaagata caagcagaac tagtcgcttt aatcagctaa aaacctaatt aggaaaatca aacggtggac atctagtaag ccacttagcg tgaccgtaag aatatcaaag taaaaaatac aatcgtttaa aaaagactta ttgactacgt tggtagat tgcacaacct aagagcgcaa atcaaagaag gagcttaaca aagttgaagt taggtgctga ctiacctttt tcagtcttat ttagaaaata gacaatttca gacaagcgat caatggggct tgatctattt tgacgaagca gaagtacaac acaggtggag aagtcccctc tatgattaag aaaiatactg ggcgtatgcg atatagacaa gtaaaatgca aggcagcaca gttattcaaa aatggtttca aatatgggtc aatggtatgc cgaaaaaaga gaaatctcga tgcagaagct tttaattttc aaaccaacaa aagatgaatg aattcgatca aggactaact agctacacat ggaaaatcaa cgtaaagcat acaaaggcgg ttgtctttga tgtcaactct attctacgaa ggagaataca cgtttaaagg agggttatat ttgaatcaag tgtaaacaag attatttttt gaacactacg gatatgttca aaggctggat acgccaaagg tatgttaaat ttacatgggc gaggacggca ccgcttgcta gttttgtgac gcattattta ttgtgataca ggttgatcct aaaaaacttg aaaacatacg tagaagaaat agattgtaac ttttgacaat aggtggcgtg gtattagtag gcaatgttta tcgtacgtga :acatqact t tgaacaaata aaatttaaaa aaataaatta agtcaaattt ggcgagattg taaacgtttt gaccgtcaac :gtgtttcac ag~attatgt :tatggaaat taaagaacac tcaaagatt gtagagbatc ;caaatagca cacttgaaaa ;agaacgagc gatcgtagaa ;ttaggaatt taaggaggaa rtagaaacac cagtacaatt tatgccaaa tgcagataac cgactatccg caagttaagc acgaattaat gatacattac atcgaagtaa ctgtacattc aaaatatcaa aaagttcatt attcattcaa cgatcttact cttacaaatg acttacggat atatgttcaa agaacaccac cgaaggggct gaaagttcgg gacactagga agatatacta 351 gatagcattc atctagtagg 3641 gttattgggg gcatgaaagc 3711 tgatggcgaa ttaaatgtaa 3781 tttgaagttg gtttttcaag 3851 acacaargtt tacaatcaaa 3921 tgaaggtact attgacggtt :92 tatggaaaag aaacacgtga 4061 tttacatcct ttgcaaagta 4131 ggcgaatgta cacgtgaaat 4201 caatcgcaaa aaccttttag 4271 ttcacgtgaa acagttttta 4341 gaatcaattt taaatggtat 4411 ttgaagcatt gcgagaagac 4481 gttaaagaaa gataacgaag 4551 ccagcagaaa ataacgaacc 4621 aaaacatggc tgacaaaatc 4691 aatgactgct atttataata aacaaaccct gaagaagaat ccat tacaac aacagaagtt acatttcaac agtgtgctgg ctatggaaag taaggaggac acgatactga aattgaagca tggtaaaata atcgtgcgct gagtagccct tggtataata tcttgaaagt tacggagcaa taagagar cc cgctcaaaaa ccagaagcaa gagcaaaatt tatgccagat ttgctaccta taataatgga gacattacag agcttcttgt agaaaagcta gaaaagtgcc tgtttatgtt tgttttgatc tcgatcactt cattcggcag Cgaataaaag aaagaacaca actatataaa gatatttctt taaaaacagg agttgacaag tatggacaca tactcttttt gaggtggaga t acactatgta gtaacattag ttcttgtgat cccgttaagt taaatgtggc gaatcaaaag gtcacagacg caactgaaqc cgttggttat ttcaaactca agaaacagac cagaatatta acagaacaag atgttcttcg gtgaagcaag i tttgacatca g aaattgttcc g cactagacga t tgccacaaat g gttcatcatc tctttttcag gcgaacgtac c WO 00/32825 PCT/I B99/02040 4761 4831 4901 4971 5041 5111 5181 5251 5321 5391 5461 5531 5601 5671 5741 5811 5881 5951 6021 6091 6161 6231 6301 6371 6441 6511 6581 6651 6721 6791 6861 6931 7001 7071 7141 7211 7281 7351 7421 7491 7561 7631 7701 7771 7841 7911 7981 8051 8121 8191 8261 8331 8401 8471 8541 8611 8681 8751 8821 8891 8961 9031 9101 9171 9241 9321 9381 9451 9521 9591 9661 9731 9801 9871 9941 10011 atcgaagcgg ttggtgcagg accgtattgg taaagtagtt catgccttta gagtctgtta aaggttacta tagtttcgtt ttattaatag atgcaaaaga cgctcaagga accattgacg ttattgatga atggtttatg tattggttgc caacaaaacc tatcgcattg ttggttaagg gtcaatcatt ctaaggagga cctatacaca taacgagaat aaagacgcct atgcctt tgt acaaacttat tcgaatggaa atgtaacaac tggagataag cctatcaatt cgtttcttac accattcatt ccaacctacg cattcgtacc tgttaaggaa atgactttaa ctaataaagt caagatgtta ggaaacaaaa gtgcaatgag catcatggga ggtaaagtgg caggaaactt t cgttacttc tggaatttca aacaaatttt ggicgaacga caggggtatt caaacaaacg gctggtgtaa caaactacca atttatccgt gttaaaacat ttgacgtt gtttcctaaa atctacgaca accaccacca tgtcacaaaa acattacac caaccgtaaa agtaacattc caattatggc cacaagatgg agagattgtt tatatgcttg tactgatatt cgtttcgata tacctttcat ttttcatcct gaagat aaat caagtgggga aacgaaagaa gtggatcacg ctagtgatcc taaaagaatt tcaaaactat gacctgaata gatgatcgag atcgataatg actccttgat tacaggagga gcaggacaac cagatatcga tcaaaactat tcaatgtatg ttaaattaaa tagtgcaggc gatcacacgt ttagacgtag atccgataca aatcttggcg ttgaagaaat ttttgttgac taaacaggaa gttcccgatg atccaagaag catggttaga tgaacgcttt atacacaggt agaaaaagag ctattcaaag aagatcaaat caacctctaa ctacctcaaa atctgatcaa agcagcggca ttcaatatga aaagaaggcg aagaatcgtc aattgtacaa aacaacaagt actatattct acttctcaat gttgcttttg caagtgcaac cagtagaagc aacaaaccaa acaaacagca ggtaaagcga acagctatcg gaggtcaaca aagaaggtat acaaatgtaa tttaaaactc aacaggaaca cttatcaaag ggatacacaa taactatctc atctttaaaa gaatcaaga atgacaacac ttggtatacg agaaagtttc taatacaatt gaagagtcgc aacgatggag tcaattttct caggaggatc aatagtaggt ggtatacaaa ccaaatgggg ccttttttaa ataagatagt cgaacaaaac ggtaaggtat aacaggaaca atgaaaacat gatcttgtag ggaacgtgta ttatgtatcc ctattgttta tcttacaggt ggtaaattga ccgattgatt atgatgtaag atcctaacga tgtaggagtt tgctcaagag caaaacattc gcgatctttt cagccttagc aagtaaacaa ctatgtttct aaatattcca gataatgtaa tatcaattgc gcttcaaaca gcacaaagag caatcgagta agaaccaaat attgtaggca gttacgcttt ggcatacgaa taagatgagt agacgaaaag ccaaatgaac cctattcaag tccttacgtt tcagttgttt tttacacact aatggttatc gaagatggtc aaatcgatca gatatcctgt tttaagatat gaaagttcct acgttaccaa gtgaatcgaa gagcgcaaaa aagcttataa ccaaattgac ttttaatgta tggcaaacaa aatgaagtgt taacttttct aagtcttatc taacaatgaa cgatcgtgta aatcgtgtct gttcgacaat tacaactggc cttaaacgtt atattgaaag gccgaaaaca attgtttgat atttatcaat cacttttact gaatatttaa atctaaacat taaaaaacga attatttca actttagttg taaccctttg aaaatgttta aaaaaggaaa attgcacagg aacataagtt caaccctgac taaaaacatt gttccacgaa attaatcgtg aaaagcactt acttcatggg ataatttcaa gacgaagtaa gcgaatttga atacacgaaa agatcgaaat tggcgaaatt actgaatcaa caaattagaa tttatgagtt ccgcttacaa tacgttatta ttgacgccga cacagacgca gtaaaactga ctttgtagga cacaaaatcg aaatattgtg gcagttattg tagatagtga ctatacaacc ctgaagggtt atattggaat tcgggaacgc tgttgctttt gttaaatcag aactagtgtt gttaaaggat catctaaaga caaggagaag ttgtttcatc agcaccagca ctgccgtaac cgtagaaggc ttagaagtcg agcaacggtt cttgttacgg ttacttctga aattgttggc taacgtgcct tttgataaca ggaatcgtac tttaattcgt ttcctgttct ctcgggggag tttttagagt agataaacac acgaagaaac ttatcctagt aaatggcagt aagtttcgtt acctttgaaa ttgatgttct attgcaaaag aacaccctca actttattat ttgattacgg tagagaatac acaacaacaa tgttattcta acaagtgaag caatgccagt ggcccatctc ctttttccta ttatttactt caggcaatgc taattttgga gagtacatgg cgggatgtat gtaacgtcgt atacaggtat aatgcaggag gttcttataa gatcatgctt tcgctttctt ttgtgtaaaa gaagcaagaa taactacttt agagaagctt ttccgtttaa atagaaatta cagaiacaaa aggacatgta gtgtatatgt aaaaggttcg ttaggaattt taactcaacc attattacca atttaagtga aaatctgact atgcttctgc attcatgcaa gcaatacttt cagacatggt atgggaaaca aagtaacaac ccttttgttg gtttgactaa gaaaaagaaa acggtttgaa cctcttggca cacagcttgg atcaaactta tctttcacaa aattaaatat gagtatgcaa caagacttga gctacaccaa acttacaaac aagaaaagca caatgagtaa cgatgtatta acacgtgtga tgatgttttg aattataacc aagacaacgg gtgcaggact tgctagaaat aaccgttata tgatgtagaa gaaatcagct actatgaaca gaatgggaaa atttgccaaa atcaattgac ttggtttctt taaagaccct acacttgggt ttatcacaac cctattttct ttacagcaaa gatgatgatg atgataaatc aaaatgtatc gtttacatcg ttttgcttta gatatggcgg agatgtatag gaaggaggaa cagcaaaaag cagaccttat ttatcgtaga caactcacgc cctcgttatt tagaaattgc tcatggtttg cgcaggggca cgaagcaatg tatcacaaga atgttgtata ataatgactt acataaacca gatatcacga gaaatacttc tcattgctac gatatggagt ttgacgaatc cagaattgaa cgaagtatgg tgcacgtgta aaatcaagaa tgaagtttag tggagggttg aaaagaacgt cgagcagaat catttaagtt tcttgaagag attgatacaa acagaaacaa tcaaaaagat cacgaagcaa tacgattaca agtgttttga gagggaggta gaacgcttta taattgccta acattttgtt gaaaatttaa ctgaaacacg attttggtat caaacatcag aagagttttg aacagacgcc ccaagtgcta attgaagtts ttgaaacaaa taatcttgac tttccgattt acatcaaagc aaatacacgt ttgagaattg tgcttctgaa cgatataaag gaattgagaa gcaacaatgg gcaaatatcc tctgaatgaa gagaagttag tcaatgatac aacacctgta gaaaataatc atgctccata aggtatcaac cagattgaaa ttggcgatga ggcaggtcaa tttcacttat tttgattatc tgagagagat gccctattgg attgatgaga accaaacgaa tttctcaagg ggtgtaatca t tgatgacat gaatcgtgat gacacaggaa ccagcaatgg aaagaaacaa gtaaaaaggg aatgatcttt tagattttaa tcatactgaa gttggtgctt aagagatcac tgtttttgca ggactacacc gaat cgaaga caaccgattc agatggaaca attattcaaa aggctgtttt tgtagtagat aatgctaacg gttcaggtaa acttgacgga tcaaaaaaag taccaacctg cgttttatga aggctcagaa aataaaatgt aacagaaatt gcaagtagat aacacttata attatgcaac Cg.aLcaaac gatcaaaatc tactcgaaaa cttatttctc gtttacccgc agatgaagaa tatttaaatt ggttgtctga aagattacaa gatgttgctg ttgacacgtt ctgatgaaaa tgtggataaa aaact acgat tagataagac catcttgtta aagattgacg accagatgag aattatctcg cgaaacaaaa acgatgggat tcctatcaaa gttaaatgag caaacagaca cagacacccc aaatatcaca aaaccaaaga atatcgtaga cttgtttatg tgtattcaaa gtaicggctt actttatcga tggtacgtta atcttggttg atgatcgaac acgtattgtt agaacacaga aaacactgat agagaaatga ccccgacaag tacagatatg tagttaatga aaatgacaca aattatatca ttttgacaaa tgattatgga cattaataaa tatgctgact acaaggaagg cggtttgacg aattactatt tatgagtggt ctcaaaaaat aagaaatcaa aaataaaccg gccgatccta tgctaacagt gtgaatattc aagattcaac gcgacaatac WO 00/32825 PCT/I B99/02040 10081 10151 10221 10291 10361 10431 10501 10571 10641 10711 10781 10851 10921 10991 11061 11131 11201 11271 11341 11411 11481 11551 11621 11691 11761 11831 11901 11971 12041 12111 12181 12251 12321 12391 12461 12531 12601 12671 12741 12811 12881 12951 13021 13091 13161 13231 13301 13371 13441 13511 13581 13651 13721 13791 13861 13931 14001 14071 14141 14211 14281 14351 14421 14491 14561 1 4rl 14701 14771 14841 14911 14981 15051 15121 15191 15261 15331 gcaatcaata gtgtataatg aaagccgtta acgaaacaat attaaatgta ggtattcgtt ttaatgcctg taagaggtaa agaaattgac gaacaagcga cacaaaaaag aagttagtgg gcagacatta atattatgac gaatacctca aatgttggta atgtagaggt ttacaattgc cgctagtgaa acagtaacat aaacaacagc tgcaactgat ctggaatacc gaacacaact taatagcggt gttcaaaatg aactaaccaa gtaatatggc taaaaatgtt gctaaaacac caaccgcaaa aaacagt at c ctagctgttc aatggcaaat aacttgttag aaacaaatgc gacaaaccct gcaaaatat t aagtaactgt gctacaggag atatttatct taacattaaa ggaacggagg aacaagtgaa gatggatcag acaatttatt tccaatttca acgaatgtag aaggagaatt gggtacactc aaacaaaatg ctgtagttac tgccaatcaa gcaaaagatt ctgtagctga attaccatag gaggaaaaat atccaagtgt tcgagcagaa attatatgca gctggtgata ataaagttta ctgaaagttt caaaacgtatc ggttgtttc atcaatcaca cctgatggca aattgcgttt aaagaagaaa cattgcatgg tgatgtaagt gaagaaaaaa ctaacaccat gtttgaaaca ccacaaggcg atttgtttgt att tgaaaat atgtggcaag ttcatttttg ttctggtaac aacaaccatt ttgatgtaga tgttgtttac acgttgtatg accggtttat aacaaaagta gtgctaacag acaaatagct ggagttgtcq gatatgggtt tgctaaagct gataatacac atattgatgt acttgatctt ggaaccccaa caggaaacgg aatgattgca ataaatgatg ttggcgacaa taaaaaagat tttttagggc atcttattta agaatatatc gcaatgatta ttaaagctaa tccctctaaa ataaggaggt atcaaaagaa ttacctattg aaatgaaaat caggagagtt gcttatgggg ttatcgccga gcgaaatggg tatcactttt tgaattgaaa cgtgatgtac atgacggcat ttaatgtaaa ctcaaagtgg taaaggaatt tcgtaactgt actttaaatg tgtctattct ctaatatctc ggaacgatat caatactagg tacagcgatc attatcgaca acaatcgaag gtggcgtaag ttctagcata cggaaataga tgtttattgt cgtaactcac ggacattacc gaaacttaaa tcgaataaca agataaaaat ctaacaggcg t tgcacaaca tcaagcgttg agcgcagttg gagtaagcag aaaaatattc atttgaccag aatctcttat gtgatcacaa acacaccaaa aaatgacaat tcatatggaa caccctgaaa t caacaattt cggtgtaaca tattttcctc ctgtagttac catcgaaagt ttctttaatg aaggaacgta tcaagcaatt ggtacaggtt atgacgatga ttattatcga aacgctttgt aagtcgaggg gattacaggt acattagaga gtacagtggc taatcttgat attcaaatgc aggataaaga tcataatcgt actctaatct tgaattagtt aatgctgaaa agctaaagaa actgctgctg gtttgtcaac acgaaggctg gtacagcaca acaaaccgca caacggcagc taaaaacaca gctgattcag tttagaggac acagcaatac aatatactgt aaatgaagga tagcaatgac aataatttat tcgtgctgaa tiaacaatga caaattgtca ctcggtgcag taggtatgct cgaaggtatg cgctaccaga aggttttaga ccaataagaa tccaacagat acaaaagaaa tggtttatgt gtaggtaaaa tcgaatatct atccctagat gaacgaattg atattcaaat gaacaagatg cgaacccgaa acaagttgtt tttgatgaaa tgttgacaca agaaaaatga caactacaat gattgtacac caatattaaa taaattactt cttgtgaacg tgattcatat tatcgctttg tttcttagga tcgggagaaa cgacattaaa ttcaatattg atggttttgc attatggttg atactcgcaa ttacaatcgt tttgactttg tgttgttgtt gctagaggta gaggggttac atcaaaacag cttttcccga tgtaaatggt ttagaggttt ctttgtgaaa aacaaccgta ttatcagaat gtaattaatt tctgtgaaat ggatatgcgc ataacttgca tgtccaaaac ttgagtttca agatgtggat caagcttata aatgaatagt acagctattt cacgtttaat aaattatatc gttgtcaagg acatgttatc tggcacaaga agcacctttg acggacggtt tgggtttgtt gttcgtggtt tgtctaattc gcggtggcgt tcaaacggct aatacaccaa gtccaaatgc agaggacaaa ccaatctcca aggtcaatgg gaaacgttgg ctgtttcttc tgctacaatt gcacaagctt tcaagaatac cagaccaaat tgttgcgatg gtacttacaa agttaagaac tttatgactg aatgttttgg tctattgcta acacaatggt tcgattttgt agcaggtatc taatttctat gacaatcgag tgatctataa aagtaatgta caaccgctaa ttgggaatcg acgcctaaaa aaggtcaagc tgcaggttat aatt t tatgt atagtaagca aaatcttgat aatttccatg gaacagtaac tatcgtttat ggacaagttt cgttaggatc agatggagat tccgatgcct tggcagatga gttaggtttt attgttaagg gtattacaat ttcagatate aaggacgata tccatgcgat tttcaacaag gtacacctca tcttgacttt aaaacaggcg caaaaggcga gccacctaaa aacaggat ca tacaacaggt ctcgatcata tgtgacttga ttaactatga agcacctcag tataactagg actgttttct ataatcgtag aatcgatcat aggatatgag atggcaactc ttacaaatga aatactttca aaatatggct ataataaaaa ttcacaagta gctggtttga acccgaacag caatgaatat ggtggaggcg gcaatcttta tcgccaagca caaatttgtg ggttgtctaa agagatcatc gctcaagggg ataaaacagg tcaatggatg actaaccctc agaccctttc agcatttaaa caatctgcaa gtcactggga acgccctggt aaacttcata tcgaagaaag tattgacggt agcggtggcg gtggcgtaaa acgttgctat cctaaaagtt tcatgagtgg acaacttttt ggcacgcatg atggtttgga ctttggttca attgatcacc ctggcaatga acatgttgga acaatgggag cattaagagc gtattttgtg caagaattta gttataacca gtcaaatata aaggtaaaag gcgcaatacg tgacgcggat catttacatt taggttttac ttctttcata gatgatggaa catgggaaga ccctttgaag actggcggag ataatgacga taacaataag gataaaaatg tgaatggttg gaaattttaa taaggagaaa aaggtatgat taatcatctt gtttatggtt tgattatatg gttaatggtt acaattgcca aatttaacaa ggaaatcgae tttagtagtt tggcagaaat ggttttagtg gtttacttta ttcctgtagc gtatataaca atgttggttg gtttgatttt atcagaaatt gatgatgata ataattggac tgattatgtt aagaagtttt ttaaatgatg aatggtattg atatctctag ttatcaaaca tttgtaaata ttaaagcaac aggcggaaca ggttatgtaa ctttgtcttt aggtaaaaag attggtgtgt atcattttgc acaagaagcg caattctttt tagataatat taagggttac gaagggtcaa atcagaaaga tgtaaattgg gcgaaagcat ttaaagcatg gttttatacg tatacagcaa acctcaatac ttatggttta tgggttgctg aatatggatc aaatcaacca acaaataatt ttccaattgt tcctgtttt cagtttacaa attgat qaatgZLLC LaiLqycyacq gcaacacacg aattgttcct cctgaaaata aaatatttga cgccacaagt agcacaagcg tgttattt tgacggagaa acgatctttg ttagaggaac atacaatcat gttcatggaa aagaaatccc tatttactta aaaatgtatg aaaagaaacc agjtatataaaqcgttaaaet tgaagagaaa aacttatact ataaccctaa gtttgtaata ggcgcacgtg gtataggtaa aacttatggt aaacacggcg aacaatttat ttatttaaga agattcaaaa agtaaaattc ggtgcagtag tatagtatac taggacatat tagacggaac actcaacaga ggaattgatc tttcaaaagt accctgattg tgaccgagca gcatgagagg ggtttagaag attggtaaag ctgttcttat ttcttgatta tgtitataat aactgatttc tctagtattg caaggctact ctcaaccagc ggatctgtat gtaggtaaaa gatgagttta ttttcactct aattgtctga tccaacacaa atcaatggtg tggacacctg taggagtgta tagtatgaca caatgcttta ggttttaatt tataaaaaat ttgttgttaa cagaacttaa aaagattcct aaaaggaaaa gaattctatt gaaaaatcta atgaatatcc aacaatttga aatagcttag gcctaatgtt tcgctttatt caatttttca gtgatgataa cgaagttcgt aaacaatggc gaaagaattt attaatgggt tgggctgttc acaattttgt ttgatgagtt cctgatcata aacttgaagt cacttagtac gtggggaat tttaattgag aaaccaaaaa WO 00/32825 PCT/I B99/02040 15401 15471 15541 15611 15681 15751 15821 15891 15961 16031 16101 16171 16241 16311 16381 16451 16521 16591 16661 16731 16801 16871 16941 17011 17081 17151 17221 17291 17361 17431 17501 17571 17641 17711 17781 tcacttarttt tacaagatgt ccagatttga actt tgcaga tat caacaat Egcgccattg gttatgatta gctgatgaaa cggtttgata attttagtag qcgatagttt ggtgtgttaa aaatgtagct cggctatatt tataaaatac ccttttggta cctgacaata ttcggtgata tctgaaaggt caatcatttt tgttcaacgc atttatcatg tgttatttct attaatgata ataagattgg atccatatct tcaataagat gaagtagaga aatttgatat attgagaaag caaattctaa tgaataaatt atcatgtatt gatcctttct accaaacgaa gttatgttga ataagcgttt agtgaagaga gagtttgtca cttttgaagg tcaaccaaat aattggcgaa acattgttat agctaccacg tgttttggtt tgtagacgaa ataggacgtc ttaatgcttt tgtgatatcg tttgtaacgc cttttcaaga tttatttccg tacgtttaca aattcctcct ttttcattga tgttaacacg gacttgatag aattgttaat tagcattgta aattccttta aatgtttatt tacctctcct tgataccacc tccagttatc atagaggaat tctgtgtata tacatatatg ttattacatc gctgaagcct gtaatgcaac taatctatat gaaacacctt atgatagtga gaaaatcttt acaaatcatt ataattatta taagaattta attagttcta ctttggcgtt atctttttc catttctttc igttaaggtg tatattggtt taactgatag atgttaaatt gaacgtcgaa gt agaaacgt atttgtccgt tttcgttatt aactcttttg acgctaaact catgtaaaac tcgaattaat gttcttcaaa gttttcggta ttttcagcta aatcaaatgt atcaaatgaa ttactaagtt cgatcggttc tcaatcattt tatattatat tattgaacat gatggaaacg gttttccgaa tagtgtagtg aacccttatt tcttgtattt caagatcgag gtatattgat tgaattgtgt ttggtagatt gattcgtgga acagaatacg tacgtttatt gaaaagagaa gtaaaaatag gggtattgga tagacgctga aacaggttgt tttatgcaat gactacgaaa gaccatgaag tctttcaaca gtggcgaaag cattcaagaa gacgtacaaa caatctgcag gattcaaaag aagattttag tagtttctta gtctatgtga aaaatagatt tagttatctg cattatgatt tgtttaataa ttacaatgat gaatagtaga agtgattttt gctaacgcct atagttcttt ctccttatac tattctaacg caattcacta agaggttcgg ttttgtgtat ccttgtagaa tgtagccatt cgagaaccaa cttttacgta gactcgattc gggtaatagc atcttgtaaa gtcccctcta aaccattcaa ttagttcgcg aatttgttta tatccgtcat gcgatattaa tgcaatggct taacgtaatc aatgtataaa atcgttgtca tctttagtta actcctttta tattaatttg atgttatttc tgtagttttc agataacaaa caatattcct tctatgatat gataattcat ttaatgattt attgttcata gattggtagc attgtattaa attgttttat tttcaagtaa tatcctcatc tctaaaaatt attcatgttt atcatccttt aattcattta ttttaatgat catgtatgat tgtatttgtc tcaaaaaaag ataatattct gatgaaaatc tggtaaccct taacatagta attgtagtct ttttgtttgc ttttggatcg agttttaata attccctgta tatccatttc taggtatata caaaacctcc caaccatcta attccacctc ctttaaatag tgaagttact aatttcattg gttgaatgag ttaacaaaag tgatctctat tttttcattg gtgttctttg aatgttcgtg gtttcaattg ttccgcatag atcaagataa acatagttat attaattgtt ttcctccttg gttgatttaa accctctaaa atattgatac caccaatcga catgaatact cggaaataag catcgcctac ctcatcaata atcccactca ttaaaggggt tgaaacactc cttttatatt attaatattc tggataattt ctttttagcc tcatccacct ttcatacata ccacgttatt ctttattaca tatatagtat ttatttgatt gtttttttat aacaattaaa ttcatataaa at t tgtagtttgg ggtcagttac atttgtgtta WO 00/32825 WO 0032825PCT/I B99/02040 316 Table 22 Phage 182 ORFs list nb Name Frame 1 Position SieKey words I I_ I_ I 1 1 1820RF00l1T 2 1 5966..7780 1604 Tail protein; 2-T 1820RF002 1 1 2152..3873 1573 DNA polymerase; 3 1 1820RF003 I 1 11305..12639 1444 4 1 1 820RF004 3 -4626..5954 1442 Major head protein: 1 1820RF005 1 3 -12651.13700 1349 Glycyl-Glycine endopepbdase: Lysostaphin precursor; 6 1 1820RFOO6 1 14995..16026 1343 Encapsidation protein: ATG/GTP-binding site motif A; 8 1820RF008 1 2 1 14105..14983 I292 Lysozyme: Muramidase; 9 1820RF010 1 2 1 1310..2155 1 281 Terminal protein; 182___RF___9 1___875.961 __28 Lower collar protein; 1111820RF0l1 1 1 1i 9607..10158 1 183 Pre-neck appendage protein; 12 1T12R0127 3 1 10872..11294 1 140 13 1 1820RF013 1 1 1 10456..10860 11341 14r 1820RF014 3~ 13716.14108 130 1 Lysis protein; f1820RF015 T 2 1 854..1225 1123 1Early protein: 16 11820RF018 1 -2 116429..16737 1102 17 1 1820RF020 1 3 110158..10454 1 98 1Leucine-zipper motif; 18 1 1820RF019 1 3 14323.4613 1 96 Head protein; 19 1 1820RF016 1 -3 116749..17033 1 9 1 1820RF022 1 1 1 12868..13149 1 9 21 1 1820RF023 1I -2 1 11914..12189 1 91 1 22 1 1820RF017 T 1 154..426 190 1 23 1 1820RF024 1 3 16174..6446 f 90 1 24 1820RF025 1 2 J 548..814 1 88 Erypen r1820RF026 1 -3 112999..13259 1 86 1aypoen 26 1820RF027 1 -1 114642..14896 1 84 1 27 1 182RF028 3 1 14430..14672 1 2811820RF021 T -3 117106.17339 177 1 29 1 1820RF030 1 -1 16199.16429 176 1 1820RF031 1 -3 8379..8603 1 74 1 31 1820RF032 -1 11195..11413 !72 1 32 1820RF033 -1 4727.494 2 71 1 33 1820RF034 j -1 5951..6160 1 69 34 1820RF029 j -3 I17412..17606 64 1 I1820RF035 J -3 115570..15758 1 62 1 -36 T 1820RF036 J -3 1 2127..2315 1 62 13 1 120RF037 -1 1 12095..12280 1 61 138 1 18RF038 I 3 1 14769..14951 1 39 1 1820RF039 2 9992..10171 159 11820RF040 3 16029..16202 157 41 1 1820RF041 1 1 1 3886..4056 1 56 1Early protein; 42 1 1820RF042 1 -3 1 10671..182 43 1 1820RF043 ]1 10491.1052 53 44 11820RF044 1 -1 6299..6457 52 -1820RF045 1 -2 I6571..6729 1 52 1 46 1820RF046 1 2 1 2372..2527 I51 1 47 1820RF047 -2 1 13201..13353 [50 1 48 1820RF048 3 3243..3395 1 50 i 49 1 1820RF049 1 3 1578..1724 1 48 1 T1820RF050 1 2 1 8012.-8155 1 47__1 51TT1820RF051 13 9390.9530 1 46 1 52 1820RF052 j1 1 I 4096..4233 45 1 53 1 1820RF053 1 15656..15793 1 45 1 1820RF054 1 -2 1 8002..8136 1 44 11820RF055 2 18324..8455 1 43 -56 IT 1820RF056 3 1 6549..6680 1 43 57 1 1820RF057 -3 18133..8264 1 43 -58 -T1820RF058 1 -1 1 5048..5176 1 42 9§T 1820RF059 1 -2 15748..15876 142 0F60 1 ILMEE§2 15276..154041 42 61 1 1820RF061 1 1974..2102 142 62 1820RF062 1 -2 1867.1992 141 1 63 1820R~f063 -3 14181..14306 141 T 64 11820RF064 1 -2 7234..7356 1 40 WO 00/32825 PCT/11199/02040 1 65 1 1820RF065 1 -2 T 3465..3582- 1 LTh12R066 1 1 4234..4353 1 39I I67 1 1820RF067 1 -1 1 1376T..13882 139 68 11820RF068 1 -1 1 7148..7267 691 1820RF069 1 -3 1 4908..5027 I39 1820RF070 1 -3 1 912..1031 1 39 i71 1 1820RF071 2 1 11741..11857 138 172 1 1820RF072 1 -3 T 11610..11723 1__37 73 1 1820RF073 -3 1 2763..2876 1 37 74 1 1820RF074 -1 881 3..8923 1 36 i 1820RF075 -3 7353..7463 136 76 1 1820RF076 -3 2316..2426 1 36 77 1 182ORF077 2 11858..11965 1 78 1 1820RF078 -2 1 7564..7671 1 35 1 79 1 1820RF079 1 -2 1 7381..7488 1 1820RF080 1 -2 14372..4473 !33I WO 00/32825 PCT/I B99/02040 318 Table 23 Predicted amino acid sequences of OR~s from phage 182 182ORFOOl 5966 atggcaagaaggtatacaaatgtaaaattgttggctaacgtgccttttgataacacctatacacacacaagatggtttaaaact 1 M AR RY T N V K L L A N V P F D N T Y T H TR W F K T 6050 caacaggaacaggaatcgtactttaatt cgtttcctgtt cttaacgagaatagagattgttcttatcaaagggatacacaactc 29 Q Q E Q E S Y F N S F P V L N E N R D C S Y Q R D T Q L 6134 gggggagtttttagagtagataaacacaaagacgccttatatgcttgtaactatctcatctttaaaaacgaagaaacttatcct 57 G G V F R V D K H K D A L Y A C N Y L I F K N E E T Y P 6218 agtaaatggcagtatgcctttgttactgatattgaatataagaatgacaacacaagtttcgttacctttgaaattgatgtttta S K W Q Y A F V T D I E Y K N D N T S F V T F E I D V L 6302 caaacttatcgtttcgatattggtatacgagaaagtttcattgcaaaagaacaccct caactttattattcgaatggaatacct 113 Q T Y R F D I G I R E S F I A K E H P Q L Y Y S N G I P 6386 ttcattaatacaattgaagagtegettgattacggtagagaatacacaacaacaaatgtaacaacttttcatcctaacgatgga 141 F I N T I E E S L. D Y G R E Y T T T N V T T F H P N D G 6470 gtcaatt tct tgttat tctaacaagtgaagcaatgccagt tggagataaggaagataaatcaggaggatcaatagt aggtggc 169 V N F L V I L T S E A M P V G D K E D K S G G S I V G G 6554 ccatctcctttttcctattatttiact tcct at caattcaagtgggaggtatacaaaccaaatggggcaggcaatgctaatttt 197 P S P F S Y Y L L P I N S S G E V Y K P N G A G N A N F 6638 ggagagtacatggcgtt tct tacaacgaaagaacctt ttt taaat aagat agt cgggatgtatgtaacgtcgtat acaggt ata 225 G E Y M A F L T T K E P F L N K I V G M Y V T S Y T G I 6722 ccattcattgtggatcacgcgaacaaaacggtaaggtataatgcaggaggttcttataagatcatgcttccaacctacgctagt 253 P F I V 0 H A N K.T V R Y N A G G S Y K I M L P T Y A S 6806 gat ccaacaggaacaatgaaaacatt cgct ttcttttgtgtaaaagaagcaagaacattcgtacctaaaagaattgatcttgt a 281 D P T G T M K T F A F F C V K E A R T F V P K R I D L V 6890 gggaacgtgtataactactttagagaagcttttccgtttaatgttaaggaatcaaaactatttatgtatccctattgtttaata 309 G N V Y N Y F R E A F P F N V K E S K L F M Y P Y C L I 6974 gaaattacagatacaaaaggacatgtaatgactttaagacctgaatatcttacaggtggtaaattgagtgtatatgtaaaaggt 337 E I T D T K G H V M T L R P E Y L T G G K L S V Y V K G 7058 tcgt taggaat t t caataaagtgatgat cgagccgat tgattatgatgt aagtaact caaccat tat taccaat t taagtgac 365 S L G I S N K V M I E P I D Y D V S N S T I I T N L S D 7142 aagatgttaatcgataatgatcctaacgatgtaggagttaaatctgactatgcttctgcattcatgcaaggaaacaaaaactcc 393 K M L I D N D P N D V G V K S D Y A S A F M Q G N K N S 7226 ttgatgctcaagagcaaaacatcgcaatacttcagacatggtatgggaaacagtgcaagagacaggaggagcgatcttt 421 L I A Q E Q N I R N T F R H G M G N S A M S T G G A I F 7310 tcagccttagcaagtaacaacccttttgttggtttgactaacatcatgggagcaggacaacaagtaaacaactatgtttctgaa 449 S A L A S N N P F V G L T N I M G A G Q Q V N N Y V S E 7394 aaagaaaacggtttgaacctcttggcaggtaaagtggcagatatcgaaaatattccagataatgtaacacagcttggatcaaac 477 K E N G L N L L A G K V A D I E N I P D N V T Q L G S N 7478 t tat ctt cacaacaggaaact t tcaaaactattat caat tgcgct tcaaacaaat taaatatgagtatgcaacaagact tgat 505 L S F T T G N F Q N Y Y Q L R F K Q I K Y E Y A T R L 0 7562 cgttacttctcaatgtatggcacaaagagcaatcgagtagctacaccaaacttacaaacaagaaaagcatggaatttcattaaa 533 R Y F S M Y G T K S N R V A T P N L Q T R K A W N F I K 7646 t taaaagaaccaaatat tgtaggcacaatgagtaacgatgtact Laacacgtgtgaaacaaat tttt agtgcaggcgt tacgct t 561 L K E P N I V G T M S N D V L T R V K Q I F S A G V T L 7730 tggcatacgaatgatgttttgaattataaccaagacaacggagatgtatag 7780 589 W H T N D V L N Y N Q D N G D V 1820RFOO2 2152 at gat taagaaat atact ggcgact tgaaacaacaac tgat ct caacga t tgt cgtgt atggt cgt ggggcgt atgcgat at a 1 M I K K Y T G D F E T T T D L N D C R V W S W G V CD I 2236 gacaacgttgacaatatgacgtt cggtttagaaatcgattctttttttgagtggtgtaaaatgcaaggcagcacagacatttat 29 D N V D N M T F G L E I D S F F E W C K M Q G S T 0 I Y 2320 tt ccacaacgaaaaatttgacggagagtttatgct ttcatggt tat t caaaaatggt ttcaaatggtgtaaagaagcaaaagaa 57 F H N E K F D G E F M L S W L F K N G F K W C K E A K E 2404 gatcgaacattctccacactcatatcaaatatgggtcaatggtatgctttggaaatttgttgggaagttaattacacaacaaca D R T F S T L I S N M G Q W Y A L E I C W E V N Y T T T 2488 aaatcaggtaaaacgaaaaaagagaaatctcgaacaataatttatgatagccttaaaaaatatccttttccagtgaaacaaatt 113 K S G K T K K E K S R T I I Y D S L K K Y P F P V K Q I 2572 gcagaagct tttaatt ttcctat aaaaaaaggcgaaatagat tat acaaaagaaagacctat tggt tacaaaccaacaaaagat 141 A E A F N F P I K K G E I D Y T K E R P I G Y K P T K D 2656 gaat gggagtatttaaagaacgacat tcagat tatggcgat ggcat taaaaat t caat tcgatcaaggactaactcegaatgact 169 E W E Y L K N D 1 Q 1 M A M A L K I Q F D 0 G L T R M T 2740 agaggaagcgacgctttaggcgat tacaaagattggctaaaagctacacatggaaaatcaactttcaaacaatggrtttcctatt 197 R GCS D A L G D Y K O WL K A T H G K S T F K-Q W-F P.r- 2824 t tgt ctt tagggttga taaagac ttacgtaaagca acaaaggcggctcact ggg aaacaaagrt tt tcaaggaaagaa 225 L S L G F D K D L R K A Y K G G F T W V N K V F Q G K E 2908 ataggtgacggcat tgtctttgatgt caactctttgt at ccct ct caaatgt acgtaagacct ttaccatatggaacacctct a 253 1 G D G I V F D V N S L Y P S Q M Y V R P L P Y C T P L 2992 ttctacgaaggagaatacaaaccgaacaacgactatccgctgtacattcaaaatatcaaagtaagattccgtttaaaggagggt 281 F Y E G E Y K P N N D Y P L Y I Q N I K V R F R L K E G 3076 tat at tccaaccat caagttaagcaaagttcattatt cat caaaacgaat at cttgaat caagtgtaaacaagttaggagtt WO 00/32825 PCT/I B99/02040 319 309 Y I P T I Q V K Q S S L F 1 Q H EY L E SS V N K L G V 3160 gacgaattaatcgatcttactcttacaaatgttgacctagaattattttttgaacactacgatattttagagatacattacact 337 D E L I D L T L T N V 0 L E L F F E H Y D I L E I H Y T 3244 t acggatatatgt tcaaagct t ct tgtgat atgt tcaaaggctggat cgataaatggat cgaagtaaagaacaccaccgaaggg 365 Y G Y M F K A S C D M F K G W I D K W I E V K N T T E C 3328 gct agaaaagctaacgccaaaggtatgt taaat agc t gtatggaaagttcggaacaaaccct gacat tacaggaaaagtgcct 393 A R K A N A K G M4 L N S L Y G K F G T N P D I T G K V P 3412 tacatgggcgaggacggcat tgt tcgat tgacactaggagaagaagaattaagagat cctgt t tatgt tccgct tgct agrtttt 421 Y M4 G E D G I V R L T L G E E E L R D3 P V Y V P L A S F 3496 gtgacggcttggggtagatatactaccattacaaccgctcaaaaatgttttgatcgcattatttattgtgatacagatagcatt 449 V T A W C R Y T T I T T A Q K C F D R I I Y C D T D S I 3580 catctagtaggaacagaagt tccagaagcaat cgat cact t ggttgatcct aaaaaact tggt tat tgggggcatgaaagcaca 477 H L V G T E V P E A I D H L V D P K K L G Y W G H E S T 3664 t tt caacgagcaaaact cat tcggcagaaaacat acgt agaagaaattgatggcgaatt aaatgt aaagtgtgctggt atgcca 505 F Q R A K F I R Q K T Y V E E I D G E L H V K C A G 14 P 3748 gatcgaataaaagagat tgtaact t ttgacaat ttt gaagttggtt t ttcaagct atggaaagt tgctacctaaaagaacacaa 533 D R I K E I V T F D N F E V C F S S Y G K L L P K R T Q 3832 ggtggcgtggtattagtagacacaatgtttacaatcaaataa 3873 561 G G V V L V D T M F T I K 18 2ORFO 03 11305 atggaagaacgaattgatattcaaatgaacaagatgaaagaagaaaatcaaaagaattacctattgcaccctgaaacgaacccg 1 M E E RI D I Q M N K M K EE N Q K N Y L L H P E T N P 11389 aaacaagttgtttttgatgaaacattgcatggaaatgaaaatCaggagagtttcaacaattttgttgacacaagaaaaatgaca 29 K Q V V F D E T L H C N E H Q E S F H H F V 0 T R K M4 T 11473 act acaat tgatgtaagtgctt atggggt tat cgctgacggtgtaacagat tgtacaccaatat taaataaat tact tgaagaa 57 T T I D V S A Y G V I A 0 G V T D C T P I L N K L L E E 11557 aaaagegaaatgggtatcactttttattttcctccttgtgaacgtgattcatattatcgctttgctaacaccattgaattgaaa K S E 14 C I T F Y F P P C E R D S Y Y R F A N T I E L K 11641 cgtgatgtacctgtagttactttcttaggatcgggagaaacgacattaaagtttgaaacaatgacggcatttaatgtaaacatc 113 R D V P V V T F L G S G E T T L K F E T M T A F H V N 1 11725 gaaagtt tcaatattgatggt t t tgcat tatggt tgccacaaggcgct caaagtggtaaaggaat tt tctttaatgatact cgc 141 E S F N I D G F A L W L P Q C A Q S G K C I F F N D T R 11809 aat tacaa cgtt ttgactt tgat t tgt t tgt tcgtaactgtact taaatgaaggaacgtatgt tgttgttgcagaggtaga 169 N Y H R F O F D L F V RHN C T L N E C T Y V VV A R G R 11893 ggggttacatttgaaaattgtctattctctaaacctcaagcaattatcaaaacagcttttcccgatgtaaatggtatgtgg 197 G V T F E N C L F S N I S Q A I I K T A F P 0 V N G M4 W 11977 caagggaacgaatcaatactaggggtacaggttttagaggtttctttgtgaaaaacaaccgtattcattttgtacagcgatc 225 QG N D I N T R C T G F R C F F V K NHN RI H F C T AlI 12061 at tatcgacaatgacgatgat tat cagaatgtaat taatttctgtgaaart tctggtaacacaat cgaaggtggcgtaagt tat 253 I I D H D3 D D Y Q N V I N F C E I S C N T I E C C V S Y 12145 tat cgaggatatgcgcataact tgcatgt ccaaaacaacaaccatt tctagcatacggaaat agaaacgctt tgt ttgagttrt 281 Y R C Y A H H L H V Q N N N H F L A Y C N R H A L F E F 12229 caagatgtggatcaagct tatat tgatgtagatgt ttattgtcgtaact cacaagt cgagggaatgaatagtacagctatttca 309 Q D V D Q A Y I D V D V Y C R H S Q V E C NH S T A I S 12313 cgtt taattgttgt ttacggacat taccgaaacttaaagattacaggtaaat tat at cgt tgt caaggacatgttat cacgttg 337 R L I V V Y C H Y R H L K I T C K L Y R C Q C H V I T L 12397 tatggcggtggcgttraatttct at tgtgact tgatggcacaagaagcacct ttgacggacggt taccggt ttatt caaacggct 365 Y C C C V H F Y C D L M4 A Q E A P L T D C Y R F I Q T A 12481 gacaatcgagaactatgatgggtttgtgtcgtggtttgtctaattcaacaaaagtaaaacaccaatgatctataaagca 393 D N R V N Y D G F V V R G L S H S T K V N T P M I Y K A 12565 cctcagactgtttctataatcgtagaatcgatcatgtgctaacaggtccaaatgcaagtaatgtatataactag 12639 421 P Q T V F Y H R R I D H V L T C P H A S H V Y N 1820RP004 4626 atggctgacaaaatcacagaacaagatgttcttcgtgccacaaatgtagaaacaccagtacaattaatgactgctatttataat 1 MNA D K I T E Q D V L R A T N V ET P V QL M T Al I H 4710 agttcatcatctcsr rtrcaggcgaacgtacctatgccaaatgcagataacatcgaagcggttggtgcagggatcacacgttta 29 S S S S L F Q A H V P N P N A D H I E A V C A C I T R L 4794 gacgtagtaaaaaacgaatttatttcaacrttagtgaccgtattggaaagtagttatccgatacaaatcttggcgtaaccct 57 0 V V K N E F I S T L V D R I C K V V I R Y K S W R N P 4878 ttgaaaatgtttaaaaaaggaaacatgcctttaggtcgaacgattgaagaaatttttgttgacattgcacaggaacataagttc L K M4 F K K C H 14 P L C R T I E E I F V D3 I A Q E H K F 4962 aaccctgacgagtctgttacaggggtatttaaacaggaagttcccgatgtaaaaacattgttccacgaaattaatcgtgaaggt 113 N P D E S V T C V F K Q E V P D V K T L F H E I N R E C 5046 tactacaaacaaacgatccaagaagcatggttagaaaaagcatttacttcatgggataatt tcaatagtttcgttgctggtgta 141 Y Y K 0 T 1 0 E W T. V~ K A F T S W DN" N "SF V A G V 5130 atgaacgctttatacacaggtgacgaagtaagcgaatttgaatacacgaaat tattaatagcaaactaccaagaaaaagagcta 169 M4 H A L Y T C D3 E V S E F E Y T K L L I A N Y Q E K E L 5214 ttcaaagagatcgaaattggcgaaattactgaatcaaatgcaaaagaatttatccgtaagat caaaccaacctctaacaaatta 197 F K ElI E C ElI T E S HA KE F I RK I K S-T K L 5298 gaatttatgagt tccgct tacaacgct caaggagt taaaacatctacct caaaat ctgat caat acgt tat sat t gacgccgac 225 E F M4 S S A Y H A Q C V K T S T S K S 0 0 Y V I I D A D 5382 acagacgcaaccattgacgt tgacgtt t t agcagcggcat tcaatatgagt aaaactgact t tgt aggacacaaaat cgt tat t 253 T D A T I 0 V D V L A A A F H M4 S K T D F V C H K I V I 5466 gatgagtttcctaaaaaagaaggcgaagaatcgtcaaatattgtggcagttattgtagatagtgaatggtatgatctacgac 'IAd a X S S 0 qI A 0 N A H 'I V 1 3 1I A s d a i S X d qI LS NN M3 0a I II Iq 0H SA H 3A XS13 6c S A d N dA d HSN V N H v1q D v D N H H SN WT 23 S6LL LOO&ROM9 M I X W N N A LEE 9Z09T E 009T 'I a A, H 7 N N I A I N a A H I A S N X A V X V A I S rI A A N N H M N X W 11 -1 H N 3 3 H a X A A W V A A H N I N d T8Z le 2eztel3e5e36v~z2 v 2eeeeeo SESST 0A a2 A S A A A 3 0 1 a V a I M A D d I X 0 3 d V I V 3 ESC TSLST 'I S S N X S H X a 1 A I aI S aI N A A 3 N N I S d a a A. sz Jell 5 6 B~ L99ST 310H I '1 H D A d A a H X A a v A1 a N S a 3 1 3 1 1i 1 L6T DH a 0 A 'I N A H N N qI a d 0 1I N A A 'I A A d N A A S A 69T 660'ST V N S "I H A 3 Hd I N I H H HI A A A a W W N 'I IV 3 V a N TI' d 'I A A I X S N a 1 '1 A a aI A 1I I A H A a d A a N S N a Evr A S rI d A V M D H I N a a Z) A A 3 N 0 N A S 'I N H S8 AI a N V W A N A A 0 d I N N 11 a A N1 a H H 1I )A 1 A 0 LS 3 f H N I A H N A A A N N AD0 A A N D ID Hd V D A A q I 6LOST W rl 0 N Ad '1 V N N d N A A rI N N a 3 '1 N A D 'I S N A W 5661'T 90 0dHOZOT d A N ND N rl V a3 S 'I 'I 'I A LE E OOLET 5 S 6S9ET I 'I a N N a N N N a a N a 0 A a D) a C0 A N
SLSE!
d a m o a id SS 0'1 L WJ CX XI A 'IH r H 8E T6V'CT V a H I V Z3 A 0 D N N A N aI 0 A N A N I N S 0 N A S A csz LOI'ET a0AA I N A, A c a N I A A A V Hd rl V 0 W A 0 A H A A A z D Q i H a N D d H a3 1 S D a a '1 0 a2 H A N N d Hd 0 L6T 6 EZET N0V H A 0 A '1 0 D S H A S N d aI 'I N A N N I d A D A 3 69! SS! Et H N AD0D D D SDo a I H N S A v 0 V r 'a'i H a a i H 'i N V l~oeeul~o5 15o635o~ei~o~jze L0ee 5P1 v2 361eeoes 30JesI Igsep~eso2Lqp 3 o Le TOI D d H a MNH 3) WAdN I A V A a I N V S N A V S '1 A 1 d IT! L86ZT N AA DV S S A d A N a H M 0 D A N a D 0VI I a V 0D se eSS 5VB5 5eV 2 65 E06ZT 3 '1 A a V N V N S qI D 3 1 0 V 0 Hd A 'I N S N d A M 0 D LS 0 AD D0 A a N S N d N 'I D V S a MNH 'I N V A A D A 06z SELZ! S N N N A C)AN S qI INM V I A 0 D H V 1 0 a N A 'I A V H aS A A A A 'I A A V 00D D I V A 3 A A 'I S Z I'56S D 5 e 5 D s 6 2 9885 A3qD3A A1 A V A V N D V A X A A V N A 'I V d V E6E A A 3 DO ON A V a A d A A A 'I V I a N S S DNMA A S9t S A A V S V A V A N A. A d N A V S N A a V A V N 0 A 0S LEE A A 1 O H H H 'I MAN N A M X Da3d N A rl S A A N A XI aAI W A N a S aI A I A V A I N S S a a 0 a N N d a a a 181 Ot7OZO/66ll I/13d SZSZC/00 OAm WO 00/32825 PCT/I B99/02040 321 G F M V C A G A E D G 0 1 D H Y H N P I F F T A N E A M 8131 tatcacaagagatatCCtgttttaagatatgatgatgatgatgataaatcaaaatgtatcatgttgtataataatgacttgaaa 113 Y H K R Y P V L R Y 0 0 0 0 0 K S K C I M L Y N N 0 L K 8215 gttcctacgttaccaagtttacatcgttttgctttagatatggcggacataaaccagatatcacgagtgaatcga agagcgcaa 141 V P T L P S L H R F A L 0 M A D I N Q I S R V N R R A Q 8299 aaaacacctgtaatratcaaactgatgaaaagaaat act tct cat tgct acaagct tataaccaaat tgacgaaaataatcag 169 K T P V I I Q T D E K K Y F S L L Q A Y N Q I D E N N Q 8383 gcgttggaaaaagattagacttatttgaaaagtctttgaaaat 197 A V F V D K D M E F D E S F N V W Q T N A P Y V V D K L 8467 cgatcagaattgaacgaagtatggaatgaagtgttaacttttctaggtatcaacaatgctaacgtagataagactgcacgtgta 225 R S E L N E V W N E V L T F L G I N N A H V D K T A R V 8551 caaacatcagaagtcttatctaacaatgaacagattgaaagttcaggtaacatcttgttaaaat caagaaaagagttttgcgat 253 Q T S E V L S N N E Q I E S S G H I L L K S R K E F C D 8635 cgtgtaaatcgtgtctttggcgatgaacttgacggaaagattgacgtgaagtttagaacagacgccgttcgacaattacaactg 281 R V N R V F G 0 E L 0 G K I D V K F R T D A V R Q L Q L 8719 gcggcaggtcaatcaaaaaaagaccagatgagtggagggttgccaagtgctacttaa 877S 309 A A G Q S K K D Q M S G G L P S A T 18 20RFO 08 14105 atgatgaatggtattgatatctctagttatcaaacaggaattgatctttcaaaagttccatgcgattttgtaaatattaaagca 1 M M N G I DI SS Y Q T G I D L S KV P C OF V N I K A 14189 acaggcggaacaggttatgtaaaccctgattgtgaccgagcatttcaacaagctttgtcttraggtaaaaagattggtgtgtat 29 T G G T G Y V N P D C D R A F Q Q A L S L G K K I G V Y 14273 catt ttgcgcatgagaggggt ttagaaggtacacct caacaagaagcgcaat tctt ttt agat aat at taagggttacattggt 57 H F A H E R G L E G T P Q Q E A Q F F L D N I K G Y I G 14357 aaagctgttct tat tct tgact ttgaagggt caaat cagaaagatgt aaat tgggcgaaagcat t tct tgattatgt ttataat K A V L I L D F E G S N Q K 0 V N W A K A F L 0 Y V Y N 14441 aaaacaggcgttaaagcatggttttatacgtatacagcaaacctcaatacaactgatttt tctagtattgcaaaaggcgattat 113 K T G V K A W F Y T Y T A N L N T T D F S S I A K C 0 Y 14525 ggt ttatgggt tgctgaat atggat caaa tcaaccacaaggct act ct caa ccagcg ccac c taaaacaaataatt t tccaat t 141 G L W V A E Y G S N Q P Q C Y S Q P A P P K T N N F P I 14609 gttgcctgtt ttcagtt tacaagtaaaggacgt t taccaggatacaacggcaat cttgartt tgaatgtt t tctatggcgatggt 169 V A C F Q F T S K G R L P G Y N G N L D L N V F Y G D G 14693 aatacatgggatctgtatgtaggtaaaaaacaggat caaattgttcctcctgaaaataaaatatttgacgccacaagtgatgag 197 N T W 0 L Y V G K K Q D Q I V P P E N K I F D A T S D E 14777 tttattttcactcttacaacaggtagcacaagcgtgttttattttgacggagaaacgatctttgaattgtctgatccaacacaa 225 F I F T L T T G S T S V F Y F 0 G E T I F E L S D P T Q 14861 ctcgatcatattagaggaacatacaat catgt tcatggaaaagaaat cccatcaatggtgtggacacctgaacaatttgatatt 253 L 0 H I R G T Y N H V H G K E I P S M V W T P E Q F D I 14945 taertaaaaatgtatgaaaagaaaccagtatataaatag 14983 281 Y L K M Y E K K P V Y K 1820RF009 8765 gtgctacttaaacgttatattgaaagtttcacttattaccaacctgaattatctcgaaaagaacgtattgaagttggccgaaaa 1 V L L K R Y I E S F T Y Y Q P E L S R K E RI E V G R K 8849 caattgtttgattttgattatccgttttatgacgaaacaaaacgagcagaatttgaaacaaaat ttatcaatcacttttacttg 29 Q L F 0 F 0 Y P F Y D E T K R A E F E T K F I N H F Y L 8933 agagagataggctcagaaacgatgggatcatttaagtttaatcttgacgaatatttaaatctaaacatgccctattggaataaa 57 R E I G S E T M C S F K F N L 0 E Y L N L N M P Y W N K 9017 atgttcctatcaaatcttgaagagtttccgatttttgatgacatggactacaccattgatgagaaacagaaattgttaaatgag M F L S N L E E F P I F D D M D Y T I 0 E K Q K L L H E 9101 attgatacaaacatcaaagcgaatcgtgatgaatcgaagaaccaaacgaagcaagtagatcaaacagacaacagaaacaaaaat 113 I 0 T N I K A N R 0 E S K N Q T K Q V D Q T D N nN K N 9185 acacgtgacacaggaacaaccgattctttctcaaggaacacttatacagacacccctcaaaaagatttgagaattgccagcaat 141 T R D T G T T D S F S R N T Y T D T P Q K D L R I A S N 9269 ggagatggaacagggtaatcaattacgcaacaaatatcacagaagatttgagtaaagaaacaacaagctccacaggcgttgaa 169 G D G T G V I N Y A T N I T E D L S K E T T S S T G V E 9353 acaaacaacgacaaaacaaatcaaaatacacgaagcaacgcttctgaaaaagaaacaaagaacacagacattaataaagatcaa 197 T N N D K T N Q N T R S N A S E K E T K N T D I N K D Q 9437 aatcaaaccaaagatacgattacacgatataaaggtaaaaagggaaacactgatatgctgacttactcgaaaaatatcgtaga 225 N Q T K 0 T I T R Y K G K K C N T 0 Y A D L L E K Y R R 9521 agtgttttgagaattgagaaaatgatctttagagaaatgaacaaggaaggcttatttctcct tgtttatggagggaggtag 9601 253 S V L R I E K M I F R E M N K E C L F L L V Y G G R 1820RF010 1310 ttacg aaattaagagtaacaqttaaaaa tcqtatcaaaar aaa ce 1 L T V R IS K NODR A K L E K I Y G K S N K A R K K Y N 1394 cgt ttaagacaaaaaggagt tgaggaaaggcaact tccaactgt tccaacatcaaagaaaagact tattgactacgtaaaatca 29 R L R Q K G V E E R Q L P T V P T S K K R L I 0 Y V K S 1478 acaaatatgagtcgtagtgattt taacaagatgt tagacgagc tggtagat tt tgcacaacct t acaacgagaat taaatti et 57 T N M S R S 0 F N K M L D E L V 0 F A Q P Y N E N I F 1562 gagatcaacaagcgaaatgttgcaatctcaagagcgcaaatcaaagaagcgcaaattaaaacagagcaagc'Caaaaagcgaaa E I N K R N V A I S R A Q I K E A Q I K TI E Q A Q K A K 1646 gaagaacactacaaagagctaacaaagttgaagttaagaagcccacagaaaacacaattgtcacaccaactattttaacagag 113 E E H Y K E L N K V E V K K P T E N T I V T P T I L T E 1730 ttaggtgctgact tacct tt caagcaataccagat t ttaat attgacgctt tcactt ct ccagaaggagt tcagt ct tatt ta WO 00/32825 PCT/I B99/02040 322 141 L G A D L P F Q A I P D F N I D A F T S P E G V Q S Y L 18 14 gaaaatataggaaaacaagacgaacaatattttgacgaaagagaccaactttattacgacaat ttcagaCaagcgatgtttact 169 E N I C K Q 0 E Q Y F 0 E R D Q L Y Y D N F R Q A M F T 1898 attcaattcagacgctgacgatattgttcgtttacttgactcaatggggcttgatctatttatgaaaacatatgttagtaac 197 I F N S 0 A D D I V R L L D S M G L 0 L F N K T Y V S N 1982 ttcttagacatgaaccttgactacatttatgacgaagcagaagtacaacagaaaaaagaacaagt ttacagtaagattgcaaaa 225 F L 0) N N L D Y I Y D E A E V Q Q K K E Q V Y S K I A K 2066 gtgatcgagtctgaaacaggtggagaagtccctcatataaccccacgaagaacatcacaattaattcagaaacaggagaagaa 253 V I E S E T G G E V P S Y N P T K N I T I N S E T G E E 2150 ttatga 2155 281 L 18 20R70ll 9607 atggtagattttaaccccgacaagcggtttgacggtttacccgctgtattcaaagaacgctttagcaaatatcctcatactgaa 1 N V O F N P0DK R FODG L P A V F K E R F SK Y P HNT E 9691 t acagatatgaattacta agatgaagaagt at cggct taattgcctat ctaatgaagt tggtgctt tagt taatgatatg 29 Y R Y E L L L 0 E E V S A L I A Y L N E V G A L V N D N 9775 agtggttatttaaattactttatcgaacattttgttgagaagttagaagagatcacaaatgacacactcaaaaaatggttgtct 57 S G Y L N Y F I E N F V E K L E E I T N D T L K K W L S 9859 gatggtacgttagaaaatttaatcaatgatactgtttttgcaaattatatcaaagaaatcaaaagattacaaatcttggttgct 0 G T L E N L I N 0 T V F A N Y I K E I K R L Q I L V A 9943 gaaacacgtgctaacagtgtgaatattcttttgacaaaaaataaaccggatgttgctgatgatcgaacattttggtataagatt 113 E TR A NS V N I LL T K N K P D V A DOD R T F W Y K I 10027 caacgcgacaatactgattatggagccgatcctattgacacgttacgtattgttgcaatcaataaagttagtggctggaatacc 141 Q R D N T D Y G A 0 P I D T L R I V A I N K V S G W N T 10111 gctacaggagatatttatcttaacattaaaggaacggagggtgtataa 10158 169 A T G D I Y L N I K G T E G V 1820RF012 10872 atggcaaat aaaaat at tcaaatgaaggat agcaat gacaat aat ttat at ccaagtgt tcgagcagaaaac ttgt tagat tg 1 MA N K N I Q M K O S N D N N L Y P S V R AE N L L D L 10956 accagtcgtgctgaattaacaatgacaaattgtcaattatatgcagctggtgataaaacaaatgcaatctcttatctcggtgca 29 T S R A E L T M T N C Q L Y A A C 0 K T N A I S Y L G A 11040 gt aggtatgct cgaaggt atga taaagt ttact gaaagt t tgacaaacc ctgt gat cacaacgc taccagaaggtt t tagacca 57 V G M L E G N I K F T E S L T N P V I T T L P E G F R P 11124 ataagaacaaaacgtattggttgtttcgcaaaatattacacaccaaatccaacagatacaaaagaaatggtttatgtatcaatc I R T K R I G C F A K Y Y T P N P T D T K E N V Y V S I 11208 acacctgatggcaaagtaactgtaaatgacaatgtaggtaaaatcgaatatctatccctagataattgcgrtttrccctctaaaa 113 T P D G K V T V N 0 N V G K I E Y L S L 0 N C V F P L K 11292 taa 11294 141 1820RP013 10456 atggcagataaaaatattcaaatgcaggataaagatcataatcgtttaatgcctgttacaattgctaaaaatgttctaacaggc 1 MAOD K N I Q N Q O K O H N R L N P V T I A K N V L T G 10540 gactctaatcttgaattagttaatgctgaaataagaggtaacgctagtgaagctaaaacacttgcacaacaagctaaagaaact 29 D0S N L E L V N A E I R G N A S EA K T L A 00Q A K S T 10624 gctgctggtttgtcaacagaaattgacacagtaacatcaaccgcaaatcaagcgttgacgaaggctggtacagcacaacaaacc 57 A A G L S T E I 0 T V T S T A N Q A L T K A G T' A Q Q T 10708 gcagaacaagcgaaaacaacagcaaacagtatcagcgcagttgcaacggcagctaaaaacacagctgattcagcacaaaaaagt A E Q A K T T A N S I S A V A T A A K N T A D S A Q K S 10792 gcaactgatctagctgttcgagtaagcagtttagaggacacagcaatacaatatactgtattaccatag 10860 113 A T 0 L A V R V S S L E 0 T A I Q Y T V L P 1820RP014 13716 atgatagaatatatcacacaatggt tggcagatgataat cat ct tgt ttatggtt gatat atggttaatggt tgcaatgatt 1 N I E Y IT Q W L AO O N N L V Y GCL II W L N V A MNI 13800 atcgattttgtgttaggttttacaattgccaaatttaacaaggaaatcgactttagtagtttaaagctaaagcaggtatcatt 29 1 D F V L G F T I A K F N K S I 0 F S S F K A K A G I 1 13884 gttaaggtggcagaaatggttttagtggtactttattcctgtagcagtaaaattcggtgcagtaggtattacaatgtatata 57 V X V A E N V L V V Y F I P V A V K F C A V G I T N Y I 13968 acaatgttggttggtttgattttatcagaaatttatagtatactaggacatatttcagatatcgagagaaataatggact T N L V G L I L S E I Y S I L G H I S 0 I 0 D D N N W T 14052 gattatgttaagaagtttttagacggaacactcaacagaaaggacgatattaaatga 14108 113 D Y V K K F L 0 G T L N R K 0 D I K 18 2ORFOI1S 854 atggaaatcgtaaaaagcacat t gacacacaaacaccagaaggaatgttacaagtattcaatgccacaaacggggcttcaatt 1 MS IrV K S T F 0 T Q T P EG N L Q V F N A T N G AS I 938 ccgttacgtaacgcaat tggcgaagtactagaat tgaaagatattct agtt tact cagacgaagt ttctggt tt tggtggagcc 29 P L R N A T r. P. V T. R T. K' D T L. VYSE V S G G C A 1022 gaaccatcacaagcagaactags cgctttcttcacagaagatggtaaaacttatgcgggtgtat cagcagtagcaacaaaatca 57 E P S ASE LV A F F T ESO G K T Y A G V S A V AT K S 1106 gctaaaaacctaattgatatgatgactgctaaccctgacatcaaaccaaaaatttcttttgtcgaaggaaaatcaaacggtgg.a A K N L IO N T AN POX K P K I S F V E G-K S _N G 1190 caaaaatttgtaaatctacaagtggtttcactgtag 1225 113 Q K F V N L Q V V S L 1820RP016 17033 atgattaacaatttatcattaattttagagggtttaaatcaactaactaaagatgacaacgatagtttagcgt ctatcaagtca 1 N I NN L S L I L EG L N Q L T K DODN O S L AS I1K S 16949 gaaataacacaaggaggaaaacaattaattttatacattgattacgttacaaaagagttcgtgttaacacatgataaatataac WO 00/32825 PCT/I 899/02040 323 29 E I T Q G G K Q L I L Y I D V V T K E F V L T H D K V N 16865 tatgtttatcttgatagccattgcattaatatcgcaataacgaaatcaatgaaaagcgttgaacactatgcggaacaattgaaa 57 Y V V L D S H C I N I A I T K S M K S V E H V A E Q L K 16781 catgacggatataaacaaattacggacaaatag 16749 H D G V K Q I T D K 1820RP017 154 atgaaat at tcactacaacaaat agatgaaat taaatc aacaatt tt cagaat tagat taaaaaggcatgaactagaggaat tg 1 M K V S L. Q Q I DE I K S T I F R I R L K R H E L E EL 238 gtggacgaagtaaacgatattgtaaagatccggaggaaagaacttatcgtttattacacagaagaagaacgtttgttt 29 V D E V N D I A K D P E E R Y L L S F Y V T E E E R L F 322 gaaat tccctctgcaagattaatagattattacaacgaaaagatcacaaat ctgaaatcggaaat catat cactcgaaaaaaga 57 E I P S A R L I D V V N E K I T N L K S E I I S L E K R 406 ttacaaaaactagtaaaataa 426 L Q K L V K* 1820RP018 16737 atgattcacgaacattcaaagaacaccgcgaactaattgaatggttacgtttctactgtaaacgtaacctttcagacaatgaa 1 H I A R T F KE H R E LI E W L R F V C K R N L S D N E 16653 aaaatagagatcatagaggggactttacaagatttcgacgttccggaaataaatatcaccgaacttttgttaactcattcaacg 29 K I E I I E G T L Q D F D V P E I N I T E L L L T H S T 16569 ctat tacccgaatcgagt caat ttaacatitcttgaaaagtat tgt caggcaatgaaattagt aacttcat acgtaaaagttggt 57 L L P E S S Q F N I L E K V C Q A M K L V T S Y V K V G 16485 tctcgctatcagttagcgttacaaataccaaaaggctatttaaaggaggtggaataa 16429 S R V Q L A L Q I P K G V L K E V E 1820RF019 4323 atggaaattaaagaacatgaatcaattttaaatggtattcttgaaagtgtcacagacggtgaagcaagatcaaagattgtagaa 1 M E I KE H E S I L N G IL E S V T D GE AR S K I V E 4407 catctigaagcattgcgagaagactacggagcaacaactgaagctttgacatcagcaaatagcacacttgaaaagttaaagaaa 29 N L E A L R E D V G A T T E A L T' S A N S T L E K L K K 4491 gat aacgaagcgt tggt tatt c aaact caaaat tgt tc cgagaaegagcgat cgt agaac cagcagaaaat aacgaaccagaa 57 D N E A L V I S N S K L F R E R A I V E P A E N N E P E 4575 acagaccagaatattacactagacgatttaggaatttaa 4613 T D N IT L D D L GI 1820RP020 10158 atggcagacattagaacacaactaacaagtgaagatggatcagacaatttatttccaatttcaaaagccgttaatattatgact 1 MA D I R T Q L T S E D G S D N L F P I S K A V N I M T 10242 aatagcggtacgaatgtagaaggagaattgggtacactcaaacaaaatgacgaaacaatgaaiacctcagttcaaaatgctgta 29 N S G T N V E G E L G T L K Q N D E T M N T S V Q N A V 10326 gttactgccaatcaagcaaaagattctgtagctgaattaaatgtaaatgttggtaaactaaccaatcgaataacaacattagag 57 V T A N Q A K D S V A E L N V N V G K L T N R I T T L E 10410 agtacagtggctaatcttgatggtattcgttatgtagaggtgtaa 10454 S T V A N L D G I R V V E V 1820RF021 17339 atgaacaataaatcattaatagctgaaaaaggagaggtatctctacttcacccctttaatgagtgggatatgaattatcatatc 1 M N N K S L I A E K G E V S L L H P F NE W D M N Y H I 17255 atagataccgaaaacaataaacattatcttattgatattgatgaggtaggcgatgaggaatattgtttgttatcttttgaagaa 29 I D I' E N N K H V L I D I D E V G D E E V C L L S F E E 17171 ctaaaggaattagatatggatcttatttccgagtattcatggaaaactacagaaataacatattaa 17106 57 L K E L D M D L I S E V S W K T T E I T V 1820RP022 12868 gtgggttgtctaatgctaaagctgaaacgttggaaggtcaagcagagatcatcgctcaaggggataaaacaggtcaatggatgg 1 V G C L M L K L K R W K V K Q R S S L K G I K Q V N G W 12952 ataatacacctgtttcttctgcaggttatactaaccctcagacccttt cagcatttaaacaatctgcaaatattgatgttgcta 29 I I N L F L L Q V I L T L R P F Q H L N N L Q I L M L L 13036 caattaattttatgtgtcactgggaacgccctggtaaacttcatatcgaagaaagacttgat cttgcacaagcttatagtaagc 57 Q L I L C V T G N A L V N F I S K K D L I L H K L I V S 13120 atattgacggtagcggtggcggtggcgtaa 13149 I L T V A V A V A 1820RF023 12189 atggttgttgt tttggacatgCaagt tatgcgcatat cctcgataataacttacgccacct tcgattgtgt taccagaaatt tc I M V V V L D M Q V M R IS S II T V A T F D CV T R N F 12105 acagaaattaattacattctgataatcatcgtcattgtcgataatgatcgctgtacaaaaatgaatacggttgtttttcacaaa 29 T E I N Y I L I I I V I V D N D R C T K M N T V V F H K 12021 gaaacctctaaaacctgtacccctagtattgatatcgtt cccttgccacataccatttacat cgggaaaagctgttttgataat 57 E T S K T C T P S I D I V P L P H T I V I G K S C F D N 11937 cgcccgagagatactagagaacag 11914 C L R D I R E 1820RF024 6174 atgcttgtaactatctcatctttaaaaacgaagaaacttatcctagtaaatggcagtatgcctttgttactgatattgaatata 1 M L V T I S S L K T K K L I L V N G S M P L L-L I_ -L-N I 6258 agaatgacaacacaagtttcgttaccrtttgaaattgatgtttcacaaacttatcgtttcgatattggtatacgagaaagtttca 29 R M T T Q V S L P L K L H F V K L I V S I L V V E K V S 6342 ttgcaaaagaacaccctcaacttattattcgaatggaatacctttcattaatacaattgaagagtcgcttgattacggtagag 57 L Q K N T L N F I I R M E V L S L I Q L K S R L I T' V E 6426 aatacacaacaacaaatgtaa 6446 N T Q Q Q M WO 00/32825 PCT/I B99/02040 324 1820RF025 548 atgggtcgaaaactaatgcaacgaaacgraacatcaactaaagtagaattctcagaagttatcgracaagatggagcgccaaca I M G R K L M Q R N V T S T K V E F S E V I V Q D G AP T 632 attgcaccatgcgaaccagttgtcttaacaggaaaactttcagaagaaaaagctttatcagcgatcaaacgtaaaaaccctgat 29 1 V P C E P V V L T G K L S E E K A L S A I K R K N P D 716 aaaaacgtagttgtaacaaatgtttcacatgaaacagcgctttacacaatgccagtcgataaatttatcgagttagcagacaaa 57 K N V V V T N V S H E T A L Y T M P V D K F I E L A D K 800 tcaacacaagcctaa 614 S T Q A 1820RF02 6 13259 atggaaattatttggtctgccgtttcctgcatgcgtgceaaaaagttgtccactcatgaaacttttaggatcaagatttgtatt 1 M E I IW S A V S C M R A K K L S TH E T FR I K I C I 13175 cttgattggggttccatagcaacgttttacgccaccgccaccgctaccgtcaatatgcttactataagcttgtgcaagatcaag 29 L D W G S I A T F Y A T A T A T V N M L T I S L C K I K 13091 tctttcttcgatatgaagtttaccagggcgttcccagtgacacataaaattaattgtagcaacatcaatatttgcagattgrtt 57 S F F D M K F T R A F P V T H K I N C S N I N I C R L F 13007 aaatgctga 12999 K C 1820RP'027 14896 atgaacatgattgtatgttcctctaatatgatcgagttgtgttggatcagacaattcaaagatcgtttctccgtcaaaataaaa 1 M N M I V C S S N M I E L C W I R Q F K D R F S V K I K 14812 cacgcttgtgctacctgttgtaagagtgaaaataaactcatcacttgtggcgtcaaatattttattttcaggaggaacaatttg 29 H A C A T C C K S E N K L I T C G V K Y F I F R R N N L 14728 atcctgtrtrtacctacatacagatcccatgtattaccatcgccatagaaaacattcaaatcaagattgccgttgtatcctgg 57 1 L F F T Y I Q I P C I T I A I E N I Q I K I A V V S W 14644 taa 14642 1020RF028 14430 atgttataataaaacaggcgttaaagcatggttttatacgtatacagcaaacctcaatacaactgatttttctagtattgcaa 1 M F I IK Q A L K H G F I R I Q Q T S I Q L I F L V L Q 14514 aaggcgat tatggt ttatgggt tgctgaat atggat caaat caaccacaaggctact ct caaccagcgccacct aaaacaaata 29 K A I M V Y G L L N M D Q I N H K A T L N Q R H L K Q I 14598 attttccaattgttgcctgttttcagtttacaagtaaaggacgtttaccaggatacaacggcaatcttgattga 14672 57 I F Q L L P V F S L Q V K D V Y Q D T T A I L I 1820RF029 17606 atgaatgaaccgatcgtatacacagaaatttattcaaataacgtggtatgtatgaaaatttttagagatgaggataaacttagt 1 M N E P1I V Y T E IY S N N V V C M K I F R D E D K L S 17522 aaattcctctatttagaatttgaggtggatgaggctaaaaagttacttgaaaataaaacaatttcatttgatgataactggact 29 K F L Y L E F E V D E A K K L L E N K T I S F D D N W T 17438 ttctcaataaattatccagaatattaa 17412 57 F S I N Y P E Y 1820RF03 0 16429 atggctacattctacaaggaaccaatatacgatatcacagtattttatatagatggt tgggaggttttgatacacaaaaccgaa 1 MA T F Y K E P IY D I T V F Y I D G W E V L I H K T E 16345 cctctcacrtaacaaaagcattaaaatatagccgtatatacctagaaatggatatagtgaattgcgttagaatagaaagaaat 29 P L T L T K A L K Y S R I Y L E M D I V N C V R I E R N 16261 ggacgtcctatagctacattttacagggaattattaaaactgtataaggagaaagaactatga 16199 57 G R P I A T F Y R E L L K L Y K E K E L 1820RF031 8603 atgtcacctgaacttcaatctgtcattgttagataagacttctgatgtttgtacacgtgcagtcttatctacgttagcattg I M L P E L S IC S L L D K T S D V C T R A V L S T L A L 8519 ttgatacctagaaaagttaacacttcattccatacttcgttcaattctgatcgtagtttatctactacatatggagcasttgtt 29 L I P R K V N T S F H T S F N S D R S L S T T Y G A F V 8435 tgccatacattaaaagattcgtcaaactccatatctttatccacaaaaacagcctga 8379 57 C H T L K D S S N S I S L S T K T A 1820RF032 11413 atgtt cat caaaaacaact tgt t tcgggt tcgtt tcagggtgcaat aggt aattct t tgasst ct tct t tcat cttgt tca 1 M F H Q K Q L V S G S F Q G A I G N S F D F L L S S C S 11329 tttgaatatcaattcgttcttccatatgaacctccttattttagagggaaaacgcaattatctagggatagatattcgatttta 29 F E Y Q F V L P Y E P P Y F R G K T Q L S R D R Y S I L 11245 cctacattgtcatttacagttactttgccatcaggtgtgattgatacataa 11195 57 P T L S F T V T L P S G V I D T 1820RF033 342 tgzaaca&aattctatg cacaagagt ctt LLU LL Lac-- 1 M S T K I SS I V R P K G M F P F L N I F K G L R Q D L 4858 tatcggataact actt accaatacggt caact aaagt tgaaataaat tcgst tttact acgt ctaaacgtgtgat ccctgca 29 Y R I T T L P1I R S T K V ElI N S FF T T S K R VI P A 4774 ccaaccgcttcgatgttatctgcaittggcataggtacgttcgcctga 4727 57 P T A S M L S A F C I C T F A* 1820RF034 6160 gtgtttatctactctaaaaactcccccgagttgtgtatccctttgataagaacaatctctattctcgttaagaacaggaaacga 1 V F IY S K N S P E L C I P L I R T I S I LV K N R K R 6076 attaaagtacgattcctgttcctgtcgagttttaaaccatcttgtgtgtgtataggtgttatcaaaaggcacgttagccaacaa 29 I K V R F L F L L S F K P S C V C I G V I K R H V SQQ 5992 ttacatttgtataccttcttgccataattgtcctccttag 5951 WO 00/32825 PCT/I B99/02 040 325 57 F Y I C I P S C H N C P P 1820RF035 15758 atggcgcataagaaactaCtatttttacttctCttttcaataaaCgtatcactatcattgacaaactcattgttgatactaaaa 1 M AX K K L, L F L L L F S IN V S L S L T N S L L I L K 15674 ccttcgtattctgttCC8CgaatcaatctacCaaaaggtgtttctCtcttcacttcrgcaaagtctttgaatcacacaattca 29 S S Y S V P R I N L P K C V S L F T S A K S F E S H N S 15590 arcaacatacctcgatcttga 15570 57 I N I P R S 1820RF036 2315 acgtctgtgctgccttgcattttacaccactcaaaaaaagaatcgatttctaaaccgaacgcatattgtcaacgttgtctaa 1 M S V L PC I L H H S K K E S IS K P N V I L S T L S I 2231 tcgcacacgccccacgaccatacacgacaatcgttgagatcagttgttgtttcaaagtcgccagatatttctaacataatt 29 S H T P H D H T R Q S L R S V V V S K S P V Y F L I I I 2147 ctcctcctgtctctgaattaa 2127 57 L L L F L N 1820RF037 12280 gtgagteacgacaacaaacarctacatcaatataagcttgatccacatcttgaaactcaaacaaagcgttcattccgtatg 1 V S Y O N K H L H Q Y K L 0 P H LE T Q T K R F Y F R M 12196 ctagaaaatggttgttgttttggaCatgcaagttatgcgCatatcctcgataataacttacgccaccttcgattggttaccag 29 L EN GCCC F C HA S Y A H I L O N N L R H L R L C YQ 12112 aaattccacagaaattaa 12095 57 K F H R N* 1820RF038 14769 gtgatgagtttattttcactcttacaacaggtascacaagcgtgttttattttgacggagaaacgatcttgaattgtctgatc 1 V MS L F S L L Q Q VA Q A C F I L T E K R S L N C LI 14853 caacacaactcgatcatattagaggaacatacaatcatgttcatggaaaagaaatcccatcaatggtgtggacacctgaacaat 29 Q H N S I I L E E H T I N F N E K K S H Q W C C H L N N 14937 ttgatatttacttaa 14951 57 L I F T 1820RF039 9992 atgttgctgatgatcgaacattttggtataagattcaacgcgacaatactgat tatggagccgat cctattgacacgttacgta 1 M L L M I E H F GCI R F N A T I LI ME P I LL T RY V 10076 ttgtrgcaatcaataaagttagtggctggaataccgctacaggagatattratctcaacattaaaggaacggagggtgtataat 29 L L Q S I K L V A C I P L Q E I F I L T L K E R R V Y N 10160 ggcagacattag 10171 57 C R H 18 20RF04 0 16202 atgagaaaagatttcgtctacattaacacacccgatccaaaagcaaacaaaaaggcgttagcaaaaatcacaacgccaaagaa 1 M R K DOF V Y IN T P D P K A N K K A LA KI T N A K E 16118 ccaaaacaaaactatcgcagactacaattactatgttatctactatccatcattgraaragaactaatcgggtagctctacta 29 P K Q N Y R R L Q L L C Y L L F I I V I E L I V V A L L 16034 aaatag 16029 57 K* 1820RF041 3886 atggaactatataaagcaatgtttatcgtacgtgatgaaggtactattgacggttacgatactgaacacagtagatatttct 1 M EL Y K A M F I V RODE GCT I D C Y D T E H Y V l D 3970 ttacatgacttrgaagaaatatatggaaaagaaacacgtgaaactgaagcagtaacattagaaaaacagaaatttaaaaaaa 29 L H 0 F E E I Y G K E T R E I E A V T L V K T C N L K K 4054 taa 4056 57 1820RF042 10832 gtgtCCtctaaactgcttactcgaacagctagatcagttgcacttttttgtgctgaatcagctgtgttttagctgccgtgca 1 V S SK L L T R T A RS V A L F C A E S A V F L A A V A 10748 actgcgctgatactgtttgctgttgttttcgcttgttctscggtttgttgtgctgtaccagccttcgcaacgctga 10671 29 T A L I L F A V V F A C S A V C C A V P A F V H A* 1820R704 3 10652 gtgtcaatttctgttgacaaaccagcagcagtttctttagcttgttgtgcaagtgttttagcttcactagcgttacctcttatt 1 V S IS V O K P A A V S L A C C A S V L AS L A L P L I 10568 tcagcattaactaattcaagattagagtcgcctgttagaacatttttagcaattgtaacaggcataaacgattatga 10491 29 S A L T N S R L E S P V R T F L A I V T C I K R L* 1820R17044 6457 atgaaaagttgttacatttgttgttgtgtattctctaccgtaatcaagcgactcttcaatrgtattaatgaaaggtattccatt 1 M K S C Y I CC C V F S T V I1K R1 L F N C I N E R Y SI1 ta713 625 29 R I I K L R V F F C N E T F S Y T N I E T I S L* 1820R17045 6729 atgaatggtataccrgtatacgacgttacatacatcccgaccatcttartttaaaaaaggttctttcgttgtaagaaacgcctg 1 M N GI P V Y 0 V T Y IP T I L F K K C S F V-V R. Wj M 6645 tactctccaaaatcagcatrgccrgccccatttggrttgtatacctccccacctgaattgacaggaagcaaacaa 6571 29 Y S P K L A L P A P F C L Y T S P L E L I C S K 1820RF7046 2372 atggtttcaaatggtgtaaagaagcaaaagaagatcgaacartctccacactcataccaaaratgggcaatggagctttgg 1 M V S N C V K K Q K K I E H S P H S Y Q IW V NCG M L W 2456 aaatttgttgggaagtraactacacaacaacaaaatcaggtaaaacgaaaaaagagaaatctcgaacaaaa 2527 29 K F V C K L I T Q Q Q N Q V K R K K R1 N L E Q WO 00/32825 PCT/I B99/02040 326 1820RF047 13353 atgctcccattgttccaacatgtgttactgttccatcgcaacatgcaatcatttcattgccagggtgatcaattgaaccaaagt 1 M L P L F Q H V L L F H R N M Q S F N C Q G D Q L N Q S 13269 ccaaaccatcatggaaattatttggtctgccgtttcctgcatgcgtgccaaaaagttgtccactcatga 13201 29 P N H H G N.1Y L V C R F L H A C Q K V V HS~ 1820RF048 3395 attagtttcgattctcacatacaactgctactttgcctgtggt 1 M S G F V P N F P Y K L F N I P L A L A F LAP S V V F 3311 tttacttcgatccatttatcgatccagcctttgaacatatcacaagaagctttgaacatatatccgtaa 3243 29 F T S I H L S I Q P L N I S Q E A L N I Y P 1820RF049 1578 atgttgcaatctcaagagcgcaaatcaaagaagcgcaaattaaaacagagcaagctcaaaaagcgaaagaagaacactacaaag 1 M L Q S Q E R K S K K R K L K Q S K L K K R K K N T T K 1662 agcttaacaaagttgaagttaagaagcccacagaaaacacaattgtcacaccaactattttaa 1724 29 S L TF K L K L R S P Q K T Q L S H Q L F 1820RF050 8012 atggttatcttggtttctttaaagaccctacacttgggttcatggtttgcgcaggggcagaagatggtcaaatcgatcattatc 1 M V I L V S L K TF L HXL G S W FA Q G Q K M V K S5I1I1 8096 acaaccctattttctttacagcaaacgaagcaatgtatcacaagagatatcctgttttaa 8155 29 T T L F S L Q Q T K Q C I T R D I L F 1820RPOSI 9390 atgcttctgaaaaagaaacaaagaacacagacattaataaagatcaaaatcaaaccaaagatacgattacacgatataaaggta 1 M L L K K K Q R T Q T L I K I1K I1K P K I R L N D I KV 9474 aaaagggaaacactgattatgctgacttactcgaaaaatatcgtagaagtgttttga 9530 29 K R E T L I M L T Y S K N I V E V F 1820RF052 4096 gtgatagttgacaagagtcaaatttggcgagattgggcgaatgtacacgtgaaatatcgtgcgctcccgttaagttatggacac 1 V I V D K S Q I1W R D W AN V H V K Y R AL P L S Y G H 4180 ataaacgttttgaccgtcaaccaatcgcaaaaaccttttaggagtagcccttaa 4233 29 I N V L T V N Q S Q K P F R S S P 1820RF053 15656 gtggaacagaatacgaagat tttagtatcaacaatgagtttgtcaatgatagtgatacgtttattgaaaagagaagtaaaaata 1 V E Q N T K I L V S T M S L S M I V I R L L K R EV K I 15740 gtagtttcttatgcgccattgcttttgaagggaaaatctttgggcattggatag 15793 29 V V S Y A P L L L K G K S L G I G 1820RF054 8136 gtgatacattgcttcgtttgctgtaaagaaaatagggttgtgataatgatcgatttgaccatcttctgcccctgcgcaaaccat 1 V I HC F V C C KE N R V V I MI D L T I F C P C A N H 8052 gaacccaagtgtagggtctttaaagaaaccaagataaccattagtgtgtaa 8002 29 E P K C R V F K E TF K I T I S V 1820RF055 8324 atgaaaagaaat actt ctcat tgct acaagct tat aaccaaat tgacgaaaat aat caggctgtt t ttgtggat aaagatatgg 1 M K R N T S H C Y K L IT K L T K I I R L F L W I K IW 8408 agtttgacgaatcttttaatgtatggcaaacaaatgctccatatgtag 8455 29 S L T N L L M Y G K Q M L H M 1820RFOS6 6549 gtggcccat ctcctttttCctattatttacttcctatcaattcaagtggggaggtataCaaaCcaaatggggcaggcaatgcta 1 VANH L L F P I1 Y F L S I Q V G R Y T N Q M G Q A M L 6633 attttggagagtacatggcgtttcttacaacgaaagaaccttttttaa 6680 29 I L E S T W R F L Q R K N L F 18 20RY0 57 8264 atgtccgccatatctaaagcaaaacgatgtaaacttggtaacgtaggaactttcaagt cattat tatacaacatgatacatttt 1 M S A I S K A K R C K L G N V G T F K S L L Y N MIXH F 8180 gatttatcatcatcatcatcatatcttaaaacaggatatctcttgtga 8133 29 D L S S S S S Y L K T G Y L L 1820RP058 5176 gtgt att caaat tcgct tact tcgt cacctgtgt ataaagcgt tcat tacaccagcaacgaaact at tgaaat tat cccatgaa 1 V Y S N S L T S S P V Y K A F IT PA T K L L K L SH E 5092 gtaaatgctttttctaaccatgcttcttggatcgtttgtttgtag 5048 29 V N A F S N H A S W I V C L* 182ORF05 9 15876 atggtctttcgtagtcattgcataaaaatgatttgtatttggttgataatcataactcacatagacacaacctgtttcagcgtc 1 M V F R S H C I KM I C IW L I II TNH I D T T C F S V 15792 tatccaatacccaaagattttcccttcaaaagcaatggcgcataa 15748 29 Y P I P K 0 F P F K S N G A 1820RP060 15404 gtgatttttgatttctcaattaaaaactcatcaaacaaaattgtacgaacttcgggatattcattagatttttcaattccccac 1 V I F D F S IK N S S N K I V R T S G Y S L D F S IP H 15320 gtactaagtggaacagcccaacccattaatttatcatcacaatag 15276 29 V L S G T A Q P I N L S S Q 18 20RF06 1 2102 atgaggggacttctccacctgrtttcagactcgatcacttttgcaatcttactgtaaacttgttCttttttctgttgtacttctg WO 00/32825 PCT/I B99/02040 327 M MR G L L H L F Q T R S L L Q S Y C K L V L F S V V L L 2018 cttcgtcataaatgtagtcaaggttcatgtctaagaagttactaa 1974 29 L R H K C S Q G S C L R S Y 1820RF062 1992 atgtctaagaagttactaacatatgttttcataaatagatcaagccccattgagtcaagtaaacgaacaaatcQtcagcgtct 1 M S K K L L T Y V F I N R S5S P 1 E S S K R T I S S A S 1908 gaattgaaaatagtaaacatcgcttgtctgaaattgtcgtaa 1867 29 E LK I V N I A C L K L S 1820RF063 14306 gtgC acct tct aaacccct ct catgcgcaaaatgat acacaccaat ct ttt tacctaaagacaaagct tgt tgaaatgCt Cggt 1 V Y L LWN P S H A Q N D T H Q S F Y L K T K L V E M L G 14222 cacaatcagggtttacataacctgttccgcctgttgCtttaa 14181 29 H N Q G L H N L F R L L L 1820RP064 7356 atgatgt tagt caaaccaacaaaagggt tgt tacttgct aaggctgaaaagatc -ct cct cctgt actcattgcactgt tt ccc 1 M ML V K P T K G L L L A K A E K I A P P V L I A L F P 7272 ataccatgtctgaaagtattgcgaatgttttgctcttga 7234 29 I P C L K V L R M F C S 1820RF065 3582 atgaatgct atctgtat cacaataaat aatgcgat caaaacatt t ttgagcggt tgtaatggt agtatat ct accccaagccgt 1 M N A I C I T I NN A I K T F L S G C NG S I S T P S R 3498 cacaaaactagcaagcggaacataaacaggatctcttaa 3460 29 H K T S K R N I N R I S 1820RF066 4234 atgtggctactcttttttgtgtttcacagaattatgtttcacgtgaaacagtttttatggtataatagaatcaaaaggaggtgg 1 M W L L F F V F H RI M F H V K Q F L W Y N R I K R R W 4318 agattatggaaattaaagaacatgaatcaattttaa 4353 29 R L W K L K N M N Q F 1820RF067 13882 atgatacctgctttagctttaaaactactaaagtcgatttccttgttaaatttggcaattgtaaaacctaacacaaaatcgata 1 M I P AL A L K L L K S IS L L N L A I V K P N T K S I 13798 atcattgcaaccattaaccatataatcaaaccataa 13763 29 1I1 A T IN H I IK P 1820RF068 7267 atgt ctgaaagt at tgcgaatgt t ttgct ct tgagcaat caaggagtt tttgt tt cct tgcatgaatgcagaagcatagtcaga 1 M S E S IA N V L L L S N Q G V F V S L H E C RS I V R 7183 tttaactcctacatcgttaggatcattatcgattaa 7148 29 F N S Y I V R I I I0 1820RF069 5027 gtggaacaatgtttttacatcgggaacttcctgtttaaatacccctgtaacagactcgtcagggttgaacttatgttcctgtgc 1 V E Q C F Y I1G N F L F K Y P C N R L V R V E LM F L C 4943 aatgtcaacaaaaatttcttcaatcgttcgacctaa 4908 29 N V N K N F F N R S T 1820RF070 1031 gtgatggttcggctccaccaaaaccagaaacttcgtctgagtaaactagaatatctttcaattctagtacttcgccaattgcgt 1 V M V R L H QN Q K L R L S K L E Y L S I L V L R Q L R 947 tacgtaacggaattgaagecccgtttgtggcattga 912 29 Y V TE L K P R L W H 1820RF071 11741 atggttt tgcat tatggttgccacaaggcgct caaagtggtaaaggaat tttct t aatgat actcgcaattacaat cgttt tg 1 M V L H Y C CMH K A L K V V K E F S L M I L A I T I V L 11825 actttgatttgtttgttcgtaactgtactttaa 11857 29 T L I C L F V T V L 1820RF072 11723 atgtttacattaaatgccgtcattgtttcaaactttaatgtcgtttctcccgatcctaagaaagtaactacaggtacatcacgt 1 M F T L N A V IV S N F N V V S P D P K K V T T G TS R 11639 ttcaattcaatggtgttagcaaagcgataa 11610 29 F N S M V L A K R 1820RF073 2876 gtgaagccgcct ttgtatgctttacgtaagtctttatcaaaccctaaagacaaaataggaaaccattgtttgaaagt tgatttt 1 V K P P L Y A L R K S L S N P K D K I G N H C L K V D F 2792 ccatgtgtagcttttagccaatctttgtaa 2763 29 P C V A F S Q S L 8923 gtgattgataaattttgtttcaaattctgctcgttttgtttcgtcataaaacggataatcaaaatcaaacaattgttttcggcc 1 V I D K F C F K F C S F C F V I K RI I K I1K Q L F S A 8839 aactcaatacgttctttcgagataa 8813 29 N F N T F F S R 18 20RP075S 7463 gtgttacattatctggaatattttcgatatctgccactttacctgccaagaggttcaaaccgttttctttttcagaaacatagt 1 V L H Y L E Y FR Y L PL Y L P R G S N R F L F QKMH S 7379 tgtttacttgttgtcctgctcccatga 73S3 29 C L L V V L L P 1820RF07 6 2426 atgagtgtggagaatgttcgatcttcttttgcttctttacaccatttgaaaccatttttgaataaccatgaaagcataaactct WO 00/32825 PCT/I B99/02040 M MS V E N V R S S F A S L H H L K P F L N N H E S I NS 2342 ccgtcaaatttttcgttgtggaaataa 2316 29 p S N F S L W K 1820RF077 11858 atgaaggaacgtatgttgttgttgctagaggtagaggggttacatttgaaaattgtctattctctaatatctctcaagcaatta I M K E R M L L L L E V E G L H L K I V Y S L I S L K Q L 11942 tcaaaacagcttttcccgatgtaa 11965 29 S K Q L F P M 1820RP078 7671 gtgcctacaatatttggttcttttaatttaatgaaattccatgcttttcttgtttgtaagtttggtgtagctactcgattgctc 1 V P T I F G S F N L M K F H A F L V C K F G VA T R L L 7587 tttgtgccatacattgagaagtaa 7564 29 F V P Y I E K 1820RP079 7488 gtgaaagataagtttgatccaagctgtgttacattatctggaatattttcgatatctgccactttacctgccaagaggttcaaa 1 V K D K F D P S C V T L S G I FS IS A T L P A K R F K 7404 ccgttttctttttcagaaacatag 7381 29 P F S F S E T 182ORF080 4473 gtgrgctatttgctgatgtcaaagcttcagttgttgctccgtagtcttctcgcaatgcttcaagatgttctacaatctttgatc 1 V CVY L L M S K L Q L L L R S L LA M L Q D V L Q S L I 4389 ttgcttcaccgtctgtga 4372 29 L L H R L WO 00/32825 PCT/I B99/02040 Table 24 Sequence similarities phage 182 and public databases Phage: 182 Database: nr Query. sid111015611an1l82ORF0o1 Phage 182 ORF15966-778012 (604 letters) gi11381241pIP07S34VG9_.BPPZA TAIL PROTEIN (LATE PROTEIN GP9) 384 e-105 gi~l38l23Isp1P04331IVG9_BPPH2 TAIL PROTEIN (LATE PROTEIN GP9) 374 e-103 gi114292381gnlPIOlell?3412 (X99260) tail protein (Bacteriophag 346 3e-94 gi1215339 (M12456) p9 tail protein [Bacteriophage phi-291 >gij2. 208 8e-53 gijl18l9?01gnl1PID1e22l269 (Z47794) tail protein (Bacteriophage 62 8e-09 gi1ll8l9681gnl1PID1e22l26? (Z47794) tail protein [Bacteriophage 56 6e-07 gil25000301spIQ599681CARA-SULSO CARBAMOYL-PHOSPHATE SYNTHASE SM 49 8e-05 Query= sid~ll0l57Ilan~l82ORFOO2 Phage 182 ORF12152-387311 (573 letters) giJ1188481spIP19894IDPOL_BPM2 DNA POLYMERASE giJ768961pirI JQO. 665 0.0 gi114292301gallPIDlell73404 (X99260) DNA polymerase [Bacterioph. 657 0.0 gi1ll68491spIP036801DPOL BPPH2 DNA POLYMERASE (EARLY PROTEIN GP 654 0.0 gi~llaasl1sp1PO69so1DPOL7BPPZA DNA POLYMERASE (EARLY PROTEIN GP 654 0.0 gi115732 (X53371) DNA polymerase (AA 1-575) [Bacteriophage phi-29] 651 0.0 giJ15734 (XS3370) DNA polymerase (AA 1-575) [Bacteriophage phi-29] 651 0.0 gi1l5724791gnl1PI01e24230l (X96987) DNA polymerase (Bacteriopha. 565 e-160 gill726561pirlI SS1275 DNA polymerase phage CP-1 >gi18365931g. 301 gilIS18471spjP22374IDPOM -ASCIM PROBABLE DNA POLYMERASE >gij8385 71 3e-11 gil4619621spIP33S37jDPOM-NEJCR PROBABLE DNA POLYMERASE >gil2833 65 le-09 gil4619631spjP33538IDPOM NEUIN PROBABLE DNA POLYMERASE >giI1018 62 le-OB gil0844871pirlI S41618 DNA polymerase slime mold (Physarum 61 3e-08 gi12435429 (AF012250) unassigned reading frame (possible DNA po 61 3e-08 gij578l571gnl1PID1e246743 (X52106) DNA polymerase [Neurospora i. 59 le-07 gi121479691pir1 1572369 probable DNA-polymerase Gelasinospora 58 2e-07 gi12l479681pirl 562752 probable DNA-polymerase Gelasinospora 58 2e-07 gi13511140 (AF061244) B type DNA polymerase (Agrocybe aegerital 57 3e-07 giIll885O1spIP1O479IDPOLBPPRD DNA POLYMERASE (PROTEIN P1) >gil 56 6e-07 giJ578144 (X63909) putative DNA-polymerase, B-type (Morchella 47 3e-04 gif2320131sp1P30322DPOM.1AGABT PROBABLE DNA POLYMERASE >gil3208 46 6e-04 Query= sidJll01S91lan1l82ORFOO4 Phage 182 ORF14626-595413 (442 letters) gill381l71spI1l38491VGB_BPPH2 MAJOR HEAD PROTEIN (LATE PROTEIN 309 2e-83 gill38ll81sp1P0753l1V08_BPPZA MAJOR HEAD PROTEIN (LATE PROTEIN 305 3e-82 gi1l4292361gnl1PID1ell734l0 (X99260) major head protein [Bacter. 300 le-8O gi1ll819581gnl1PID1e221257 (Z47794) major head protein [Bacteri. 152 6e-36 Query= sid111016O1lan~l82ORF'005 Phage 182 ORF112651-1370013 (349 letters) giJ1379321spIP151321VG13_BPPH2 MORPHOGENESIS PROTEIN 1 (LATE PR 52 Be-06 gi114292421gnl1PID~e1173416 (X99260) morphogenesis protein (Bac. 48 7e-05 gill379331spIP075381VG13_BPPZA MORPHOGENESIS PROTEIN 1 (LATE PR 47 2e-04 Query= sid~ll0l6l1lan1182ORFOO6 Phage 182 ORF114995-1602611 gi11379441spJP1l0l41VG16_BPPH2 ENCAPSIDATION PROTEIN (LATE PROT 402 e-111 gii1379451sp1P075411V016_BPPZA ENCAPSIDATION PROTEIN (LATE PROT. 402 e-111 gi1l4292451gnl1PI01e11734l9 (X99260) encapsidation protein [Bac. 381 e-105 giJ11819721gnlIPI01e221271 (Z47794) encapsidation protein [Bact. 159 2e-38 Query- sid111016211an~l82ORF007 Phage 182 ORF17795-877511 (326 letters) WO 00/32825 PCT/I B99/02040 330 gijl4292391gn11P1Djl73413 (X99260) upper collar protein (Bact. 271 5e-72 gi 11379151 sp1IP075351IVGl0_BPPZA UPPER COLLAR PROTEIN (CONNECTOR 256 le-67 gi11379l41sp1P043321VGl0_BPPH2 UPPER COLLAR PROTEIN (CONNECTOR 256 2e-67 sijll8lSGOjgnl(PIDje221259 (Z47794) connector protein (Bacterio. 148 6e-35 Query= sidllOS3lanIlS2ORP008 Phage 182 ORFI14lO5-1498312 (292 letters) gij42l0750jgnljP1Dje1374037 (AJ132604) LyaL protein (Lactococcu. 129 2e-32 9iI462559IspIP34O2OILYC_CLOAB AUTOLYTIC LYSOZYME (1,4-BETA-N-AC 75 8e-13 gij2327014 (U82823) putative lysozyme (Saccharopolyspora erycbr 64 2e-09 giI1266521spIP2S310ILYCM_STRGL LYSOZYME Ml PRECURSOR 60 2e-08 gil1277891spIP19386ILYCA_-BPCP9 LYSOZYME (ENDOLYSIN) (MURAMIDASE. 60 2e-08 giI6776l1pirIIMUBPCP N-acetylmuramoyl-L-alanine amidase (EC 59 3e-08 gi14105636 (AP049087( lys [Leuconostoc oenos bacteriophage 1OMC( 59 3e-08 gi1623084 (L,02496) muranidase; muramidase [Bacteriophage LL-N] 57 le-07 SiIl27787IspIPlSOS7ILYCA_BPCP1 LYSOZYME (ENDOLYSIN) (MURANIDASE. 57 2e-07 giI126S97IspIPOO721ILYCIL-CHASP N,O-DIACETYLMURAMIDASE (LYSOZYME 57 2e-07 9iI127788IspIP19385ILYCAB1PCP7 LYSOZYME (ENDOLYSIN) (MURAI4IDASE 57 2e-07 SiI67762IpirIIMUBPC7 N-acetylmuramoyl-L-alanine anidase (EC 3.5 56 3e-07 9 iI3025168IspIP76421IYEGX -ECOLI HYPOTHETICAL 32.0 CD, PROTEIN IN 53 2e-06 gi14204413 (AF047001( Lys44 [QenococcuB oeni temperate bacterio. 53 3e-06 gij21l6978jgnlIPIDjd1O20940 (088151) cortical fragment-lytic en 52 5e-06 9 i12392844 (AF011378( lysin [Bacteriophage ski] 48 Se-OS Query= sid1ll0l64Ilan1l82ORFOO9 Phage 182 ORF18765-960112 (278 letters) gi11429240ignllPI01ell73414 (X99260) lower collar protein (Bact. 180 le-44 gil13792l1sp1P043331VG11_BPPH2 LOWER COLLAR PROTEIN (LATE PROTE. 171 Se-42 9i1215341 (M12456) p11 lower collar protein (Bacteriophage phi-29] 98 9e-20 gij224lS2jprfI 1011232B protein pl~lower collar (Bacteriophage 97 le-iS giJ535260 (Z30339) STARP antigen (Plasmodium reichenowi) 50 le-OS gij4049753 (AF063866) ORE' MSV230 hypothetical protein (Melanopl 49 4e-05 giJ2l3lSS71pirj 1570306 hypothetical protein YEL077c yeast (Sa 48 Se-OS gi11317821spIP127531RA50_YEAST DNA REPAIR PROTEIN RAD5O (153 10 48 7e-05 giI2l3l3O9jpirI 1S70305 hypothetical protein YBL113c yeast (Sa 47 2e-04 gij499325 (Z26314) STARP antigen [Plasmodium falciparim] 46 3e-04 giJ3845171 (AE001391) ribosome releasing factor (00, TP)([Plasm.. 46 3e-04 gil7319O31spjP4O434jYIR7_YEAST HYPOTHETICAL 197.5 10 PROTEIN IN 45 5e-O4 gijl6328291gnlIPID~e276379 (YO8924) AARP2 protein [Plasmodium f. 45 5e-04 giI117649O1spIP4O891YJWS_-YEAST HYPOTHETICAL 197.6 10 PROTEIN I 45 5e-04 gill773001pirlI1S51848 hypothetical protein HRD1054 yeast (Sa. 45 5e-04 gi12425143 (AF020407( WimA (Dictyostelium discoideun] 45 6e-04 giJll8l96ljgnlIPIDIe22l26O (Z47794) collar protein (Bacteriopha. 45 6e-04 gil21326571pirlIS648l9 probable membrane protein YLLO67c yeas 45 8e-04 gi)21330411pirlI585341 probable membrane protein YPR2O4w yeas 45 8e-04 giJ73O2751spIP397931PBPA_BACSU PENICILLIN -BINDING PROTEINS lA/1i... 45 Be-04 Query= sidJ1l16s1lanJl82ORF0lO Phage 182 ORF11310-215512 (281 letters) gi11356041spIP068121TERM_-BPNF DNA TERMINAL PROTEIN 9 il758151pi. 69 3e-11 gijl572478jgnljPIDfe242334 (X96987) terminal protein [Hacteriop. 65 3e-1O gijl42923ljgnljPIDjell734OS (X99260) terminal protein (Bacterio. 64 le-OS Query= sid110l66Ilan1l82ORF0ll Phage 182 ORF19607-1015811 (183 letters) gij1379281spIP07537jVGl2_BPPZA PRE-NECK APPENDAGE PROTEIN (LATE 51 6e-06 gijl429241IgnlIPIDjel1734l5 (X99260) pre-neck appendage protein 51 6e-06 gi1l379271sp1P203451V012_BPPH2 PRE-NECK APPENDAGE PROTEIN (LATE SO le-OS Query= sid1110l691lan1l820RF014 Phage 182 0RF113716-1410813 (130 letters) gill379361spI11188tVGl4_BPPH2 LYSIS PROTEIN (LATE PROTEIN GPl4. 97 6e-20 gi1l379381spIP075391VG14_BPPZA LYSIS PROTEIN (LATE PROTEIN GP14. 96 8e-20gijl429243jgn1IPIDell73417 (X99260) lysis protein [Bacteriopha. 96 8e-20 siI21S332 (M14782) lysis protein [Bacteriophage phi-29] 94 5e-19 Query. sid11l017O11an1l820RF015 Phage 182 ORF1854-122512 (123 letters) WO 00/32825 PCT/I 099/02040 331 giJ15670 (VOliSS) reading frame 10 (may be gene 4) [Bacteriopha. 70 5e-12 gi11380721spIP06953[VGSA_BPPZA EARLY PROTEIN GPSA >gil758361pir. 69 ?e-12 Query= sid1110l741lan1lB2ORF019 Phage 182 ORF14323-461313 (96 letters) gij1429235fgnljPIDje1173409 (X99260) head morphogenesis protein 61 2e-09 gijl38llljspIPl3B48jVG7_BPPH2 HEAD MORPHOGENESIS PROTEIN (LATE 57 3e-08 gijl381l21spIPO7S331VG7_BPPZA HEAD MORPHOGENESIS PROTEIN (LATE 54 le-07 Query= sid1l0Il0an1182ORFO2S Phage 182 ORF1548-81412 (88 letters) gill380991spIP069551VG6_BPPZA EARLY PROTEIN GP6 >gil758411piril 55 7e-08 gij138098jspjP03685jVG6_BPPH2 EARLY PROTEIN GP6 >gil75B401pirll 54 2e-07 gi114292341gnljPIDje1173408 (X99260) gene 6 product [Bacterioph. 54 2e-07 WO 00/32825 PCT/I B99/02040 332 Table Homologies between 182 ORFs and proteins in public databases Phage: 162 Database: Swissprot Query= sidIll0l561lanIl82ORFOOl Phage 182 ORF15966-778012 (604 letters) giI138124jsPIP07534jVG9_BPPZA TAIL PROTEIN (LATE PROTEIN GP9) 384 e-106 gij138123I5pIP04331jVG9_BPP12 TAIL PROTEIN (LATE PROTEIN GP9) 374 e-103 gil25000301spIQ599681CARA_SULSO CARBAM'OYL-PHOSPNATE SYNTHASE 49 2e-05 Query= sid1ll01S7Ilan1lB2ORFOO2 Phage 182 ORF12152-387311 (573 letters) gill8481spIP19894IDPOL_-BPM2 DNA POLYMERASE 665 0.0 gij118849jspfP0368OIDPOL_BPPH2 DNA POLYMERASE (EARLY PROTEIN GP2) 654 0.0 giII1885l1spIPO6950IDPOL_-BPPZA DNA POLYMERASE (EARLY PROTEIN GP2) 654 0.0 gilIS8471spIP22374IDPOM_ASCIM PROBABLE DNA POLYNERASE 71 7e-12 gi1461962jspIP33S3?IDPOM_NEUC!R PROBABLE DNA POLYMERASE 65 3e-10 gil4619631spjP33538IDPOMNEUIN PROBABLE DNA POLYMERASE 62 3e-09 gijl18850jSpjP10479jDPOL_BPPRD DNA POLYMERASE (PROTEIN P1) 56 2e-07 gil232013IspIP30322IDPOM_AGABT PROBABLE DNA POLYMERASE 46 2e-04 giI1l88B7IspIP1o582IDPOMMAIZE DNA POLYMERASE (S-1 DNA ORF 3) 46 2e-04 Query= sid1ll0159Ilan1182ORFOO4 Phage 182 ORF14626-595413 (442 letters) gijl38117IspjP13849jVG8_BPPH2 MAJOR HEAD PROTEIN (LATE PROTEIN 309 6e-84 gij138118jspjP07531jVG8_BPPZA MAJOR HEAD PROTEIN (LATE PROTEIN 305 7e-83 Query= sidIll0l6O1lanII82ORFOO5 Phage 182 ORF112651-1370013 (349 letters) gij137932jspjPlS132jVG13_BPPH2 MORPHOGENESIS PROTEIN 1 (LATE PR 52 2e-06 gij137933jspjP07538jVG13_BPPZA MORPHOGENESIS PROTEIN 1 (LATE 47 6e-05 Qu~ery= sid1ll0l6l1lan1l82ORFOO6 Phage 182 ORF114995-1602611 (343 letters) gijl37945jspjP0754ljVGl6_BPPZA ENCAPSIDATION PROTEIN (LATE PROT. 402 e-112 giI137944IspIP11014jVG16_BPPN2 ENCAPSIDATION PROTEIN (LATE PROT. 402 e-112 Query- sid1Ol062Ilan1182ORFOO7 Phage 182 ORF17795-877511.
(326 letters) giI137915jspIP07S35IVG10_BPPZA UPPER COLLAR PROTEIN (CONNECTOR 256 3e-68 gij13?914jspjP04332jVG10_BPPH2 UPPER COLLAR PROTEIN (CONNECTOR 256 Se-68 Query= sid1110163Ilan1l82ORFOO8 Phage 182 ORF114105-1498312 (292 letters) giI462559jsp1P34020ILYC!_CLOAB AUTOLYTIC LYSOZYME (1,4-BETA-N-AC 75 2e-13 gi11266521spIP25310ILYCM_-STRGL LYSOZYME MI PRECURSOR 60 5e-09 gij127789jSpjP19386jLYCA_~BPCP9 LYSOZYME (ENDOLYSIN) (MURAMIDASE. 60 Se-09 gill277871spIP1SOS7ILYCABPCP1 LYSOZYME (ENDOLYSIN) (MURAMIDASE. 57 4e-08 giI1265971spJP00721)LYCNCHASP NO-DIACETYLMURAM~IDASE (LYSOZYNE. 57 4e-08 gi11277881spIP19385ILYCABPCP7 LYSOZYME (ENDOLYSIN) (MURAMIDASE. 57 Se-OB giJ3O251681spjP764211YEGX-ECOLI HYPOTHETICAL 32.0 KD PROTEIN IN 53 5e-07 Query= sid11l0164Ilan1lB2ORFOO9 Phage 182 ORF18765-960112- (278 letters) gij137921jspjP04333IVG11_BPPH2 LOWER COLLAR PROTEIN (LATE PROTE. 171 le-42 gij1317821spjP12753jRASO3EAST DNA REPAIR PROTEIN R.AD50 (153 1W 48 2e-05 gij117649OjspIP40889jYJW5_YE.AST HYPOTHETICAL 197.6 1W PROTEIN I 45 1C-04 gil7319031spIP404341YIR7_YEAST HYPOTHETICAL 197.5 KW PROTEIN IN 45 le-04 giI73027S5pIP39793IPBPA_-BACSU PENICILLIN-BINDING PROTEINS 1A/i 45 2e-04 giI1168610IspIP41696IAZF1_YEAST ASPARAGINE-RICH ZINC FINGER PRO 44 3e-04 WO 00/32825 PCT/I B99/0 2040 giI7315871spIP38900IYH19_YEAST HYPOTHETICAL 70.1 (CD PROTEIN IN 44 3e-04 Query- sidIllOI65Ilanhl82ORF0lO Phage 182 ORF11310-215512 (281 letters) gi11356041spIP068121TERM4_BPNF DNA TERINAL PROTEIN 69 8e-12 Query- sidJ110l66Ilanhl82ORF011 Phage 182 0RF1960?-1015811 (183 letters) gi11379281spIP07537VG12_BPPZA PRE-NECK APPENDAGE PROTEIN (LATE 51 2e-06 gi113792?lspIP203451VG12_BPPH2 PRE-NECK APPENDAGE PROTEIN (LATE.. 50 3e-06 Quer-y= sidJ11016911anJ182ORF014 Phage 182 ORF113716-1410813 (130 letters) gijl37936jsp1P111881VGl4_BPPH2 LYSIS PROTEIN (LATE PROTEIN GP14( 97 2e-20 gill37938IspIPO7539IVG14_BPPZA LYSIS PROTEIN (LATE PROTEIN GP14( 96 2e-20 Query= sid111017011anh182ORF015 Phage 182 ORF1854-122512 (123 letters) giI1380721spIP06953IVGSA_BPPZA EARLY PROTEIN GP5A 69 2e-12 Query= sid111017411an11820RF019 Phage 182 ORF14323-461313 (96 letters) 9iI138l11I5pIP138481VG7_BPPH2 HEAD MORPHOGENESIS PROTEIN (LATE 57 9e-09 giI138112IsplP07533IVG7_BPPZA HEAD MORPHOGENESIS PROTEIN (LATE 54 4e-08 Query= sidJ11O18011anJ1820RF025 Phage 182 ORF'1548-81412 (88 letters) giI138o99IspIP06955IVG6_BPPZA EARLY PROTEIN GP6 55 2e-08 gi11380981sp1P036851VG6_BPPH2 EARLY PROTEIN GP6 54 5e-08 WO 00/32825 PCT/I B99/02040 334 DLASTP 2.0.8 (Jan-05-1999) Query= sidtll0lSE1lanJlS2ORFO0l Phage 182 ORF15966-778012 (604 letters) ,.gi11381241spIP075341V09_EPPZA TAIL PROTEIN (LATE PROTEIN GP9) >gij7SB49jpirI IWMBP9Z gene 9 protein phage PZA >gi1216058 (M111813) tail protein (Bacteriophage PZA) Length 599 Score 384 bits (975), Expect e-105 Identities =231/610 Positives 344/610 Gaps 36/610 Query: 6 TNVKLLANVPFDNTYTHTRWFKTQQEQESYFNS FPVLNENRDCSYQRDTQLGGVFRVDKH TNV++LA+VPF N Y .*TRWF Q ++FNS E V Sbj ct: 9 TNVRILADVPFSNDYIQJTRWFTSSSNQYNWFNSKTRVYEMSKVTFQGFRENKSYISVSLR 68 Query: 66 KDALYACNYLIFKNEETYPSKWQYAFVTDIEYKNDNTSFVTFEIDVLQTYRFDIGIRESF 125 o LY +YnP.+N Y +KW YAFVT++EYKN T+tV FEIDVLQT+ F+I +ESF Sbj ct: 69 LDLLYNASYIMFQNAD- YGNKWFYAFVTELEYIOVGTTYVHFEIDVLQThNFNIKFQESF 127 Query: 126 IAKEHPQLYYSNGIPFINTIEESLDYGREYTTTNVTTFHPNDGVNFLVILTSEAN- -pvc 183 I i-El i-G P INTI+E L+YG EY +V P D FLV+++ 11 G Sbj Ct: 128 IVREHVKLWNDDGTPTINTIDEGLNYGSEYDIVSVENMRPYDDMMFLVVI SKS IMHGTAG 187 Query: 184 DKEDKSG- -GSIVGGPSPFSYYLLPINSSGEVYKPN-GAGNANFGEYMAFLT- -TKEP 236 S+ GP P YY+P 0-iNK G NAN LT Shi ct: 188 EAESRLNDINASLNGMPQPLCYYIHPFYKDGKVPKTFIGDNNANLSPIVNMLTNIFSQKS 247 Query: 237 FLNKIVGMYVTSYTGIPFIVDHANKTVRYNA0GSYKIMLPTYASDPTGTMKTFAFFCVKE 296 +N IV MYVT Y G+ A D T VK+ Sbjct: 248 AVNNIVNMYVTDYIGLKLDYKNDKELKIDKDMFEQAGI ADDICIGNVDTIF VKK 301 Query: 297 ARTFVPKRIDLVGNVYNYFREAFPFNVKESKLFM4YPYCLI EITDTKGNVMTLRPEYLTGG 356 ID G+ F +ESKL MYPYC+ E+TD KG+ M1 L+ EYi- Sbjct: 302 IPDYETLEID-TGDKWGGFTKD--QESKLMMYPYCVTEVTDFKCNHNNLKTEYIDNN 355 Query: 357 KLSVYVKGSLGISNKVMIEPIDYDVSNSTI ITNLSDK?4LIDNDPNDVGVKSDYASA 412 KL V+GSLG.SNKV DY+ S +T D LI+N+PND+ DY SA Sbj ct: 356 KLKIQVRGSLGVSNKVAYSIQDYNAGOSLSGGDRLTASLDTSLINNNPNDIAI INDYLSA 415 Query: 413 FMQGNKNSLIAQEQNIRNTFRRGMGNSANSTG0AIFSAAASNNPFVGLTNIMGA000VNN 472 .+QGNKNSL Q+ +I GM i-S G i--i *PF G N Sbjct: 416 YLQGNKNSLENQKSSILFNGIVGMLGGGVSAG ASAVORSPFGLASSVTGMTSTAGN 471 Query: 473 YVSEKENGLNLLAGKVADIENI PDNVTQL0SNLSFTTGN- FQNYYQLRFKQI KYEYATRL 531 V+ L K ADI NIP N +F GN Y KQ+K EY L Sbjcc: 472 AVLD MQALQAKQA.DIANIPPQLTKGGNTAFDYGNGYRGVYVIK-KQLKAEYRRSL 526 Query: 532 DRYFSM4Y0TKSNRVATPNLQTRKAWNFIKIJKEPNIVGTHSNDVLTRVKQIFSACVTLWHT 591 *F YG K NRV PNL+TRKA+N+I- K+ I 0 L IF G+TLWHT Sbjct: 527 SSFFIKYGYKINRVKKPNLRTRKAYNYIQTKDCFIS0DINNNDLQEIRTXFDNCITLWHT 586 Query: 592 NDVLNYNQDN 601 NY+ +N Sbjct: 587 DDIGNYSVEN 596 Query- sid1ll0l57Ilanhl82ORFOO2 Phage 182 ORF12152-387311 (573 letters) >giI1188481spIP19894IDPOL_-BPM2 DNA POLYNERASE >gij76B96jpirj 3Q0161 DNA-directed DNA polymerase (EC 2.7.7.7) phage M12 )gi;215509 (M33144) DNA polymerase (Bacteriophage 1121 Length 572 Score 665 bits (1697). Expect 0.0 Identities 327/589 Positives 420/589 Gaps 38/589 Query: 3 KXYTGDFEVI'TDLNDCRVWSW0VCD IDNVDNNTF0LEIDSFFEWCKMQGSTD IYFHNEKF 62 K ++DFETTT L+DCRV4+iG +I N+DN G +D F +W Mi- D+YFHN KF WO 00/32825 PCT/I B99/02040 Sbj Ct: 4 KMFSCDFETTTKLDDCRVWAYGYMEIGNLDNYKIGNSLDEFMQWW MEIQADLYFHNLKF 62 Query: 63 DGEFMLSWLFIGIGFKVCKEAKEDRTFSTLISNNGQWYALEI CWEVWIXYYYYYXNYYYNY 122 DG F.*+WL .,GFKIJ E T++T.IS MGQWY ++IC.
Shi ct: 63 OGAFIVNWLEQNGFKWSNEGLPN -TYNTI ISKNGQWYMIDICFGYK--------- GKRKL 112 Query: 123 XXIIYDSLKKYPFPVKQIAEAFNFPIKKGEIDYTKERPIGYKPTKDEWYLO4DIQIMA4 182 .IYDSLKK PFPVK.IA+ F Pt KG+IDY ERP+Gt+ T +E.EY+104D1.I+A Sbj ct: 113 HTVIYDSLKKLPFPVKKIAXDFQLPLLKGDIDYHTERPVGHEITPEEYEYIKNDI ElIAR 272 Query: 183 ALKIQFDQGLTRNTRGSDALGDYKDWLKATI4GKSTFKQWFPILSLGFDKDLRKAYKGGFT 242 AL IQF QGL RI4T GSD+L .KD L F FP LSL DK+.RKAY+GGFT Sbjct: 173 ALDIQFKQGLDRITAGSDSLKGFKDILST KFNKVFPKLSLPMDKEIRKAYRGGFT 228 Query: 243 WVNKVFQCKEIGDGIVFDVNSLYPSQMYVRPLPYGTPLFYEGEYKPNNDYPLYIQNIKVR 302 WNM+ KEIG+G.VFDVNSLYPSQM4Y RPLPYG Pt YPLYIQ It Sbj ct: 229 WLNDKYKEKEIGEGMVFOVNSLYPSQMYSRPLPYGAPIVFQGKYEKDEQYPLYIQRIRFE 288 Query: 303 FRLKEGYIPTIQVKQSSLFIQNEYLESSVNKLGVDELIDLTLTNVDLELFFEHYDILEIH 362 F LKEGYIPTIQ.K*. F NEYL+rS CV E .tL LTNVDLEL SHY.. Sbjct: 289 FSLKEGYIPTIQIKQJPFFKGNSYLJS- -GV-EPVELYLTNVDLELIQSHYSLYNVE 343 Query: 363 YTYGYNFKASCDMFKGWIDKWI SVIKfTEGARKANAKGMLNSLYGKFGTNPDITGKVPYM 422 Y G+ Ft .FK +10KW VK EGA+K AK MLNSLYGKF .NPD.TGKVPY.
Sbj ct: 344 YIDGFKFREKTGLFIWFIDKNTYVKTHEEGAKKQLAKLMLNSLYGKFASNPDVTGKVPYL 403 Query: 423 GSDGIVRLTLGEESLRDPVYVPLASFVTAWGRYTTITTAQKCFDRIIYCnTDSIHLVGTE 482 .00 +G*5S .DPVY Pt F.TAW R+TTIT AQ CtDRIIYCDTDSIHL GTE Sbj ct: 404 KDDGSLGFRVGDSSYKDPVYTPMGVFITAWARFTTITAAQACYDRIIYCDTDSIHLTGTE 463 Query; 483 VPEAIDHLVDPKKLCYWGHSSTFQRAKFIRQKT--YVESIDGSL--------------- 524 VPE I .VDPKKLGYW HSSTF.RAXt.RQKT YV.E.DG.L Sbj ct: 464 VPEIIKDIVDPKK GYWAHSTFKRAXYLRQKTYIQDIYVKEVDGKLKSCSPDEATTTKF 523 Query: 525 NVKCAG14PDRI KEIVTFDNFSVGFSSYGKLLPKRTQGGVVLVDTMFTIK 573 *VKCAGt4 D IK+ VTFDNF VGFSS GK P GGVVLVDt*FTIK Sbjct: 524 SVKCAGMTDTIKKKVTFDNFAVGFSSMGKPKPVQVNGGVVLVDSVFTIK 572 Query- sid1ll0lS9Ilan1l82ORFOO4 Phage 182 0RF14626-595413 (442 letters) >gi1l381171spIP138491VG8_BPPH2 MAJOR HEAD PROTEIN (LATE PROTEIN GP8) >siI7SB4S1pirIWMBP89 gene 8 protein phage phi-29 >qi1215325 (M14782) major head protein [Bacteriophage phi-29] ->gil22S3621prf I 11301270B gene 8 [Bacillus sp.] Length 448 Score 309 bits (783). Expect 2e-83 Identities 176/440 Positives 250/440 Gaps 27/440 Query: 4 KITSQDVLRAT&VSTPVQLMTAIYNSSSSLFQANVPMPNADNIS-AVGAGITRLDVVKNEF 63 +IT DV AI MS Ft. VP. A+Nt VGAGI V.N.F Sbj ct: 2 RITFNDVKTSLGITSSYDIVNAIRNSGDNFKSYVPLATANNVAEVGAGI LINQTVQNDF 61 Query: 64 ISTLVDRIGKVVIRYKSWRNPLKNFKXGNPLRTISSIFVDIAQEHKFNPDESVTGVFK 123 It.LVDRIG VVIR S NPLK FKKG .PL4GRTISEIt DI +E tEt VF.
Sbj Ct: 62 ITSLVDRIGLWIRQVSLNNPLKKFKXGQI PLGRTIEEIYTDITSKQYDAEEAEQKVFS 121 Query: 124 QSVPDVKTLFHEINREGYYKQTIQE-AWLEKAFTSWDNFNSFVAGVMNALYTGDEVSSFEY 183 .E.P+VKTLFHS NR.GtY Q'rIQ. AF SW NF SFV. NA+Y EV EtEY Sbj ct: 122 REMPNVITLFHSRNRGFYHTIQDDSLKTAFVSWGNFESFVSSI INAIYNSAEVDSYSY 181 Query: 184 TKLLIANYQEKELFKEIEIGSITSSNA--KEFIRKIKSTSNKLEFM--SSAYNAQGVKTS 239 KLLt NY K IS £7T S SF+tK T+s KL S tNt V.7 Sbjc:: 1,02 .wL;;Du*,rSKLTVKIDEPYSS~t3AL~rFVKKRATARKLTLPQGSRDWNSMAVRTR 241 Query: 240 TSKSDQXYVYYYYYYYYYYXXYYYYFNMSKTDFVGHKIVIDEFPKKEGEESSNIVAVIV 299 D FNM..TDFtG. VID F S. AV.V Sbj ct: 242 SYMEDLMLIIDADLEAELDVDVLAICAFNMNRTDFLGNVTVIDGF ASTGLSAVLV 295 Query: 300 DSEWFMIYDKLYKTTSLYNPEGLYWNYWLHMHQLYSTSQFGNAVAFVKSATKPVTKVAFA 359 D .WFM.YD L+K NP GLYWNY. H Q S S.F NAVAFV VT.V Sbj ct: 296 DIWWFHVYDNLHKJ4ETVRNPRGLYWNYYYHVWQTLSVSRFANAVAFVSGDVPAVTQVIVS 355 Query: 360 SATTSVVKGSSKDIALTFTPVATNQQG;EVVSSAPALVKATVKQTAGKATAVTVSGLEVG 419 WO 00/32825 PCT/I B99/02040 3,36 .Vi-G V ATN V V 0.7T G Sbjct: 356 PNIAAVKQGGQQQFT AYVRATNAKDMKV----------- VWSVEGGSTGTAI TG 398 Query; 420 QSLVTFTAIGCQQATVLVTV 439 Li-i-+ Q TV TV Sbjct: 399 DGLLSVSGNEDNQLTVKATV 418 Query- sidJ11016011anJ1820RF005 Phage 182 ORF112651-1370013 (349 letters) 9 iI137932IspIP15132IVG13_EPPH2 MORPHOGENESIS PROTEIN 1 (LATE PROTEIN GP13) gsij758S8tpirIjIWMBP23 gene 13 protein phage phi-29 >gi1215331 (M14782) morphogenesis protein (Bacteriophage phi-29) >giJ22S368jprf~j1301270H gene 13 [Bacteriophage phi-29) Length 365 score 51.5 bits (121), Expect 8e-06 Identities 44/166 Positives .70/166 Gaps 14/166 Query: 6 NEQIARGQTIAKILSKYGYNQ4SQVCVVANLRWESA- -GLNPNSNEXXXXXXXXX-QWT 61 +E QI1 LS C+ K. G++Ni- ES GL N +E QWT Sbj ct: 12 SEMKVNAQYILNYLSSNGWTKQAICGMLGNMQSESTINPGLWQNLDEGNTSLCFGLVQWT '71 Query: 62 PKSNLYRQAQICGLSNAKAETLEGQAE IIAQGDKTGQWMDNTPVSSAGYTNPQTLSAFKQ 121 P SN A CL II QN.. Y K Sbjct: 72 PASNYINWANSQCLPYKDMDS- -ELKRIIWEVNNNAQWINLRflMTFKEY--------- IKS 121 Query: 122 SANIDVATINFMCHWERPGKLHIEERLDLAQAYSKMIOGSCGGGVK 167 .ERP ER DA+ KGGGG--s Sbjct: 122 TKTPRELANIFLASYERPANPNQPERDQAEYWYQNLSGGGGCGLQ 167 Query= sidJll0l61anB2ORFOS Phage 182 ORF114995-1602611 (343 letters) >gijl3794SjspIPO754ljVGl6_BPPZA ENCAPSIDATION PROTEIN (LATE PROTEIN GP16) >giI7SB6ljpirIIWMBPl6 gene 16 protein phage PEA >giJ216065 (M11813) iorphogenesis protein C (Bacteriophage PEA) Length 332 Score 402 bits (1023), Expect e-ill Identities 186/332 Positives 244/332 Gaps 2/332 Query: 11 EKNLYYNPNNALCFNCLMLFVIGARGIGKTYGYKKFVVNRFI KHGEQF IYLRRFKTELKX .K.L-YNP L FVIGARGIGKi-Y K -sNRFIK.GEQFIY.RR.K EL K Sbj ct: 2 DKSLFYNPQKI4LSYDRILNFVICARGICKSYAIIKVYPINRFI KYCEQFIYVRRYKPELAK 61 Query: 71 I PQFFKTMAKEFPDHKLEVKGKEFYCDDKLMGWAVPLSTWGI EKSNEYPEVRTI LFDEFL 130 i-F .Ai-EFPDHi-L VKG+ FY D (CL GWA.PLS W EKSN YP V TI.FDEFi- Sbjct: 62 VSNYFNI3VAQEFPDHELVVKGRRFYIDGKLAGWAIPLSVWQSEKSNAYPNVSTIVFDEFI 121 Query: 131 I EKSKITYLPNEAEALLNMMETVFRRRTNTRCVMLSNATSVVNPYFLYFNLQPDLNKRFN 190 ER Yi-PNE ALLN-sMi-TVFR R RC+ LSNA SVVNPYFL.FNL PD.NKRFN Sbi ct: 122 REKDNSNYI PNEVSALLNLMDTVFRNRERVRCICLSNAVSVVNPYFLFFNLVPDVNKRFN 181 Query: 191 LYQDRILIELCDSKDFAEVKRETPFGRLIRGTEYEDFSINNEFVNI3SDTFIEKRSKNSS 250 i-Y D LIE. DS DF+ .R.T FGRLI 07EV Si-iNi-F. DS FIEKRSK.S Sbjct: 182 1/TO- -ALIEIPDSLDFSSERRKTRFGRLIDCTEGFt4SLDNQFIGDSHVFIEKRSKDSK 239 Query: 251 FLCAIAFEGKI FGYWIDAETGCVYVSYDVQPNTNHFYA4TTKDHEENRLLKNWRNNYYL 310 F. C G W.D C .VV P.7 V +TT D EN i-L+ N+.NNY.L Sbj ct: 240 Fy75 IVYNGFTLCVWVDVNQGLMYVDTAHDPSTKNVYTLTTDDLNENMNLITNYKNNYHL 299 Query: 311 STVAKAFIO4SYLRFDNIVIKNLHYDLFNKMKI 342 AF N YLRFDN VItNi- ViLF 104.1 Sbjct: 300 RKLASAFMNGVLRFDNQVIRNIAVELFRKMRI 331 Query- sid1ll0l62lan1B2ORFO7 Phage 182 0RF17795-877511 (326 letters) >gi114292391emb1CAA676581 (X99260) upper collar protein (Bacteriophage 8103) WO 00/32825 PCT/I B99102040 Length =308 Score =271 bits (685), Expect =6e-72 Identities 131/275 Positives 187/275 Gaps =5/275 Query: 36 +Y HY L L .QLFEWE LP StOP YLE GYtOFtEOP C GA C Sbj ct: 22 WYYNYYQYLCSLAYQLFEWERLPPSVDPSYLEKS IIQFCYVCFYKDPRICYIACQCALSC 81 Query: 96 QIDHYHNPI FFTANEA)4YNKRYPVLRYDDDDDKSKCIMLYNNDLKVPTLPSLHRFALDMA 155 tOI{Y+ P F A+ Y Y D tYNNDLK TLP+L FA D+A Sbjct: 82 TVDHYNLPDRFI4ASSVCYQNTFKLYNYSDNKEKONhGVAIYNNVLKCSTLPALEMFAQDLA 141 Query: 156 DINQISRVNRRAQKTPVIIQTDEKKYFSLLQAYNQIDENNQAVFVDKDMEFDESFNVWQT 215 +4 +I VN+ AQKTPV+I SL YNQ N tFV D V++T Sbjct: 142 ELKEIIAVNQNAQKTPVLIAANDN14QLSLKNIYHIQYECNAPVIFVHESLDLD4JLKVFKT 200 Query: 216 )APYVVDKLRSELNEVWNEVLTFLCINNANVDKTARVQTSEVLSNNEQIESSCNILLKSR 275 +APYVVDKL N VWNEV+TtLGI NAN+-+K R. TSEV SNtEQIESSCNI LK+R Sbj ct: 201 OAPYVVDKLNAQKNAVWNEVTYLGIKNANLEKKERbVTSEVDSNDEQIESSCNIYLKAR 260 Query: 276 KEFCDRVNRVFCDELDCKIDVCFRTDAVRQLQLAA 310 4E +40 L VKFR D V Q+tL A Sbjct: 261 QEACNKISELYCLNL KVKFRYDIVEQMRLNA 291 Query. sid1016311an1820RF008 Phage 182 ORF114105-1498312 (292 letters) >gil4210750jebICAA107101 (A3132604) LysL protein [Lactococcus lactis] Length 235 Score 139 bits (347), Expect 2e-32 Identities 85/210 Positives 114/210 Caps 14/210 Query: 2 MNGI101SSYQTCIDLSKVPCDFVNI KATCCTCYVNPDCDRAFQQALSLCKKIGVYHFAHE 61 ?O4CIDISSYQ VP DFV MEAT CT Y+NP Q K +G YHFA Sbjct: 1 MNCIDISSYQAELNACIVPSDPVIIKATECTNYINPTWEEQACQVIQTNKLL CFYHFAS- 59 Query: 62 RCLECTPQQEAQFFLDNIKGYICKAVLILDFECS NQKDVNWAKAFLDYVYNKTCVKAW 119 G P EA FF+ +K YICKAVL+LDFE N At FL+ V ETC.
Sbjct; 60 VGNPIAADFFISVVO4YIEAVLVLDFAAINAWCNVGARQFLNRVKEETGINP4 116 Query: 120 FYTYTANLNTTDFSSIAKGDYCLWVAEYCSNQPQCYSQPAPPKTNN--FPIVACFQF 174 Y rtSi-I+ LWVA+Y S P CY P Tt A Q+ Sbjct: 117 IYl4SSflVTRQFlWSTISSTN-PLWVAQYASMNPTCYQ-SEPWTDGKYCAWSSAAIHQY 173 Query: 175 TSECRLPCYNCNLDLNVFYCDGNTWDLYVC 204 .SCGL t+CNLD+N+ Yt+NW C Sbjct: 174 SSACSLSNWSCNLDINLAYINANQWESLAC 203 Query- sid1ll0l64Ilanhl82ORFOO9 Phage 182 ORF18765-960112 (278 letters) >giJ14292401embjCAA.676591 (X99260) lower collar protein (Bacteriophage 9103] Length 293 Score 180 bits (451). Expect le-44 ldczit-ao- 1C5/2bC U22t) ,nPositivec 1621!296 (53S. Gapn 3!76 (11%1 Query: 3 LRYIESFTYYQPELSRERIEVRKQLFDFDYPFYD)ETERAEFSTKPINMFYLREICSE 62 L, YIE Y+ LS EtIE CR +LFDF YP +DE+ R FET FT .*FY+REIG E Sbjct: 8 LSTYIEMWSQYETCLSMAEIEEGRPELFDFQYPIFDESYREVFETMFIRNFYMREICFE 67 Query: 63 TMCSFKFNLDEYLNLNNPYWNKNFLSNLESF -PT FDDMDYTIDEEQELLNEIDTNIEANR 121 TCGFEFNL +L .NMPY+NK+ S L T K+ OT NR Sbjct: 68 TECLFKFNLETWLIINIPYFNKLFESELIKYDPLENTRLNTTNKOJ--OTERNDNR 122 Query: 122 D---ESIQNQTEQVDQTDNRN1O4TRflTGTT--DSFSRNTYTD)TPQEDLRIASNC 169 D0K t~TK 0+T. D TT D+F.*R +D P L +N WO 00/32825 PCT/I 899/02040 338 Sb) ct: 123 DTTGSMKADGKSNTKTSDKTNATGSSKEDGKrrGSVTDDNFNRKIDSDQPDSRLNLTTN- 181 Query: 170 DCTGVINYATNITEDLSKETTSSTGVETNNDKTNQNTRSNAS EKETKNTD 219 DG G.,YA+ I E+t+ ++TG TNN S T N Sbjct: 182 DGOGTLEYASAIEENNTNNKRNTTG--TNNVTSSAESESTGSGTSDTVTTDNANTTTNDK 229 Query: 220 INKDQNQTKDTITRYKGKKGNTDYAD)LLEKYRRSVLRIEKHIFREMNKEGLFLLWY 275 +N N +D I GK G YA L+t+ YR ++LRXEK IF EM LP+LVY Sbjct: 240 UNSQINJVEDYIESKIGKSGTQSYASLVQDYRAALLRIEKRIFDEMQE--LFMLWY 293 Quory= sidIllO16S1lall2ORF0l0 Phage 182 ORF11310-215512 (281 letters) .gijl35604IspP068l2TER_BPNF DNA TERMIINAL PROTEIN ,SiI7S8lSjpirIERBPNP terminal protein phage NF >giI579177lembICAA684401 (Y00363) gene E product (AA 1-267) [Bacteriophage NF] Length 266 Score 74.9 bits (181) Expect 6e-13 identities 73/275 Positives 129/275 Gaps 37/275 (13%) Query: 3 VRIS)ODRAKLSKIYCKSNKARKICNRLRQK-GVE- ERQLPTVPTSKXRLIDYVKSTN 58 -tRI+ ND+A KC. K+ KA K+R E Sbjct: 7 IRITNNDKALYAKLV-)OTKA--KISRTKXXYCIDLSNEIELPPLESFQ------------- 52 Query: 59 MSRSDFNKNLDELVDFAQPYNENYI FEIIJKRNVAI SRAQIKEAQI KTEQAQKAKEEMYKE 118 +R +FNK F N+NY F NK S+A+I E T++AQ+ +E +E Sbj ct: 53 -TREEFNKWKOKQESFTNRANQNYQFVIOJKYGIVASKAXINEIAKNTKEAQRIVDEQREE 111 Query: 119 L NKVEVKPTENTIVrPTILTELGiADLPFQAIPDFNIDAFTSPEGVQSYLEN 170 K +T C P DFN D S E Sbjct: 112 IEDKPFISGGKQQGTVGQR4QILSPSQVT- -GISRP- SDFNFDDVRSYARLRTLEEG 165 Query: 171 IC -KQDEQYFDERDQLYYDNFRQAMFTI FNSD- -ADDIVRILDSMCLDLFNKTYVSNFLD 227 K Y+D R +NF7+ FNSD L D F Ft+ Sbj ct: 166 MAEKASPDYYDRRITQMHQNFIEIVEKSFNSOWLSDELVERLKKI PPDDFFELYLM-FDE 224 Query: 228 MNLDYIYDEAEVQOKKEQVYSKIAXVIESETCGEV 262 +Y 22E E ++KI G.V Sbjct: 225 ISFEYFDSECEDVEASEAIILNKINSYLDRYERGDV 259 Query- sidJll0l6ShlanJIS2ORF0ll Phage 182 0RF19607-10158Il (183 letters) >gi11429241lembiCAA676601 (X99260) pre-neck appendage protein [Bacteriophage B103] Length 860 Score 50.8 bits (119), Expect 6e-06 Identities 29/105 Positives 56/105 G aps -6/105 Query: 8 KRFDGLPAVFKERFSKYPHTEYRYELLLDEEVSALIAYLNEVCALVNDMSCYLNYFI EMF 67 +RF+ L T +I YLN++G LND+ N +2 Sbjct: 7 RRFEKLG3EMM4VQVYERYLPTAFDESMTLLEKMNKIIEYLNQIGRLTNDVVEEWNCVEW1 66 Query: 68 V- EKLEEITNDTLKXWLSDGTLENLINflTVFANYIKEIKRLQILV 111 LE+ .TL+KW +G I E+K+ V Sbjct: 67 LNDCLEDYVKETLEKWYEEGKFADLV--IQVIDELKQFCVSV 106 Query. sidIllOl69Ilanhl82ORF0l4 Phage 182 ORF113716-1410813 (130 letters) >gil1379361spIP111881VG14_PP42 LYSIS PROTEIN (LATE PROTEIN GP14) >giI7SSGOjpirI IwMBP2S gene 14 protein phage phi-29 >gil15678jembiCAA28631I (X04962) gene 14 product (AA WO 00/32825 PCT/I B99/02040 33 9 1-393) (Bacteriophage phi-291 >gii22S3ES1prfI 11301210J gene 14 (Bacteriophage phi-29] Length 131 Score 96.7 bits (237). Expect 6e-20 Identities 53/131 Positives =81/131 Gaps 3/131 Query: 1 MIEYITQWL-ADDNHLVYGLIIWLMVAIIDFVLGFTIAKFNKEIDFSSFKAKAGIIVKV 59 MI1 +L D+ L+Y L .LMV Mt+D VLG AK N I FSSFK KG +KV Sbj ct: 3 MIANMQNFLETDETKLIYWLT- FLM4VCMVVnTVLGVLFAKLNPNIKFSSFKIKTGVLIKV 61 Query: 60 AEMVLVVYFIPVAVKFGAVGITMYITMLVGLILSEIYSILGHISDIDDDNNWT)YVKKFL 119 +EM.L IP AVt F A G+ T+ L +SEIYSI GH+ +00 F Sbj ct: 62 SEMILALLAI PFAVPFPA-GLPLLYTVYTALCVSEIYSI FGHLRLVDDKSDFLEILENFF 120 Query: 120 DGTLNRKDDIK 130 T +4K Sbjct: 121 KRTSGKNKEEK 131 Query- sidJ110170I1anI182ORF01S Phage 182 ORFIBS4-122512 (123 letters) >gij15670lembICAA244831 (VOliSS) reading frame 10 (may be gene 4) [Bacteriophage phi-29] Length 124 Score 69.9 bits (168), Expect 6e-12 Identities 39/119 Positives 64/119 Gaps 3/119 Query: 3 IVICSTFDTQTPEGMLQVFNATNGASIPLRNAI-GEVLELIOILVYSDEVSGFGGAEPSQA 61- IVK.TFDT+T EQ +FNA G +11 G I Y 40 A+ Sbj ct: 6 IVKATFDTETLEGQIKIFNAQTGGGQSFIOLPDGTIIEANAIAQYKQVSDTYGDAK- -EE 63 Query: 62 ELVAFFTEDGKTYAGVSAVATKSAKNLIDMMTANPDIKPKISFVEGKSNGGQKFVNLQV 120 F DG Y+ +S +LID++T KC+ VeG S+ G F .LQ+ Sbj ct: 64 TVrTIFAADGSLYSAISKTVAEAASDLIDLVTRKLETFKVKVVQGTSSKGNVFFSLQL 122 Query. sid1ll0l74Ilan1l82ORF0l9 Phage 182 0RF14323-461313 (96 letters) >gil1429235jembjCAA616541 (X(99260) head morphogenesis protein [Bacteriophage B103( Length 101 Score 60.9 bits (145). Expect le-OS Identities 34/96 Positives 53/96 Gaps 5/96 Query: 1 MEIKEHES ILUG! LESVTDGEARSKIVEHLEALREDYGATTEALTSANSTLEKLKK3NEA ME HE ILN L+ LR DYG+ S EKL+ -N Sbjct: 3 MERDSHEEILNKLNDPELEHSERTEL LQQLRAlYGSVLSEFSELTSATEKLRAENS0 59 Query: 61 LVISNSKLFRERAIVEPAE-N--NEPETDQNITLDDL 94 L++SNSKLFR. I E E IT+.DL Sbjct: 60 LIVSNSKLFRQVGITKEKEEEIKQEELSETITIEDL Query- sid1ll0lBO1lan1lS2ORFO2S Phage 182 ORF1548-81412 (88 letters) >gij138099jspiP069S5jVG6_BPPZA EARLY PROTEIN GP6 >,gij7SB4ljpirI ERBP6Z gene 6 protein phage PZA >giJ216047 (111813) gene 6 product (Bacteriophage PZA] >gif224746jprfjj11112171K ORF 6 (Bacteriophage PZA] Length 96 Score 55.0 bits (130), Expect 8e-08 Identities 28/79 Positives 45/79 (56%) WO 00/32825 PCT/I B99/02040 Query: 4 Sbjct: 3 KLMQRNVTSTKVEFSEVIVQDGAPTIVPCEPVVLTGKLSEEKALSAI KRKNPDIONVTr 63 K+MQR +T TV G LS E+A +KRK VV KMMQREITKTTVNVAKHVMVDGEVQVEQLPSETFV~rILSMEQAQWRNKRKYKGEPVQVVS 62 Query: Sbj ct: 64 VSIISTALYTMPVDKFIELA 82 V T +Y +PV*KF+E+A 63 VEPNTEVYELPVEKFLEVA 81 WO 00/32825 PCT/I B99/02040 341 Table 26 Secondary structure prediction for ORF 1820RF008 1 MMNGIDISSY cccccccccc 61 ERGLEGTPQQ
CCCCCCCCHH
121 YTYTANLNTT
EEECCCCCCC
181 PGYNGNLDLN
CCCCCCCCEE
241 GETIFELSDP
CCEEEECCCC
QTGIDLSKVP
cccccccccc
EAQFFLDNIK
HHHHHHHHHC
DFSS IAKGDY
CCCEECCCCC
VFYGDGNTWD
EEECCCCCCE
TQLDHIRGTY
CCHHHHCCEE
CDFVNIKATG
CCEEEEEECC
GYIGKAVLI L
CCCCEEEEEE
GLWVAEYGSN
CEEEEECCCC
LYVGKKQDQI
EEECCCCCCC
NHVHGKE IPS
CCCCCCEECC
GTGYVNPDCD
cccccccccc
DFEGSNQDV
CCCCCCCHHH
OPOGYSQPAP
cccccccccc
VPPENKIFDA
cccccccccc
MVWTPEQFDI
CCCCCCCHHH
RAFQQALSLG
HIHIHHMJHC
NWAKAFLDYV
HHH~iHHHHHH
PKTNNFPIVA
CCCCCCCEEE
TSDEFIFTLT
CCCEEEEEEC
YL1K1YEKKPV
HHHHHCCCCE
KKIGVYHFA{
CCCCEEEEEE
YNKTGVKAWF
HCCCCCEEEE
CFQFTSKGRL
EEEECCCCCC
TGSTSVFYFD
CCCCEEEECC
YK
EC
Secondary structure prediction for ORF 1820RF014 1 MIEYITQWLA DDNI{LVYGLI IWLMVAMIID FVLGFTIAKF NKEIDFSSFK CCCCEECCCC CCCCHHHHHH HHHHHHHHHH HHHHHHHHHC CCCCCJHHHH 61 ENVLVVYFIP VAVKFGAVGI TMYITMLVGL ILSEIYSILG HISDIDDDNN EEEEEEEECC CEEECCCEEE EEEEEEEEEE EEEEEEEECC CCCCCCCCcc 121 GTLNRKDDIK
CCCCCCCEEC
AKAGI IVKVA
HHHCEEEEEE
WTDYVKKFLD
CEEEEEEECC
WO 00/32825 WO 0032825PCT/I B99/02040 342 Table 27 Enterococcus accession numbers 242/242 gil28957511gbAF044978.1jAF044978 [2895751] gil4803755IdbjIAB026843.11AB026843 [4803755] giJ4769001 jgbAF 140549. 1 AF 140549 [4769001 gil47609O1 jgb1AF099088. I11AF099088 [4760901 gil47O47051gblAFl 21254.1 IAFi 21254 [4704705] gil3342 1171gblAF076604. 1 AF076604 [33421 17] gi146888241emb1AJ 132470.1 IESP 132470 [4688824] gil47320851gbAF1 25553.1 1AF125553 [4732085] gil47320821gblAF125552.1IAF125552 [4732082] gil47320791gb1AF 125551.1 IAF12SSS 1 [4732079] gil47320761gbjAF125550.1IAF125550 [4732076] gil47320731gblAF 125548.1 IAFi 25548 [4732073] gij47320701gblAF125547. 11AF125547 [4732070] gij47320671gblAF125546.1IAF125546 [4732067] giI4732O641gbjAF 125545.1 IAF 125545 [4732064] gij47320611gbIAF 125544.1 IAFi 25544 [4732061] gij4704653jgbjAF1 14715.1 AFI 14715 [4704653] gil47045641gbjAF102550.11AF102550 [4704564] gil4688827lemblA1238249. 1 EFA238249 [4688827] gil4680606jgblAF125198.1fAF125198 [4680606] gil46332791gblAF1 17609.1 JAFI 117609 [4633279] gij4633l241gbjAFl 10130.11AF1 10130 [4633124] gil4590399igbIAF 124258.1 IAFI 24258 [4590399] gil45903361gblAF1 08380.1 jAF108380 [4590336] gil45903351gbjAF108379.11AF108379 [4590335] gi140191671gbU21300. IICXUt21300 [4019167] gi14545122jgbjAF077816.11AF077816 [4545122] gil4433610fgblAF106614.11AF106614 [4433610] [4468838] gi14468121 emb1AJ132958.1 BPHI 32958 [4468121] gil44S6l04lembjY 17302.1 1EH117302 [4456104] giJ443361 I gbIAF 106615. 1 AF 106615 [4433611] gil44336071gb1AF10661 1.11AF10661 1 [4433607] giJ40982671gbIU76614. I11BLU76614 [4098267] gij470 I9IembYOO1 16.1ISFAMB 1 47019] gij4 1581 79jembjAL035206. 1jSC9B35 [4158179] gil4l 65458!emb1X79343. I EF16SSPA [4165458] giJ4 I654571cmbIX79342.1I EFTRN1~ALA [4165457] gil4165456lembIX7934 1.1 IEF23SRNA [4165456] gil4l1 O978IemblI 4027.1 EFY 14027 [4150978] gil4 127803 IembIAJ223 161.1 IEFAJ3 161 [4127803] gil29566851embY164 13.1 IEFENTIJO [2956685] gij2665346emblY 13922.1 EHY 13922 [2665346] gil43246751gblAF109375. IIAF 109375 [4324675] gil42346271gblAF0610 13.1 I1AF06 1013 [4234627] gil42346261gbAF061012. 1!IAF061012 [4234626] gil42346251gblAF061 011,.1IAFO61O 11 [4234625] gil42346241gbIAF06 1010.1 IAFO6 1010 [4234624] giJ42346231gblAF06 1009.1 AF06 1009 [4234623] gil42346221gbIAF061008.1IAFO6 1008 [4234622] gi142346 2 1 IgblAF061007.1 IAF061 007 [4234621] gil42346201gblAFO61006.1 jAFO6 1006 [4234620] gij42346 l91gbjAF06 1005.1 IAF061 005 [4234619] gil42346 1 8gblAFO6l 004.1 IAF06 1004 [4234618] gil4234617jgblAF061 003.1 AFO6 1003 [4234617] gil42346161gblAF061002.11AF061002 [4234616] gil4234615jgblAF061001.11AFO6OO1 [4234615] gil42346 14gblAFO6 1000.1 IAF06 1000 [4234614] gij3l138990IgblAF06024 1.1 1AF06024 1 [31389901 giJ3 1389861gbAF060240. 1 IAF060240 [3138986] gi142045351gbjAF094803. 1 AF094803 [4204535] gil42045341gblAF094802. I AF094802 [4204534] gil42045331gb1AF094801 .1 AF094801 142045331 gil42045321gbIAF094800. 1 JAF094800 [4204532] gil420453 11gblAF094799. 1IAFQ94799 [4204534]gil4204530IgblAF094798.11 A-F94798 [4204530] gil42045291gb1AF094797. 1 AF094797 [4204529] gil42045281gbjAF094796. I 1AF094796 [4204528] gil42045271gblAF094795.1I1AF094795 [4204527] WO 00/32825 WO 0032825PCT/I B99/02040 gil42045261gblAF094794. I 1AF094794 [4204526] gil42045251gb1AF094793. 1 AF094793 [4204525] gil42045241gblAF094792. 11AF094792 [4204524) gij42045231gblAF09479 1.1 AF09479 1 [4204523] gil42045221gblAF094790. 1 IAF094790 [4204522] gil420452 1 gblAF094789.I1 AF094789 [4204521] gi142045201gblAF094788. 11AF094788 [4204520] gil42O45 191gbjAF094787. 1 1AF094787 [4204519] gil4204518igbAF094786.11AF094786 [4204518] gil42045 171gbAF094785.11AF094785 [4204517] gij42045 1 6gblAF094784. 11AF094784 [4204516] gil4204515jgbjAF094783.1lAF094783 [4204515] gil42045141gblAF094782.1IAF094782 [4204514] gil42045 131gblAF09478 1.1 AF09478 1 [4204513] gil42045 121gblAF094780. 11AF094780 [4204512] gil3873 1861gblAF034779. I AF034779 [3873186] giI4 15 3671gbjAF093508. 1 AF093508 [4151367] gil28281I361gblAF039903. 1 AF039903 [2828136] gil2828 1351gbIAF039902. 11AF039902 [2828135] gil28281341gblAF039901.1jAF039901 [2828134] gil28281331gblAF039900.11AF039900 [2828133] gil2828 1321gbjAF039899.1I1AF039899 [2828132] gif 2828131 IgblAF039898. I IAF039898 [2828131 gij4l038661gblAF028812.11AF028812 [4103866] gil4103864jgblAF02881 1.1jAF02881 1 [4103864] gil26059251gblAF029727. 1 AF029727 [2605925] gi1 14027501gbIU60038.11 EFU60038 [1402750] gi118357801gbIU86375.11EFU86375 [1835780] gil383 15551gblAF047608.1I1AF047608 [38315551 gil3 79061 7gblAFO974 14.1 1AF0974 14 [3790617] gil3767587ldbjlAB005036. 1 AB005036 [3767587] gil3757810jgblAF042288.11AF042288 [3757810] gi137470391gblAF093509. 1 IAF093509 [3747039] gil366OSS9IdbjlABO 17811.1 IABO 17811 [3660559] gil I147743jgb1U4221 1. 1 EfU42211 [1147743] gil3676412igblAF051917.11AF051917 [3676412] gil 3676164lemb1A301 11 13.1IEFAOI 1113 [3676164] gif26128691gblAF005726.11AF005726 [2612869] gil23537621gblAF0 16233.1 IAF01 6233 [2353762] gi121498991gbIU94707 1 1EFU94707 [21498991 gij2 149 1491gbIU82366. 1 ILSU82366 [2149149] gi1l14694631gbIU495 12.11IEFU495 12 [1469463] gi11244503igbIU35366.1fEFU35366 [1244503] gi18338541gbjU26268.1fEFU26268 [833854] gif 841 200IgbIU 18931.1 CPU 18931 [841200] gij460079igbIU00457. I lU00457 [460079] gil460077igb]UO0456. 1 JU00456 [460077] giJ53566 1 jgbIL34675. 1 JINSTRANSPO [535661] giJ302304 1 igblAF007787. I AF007787 [302304 1] gif 431 1241gbL15633.IjTRN916ENT [431124] gi13881I06igbIL23802. 1 ENEEBSA [388106] gil36083871gbjAF07 1085.1 IAFO7 1085 [3608387] giJ3SS 1851 I gblAF076027. I IAF076027 [3551851] gil3551I7731gbIU94770. 1 ISPU94770 [3551773] gif35517431gbIU57498.11EGU57498 [355 1743] gil3243 1781gblAF0630 10.1 AF0630 10 [3243178] gil3 1363 161gblAFO63900. I1AF063900 [3136316] gil35402561gbjAF052459. 11AF052459 [3540256] gil7552 1 SgbIU 17696.1 ILLUl 7696 [755215] gil3421I4371gb AF082295.1 1AF082295 [3421437] gil342 1436f gbjAF082294. I AF082294 [3421436] gil342 14351gblAF082293.1 1AF082293 [3421435] gil342 14341gbf AF082292. 1 IAF082292 [3421434] gil334l43OlemblI 7797.1 IEFYI 7797 [3341430] gi133 196471embX69092.1IEHPBP3RA [3319647] gil3292886lemblAJ007584.11EFA7584 [3292886] gi13261 536JembjALO2 1958.1 MTV04 1 [3261536] giI325O7O8lembIZ9S 150.1 MTCY 164 [3250708] gil32496881gblAF070678. 11AF070678 [3249688] gil32496871gblAF070677. 1 IAF070677 [3249687] gil32496861gblAF070676. 1 AF070676 [3249686] gil32l9IS8ldbjABOlS233.11AB015233 [3219158] gil2765275lembiYI2924.11SPY12924 r2765275] gil3lI83687lemblY 1162 1. 1 lEA I 6SRRN [3183687] gil2765274JembY 12923.1 EFY 12923 [2765,2741 gil 27652731emblY 12922.1 ESY 12922 [2765273] gil2765272lembfY 12921.1 ESY 12921 [2765272] gil276527 1lemblY 12920.11EDY 12920 [2765271] gil2765270emblY 12919. 1 JESY 12919 [2765270] WO 00/32825 WO 0032825PCT/1 B99/02040 gil2765269IemblY 12918. 1 IECY 12918 [2765269] gil2765268lemblI 2917.1 IECY 12917 [2765268] gil2765267$embI 2916.1 EPYI 2916 [2765267] gil2765266lemblY 12915. 1 JESY 12915 [2765266] gil2765265lemblY 12914.1 IERY 12914 [2765265] gii27652641emb1Y 12913.1 IEMY 12913 [2765264] gil2765263lembiY1 2912.1 EHY 12912 [2765263] gi127652621emb1Y 12911.1 EMY 12911 [2765262] gil2 7 6 52 6 1lIernblY 129 10. 1 IEGY 12910 [276526 1] gil2765260lembY 12909. 1 EDY 12909 [2765260] gil2765259lembI2908.1IECY12 9 OS [27652591 gi12765258lemblYI2907. 1IEAY 12907 [2765258] gil27652571emb1Y12906.11EFY1290 6 [2765257] gil2765256lembIY12905. IIEFYI29OS [2765256] gil28945411emblAJ223332.11EFAJ3332 [2894541] gil2894539emblA122333 1.1 IEFAJ333 1 [2894539] gil3 180581gbjAF06O88 1.1 1AF06088 1 [3108058] gil3087776lemblAJ223633.1 EFAJ3633 [3087776] gil30807541gblAF01 6483.1 IAF0 16483 [3080754] gi]2197l 19igblAF00392 1. 1IAF003921 [2197119] gil2982722ldbjlAB01 2213.1 IABO 12213 [2982722] gi1298272lidbjlAB012212.1IAB012212 [2982721] giJ20587801gb1B07890. 1 IB07890 [2058780] gil2058779igbIB07889.1I1BO7889 [2058779] gil2058778igbIB07888.11BO7888 [2058778] gil20587771gbIB07887. 11BO7887 [2058777] gil20587761gbIB07886. 11BO7886 [2058776] gil2058775igbIB07885.11B07885 [2058775] gil20587741gbIB07884. 11BO7884 [2058774] gil20587731gbIB07873. 11B07873 [2058773] gil20587721gbIB07872.11BO7872 [2058772] gil205877ligbIB07871.11B07871 [2058771] gil2058770gbIB07870. 1 IB07870 [2058770] gij2058769gb~IjBO789.jBG7S69 L05807169] gil20587681gbIB07868.1 IB07868 [2058768] gil20587671gbIB07867.11BO7867 [2058767] gil20587661gbIB07866.11BO7866 [2058766] gil2058765igbIB07865.11BO7865 [20587651 gil20587641gbIB07864.11B07864 [2058764] gil20587631gbIB07883.11BO7883 [2058763] gil20587621gbIB07882. 1 IB07882 [2058762] gi[20587611gbIB0788 1.1 1B0788 1 [20587611 giJ20587601gbIB07880.1 IB07880 [2058760] gil20587591gbIB07879.11BO7879 [2058759] gil20587581gbIB07878.1 IB07878 [2058758] gil2058757igbIB07877.11BO7877 [2058757] gil20587561gbIB07876.1 IB07876 [2058756] gil20587551gbIB07875.11BO7875 [2058755] gil20587541gbIB07874.11BO7874 [2058754] gil2058753[gbIB07863.11B07863 [2058753] gil2058752igbIB07862.11B07862 [2058752] gil2O587511gbIB07861.11B07861 [2058751] gil20587501gbIB07860.11 B07860 [20587501 gil2058749igbIB07859.11B07859 [2058749] gil20587481gbIB07858.11 B07858 [2058748] gil20587471gbIB07857.1 IB07857 [2058747] gil20587461gbIB07856. I1B07856 [2058746] gi120587451gbIB07855.11BO7855 [2058745] gil20587441gbIB07854.11 BO7854 [2058744] gil2058743igbB07853.11BO7853 [2058743] gil2058742igbIB07852.11 BO7852 [2058742] gij2058 7 4 11gbIB0785 1.1 1B0785 1 [2058741] gil2058741gbIB307850. I B07850 [2058740] gil29475271gbT25933.l1 T25933 [2947527] gi129243021embIX8 1655.11EHERMAM [2924302] gil2664256lembY 12234. 1 EFAS48C [2664256] gil2879906ldbjjD85752.1IID85752 [2879906] gil27462 1 6gblAF028836. 11AF028836 [2746216] gil27458251gblAF039 139.1 1AF039 139 [2745825] gil269601I9ldbjlAB007844. 1 JAB007844 [2696019] gil48999embIX62280. I EHPBP5G [48999] gil2654477igbIU899 14.1 I BFU899 14 [2654477] gil43347embX68646. IIEHPSR.AA [43347] g;11 1flAAnI hA P00564,1_SEGE FDDH4RR [2613034] gil2613033IlgblAF029775.1IEDDH4RR2 [261aQ33] gi126130321gblAF029774.IIEDDH4RRI [2613032] giJ2613031IjgbIAH005623.I ISEG_EDDHIRR [261303 11 gil2613030igblAF029773. I EDDHnRR2 [2613030] WO 00/32825PT/B9004 PCT/I B99/02040 gil26 13029lgblAF029772. 1IEDDHIRR 1 [26130291 gil26l3O281gblAH0O622.IISEG_EDH19RR [2613028] gil26 13027 lgblAF02977 1.1 IEDH 19RR2 [2613027] gil26130261gblAF029770.IIEDH19RRI [2613026] gil26l3O25jgbIAH0OS62l.1ISEG_EDISRR [26 13025] gil26130241gblAF029769.11EDISRR2 [2613024] gil26 130231gblAF029768. 1IEDISRR1 [2613023] gil188I226ldbjABOOl488.11AB001488 [1881226] gil2547 16OlgblAFO23 104.1 1AF023 104 [2547160) gi12547 1 S9gblAFO23 103.1 1AF023 103 [2547159] gil2547 l581gbIAFO23 102.1 1AF023 102 [2547158] gi125471571gblAF023101.IIAF023101 [2547157] gil24153831gblAF015775.1lAF015775 [2415383] gil23886361gbIU94356.11EFU94356 [23886361 gil23886341gbIU94355.11ECU94355 [2388634] gij2340825ldbjlD26045. I D2604S [2340825] gil2226 1471emblY14080.I IBSY 14080 [22261471 gij23270261gbIU87997. IIEFU87997 [2327026] gil23180581gbjAF012532.1IAF012532 [2318058] gil 1848 175lembIX871 89.1 EM23S5SSP [1848175] gi11848174lembIX87187.1IEM16S23SS [1848174] gill1848173lembIX87188.1IIEM 16S23SP [1848173] gillI 8481 72lemblX87 185. I EH23S5SSP 1848172] gillI848171lembIX87l84.1IIEH16S23SS [1848171] gillI 8481 7OlemblX87 18 1.1 IEF23S5SSP 1848170] gill1848169lemblX87183.1IIEF23S5SPA [1848169] gill 848 168lemblX87 191. 1 IEF23S5SAC 1848168] gill 8481 67lembIX871 80.1 IEF1 6S23SS [1848167] gill18481 66lemblX871 82.1 IEF1 6S23SP [1848166] gill 1848 165lemblX871 90.1 I EF16S23SC [1848165] gill18481 64lembIX87 186.1 IEF16S23SA [1848164] gillI 848 lS6lemblX87l 79.1 I ED23S5SSP [1848156] gi11848155lembIX87178.IIED16S23SS [1848155] gill184815S4lembIX871 77.1 ED 16S23SA [1848154] gil2274942lemblAJ000346. I EHNAPBC [2274942] gil2274939lemblAJ000042.1 I EFGLS24B [2274939] gil4 145751gbjL127 10.1 IENEAAC [414575] gil2245603lgblAF006008. I AF006008 [2245603] gil2231I9921gbIU94530. 1 JEFU94530 [2231992] gil2231990igbIU94529. I11EFU94529 [2231990] gil22319881gbIU94528. 11EFU94528 [2231988] gil22319861gbIU94527.1 1EFU94527 [2231986] gil223 19841gbjU94526. 11EFU94526 [2231984] gil2231982jgbIU94525. I IECUJ94525 [2231982] gij2231980JgbIU94524.11ECU94524 (2231980] gil22319781gbIU94523. 11ECU94523 [2231978] gil22319761gbIU94522. I11ECU94522 [2231976] gil2231I9741gbIU9452 1.1 1ECU9452 1 [2231974] gil2 1966851gbjU25090. I 1EFU25090 [2196685] gil2197120IgblAF003922.1 1AF003922 [2197120] gil21966831gbIU25095.11EFU25095 [2196683] giJ2 196681 gblU25094. I 1EFU25094 [2196681 giJ21I966791gbIU25093.1 1EFU25093 [2196679] gil2l1966771gbIU25092. 11EFU25092 [2196677] gil2l1966751gbIU2509 1.1 IEFU25O9 1 [2196675] gil2lI966731gbjU24682. I EFU24682 [2196673] gil5325331gbjU09422.11EFU09422 [532533] giJ48727 I dbjlDl17462. 1 ENENTP [487271 gil468459ldbjlD28859. 1 ENEPPD 1 [468459] gil44Ol 35ldbjIDI 6334.1 IENBATPK [440135] gil39 l68OldbjlD 13 816.1 IENENAABS [391680] gi11402524idbjjD78257. I11D78257 [1402524] gil7O9995ldbjID30808.11 BACYCB20 [709995] gil2 I 92651gbjU91 527.1 IEFU91 527 [2109265] gil1041112ldbjID78016.1jENEPPD1A [1041112] gi1 1339880ldbjlD85392.IIENERPA [1339880] gil11339878ldbjID85393. 11ENEGEIE [1339878] gil6629 18lemblZ46807. 1 IEHCOPAYZ [662918] gil769796lembjX86 176.1 IEFRPODDNE [769796] gill 8546381gbIU5 1479.1 EGU5 1479 [1854638] gil 1857221 IgbjU72706.1 I EFU72706 [185722 11 ,.1Q721OIT727() IEFUT72V704 rl8,472191 gill18572 171gbIU72705.11IECU72705 [1857217] gil 1272655lembIX96978. I lEFPPD1IGNS 11726551 gi11272652lembX96976.IIEFPLSEiilG [1272652] gill1279406lembIX96977.I1IEFPADI1ORF [1279406] gilIO07Ol49lemblX932l 1. 1IEFTNFO1I 1070149] WO 00/32825 PTIB90 4 PCT/1 B99/02040 gil 1065723lemb!X92947. 1 EFTETMGN [1065723) giI 101 96391gbIL38972. 1IPH4COINJN [1019639] gil 115115 1 gbIU43087. IJEFU43087 (1151151 gif 1098507lgb1U 17283.1 IBMUI 7283 [1098507] gil 14980721gbIU64887. 11EFU64887 [1498072] gi1 14980711gbIUJ64886. 1 EFU64886 [1498071] gi11469783igbIU58049.11EHU58049 [1469783] gill 7636661gbIU8 1452.11EFU8 1452 [1763666] gil624694jgbIL38973.1 IPH4SEQ [624694] gi11730458lembIZ83305.1IEFVANRES [1730458] gi11419498lembIX84796.11ECPFW4 [1419498] gi1 1419497 jemblX84795.11 ECPFW3 [14194971 gi 14 19496lemb1X84794. 1 ECPFWI [1419496] giJ2544001gbIS43266. I S43266 [254400] gil2390251gblS66277. 1 S66277 [239025] gil 105493 1 jgbJU3859O. 1 JEFU38590 [105493 1] gi112445731gbIU39788.I1 EHU39788 [1244573] gil 12445711lgbjU39789.I1 EGU39789 [1244571] gill12445691gbIU39790. 11EFU39790 [1244569] gil 1255020igbIU39777. 11ESU39777 [1255020] gi1 125501 81gbIU39775. 11EPU39775 [1255018] gi112550161gbIU39778.11EDU39778 (1255016] gil 12550 141gbIU39776.I1ECU39776 [1255014] gill 2550 121gbIU39774. I EAU39774 [1255012] gil 161 99221gbIU69267.I1 1VU69267 [1619922] gil790436lemblX8486 1.1 IEFEFMPBP5 [790436] gil790434lembIX84858. IIEFD63RPSR [790434] gil790432lembIX84862. 1 EF72 1PBP5 [790432] gil790430lembIX84860. IIEF63RPBP5 [790430] gil790428lemblX84859. 1IEF366PBP5 [790428] gill1572800igbIU70854.11 CELF38A5 [1572800] giIO4l8l61gbIU17lS3.1EFU17153 [1041816] gi110865231gbIU39859.11EFU39859 [1086523] -IAA-2 AI.4 minT..1 917TAAl r40-35441 gill15154741gbIU66286.I1 EFU66286 [1515474] giIl5l3O68jgblUlSS54.IILMU15554 [1513068] gill 2965 2OlemblX94 181. 1 IEFENTAORF [1296520] gi114880691gbIU63997. 11EFU63997 [1488069] gi112095251gbIU35369.11EFU35369 [1209525] gill 146934 1 igbIU3093 1. 1 JESU3093 1 [146934 1] gi148833 1 IgblM77276. 1 ISYNGIP2 122 [488331] gill 0461 771gblU39733. II [1046177] gill123661 31gbIU49939. I CVU49939 [1236613] gi147491IembjX55766.1ISS16SR5G [47491] gil47490lemblX55767.1lSSl6SR3G [47490] gil4706 1lembIX56353. 1ISFTET91 6 [47061] gil49022!embIX62755.1 ISFNPRG [49022] gil47O47lembIX172l4.1ISFPASA1 [47047] gil47044lembIX68847. I ISFNOXAA [47044] gil47033lembIV01547.1ISFKANR [47033] gil47Ol8lembIX02O27.11SFSSRNA (47018] gilJi 1044lemblX75752.1IIMP1I6SRNAO [511044] gi15 110431embIX7575 1. 1 IMP 1 6SR243 [511043] gil886481IlembIX82819.1I ESPLPAM [886481] gil 1 7387lemblX76 177.1 ES 16SRR [517387] gil4729 16lemblX769 13.1 IEI{NTPOP [472916] giJ4335 1lemblXSS 133. lIES 16SRRN [43351] gillI 143442lemblX92687.1IIEFPBP5G 1143442] gil963032lembjZ50854. 1 IEHARPQTOU [963032] gil886479lemblX84818.IIEHDNAPSR [886479] giISS 1437lemblX8 1654. 1 JEHIS 1216 [551437] gil467805lembIX78425. IIEFPBP5 [467805] gi129672 1 lemblX5596 1.1 IEFPD78 [29672 1] gil287946lemblZ19137.1IEFPTSHGN [287946] gi149042lembIX63285. I EH-NAKA [49042] gil49019lembIX62658.1IEFSEA1 [49019] gil43337lembIZ1 2296.1 JEFSPREG [43337] giJ433351emblX56895.IIEFPVANAG [43335] gil43333lembIX 16421.1 IEFPF54 [43333] gil4333 1 lemblX62657. 1 IEFORF3 [43331] gil 1065721 lemblX92945. 1 IEFCAT501 [106572 1] giJ80655 1 lemblZ49243. I IEF4 I1 OSOD [80655 1] ;10O6A YAW.~ I7A A I rr CC 'f t\c- roAfcCAnl gil5O5530lembIX79542.11EFAS48 [505530] gil43323 lemblX62656. IIEFASP 1 [433231. gil40840lembIX56422. I JEC 1 6SRN G [40840] gij48189lembIX04388. IITN1IS45TR [48189] gil92881I41gbIL4O84 1.1 JENETRANSPO [928814] gil 141856igbIL01794. IIADIREPABC [141856] WO 00/32825 WO 0032825PCT/l B 99/02040 gil 1 491 25igbIM90647. 1 lIP8VANY 149125) gi1141862igbIM87836.IIADITRAEI [141862] gi114186O1gbIM84374.1IADITRAA [141860] gi11418531gbIM62888. IIADIPADI [141853] gil 1101 637ldbjID31674.1IjEVM 16RNA7 [1101637] gil 1101 636ldbjlD31675.1I ENE 16RNA8 [1101636] gil497792ldbjID3 1676. I ENG I 6RNA9 [497792] gi110227291gbIU36195.11EFU36195 [1022729] gi14883381gbIM77279. 11SYNGIP3 124 [488338] gil4883351gblM77278. IISYNGIP2563 [488335] gil4883331gbIM77277. 1 ISYNGIP2 124 [488333] gil4883291gblM77275.1 ISYNGIP2I2 1 [488329] gil388267igblL19532.IIAD1TRAC [388267] gij4930161gbIU03756.11EFU03756 [493016] gij4535361gbIL28754.1 INSTRAN [453536] gill 536581gbIM58002. 1 ISTRHYDROLA [153658] gi14754271gbIU0068 1.1 EFUOO68 1 [475427] gij8 187041gbjU24692.1I1EFU24692 [818704] gi1l1550361gbIM97297.I1ITRNVAN [155036] gillI 505521gbIM64978. 1 IPCFPRGAB 150552] gil7862741gbIU2254 1.1 1EHU2254 1 [786274] gil7862731gblU22540.1I1EHU22540 [786273] gil5598581gbIL371 10.1 IAD ICLYL [559858] giJ6436 141gb U 16659.1 JECU 16659 [643614] gil6436 l2lgblU 16658.1 ECU 16658 [643612] gij29064lgblL13292. lIENECOPPUMP [290641] gil62470 11gbIL29639. 1IENEVANCRF [624701] gil6246991gbIL29638. 11ENEVANCR [624699] gil6246921gblL2964 1.1 IENEDDLA [624692] gil6246901gbIL29640. I I ENEDDL [624690] 01i4930941011-328 13.1 IENERRD [493094] gil 153852igbIAH000939. IISEG_STRTN916 [153852] gi11538511gbIM22645.IISTRTN9162 [153851] gill 53850lgbIM20864. 1ISTRTN9 161 [153850] gi1153660igbIM36878.1ISTRIF2BA [153660] gil 1535851gbIM 1377 1. 1ISTRBRP 153585] gill153575 lgbIM64265.I1ISTRATPEFHA [153575] gil 1535651gbIM90060.1IISTRATPASEA 153565] gill1529691gbIM92376.]IISTABLAIA [152969] gil3O966OIgbIL 14285.1 IPCFPRGWZY [309660] gil4337141gbIL12033.1IENESATA [433714] gij2906451gblL 15304.1 ENEVANB2A [290645] gil 14833 1 lgbIM84 146. I ENEVANR 14833 1] gi11483291gbIM64304.IIENEVANE1 [148329] gi1148326igbIM68910.1IENEVANCRES [148326] gil1483241gbjM75132.11ENEVANC [148324] gi1 1483231gbIL061 38.1 ENEVANB [148323] gil 14832 1 lgbIM85225. 1 IENETETM 148321 gil 1483201gbIL00925.1I ENERTRNA 148320] gil 1483 191gbIL00924. 1 IENERRNA 148319] gi114831I71gbIM8 1466.1 ENERECA [148317] gill14831ijgbIM8l96I1. 1 ENENAPA 148315] gil 1483 12igbIM38386.I1IENEMSPDPS [148312] gi1148310lgbIM37185.1IENEGELE [148310] gi11483071gbIL07892.1IENEBLACREG [148307] gi11483051gbIM60253.IIENEBELAA [148305] gi11483031gbIM77639.1IENEB14NAM [148303] gil29O6441gbILl65 15. 1IENERGTG [290644] gi1154954igbIM37184.11TRN916 [154954] gil 148301 IgbIM6922 1.1 IENEAAD9A 148301] gil 1483081gbIM38052. 1IENECYLB [148308]
I
WO 00/32825 ,48 Table 28 Phage Dpi complete genome sequence. 56506 nucleotides.
PCT/I B99/02040 1 71 141 211 281 351 421 491 561 631 701 771 841 911 981 1051 1121 1191 1261 1331 1401 1471 1541 1611 1681 1751 1821 1891 1961 2031 2101 2171 2241 2311 2381 2451 2521 2591 2661 2731 2801 2871 2941 3011 3081 3151 3221 3291 3361 3431 3501 3571 3641 3711 3781 3851 3921 3991 4061 4131 4201 4271 4341 4411 4481 4551 4621 4691 ataataaaaa taagtgaagc acactacacg tgactataat gacggatatg acattccaaa tgacttccac gaagtcatgg aatttgcaac tcttggacga gccacttgtt gacaaaagca tcttgaaatt catggaaaat gactaatgct cgcagacgat atggaatatg ttaaacgggg ttgtggaact cattataagg cattttggaa atgaccacgg tgacagactt aagcgagttc agcttatgtg tacttactac gaaaagatta agaggcaaga ttcaaggtga caactggtgt gctagtcgaa gaggaaatcc gtttggtCtc cctaaaccgc atgaaaacct gtttaaaact gaaggaaaaa cttcaacaa tgaaaattga ggatgaaatt gcagaacgtg atcgagaaga agacattcca attcctaagg aagtacaaga ggtcatcgta acttcaacta agtaggaggc atgtgacatt gattcttcta ttttcacctc tgaaagttgc tgcgctggaa gcgattatcg taggagcgaa aggaaagagg gcataagaaa aaaggggctg tcgccgaact acaaatttct tatgaagcag agttttgtaa cactgacgct ccagacgtcc aattcacttc cgaaaagatg gaagattgca gcctatt cac tcctgaactt gctattcgat tagcaattga tgaagcagaa gactcgaaaa atattgggtt aattattgct acctgacatc ctgctaaata gaattgacag gcgttactct ttgaaacttt ccctaacaaa cctttgccct aaaacaggac gttgaatcta aatcattgaa tgaacattat tttgaatgac tcctcgtggt ggaatttcaa gaacagcttc aacttcaacg aggaggctgg aatgaaatca agttgacaag tggggttcta cttgaaaatg ctgctaatgt tctactcaag ctctagctct cttacgctga aatcctagca tccacaggct gcggcttatg gcggctggag gtgcttaccc gaactggagg caaggtaacc aattgattta gatgttcctt tgcgcaactt gtatcgaccg agaattgata tgagagtttc aatgcgcaaa tttgcacggg ttcgagtcaa gggatggttg gaccacgctg ttcttcttca tatttggatt tagaactacg gaagcatgct cgtatcgact gagattttca cagaagacga ctgtccgcga aattttagag ttcgaatcaa tgttcgcgac aggaatggtt ataggtcaaa gactcagcct ttacctggaa tcttgaaact agctttcaat tgccttaatc aacgagccta gaaactcaag gaactcgatt cttcaagtgg aatgagaact tgactggtca tttaaaatcg ttcgaaggca agttacgtcc tcagtgatag gcttcttgaa tgttcgacct ttaccgcaac gcatctagat aaaatcggta gtaaccttgg acaatactga acgggttgca agatactcca ccctaaactt catctcgaaa ttcaattctt tatgtgagca ataagattac aggtctttca gcgcttgact caacaaatcg gaggctgagc atacttgcat tgcgaggtct tttccaagat ggaaaatgaa taaaagtgca gaccgttgca ttttctgcta cctttatgga accatagatg ttggactgat tgacgtttta taagatggca agtcctctat cttcgaatag tttactcttt aaatgggatt taatctatac agccaatcgc ctattcaatc accaatccaa gcactactCc aagttgacat tgacgcctat cgataaaatt cctggtgtat tgggataatt atctatacat gagaaggaag ct tatt cggt tgattgcact cttgtcgctc atttcttgac caaaaaggca taaaacctta catacttaca ttgactttta agggaatgaa gctgagaata ctatcaaact gattgaaatg caggagcagg cctgagaaaa agactatttt cggtactact gataaaggtg tggctaagat ccaagaatgg aatatgaaaa ttatctttga agtgaactac aagttgggat ttcatacact acgtattagg ggcagccgtt ttccgttttg aaacattcga t cat t tagct aaattcggtc ctgacgctat gagcggacgc gacgcatctg accttttggc ttagttatgg gactccgggg ttcggttcac attcacttat taacaaaatg aaaat aaagg tttaggaaac catcctgaaa agcctgactt attgtactta ttgtatgaat tttacccatt caaattgaac gtagttttat aaaatgttca tgcaatgttc tccttattac tagttgacac tggagcttct cctgagttct ctctacttac tcgttcatgt ttcgaagaaa acattcgacg aagtcgaaat tcacgtcaag ccaatcgctt tgtcaagatt atgggaaact ttcaagaacg ataatggtta tgcctatcat cattcgaact gagccggaat aacagatttg gatttcgatt ttcaaagaag ttcttgaagc cgaaaatgac ctttcagttg ggctttggga tgittatgat gagagagaac caaagacttt ttaaagcact cgtcgaccat ccgttcgtag gagtggttga tcaggaagtt ggtattaaga ctcgagcaga ttgttcgaac acctattcaa attgtattag ttgctacctt ctgtccagtt caccgaattt gtgtataata aggctcgaac atgagtcaaa caagacacca aatacgatta ataattacct agtaacattt cgcgaatgtt ttcattagtt ttcagtttcc gtaaccacgg tgatggaacc taagtacact cgtcaacaaa gtgaatcctc ttccttggaa atgttcaagg tatccggcgg agtcgactca tgctatagca ttcaattacg tacggagcca agttcaccat aaggaaaagg cgaaatttca ctatgttcca tttagaaatg tacgtcgtat atggtgctca ataattcaat gtcaaatgca tctaaccaag gcgcaagtcg tatgaaagtg acgctgaaag atggaatgac tgaccctatt cagctcatca actagttgga ttcattagca ggcggaactt aaaatcgcag gtacattcat tagcaaatgc agttgacacc ccttacctgg actctcacgg cctacaggtt gcgcagaatg taacctttat cgacaaagac atcaatacaa tcagcctgaa ggaaattttc ggtcctacaa ggtggatgcg actatcattg atatcacagg caaagaagct taaccacgtg acattgactg ctaaaagaac atggattcaa taagcgatat cactattagt tattgtagat agaatgaatg ctagcttatg cgcgtgatat ggaatgcaaa cgcatacgaa taaagtgtat gaagacccag aataaaagag gagtataaaa ggatgggctt cccttaagcc ttggtctatt aggcgaggac cgctgaacat accgtagggt gaagaccttg ttcttgtgaa ggaaggtgca tattgcatac aggatacgct aaacgacttc ctaaatcctc aagcagttgc agcacggggc aacgacagtg attgcttcag ttgattaaaa agctcttatt gcggctctat tttagagtca gtgaagcctt gaacaattat tgcaaacttc ccttggagta gtggcaatgg cttgctaarg cttaccttat tctatgtagg aattagtgaa caatcatttt agaacactga tagcactgac gattatttga attggcaaaa ggtggattga gcgcacattc tgctcacacc cgtgggaatg tttgactgta gaacagcttt tggaagcacc aagacaagct ct tacctatt attcgaaggc ggaaagcac.a gacaagtgga tggaa-agagt ttgggatgac agttactagc cacaggagcg atgggaaaca gatgetgtcc gtaggctgcc attttagcct agagcaatta ttgggcagag aactatgaat cgctcttatt ttttttaaaa ttccatatgg gagaagactt ttccttacat tggaatttca attcgaagtt attcgaaaca caattagagc gtcacccatt ttatgacgtc aaaaggatta aaaaccggtt caagttgaaa gttgaggact ataaacttcg tcaagggaat taaaaatcgt taaatggctc ccagccaatg gttctaatcc ctatagcgcc gttgacttgt ttgaatccat agcattgttc caacgtcgac acctttttgg gaatctgtca atttccacgc tggcgaagaa ttcgcaggag gtcacgctat aactgtacga aagaaacggg aaaacractc gtcgactcta accgaatacg tgaatgataa ttagacagcc taagacacgt gcgcgagcga atggttgaga aacttgatgc tcgaaactac actcgactac gaagcataaa agacgttaag actcacgcat gactctactt ctgtactgct cacagaagaa tggaggaatt tatcgaagaa actggagcgc aatgttcaat acatgctgaa tattttagat aagagctttt WO 00/32825 WO 0032825PCT/I B99/02040 4761 4831 4901 4971 5041 5111 5181 5251 5321 5391 5461 5531 5601 5671 5741 5811 5881 5951 6021 6091 6161 6231 6301 aaaaatgaac aacgaataag actttctaaa tggcgctcat tgaaaaaagt tcgagaagga gaggaacctc atatgcttaa atatcaaaaa cagctcaata aatgcgtcat tgacgtgatt gttcctgaag aagagtaccc gctgttctac ggcttcctgt aaaagggact gtacttctgg atggaaggta cccctacagc cgtcgaattc tacgcatctg aaattgtatc tttttataca aggtaaataa agatgttgag agagatgagc tcaaacttat aaaaraggag ctttcattgt aagatttaaa aaggaggctc agttgaagcc gtttacagcg gtgaaagcag aatcttcgct tacattcgac ggaatcgcca taaaaggcgg agaaatgctc caaattgacg tggaagatta agaaatcctg ttattettga ctggcaagaa aaccgtcacc aaaacgcttg actctattca aatgacagca gttcaacaag tacagtgaca acttagagca acgatatgaa aaraacttca caagaatatc taaaactagt tcgatgaagc ggttattcaa actcaaaatg caagaggctg attatttaga aacggcttta atattatgag cattaagttc tagcaagitg ctagaaatca tatgatggct caaacttcct agcagtttgg aaaacttgta aaaagttatt gggaatggtg cacttgctcg aagacgtgag atatcaacga ttctgcggta aaaagcaatt actacagaca attccttaca acctaatgag atactactgt ctatatttca tgaagacgtt tcacagcttg agcgtattag accgccttgt aagaccgact tcgaattaaa agtttcgaag aaagaattca gaagaaaact tcactgtctc ctcattatcg tacaatcata ttaagttcta cttagaagaa agcaattatg aaagatattc tacgaagtat tatagagagg tgagttcaaa cgtcaacttt ttattcacct tctatagtct ccgtgaacgg gacttatgaa caaactcgcg ataattcgtg aaaaccgaag caaactattg tcgatgcatt gaaaagacca agtacaatat tgaagaaaat tctaaatcag tcattcgcgt tattttagca tcggcttcag actcaattga actattcact acttctacta cttgccacct ttatggaagc gaagaataat tgcttcatct tattataagc tagataacgc gtacatggtt aatccagttg aacttcttca taagacigaa tgctacggaa 6371 atcgaatggt gtcgtttact tcctagcact 6441 gaattgcaaa gatggttaga gCaggaaaca 6511 ggttattgaa cgaactcagc ctgaatataa 6581 attcgaaaaa tgtatttcga aagaatcggt 6651 tgggcgaagc tggaacattt aggcacgaag 6721 ggactttgaa tggttgaatg tagcagagtt 6791 cgtttcaaga aaaacgatta tgiaaacgaag 6861 gactagttcg atataaaggc aagctctaca 6931 acatactgag ccctatgaag aacacaagat 7001 gtcattttcc tctatgaaaa tcgagataac 7071 tgaaaaatca agtccttgga aaaattatga 7141 ctattgctct tcagcctatt gcccatattg 7211 tgttcgagga agactttttc gaaggtgcaa 7281 taccactaat Sgatttcgag gagttgcaaa 7351 tttattgaac tgaaaactac taaagaagct 7421 agctatcacg cgcagatgga tgcaaattta 7491 gattatatgg tatccaattt caagccttga 7561 ttcatcgatg cagggtatga agtttcttac 7631 ttctagatgc agttgagctt cattacaagg 7701 ttcgagaaga agaaatacga gatgctcaag 7771 cgacgaaatt gttgaagcag cttgcggttc 7841 caaaatcctg tcattatgga agaccttaac 7911 cagatagggc ggaaatggtg ggaatacaaa 7981 tctatacatt ttagccgccg ggaaaactat 8051 gaagaagtca tcgaaaatgc ttacaagcga 8121 aggtattagc atcttaaaa cgaattcaaa 8191 aaaaggagta ttattaaatg caaaaagacg 8261 atacaeaggt gattgggttg atgtacgaat 8331 tcaagatgtc gaaaagtgct tcaaaaggct 8401 cacacggatt tgctcttgaa cttcctaagg 8471 gaaaactggt ctaatcttcg tttctagcgg 8541 ttctcagttt ggtatgctac tcgtgacgca 8611 aggaaaagca acctgetatc aagttcaatt 8681 aagtacaggt gatttctaat gaaattggaa 8751 tagcagttca aggacttgaa cgtgaagcgc 8821 aacctacggc gggctccctc gaaaaagggc 8891 teagctctcg acattgtcaa gaatgcgcaa 8961 tcaaggaaaa gctggaaaat gcgcgtgcat 9031 actcgatagt cttcaagagc ctcttaagat 9101 gctaaaaaga ttggagtcga tgttgacaat 9171 tacttcaata tgttttagac atttcgaaa tcaagagccg gtgaaggtcc tccttcgaca gagtctatta ttctccaaga cttgaaagaa tgtaagaacg ttttagagat gcaagcaact aactttcaaa gcatattttt atcgacagcg cggccgcaac tgatattgtt gctctcact gagcagag atgtatcaac agtattcctg tcgaaattta gtttgaagat t cagcct ttg gcagttatga taacagctta gaaaccgcaa ggccaagt cc tttgtcaact ccagcggggt agat tct aac aaaatggccg aaggaactat actttcattc accatgttca tgtgtctagg gtttcacac aaaggcgaaa tgtgagctat ttcacggtcc caactaactt cactgataat tatttccaaa ttaaaagcgt ttcgaaaaga aagcctacac cctgcgaaga gtatgtagag tagaaaggaa aagactttga tccctgcgat tctttgagct ttctcgccgg aaaaattaaa aagaagcgt c agaaaagcaa acgtagcaga acttgaCCag tactacattg tggattcaag tcctgacaag gcctacaaga cctggcaact tagacgtgaa tagttctatc caagtatatt gatatgaagc agtgattgac gatatcttct tcgtagaatc cagttgatga ttccaagaat agttgaattc atggtatttg ccaaagctag tgtatatctt atttggatag caggtgaagt cttactcgct cctattcaac ggtcgaaatc gaaagatgct tatatagccg ttaataacat aattttagtg cggtctggag atataaacaa gCcggcgctc ttaaatggaa ggtaaggcta Ctttaaatct caacaatac gcaaaactca tatattatat aattgtttct ggtgacggcq atgttgaaat cgtcacatta acagaagatg tgaaaagctc aatttatacc cctatcaagg atgagaagat tggaaaattg gatgcggcta acaaaggaac agacatcatg ctCttgaagg ttaagatttc aatttaacta cgctgacccg tggtggatgt ctaattgcaa aaatcgatga cgtcgacgag ttgtgtgacg agttcactaa agtcgatgat acagacgaga gtCCtaaaat actggaaaaa gtctatatga tgggaccttg caatggttcc agcatgaaaa caacccaaac ttccaaaatg caaattgata ataaagtagt agttgtaagc tatttcgccg aatacgaiaa tgtcatgaat caggccgata caaataattc accgattaaa cgccgatgtc attaaaattg gtctttttaa tgatgaatgg tttagaattc gaggccatgg aaagctcttg tgaattatca gaaaactact actgaagaac ttgaaatgca cactgagtgg gctgaagaaa ccttgcctta ascscctttn at caatcaaa agcatgct tg ccgtactgct gacagaaaattcgctgjcga aattatgaca gaggatgact gcaaaaatct cctaaaccta gttgcatcta atattctctt gaactagatt gaccattcct tggcaagacc taagttacct ctcgtatggt gcgattatca gcaatggaag aaattcaaat gctatcttcc cactcttctt ttctgctatc aggaaagaaa caagcagaaa ctcgaaaact aagttcaact aaagctagaa agcagagtta gaaactcagc aatgattgac cctaaactcg actaaaattg acgccgacag cagtggcggc aggtgaatgc aatcttgcat cctcgttcca gaaggttaca aaggtgacac acgaccaaag aattgcccaa tttaggaaat gcggctcgtg aggactggaa taaggattcg ccctttttCt gcgccttcta ttcggtcctg agtcaagtgg agcaggaatg ggaacagaag caagactgct gtcaaggaac gaccttgaga atacattaga ttcgccctga aatgaacagc tggcctagta gttctagatt -agcascgatt acaatgcaat attcctaggc tccaggcgga aagatgtgga gaaaacggtg catcattgac agaccaaagc atttaagccq tgaaaatgac cttgtagacg gtcgaccttg aaactggaga atctagttcg acgcttcaag cactcgagaa gaaggctaat agaaaagaac ggttceaaaa gcgtcaagtg cttgttcata gacaaatgga gccacgaact 9311 9381 9451 9521 9591 9661 9731 9801 9871 9941 10011 actgaattta gtcgaaaggt ttcgagaaga tatgaatagt tgcagttcga cttaaattta cgaaaccctg cagggaatgt tagtttccta tacgctttcc atttggagtc attcaaaagg gatgaagacg aagaaccatt acttattcga catggtgatg ctatttggac ctaagctagt aaatcgatga gcaagtggtt ttattatttt aatgactcaa tactcctctt cagtacaatg gaaaaggtga agtagagtca ctaccttgac ttcgtcgaga tatcatgatq gaattcaaat caggggcatg gttcagtatc gaagttccaa ggcaaggcaa actgcggttc acgaaattat gcctgCtagt tcaaggcgca gagcttatga accgcagaga ttatagcaga cgggcagtat WO 00/32825 WO 0032825PCT/I B99/02040 10081 10151 10221 10291 10361 10431 10501 10571 10641 10711 10781 10851 10921 10991 11061 11131 11201 11271 11341 11411 11481 11551 11621 116 91 11761 11831 11901 11971 12041 12111 atagtttcgc ctggaatggg ttagcttcta tgaaccaatt cactactgct tctaataatg tcaaagttta agaagagccc ctcagtcgag ttatcgaaag acttgcaaat gacatggaag ttgccaacta attagtgact atcactcaac tattgtggat aattgaaacc catttcgaaa ttaatccgtt gaaacat cat aatgtcagct gttgatagca ctaatgaaga gattgtagac gagctatttg ttactaattg tttaaattgg gaccttttag gcgtgaacga ccaataatct accctgatga tcttccatac aataccgtcc acaaaatggc cgaattttcg gggtagaaaa catcattgac tcatcgggaa ttcaacggtt tgaaaatgaa ggaggaatgc ccgtttctaa tgacggctca cgaaacttta ttcctgctca gctagaagaa aaacttcttt tggaaacaac atattgcttc tcaggaigca cttaactcgc tcaataatgc gaaaatgcag tattgcaatc aaaaggttac gctcaaattt tcgacagttg tgagggaagc at ttat cagg aaagccgttc gtttcgacag actgttctct ataacgagtt taaacagtct gactgtcagt ttgctgtaag ggtcgcagaa aggcttttaa tcaaactttc gaggaagtgg tagctcaaga atatgtcaaa gctatcaaac acggctatct attctgtggt ggcgctggaa cgaaggatgt gaacaaagga cttggctctc ctattgaaat tgttcgaaac attattgaag actctagata caagtctatg gaggttcata tgctttcaac cggagcattt aatgcgctgt ccgtgttcat tctatgtact actgaccctc aaaagattcc tgactttact cgaattgata atgacgacat cgttaatcaa gaaggagctg gttatagtta tgagcgtgac gccctttcgt gtgacagcat cacaaggctc gaaaaagtcc ttgattatag tgcactagga gttccggact acgaaacatt cgctrcactt aagrgtttag aaattgtaaa tgacttccac tactcaggaa cagacttcct tttagaggtt tgtaagtatt ggctagttcg ttttgaaagt aagctagagc aattctgtga ggcttttcaa atgaatgaac ttgctggagt tgttaaatgg gagcctaatg tgatgagcaa ggaggagtga catgattgga cagggacttg ttccaaaata tataatcgtc gaaggtgaag taggttcagg gaaatttgac gctgattcta ttgtagtagg aacgagtgta cagactattt tcaaggcgag aatctacgtg atagacggaa ttttgaagat agcggaagag ccacctttaa actgtcatat tttacctacg cttgcaagta gagcaaaagt tctaaccatg tttgtcaagt cctacaagaa ggtagatact tcaggaattg ttgccagcaa tcttcaaatg cttgaagaca tattagaata aacattttat gacttaatat gggaggcaag tgctagcaat aaggaaactg atgaaggaaa aattgagcct aaacttttcc tcatcaggaa gcactatgta gaaatgtctt tcgaagaact atctaggtgt ttgcgaaagg tatctaaaaa gggctcaaat agggtcaaac aagttgagtg atttagtatc atttcaaaaa tatatcttgt acggcgaaga aattggtctt atgaatgttt aaacttcggt ttcaacagtc tggaaaaccc tcactcaaaa tgttcgagat gataaggagt ttctgtctaa tgagtcgagg acacttgttt tgatggttac taaaattgac aagcgaagca ttgagtttga gaaaatgact gacgcgcagt tgaaaaggca cgacatgatt gacatggtta tccagttctg tctaaacgat ctgtcgcgat tgaaaaaggt tgacgcatca gtagttgaat ttttcagcct agttgatgat gtattggaat ataggccgga agccaaagga gaaagtccta ttggattgct taccttgctt ctaggagccg atgagcctaa agaagccaat ctaggcatta actttcaata cgagctggac tcagcctttg aaggcatggc gaatggtcgc tatacagaaa gttcagtggt ctatatttct ataagctgaa atcgtgtat attacagtat aagcaaagga agcccgcaaa aggtgagagt agttatggtc gggaatattg gaacagaaac ttccatcagt tatattatag aaaatgaaag ttcaagaaag cagtaacaga aacaatcaat cgtgacggta ataccaattt cttctcaagt ccagcaagca ctcgatacca ctcattaaac gtgttcaatc aactactttt cgaaatggat tacccaatgg aaacagttgc aatcgtagca ctatttcacg ctgaaaaatg gcgcaagaac agcgacggtg aatgggaaag acttacaatg ggacatggtg caaaatctaa tttccttctt 12181 aatgggaaat gtagttcgag 12251 tctaatcatc gaatattcgc 12321 ttccggatgt tagatatggg 12391 ggcctttcct gataattgtg 12461 aaatactcga ctattgatag 12531 ttgacaatga attggacaag 12601 gcacaagacc gaaattgaca 12671 atgaaagtga ctgaactttt 12741 ttaataacgc ttgtcttgtg 12811 aatcaataag attgtctata 12881 caagctatcg agggcataaa 12951 ttttttcact tacttaacaa 13021 tgacagaagt tgcggtaaat 13091 atatttaaaa aggaagtacg 13161 tgacagactt taaaaaacgc 13231 tatggattgg ctcgaaaatg 13301 gaaggtggac ttgtcgagca 13371 gcaaaggctg ggaagacatt 13441 agttggtcag tatcgtgaaa 13511 tatgaatacg accctgagca 13581 ttcaactcac gccagttgaa 13651 aaatttgaat ggatgtggag 13721 gccgcaactt atgtagtcga 13791 ttgaagaagt agttgaagaa 13861 tgaagaggct gaagaaaaac 13931 gtagaagagc ctaaagaaga 14001 tcgaagaggt agaaagcgca 14071 tggatatgtt cgagatgtct 14141 gagcctgacg atgacagcga 14211 aagaagactt cttctacgaa gacggaaata gaaaatgaat gaaattcttt ctggc aaaac tgatgctgct gacagcgagc tgaaaacatt tgacactatt cttcaattta ttattgggaa tcatcacgtt gt tgaagct a aagacttgaa agatatttca tatcctactc ctaaaccgat ttaaatctac acggaagacc gatgacattc atagcctgtc agccatgact ctaccttata acgaccgagc tggcgcagaa t cgctaaagg tcaactgtct tgaggcccat gcgcgtgtct gacattcgaa atctcaatca agggctcgtt tggaaaaggc agttgctaaa ttttgtgtct tactctagaa ccattgtcaa gcaggcaatt tatcaaaatt agcagttctt tattttaggt ttgtataaaa ggacagccta aat t tctcga gggtctaata tcgagaacct tggaagctat accatggtag acctttgcaa ctatttagca caacgtttca gtccttatgc cgcagatatg gaggctgagg aagaaaaagt agaggaagag actgaaaagg tggtacctgc agatgtcgac cctgtattag tgccggaaga acctaaagca ggttgaaata gtggctatgt gtagaaaag gactaggca tcgagaatga caaccttgca :ttatgctaa aqaggacatc aatcctgaca acgcaacttg cgaat cgt cg accacggcga tgaaagccaa attggatacc atggggataa gctcaagcaa ttttctggca cagccttcga aactaatcca aaatgaaaac ttcgaatact aaacctaaga gttcaactcg caaaagctgg aatcactcga gcctaagaaa gcatcttcta gacgagccga aagttgaaga actacttcta cagtgaagtc cattcttgta gacgaagaag cttgacggca aggttcacaa 14281 14351 14421 14491 14561 14631 14701 14771 14841 14911 14981 15051 15121 15191 15261 15331 atacgacgaa gaaacttggg gttgcaaaac ctactcgaaa aaatgtgtga aaattgtcaa cgacgcctca ttcacttaca aaagaccgtg acagcctttt agagacittg tattgcaaat aaaggctgaa gatttaaagg ttgcaattag ttgaatcagg cggcactcat ttagaagtag caggggtata ttgatacccc tcaatggtga cgctattgct tgaaactatt aaatacgagg aacttcgcga agactatcaa agaactcgaa aaccttgaag cgaaatgtct tgaggctatg actatacatg ggttcaccaa aacctatcac aactccagcg aacgaaacat aggagattcg agtcgctaca tctcgattgg acgttatctt agcattataa cagctttgtt tgacctttat actgacat tt agcctattgc cgtgcaagag cctttgtggg tgaagcagaa ccttctcgtc tcaatactag cgacaccgca gtcaggqcc aagcatttcg aggt cttatc tggaacgaat cgataccgtt aat caaagga tactacgact atgaacaatc gtcgaataaa atacattgac tatgggagcc tatgatatta cttgcattct taatccatcg ctcaaggtcc agttgaacaa taagaaacct gcgcctaagg cgtcgcaaac ctgcgccaaa aaattcgaat gcetaaaaag agcagaggac gacaatgtgg gctgacgttt actacaagaa agtacatgga cgcaatgtgt attagcaaaa ggtgaacgct tacatcaagc gaacagaaaa gccctcgccc ttaaaagaaa aattttcaat gaagatgaaa gcagctatta gcaatcgagc ctcccgttc tcacgcagaa tgaagctgtt caagaggctc gacgttgaca aaaaaattgg aaagacgcta tttcacgtga gatgattatg atgacgttat gcattagaat ggcgccttac agatgatatt atctacgtcg agcgaaagca aatgaacaaa cttccttctt gctgtaaagg aatctagtcg aatgttttcc tcactaatgc ggccgctgaa aacacttgaa gaaattttag tgtattagat gaccttccag cttcgagaca aagcagttga WO 00/32825 PCT/I B99/02040 15401 15471 15541 15611 15681 15751 15821 15891 15961 16031 16101 16171 16241 16311 16381 16451 16521 16591 16661 16731 16801 16871 16941 17011 17081 17151 17221 17291 17361 17431 17501 17571 17641 17711 17781 17851 17921 17991 18061 18131 18201 18271 18341 18411 18481 18551 18621 18691 18761 18831 18901 18971 19041 19111 19181 19251 19321 19391 19461 19531 19601 19671 19741 19811 19951 20021 20091 20161 20231 20301 20371 20441 20511 20581 20651 cattattcgc tctaaacacg agttagtgtc tttgtactgt gaggaccacc gtagtaccgt tctttatcct cgcctatctg aaacaagaga aggcagctgc acaaaggtga cgtcgtaaca cagcttgaaa aaggaatggt atcgctttcg actttggtga tagaggatag aaatgataac ccatgcaact gtacgcagac tgaccctatt gttcgagaaa acaaacctcg accagaatga actacctaac caagctacaa ttgataaatt ccagcaattt ccttgttcat ttcttgcttt caattctagc atcaacttcc tactgcaatg tcaagttcgc gtgaccttat attgtttctc cgtctttcca atctgctgta ttgacgcttg ttttatttat aagttgaact tttttaaata aaaattaagt tcatcttcat atttcgtgga ctcctttttt caatcctttc gagtcgcttt aatagtttga atggcttcaa caggaaagca aagcgttcca ttagaatatc tttgtagtca ccaaatcttc gtcctcgtca gtttcgaact cgaatgctaa tcccactcta aatcgtcgta ttgccatttt agtttcctcc tgaacttaac ttggtcgacc ttggccgttt tcgttgataa ttcattttac tacctccact aagacgttct aggcttaccc acaactttca ttcctacttg tatagtatta ttatacgata tttttttttc aaaaaaataa cctcatagcc tttacgacgt tactttaaag tcatccgcct ttggaaaact cacctatatt ctaaaaagtt gtccaaggtt aaggctgaca atttcactgt ggctctgctc cgctatctag gggcgtctgc acgcgcaacc aattccttca aaatagctct aggtcgaaat atacttgaat cttttacatt tacttttttc ctttttgttc tttgccatgc tgcttctcgc gatgcaatag ccagttatgg tggcgtcaat atactagcct tttataatag ttgtagacga taaggagttc catgagtttt gaaaatggat caatccataa ttgaaaaggc tgaaagcgcg attaggtcat caaaagtaag cgacatttcc atagtcgcgc agaataaact ttgcattctc gccatgaaac tccttccttc tttaaatttc agtgaatatt cttccacctg tcttgtagga aggttcgcga tattttagac actaattcag tcacgctgat taatacaaaa ttgagcaagt gcgatattat tcactcgggt tgtcatttgc tagtcacttt ctatcatatt gtccaagcgc gacaagtgtc tacatttttc aatatctact tagagccttt tcataacttt ttaccaagat tatcaaaatc tacataaata gaagcagttt tttaaaactg tcgcttcagc cgcctaagac ttcagcttgg tgaaaatttc attttatttt gaatcgaaat cagccttttg cgcccttgta ccatgacatc taagcagttg gactcaatgc tcctaaaaaa cggaggcgaa cttatrtaaa cttattccta acgtacttga tgtcgacgat agtcaacaaa gacgagcgca aattctttcg atgtcgcgag tctttctaat agtttctttt agaiaaccga attatgatta aagcaagaat aagttcgtcg tcattttgtg aaaagtccgt gctagrgatt atatcagct tcgttttcat ggact tccat gtcgaagata ttatgcgata gtttcttcca tttcgtacca tt t tcgt cca atttacgacc caaatcttta atgagtgaat cgagccgaag gctacctttc tggcatagtc agcacaacgc ataggaaggt ccttaaatag tacatcgcca tggagctcct tgtccgggtc ttcatctgta gagagatttg tagtatctcc tttcgagaat taagtaacca ccatttcctg ctggaacttc aactttccat ttatcttctc ctaggctgtc aactttctct taaggagaaa catgacttgc tttggcgctc acgcgcatac ggaggaaaag ttatagaatg tgaacaggaa cagtacatag ataaacagrg tacaagagga gctcatttca ttcccacagc agcaaaataa atcttctagc ttaaggcgtt taagtgtgac aactgagcct acaggaatgc aataaagtgt tacgataata aaaataaaaa ctgtccgtac atagtacagt tatcaattgt tattgaaact tgaatttgag cagtatgatt agcaggcgat gtcctcaaca gttacaagac tatagtttga tgtattcgcc ccattcatca ttagtgattc ttgacggtca acttttacca aaagtcaagt ctacgttatt cagctttaga gagcaggagc aaaacaagtg cctttggaaa ttcaccgtct accgtgtgac taacagtcat aatagtgcct ttaggcagcc tagggataag atttctgttg atgcctgttc tctattgact cgcgtgtttc gaacaggagc ttattttcca tataaggccg tagctcgagt agtgcttcac cttgatgaat tatatggtaa aagtcattct tcaatggttt atttggttac aggtaagcaa tcgtaagctc tgcggcgttc tattcgacac aaaccaattc ccacgcgcag agctagtgct tacagcctaa ttcaggagcc actgactact caagacagtt atgaagccac aaagttcagt aggttcgctc aaaaactcga ctattctgct caatgtctat aagtcagttc aagcgaatat aaggaacagt tactccaatt aaagggtcag cgatatacag ttcgttgata taactggact cggagccgtg taggagtttc aaaatatggt acgccaaaga agaagcgctc gactttgcta atagacctat ttctaggtct atttttatta gcagatacta ggtggcggct ttcttgttta cgattcttgt agttaatttc ttgatgattt tccagtttca gcgacaggac atgctttgaa aggtctaagt acaagttagg attgattcca tttcatagtq gaaagtgtag ttcttgtgac tgtttccata attgacctct ttctgcgtcc aaggaataaa gtcaagcact ttttacaaaa gccctaataa tagagctttt agtttagcag tggtaagaaa tagctgattc aatatccggc tacaatgacc tattcttgac tgaagttcct tttcgagtct aggtgagtga aggaacttgc cctttataag aaagctcatt ccgtgtatag ggttaggaga gtttcgataa gctacaaaat gttgataaat accttcattt tataaccctt aacttcaacc cactcgtcgt cctcaccttc tcttcgaatc cttcattagg tgcatatcct gtccgtcaaa ttttactgtt tcctttactg taatttgaga ttcgatgtca ccatagttga catgtcttcg attcttccgt cttgaatcat ccgaattgtt tgattgcttc tttaactgtt gttatcatag aaccgaatac gtccatcact gttactttaa attcagtacc ttttgcattt ttttatatga ctcctttatt tgtttttctt gtttttgtaa acttttttaa attttttaat tatttatctg ctcaagggct tgttgaattg gccgggtgaa aagtcccaaa cagtttcgtc tggacagctt tttgccattt ccgccaattc ctctagtatg ctggctagac ataatgaact ctcataaggc tctttgacat cgtatttgaa ttatacataa taccttgaac aatttcagta aataggcttt aagaactgca aaaaaacctg ccaaggctga ggtttcttac aaacaatcct aacattgtca gcctgttttt atttatataa acttaacagt gacttttcta taagcgattg cattttcctt ttgacattta ctttttttcg gtcttgcttt ttagctctgt tcagttcagc ataggctcac aatattccgc caaagatttg ccttaccata aaatacaaaa tcgtcttggc aattttaact aagctcattt tcacccaaac ctcctttttt catcgtctac ttgtttaata tagtttcacc ttattccatg tacccgtcaa tgataatttt agtccagttc ccactacatt tcgattacaa ggttgccagt atcaatttca gatacctatc atatgtcgcc tcttcgtcaa cgccttccaa aatttcatcg ggcataatct aagattgaag tcatgttgag gtctgtcaat ttttttgttt gctcgctagg taggaccata aggctgacaa aaagcctttg aggtatgaat gtaacgaaga taaagcaaag cagcctcata aaaattattc gacttttatt caagcgcata gaatatgacc aagttcacgt tcccaccaaa gccataagaa gggcagtttg cccctcttcg ttttttcaat aattttttcg ttattcatat atgtcgtcta cttcaattgt cttgtcataa acatcttttc attatggtcg aa~ctttcigagtatcaaca tttcgaagcg ataaaaaggc gctgaaggcc tcaatccttc agctagaatt ttagttactt ccttacatat ctagagtcac atagettcct cttcgctgag tttttcgagt aaccgttgag aatgttttcg atatttcctg gtattcatta gtaagtgctt tagcaaagtt atactattat tatacaataa tgattgaata tcgaatttca ttttagttac cgcccttcaa tatacgcttc gaaatgtgtc ctgaagcgca ctttttaaat cgaatggcta gtaggaagtc ggtcaatacg cgtcttgttt ttcgccgaag gcacctaaaa tccac tagct taatcgaata ctcgagcttt gaaatgaaat tcaagttcga ctgctaggta agtggcgtga tatcttccaa tacaacatta tcattgttca cccttattt tcagtcgcga caacttttga agatttttaa cgaaaagtca aggctacaaa gaacgacaat aataacccca taaagcttca gtcctactca gcaaagttcg ctaccattag gttccccct WO 00/32825 PCT/I 899/02040 20721 20791 20861 20931 2 100 1 21071 21141 21211 21281 21351 21421 21491 21561 21631 21701 21771 21841 21911 21981 22051 22121 22191 22261 22331 22401 22471 22541 22611 22681 22751 22821 22891 22961 23031 23101 23171 23241 23311 23381 23451 23521 23591 23661 23731 23801 23871 23941 24011 24081 24151 24221 24291 24361 24431 24501 24571 24641 24711 24781 24851 24921 24991 25061 25131 25271 25341 25411 25481 25551 25621 25691 25761 25831 25901 25971 aagtaaagca tgggtggaaa ggacttaata aaattagggg tttagcaggc aagcact tcc acttctaaac ctgctcgaaa ttttttataa atttaggcga agtaaataaa cgacataaaa ttttacicaaa acaaacaagt ttttcgaata acctcaaaac aaaagttgaa ctttttttac caatttatac ccattttcaa aaagcgccct gaaaatacct gtcgagcact atcctaac ctttcgaaca gcatgaaaaa tctcaaaatg ctctcaaaaa atcgagccta atttagaggt actcgaaaag tcgagcatag cctcaatcct tcgaaaagtc ggaattaagg acataccagt attcagtcaa ttgtaagtca aaactcaacc ggttcgaaaa aactcaacca ttcgagagta ttctcattat aggactataa aatataaagg agggtcaata ggccaaaaga caaaaaagga taaaggcatc attcaggaaa gggtgaaggc tgcagggttt atgtgtttag tcctttcgaa tcgatataga gaaggcgagg gtgacgcaac agttcaactt tcgaactttc ttttcccttt ctcgacgaaa aaatgaaaat ggaggaaaag tggcactgat gttctcttgt cgtatcttat agaaaatgta ttattcaatg aagatgacaa ttgcagcccg taagaaagag aaagtctcag ataatcctga ccttatagat ctacaaattg ttaaagataa aggttcctac cgtccctttt gatatcgtca gggccatgac tgttattcag ctacttatta atgagtcgag gcctgaatcg aacccggcca at ccccact t tgctggagc ttcaatgaag aatggcgaaa at caaagcaa ttaaggaacg cc atggat ta ttcgagaatg tcaatattaa cagacattca agagctcaaa agcgcgagaa cttcgtagaa attaaaggag agtaagaaac tagcagacgg ggaattcaac cctcaaagc ttagacact tcaagacct tttgtcaacc atcctaatca tacaatggac tccaggaaaa agcagcatgt tgcaacaacc gctactggac atgcgcgtgc tgatgctaga aatgaaaagc aaaaggctc acatcacgcc tttactaaag aaggacaaga aattacatcg gcactagata tgactgataa aactccagct ttccattcgt ggacaaaact aaatgctcac tagaaatggc atacaggggg aagcgacagg ctacttcaag gacaacccta gatttatct caaaaagcc ttcggcctcc aatttcga actatttaaa cctcatttat aaacaatca acaaatccca cagtccgcaa actaccagcc tcatatacaa gcatacaacc agaggaacag ccgtaaaatt agtaagcccg cgaaaaaccc aatttctcga aaaggggtcg aaaagtcgag gaaccatccg aaaagttcaa tcaacctttt tagcttcaaa cgcgcaaatt tgttacaatg caaaagttcg aagaggaaaa caacaaagac cagttcgcag atgaaatcgg aatttattag gcgttggcga aatacgggct tgttactaat gatacattga gcctgggaga aactaatgaa aaggaaacac tagatggggc aagagcgaat ggaatgctcc ctccgggcca aaacgggcga aagcagctca agccgtttgg tgacaataag cctgagaaat cctaacatgg tggacaaagg tcaggaaaaa cagtatcgat ttgccacg tggtaagaca aagtcgcggg tatgaaatc gaagaaattg tcaactacct caacattagc aggtatctcc gcgctgttcc gtaacaggcc aagaactgga ttgacaaaca aattataaaa atatagaaaa gtaaaaacaa tcgacacaga attatagaaa aacttttcga aaagtcgaac aatgctcgaa aagttcgaaa atcactcttt taaacgataa actcctccac agtatgacta aggcatgact cgcgataaat ctcaaatgta catcgtcgaa gcattagatg cggaagaggt ccaggaaatt caagaattta aaactaccta gctcacctt ggctctttca attcact cag gagatgttcg ctatatattt tgtgatgagg cgaaaatgtg ggtcgaaaag aggcgctatg atggtctagt attcgtagca aagcgctacc atgttaattc cgatatgata cttcaaaagc gcatttcatt tgatgaatac catgaccact tcgaaataca atataagtag tcagtcgttt tatcaatacc aagactcagt acgctttatg gagaacaagc tagtcgacgc ccagtttt at gcttgacgga cggcctttgg aatatagaat atggattctc agagcaacta ggcgactttg gtatctataa atctaattga gicatactac gaatattcaa tttagttcag cgaggaaagc aaatcgaaca atccttatat agctagaaag cactcagggc gcgaggcgga ttctacaaaa gactactaaa gagtacgcaa tataatccc gacccgtcg cttctgctat aatatcccta tcattcctgc tcacgctgaa tatgcctaca gcatggatag aatattatcc gaggat t tta gacagcccct tcgacggacc cgactacaaa atgggtaaag gtattctacc cacagcaggt tcaatgccc atcaggacga agctggaaca gccttaacgg aggctaaact cgccccaacc gggacaagcg attcacttcg gaagattgcg aagcaagctc gcgctaagaa aattgcaatg gctattcaat ctcaagacac cgatgaactt attgaayaaL t tgacgaaac tgaaacgagt gtcgacccag atttcgaaac ccacttatgt gcaggaactt gctgggcata aatgggtatc aagctgtact atacttgcct gaatactccc ctcttggctg agaacagact gctgggacag taaagcgagc gaacagatat gcctgcca ggaaaaggcg ccagaaacta aacatggcta aaaaatcaaa tggcaaagaa tcaaaagttc tgatgtcacg gacccgaaca cccactcgag aaattcgaaa agccagagct tatctttagt cgactctatt tcaaggaact aagcgggtac cgatgacagc agttcaccta tacagttgac gcgtacaaaa ggaacgagca agtcaatcag gaatcgcaac acggagagcc gaaccaaatc tgcagaaatt gaagacaacc ggcccgaagc aaattccccg acgccaaaga ccttatcaca ataaaatg ttcgagcagc ccaaacgctt tggtcgacct aagccacttc catttcagga agccatgtat gaactaacgg cagttcttat tctacgacct ggctcaccca aacgctggaa tcaatcaagt taccaaacac tctgctaagc aacttgcgct LuagL4agaa qqaaaycy tacactcgac caaacgggag ctgacgctct actaaacact agctatccca aagaaagagc ctgactatgg ctatattcga ccagttcaag gtaaattctg agtagcaaat cctagaaacc ctgaaaaaca agcttagaa tatatgacag taggaaacaa tcgagttatc gtagcagata ctgttatcat taagagtgac aacttcaact accagccaat aatttctcga gaaattttag tcacgacaat cactgaagtt yauaaiyya.L tatcagcaaa caaagaaaaa ccattgtgag tggagatct accttattcg tcaaaacact aagctaaaga attcatgaaa gaactgattt cgcttacgac gtagatggcc tcgaaatgac cctagcaaca aagatagagt aatcaacgat tttatagaaa cacacagacg ttcaggaagt gcaaacatgg caacttatca acaatcaaga gagcaaaagt gccgctcaac tagcattaaa gtaacagcag tcgaccgttt gaaacgtcat actgaggcgg atgatttagt gattgtcgaa qtgacgcttg cgcacgacat cattaaagag gacttcggct ttagtgcata aactgattag tgaaaagcat aagattgacg aaaagcaatc tgaacaggct acaaatgcat aaattgacgc cttccagctt gctttctgct gacattgttt actttggcac cattatagat atgaaatgaa gacattgaag aacaatgttg gctcacttat aaagtggcca aactactatc aaggagacag gattgaagtt cctttaaaag ttcaagaatc cttaatgaac cattgactaa tgacatatac acttgaacaa aactaccagt gacttacgag tgatggctct tctaagtcaa ttcaaggaat cctacttcct caatcggcgg tactggaggc cccttccagc ggctgaatat tatttagagg gcctgaaaag gtacaggagg cgccatcagg cgtgacggaa aatggaccga gtgggatgac caacagtgaa cgttgacttg ggaaatattc gactatcgaa caccactatc caattcctag caaactaatg tacgcagcca ccaatcttac yygaacgcar cccggaagaa cctgctcagc cgaattaaac gaacaagagg agcctcaaga gaacaaactg aacaaccaac cgaagaagga cataagtccg accttacatc taaacagatt tcgaatgtcc taagtgccat titcggt.ca atttagaaat acttgtcgag ctaaaatgaa taccattcat atcgaattca ggatgagcaa aggagataaa cattgaaaaa cgagaaaaag ataatattcg tctaaccctt gaacaggaac aattgcaaag attaaaaagc ctcgaaacgg ctgcacgctg ttcttgttca actaatgact aggtagttca tattctacaa gtattagcag ctcagctgga gaagaagacg acgttcaagg atatcgaaaa aggaaacaag caaaaaggag aaatctcagg tatatagctg gaaagccttc aaaagaatct ataaggcggc gcattgcctg aagttgatga ttaattgtga gggcgtagag gaaattgaaa ctgctgctaa gcttatggcg ggaaaaggct agcgatgcag ataaagacta aatgaacgca WO 00/32825 PCT/I B99/02 040 26041 26111 26181 26251 26321 26391 26461 26531 26601 26671 26741 26811 26881 26951 27021 27091 27161 27231 27301 27371 27441 27511 27581 27651 27721 27791 27861 27931 28001 28071 28141 28211 28281 28351 28421 28491 28561 28631 28701 28771 28841 28911 28981 29051 29121 29191 29261 29331 29401 29471 29541 29611 29681 29751 29821 29891 29961 30031 30101 30171 30241 30311 30381 30451 30521 30591 30661 30731 30801 30871 30941 31011 31081 31151 31221 31291 aagatggaaa agcagccgag tcagccgcac cagatatggc agctgagaag cgaactacca aagttcaatg atttcccatc tcactcgaag ggtacgacga tcggttcaat atattcagt tgttccaatt tcaattagaa acttcgaaag tcgatgaagt gctttctaaa aagcagtctc ctccagcagc aggtcttgat gcagaacaag grggtgttcc agcacaagaa actgat rta ttccagctac tactacattt tccgctgacc aatatgcagc ataggaggaa cttccttcaa tttcatggct tcttcgtgaa ggtgaaaaag ctcaactcta attgcttcaa atggatgcta acatt ttagc aaacacttat tgggaaaact ctgtctactc gttcaacttg actactccag ctaccgttac atcattcgaa aagctcttag gcttgcttcg gaaactgttc cggctaaaac aaaaagcgaa tgaattagtc ctteagaca tggtaaccct tgaccaatat agggctgatg tcctactaaa gcgaatgaag attctactga aatcattcaa aatcttctcg agtacggaag acagtctgcg cagggaatgc atcacatcac gctgtagaag taaacgaacg ctcaggag atagaatgcc aaactcaatt gtaacctatg gaattccaca gagcttgacc cagaactcga cgtgcgtact atggccctaa C tgcacaagg aattgtcaac tgctacggct gcagttgtca gcgcaggaaa taaaatgctc ctagggaaac ttagccattc gcattctgtt gaagaatatc aaatcgctga tttaagttca accgagtctt atgttataat aaataaaaac gacttgttaa aactcgatgc tgttcaacag caggtcaaag aacttgcaaa agacattett gaagagttga aggctcaagc cgaacct cgt caatcatcat atcaaaccac cgcagcaact gaaggacgca aagaagtgtt ccttcgaaaa ttatagatga gaaaagttcg aaggrgaaat aaaaggaagt cagggctgca ctataaggac ggcaaaaacc tgatgticaa caaatagtca gagaaatata tcgaccctaa ctgctgctca taaatatcaa cgccacagct ggagtgagac cacgctccag gtcgaacgtg ctttcgacca tcctaatgga tgagttgaga ggctgggtag ggaaaagttg agaaatacag tttgtctata aattgtctaa ataagttgaa aaggaacctt agcagattca gccggagggc aaggtctaga tgaaccaact taaaattttc attgacggcg cgcgatgcag ctaacggctc ataacggtga tgcgcagacc aggcgctgtg attacttcag ggatttatga accttgacaa aagctgttcg tgagtctcgt taagtcgcca gccgggactg gaaatcggct cttttggtaa tctttaaata ataggaggaa tcgaagcatt gtcgcaattc caagtaggga acaagaaata aaactggact cgaagtagta tgaaggtgaa gaaaaagtaa gttggcgatg ctgtgcctga atatttatga ttatatcaac cacaaggcct ggttcgaaaa aatctcgaat aatggggcaa C caagcgtgt atgtgctacc acggagaacc cgacctcgac tttcgagaac gtcgccttaa ggaaaactca accaaacagg acggtcaaca aattaattct actatccaaa ctcttcatcc cattacggtc aaatacttat gaaatttagg gcaacttgct ctaactatgc ctgaccacta cattcttgcc tctaccggtg ccgtgacagt atctaaaaac gcaggggaga tccctaatgc ccagccatct atggcattct aaagttcagc tctttagtat tttcacgaag tctcgaaacg tgtttggtct agcaagtgga atgtctgcta gattgggacr ttgataagat acaatgccct tcgacttgct ggttaatcct tatgctcgaa atcgatttag aiggtgaagt aaactgtatg gtacgaaaac taatgatgta ttagacgaat tttgttaaaa gttattaggc cttcgaaaag tagtaaaatg tgactcgaaa ttggtttcac ggaggaaaat aaatggctta tgaaggaaat tatttcgaaa ttttgtacct cacgcacgtt tataaagaac aagtcgcgac accttcaaga gcaactcgac gctgattagt gactccattg gaaagtgacg gtaaagttaa tcaaagaagt cgaagttccc aaatccaggt cgtgtcggtg gctgctcaac aaacggcagg ctaatgtgcg agttaagaaa cgttgctttg gctgctcaaa ggaacttgcg tgaaaaatgc aacaattcga cggagttatc attagttcac ggattcgtca gcaatgattc ttgtcgttaa ttgetagcta cattcaagca tcaacaaaca gggacagaca aactacgacg cgaaagcaag tccgtgagtc tatgcgactt tcttgcccaa ccacttatca acgctcttca ataccttgga ccaactcttt caagggtgca aataatttgc cagtaactat cgtgctggat ttagcaaaca agctactgag accgtcaaaa cttgcaaatg ctattgaacc taatgatact aagaaccttg tagacggtgt tacggtaaat tcactgtcaa atcaactaac agcaacaata tgcagtcact aagaaatgga agcaatggat gacatcgaaa atcgtacagg aaccaaatga ctaagagtga ctctatcaag tcttgcttct tgcaagtgac gctgagaaat taagaaaatt gctcagttcg ctgacgctga attgacgacg geaaagtggt attgcttcca aagcattcga cttggcttca ggcggaacag aacttatctt gaaaaacatc ctgtcaacat ggaattgact atgtaggagt tctcacaact caccttaatc gtttccggag cagtagtgca tctttaattg aacgcaattt tgcgttcgag ctcaaacaat tgaatcagtt gaagaaattg cgttcctgag cccgttgaat tagcaagagc tatatcgacg ctttaattaa gtacgaacta aaaatcaata tcgataacga taattctccg agcataaatc tgtcgcctat gttagttata tggacctatc agtctaiaaag gtgacgcaga aagcaagaac agcttgaaac tgatgaaaag ggacatgagt tatgacgtga attatgttaa atcaaggtac ttcgaaactc ttgggtcagt tcgtagcaga cgaccttgtt tgtttagttg cgcgggaaaa atttttgccc aaaatggagt cgagccgata ctatcgaaat taaaaactca agcaagacat tttgatagaa cttaaattgg cctgaagaat ttgttagtaa ctgtgaacgg aagtcgctgc tactaagatg gaagaatacg tcgacagaaa ctcaaaggag aagctgcttg atggactacg ggttttggct agaactagct acaatgtcga agaacttttt agagcgttga acaatgatgg acagattgaa ggaaattctt ttgaatttga cgagcaagat acagataggc cagcgcaaca aatagcctag gaagttttgc attggtatcg acgaatatag cagaaaggtt cagaaactgg tgactacttc gacacaatgc aggaggaaac taataatgag taaagacatt cattgactca gttgccaaaa gtcggcggag agccgtgacc tcggagggaa ctgaagatgt ccagaccttt tatacggtta tgacttaaca ttgaaggtgg tacagtacgt caacaaggcg tgcttctaat atgaaaccat ttagaatgaa tacgtgaaaa tcactttgaa taactgtacc tgaagcgcaa agcgaggctc ctaacccagc tgttcgccct aaagctcttg tcatcgctga caaacttcct cctgacgcag acgctcaagt tgcaacagtt aattaggagg ttcagggtcg attaaggcgg acgaagttga taatggaatt gcagaataca aatacactta tgaaagtgac actcgaatgg caattggtgt aaaaacaggt ggagagtaaa tcaccaatga tgatttgctt ctactggaaa tcgaacgctg gaatcaaget gatggatatg ataattcaac gaaaattttc ggaagacggt aggtgaacga tatcgaaaca caaagacgca ggtaagcgca catggt cgaa gaaggttatt cctacatttc cggatgacta ttattggaaa cgaaacatta tttctagata ctttacggaa ctaactttgt gaaacgcaat ttcaaggaca gaactattgc catctatgtg ggtaaagctc gacgttggta ttggtcacac tcaagttctt gtatcagctg C cgctatatg gtattttctt ctgaagatgg acaaatgcgc gacatttctt atggcagctc ctgaccaaag aatgaagacc caaatggcgc gttcgacaat cgtagagcca gaggaaagaa tgttcctgac attctatatg acagggtagt ctaatgtctc agtttcaagt tgctatttgg gaccaaatca aatacaaaat agactaggag agctctcgcc cattgttctt gttcaaatct tcaaggacat ccgactagaa tcaagctcgt cgtagatacg gacacgcgca acacgtttga tggatacgac ccaaactatg cagggctttc tgcgtatgca cgactacaac cctatcgctg tcttgaaccg tcaaggttct cttcaaatcg acattcgtca ttggtacggt tcaggcggac ttatgattcc gctacattga gccctgaagc agaaacggta gaagagtatg caatttctcg aaacggacat tatctcagct cggaatgacg aattctatta cttaatgaaa ttgaaaccgc aaaggataaa ct C Ctagcca atgaaggcaa agaaacccac agcctgaatt cgctgtcata acagaccgta tgatagctgt tctcgaacag tgaacatgac tgctcctatg cgatatagtc acgtccattc gggctacgaa atcgaatarta gcaaatcgag gcagaaacag ttcttgctat ccctgaaatc accccaatgc taggtgactc aatcgggaaa WO 00/32825 PCT/I B99/02040 31361 31431 31501 31571 31641 3 1711 31781 31851 31921 31991 32061 32131 32201 32271 32341 32411 32481 32551 32621 32691 32761 32831 32901 32971 33041 33111 33181 33251 33321 33391 33461 33531 33601 33671 33741 33811 33881 33951 34021 34091 34161 34231 34301 34371 34441 34511 34581 34651 34721 34791 34861 34931 35001 35071 35141 35211 35281 35351 35421 35491 35561 35631 35701 35771 35841 35911 35981 36051 36121 36191 36261 36331 36401 36471 36541 36611 gagttctacg ctcctgagtt tggactatgt qgcacaactt cgccgacgca gttcgagttg aaggctttca aaggctggaa accgagaCgt caaactcgta atcacagctg agcagtttaa aacctatcca tgttaaaatt gcttttaggt aaagtgacag tcaattactg accaacagaa tggctgaact tcttcgagta tatgacagat gagcaactta cgtacagacg aaggaaatgt gtcgggatgc aaactgattt aggaagacaa gactcctagg actattttca gtcgctCCtc aaatgacttt ggatatctca actagagtct tcgaagtcct gttacccttc ctcttatggg cccgtgttca agctattgca tggtgctaaa actgctttta caggtaaatg aaatcatgga ccgcgagctc cgaggccatg ggctgacgta tttgctcgag tacgtcgcac ccgttgctca ccgacgccgg tattaagggc tacgaaagcg atggtcaaat ccactaagag aacaaatcgc accttgttac cttgtatggc attggataag atgaccaatg gacaaccttg ctagtaaaat tccttgagcc tgcacttgct acctatcggt caaaagatgg gcaggaatgg tgatgacaac gaacgatggg aaccattgca cacaaaatcg gagagattta gcgttggaat ggctacttcc aagagttcgg tcagtctgta ggcaggaggc tcgattggtc ggaggagtca tcaattgc cactcgggat tgctattagt agacggaatt actcaagtat taccttccag tctttgtcga ttcctcaagt agttgaagtg tcaattagtc gaagcaggaa atcattcaag cagctgttca ttcaagcagg tcttcaaatt agcagctgtt caaattatca gcgatgcaga ttataatggg ttcaaattct aatggcttta aatcattact icactattag cttttatcac ttcttcaagg tggcacttct taaagcagtt caacatcaag gcacgtgaag ccagcggttc ttcgtcgcgt aagcaggtaa gaagatttct agttgaagga gaatcaacta gcacaatttg catagaaatt gcaacttgca tttcaaatca cgagcagcag gtgtcatgaa aactgtttgg agaaacttcg gaaagaagcg ctcgaccgat ttcgcagaag cttcaatggt tgacaatctc cagtgcaatg ctaatgtcat agcagtCgCt aggcaaatac tgcttcgacg tatcctggtg acgaaaagaa tttttgtata tagaaaggaa aacttcacaa gtcaattaaa ttcaaattgg ttctgcttta atttgcagcc gcctctatta ggagcgacag cggaagagct gtgcaaaaga ggcggctcaa cgctatgcca ggggtacttg gctagttcac cagcagctga ctctatgggc tcgcaagccg caatgcagga tcaactgaaa caaaactcgt ctctcgtgaa cgagcaaatg aaaat cgtgg ttgtcatatt tattgtcaag ggagttatag gaaactttat attgattcaa ggtattgctt gttagcaaga ttgctagctt gtggtattgg gtcaatgatt ggttactgga ttcgctggac agttccatgg taagttctgc gattcttagg tattcactct aaatggtatt ggtaacatga gctctcagcg acgtgaagat agatggctga ccaacttcct gcctcgagtg gacttgttca caaggcgagc aaaccgttgt cgagaggatt gtataataga aaatagatgg ctagcagaca tagaaiatgt aggactcact agtattagat tctccgtcta accgaaaagc aagttaatca tttcgacact tgaagaccct tgtagacgtt caagccttta gagtacagcg actcaactgt acccaggaag acctactcga aattggcgaa aaaagttcag attattattc taaatcttgg ttagatacat taaacgaggc agccgatgac gcagcagctt acgactgaaa gggtctaaag agttcattgg tgtttcactt ctgttagttt tcgaaaactt aaaaggaact atttcacaag ttaagatact aattatcact ttgtcagctc tgtcgcttgt tctagtcaac atcgagggac aagcaatctt gttgctaaat atcgacttcg cacttctcgg tgtgggacag ggttcagctg aaatggtaag ggtaagtgcg ccttcacgtg ttcgaactac ggatattcaa gaaactcttc atacaggaag caacattgga agtaaagaaa gacgctattg ttcgcaggat atgctatgtc aaaatacagg ggatactatc aagatacttc tcgaaaggtt ttcgagcctt tacgaacgca ttgagccttg gaaccacgct attaggagtt acagctactg tgtcaggtat ctcggacgga ggaggagct t gagcaatcac cgcaggaatg ttaagaattg caatattcta caacagtctt gagttaggag tgt caaaact aaatgttctc gtaacaaaat catttttgac gacaaacaca caaattttag tcattgaaaa cgaagcgctt gctttattca tcataaacgg tcaagcacta gcactgattg ttattcaagt gtcgaacctt atgcttcctc tccctaaact ctcactttta atggtttcag tctctaaaat cgcaggggtc gcggctaata tcatggagca acgtgacaag gaaaatggag cagctcctga tgacaaccct acaatcgtag ctctatcagg gtcgacggaa ttaaggactc cgctcttact cagttcaaac gaacgggaaa ccttgtagtt tataagtttc caaccaaagc aggtttgcca gttaagtcaa gacattcoat ttgaacggtg gaacaggaac ccaaaaccag ttgaccctac cttaacaggt tttgggactt cgacaaccac atgatgcctg tagaaagaag ggtctgttat gactaatatt tcgcacttcc aggattttca aaaggtagtg cctaatcgct aacgggaaaa tccctaatac acagtcacta aagacaatgc tagtctagca tgaacaaaac cgataccggt attcaagaca agagcctact tacgctgaag tcggcgagta tacggtgaag tgactcaagc tgaaaccttt actgaatttc atattagacc tagcgaggtg cagcagccgt tgcttatatt agatatitgc aaatccagga ttgcaaatgc tiatggagtg attacatgga ttttgggtca attgcagcaa tcttgctcaa agtcaagcgc aacggctcgc acaggattag ggaaaggact tacgactgcg aagtagggaa tgaattccaa gctcaaatgt tggtagaaig aagactcaag caatcgacct ggtatggaaa atctagcttc agccggtttc acctggctgc cgtatctgga ggagatgtgg tggattagag gcaaaccagg cgggtcacgt gaaactagcg acatggcaga ggcgatgaaa aagaaacggc tgcgtctatt gggattatgg tagaggcgct ctctcgcgta ttgccaaacc tcgttctacg acgcgaacgg aaacatgatt caggactaac acaagaggaa cgaaatcgtc gcttgcacta ttagacgcag gtcctgagaa gctgctaagg aaatggcaga aactatgcag tcgagtctgt tgctattatt gttcaacaaa aaaagttctc gaagcattcg taaatatgtc gttgcagccc ttggaccact gcttctaatt ctattcagtt tttaggtcca gcatttatgg tgctctggtc gccgtgttca tgatagccta gcgcctgcta ttaaagctgg gtttggagga aatggttaca gaaggcaggc gagaaggcga gctcgaacag tttggaataa gtatcggtca gaaaggctag gaggcgcatt tggaaaagta tcggtctcgc atttctaggg attacaggac agcttgggct agaacaggtg agttcaacgc attcagtcga cggctgattt catctctcaa ttaagattat tgaaggaatt gcatctgctg tattgtgatg acaatttcga cagttatgcc ataaatggtc ttgttcaatc tcttcctact atggtcttgt tcaggcactt cctacgctta actagttcaa gcgcttccgg caattattca attgaaaact tgcctatgat aatcgaagca aaaatatagg acctatctta gaagcaggga gcttcctgaa ctaattacag cagcgattca cctcaacttc tagaagccgg agttaaattg aactaattgc aggggctttg caaatcatga tcttcaagca ggtgttcaac ttcttaaggc tcgacagctg gaaacatgct ttcatcatta gaggtgcgaa cctgattcga aacttcatta tggcagCatg ggaacttCaa ttgtttctaa aaccttgttc gaggatttat caatggtatc tggciagcag tgcattaaat gccgttaagg gatgggtatc tatacgggtc aagggttcgt gctaaagaaa tggctgaaac tgttactgaa ttatagaaaa ggttaaatca gtttacgaaa tttcgaagat gttcgtaaag cagccggttc aaccaacctc agtcacaatc taaaaacaat ttcgaaacaa tgacgacgtt gacaaactgt gtttggtaac attgtaacac cgtaaaggag aggatttaaa aaccctgaag gcatagacgg ggaagcgtga ccttaatgtt ccacggagaa aatttattcg ctcgaagtca ctttggagaa atttttagga gaaaccgagc aaggaaaAcptaaattaggga ttcagticaa agatgcttac aacccgcttt gggaggcgat agcttaccta aactacttct caaatcaaag gatattttcg actaattcag tattgatgga aagtggctcg ttagcagtgc aaatcaagcg actaacttat tggaaattca acaattacca ttgaataccg caagttgaac tgtttctaaa tccgtcttac caatttagag tagaaataag gacagtttgt tgagttcggt aacttttgaa cttattaaaa gcattcttca agattcctaa ggacctctac tcttcccgct WO 00/32825 PCT/IB199/02040 36681 36751 36821 36891 36961 37031 37101 37171 37241 37311 37381 37451 37521 37591 37661 37731 37801 37871 37941 38011 38081 38151 38221 38291 38361 38431 38501 38571 38641 38711 38781 38851 38921 38991 39061 39131 39201 39271 39341 39411 39481 39551 39621 39691 39761 39831 39901 39971 40041 40111 40181 40251 40321 40391 40461 40531 40601 40671 40741 40811 40881 40951 41021 41091 41231 41301 41371 41441 4e151i 41581 41651 41721 41791 41861 41931 tattagaaag atatgaccaa gtgactcgag taaaggttga tgtcaaaggg ttgaaacacg gactagtttg ttggcatctt caagaggtta t tgt agttCga gttgacaggt tatctcattg acgaacattt actaattgga gtcgacgacc caaactctac agtcctttca aattttaatg ctgacgactt tgatggagtc tccgtttccg tcc tgtggac agggcaagac gtcatgtatg tcccaggtgg ttcagtttca ggaatagggt gggcttcaca ttcaactacc gctggtaagg cgcctactc ttggaactat ggagttcaag cgcaatatac agcatacgtc aaatggaagg tccatatagc tatgggt Cat cttgccaatg gctattctag acgtcaacgg aaactgacct gtatcagttt cgtat C Cac aattgtaccg tcgaacttgg agccgatcaa ctgaaagcta atgaagaagc agaacttggc attatcggta ggaatgaagt agtcggccga ggagaataac agtcaggacg gaacgtggac cccagactac gacgggacaa tctctactaa tctaggatct gttttcggta ggaatatatg aacttcaatc cc cgaggaaa aaacatiatc cttaccaagt ttgcCtcttc tcctcctgac cgatatcttg gaattgttca agagaatttg aaaaaggaag atgtttcgtg tagaattaaa tatgaggctt attatgatgt C at catttt c ggggaaactg cagaatctgg tacatggatt gacggtgtac gaacgcaaga taaaacatt ggaaattccg caagttcgcc tcagtattta agaatgggcg tgaagtcaac agttccttct gaaacgggct attgacaata taattggagc agaaactt cC cagtatggag ttacctgcta tgtaggcgct ggtgctaaca caaagcaata aaccgttgta aaatatgtca aaggcagtca gtttactaca gaaaacctga cagcggtcct tatcgagcgg caagaccctc taaatgagtc gaaatacatt cgact agaag ctggaaagag gcctgaaaat tggagatata gaaaagacgg atctgctact tggactcgaa agcagggtcc ttcagtttct ttaatcaaag atcaaaaaac atttacctat aagtgatgaa acttttgaaa gaagatggtt cgcattatgg gtcgcgctag aacaagcttcg caatttagaa tttctcCagc gagtccaatt cctggcgaaa ttgttcaagt atctttagca agcattacga agacgaaatt gtattgaaac ctcatctatc tatcaacact tcgaattaaa tatgctcagg acgtagaaga tatgaaccag cagaaggctt gcctaggaag atattaccaa agacgcaggt gaatgggtc aagcataaca gccgcagaaa attcaatgct Ctgacatttg gttatgaaga aattatcaag cttatgccga gtctaaagta gactttcctc ctaggcagga agattctcga aacctgtgta cggcttacaa agagccttta acgtttgctt ctatcaacaa tggaagtgaa cgccacatga agcctcgata tattgctaaa tctaaaagcg tgagtgctgc gcgtgcttat cttgacatct acagtcgccc ccataacaag gttcctgact tgcatcatac tcaactaatt cgaaagatat ctgctcgaaa aattgactac gacgaccttt gaaaagactt gatggaCttg ctaaatgagg acggcgaagg ccaagttgtt attagatacg cagatgacat tttagggact ~ggcctta atactaataa gaaaccgagc gaattagttc gtcccaaagg tgacgcaggt ttaccgggag ctcctgggcg cggagtaggg atagcagata cagctatcac Ctatgctgta ggatggagcg aacaagttcc tgaactcata aaaggtcgat ~cgacggctc acatgaaact ggatactccg ttgcctatat ~acgcaggt aaggacggag taggtatagc cgcaactgaa gaagctccag ctggtggatg gtctacgcaa gttcctaccg caagatggcg ctacactgac caaactgatg aaattggata taaaggtgac gcaggtcgtg acggtattgc aggaaagaac tatggaatta gtcccactga ttctgcgatt cctggagtat gtcaatatct ttggactcga actatttgga cctataccga ctacattcca aaagacggga atgacggtaa aaatggaatc acgaccatta cctacgcagg ctcaacctca ggaacagttg caaatgtcca accgggattc ttcttgtgga cgaaaactgt atggggtagg aattaagtct aaattggact tctgctattc actgatgaca ctagcgaaac gtcttcaagg tcctcaaggg tcacctcgct ttctctaata ggtcagtatc aagatttcaa ggaatgacgg agctcaaggg ttacgcttca agtgcagacg tactccgatt atgagcaagc ttcaagcggg aggtcgaaac ttacaatcta atggacggac ttcaaaggtg ctaacccttt tcttcagg aggagatacg ctgggctaag gcctctagga Scaaccttaa ccgatcaatg ctgaagcaat Cttccatgta taatatctct actcccccca aagctaacta accaacagtt aggctacaat ggagcagtta tatcaaaaaa Ccggaagccg gggctacggg aactgaagaa agaacgacgg tagctctacc tatgtacctt acgcaagggt tttagaacgg aacaatactc atgacaaaat ttatcaactc taacgaacaa ctcctcgcga ttatggaaat actagtaacc gacacgtccg gcgaagaggt agacaatgtc cgtttgggct tCacacttta gacagtattc ttacatacgg ttatctttaa gcgactggat agatttaggt aggttactca gtttccaaga cttcaaggaa ttcctggacc gtccaaacgg tgagggattt tcccgtccat tcaaaagacc taggtgaaac tgcaggagct agtcatactg ctgcagccta aggtcctaga gacggacgtt acagcggacg tacatggacg actaattatt atcaacaata gtCtgaccgc ttaaaacctc atacccggga agccaggcgc agacggtaag gatcacgtga gttcagtCtg gaagataata agatagcagg gatcgaacta agtatcgatg gagtcccca attcCCtatt tgaatttggt aagatcaaac gcaaggacag atatctgcta acgacttgac tcaacatgga acggtaaacc cgattaggta ctccaaccga gtggtctaat acggagtgag cttagctgca cggccgggtt gaagttctac gattttaaat tctttgacaa ttcactcaaa gttgttcagt gtggctcaat gtgaagcaga ggaagacctt aaatatcgaa gacggcactc acggaaaagg ctcaactaca agtaacttag aaaaggctta tgaaggtaga acccaatctt agcggcaagt cgaattgaag gttcgtcgac agttacatga gctcccctaa atcaaggtat caagtgaccg aatttctatg Ccattcacat cgataacggg atctttaccc gtttaatcca gacatgaacg tgattcggta atacggccct cttcacttga accttacgt gttagttggc gagctactgt cgaccgcgat ttccgtatg gttaaatggt tcaagtgttc aacgctcgca agtggagaag Cgactgttcc tcgtttgacc ctaataacgg cgccacgga caaggtctac acagatttct agttttgagg ccgaaaagtg aactcCtta cgcatcaagt aagaaccata ctactagcgt atcctttacg ccggaacaat ggacatctgt attcgaacct cggatggagg ttcaacatcc ccgatccagt tcagcggttc gacagatttt aacagggaac tatcaacgaa aacggcggca aattgggtat acagacacgc gaggaaaaca atcgaacgc ctatcaattt ctccgttcaa cgtactcgtc cgcacctata acggtaggag gtcaacagaa actactaatt Ccacagaaga tagaggctcg cgtccgcgaa cttagctggt aactacgggc gttcacttcg actgaattta gtgctacggt cgacttggag ttggtaaggt tgtagaacaa ~cggaggtcg acaagttcaa cagtttcagc cgatgtttgg aataagcgtg aaacagagt actcgaggtg aatggggact atttcaaaat ctattgacga gcagaaccaa ttagaaggtc atcgtagtaa agttaattca catattaaaa Ctgactcaaa tgacgcagaa atgaaagcta ctactatcca tgaaggtcta ttctccgcag aatccattca Cgtaggataa cgaacaagt C ggagcttatc atagcagtca tcacaatagt aatatcacta gaaatcgaaa Ctggtaccga ccgtcactgg ataacggaac acgtcctact aacCtcctcc LaaqcaLL gatgaacct caagacgtat aaaatcctgc aiacatcartg gcgtceggga cggacaagtc agctaccgaa gggaaggcag Ccactgataa Cacatggcga ttctggttag acttagcaag gtacttacct tacgcaaaCC ggtagtgacg Cttcgggca ttCCCCCCC& tcacgctgag ctcgtaggta aatggctccg ctaccgtaag ctatcaatgt tatagaatac aaCtatccaa gctcttcgaa caaattacct tCCccC9gC9 cgttcactac Catcccta ttacatagtt aaggctaaaa tcagtagttc Ctaactatga ggtcaattga Cgcagcaggt taatggagca Ctgaacaggg agtaacaaat acgaggacaa aaatcaagtt tctattcaaa agacacgacC aaaaccaagc agcatgggtt tatggaccgt atgctaaggC gccgttgaac atgactaact Cccaagacag caaggacggt gatatatatg gtcaatataa ccctacggga WO 00/32825 PCT/I B99/02040 42001 42071 42141 42211 42281 42351 42421 42491 42561 42631 42701 42771 42841 42911 42981 43051 43121 43191 43261 43331 43401 43471 43541 43611 43681 43751 43821 43891 43961 44031 44101 44171 44241 44311 44381 44451 44521 44591 44661 44731 44801 44871 44941 45011 45081 45151 45221 45S291 45361 45431 45501 45571 45641 45711 45781 45851 45921 4S991 46061 46131 46201 46271 46341 46411 46481 46551 46621 46691 46761 46831 46 901 46971 47041 47111 47181 47251 atagctggaa aatggttcaa aaacagctgg agacctaaca aaacttgttc ttcaaagtgg acggcatagt atatttgaga tcctgaagga tttagaccga ctatgtatat acactgacgg atgtctcatt tcgtatttaa tgttgaacct tacaaaatcg tgtcaaaaca acgattgtga gacttgtatg ctgcgaaccg tcgaagatga aattctagCt tctatgccaa tgtggctaaa ctgtcctact aaataagtta aactcttagc actcttaaac gacgtcattc aagacggaac taacaggcta tacaactctc cggaaatggc gaagttgaag gaaactatct aacgaacaat ctaattacag gtcttggagC caacttttgc aggtactgtt tgaggtggaa taatgggagt cttatagcat ggactttcga ctcagccgga gcttcaagtg ggttatgaac taattagtga aaggtgctag cgcaggcgct ctacgcctac gacggaattt tacgtctatc gcttgactaa ctggtttctg gtacgctcga gtcttggttc tactttgacg tggtattggt tcgaccgtga tcaatcgcga tggttcaatg caacggcgac atgaaatcga cgtctggcag ataaacctca agaggaggaa gctcttttCt gtcgtatatt actctattta gttgatatga ccctttccgc qcttgacaac attcactcat cattatgtca aaaattaaat aagtttaaaa tcgtttcaat aacttcacct ttcagcttca agaagctgct aaacctgcta cccaaaccta aaaaagaagt cagttagtga gaaatctact tcttgaaagt cgaattgttg cgctctaaga agaacttcgt ggttgacaga agaccaaaag aatttttaaa ctcgtcaagg tcgctatgat tgaaatcgtt ggcaagcact gatgaagatg tcttctaata acttcgaact ttcacatctt cggcgaactt aagcaatgag cagttttcgt gagcatccat gtttcctttt ttagcaggaa ggcaagtttc aaaagaagta ggtattcatt ttagtgattg acggagtttc tccttcatta agcggaaaga gtggaaccat ggaaatgtgc aagtttcaac aagacttgtg tttgagctga cgccaaattg acattgatgc tcgagaactt gaacagtcaa cgacacagca ttcgaatgga agcaggt cga tagaaaaatt gaccattita ccttgtatga atgacgtagc gttgtatcaa ctaggagttt cgatattgaa gacggtcctg ctggatgggc aaatgctccg ggaggtcata ccgtcaacga cgcaaatgct gcaaacggaa accaaggcta cggatacatg gtaaccggtt atgcgtttat attcaccgta taatattgtt cttattcgaa cctacataat tat cgtataa tcgaaaacct caatgtcagg ggt tct at tt cactcaacct ataaaggact gtatcttcag gtgaaatcga aatcatgtta tggcagagtt aaacgcagta cgagctgacg agactgaaac gtcttgacga aatcgaataa cgggattgac caacgttacc gagagctctc aaaatacaag aaagaacgtg tttgacacta ctagccgaaa aaaggcgttg atagctatga agt caat act tgggatgcta cagggatgtt ccacgatgag caaccggctg cttatccaaa catgctcgct gctacgtcat ggattaagta ccgttataac gagccggacg tctcttaatc gatttcaatt ttgtggggcg tacaattata taaaaaaggc aagaatgttc aagcaagact atggcgacgc tatcgacaaa gctctcaata atgtagataa taatattttt cactattgga tcaaccgtct agcaaaaact agctctaaca cgattattac agccaagagc caaacgacag gt ct ttat ca tattttattc aaattaccaa gtaaccgtag ctgctatcac atcaggacag cgaacgatgg tcgaacagaa taattggcag attctattcg aaaactcttg gaggctacta ttgcagtact actcatatgg aaatgccatt ttcttggtta aatttagaca tagaaaggag gtgagaacta caaggagctg aaaagaaact ctgaaactct tcatgaccca tcgcgaaact cgttacgcaa gctgaacaag gaggcgtcaa agcgtgcagc ggagtgctta gttttagagg atatctctac tagcaatcaa tcaccaaaat cgacttaaaa agggaagtga gaaagttata agaaccttgg ttagggagga agatttagat tcgttccagc agcgattgca aggaaccatc gcacttcttg ctaccaaaag gaacaagaag ctcaaaacaa cgtggatgca ggcccgaaag ggtcgagtat tttagcagac gaaaagaaag actctcgaac gttggtacac aaaaggctgc tcctgcagtt ccttgaggaa gaaattcctg gttcgaaaac ctgctcctaa aagcctttcc tgcgtctact tactatcgaa qaaactcgaa ctgctcaagt gagtacatgc aacgaggcga cattgacagt cgttggtact agaagaaact agatgagttc gagaaatggt ggaaacggat ttacgataat gacggctggt ggctcattac ccgcaaggtt ataattaaat tttatttttt aaaataaata gatgttgtgc cagaccttga aatggaagat gctcgacctg aagttaagga aaaagaaagc cgaatcgtca aaggtgtttc tcctgcatct atggaattga gcgaggtcga atttccagct atgttataga cattgactct caccaggttg ggaaaactaa ttgcatatca ttaggaagtc ttggtggcca cttaaatgaa ccttcaagta aaacttcttg aagaagatat atagcacgtt cagttaaaat accttataag gataaagatg tcaagactac aggcgatgag aaacattgtt catctattgc tgacaccgca cgaaagctag ggcagaaaag tatttcaata ttcaagacta tccattcgcg caaggtttag taatcgtatg tctatgtact atgctctccg acgcatggct tattgaaaac catcttcatc tggggacgca gataacatca ttcactgcaa atgcaggtca accttactac tggctggcag aaagatgcta gagtatatcg aagaaaacaa tgaaacatac tgatggaaat tggcgagtca tggtactact tggtattatt gtgatgctac atctactatt accggacgga tgctaaagtt taaaatatag tcgaccctgc ggggttttgt agtcaacatg attcatgatt ataaaaattt tttacaaaat aagccgaaag gcgaggagga tacgagctaa atctcaaacg atcattagaa gacggaggtg gaaactgaac ctaaaaaaga ctcgaaaagg tagagtcgtt acagccggaa gaagttggtt gtgatggcga ttactaaggc ctcagtctta catcgcctat tattggagtt cgcgcaaaag tacgaatggg cgattgacgg ttgaagcttc tcacctttct accctattta ttgaaacatg tgcccaatgt agtcgagacg cgctctagat gaatgggagc cgaaaccgaa tagcttcttc aatgtttcga atacgcacaa acaggcaatt gatattgcag gggctcaaat ggaattgggc gagtcactaa agatgggaaa cgacgaattc ttccttatca ctgacaaaaa gcggagaaat agcaaggggc ttcaattcaa taacccaatc gatgtattca gagcgatact gtatcgtcga ttgtcaacga ctacatgctt agagtatgtc gacatgaact atagataaag tcaagctcat cttcgatttt aactactcaa LUccay~.Gd O:9.aLCCZ ctttcgaaga cagtcaaatg ttatgaatca cagaaaggct gaaagcggat acggttattt tgtcatagaa ttggcgca&a gtatagaaga ccttactgaa taaaattggt aaccttttcg taaaaggaag ccggacagga agtgagttag cagagctcct tcaacaaaga gcaatgctca aagttggaga gatatgtcaa cttcgctgaa gaaaataaca ctaacattga tggaatggtt aagctccaaa atgttatgaa ccagttcaat agaagaacga cgaaacagt c gcctaaccct gatgtcaagt gcgtgatatt caacctggta tctgttattt tcttagatag aagttctgtc cggaaaggag atatcCtgct acggaagaaa aactcttcgc attattgacg caaatgagtc agctcatatt taaacgggca gaagacttgc gatgctgtgt tcattaaata agattCaCaa gtgtaagaac gttattacaa gatgggaCtt gtggctaggg gcggaacatc caaatcactg gatatcgaaa aggaagaagt aaaaatctat ctagccgaaa cttgcaagtt tagcaattgg acaggagaaa taggaactat tccgtggaca actatctaca aattagcaga taaatagaat aactgcactc tttagacccg aaggaagttt cgattcgact cgaaagatgt gaaaaggtta ccgcagataa aggtgcgcta cgcgcagaaa cgcttgtcgc ttatatcgta gaaaactatc gaactagctc caaccagaat tgaaattcac tacgattaat acaggaactc ctctaatgaa atacactgac tcagttcaaa tctagctgaa cttcgcgagc cctgaaaaga ttcgagtcac tgactaaact tgttcaagaa tcgacaagcg actggaaatc atcgaaatug LcgaqqaaLq ttgaacctct tgctaagata gttcaacgaa attgaagaat ggaacaggat ttactttgac aggaccaagc cgaagatagg caaaggtact gttgacgaac gatggtaagc ctatgaaatc tccatattaa ggaaagacac aagagattga tatgtcacct tttaaaactc gacaaactgc tgaaggaaaa aattggtata gtttggtatg aagcaaaaga tcaaattcct gaaaggaegg ataatagaaa ggtatacaaa atgttgaacc gctctctatc WO 00/32825 PCT/I B99/02040 47321 47391 47461 4753]1 47601 47671 47741 47811 47881 47951 48021 48091 48161 48231 48301 48371 48441 48511 48581 48651 48721 48791 48861 48931 49001 49071 49141 49211 49281 49351 49421 49491 49561 49631 49701 49771 49841 49911 49981 50051 50121 50191 50261 50331 50401 50471 50541 50611 50681 50751 50821 50891 50961 51031 51101 51171 51241 51311 51381 51451 ttcacttccc gtttgttctt cgaaggcgtg aacaaactca actcttgtag ggaaaactga tggagaatta aatgaaattt tgctaccaaa ggcgacatgg aatgacattg aacctgctca acgaagaacg cttgaaagaa actttcaggg cttatcgaat gagattgacc aagaagcaat ctaaaattta gcgatatttt caggcaaccg ctgtctgcgt caaagaatag gcaattcagg gttctacctt attcaagaag tgaagcactt aacggaaaac gaatttattt tcaatattaa gaacttattt aaacattgag aaatgttcga aaaagattga ttggacgaac tcgaaggaaa attttttaaa atgtggttta cggtatatat acaccaataa gaaaatttag ctgatagaat agtatttcga acctcaagtg tcgagcaaat atagtcgaag gggaaaacta gctgggcggt ttgagaaagg aatgtttgta catgcaagaa tttctcgaac ggaggttcct taaccaaggc tgtcgactat ttatacgact tcgtatatat gatacttcag attgaatcat agatatagta gcggtgtcct attgtgcagg agaaagaaaa gtcagccgtc caaggaaagt cctctctaca agcttcaagt cttaaataaa agaatacttc acggattatt ccggacgacg aaactattct accttatcga caagctaaaa ggacattcaa gtagatagta tctaaattcg taggcggact gaaaccatga cggtgaaaga acttcctggt gaggatttga atgcttgcaa ctgcttggaa ttggtgctcg tatagatact ccatcagttc gaaaaatatg acgcccttta tgattggagg catctgtggt ggggattgac ccagtacgcc aacatcacca gtccaagcag ggcgttcggc atggagtagg tcaaaatgct atctgtcgtt aaaaaccgat acctatactc ttataggatt aagcaaaagc ctctaggtcg atgaaagtaa atggtcttca aagacgaagg aacattcatt tcatgcagga gggactgaaa gtgacggaag ctggaacggt atgtattagg tcgaaacgat cgaagtagtt aggcaaggcg aaaatcattc ctgaagagga cggacgagct catcgagatg cctgaaccta aacgatttag aagggaagca gaagatgaaa agaaacaagt aggtaagcac attatcgaaa acaagcctgt tcttccagca tggttctgcg taattttaga aaagcctaaa gacgtggcaa aattcgaacc gtgcatcgat tcgaacattg acctaagcga aacgggttca caaaatgacc tcgagaaata atggaagaaa ttagtcgaac aagttcgaaa tcgacttttg gtgtcagctc gtttcgagcg ctcttatcct aattatactg tggttctaga acatcacaac agtgcataat tacttgacag atgaaaaggg gttctcgaag tagacgagta gaacagacct ggacaaccta attcaagaag aacagtt cat caaaagtctt ttttctgcta aattagttga catcaatacg gttgtcattt acgtttttag aggttaatat aatgaaattg caggtctttc taatatgaag tataactggt aactttgtcg aaggttcata aaattcgaag tcaataggcg ataaattata aagttaaatg gaaaagccga cggt ct tagc caacgctatt aactattgac ccttaagact tatctgtatg acgatgaaat ttttcaggca tatttttctt gagcgagagt tcgatagttc acagcatgta aaaagagctt tcaatttatt cgaccatttt cctggattcg gaggagcatc tatataattc tgaccatcgt ggttctcgat tctgggatga atgcgcggtg acttggcatt ctacactcga atgctaaagc atttaaacgt gaacatggat cgctgcaatt gaagaagccg gtgaattaaa cgtgatgctc taaaagagta catgaaagaa ccttctacac gacagagcgc tcaactatgg cgaagccgag acggaagaaa tgtgtgaaaa aaacttctcg aggatatgat ttatcacggc ctgttacaga aggcattcgt tttggaaagg ggttagcaga atccaatcac accacttgcg tataccataa ggaggagata agtggcaagg aactaacatt caaagacaag cctaaaactc aaaagtcgag catgattatt ttcaaatagt caggtgtcat ctttctttat agttcagtat tcaacttttc gagcactatg aaaaatgttc atttttagct gaaagttttg ttaaatatga acggtctcga ctttcaaaag agcctggatt gattttatta gtttagtaga ctarctttag tataatttat caatcttgat tctttcgggc gtatcgaaaa tataaaaagg agaaaagttg accttttcga gagaagtggg ctacctcaaa caagg3aatgt tgggaatggc tagaagctgt attgttattg cttcgaatac tgtcgggaat tagcagaaac igcacttgac ggaagaattg tgagttcggc gactataatt attttcaaac tgtgagctat tagtcataga cgaaataggt acttggttaa ttatagggtt gacaataact tatrgacctt ttaggccaaa ggctttatag agcaatgtaa gaggattgga ggtaagcgaa tggcagattg tctttctttg tatttgctgc ctcaagataa ggtgattcaa agttataagc aggagcttgg ctaggaagtg ctccgggagc ggaaaattga aagaggtggg agagtgatac atccatttta gaaaataatg gaattgacca caagaacact tttegagata tggaagagtt aatttttcga aattggcgaa actgatgaat acttgttcca attttaacgg aagcggctga attccaaaac tagaagaact tttcaatcgc aacttcgact agactgggcg aatactatta tgaactattg gacgacgtgc ttggaggctt ggacaaggta agtcgtggac tattgataaa tatatagcgg ggaaatgagt gaaatgcaag caattcaatt accaaaggga tttggaacga actgaggctg aaaattccct tgtggtagtc ttttagatag catgatatct aaatatagac gtcttatcca agcagggagc agaagcgaat gctaaatatg gaattcctat tgtgcttaat gtatggaact agaacatata gcagaaagtg gcgtgacgaa aaatccggca tacttgaact atcgaacata tgtgggacgt tgaaactgga gaactgaaaa aggcgaaagc tctccattga ggttacaagg gaaggagttg aagcattttg aaataattga aaaactttcg agacaacttg tggaagcaac tatcaattct catgcccgtt agtagaaatc cttcttattc aggaagtaag acacttcagg actaactgaa ttcgtctcga gtggctgaaa aggaattttg gaacatctag agaaatggga gaactgaaaa agtcgagcat ttcatcctta tatgtatgaa cggaaattga actgcatgat tgcatcacct ttccagtacg gttcgttcta agtttcacca gtacggtgaa ttgtagcatt tcgagactat tttgaaaaac cttgactctt tggtcaatga agattccagc ttactaaaac gacttcetta tagaaatatt aggaaaaacc ccaccgaca,_ Lyaa a& tgataataag tgggatataa acgaccatcc atttattatc gtataataaa gttagaaaat ctagtttcta gctatgtagg attcgaatgc ctgatatgtc aattgcgtct gdttatcaXc atgcttatct cgacatgaca ttgaaagcat tcaaaccaag gggccaagtt ttcaacttac acattgcgat agacattgct cttggaatat ttgtcataat gaacgggcat at t ct ttcga aggaccatat aaagaacct t cagctttcac tggacctata taaaactgaa agcagagtta atggcgaaga caaagaggaa actgctcgt c aattgaagcg tt tagacgaa agcatccctc tcactgtttc ggagggttct tcagccctga acttgataaa t ttgatgt ag tgcgaatata cgaaatgcta cgacagggtt ggctcgacct gatgtccttc atgttagcat tcaagcaatg acccctgcaa tcatgagcga taagatttct ggcgctgaaa tcgctatgaa ccgaaaaatc ggcgaagaag ttcgaagtaa actcctgaac ctaagtcgct ttgtggcatg acttgcggct atggaaacca agcgtttcga C accggt C Ca gttatgacaa ccgtcgaagt caatatgagc ttatcaactg tcaaatcaat cagacagcgc aagagttcta gtagaaattc agcgaataga gaactagacc aaaggtttaa aacgttcaaa 51521 gaacctcaag ggcgaaacag 51591 gatgacccta aaacggaatt 51661 ctattagtca agtattcgtg 51731 agtcgctctt atgggagtag 51801 gttctagcac ttgaccctga 51871 gcaaggtcgt tagatttttg 51941 ggaattatta aattttaatg 52011 tttaaaaaga ggtcatatca 52081 tggactgacg aagaatgtat 52151 gttatttgg gatgctttat 52221 tgcattcgag actatttcaa 52291 cttacaagac tcttcaagaa 52361 attggtatgt agaagtgacg 52431 gacagttggc Cattgtgaag 52501 aatacagagt atgcttatat 52571 gtgaaattgg agtaagcagg tattcttcaa tctttatggc actgagtctg gtggaggaaa taacgctggg aactacccta atttagtctt atatgaaaga caggaacttt tcctatgcaa aatgtttggc tagaatagtc ttcgatagcg ttagaatata ggtacctaaa tgcaccttcc atgaatcgaa tttcgacaaa tgaagaaggc gacgatttta gtatcctatc actacggaaa aattgaaatt gaagcaagtc ctcgtctgtc attcaaaacg gtccttcagt tctgctatta gtcagtctaa gaagtcacta ttgacttcat gacgctttct aagcgacgca gaaattgcgc aaaaataaat taaaagattt WO 00/32825 PCT/I B99/02040 52641 52711 52781 52851 52921 52991 53061 53131 53201 53271 53341 53411 53481 53551 53621 53691 53761 53831 53901 53971 54041 54111 54181 54251 54321 54392 54461 54531 54601 54671 54741 54811 54881 54951 5S021 55091 55161 55231 55301 55371 55441 55511 55581 55651 55721 55791 55861 55931 56001 56071 56141 56211 56281 56351 56421 56491 tatacaactg gtttacaaat aaaaacttca aaaatctttc aaaaatcagg aacatttagc aattgtcact ctattgtatg gttgacggtc gtcgacgcta attgtccatt gggaaaagtt ggaagcct t9 aattccttcc aactctaatt gagcgttctt aatcttcaca aggattctaa gact ct ttgg gaagcctgca ctttcaacta agtatgtaga tcacgatgag gttagcaata ggattgtaga actcggcgtc tcCCcacagct atgacttatt ccctttgcaa gaatataacc acatggaagt tatgaacgag caaactaat C gtcctactca atgccaaaaC gaaacatggg tgactcagcc agagcgtccg ttagacctcg caagtcgttC aggtcgaaca catgagggcg tgcaaagcct gttgaggtta ggattcttgg caaaatgatt ctggcaggag tgacgaagat ttcaggaatt aaaatgaatg tgaaaagtct taaaggaatt actttcgaac tggaaaaagt ctacggtgtc gcigagcaag Cccagagcta attagcaatt cacgtgaatt aaccatcaaa tcagggtcta atgacccgga tatcaattgc ggattccctc accgaggccg tCCtgaaatt gaagacagcg acgaagacca aaattcaCgt gctgaaagaa cgagccctct cgttccagca cttacatttc acacgttctt gaagacggaa tctgcttgta gcgaattaag cctgtcatct agccagcgtg Ccactctaaa cctttagtt tctatgaatt ctcatgggtt acaacgagtt aggcgaagac aatgctattg gtattgaaaa t t cCCatgC attcgttcag ctactcttga aatgtttgac agaggagcat ctccttcagt ttattattga agaagaacga aggaaacgct gatagacttg taggttCat ctcacctagt aatcaaattt atcataattc ggatacatat tatgttagga taatcctCCC tcaagaacaa cttcataata tttcacactc atggattat gcgaagacgg actatttctt caaaagattg gagctaaagg agattttcca gtggttgacg ctcCtgcgcc tagtcgaaga ttaagaaagg tgctcagtta ctaaaggacg agttaatcac tgacgtagaa caaaaaggaa ctcctgagtt gaaatttgac ttagccgcaa acgaagaaaa tgatgt tgcc tacttgactc ttgagatgcc tcgtgtatat tatatacgaa aggacaaact ttgaaacctt aacttataaa ggagaatcga tatgggaaaa gtatcaattc gacttagacc aagataagct ggcagaaart agtttcaaca gcttgtcagc gaatggcagc tcaaaaactc gaaatggatg caagaggtcg ctgttttatg atatCatggg attgaaaagt gctgaccacg gtgacagcgc tcgtagtcca cgaagcagac ggaaacagtc catcctgata caactttaca accaigatac ttacatttat caataaatat tgaccaacga actacttatg gaaaagagcg aacttcttgg gcaagttcac tcttcaagaa Cagacgaggt tccggtcgag actcctccaa cacgaggtcg gaaaataatg gcacaaaaag cttgctcaac ggaaaaacag cagttgctag agctcgtact Cgaggaagca aaactcgagc actgatggac tcgatactat tcCatgctcc tgtcaatcat caigaagaaa atgcttcaac atgaaatcga tCtattggcg CgctCtcaaa Cgaaaacgag cgcagaggCC gcaaaattta tatatgtatg cggcctatga caggaactga acaatgtgaa tctaattaaa gCtctcttcg agagaacagt ttactgccaa ctgaaattga agaacttcga agcgacggta agcatttcca cctgaaaggg ataaacctag cacttttgaa atatagaaaa gcctgacaat cgaattcaca cctaacttac agaatatccc ggcattacat tattggtagt aagtatgcga catgcttacg ccctatgaag agtgtttaga ctgtcaagtc cgttctttta atctgtcaaa gaagcgaata ataCtcgttc aacagcaggc ctgatatgag tcttcctgaa taactttgac gcagaccaac ctagatagag ccggggatt ttaaggataa Cggaggcaag agccgacatg actaagtacg ttaatgattc cagttcacga ggttgacaga agttatgaCC agtagaaaga Cggtatggtg tattgcgaac cttgtgacaa tggtttatgg gattcacttt ctttgatatg ggtagggtta cgtggcttca ggagttgcat aaattagact cgaagattgc tttcattagg actgagtcct ccaagttcta gttcagttta ctgattac Cgcagtcaat Ctacaaaaac gcttgaccttt aggaacaggc gaaagtattg tatgcaaaat tagtttcgac ctacattcaa acagtacgga ttctcgcggt gactactctc aacaaaacct gttctatccc ggtcttatgt aggttattga gcaggacttg tacgagttcg agatggacga C aagaagaag atagctgatg caatgattaa tgagttacta gaagcagcca aagaaattga tttatttcga cctgcttata ttccttacct tttggataag aaatgctg cttggaattg Ccttgcagc tgcttcgaga attcattcat gagggtgcag aacaagaacc ggacctatat gacggaacga acggccgcgg agatttcttc ggatatgttcC agtatatcga tactgttcct caagaaatta ctcagcgcca ggtacacaat ggtgaggttc aggacattat aatctaaaat acctttaaat aatctaataa cgttgatttg tcaaaaagca tctagcaaca tcgagcattt ctatacaaca gctaagacag tagttcgaca tcgttcattg CcagCtatcg ctaacaagga ggctaactca accgagtt~c aaacagctac cgctagcaag gaacatatta aagaccaggc atgtttgaac gacgctgaat ctatcaagaa tagtcttcca tgataacgat cttgaccaac ggcgtatgtc aatctttgca gcggaattaa gttcgaaact aggaaaactt atcgctgagc ctaaagtggc cggtcgaaga aacgaagatt tcgaaaaata aaaagccgaa tcagttattc tgaaagaatt cgcaaaacgg atgaaatgtg atctcaaaag accttgcaaa aagtgagaat gccagtgaag gtggcgacga ttatggtgtt cgaagaaatt agatgaatgt agactatatc agaaggcttc tcgacccctt ttgggcccag ggaattctta aaggaacggc aggattccat ggagcagaaa accccagtat ttctagtagt tccaagcagt CCtgcaggtt aatcgcttgt ctattcagtC gcatatataa gtgaaaggaa ttttaattcc gcaagtacga gaagccaaaa ctttatgcaa aacctaccac agtgtcCCta tattcgacaa gCcccgaat tcggttctat tatagacgca accatatgga gaacggttgc atatatagaC attccgtcag ccgtactagg aattgcttcg agatatttga aaaagtagtc aggaaaattc Catttgaaaa agtagtcagg aaaattcctg attaCCCttC taCtaC WO 00/32825 WO 0032825PCT/I B99/02040 359 Table 29 Phage dpi ORFs list nb I Name Frame I Position JSizeKewod I a.a.)-Kyos 1 I dp1ORFOOI 2 136698..40390 1230 Putative tail: 2 dplORFOO2 1 32386..35835 1149 -Tail; 3 dpIORFOO3 3 53538..55877 1 779 -DNA polymerase I 4 dpiORFOO4 3 40401..42440 679 Minor structural: dpIORFOO5 1 23674..25434 L586 1 6 dp1ORFOO6 2 45296..46987 L563 1SWI/SNF Helicase: 7 dp1ORFOO7 3 22230..23621 I463 1 Terminase; 8 dp1ORF0O8 1_ 49624..50961 44 NAb Helicase: 9 dplORFOO9 2 13160..14404 414 dpIORF010 2 8699..9859 386 jRecA: 11 fdplQORFO 11 3 28017..29096 359 IMaior head: 12 dpIORF012 3 5346..6419 357 DNA pol. I II beta: 13 dpIORF013 3 10215-.11240 j 341 DNA o. IIgamma and tau; 14 dplORF014 3 50961..51974 j 337 DNA primase; dplORF015 1 3793..4728 I311 16 dplORF016 3 43413..44303 296 Amidlase: 17 dplORF017 1 11242..12081 279 18 dplORF018 1 3 35B47..36686 279 19 dplORF019 2 12161..12967 268 dp1ORFO2O 1_ 1864..2658 264 exsD; Coenzyme PQQ; 21 dplORF02l 2 2504..3295 263 1GTP cydlohydrolase: 22 dpIORF022 30896..31675 I259 23 dplORF023 6419..7195 -258 24 dpl9R025 -1 18026..18778 250 dplORF024 3 25992..26738 248 26 dp1ORF026 21512..22252 246 27 dplORF027 52762..53490 242 28 dplORF028 4.4595..45299 234 29 dpIORF029 2 662..1348 228 exsB; dplORF031 3 26943..27611 222 31 dpi 0RF030 -2 19423..20088 221 32 dplORF032 52033..52647 204 33 dplORF033 7670..8239 189 34 dplORF035 1 -1 16859..17425 188 dpIORF036 1_ 1 48808..49362 184 DNAc replication; 36 dplORF037 1 55855..56388 J177 37 dplORF034 131..652 I173 38 dplORF038 1350..1871 173 exsC; 6-pyruvoyltetrahydroptern: 39 dplORFO39 3 3306..3803 165 1Citrulline biosynthesis; dplORF040 1 7192.7683 163 41 dplORF041 3 8208..8699 163 dUTPase: 142 dplORF042 1_ 1 48082..48561 159 43 dpIORF043 1 31699u.2154 151 44 dcplORF044 -1 25211..25666 151 1 dplORF045 2 25340..25777 145 46 dplORF046 3 42774..43202 142 47 dpIORF047 1 47542..47961 139 48 dpIORF048 -3 16308.-16709 133 49 dpIORF049 -3 43620.44018 132 dplORF05O 3 15081..15476 131 51 dpiORF051 2 29765-.30154 129 52d Op IORF0;3 40i7..50200 117 1 53 dplORF052 3 30516. 30893 125 54 dplORF054 2 14423.14800 125 dplORF055 3 27627..28004 125 56 dpIORF056 -3 18780..19151 123 57 dplORF057 1 9859.10218 __119 58 dplORFO58 3 15633..15989 118 59 dp1ORFO59 1 11 30154..30507 17 dp1ORF060 -2 37717..38070 117 61 d 1ORF062 -3 4494.45284 114 62 dpl0 RF 06 3 1 47200-.47541 113 63 d IpORF064 2 29108249 WO 00/32825 PCT/I B99/02040 64 dplORF066 -3 1 28566..28898 110 dplORF067 -1 1 44735..45061 1 108 66 dplORF068 3 29451..29768 i 105 67 dplORF069 -3 20094..20411 1 105 1 68 dplORF061 -3 19161..19475 104 69 dplORF070 1 15973..16284 103 dPlORF071 3 38904..39209 j 101 J 71 dplORF072 I -2 150749..51045 1 98 1 72 dplORF073 1 3 114262.14555 97 73 dPlORF074 3 132298..32591 97 74 -dp1ORF075 -1 122154..22447 1 97 dplORF076 -1 5435..5728 jf 97 76 -dplORF077 1 14800..15084 j 94 77 dplORF079 -3 35007..35288 93 78 dplORF081 -3 55188..55466 j 92 79 dplORF1O3 2 49352..49627 J 91 IdplORFO80 1 42490..42759 1 89 81 dpi 0RF082 1 144728..44994 1 88 82 dolORF083 -1 35720..35974 84 83 dplORF065 -3 51246..51497 83 84 dplORF085 -3 10602.10847 I 81 dp1ORF087 -2 29794..30036 1 86 dplORFO88 3 5040..5279 1 79 87 dplORF089 -2 12256..12495 79 88 1dplORF273 3 56256..56486 76 89 dplORF078 -3 117280..17507 dplORF090 1 27037..27261 74 1 91 dplORF091 1 43189..43413 1 74 Holin; 92 dplORF092 3 46989..47213 74 93 dplORF093 -2 45538..45756 J 72 94 dplORF095 3 8877..9089 1 dplORF096 -1 46469..46681 96 1dplORF097 -1 38888..39100 97 dplORFO98 1_ 1 43627..43836 69 1 98 dplORF099 3 38298..38507 69 99 dplORFlOO 1 1597..1803 68 100 dplORF101 2 19220..19426 68 101 dplORF094 1 8281..8484 67 102 dplORF102 2 4034..4237 67 103 dplORF104 -1 21224..21427 67 104 dplORF105 -2 1828..2028 66 105 1dplORF106 -3 10329..10529 66 106 dplORF1O8 -1 49250..49447 107 dplORF109 -2 31435..31632 108 dplORF110 1 16444-.16638 64 109 dplORFlll 1 28657..28851 64 110 dplORF113 -2 17521..17715 1 64 ill dplORF084 1 15445..15636 1 63 112 dplORF114 2 52952..53143 63 113 1d01ORF115 -3 5151..5342 63 114 dplORF116 -1 20474..20662 62 115 dplORF117 -3 24492..24680 62 116 dplORFll8 2 15023.5 208 61 117 dplORF119 2 41054..41239 61 118 dplORF120 1 28387..28569 119 dplORFl2l 3 39222..39404 dplORF122 -1 40220..40402 121 dplORF123 -2 21145..21327 60 122 dplORF124 1 -3 17712.17891 59 123 dplORF125 1 -3 49740..49916 58 124 dplORF126 -3 1ilb~O..16136 125 dplORF127 -3 13335..13511 58 126 dplORFl28 1 4852..5025 57 127 dplORF129 2 25133..25306 9 7 128 dplQRF13O -1 16619..16789 56 129 dp1ORF131 1 43846..44013 55 130 dplORFI32 -1 15137..15304 131 dplORF133 -2 7900..8061 53 132 dolORF135 3 780.938 52 133 dplORFl36 -1 55094..55252 52 134 dplORFl37 -2 36988..37146 52 WO 00/32825 WO 0032825PCT/I B99/02040 135 Jdpi 0RF138 1-3 130504-.30662 T 52 136 jdplORFl39 -3 ]11934.12092 1 52 1 137 jdplORF140 3 120562..20717 1 51 1 138 dplORF141 -1 j42767-..42922 j 51 139 dplORF142 -3 31743..31898 I 51 140 dplORF143 -3 7410..7565 51i 141 dplORFl44 1 36517..36669 142 a plORF145 1 42067..42219 j 143 PdplORF146 1_ 51484..51636 144 dplORFl47 1 55207..55359 145 dplORF148 -1 28484..28636 146 dplORF150 -3 15033..15185 147 dplORF134 -2 349..498 49 148 dplORF151 1 28027..28176 49 149 dplORF152 1 1 42235..42384 149 150 -d 1ORFlS53 2 22307..22456 49 151 -d I ORFO86 2 52760..52906 48 152 dplORF154 2 18446..18592 148 153 dplORF155 3 13512..13658 48 154 dplORF156 3 18777..18923 48 155 dplORFl57 -2 13135..13281 48 156 dplORF158 -3 40581..40727 48 157 dplORF159 -3 30225..30371 48 158 dplORF149 -3 26331.26474 47 159 dp1ORFI6O 1_ 2 41324..41467 47 160 dplORF161 2 52175..52318 47 161 dplORF162 13020..13163 47 162 dplgRF16 3 40224..40367 47 163 dplORF164 -2 6553..6696 47 164 dpIORF165 -3 50361..50504 47 165 dplORF166 -3 23376..23519 47 166 dplORF167 3 1008..1148 46 167 dplORF168 -2 54205..54345 46 168 dplORF169 -2 45814..45954 46 169 dplORFl70 -2 27460..27600 46 170 dplORF171 -3 47538..47678 46 171 dpOF7 -1 10325..10462 172 dplORF173 -2 32023..32160 45 173 dplORF174 -2 29629..29766 174 dplORF175 -2 15511..15M4 45 175 dplORF176 -3 42894..43031 176 dplORF177 -3 19800..19937 177 dplORF178 -3 11787..11924 178 dplORF112 2 32207..32341 44 179 dplORF179 13 56058..56192 44 180 dp1ORF180 -1 141042..41176 44 181 dplORFl81 -1 12992..13126 44 182 dplORFl82 -2 45235..45369 44 183 dplORFl83 -2 13762..13896 44 184 dpIORF184 -3 53196..53330 44 185 dplORFl85 1 22522..22653 43 186 dplORFI86 2 21272..21403 43 187 dplORF187 2 34415..34546 43 188 dpi ORFi 88 2 35609..35740 43 189 dpOF8 2 42587..42718 43 190 dplORF19O 3 39786..39917 43 191 dplORF191 -1 40865..40996 43 192 dplORFl92 -1 2789..2920 43 193 dplORF193 -2 142325..42456 43 194 dplORF194 -2 1401 53..40284 43 196 dplORFl96 -3 11142..11273 43 197 dpIORFlO7 1 110750..10878 42 198 dplORFl97 2 7484..7612 42 199 dplORF198 2 124119..24247 42 200 dplORF199 -1 115614..15742 42 201 d IORF200 t -3 47715..47843 42 202 d 1ORF201 1 38569..38694 41 203 d IORF202 2 44483..4608 41 204 dI ORF203 -3 22656..22781 141 205 d IORF204 1 14113 40 WO 00/32825 WO 0032825PCT/I B99/02 040 206 1dplORF205 1i 8524..8646 40 207 1dplORF206- 1_ 19855..19977 1 40 208 jdplORF2O7 I 1 27502..27624 i 40 209 dplF208- 2 47279..47401 140 210 JdpORF0I 3 29784..29906 40 211 JdplO0RF210 1 -1 52955..53077 212 jdPlORtF211 1 -1 20837..20959 40 213 dplORF212 52861..52983 40 214 dplORF213 -2 30169..30291 140 215 dplORF214 j-2 24151..24273 140 216 d~lOF215 1 -3 35700..35822 140 217 1 dr10R216 -3 32727..32849 40 218 dDlORF217 1 23443..23562 39 219 dplORF218 3 22029..22148 39 220 dp1ORF219 -1 51269..51388 39 221 dD1ORF220 -1 6215..6334 39 222 -dplORF22l1 1 43507..43623 38 223 dplORF222 3 13212..13328 38 224 dplORF223 j3 14055..14171 38 225 -dplORF224 j-1 13505..13621 38 226 JdplORF225 J-2 32875..32991 38 227 dolORF226 1 -2 25075..25191 38 228 dPlORF227 j-2 22999..23115 38 229 dplOf228 1 10450..10563 37 230 dplORF229 I1 27634..27747 37 231 dplORF23O 1 2 50723..50836 37 232 dplQRF231 -2 30958..31071 37 233 -dplQRF232 -2 29272..293B5 37 234 dplQRF233 -3 52779..52892 37 235 dplORF234 ,1 36253..36363 36 236 dplORF235 2 32768..32878 36 237 dPlORF236 1-1 37418..37528 36 238 dpi 0RF237 -1 1568..1678 36 239 dpOF3 -3 1191..1301 36 240 dplQRF239 1 26521..26628 35 241 dplORF240 1 41893..42000 35 242 dplQRF241 -1 4691 3.47020 35 243 dPlORF242 -1 41231..41338 35 244 dplQRF243 1-2 51199..51306 35 245 dplORF244 -3 26976..27083 246 dplORF245 -3 6171..6278 35 247 dpi 0RF246 -3 2724..2831 248 dplORF247 1 29641.29745 34 249 dj.9RF248 1 53560..53664 34__ 250 dplORF249 2 2012..2116 34 251 dplORF25O 2 23837..23941 34 252 dplORF251 -1 39101..39205 34 253 dplORF252 -2 54667..54771 34 254 dplORF253 -3 56151..56255 34 255 dplORF254 -3 48375..48479 34 256 dplORF255 -3 9468..9572 34 257 dplORF256 1 15289..15390 33 258 dplORF257 1_ 28216..28317 33 259 dplORF258 1_ 44023..44124 33 260 dPlORF259 2 4298..4399 33 261 dPlORF26O 2 24746..24847 33 262 dPi 0RF261 3 288..389 33 1 263 dplQRF262 1 3 9408..9509 33 264 dplORF263 -1 26951..27052 33 265 dplORF264 -1 6038..6139 33 267 -dplORF266 j-2 50119..50220 -33 268 -dplORF267 1-2 47266..47367 33 269 -dplORF268 -2 12520..12621 33 270 dplORF269 I-3 53733..53834 271 dplORF27O -3 50691 50792 272 -dplORF27l1 -3 19638..19739 33___ 273 dplORF272 -3 1455..1556 -T 33 WO 00/32825 PCT/I B99/02040 Table Predicted Dp-I amino acid sequences dp1ORFOO1 36698 atgat tgacaat aat t tacct atgagt ccaa tt cctggcgaaat tgtt caagt at atgaccaaaact tcaatct aat tggaca I MI D N N L P M S P IP G E I V Q V Y D Q N F N L I G A 36782 agtgatgaaatctttagcaagcattacgaagacgaaattgtgactcgagct cgaggaaaagaaactttcacttttgaaagtatt 29 S D E I F S K H Y E D E I V T R A R G K E T F T F E S I 36866 gaaacctcatct at ctatcaacacttaaaggt tgaaaacattat ccagt atggaggaagatggt tt cgaat taaatatgctcag 57 E T S S I Y Q H L K V E N I I Q Y G G R W F R I K Y A Q 36950 gacgtagaagatgt caaagggctt accaagt ttacctgctacgcat tatggt atgaactagcagaaggcttgcc taggaagt tg D V E D V K G L T K F T C Y A L W Y E L A E G L P R K L 37034 aaacacgt tgct tct tctgtaggcgctgtcgcgct agatattatcaaagacgcaggtgaatgggtt cgaCt agt ttgtcct cct 113 K H V A S S V G A V A L D I I K D A G E W V R L v c P P 37118 gacggtgctaacaaacaagttcgaagcataacagccgcagaaaattcaatgct ttggcatct tcgatatcttgcaaagcaatac 141 D G A N K Q V R S I T A A E N S M L W H L R Y L A K Q Y 37202 aatttagaattgacatttggttatgaagaaat tatcaagcaagaggttagaattgttcaaaccgttgtatttcttcagccttat 169 N L E L T F G Y E E I I K Q E V R I V Q T V V F L Q P Y 37286 gtcgagtctaaagtagactttcctcttgtagttgaagagaatttgaaatatgtcactaggcaggaagattCtcgaaaccgtgt 197 V E S K V D F P L V V E E N L K Y V T R Q E D S R N L C 37370 acggcttacaagttgacaggtaaaaaggaagaaggcagtcaagagcctttaacgtttgct tctaicaacaatggaagtgaatat 225 T A Y K L T G K K E E G S Q E P L T F A S I N N G S E Y 37454 ctcattgatgtttcgtggtttactacacgccacatgaagcctcgatatattgctaaatctaaaagcgacgaacattttagaatt 253 L I D V S W F T T R H M K P R Y I A K S K S D E H F R I 37538 aaagaaaatttgatgagtgctgcgcgtgcttatcttgacatctacagtcgcccactaattggatatgaggcttcagcggtcctt 281 K E N L M S A A R A Y L D I Y S R P L I G Y E A S A V L 37622 t ataacaaggt tcctgact tgcat cat act caact aat tgt cgacgaccattatgatgt tatcgagtggcgaaagatatctgct 309 Y N K V P D L H H T Q L I V D D H Y D V I E W R K I S A 37706 cgaaaaattgactacgacgacctt tcaaactctactatcattttccaagaccctcgaaaagacttgatggacttgctaaatgag 337 R K I D Y D D L S N S T I I F 0 D P R K D L M 0 L L N E 37790 gacggcgaaggagt cct ttcaggggaaactgt aaatgagt cccaagt tgt tat tagatacgcagatgacat tt tagggactaat 365 0 G E G V L S G E T V N ES Q V V I R Y A D D I L G T N 37874 ttaatgcagaatctgggaaatacattggtgtccttaatactaataagaaaccgagcgaattagttcctgacgactttacatg 393 F N A E S G K Y I G V L N T N K K P S E L V P D 0 F T W4 37958 attcgactagaaggtcctaaaggtgacgcaggtttaccgggagctcctgggcgtgatggagtcgacggtgtacctggaaagagc 421 1 R L E G P K G D A G L P G A P G R 0 G V D G V P G K S 38042 ggagtagggatagcagatacagctatcacttatgctgtatccgtttccggaacgcaagagcctgaaaatggatggagcaacaa 449 G V G I A D T A I T Y A V S V S G T Q E P E N C W S E Q 38126 gttcctgaactcataaaaggtcgactct tgtggactaaaacattttggagatatactgacggct cacatgaaactggatactcc 477 V P E L I K G R F L 14 T K T F W R Y T 0 G S H E T G Y S 38210 gttgcctatatagggcaagacggaaattccggaaaagacggaatcgcaggtaaggacggagtaggtatagccgcaactgaagtc 505 V A Y I G Q D G N S G K 0 G I A G K 0 G V G I A A T E V 38294 atgtatgcaagttcgccatctgctactgaagctccagctggtggatggtctacgcaagttcctaccgtcccaggtggtcagtat 533 M Y A S S P S A T E A P A G G W S T Q V P T V P G G Q Y 38378 t tatggact cgaacaagatggcgct acactgaccaaactgatgaaat tggatatt cagt tt caagaat gggcgagcagggtcct 561 L W4 T R T R W4 R Y T D Q T D E I G Y S V S R M G E Q G P 38462 aaaggtgacgcaggtcgtgacggtattgcaggaaagaacggaatagggttgaagtcaacttcagtttcttatggaattagtccc 589 K G D A G R D G I A G K N G I G L K S T S V S Y G I S P 38546 actgattctgcgattcctggagtatgggcttcacaagttccttctttaatcaaaggtcaatatctttggactcgaactatttgg 617 T D S A I P G V W4 A S Q V P S L I K G Q Y L W4 T R T I W4 38630 acctataccgattcaactaccgaaacgggctatcaaaaaacctacattccaaaagacgggaatgacggtaaaaatggaattgct 645 T Y T 0 S T T E T G Y Q K T Y I P K D G N 0 G K N G I A 38714 ggtaaggatggggtaggaattaagtctacgaccattacctacgcaggctcaacctcaggaacagttgcgcctacttcaaattgg 673 G K D G V G I K S T T I T Y A G S T S G T V A P T S N W4 38798 acttctgctattccaaatgttcaaccgggattcttcttgtggacgaaaactgtcttggaactatactgatgacactagcgaaaca 701 T S A I P N V 0 P G F F L W4 T K T V W N Y T D D T S E T 38882 ggttactcagtttccaagataggtgaaacaggtcctagaggagttcaaggtcttcaaggtcctcaagggcttcaaggaattcci 729 G Y S V S K I G E T G P R G V Q G L Q G P Q G L Q G I P 38966 ggacctgcaggagctgacggacgct cgcaatatactcacctcgctttctctaatagtccaaacggtgagggatt tagtcatact 757 G P A G A D G R S Q Y T H L A F S N S P N G E G F S H T -co rorrg ar oat ttcaatcccat ccat tcaaaaqaccctqcaqcct at acatggacgaaa 785 D S G R A Y V G Q Y Q 0 F N P V H S K D P A A Y T W4 T K 39134 tggaaggggaatgacggagctcaagggatacccgggaagccaggcgcagacggtaagactaattatttccatatagcttacgct 813 W K G N D G A Q G I P G K P G A D G K T N Y F H I A Y A 39218 t caagtgcagaeggatcacgtgagtt cagt ttggaagat aat aatcaacaatat atggt tat tact ccgat tatgagcaagca 841 S S A D G S R EF S LE D N N Q Q Y M G Y Y S D Yt-FQ A 39302 gatagcagggatcgaactaagtatcgatggtttgaccgccttgccaatgttcaagtgggaggtcgaaacgagttccttaattct 869 0 S R D R T K Y R W F 0 R L A N V Q V G G R N E F L N S 39386 ttatttgaatttggtttaaaacctcgctattctagttacaatctaatggacgacaagatcaaacgcaaggacagatatctgct 897 L F E F G L K P R Y S S Y N L M D G Q D Q T Q G Q I S A 39470 actattgacgaacgtcaacggttcaaaggtgctaactctttacgacttgactcaacatggaacggtaaaccgcagaaccaaaaa 925 T I D E R Q R F K G A N S L R L 0 S T W N G K P Q N Q K WO 00/32825 PCT/I B99/02040 39554 ctgaccttttctttaggaggagatacgcgattaggtactccaaccgagtggtctaatttagaaggtcgtatcagtttctgggct 953 L T F S L G G D T R L G T P T E W S N L E G R I S F W A 39638 aaggcctctaggaacggagtgagcttagctgcacggccggttatcgtagtaacgtatttaccgcaaccttaaccgatcaatgg 981 K A S R N G V S L A A R P G Y R S N V F T A T L T D Q W 39722 aagttctacgattttaaattctttgacaaagttaattcaaattgtaccgctgaagcaattttccatgtattcactcaaagttgt 1009 K F Y D F K F F D K V N S N C T A E A I F H V F T Q S C 39806 tcgggccacttaatgatgtaacttccttatagaagaactattg 1037 S V W L N H I K I E L G N I S T P F S E A E E D L K Y R 39890 attgactcaaaagccgatcaaaagctaactaaccaacagttgacggcactcacggaaaaggctcaactacatgacgcagaactg 1065 I D S K A D Q K L T N Q Q L T A L. T E K A Q L H D A E L 39974 aaagctaaggctacaatggagcagttaagtaact tagaaaaggcttatgaaggtagaatgaaagctaatgaagaagctatcaaa 1093 K A K A T M E Q L S N L E K A Y E G R M K A N E E A I K 40058 aaatcggaagccgacctaatcttagcggcaagtcgaattgaagctactatccaagaact tggcgggctacgggaactgaagaag 1121 K S E A D L I L A A S R I E A T I Q E L G G L R E L K K 40142 ttcgtcgacagttacatgagctcttctaatgaaggtctaattatcggtaagaacgacggtagctctaccattaaggtatcaagt 1149 F V D S Y H S S S N E G L I I G K N 0 G S S T I K V S S 40226 gaccgaatttctatgttctccgcagggaatgaagttatgtaccttacgcaagggttcattcacatcgataacgggatctttacc 1177 D R I S M F S A G N E V H Y L T Q G F I H I D N G I F T 40310 caatccattcaagtcggccgatttagaacggaacaatactcgtttaatccagacatgaacgtgattcggtatgtaggataa 40390 1205 Q S I Q V G R F R T E Q Y S F N P D M N V I R Y V G dplORFOO2 32386 atggattttgggtcaattgcagcaaaaatgactttggatatctcaaacttcacaagtcaattaaatcttgctcaaagtcaagcg 1 M D F G S I A A K M T L D I S N F T SQ L N LA Q S Q A 32470 caacggctcgcactagagtcttcgaagtcctttcaaattggttctgctttaacaggattaggaaaggactacgactgcggtt 29 Q R L A L E S S K S F Q I G S A L T G L G K G L T T A V 32554 accct tcctct t atgggatt tgcagccgc ctct att aaagt agggaatgaatt ccaagct caaatgt cc cgtgt tcaagctatt 57 T L P L M G F A A A S I K V G N E F Q A Q M S R V Q A 1 32638 gcaggagcgacagcggaagagcttggtagaatgaagactcaagcaatcgaccttggtgctaaaactgcttttagtgcaaaagag A G A T A EE L G RHM K T Q AI D L G A K T A F S AK E 32722 gcggctcaaggtatggaaaatctagcttcagccggtttccaggtaaatgaaatcatggacgctatgccaggggtacttgacctg 113 A A Q G M E N L A S A G F Q V N E I M 0 A M P G V L D L 32806 gctgccgtatctggaggagatgtggccgcgagctccgaggccatggctagt tcacttcgagcctttggattagaggcaaaccag 141 A A V S G G D V A A S S E A M A S S L R A F G L E A N Q 32890 gcgggtcacgtggctgacgtatttgctcgagcagcagctgatacgaacgcagaaactagcgacatggcagaggcgatgaaatac 169 A G H V A D V F A R A A A D T N A E T S D M A E A M K Y 32974 gccgcacccgttgctcactctatgggcttgagccttgaagaaacggctgcgtctattgggattatggccgacgccggtattaag 197 V A P V A H S H G L S L E E T A A S I G I M A 0 A G I K 33058 ggctcgcaagccggaaccacgcttagaggcgctctctcgcgtattgccaaacctacgaaagcgatggtcaaatcaatgcaggaa 225 G S Q A G T T L R G A L S R I A K P T K A M V K S H Q E 33142 ttaggagtttcgttctacgacgcgaacggaaacatgattccactaagagaacaaatcgctcaactgaaaacagctaetgcagga 253 L G V S F Y D A N G N M I P L R E Q I A Q L K T A T A G 33226 ctaacacaagaggaacgaaatcgtcacct tgttaccttgtatggccaaaactcgttgtcaggtatgcttgcactattagacgca 281 L T Q E E R N R H L V T L Y C Q N S L S G H L A L L D A 33310 ggtcctgagaaattggaiaagatgaccaatgctctcgtgaactcggacggagctgctaaggaaatggcagaaactatgcaggac 309 G P E K L D K H T N A L V N S 0 G A A K E H A E T H 0 0 33394 aaccttgctagtaaaatcgagcaaatgggaggagctt tcgagtctgttgctattattgttcaacaaatccttgagcctgcactt 337 N L A S K I E Q H G G A F E S V A I I V Q Q I L E P A L 33478 gctaaaatcgtggagcaatcacaaaagtccgaagcattcgtaaatatgtcacctatcggtcaaaagaggttgtcatattc 365 A K I V G A I T K V L E A F V N M S P I G Q K H V V I F 33562 gcaggaatggt tgcagcccttggaccactgcttctaattgcaggaatggtgatgacaactattgtcaagttaagaat tgctart 393 A G H V A A L G P L L L I A G H V H T T I V K L R I A I 33646 cagt ttt taggtccagcat ttatgggaacgatgggaaccat tgcaggagt tatagcaatattct atgctctggt cgccgtgt tc 421 Q F L G P A F H G T H G T I A G V I A I F Y A L V A V F 33730 atgatagcctacacaaaatcggagagatttagaaactttatcaacagtcttgcgcctgctattaaagctgggtttggaggagcg 449 M I A Y T K S E R F R N F I N S L A P A I K A G F G G A 33814 ttggaatggctacttccacgactgaaagagttaggagaatggttacagaaggcaggcgagaaggcgaaagagttcggtcagtct 477 L E W L L P R L K E L G E W L Q K A G E K A K E F G Q S 33898 gtagggtctaaagtgtcaaaactgctcgaacagtttggaataagtatcggtcaggcaggaggctcgattggtcagttcattgga 505 V G S K V S K L L E Q F G I S I G Q A G G S I G Q F I G 33982 aatgttctcgaaaggctaggaggcgcatttggaaaagtaggaggagtcatttcaattgctgtttcacttgtaacaaaattcggt 533 N V L E R L G G A F G K V G G V I S I A V S L V T K F G 34066 ctcgcatttctagggattacaggaccactcgggattgctattagtctgttagtttcatttttgacagcttgggctagaacaggt 561 L A F L G I T G P L G I A I S L L V S F L T A W A R T G 34150n ~getnaa aga -~aattaccaatttcqaaaac qacaaacacaat tcagt acgctgattt cat ctctcaat ac 589 E F N A D G I T Q V F E N L T N T I Q S T A D F I S Q Y 34234 cttccagtctttgtcgaaaaaggaactcaaattttagttaagattattgaaggaattgcatctgctgttcctcaagtagttgaa 617 L P V F V E K G T Q I L V K I I E G I A S A V P Q V V E 34318 gtgat t tcacaagt cat tgaaaat at tgtgatgacaat t tcgacagttatgcct caat tagt cgaagcaggaat taag~tatt 645 V I S Q V I E N I V H T I S T V H P Q L V E A G 1-K I L 34402 gaagcgctt ataaatggtcttgtt caat ct cttcctact at cat tcaagcagctgt tcaaat tatcactgci-t att caatggt 673 E A L I N G L V Q S L P T I I Q A A V Q I I T A L F N G 34486 cttgttcaggcacttcctacgcttattcaagcaggtcttcaaattttgtcagctctcataaacggactattcaagcgcttccg 701 L V Q A L P T L I Q A G L Q I L S A L I N G L V Q A L P 34570 gcaattattcaagcagctgttcaaat tatcatgtcgct tgttcaagcactaattgaaaacttgcctatgataatcgaagcagcg 729 A I I Q A A V Q I I H S L V Q A L I E N L P H I I E A A WO 00/32825 PCT/I B99/02040 365 34654 a tgc agat t aaatgggt ctagt caacgcactgat tgaaaat ataggacct at c ttagaagcagggat tcaaat tctaatggct 757 M Q I I M C L V N A L I E N I G P I L E A G I Q I L M A 34738 ctaatcgagggacttatcaagtgcttcctgaactaattacagcagcgattcaaatcattacttcactattagaagcaatcttg 785 L I E G L I Q V L P E L I T A A I Q I I T S L L E A I L 34822 t cgaac ctcct caactt ctaaagccggagt aaa tgctt ttatcact tct t caagggt tgct aaat atgct tcct caact a 813 S N L P Q L L E A G V K L L L S L L Q G L L N M L P Q L 34906 attgcaggggctttgcaaatcatgatggcacttcttaaagcagttatcgactcgtccctaaacttcttcaagcaggtgttcaa 841 1 A G A L Q I M M A L L K A V I D F V P K L L Q A G V Q 34990 cccttaaggcattgattcaagtattgcttcacttctcggctcactttatcgacagctggaaacatgctttcatcattagtt 869 L L K A L I Q G I A S L L G S L L S T A G N M L S S L V 35074 agcaagattgctagctttgtgggacagatggtttcaggaggtgcgaacctgattcgaaacttcattagtggtattgggtcaatg 897 S K I A S F V G Q M V S G G A N L I R N F I S G I G S M 35158 at cggt tcagctgt ctct aaaat tggcagcatgggaact tCaat tgt t tct aaggt tactggat tcgctggacaaatggtaagc 925 I G S A V S K I G S M G T S I V S K V T G F A C Q M V S 35242 gcaggggt caacct tgt t caggat ttat caatggt at cagtt ccatggt aagtt ctgcggtaagtgcggcggct aat atggct 953 A G V N L V R G F I N G I S S M V S S A V S A A A N M A 35326 agcagtgcattaaatgccgttaagggattcttaggtattcactct~cttcacgtgtcatggagcagatgggtatctatacgggt 981 S S A L N A V K G F L G I H S P S R V M E Q M G I Y T G 35410 caagggt tcgtaaatggtat tggtaacatgatt cgaact acacgtgacaaggctaaagaaatggctgaaactgt tactgaagct 1009 Q G F V N G I G N M I R T T R D K A K E M A E T V T E A 35494 ct cagcgacgtgaagatggatatt caagaaaatggagtt atagaaaaggtt aaatcagt t.tacgaaaagatggCtgaCCaaCt t 1037 L S D V K M D I Q E N G V I E K V K S V Y E K M A D Q L 35578 cctgaaact ct tccagct cctgat tt cgaagatgtt cgt aaagcagccggt tcgcct cgagtggaCt tgtt Caatacaggaagt 1065 P E T L P A P D F E D V R K A A G S P R V D L F N T G S 35662 gacaaccctaaccaacctcagtcacaatctaaaaacaatcaaggcgagcaaaccgttgtcaacattggaacaatcgtagttcga 1093 D N P N Q P Q S Q S K N N Q G E Q T V V N I G T I V V R 35746 aacaatgacgacgttgacaaactgt cgagaggat tgt ataatagaagt aaagaaact ctat cagggt ttggtaacattgtaaca 1121 N N D D V D K L S R G L Y N R S K E T L S G F G N I V T 35830 ccgtaa 35835 1149 P dplORFOO3 53538 atggcacaaaaaggactct t tggtgcaaagcct cgtt agcaagaagaacgatgct cagt tact tgctcaacggaaaaacagg 1 M A Q K G L F G A K P R S SK K N D A Q L L A Q R K N R 53622 aagcctgcagttgaggttacttacattcaggaaacgctctaaaggacgcagttgctagagctcgtactctttcaactaggatt 29 K P A V E V T Y I S G N A L K D A V A R A R T L S T R I 53706 cttggacacgttcttgatagacttgagttaatcactgaggaagcaaaactcgagcagtatgtagacaaaatgattgaagacgga 57 L G H V L D R L EL I TE E A K L E Q Y V D) K M I E D G 53790 ataggttctattgacgtagaaactgatggactcgatactattcacgatgagctggcaggagtctgcttgtactcacctagtcaa I G S I D V E T D G L D T I H 0 E L A C V C L Y S P S Q 53674 aaaggaat ct atgctcctgtcaatcatgt tagcaat atgacgaagatgcgaattaagaat caaatt tct cctgagt tcatgaag 113 K G I Y A P V N H V S N N T K N R I K N Q I S P E F M K 53958 aaaatgctt caacggat tgtagat tcaggaat tcctgt cat ctatcataatt cgaaat t tgacatgaaatcgat t attggcga 141 K M L Q R I V D S G I P V I Y H N S K F D M K S I Y W R 54042 ctcggcgtcaaaatgaatgagccagcgtgggatacatatttagccgcaatgcttttaaatgaaaacgagtctcacagcttgaaa 169 L G V K M N E P A W D T Y L A A N L L N E N E S H S L K 54126 agtcttcactctaaatatgttaggaacgaagaaaacgcagaggttgcaaaatttaatgacttatttaaaggaat tccttttagt 197 S L H S K Y V R N 2 E N A E V A K F N D L F K C I P F S 54210 ttaattcctcctgatgttgcctatatgtatgcggcctatgaccctttgcaaactttcgaactctatgaatttcaagaacaatac 225 L I P P D V A Y M Y A A Y D P L Q T F E L Y E F Q E Q Y 54294 t tgact ccaggaactgaacaatgtgaagaatataacctggaaaaagtCt catgggt tct.t cat aatat tgagatgcct ctaat t 253 L T P G T E Q C E E Y N L E K V S W V L H N I E M P L I 54378 aaagt tct ct tcgacatggaagt ctacggtgt cgact tagaccaagataagctggcagaaat tagagaacagttt actgccaat 281 K V L F D M E V Y G V 0 L D Q 0 K L A E I R E Q F T A N 54462 atgaacgaggctgagcaagagtttcaacagcttgt cagcgaatggcagcctgaaat tgaagaacttcgacaaactaatttccag 309 M N E A E Q E F Q Q L V S 2 W Q P E I E E L R Q T N F Q 54546 agctat caaaaact cgaaat ggatgcaagaggtcgagtgacggtaagcat ttccagt cctact caat tagcaattctgt tt tat 337 S Y Q K L E N D A R G R V T V S I S S P T Q L A I L F Y 54630 gatatcatgggattgaaaagtcctgaaagggataaacctagaggaacaggcgaaagtattgt cgagcat tttgataacgatatc 365 D I N G L K S P E R D K P R G T G E S I V E H F D N 0 I 54714 tcaaaagcacttttgaaatatagaaaatatgcaaaattagtttcgacctatacaacacttgaccaacaccttgcaaagcctgac 393 S K A L L K Y R K Y A K L V S T Y T T L D Q H L A K P D 54798 aat cgaat tcacactacatt caaacagt acggagctaagacagggcgtatgt caagtgagaatcctaacttacagaatat cct 421 N R I H T T F K Q Y G A K T C R N S S E N P N L Q N I P 54882 tctcgcggtgagggtgcagtagttcgacaaatctttgcagccagtgaagggcattacattattggtagtgactactctcaacaa 49 S a G 2 G A V 1. R 1. I F A A S5 v r. Y T I G S D Y S 0 0 54966 gaacctcgttcattggcggaattaagtggcgacgaaagtatgcgacatgcttacgaacaaaacctggacctatattcagttatc 477 E P R S L A E L S G D E S M R H A Y E Q N L D L Y S V I 55050 ggt tcgaaact t tatggtgtt ccccatgaagagtgt ttagagttct at cccgacggaacgactaacaaggaagg aaaacttcga 505 G S K L Y C V P Y E E C L EF Y P D C T T N K-E G K -L RZ 55134 agaaar ctgt caagt ccgt tct t tagg~ttatgt acgCgccgggggct aact caat cgctgagcagat~aatgtat ctgt c 533 R N S V K S V L L C L N Y C R. C A N S I A E Q N N V S V 55218 aaagaagcgaataaggt tat tgaagat t t c tcaccgagt tccctaaagtggcagactatat catattcgttcaacagcaggcg 561 K E A N K V I E D F F T E F P K V A D Y I I F V Q Q Q A 55302 caggacttgggatatgt tcaaacagctaccggt cgaagaagaaggctt cctgatatgagtct t cctgaatacgagt tcgagtat 589 Q D L C Y V Q T A T G R R R R. L P D N S L P E Y E F E Y 55386 atcgacgctagcaagaacgaagatttcgacccctttaaCtttgacgcagaccaacagatggacgatactgttcctgaacatatt WO 00/32825 PCT/I B99/02040 366 617 I D A S K N Et D F D P F N F D A D Q Q M 0 D T V P E H I 55470 atcgaaaaatattgggcccagctagatagagcctggggatttaagaagaagcaagaaattaaagaccagcaaaagccaagga 645 1 Et K Y W A Q L D R A W G F K K K Q E I K D Q A K A E G 55554 atcttagaacggcaaacgtccgccatttaccgttcagagcgca 673 I L I K D N G G K I A D A Q R Q C L N S V 1 Q G T A A D 55638 atgactaagtacgcaatgattaaggtacacaatgacgctgaattgaaagaattaggattccatttaatgattccagttcacgat 701 M T K Y A M I K V H N D A E L K E L G F H L M I P V H D 55722 gagtt actaggtgaggttcctatcaagaacgcaaaacggggagcagaaaggttgacagaagttatattgaagcagccaaggac 729 E L L G Et V P I K N A K Rt G A E Rt L T E V M I E A A K D 55806 attattag~cttccaatgaaatgtgaccccagtatagtagaaagatggtatggtgaagaaattgaaatctaa 55877 757 1 1 S L P M K C D P S I V E ft W Y G E E I E I dplORFOO4 40401 atgacaaaatttatcaactcatacggccctcttcacttgaacctttacgtcgaacaagttagtcaggacgtaacgaacaactcc 1 M T K F I NS Y G P L H L N L Y V E Q V S Q D V T N N S 40485 tcgcgagtagttggcgagctactgtcgaccgcatggagcttatcgaacgtggacttatggaaatattagtaacctttccgta 29 S R V S W Rt A T V 0 R D G A Y ft T W T Y G N I S N L S V 40569 tggttaaatggttcaagtgttcatagcagtcacccagactacgacacgtccggcgaagaggtaacgctccaagtgagaatg 57 W L N G S S V H S S H P D Y D T S G E E V T L A S G E V 40653 acgtcccaataggcagcagcgttgctgtgccatagctccgatt T V P H N S D G T K T M S V W A S F D P N N G V H G N I 4 0737 acaccatatcctaaattcagttaaaatcattgggatgattgac 113 T I S T N Y T L D S I P R S T Q I S S F E G N Rt N L G S 40821 ttacatacggttatctttaaccgaaaagtgaactcttttacgcatcaagtttggtaccgagttttcgtacgactggatagat 141 L H T V I F N R K V N S F T H Q V W Y ft V F G S D W I D 40905 ttaggtaagaaccatactactagcgtatcctttacgccgtcactggacttagcaaggtacttacctaaatcaattccggaaca 169 L G K N H T T S V S F T P S L D L A R Y L P K S S S G T 40989 atggacatctgtattcgaacctataacggaactacgcaaattggtagtgacgtcta~tcaaacggatggaggttcaacatcccc 197 M D I C I ft T Y N G T T Q I G S D V Y S N G W ft F N I P 41073 gattcagtacgtcctacttcgggcatttctttatagacacacttcagcgttcgacagattttaacaggaacaacttc 225 0 S V R P T F S G I S L V D T T S A V R Q T L T G N N F 41157 ccccaaatcatgtcgaacattcaagtcaacttcaacaatctccggcgcttacggatccactatccaagcatttcacgctgag 253 L Q I M S N I Q V N F N N A S G A Y G S T I Q A F H A ft 41241 ctcgtaggtaaaaaccaagctatcaacgaaaacggcggcaaattggtatgatgaactttaatggctccgctacctaaagca 281 L V G K N Q A I N E N G G K L G M M N F N G S A T V R A 41325 tgggttacagacacgcgaggaaaacaatcgaacgtccaagacgtatctatcaatgttaiagaatactatgaccgtctatcaat 309 W V T D T R G K Q S N V 0 D V S I N V I ft Y Y G P S I N 41409 tttctcagatgcaactcattcagttcaagtagcccttagtgag 337 F S V Q Rt T R Q N P A I I Q A L ft N A K V A P I T V G G 41493 caacagaaaaacatcatgcaaattaccttctccgtggcgccgttgaacactactaatttcacagaagataaggttcggcgtca 365 Q Q K N I M Q I T F S V A P L N T T N F T E D R G S-A S 41577 gggacgttcactactatttccctaatgactaactcgtccgcgaacttagctggtaactacgggccggacaatcttacatagtt 393 G T F T T I S L M T N S S A N L A C N Y G P D K S Y I V 41661 aaggctaaaatccaagacaggt tcacttcgactgaatttagtgctacggtagctaccgaatcagtagtt cttaactatgacaag 421 K A K I Q D R F T S T Et F S A T V A T Et S V V L N Y D K 41745 gacggtcgactftggagttggtaaggttgtagaacaagggaaggcagggtcaattgatgcagcaggtgatatatatctgagt 449 D G ft L G V G K V V ft Q G K A G S I D A A G D I Y A G G 41829 cgacaagttcaacagtttcagctcactgataataatggagcattgaacaggggtcaatataacgatgtttggaataagctgaa 477 R Q V Q0 OF Q LT D N N G AL N R G Q Y N D V W N K R E 41913 acagagtttacatggcgaagtaacaaatacgaggacaaccctacgggaactcgaggtgaatggggactatttcaaaatttctgg 505 T Et F T W ft S N K Y E D N P T G T Rt G E W G L F Q N F W 41997 ttagatagctggaaaatggttcaatccttcattacaatgtcaggaagaatgttcatcaggacagcgaacgatggaaacagctgg 533 L 0 S W K M V Q S F I T M S G ft M F I R T A N D G N S W 42081 agacctaacaagtggaaagaggttctatttaagcaagactt cgaacagaataattggcagaaacttgttCttcaaagtgggg 561 Rt P N K W K E V L F K Q D F E 0 N N W Q K L V L Q S G W 42165 aactatacttggcctcatgaatctagctgaatgggaaggaaag 589 N H H S T Y G D A F Y S K T L D G I V Y L ft G N V H K G 42249 ctacaaaagtcatcgatctagattgcgagtcagactagttata 617 L I D K Et A T I A V L P ft G F R P K V S M Y L Q A L N N 42333 tcttgatcattttttccgcgaacttggatgagaaatctgtatt 645 S Y G N A I L C I Y T 0 G ft L V V K S N V D N S W4 L N L 42417 gacaatgtctcatttcgtatttaa 42440 673 D N V S F Rt I dplORFOOS 23674 atqqctaaaaaatcaaaagctatctcacacacagacgaactgattagtcagtcgtttgacagccccttggcaaaaatcaaaag 1 M A K K S K A I S N T D E 11 1 5Q 5 F Z FPL A K N Q K 23758 ttcaagaaagagcttcaggaagttgaaaagtattatcaatact tcgacggatttgatgtcacggacttgaatactgactat9g 29 F K K ft L Q E V E K Y Y 0 Y F 0 G F D V T D L N T 0 Y G 23842 caaacatggaagattgacgaagactcagtcgactataaacctactcgagaaattcgaaactatattcgacaacttatcaaaaag 57 Q T W K ID E D S V D Y K P T R E I ft N Y I R-Q LI-K K 23926 caatcacgctttatgatgggtaaagagccagagcttatctttagtccagttcaagacaatcaagatgaacaggctgagaacaag Q S R F M M G K Et P ft L I F S P V Q D N Q D E Q A ft N K 24010 cgatttcattttagatgaattgacaatcatctatgccaatgta 113 ft I L F D S I L ft N C K F H S K S T N A L V D A T V G K 24094 cgggtattgatgacagtagtagcaaatgccgctcaacaaattgacgtccagttttattcaatgcctcagttcacctatacagtt 141 Rt V L M T V V A N A A Q Q I D V Q F Y S M P Q F T Y T V 24178 gaccctagaaacccttccagcttgctttctgttgacat tgtttatcaggacgagcgcacaaaaggaatgagcactgaaaaacaa WO 00/32825 PCT/I B99/02040 367 169 D P R N P S S L L S V D I v Y Q D E R T K G M S T E K Q 24262 ct tggcat cat tat agat atgaaat gaaagctggaacaagt caat caggaattgcaacagct t tagaagacat tgaagaacaa 197 L W H H Y R Y E M K A G T S Q S G I A T A L E 0 I E E Q 24346 tgttggctcacttatgccttaacggatggagagtcgaaccaaatctatatgacagaaagtggccaaactactatcaaggagaca 225 C W L T Y A L T 0 G E S N Q I Y M T E S G Q T T I K E T 24430 gaggctaaacttgtagaaattgaagacaacctaggaaacaagattgaagttcctttaaaagttcaagaatccgccccaaccggc 253 E A K L V E I E 0 N L G N K I E V P L K V Q E S A P T G 24514 ttgaagcaaattccttgtcgagttattcttaatgaaccattgactaatgacatatacgggacaagcgatgtcaaagaccttatc 281 L K Q I P C R V I L H E P L T N D I Y G T S D V K 0 L 1 24598 acagtagcagataacttgaacaaaactattagtgacttacgagattcacttcgatttaaaatgttcgagcagcctgttatcatt 309 T V A 0 N L N K T I S D L R D S L. R F K M F E Q P V I I 24682 gatggctcttctaagtcaattcaaggaatgaagattgcgccaaacgctttggtcgaccttaagagtgaccctacttcctcaatc 337 D G S S K S I Q G M K I A P N A L V 0 L K S 0 P T S S I 24766 ggcggtactggaggcaagcaagctcaagtcacttccatttcaggaaacttcaacttccttccagcggctgaatattatttagag 365 G G T G G K Q A Q V T S I S G N F N F L P A A E Y Y L E 24850 ggcgctaagaaagccatgtatgaactaatggaccagccaatgcctgaaaaggtacaggaggcgccatcaggaattgcaatgcag 393 G A K K A M Y E L N 0 Q P M P E K V Q E A P S G I A M Q 24934 t tct tat tctacgacct aat t t Cgatgtgacggaaaat ggattgagtgggatgatgct att caatggct catt caaatgctg 421 F L F Y D L I S R C D G K W I E W 0 D A I Q W L I Q M L 25018 gaagaaattttagcaacagtgaatgttgact tgggaaatattcctcaagatattcaatcaagttatcaaacacttacgacaatg 449 E E I L A T V N V 0 L G N I P Q D I Q S S Y Q T L T T N 25102 actatcgaacaccactatccaattcctagcgatgaacttctgctaagcaacttgcgctcactgaagttcaaactaatgtacgc 477 T I E H H Y P I P S 0 E L S A K Q L A L T E V Q T N V R 25186 agccaccaatcttacattgaagaattcagtaagaaggaaaaggcggacaaggaatgggaacgcattttggaagaacttgctcag 505 S H 0 S Y I E E F S K K E K A 0 K E W E R I L E E L A Q 25270 cttgacgaaatctcagctggagcattgcctgtattagcaaacgaattaaacgaacaagaggagcctcaagatgaaacgagtgaa 533 L D E I S A G A L P V L A N E L N E Q E E P Q D E T S E 25354 gaagacgaagttgatgacaaagaaaaagaacaaactgaacaaccaaccgaagaaggagtcgacccagacgttcaaggttaa 25434 561 E 0 E V 0 0 K E K E Q T E Q P T E E G V D P D V Q G dplORFO06 45296 atgat tgaaat cgttat agcacgtt cgaaagctaggcgaggt cgaaccctatt tat tgaaacatgggcaagcactgatgaagat 1 M I E I V I A R S K A R R G R T L F ISE T WA ST 0 E D 45380 gcagttaaaatggcagaaaagatttccagcttgcccaatgtagtcgagacgtcttctaataacttcgaactaccttataagtat 29 A V K M A E K I S S L P N V V E T S S N N F E L P Y K Y 45464 t tcaat aatgt tatagacgctCt agatgaat gggagcttcacatctt cggcgaacttgat aaagatgttcaagact acattgac 57 F N N V I D A L 0 E W S L H I F G E L D K D V Q D Y I D 45548 t ct cgaaaccgaatagct tctt Caagcaatgagcagtt tt cgtt caagactact ccattcgcgcaccaggt tgaatgt tt cgaa S R N R I A S S S N E Q F S F K T T P F A H Q V E C F E 45632 tacgcacaagagcatccatgttt cctttaggcgatgagcaaggtttagggaaaactaaacaggcaattgatattgcagttagc 113 Y A Q E H P C F L L G D E Q G L G K T K Q A I D I A V S 45716 aggaaggcaagtttcaaacattgttaatcgtatgttgcatatcagggctcaaatggaattgggcaaaagaagtaggtattcat 141 R K A S F K H C L I V C C I S G L K W N W A K E V G I H 45800 tcaaatgagtcagctcatat ttt aggaagtcgagtcactaaagatgggaaattagtgattgacggagtttctaaacgggcagaa 169 S N E S A H I L G S R V T K D G K L V I 0 G V S K R A E 45884 gact tgct tggtggccacgacgaattct tccttatcactaacat tgaaactcttcgcgatgctgtgt tcatiaaatact taaat 197 D L L G G H 0 E F F L I T N I E T L R D A V F I K Y L N 45968 gaactgacaaaaagcggagaaattggaatggtattattgacgagattcacaagtgtaagaacccttcaagtaagcaaggggct 225 E L T K S G E I G N V I I D E I H K C K N P S S K Q G A 46052 tcaat tcaaaagctccaaagttattacaagatgggacttacaggaactcctctaatgaataacccaatcgatgtattcaatgtt 253 S I Q K L Q S Y Y K N G L T G T P L M N N P I D V F N V 46136 atgaagtggctaggggcggaacatcatacactgactcagttcaaagagcgatactgtatcgtcgaccagttcaatcaaatcact 281 M K W L G A E H H T L T Q F K E R Y C I V D 0 F N Q I T 46220 ggatatcgaaatctagctgaacs tcgcgagcttgtcaacgactacatgcttagaagaacgaaggaagaagttttagacctgcct 309 G Y R N L A E L R E L V N 0 Y M L R R T K E E V L 0 L P 46304 gaaaagattcgagtcacagagtatgtcgacatgaactcgaaacagtcaaaaatctataaggaagttttgactaaacttgttcaa 337 E K I R V T E Y V D M N S K Q S K I Y K E V L T K L V Q 46388 gaaatagataaagt caagct catgcctaaccctct agccgaaacgatt cgact tcgacaagcgactggaaat cct tcgat t tta 365 E I D K V K L N P N P L A E T I R L R Q A T G N P S I L 46472 actactcaagatgtcaagtcttgcaagttcgaaagatgtatcgaaattgtcgaggaatgtatccagcaaggaaagtcctgcgtg 393 T T Q 0 V K S C K F E R C I E I V E E C I Q Q G K S C V 46556 atatttagcaattgggaaaaggttattgaacct cttgctaagatactttcgaagacagtcaaatgcaacctggtaacaggagaa 421 1 F S N W E K V I E P L A K I L S K T V K C N L V T G E 46640 accgcagaaagcaacgaaagaagaatagaL:,aaaggc Z. gta 449 T A D K F N E I E E F N N H R K A S V I L G T I G A L G 46724 acaggat t act t gacgaaagcggatacggtt at t t tct tagat agtccgtggacacgcgcagaaaaggaccaagccgaagat 477 T G F T L T K A 0 T V I F L D S P W T R A E K 0 Q A E 0.
46808 aggtgt catagaat tggcgcaaaaagtt ctgt cactat ctacacgct tgt cgccaaaggt actgt tgacgaacqtastagaagac 505 R C H R I G A K S S V T I Y T L V A K C T V 0 E _R I E 0 46892 ctatgaacggaaaggagaattagcagattatatcgtagatggtaagcctatgaaatctaaaattggtaaccttttcgatatc S33 L I E R K C E L A 0 Y I V D G K P N K S K I G N L F D I 46976 ctgcttaaatag 46987 561 L L K dplORP0O7 22230 atgacaataagcctgagaaataaactacctaagt tcaact tcgt ccct tttagt aagaaacaact ccagctcctaacatggtgg WO 00/32825 PCT/I B99/02040 ,68 1 M T I S L R N K L P K F N F V P F S K K Q L Q L L T W W 22314 acaaagggctcaccttttcgaactttcgatatcgtcatagcagacggttccattcgttcaggaaaaacagtatcgatggctctt 29 T K G S P F R T F D I V I A D G S I R S G K T V S M A L 22398 tcattttccctttgggccatgacggaattcaacggacaaaactttgccatctgtggtaagacaattcactcagctcgacgaaat 57 S F S L W A M T E F N G Q N F A I C G K T I H S A R R N 22482 gttattcagcctctaaagcaaatgctcacaagtcgcgggtatgaaattcgagatgttcgaaatgaaaatctacttattattaga V I Q P L K Q M L T S R G Y E I R D V R N E N L L I I R 22566 cactttagaaatggcgaagaaattgtcaactacttctatatatttggaggaaaagatgagtcgagtcaagaccttatacagggg 113 H F R N G E E I V N Y F Y I F G G K D E S S Q D L I Q G 22650 gtaacattagcaggtatcttCtgtgatgaggtggcactgatgcctgaatcgtttgt caaccaagcgacagggcgctgttccgta 141 V T L A G I F C D E V A L M P E S F V N Q A T G R C S V 22734 acaggtccgaaaatgtggttctcttgtaacccggccaatcctaatcactacttcaagaagaactggattgacaaacaggtcgaa 169 T G S K M W F S C N P A N P N H Y F K K N W I 0 K Q V E 22818 aagcgtatcttatatcttcactttacaatggacgacaaccctagcttgacggatagcattaaaaggcgctatgagaaaatgtat 197 K R I L Y L H F T M 0 0 N P S L T D S I K R R Y E K M Y 22902 gctggagtcttcaggaaaagatttattctcggcctttgggtaacagcagatggtctagtttattcaatgttcaatgaagagcag 225 A C V F R K R F I L G L W V T A D G L V Y S M F N E E Q 22986 catgtcaaaaagctcaatatagaattcgaccgtttattegtagcaggcgactttggtatctataatgcaacaaccttcggcctt 253 Hi V K K L N I E F D R L F V A G D F G I Y N A T T F C L 23070 tatggattctcgaaacgtcataagcgctaccatctaattgagtcatactaccactcagggcgcgaggcggaagagcaactaact 281 Y G F S K R H K R Y H L I E S Y Y H S G R E A E E Q L T 23154 gaggcggatgttaattcgaatat tcaatttagttcagttctacaaaagactactaaagagtacgcaaatgatttagtcgatatg 309 E A D V N S N I Q F S S V L Q K T T K E Y A N 0 L V D M 23238 atacgaggaaagcaaatcgaatatataattctcgacccgtctgcttctgctatgattgttgaacttcaaaagcatccttatata 337 1 R G K Q I E Y I I L D P S A S A M I V E L Q K H P Y I 23322 gctagaaagaatatccctatcattcctgctcgaaatgacgtgacgcttggcatttcatttcacgctgaactcttggctgagaat 365 A R K N I P I I P A R N D V T L C I S F H A E L L A E N 23406 agatttacactcgaccctagcaacacgcacgacattgatgaatactatgcttacagctgggacagcaaagcgagccaaacggga 393 R F T L D P S N T H D I D E Y Y A Y S W 0 S K A S Q T G 23490 gaagatagagtcattaaagagcatgaccactgcatggataggaacagatatgcctgtctcactgacgctctaatcaacatgac 421 E D R V I K E H D H C M D R N R Y A C L T 0 A L I N D D 23574 ttcggtttcgaaatacaaatattatccggaaaaggcgctagaaactaa 23621 449 F G F E I Q I L S G K G A R N dplORFOO8 49624 gtgatacagcttcaagtcttaaataaagttctcgaagaaaagagcttatccattttagaaaataatggaattgaccaagaatac 1 VI 1Q L Q V L N K V L EE KS L S I L EN NCG I D Q E 49708 ttcacggattatttagacgagtatcaatttattcaagaacacstttcgagatatggaagagttccggacgacgaaactattctc 29 F T 0 Y L D E Y Q F I Q E H F S R Y G R V P D D E T I L 49792 gaccatttt cctggattcgaatttttcgaaattggcgaaactgatgaataccttatcgacaagctaaaagaggagcatctatat 57 D H F P G F E F F E I G E T D E Y L I D K L K E E H L Y 49876 aattcacttgttccaattttaacggaagcggctgaggacattcaagtagatagtaacattgcgattgcgaatataattccaaaa N S L V P I L T E A A E D I Q V D S N I A I A N I I P K 49960 ctagaagaacttttcaatcgctctaaattcgtaggcggactagacattgctcgaaatgctaaacttcgactagactgggcgaat 113 L E E L~ F N R S K F V G C L D I A R N A K L R L D W A N 50044 actattagaaaccatgacggtgaaagacttggaatatcgacagggtttgaactattggacgacgtgcttggaggcttacttcct 141 T I R N H D C E R L C I S T C F E L L D 0 V L C C L L P 50128 ggtgaggatttgattgtcataatggctcgacctggacaaggtaagtcgtggactat tgataaaatgcttgcaactgcttggaag 169 C E D L I V I M A R P C Q C K S W T I 0 K M L A T A W K 50212 aacgggcatgatgtccttctatatagcggggaaatgagtgaaatgcaagttggtgctcgt atagatactattctttcgaatgtt 197 N C H D V L L Y S C E M S E M Q V C A R I D T I L S N V 50296 agcatcaattcaattaccaaagggatttggaacgaccatcagttcgaaaaatatgaggaccatattcaagcaatgactgaggct 225 S I N S I T K C I W N D H Q F E K Y E D H I Q A M T E A 50380 gaaaattcccttgtggtagtcacgccctttatgattggaggaaagaaccttacccctgcaattttagatagcatgatatctaaa 253 E N S L V V V T P F N I C C K N L T P A I L 0 S N I S K 50464 tatagaccas ctgtggtggggattgaccagctttcactcatgagcgagtcttatccaagcagggagcagaagcgaatccagtac 281 Y R P S V V C I D Q L S L M S E S Y P S R E Q K R I Q Y 50548 gccaacatcaccatggacctatataagatttctgctaaatatggaattcctattgtgcttaatgtccaagcagggcgttcggct 309 A N I T N D L V K I S A K Y C I P I V L N V Q A C R S A 50632 aaaactgaaggcgctgaaagtatggaactagaacatatagcagaaagtgatggagtaggtcaaaatgctagcagagttatcgct 337 K T E C A E S M E L E H I A E S D C V C Q N A S R V I A 50716 atgaagcgtgacgaaaaatccggcatacttgaactatctgtcgttaaaaaccgatatggcgaagaccgaaaaatcatcgaatat 365 M K R D E K S C I L E L S V V K N R Y C E D R K I I E Y 50800 atgtgggacgttgaaactggaacctatactcttataggattcaaagaggaaggcgaagaaggaactgaaaaaggcgaaagctct 393 M W D V E T C T Y T L I C F K E E C E E C T E K C E S S 0 6 i 509Z 421 P L K A K A S R S T A R L R S K V T R E C V E A F dplORPO09 13160 atgacagactttaaaaaacgcttcaagaaagcagtaacagaaacaatcaatcgtgacggtatcgagaaccttatggattggpcc 1 M T D F K K R F K K A V T E T I N R D C I E N-L D-W L 13244 gaaaatgataccaatttcttctcaagtccagcaagcactcgataccatggaagctatgaaggtggacttgtgagcactcatta 29 E N D T N F F S S P A S T R Y H C S Y E C C L V E H S L 13328 aacgtgttcaatcaactacttscgaaatggaaccatggaggcaaaggctgggaagacatttacccaatggaaacagttgca 57 N V F N Q L L F E N D T N V C K C W E D I Y P N E T V A 13412 atcgtagcactatttcacgacctttgcaaagttggtcagtatcgtgaaactgaaaaatggcgcaagaacagcgacggtgaatgg I V A L F H D L C K V C Q Y R E T E K W R K N S D C E W 13496 gaaagctatttacttatcactacataatgactggaattatctctactt WO 00/32825 PCT/I B99/02040 369 113 E S Y L A Y E Y D P E Q L T M G H G A K S N F L L Q R F 13580 attcaactcacgccagttgaagctcaagcaattttctggcatatgggagcctatgatattagtccttatgcaaatttgaatgga 141 1 Q L T P V E A Q A I F W H M G A Y D I S P Y A U L N G 13664 tgtggagcagccttcgaaactaatccacttgcattcttaatccatcgcgcagatatggccgcaactcatgtagtcgaaaatgaa 169 C G A A F E T N P L A F L I H R A D MA A T Y V V E N E 13748 aacttcgaatactctcaaggtccagttgaacaagaggctgaggttgaagaagtagttgaagaaaaacctaagagttcaactcgt 197 N F E Y S Q G P V E Q E A E V E E V V E E K P K S S T R 13832 aagaaacctgcgcctaaggaagaaaaagttgaagaggctgaagaaaaaccaaaagctggaatcactcgacgtcgcaaacctgcg 225 K K P A P K E E K V E E A E E K P K A G I T R R R K P A 13 916 ccaaaagaggaagaggtagaagagcctaaagaagagcctaagaaagcatctt ctaaaattcgaatgcctaaaaagactgaaaag 253 P K E E E V E E P K E E P K K A S S K I R M P K K T E K 14000 gtcgaagaggtagaaagcgcagacgagccgaaagttgaagaagcagaggacgacaatgtggtggtacctgctggatatgttcga 281 V E E V E S A D E P K V E E A E D D N V V V P A G Y V R 14084 gatgtctactactt ctacagtgaagtcgctgacgtttactacaagaaagatgtcgacgagcctgacgatgacagcgacattctt 309 D V Y Y F Y S E V A D V Y Y K K D V 0 E P D D 0 S D I L 14168 gt agacgaagaagagt acatggacgcaatgtgt cctgtatt agaagaagact t Ctt acgaact tgacggcaaggt tcacaaa 337 V D E E E Y M D A M C P V L E E D F F Y E L D G K V H K 14252 ttagcaaaaggtgaacgcttgccggaagaatacgacgaagaaacttgggaacctatcacigaagcagaatacatcaagcgaaca 365 L A K G E R L P E E Y D E E T W E P I T E A E Y I K R T 14336 gaaaaacctaaagcagttgcaaaacctacCcgaaaaactccgcgccttctcgtcgccctcgcccttaa 14404 393 E K P K A V A K P T R K T P A P S R R P R P dp1ORF010 8699 atgaaattggaacagttgatgaaggactggaataaggattcgaaagctcttgtagcagttcaaggacttgaacgtgaagcgctt 1 M K L E Q L M K D WN KOD S K A L V A V Q G L E R E A L 8783 ccaagaatccctttttctgcgccttctatgaattatcaaacctacggcgggctccctcgaaaaagggtagttgaattcttcggt 29 P R I P F S A P S M N Y Q T Y G G L P R K R V V E F F G 8867 cctgagtcaagtgggaaaactacttcagctctcgacattgtcaagaatgcgcaaatggtatttgagcaggaatgggaacagaag 57 P E S S G K T T S A L D I V K N A Q M V F E Q E W E 0 K 8951 actgaagaaCtcaaggaaaagCtggaaaatgcgcgtgcatccaaagctagcaagactgctgtcaaggaacttgaaatgcaactc T E E L K E K L E N A R A S K A S K T A V K E L E M Q L 9035 gatagt ct tcaagagcct ct taagat tgtatat cttgacctt gagaatacat tagacactgagtgggctaaaaagattggagt c 113 D S L Q E P L K I V Y L D L E N T L D T E W A K K I G V 9119 gatgt tgacaatatt tggatagttrcgccctgaaatgaacagcgctgaagaaatact tcaat atgt tt tagacattttcgaaaca 141 D V D N I W I V R P E M N S A E E I L Q Y V L D I F E T 9203 ggtgaagttggcctagtagttctagattccttgccttacatggtcagtcaaaaccttattgatgaagagttgactaaaaaggcc 169 G E V G L V V L D S L P Y M V S Q N L I D E E L T K K A 9287 tatgcaggaatctcagcgcctgactgaatttagtcgaaaggttactcctcttcttactcgctacaatgcaatattcctaggc 197 Y A G I S A P L T E F S R K V T P L L T R Y N A I F L G 9371 atcaatcaaattcgagaagatatgaatagtcagtacaatgcctattcaactccaggcggaaagatgtggaagcatgcttgtgca 225 1 N Q I R E D M N S Q Y N A Y S T P G G K H W K H A C A 9455 gttcgacttaaatttagaaaaggtgactaccttgacgaaaaCggtgCatcattgacccgtactgctcgaaaccctgcagggaat 253 V R L K F R K G D Y L D EN GA S L T R T A RN P A G N 9539 gtagtagagtcattcgtcgagaagaccaaagcatttaagccggacagaaaattagtttcctatacgctttcctatcatgatgga 281 V V E S F V E K T K A F K P D R K L V S Y T L S Y H D G 9623 attcaaattgaaaatgaccttgtagatgtcgctgtcgaatttggagtcattcaaaaggcaggggcatggttcagtatcgtcgac 309 1 Q I E N D L V 0 V A V E F G V I Q K A G A W F S I V D 9707 cttgaaactggagaaattatgacagatgaagacgaagaaccattgaagttccaaggcaaggcaaatctagttcgacgcttcaag 337 L E T G E I H T 0 E D E E P L K F Q G K A N L V R R F K 9791 gaggatgactacttattcgacatggtgatgactgcggttcacgaaattatcactcgagaagaaggctaa 9859 365 E D D Y L F 0 M V H T A V H E I I T R E E G* dplORFOll 28017 atgaatatttatgattatatcaacgcaggggagattgctagctacattcaagcacttccttcaaacgctcttcaataccttgga 1 M N I Y D Y I N A GE I A S Y I Q A LP S N A L Q Y L G 28101 ccaact ct tttccctaatgct caacaaacagggacagacatttcatggct caagggtgcaaataat ttgccagt aact atccag 29 P T L F P N A Q Q T G T D I S W L K G A N N L P V T I Q 28185 ccatctaactacgacgcgaaagcaagtcttcgtgaacgtgctggatttagcaaacaagctactgagatggcattcttccgtgag 57 P S N Y 0 A K A S L R E R A G F S K Q A T E M A F F R E 28269 tctatgcgacttggtgaaaaagaccgtcaaaacttgcaaatgctattgaaccaaagttcagctcttgcccaaccacttatcact S M R L G E K 0 R Q N L Q M L L N Q S S A L A Q P L I T 28353 caactctataatgatactaagaaccttgtagacggtgttgaagcgcaagcagaatacatgcgtatgcaattgcttcaatacggt 113 Q L Y N D T K N L V D G V E A Q A E Y M R M Q L L Q Y G 28437 aaattcactgtcaaatcaactaacagcgaggctcaatacacttacgactacaacatggatgctaagcaacaatatgcagtcact 141 K F T V K S T N S E A Q Y T Y D Y N N D A K Q Q Y A V T 169 K K W T N P A E S D P I A 0 I L A A M D D I E N R T G V 28605 cgccctact cgaatggt ct tgaacegaaacacttataaccaaatgac taagagtgact ctat caagaaagctct tgcaat tggt 197 R P T R M V L N R N T Y N Q M T K S D S I K K A L A I G 28689 gt tcaaggt tct tgggaaaact tct tgctt ct tgcaagtgacgctgagaaat tcat cgc tgaaaaaacaggt ct tcat ccl 225 V Q G S W E N F L L L AS 0 A E K F 1 A E K T GJ, Q I A 28773 gtctactctaagaaaattgctcagttcgctgacgctgacaaacttcctgacgttggtaacattcgtcagttcaacttgattgac 253 V Y S K K I A Q F A 0 A 0 K L P 0 V G N I R Q F N L I 0 28857 gacggtaaagtggtattgcttccacctgacgcagttggtcacacttggtacggtactactccagaagcattcgacttggcttca 281 0 G K V V L L P P 0 A V G H T W Y G T T P E A F 0 L A S 28941 ggcggaacagacgctcaagttcaagttctttcaggcggacctaccgttacaacttatcttgaaaaacatcctgtcaacattgca 309 G 0 T 0 A Q V Q V L S G G P T V T T Y L E K H P V N I A WO 00/32825 PCT/I B99/02040 370 29025 acagttgtatcagctgttatgattccatcattcgaaggaattgactatgtaggagttctCacaactaattag 29096 337 T V V S A V M I P S F E G I D Y V G V L T T N dplORF0l2 5346 atgagtattaagttcaaaaccgaagaactttcaaaaattgtttctcagctcaataagttgaagcctaacaagttgctagaaatc 1 M S I1K F K T ELEL S KI V S Q L N K L K P S K L L ELI 5430 acaaactattggcatatttt9tgacggcgaatgcgtcatgttacagcatgagctcaaaacttccttcgatcattatc 29 T N Y W H I F G D G E C V M F T A Y D G S N F L R C I I 5514 gacagcgatgttgaaattgacgtgattgtgaaagcagagcagtttggaaaacttgtagaaaagaccacggccgcaaccgtcaca 57 D S D V E I D V I V K A E Q F G K L V E K T T A A T V T 5598 t tagttcctgaagaatcttcgctaaaagttattgggaatggtgagtacaatattgatat tgttacagaagatgaagagtaccct L V P E E S S L K V I G N G E Y N I D I V T E D E E Y P 5682 acattcgaccacttgctcgaagacgtgagtgaagaaaatgcttcCactttgaaaagctcgctgttctacggaatcgccaatatc 113 T F D H L L E 0 V S E E N A L T L K S S L F Y G I A N 1 5766 aacgattctgcggtatctaaatcaggagcagatggaatttataccggcttcctgttaaaaggcggaaaagcaattactacagac 141 N D S A V S K S G A D G I Y T G F L. L K G G K A I T T D 5850 atcattcgcgtatgtatcaaccctatcaaggaaaagggactagaaatgctcatt ccttacaacctaatgagtattttagcaagt 169 1 1 R V C I N P I K E K G L E M L I P Y N L M S I L A S 5934 attcctgatgagaagatgtactt ctggcaaattgacgatactactgtctatatttcatcggcttcagtcgaaatttatggaaaa 197 I P 0) E K M Y F W Q I D 0) T T V Y I S S A S V E I Y G K 6018 ttgatggaaggtatggaagatt atgaagacgtttcacagct tgacicaattgagtttgaagatgatgcggctatccctacagca 225 L M E G M E D Y E D V S Q L D S I E F E D D A A I P T A 6102 gaaatcctgagcgtattagaccgccttgtactattcacttcagcctttgacaaaggaaccgt cgaattcttattcttgaaagac 253 E I L S V L 0 R L V L F T S A F 0 K G T V E F L F L K D 6186 cgacttcgaattaaaacttctactagcagttatgaagacatcatgtacgcatctgctggcaagaaagtttcgaagaaagaat tc 281 R L R I K T S T S S Y E 0 I M Y A S A G K K V S K K E F 6270 acttgccaccttaacagcttactcttgaaggaaattgtatcaaccgtcaccgaagaaaacttcactgtctcttatggaagcgaa 309 T C H L N S L L L K E I V S T V T E E N F T V S Y G S E 6354 accgcaattaagatttcatcgaatggtgtcgtttacttcctagcacttcaagagccggaagaataa 6419 337 T A I K I S S N G V V Y F L A L Q E P E E dplORF0l3 10215 atgaatttagcttctaaataccgtcctcaaactttcgaggaagtggtagctcaagaatatgtcaaagaaattcttttgaatcaa 1 M N LA S K Y R P Q T F EE V V AOL Y V K L I L L N Q 10299 ttacaaaatggcgctatcaaacacggctatctattctgtggtggcgctggaactggtaaaaccactactgctcgaattttcgcg 29 L Q N G A I K H G Y L F C G G A G T G K T T T A R I F A 10383 aaggatgtgaacaaaggacttggctctcctattgaaattgatgctgcttctaataatggggtagaaaatgttcgaaacattatt 57 K D V N K G L G SF L EI D A A S N N G V E N V R N I I 10467 gaagattctagatacaagtctatggacagcgagttcaaagtttacatcattgacgaggttcatatgctttcaaccggagcattt E D S R Y K S M D S E F K V Y I I 0 E V H M L S T G A F 10551 aatgcgctgttgaaaacattagaagagccctcatcgggaaccgtgttcattctatgtactactgaccctcaaaagattcctgac 113 N A L L K T L E E P S S G T V F I L C T T D P Q K I P 0 10635 actattctcagtcgagttcaacggtttgactttactcgaattgataatgacgacatcgttaatcaacttcaatttattatcgaa 141 T I L S R V Q R F D F T R I D N D 0 I V N Q L Q F I I E 10719 agtgaaaatgaagaaggagctggttatagttatgagcgtgacgccctttcgtttattgggaaacttgcaaatggaggaatgcgt 169 S L N E E G A G Y S Y E R 0 A L S F I G K L A N G G M R 10803 gacagtatcacaaggctcgaaaaagtccttgattatagt catcacgttgacatggaagccgtttctaatgcactaggagttccg 197 0 S I T R L E K V L D Y S H H V 0 M E A V S N A L G V P 10887 gactacgaaacattcgcttcacttgttgaagctattgccaactatgacggctcaaagtgtttagaaattgtaaatgaCttccac 225 0 Y E T F A S L V E A I A N Y 0 G S K C L L I V N 0 F H 10971 tactcaggaaaagacttgaaattagtgactcgaaactttacagacttccttttagaggtttgtaagtattggctagttcgagat 253 Y S C K D L K L V T R N F T D F L L E V C K Y W L V R 0 11055 atttcaatcactcaacttcctgCtcattttgaaagtaagctagagcaattCtgtgaggCtt ttcaatatcctactctattgtgg 281 I S I T Q L P A H F E S K L E Q F C E A F Q Y P T L L W 11139 atgctagaagaaatgaatgaacttgctggagttgttaaatgggagcctaatgctaaaccgataattgaaaccaaacttcttttg 309 M L E E M N E L A G V V K W E P N A K P I I L T K L L L 11223 atgagcaaggaggagtga 11240 337 M S K E E dplORP014 50961 atgaaagtaaatggtcttcaaattgaagcgactcctgaacaaataattgaaaaactttcgagacaacttgaagacgaaggaaca I M K V N G L Q IE A T PL E lI1E K L S R Q L L DL G T 51045 ttcatttttagacgaactaagtcgcttggaagcaactatcaattctcatgcccgtttcatgcaggagggatgaaaagcatccc 29 F I F R R T K S L G S N Y Q F S C P F H A G G T E K H P 51129 tctgtggcatgagtagaaatccttcttattcaggaagtaaggtgacggaagctggaacggttcactgtttcacttgcggctac 57 S C G M S R N P S Y S G S K V T E A G T V H C F T C G Y 51213 acttcaggactaactgaattcgtctcgaatgtattaggtcgaaacgatggagggttctatggaaaccagtggctgaaaaggaat 25 T S~ G L T E F V J S 7 T. V D G Gl F~ Y~ G V 1 Wj L~ K R N 51297 tttggaacatctagcgaagtagttaggcaaggcgtcagccctgaagcgtttcgaagaaatgggagaactgaaaaagtcgagcat 113 F G T S S EV V R Q G V S P L A F R R N G R T E K V E H 51381 aaaatcattcctgaagaggaacttgataaataccggtttattcatccttatatgtatgaacggaaattgacggacgagctcatc 141 K I1 LP E EE L D K Y R FINH P Y M Y E R K L-T D E--LI 51465 gagatgtttgatgtaggttatgacaaactgcatgattgcatcacctttccagtacggaacctcaagggcgaaacagtattcttc 169 E N F D V G V 0 K L H D C I T F P V R N L K G L T V F F 51549 aacegtcgaagtgttcgttctaagtttcaccagtacggtgaagatgaccctaaaacggaatttctttatggccaatatgagctt 197 N R R S V R S K F H Q V C E D D P K T E F L Y G Q V E L 51633 gtagcatttcgagactattttgaaaaacctattagtcaagtattcgtgactgagtctgtiatcaactgcttgactctttggtca 225 V A F R D Y F E K P I S Q V F V T L S V I N C L T L W S 51717 atgaagattccagcagtcgctcttatgggagtaggtgaggaatcaaatcaatttactaaaacgacttcttatagaaatatt WO 00/32825 PCT/I B99102040 253 M K I P A V A L M G V G G G N Q I N L L K R L P Y R N I 51801 gttctagcact tgaccctgataacgctgggcagacagcgcaggaaaaactct accgacagttaaagcgaagcaaggt cgttaga 281 V L A L D P D N A G Q T A Q E K L Y1 R Q L K R S K V V R 51885 ttrttgaactaccctaaagagttctatgataataagtgggatataaacgaccatccggaattattaaactttaatgatttagtc 309 F L N Y P K E F Y D N K W D I N D H P E L L N F N D L V 51969 ttgtag 51974 337 L 3793 atgggatttaatctatacttcgcaggaggtcacgctattagcactgacgattatttgaaggaaagaggagccaatcgcctat oc 1 M G F N L Y FA G G H A I ST 0 D Y L K E RG A N R L F 3877 aatcaactgtacgaaagaaacgggattggcaaaaggtggattgagcataagaaaaccaatccaagcactacttcaaaactatt C 29 N Q L Y E R N G I G K R W I E H K K T N P S T T S K L F 3961 gtcgactctagtgcatatctgctcataccaaaggggctgaagttgacattgacgcctatatcgaatacgtgaatgataacgtg 57 V D S S A Y S A H T K G A E V D I D A Y I E Y V N D N V 4045 ggaatgtttgactgtatcgccgaactcgataaaattcctggtgtatttagacagcctaagacacgtgaacagcttttggaagca G M F D C I A E L D K I P G V F R Q P K T R E Q L L E A 4129 ccacaaatttcttgggataattatctatacatgcgcgagcgaatggttgagaaagacaagctcttacctattttccatatggga 113 P Q I S W 0 N Y L Y M R E R M V E K D K L L P I F H M G 4213 gaagactttaaatggctcaacttgatgctcgaaactacattcgaaggcggaaagcatattccttacattggaatttcaccagc 141 E D F K W L N L M L E T T F E G G K H I P Y I G I S P A 4297 aatgactcgactacgaagcataaagacaagtggatggaaagagtattcgaagttattcgaaacagttctaatccagacgttaag 169 N D S T T K H K D K W M E R V F E V I R N S S N P D V K 4381 actcacgcatttgggatgacagttactagccaattagagcgtcacccattctatagcgccgactctacttctgtactgctcaca 197 T H A F G M T V T S Q L E R H P F Y S A D S T S V L L T 4465 ggagcgatgggaaacattatgacgtcaaaaggattagttgacttgtcacagaagaatggaggaattgatgctgtccgtaggctg 225 G A M G N I M T S K G L V D L S Q K N G G I 0 A V R R L 4549 ccaaaaccggttcaagttgaaattgaatccattatcgaagaaactggagcgcattttagcctagagcaattagttgaggactat 253 P K P V Q V E I E S I I E E T G A H F S L E Q L V E D Y 4633 aaacttcgagcattgttcaatgttcaatacatgctgaattgggcagagaactatgaattcaagggaattaaaaatcgtcaacgt 281 K L R A L F N V Q Y M L N W A E N Y E F K G I K N R Q R 4717 cgactattttag 4728 309 R L F dplORF016 43413 atgggagtcgatattgaaaaaggcgttgcgtggatgcaggcccgaaagggtcgagtatcttatagcatggactttcgagacggt 1 M G V D IE K G V A W M Q A R K G R V S Y S M D F R D G 43497 cctgatagctatgactgctcaagttctatgtactatgctctccgctcagccggagcttcaagtgctggatgggcagtcaatact 29 P D S Y D C S S S M Y Y A L R S A G A S S A G W A V N T 43581 gagtacatgcacgcatggcttattgaaaacggttatgaactaattagtgaaaatgctccgtgggatgctaaacgaggcgacatc 57 E Y M H A W L I E N G Y E L I S E N A P W D A K R G D I 43665 ttcatctggggacgcaaaggtgctagcgcaggcgctggaggtcatacagggatgttcattgacagtgataacatcattcactgc F I W G R K G A S A G A G G H T G M F I D S D N I I H C 43749 aactacgcctacgacggaatttccgtcaacgaccacgatgagcgttggtactatgcaggtcaaccttactactaegtctatcgc 113 N Y A Y D G I S V N D N D E R W Y Y A G Q P Y Y Y V Y R 43833 ttgactaacgcaaatgctcaaccggctgagaagaaacttggctggcagaaagatgctactggtttctggtacgctcgagcaaac 141 L T N A N A 0 P A E K K L. G W Q K D A T C F W Y A R A N 43917 ggaacttatccaaaagatgagttcgagtatatcgaagaaaacaagtcttggttctactttgacgaccaaggctacatgctcgct 169 G T Y P K D E F E Y I E E N K S W F Y F D D Q G Y M L A 44001 gagaaatggttgaaacatactgatggaaattggtattggttcgaccgtgacggatacatggctacgtcatggaaacggattggc 197 E K W L K H T D G N W Y W F D R D G Y M A T S W K R I G 44085 gagtcatggtactacttcaatcgcgatggttcaatggtaaccggttggattaagtattacgataattggtattattgtgatgct 225 E S W Y Y F N R 0 G S M V T G W I K Y Y D N W Y Y C D A 44169 accaacggcgacatgaaatcgaatgcgtttatccgttataacgacggctggtatctactattaccggacggacgtctggcagat 253 T N G D M K S N A F I R Y N D G W Y L L L P D G R L A D 44253 aaacctcaattcaccgtagagccggacgggctcattactgctaaagtttaa 44303 281 K P Q F T V E P D G L I T A K V dplORP017 11242 atgattggacagggacttgttaaatctaccatttcgaaatggaaacaacttccaaaatatataatcgtcgaaggtgaagtaggt 1 M I G Q C LV K S T I SK W K Q L P K Y I I V E5G EV G 11326 tcaggacggaagacct taatccgttatattgcttcgaaatttgacgctgattctattgtagtaggaacgagtgtagatgacatt 29 S G R K T L I R Y I A S K F D A D S I V V G T S V D D I 11410 cgaaacateattcaggatgcacagactattttcaaggcgagaatctacgtgatagacggaaatagcctgtCaatgtcagctctt 57 R N I I Q D A Q T I F K A R I V V I D G N S L S M S A L 11494 aactcgcttttgaagatagcggaagagccaccttaaactgtcatatagccatgactgttgatagcatcaataatgctttacct 0: N S L L X I A E E P P L. M C H 1 A M T V nl q T N A T. P 11578 acgcttgcaagtagagcaaaagttctaaccatgctaccttatactaatgaagagaaaatgcagtttgtcaagtcctacaagaag 113 T L A S R A K V L T M L P V T N E E K M Q F V K S Y K K 11662 gtagatacttcaggaattgacgaccgagcgattgtagactattgcaatcttgccagcaatcttcaaatgcttga agacatatta 141 V D V S G I D D R A I V DVY C N L AS N L Q M-L E D4 -1L 11746 gaatatggcgcagaagagctatttgaaaaggttacaacattttatgacttaatatgggaggcaagtgctag~at;:cgcaaag 169 E V G A E E L F E K V T T F Y D L I N E A S A S N S L K 11830 gttactaattggctcaaatttaaggaaactgatgaaggaaaaattgagcctaaacttttcctcaactgtcttttaaattggtcg 197 V T N N L K F K E T D E G K I E P K L F L N C L L N W S 11914 acagttgtcatcaggaagcactatgtagaaatgtctttcgaagaacttgaggcccatgaccttttagtgagggaagcatctagg 225 T V V I R K H V V E M S F E E L E A H D L L V R E A S R WO 00/32825 PCT/I B99/02040 11998 tgtttgcgaaaggtatctaaaaagggctcaaatgcgcgtgtctgcgtgaacgaatttatcaggagggcaaacaagttgagtga 12081 253 C L R K V S K K G S N A R V C V N E F I R R V K Q V E dplORF018 35847 atggctagcagacagacgc tat tggt cgacggaat tgacct tgt cgacaaaggtgcaaccgtgctagaaatgt aggactcact 1 M A S R Q T L L V D G I D L V D K G A T V L E Y V G L T 35931 ttcgcaggatttaaggactcaggattaaaaaccctgaaggcatagacggagtattagattctccgtctaatgcatgtccgct 29 F A G F K D S G F K N P E G I D G V L D S P S N A M S A 36015 cttactggaagcgtgaccttaatgttccacggagaaaccgaaaagcaagttaatcaaaaatacaggcagttcaaacaatttatt 57 L T G S V T L M F H G E T E K Q V N O K Y R Q F K Q F I 36099 cgctcgaagtcattttggagaatttcgacact tgaagaccctggatactatcgaacgggaaaat ttttaggagaaaccgagcaa R S K S F W R I1ST L ED PGC Y Y R T G K F L G ET E Q 36183 ggaaaact tgtagacgt tcaagccttaaaga act tccct tgt agrtaaat tagggat tcagt tcaaagatgcttacgagtac 113 G K L V D V Q A F K D T S L V V K L G I Q F K 0 A Y E Y 36267 agcgactcaactgttcgaaaggtttataagtttcaacccgctttgggaggcgatagcttacctaacccaggaagaccactcga 141 S D S T V R K V Y K F Q P A L G G D S L P N P G R P T R 36351 caatttagagtagaaataagaactacttctcaaatcaaaggatattttcgaattggcgaaaaaagttcaggacagtttgttgag 169 Q F R V E I R T T S Q I K G Y F R I G E K S S G Q F V E 36435 ttcggtactaattcagtattgatggaaagtggctcgattattattctaaatcttggaacttttgaacttattaaaatagcagt 197 F G T N S V L M E S G S I I I L N L G T F E L I K I S S 36519 gcaaatcaagcgactaacttatttagatacattaaacgaggcgcattcttcaagattcctaatgaaattcaacaataccatt 225 A N Q A T N L F R Y I K R G A F F K I P N G N S T I T I 36603 gaataccgagccgatgacgcagcagcttggacctctactcttcccgctcaagttgaactgtttctaaaccgtctacatag 36686 253 E Y R A D D A A A W T S T L P A Q V E L F L N P S Y Y dplORF019 12161 atgaatgtttatctcaatcaaatgggaaatgtagttcgagaaacttcggtttcaacagtctggaaaaccctcaccaaaaaggg I M N V Y L N QM G N V V RE T S V S T V W K T L T Q K G 12245 ctcgtttctaatcatcgaatattcgctgttcgagatgataaggagtttctgtctaatgagtcgaggtggaaaaggcttccggat 29 L V S N H R I F A V R D D K E F L S N E S R W K R L P D 12329 gtt agatatgggacact tgt t ttgatggt tactaaaat tgacaagcgaagcaagt tgctaaaggcctt tcctgaaattgtgt t 57 V R Y G T L V L M V T K I D K R S K L L K A F P D N C V 12413 gagtttgagaaaatgactgacgcgcagttgaaaaggcattttgtgtctaaatactcgactattgatagcgacatgattgacatg E F E K H T D A Q L K R H F V S K Y S T I D S D M I D H 12497 gttatccagttctgtctaaacgattactctagaattgacaatgaattggacaagctgtcgcgattgaaaaaggttgacgcatca 113 V I Q F C L N D Y S R I D N E L D K L S R L K K V D A S 12581 gtagttgaatccattgtcaagcacaagaccgaaattgacattttcagcctagttgatgatgtattggaatataggccggagcag 141 V V E S I V K H K T E I D I F S L V 0 D V L E Y R P E Q 12665 gcaattatgaaagtgactgaacttttagccaaaggagaaagtcctattggattgcttaccttgctttatcaaaatttaataac 169 A I M K V T E L L A K G E S P I G L L T L L Y Q N F N N 12749 gcttgtcttgtgctaggagccgatgagectaaagaagccaatctaggcattaagcagttcttaatcaataagattgtctataac 197 A C L V L G A D E P K E A N L G I K Q F L I N K I V Y N 12833 t t tcaat acgagc tggact cagc c tttgaaggcat qgct att t taggt caagct at cgagggcat aaagaatggt cgc tat aca 225 F Q Y E L D S A F E G M A I L G Q A I E G I K N G R Y T 12917 gaaagttcagtggtctatatttCtttgtataaaattttttcacttacttaa 12967 253 E S S V V Y I S L Y K I F S L T dp1ORF02 0 1864 atggttaatcaatacaatcagcctgaaagaggcaagattcgaatcaatgttcgcgaccctgagaaaatgctatcatggaaatt 1 M V N Q Y N O P E RG K I R I N V R D P E K H P I M E I 1948 ttcggtcctacaattcaaggtgaaggaatggttataggtcaaaagactattttcattcgaactggtggatgcgactatcattgc 29 F G P T I Q G E G M V I G Q K T I F I R T GCG C D Y N C 2032 aactggtgtgactcagcctttacctggaacggtactactgagccggaatatatcacaggcaaagaagctgctagtcgaatcttg 57 N W C D S A F TI W N G T T E P E Y I T G K E A A S R I L 2116 aaactagctttcaatgataaaggtgaacagatttgtaaccacgtgacattgactggaggaaatcctgcctaatcaacgagcct K L A F N D K G E Q I C N H V T L T G G N P A L I N E P 2200 atggctaagatgatttcgattctaaaagaacatggattcaagtttggtctcgaaactcaaggaactcgatccaagaatggttc 113 M A K M IS I LK ENH G F K F G L ET Q G T R F Q ESW F 2284 aaagaagtaagcgatatcactattagtcctaaaccgccttcaagtggaatgagaactaatatgaaaattctgaagctattgta 141 K E V S D I T I S P K P P S S G H R T N H K I L E A I V 2368 gaiagaatgaatgatgaaaaect tgactggtcat ttaaaatcgttatctttgacgaaaatgacctagcttatgcgcgtgatatq 169 D R H N D E N L D W S F K I V I F D E N D L A Y A R D M 2452 tttaaaactttcgaaggcaagttacgtccagtgaactacctttcagttgggaatgcaaacgcatacgaagaaggaaaaatcagt 197 F K T F E G K L R P V N Y L S V G N A N A YE S G K I S 2536 gat aggct tcttgaaaagt tgggatggcttrtgggat aaagtgtatgaagacccagct t tcaacaatgttcgacctttacccaa 225 DR L L E LCW LW.,D K 'YE P AF N N vRPT.P Q 2620 cttcatacacttgtttatgataataaaagaggagtataa 2658 253 L N T L V Y D N K R G V dplORP021 2504 atgcaaacgcatacgaagaaggaaaaatcagtgataggcttcttgaaaagttgggatggctttgggataaagtgatgaagcc 1 H Q TNHT K K E K S V I G F L K S W 0 G F C I K M K T 2588 cagctttcaacaatgttcgaccettaccgcaacttcatacacttgtttatgataataaaagaggagtataaaatgaaaattgag 29 Q L S T H F D L Y R N F I H L F M I I K E E Y K H K I E 2672 cat ctagat aaaat cggt aacgt at tagggagagagaacggatgggc t tccc t taagccggatgaaat tgt aacct tggacaat 57 H L D K I G N V L G R E N G W A S L K P D E I V T L D N 2756 actgaggcagccgttcaaagactttttggtctattaggcgaggacgcagaacgtgacgggttgcaagatactccatccgttt T E A A V Q) R L F G L L G E D A E R D G L 0 D T P F R F WO 00/32825 PCT/I B99/02040 373 2840 gttaaagcactcgctgaacataccgtagggtatcgagaagaccctaaacttcatctcgaaaaaacattcgacgtcgaccatgaa 113 V K A L A E H T V G Y R E D P K L H L E K TF F D V D H E 2924 gaccttgttcttgtgaaagacattccattcaattctttatgtgagcatcatttagctccgttcgtagggaaggtgcatattgca 141 D L V L V K D I P F N S L C E H H L A P F V G K V H I A 3008 tacattcctaaggataagattacaggtctttcaaaattcggtcgagtggttgaaggatacgctaaacgacttcaagtacaagag 169 Y I P K D K I T G L S K F G R V V E G Y A K R L Q V Q E 3092 cgct tgact caacaaat cgctgacgct at tcaggaagt tct aaat cct caagcagt tgcggt cat cgtagaggct gagcat act 197 R L T Q Q I A D A I Q E V L N P Q A V A V I V E A E H TF 3176 tgcatgagcggacgcggtat t aagaagcacggggcaacgacagtgactt caact atgcgaggtct t t tccaagatgacgcat ct 225 C M S G R G I K K H G A TF T V TF S TF M R G L F Q D 0 A S 3260 gctcgagcagaattgcttcagttgattaaaaagtag 3295 253 A RA E L L Q L I K K dplORFO22 30896 atgagtaaagacatt ctttacggaatcaagctCgtgcaaatcgaggagcttgacccattgactcagttgccaaaagtcggcgga 1 M S K D IL Y G 1 K L V Q IXE E L D P L T Q L P K V G G 30980 gctaactttgtcgtagatacggcagaaacagcagaactcgaagccgtgacctcggagggaactgaagatgtgaaacgcaatgac 29 A N F V V 0 TF A E T A E L E A V TF S E G TF E D V K R N D 31064 acgcgcat tct tgctatcgtgcgt act ccagacct tt tatacggt tatgact taacat tcaaggacaacacgt t tgaccctgaa 57 T R I L A I V R T P D L L Y G Y D L TF F K D N T F D P E 31148 atcatggccctaattgaaggtggtacagtacgtcaacaaggcggaactattgctggatacgacaccccaatgcttgcacaaggt 1 M A L I E C G T V R Q Q G G TF I A G Y D TF P M L A Q C 31232 gct tctaat atgaaaccat t tagaatgaacatct atgtgccaaact atgt aggtgactcaattgt caact acgtgaaaatcact 113 A S N M K P F R M N I Y V P N Y V G D S I V N Y V K I TF 31316 ttgaataactgt accggtaaagctccagggct t tcaatcgggaaagagt tctacgct cctgagt tcaacat caaggcacgtgaa 141 L N N C T G K A P G L S I G K E F Y A P E F N I K A R E 31400 gcaaccaaagcaggtttgccagttaagtcaatggactatgtggcacaacttccagcggttcttcgtcgcgtgacattcgatttg 169 A T K A C L P V K S M D Y V A Q L P A V L R R V TF F D L 31484 aacggtggaacaggaaccgccgacgcagttcgagttgaagcaggtaagaagatttctccaaaaccagttgaccctaccttaaca 197 N G G T C T A D A V R V E A C K K I S P K P V D P T L TF 31568 ggtaaggct t tcaaaggctggaaagt tgaaggagaatcaactatt tgggact tcgacaaccacatgatgcctgaccgagacgt c 225 G K A F K C W K V E G E S TF I W D F D N H M M P D R D V 31652 aaactcgtagcacaatttgcatag 31675 253 K L V A Q F A dplORFO23 6419 atggccaagt ccaat t taactagaat tgcaaagatggt tagagcaggaaacagtgaaggt cctgcttcat ct tttgt caat tcg 1 M A K S N L T R I A K M V R A G N SECG P A SS F V N S 6503 ctgacccgggttattgaacgaactcagcctgaatataatccttcgacatattataagcccagcggggttggtggattattcga 29 L TF R V I E R TF Q P E Y N P S T Y Y K P S C V C C C I R 6587 aaaatgt at ttcgaaagaat cggtgagt ctat tatagat aacgcagat t ctaacctaattgcaatgggcgaagctggaacat tt 57 K M Y F E R I C E S I I D N A D S N L I A M C E A G TF F 6671 aggcacgaagttctccaagagtacatggttaaaatggctgaaatcgatgaggactttgaatggttgaatgtagcagagttcttg R H E V L Q E Y M V K M A E I D E D F E W L N V A E F L 6755 aaagaaaatccagttgaaggaactatcgtcgacgagcgtttcaagaaaaacgattatgaaacgaagtgtaagaacgaacttctt 113 K E N P V E C T I V D E R F K K N D Y E TF K C K N E L L 6839 caact t tcat ct Cgtgtgacggactagt tcgatat aaaggcaagctctacat tttagagattaagactgaaaccatgt tcaag 141 0 L S F L C D C L V R Y K C K L Y I L E I K TF E TF M F K 6923 t tcact aaac atact gagc cct at gaagaacacaagat gcaagcaact tgctacggaatgtgt ctaggagt cgatgatgt cat t 169 F TF K H T E P Y E E H K M Q A T C Y C M C L C V D D V 1 7007 t t cct t tatgaaaat cgagataac t tcgaaaagaaagcctacacgt ttcacat cacagacgagatgaaaaatcaagt ccttgga 197 F L Y E N R 0 N F E K K A Y TF F H I TF D E M K N Q V L C 7091 aaaattatgacctgcgaagagtatgtagagaaaggcgaaagtcctaaaatctattgctcttcagcctattgcccatattgtaga 225 K I M T C E E Y V E K C E S P K I Y C S S A Y C P Y C R 7175 aaggaaggtcgaaatctgtga 7195 253 K E C R N L dplORF024 25992 atgaacgcagt agatggccaggtagt tcatattctacaagtattagcagaagatggaaatgctacggctgaaaagt tcgaaaag I M N A V DCG Q V V H IXL Q V L A E D C N A T A E K F E K 26076 gaagtcagggctgcatctttagtattttcacgaagagcagccgaggcagttgtcaaaggtgaaatctataaggacggcaaaaac 29 E V R A A S L V F S R R A A E A V V K C E I Y K D C K N 26160 ctctcgaaacgtgtttggtcttcagccgcacgcgcaggaaatgatgttcaacaaatagtcacacaaggcctagcaagtggaatg 57 L S K R V W S S A A R A C N D V Q Q I V T Q G L A S G M 26244 tctgctacagatatggctaaaatgctcgagaaatatat cgaccctaaggttcgaaaagattgggactttgataagatagctgag S A T D M A K M L E K Y I 0 P K V R K D W D F D K I A E 26328 aagctagggaaacctgctgctcataaatatcaaaatctcgaatacaatgcccttcgacttgctcgaactaccattagccattcc 113 K L G K P A A H K Y 0 N L E Y N A L R L A R TF T I S H S 26412 gccacagctggagtgagacaatggggcaaggttaatccttatgctcgaaaagttcaatggcattctgttcacgci ccaggrcga 141 A TF A C V R Q W C K V N P Y A R. K V Q W H S V H A P C R 26496 acgt gt caagcgtgtat cgat t tagatggtgaagtat t tcctat cgaagaatgt cctttcgaccat cctaatggaatgtgct ac 169 'F C Q A C I D L 0 C E V F P X E E C P F 0 H PN C-M C Y.
26580 caaactgt atogt acgaaaact cact cgaagaaat cgctgatgagt tgagaggctgggagacggagacct aagtl~rat ta 197 Q T V W Y E NS L EE I A DE L R C W V D C E P-W D V L 26664 gacgaatggtacgacgat t taagtt caggaaaagt tgagaaatacagcgacct cgacrtt gt taaaagt tat tag 26738 225 D E W Y 0 D L S S C K V E K Y S D L. D F V K S Y dplORFO2S 18778 atggcaaagaacaaaaagcgaaaaaaagtaaatgtcaaaaggaaaatgcttatccctacaaatctctcgaaaaaagtaaatgta 1 M A K N K K R K K V N V K R K M L I P T N L S K K V N V WO 00/32825 PCT/I B99/02040 374 18694 aaagcaatcgcttatagaaaagtcactgttaagtgctgcctaatacagatgaaattcaagtatatttcgacctttatataaat 29 K A I A Y R K V T V K W L P N T D E I Q V Y F D L Y I N 18610 aaaaacaggctgacaatgttaggcactattgacccggacaagagctattttgaaggaattaggattgtttgtaagaaacctcag 57 K N R L T M L G T I D P D K S Y F E G I R I V C K K P Q 18526 ccttggatgactgttaaggagctccaggtgcgcgtgcagacgccccaggtttttttgcagttcttaaagcctattgtcacacg P W M T V K E L Q V A R A D A P G F F A V L K A Y C H T 18442 gttggcgatgtactagatagcggagcagagcctactgaaattgttcaaggtattatgtataaagacggtgaactatttaaggac 113 V G D V L D S G A E P T E I V Q G I M Y K D G E L F K D 18358 agtgaaattgtcagccttttcaaatacgatgtcaaagagccttatgagtttccaaaggaccttcctataaccttggacaacttt 141 S E I V S L F K Y D V K E P Y E F P K D L P I T L D N F 18274 ttagagttcattatgtctagccagcatactagagcacttgttttgcgttgtgctaatataggtgagttttCCaagaattggCgg 169 L E F I M S S Q H T R A L V L R C A N I G E F S K N W R 18190 aaatggcaaaaagctatccagctcctgctcgactatgccaaggcggatgactttaaagtagacgaaactgtttgggacttttca 197 K W Q K A I Q L L L D Y A K A D D F K V D E T V W 0 F S 18106 cccggctctaaagctggaaaggtagcacgtcgtaaaggctatgaggcaattcaacaagcccttgagcagataaataaataa 18026 225 P G S K A G K V A R R K G Y E A I Q Q A L E Q I N K dplORFO26 21512 atggcgaaagctactggaccaaaagtcgaagaggaaaaactcctCcacggccaaaagacaaaaaaggaatcaaagcaaatgcg 1 M A K A T G P K V R R G KT P P R P K D K K G I1K A N A 21596 cgtgtcaataaagaccagttcgtagagtatgactataaaggcatcaagatgacaattaaggaacgtgatgctagaatgaaattg 29 R V N K D Q F V E Y D Y K C I K H T I K E R D A R M K L 21680 gaatttattagaggcatgactattcaggaaattgcageCCgctatggattaaatgaaaagcgtgttggcgaaatacgggctcgc 57 E F I R C H T I Q E I A A R Y G L N E K R V C E I R A R 21764 gataaatgggtgaaggctaagaaagagttcgagaatgaaaaggctcttgttact aatgatacattgactcaaatgtatgcaggg D K W V K A K K E F E N E K A L V T N D T L T Q M Y A G 21848 tttaaagtctcagtcaatattaaatatcacgccgcctgggagaaactaatgaacatcgtcgaaatgtgtttagataatcctgac 113 F K V S V N I K Y N A A W E K L M N I V E M C L D N P D 21932 agatatttatttactaaagaaggaaatattagatggggcgcattagatgtcctttcgaaccttatagatagagctcaaaaagga 141 R Y L F T K E C N I R W G A L D V L S N L I D R A Q K G 22016 caagaaagagcgaatggaatgcttccggaagaggttcgatatagactacaaattgagcgcgagaaaattacattgctCCgggCC 169 Q E R A N G M L P E E V R Y R L Q I E R E K I T L L R A 22100 aaaatgggcgaccaggaaaitgaaggcgaggttaaagataacttcgtagaagcactagataaagcagct caagccgtttggcaa 197 K M G 0 Q E I E C E V K D N F V E A L D K A A Q A V W 22184 gaatttagtgacgcaacaggttcctacattaaaggagtgactgataatgacaataagcctgagaaataa 22252 225 E F S D A T G S Y I K G V T D N D N K P E K dplORFO27 52762 atgggaaaagtatcaattcaaaaatcaggaacatttagct cagggtctaataacgagtttttcacactcgctgaccacggtgac 1 M G K V S I Q K S CT F S S G S N N E F F T L AD H G 0 52846 agcgcaattgtcactctattgtatgatgacccggaaggcgaagacatggattatttcgtagtccacgaagcagacgttgacggt 29 S A I V T L L Y D D P E C E D M D Y F V V H E A D V D G 52930 cgtcgacgctatatcaattgcaatgctattggcgaagacggggaaacagtccatcctgataattgtccattatgccaaaacgga 57 R R R Y I N C N A I G E D G E T V H P D N C P L C Q N G 53014 ttccctcgtattgaaaaactatttcttcaactttacaaccatgatacgggaaaagttgaaacatgggaccgaggccgttcttat F P R I E K L F L Q L Y N H D T G K V E T W D R G R S Y 53098 gt tcaaaagat tgt tacat ttat caat aaat atggaagcct tgtgact cagccttt tgaaatt attcgtt caggagct aaaggt 113 V Q K I V T F I N K Y G S L V T Q P F E I I R S G A K G 53182 gaccaacgaactacttatgaattccttccagagcgtccggaagacagtgctactcttgaagattttccagaaaagagcgaactt 141 D Q R T T Y E F L PE R P E D S AT LE D F P E K S E L 53266 cttggaactctaaitttagacctcgacgaagaccaaatgtttgacgtggt tgacggcaagttcactcttcaagaagagcgttct 169 L C T L I L 0 L D E D Q M F D V V D G K F T L Q E E R S 53350 tcaagtcgttcaaattcacgtagaggagcatctcctgcgcctagacgaggttccggtcgagaatcttcacaaggtcgaacagct 197 S S R S N S R R G A S P A P R R G S G R E S S Q G R T A 53434 gaaagaactccttcagttagtcgaagaactcctccaacacgaggtcgaggattctaa 53490 225 E R T P S V S R R T P P T R C R G F dplORFO28 44595 atgtcaaaaat taaat tcgaaaacct taaaaaaggcgatgttgtgctacgagctaaatct caaacg3aagtt taaaat cgt tt ca I M SK I K F E N L K K G D V V L R A K S Q T K F K I V S 44679 att t tagcagacgaaaagaaagcagacct tgaatcattagaagacggaggtgaacttcacc~ttt.cagctt caactct cgaacgt 29 1 L A 0 E K K A D L E S L E D G G E L Hi L S A S T L E R 44763 tggt acacaatggaagatgaaac tgaacctaaaaaagaagaagctgctaaacctgctaaaaaggctgct cctgcagttgct cga 57 W Y T M E D E T E P K K E E A A K P A K K A A P A V A R 44847 cctgctcgaaaaggtagagtcgttcccaaacctaaaaaagaagtccttgaggaagaaattcctgaagttaaggaacagccggaa P A R K C R V V P K P K K E V L E E E I P E V K E Q P E 44931 gaagttggttcagttagtgagaaatccacrgc acd, 113 E V C S V S E K S T V R K P A P K K E S V M A I T K A L 45015 gaaagtcgaat tgt tgaagcct t tcctgcgt ct actcgaat cgt cact cagtct tacatcgc ctatcgct ctaagaagaact tc 141 E S R I V E A F P A S T R I V T Q S Y I A Y R S K K NF 45099 gt tactat cgaagaaact cgaaaaggtgt ttct at tggagt t cgccaaaagggt tgacagaagaccaaaagaaactM ct tgca 169 V T I E E T R K G V S I G V R A K G L T 2 D Q K..K L L A 45183 K ctattgctcctgcatcttacgaatgggcgattgacggaatttttaaactcgtcaaggaagaagatattgacaccgcaatggaa 197 S I A P A S Y E W A I 0 G I F K L V K E E D I 0 T A M E 45267 ttgattgaagcttctcacctttcttcgctatga 45299 225 L I E A S H L S S L dplORPO29 V~ VX S VA a o va u 33 Hi a 1 0 d q x d u v w EEOAIIO~dp Ia a x 'I x N x L6T q1 S 7 5 0 S I V S 'd .S A D 1 3 H V I 3 v a S A S d D N 691 Be6 ee2o o 3 6e 21v o5 15 z 56eoB e IB 5 2apvvS26 o eB S o e orpuIB 15 w o v S LES 0 1 A S S 1 A V A 3 1 N S I1 I. W A a 1 S V 3 1 3 1 M D A II'! U A2 .J L S 'I 1 A 0a Ga 3 3 N 1 S A S a a I A 3 A LII A M4 N H N W S d V N rI A ld A 3 '1 A I H N X A 'I 'd I 'I A 1 99 s a x v E 0 N S x s 1 v '1 3 x s i 1. 3 A v i s a i a H d LS S ri x A x v V A S A w D s A H Ni x V S v I S w a d a 6Z '3 A N H I 33 aG. 14 M 3 A D A A SS A rl Hd N V 3 X W T x A a s s 0 a 0 v D v I. 0 v v V 0 x oaA s a 1 L61 TT9LZ I LSL 1 H d a d A D a a A H a d N D qI N a 1. a V d S X V 0 V 3 0 691 3 V d A 3 A 3 X A '1 AX H S 3 lH A V Xr '3 3 a 'i a X A XT 1I a aS 3 A I I N a ri N w4 a D'1 I a V V d V I S a S I '1 d ET! H I1 V S I. I A V a x v '1 0 s 0 x a ri 0 3 0 '1 N 0 1 1 .L 0se 56 ILZ V aI a N a X A 0 X S '1 1 V A 03 X A S N I s a N v v a s LS 00A A 3 a a H V H d A A N 0 a a a I A I X V a 'i a X 6z S I X S I1 3 X A O X i I d a a ri a x ri 'i a s '10 A v w4 T TEOAHdO~dp N a a 3 '1 1 H al d W I X V N 3 a) N A H a 3 I A V 3 L61 '1N a a N 0 d H a i a 3 x x q1 N H a L, a 0 '1 v N x J. 0 3 69T S d I1 A W4 1 5 A 3 3 A 0 X X A H I V '1 V S '1 A a X '1 A, S TITI N X A S d 3 If 1 d 1 Hf a I If A H 'I A 'I I1 V V 3 A I X S A EII ,r 3 v a b x 3 oa a AI I H S X N 'I Hd W a Hd 0 N I A H sB A N I1 V '1 I N a 'I 3 H 3 M4 14 A 1 Hd a V 3 X a A V 0 9 1 3 LS a 0X d 3 a A 0 3 A 0) V 'I V I N N X 1 W4 rI X 0 V W4 'I 'I V 6Z 3. a 3 a S d Na N V '1 01 '1 N H I X 3 11 X 3N N W T I oBeeoBD 5DeoriS~oovo~u ef-1 eo L a er Let5DeIL 2eBr ~veaBea 8800E 0EOLHOTdP N 3 A A szz OVEL laeeBeeDe VLLI N I d a i w a N 3 a A v x x H a i v i D 0 5 3 V a L6T S 3 A Z) S H I 1 I AL d A a rI a I a M X A A 0 V X I. 'I I. '1 691 vS 6 3 5 eDDD 9911 'I d V A q1 I A X a a 1 A 3 W4 V N S W S N A A 3 d 1. 0 a tI v oa 6 55113gbe j6uoSS ovS iv2p,6ee 3oaB ea rel o 6D oe e IeB oeoB6 I 801B d A v a a v v a a V H V a A A A A S V a A S A V A V V V LII BD5B 55E BBED D5D B35v 5EB~~ B866 S rl W11 'I D N H A4 d A A 1. a A A 3 )f 3 V I'1 3 V A S X a se N S 1 3 0 X a 0 q1 S S S S S S A I X S a 1 3 1 1 4 AX LS ADaA A 4V A NV V N 3r13 VH3NHX09aAN AV I VHNA 6Z 55 B 53 6~ De B E 55 BeDD5 9 t'L N X S a Mx a A 3 1 V '1 01 V S a A a a0 S 1 '1 A A S X W4 I I upe 551Beo 1v eoeaII vot v0I650500iIiI1 e~ 99 Ot'OZO/669 I/1JDd
ZZOOA
SUZE/00 om WO 00/32825 PCT/I B99/02040 376 29 IIN K V V D E I V E A A C G S L D 0 A M E E I Q I V V 7838 agccaaaat cctgt cat tatggaagacct taactactacattggct atct tcccactct t c tttat t t cgccgcaatagggcg 57 S Q N P V I M E D L N Y Y I G Y L P T L L Y F A A D R A 7922 gaaatggtgggaatacaaatggat tcaagt tctgctatcaggaaagaaaaatacgataatctatacattttagccgccgggaaa E M V G I Q M D S S S A 1 R K E K Y D N L Y I L A A C K 8006 actattcctgacaagcaagcagaaactcgaaaacttgtcatgaatgaagaagtcatcgaaaatgcttacaagcgagcctacaag 113 T I P D K Q A E T R K L V M N E E V I E N A Y K R A Y K 8090 aaagttcaattaaagctagaacaggccgataaggtattagcatctttaaaacgaattcaaacctggcaactagcagagttagaa 141 K VO0L K L E Q A D K V LA S L K R I Q TW Q L A E L E 8174 actcagtcaaataattcaaaaggagtattattaaatgcaaaaagacgtagacgtgaaaatgattga 8239 169 T Q S N N S K G V L L N A K R R R R E N 0 dplORF03 4 131 atatcaaatccccgccga gcgcttc tttgaacaaacatcatta 1 M S Q N T T R TOD A E L T G V T L L G N Q D T K Y D Y D 215 tataatccagacgtccttgaaactttccctaacaaacatcctgaaaataattacctagtaacatttgacggatatgaattcact 29 Y N P D V L E T F P N K H P E N N Y L. V T F D G Y E F T 299 t cc gctaaagcg tattccattt tct g aa cacaagtg ga ta 57 S L C P K T G Q P D F A N V F I S Y I P N E K M V E S K 383 t cat tgaaattgtacttattcag tt ccgtaaccacggtgactccacgaagattgcatgaacat tattttgaatgacttgt at S L K L Y L F S F R N H G D F N E 0 C M N I I L N 0 L Y 467 gaattgatggaacctaagtacatigaagtcatgggcctattcactcct cgtggtggaatttcaatttacccattcgtcaacaaa 113 E L M E P K Y I E V M G L F T P R G G I S I Y P F V N K 551 gtgaatcct Caatt tgcaact cctgaacttgaacagct tcaact tcaacgcaaattgaacttc ct tg9aaatgttcaagt ctt 141 V N P Q F A T P E L E Q L Q L Q R K L N F L G N V 0 G L 635 ggacgagctattcgatag 652 169 G R A I R 17425 atgcac c taa tgaagga tt cgaagat gt tgaggac at ggaagtcctt ag catt cgagtt cgaaacgaaggt gaggacgacgagt 1 M H L M K D SK M L R T W K S L A F E F E T K V R T T S 17341 gggt tgaagt tat cgcctgct atgaaaacgatgacgaggacgaagat ttggaagggttataaaatgaaggtat ttat caacaat 29 G L K L S P A M K T M T R T K I W K G Y K M K V F I N N 17257 catactgaagctgatat tgactacaaagat att ctaaattttgtagct tatcgaaactctcct aaccct caaattcaaat cact 57 N T E A D I 0 Y K D I L N F V A Y R N S P N P Q I Q I T 17173 agctggaacgct ttgctttcctgctatacacggaatgagctttcttataaaggagt ttcaat aacggact ttt ttgaagccat t S W N A L L S C Y T R N E L S Y K G V S I T 0 F F E A 1 17089 caaactattgcaagttccttcactcacctagactcgaaaacaattgatacacaaaatgaaaagcgactcgaaaggattgaggaa 113 Q T I A S S F T H L D S K T I D T 0 N E K R L E R I E E 17005 cttcagtcaagaataggtcattgtaactgtactatcgacgaacttaaaaaaggagtccacgaaatgccggatattgaatcagct 141 L 0 S R I G H C N C T I D E L K K G V H E M P 0 I E S A 16921 atttcttaccagtacggacagattcttgcttatgaagatgaacttaattttctgctaaactaa 16859 169 I S Y Q Y G Q I L A Y E D E L N F L L N dplORFO36 48808 gtgttagtcgaacgaaaagccgacaaggaatgttgggaatggctagaagctgttcgagcaaatatagtcgaagaagttcaaac 1 V L V E R K A D K E C W E W L EA V R AN I V E E V R N 48892 ggctgattatctgaattggaggaaca~gggtcattgaccatac 29 G L S I V IA S N T V G N G K T S W A V R L L Q R Y L A 48976 gaaactgcacttgacggaagaattgttgagaaaggaatgtttgtagtgtcagctcaactattgactgagttcggcgactataat 57 E T A L D G R I V E K G M F V V S A Q L L T E F G D Y N 49060 tattaactcaatttgagtcaccctaatgggttataaagatgtg Y F Q T M Q E F L E R F E R L K T C E L L V I D E I G G 49144 ggttccttaaccaaggcctcttatccttatctgtatgacttggttaattatagggttgacaataactttcgactatttatacg 113 G S L T K A S Y P Y L Y D L V N Y R V D N N L S T I Y T 49228 acataatagraattgcttagcaagtttgcttttaatcggtcaa 141 T N Y T D D E I I D L L G Q R L Y S R I Y 0) T S V V L D 49312 tttcaggcaagcaatgtaagaggattggaggtaagcgaaattgaatcatag 49362 169 F Q A S N V R G L E V S E I E S dplORF037 55855 atggtgaagaaattgaaatctaaaatctattcagttgcatatataattctagtagttattgcgaaccttgtgacaatttatttc 1 M V K K L K SK I Y S V A Y I I L V V I A N L V T I Y F 55939 gaacctttaaatgtgaaaggaattttaattcctccaagcagttggtttatgggattcactttCCtgcttataaatctaataagc 29 E P L N V K G I L I P P S S W F M G F V F L L I N L I S 56023 aagt acgagaagccaaaat ttgcaggt tctt tgat atgggta9ggttatt ccttacct cgt tgat ttgctt tatgcaaaacct a 57 K Y E K P K F A G S L I W V G L F L T S L I C F M Q N L 56107 ccacaatcgcttgtcgtggcttcaggagttgcattttggataagtcaaaaagcaagtgtctttatattcgacaagctctcgaat R, P 0 S L V V A S C V A F W I S Q K A S V F I F D K L S N 56191 aaat tagactcgaagattgcaaatgt tcgtcact gc atCa agaugg9 C aYacat t, 113 K L 0 S K I A N A L S S N I C S I I D A T I W I S L G L 56275 agtcctcttggaattggaacggttgcatatatagatattccgtcagccgtactaggccaagttctagttcagtttatcttgcag 141 S P L G I G T V A Y I D I P S A V L G Q V L V Q _F I L_0 56359 tcaactgcttcgagatatttgaaaaagtag 56388 169 S I A S R Y L K K dplORP03S 1350 atgagagt t t ctaaaac ct taacatt cgacgcagct catCaact agt tgga cat t ttggaaaat gcgcaaat ttgcacgggcat 1 M R V S K T L T F D A A NOQ L V G N F G K CA N L N C N 1434 act tacaaagtcgaaat t tcat tagcaggcggaact tatgaccacggt tcgagt caagggatggttgt tgactt ttatcacgt c 29 T Y K V E I S L A G G V Y D H C S S Q C M V V D F Y N V WO 00/32825 PCT/I B99/02040 377 1518 aagaaaatcgcaggtacattcattgacagacttgaccacgctgittEcttctaagggaatgaaccaatcgctttagcaaatgca 57 K K I A G T F I 0 R L 0 H A V L L Q 0 N E P I A L A N A 1602 gttgacaccaagcgagttctatttggatttagaactacggctgagaatatgtcaagattccttacctggactctcacggagett V D T K R V L F G F R T T A E N M S R F L T W T L T E L 1686 ar.gtggaagcatgctcgtatcgactctatcaaactatgggaaactcctacaggttgcgcagaatgtacttactacgagattttc 113 M W K H A R I D S I K L W E T P T G C A E C T Y Y E I F 1770 acagaagacgagat tgaaatgt tcaagaacgtaacct ttat cgacaaagacgaaaagattact gt ccgcgaaat tttagagcag 141 T E D E I E M F K N V T F I D K D E K I T V R E I L E Q 1854 gagcaggataatggttaa 1871 169 E Q D N G dplORFO3 9 3306 atgaataaaagtgcaaccttttggcttgttcgaacagctcttattgcggctctatatgtgacattgaccgttgcattttCtgct 1 M N K SA T F W L V R T A L I A A L Y V T L T V A F S A 3390 attagttatggacctattcaatttagagtcagtgaagccttgattcttctacctttatggaaccatagatggactccgggatt 29 I S Y G P I Q F R V S E A L I L L P L W N H R W T P G I 3474 gtattaggaacaattattgcaaacttctttcacctcttggactgattgacgttttattcggttcacttgctaccttccttgga 57 V L 0 T I I A N F F S P L G L I D V L F G S L A T F L G 3558 gtagtggcaatggtgaaagt tgccaagatggcaagt cct Ctatat tcactt at ctgtccagt tct tgct aatgcttaccttat t V V A M V K V A K M A S P L Y S L I C P V L A N A Y L I 3642 gcgctggaact tcgaatagtt tact ct ttacct tt ttgggaat ctgt catct atgtaggaat tagtgaagcgattat cgtttt a 113 A L E L R I V Y S L P F W E S V I Y V G I S E A I I V L 3726 atttcatacttccttatttccacgctggcgaagaacaatcattttagaacactgataggagcgaaaaatgggatttaa 3803 141 I S Y F L I S T L A K N N H F R T L I G A K N G I* dplORFO4 0 7192 gtgagctatactggaaaaatgttcgaggaagactttttcgaaggtgcaaaagactttgagaaagatgctttcacggtccgtcta 1 V S Y T G K M F E ED F F E O AK O F E K D AF T V R L 7276 tatgataccactaatggatttcgaggagttgcaaatccctgcgattatatagccgcaactaactttgggaccttgtttattgaa 29 Y 0 T T N G F R G V A N P C D Y I A A T N F G T L F I E 7360 ctgaaaactactaaagaagcttctttgagctttaataacatcactgataatcaatggttccagctatcacgcgcagatggatgc 57 L K T T K E A S L S F N N I T 0 N Q W F Q L S R A D G C 7444 aaat ttattct cgccggaatt t tagtgtattt ccaaaagcatgaaaagat tat atggt at ccaatt tcaagccttgaaaaaatt 8S K F I L A G I L V Y F Q K H E K I I W Y P I S S L E K I 7528 aaacggtctggagttaaaagcgtcaacccaaacttcatcgatgcagggtatgaagtttcttacaagaagcgtcgaactagattg 113 K R S G V K S V N P N F I D A G Y E V S Y K K R R T R L 7612 accattcctttccaaaatgttctagatgcagttgagcttcattacaaggagaaaagcaatggcaagacctaa 7683 141 T I P F Q N V L D A V E L H Y K E K S N G K T* dplORP04l 8208 at gc aaaaagacgt agacgt gaaaatgat tgacc ct aaac ttgaccgat taaaat acacagg tgat tgggt tgatgtacgaat t 1 M Q K D V D V K MI D P K L DOR L K Y T G D W V D V R I 8292 agttctatcactaaaattgacgccgacagcgccgatgtctcaagatgtcgaaaagtgcttcaaaaggctcaagtatattcagtg 29 S S I T K I D A D S A D V S R C R K V L Q K A Q V Y S V 8376 gcggcaggtgaatgcattaaaattgcacacggatttgctcttgaacttcctaagggatatgaagcaatcttgcatcctcgttcc 57 A A G E C I K I A H G F A L E L P K G Y E A I L H P R S 8460 agtctttt Eaagaaaactggtctaatcttcgtttctagcggagtgattgacgaaggttacaaaggtgacactgatgaatggttc S L F K K T G L I F V S S G V I 0 E G Y K G D T D E W F 8544 tcagtttggtatgctactcgtgacgcagatatcttctacgaccaaagaattgcccaatttagaattcaggaaaagcaacctgct 113 S V W Y A T R 0 A D I F Y D Q R I A 0 F R I Q E K Q P A 8628 atcaagttcaatttcgtagaatctttaggaaatgcggctcgtggaggccatggaagtacaggtgatttctaa 8699 141 I K F N F V E S L G N A A R G G H G S T G D F dplORFO42 48082 gtggcaaggcaaagaataggcaattcaggaaagcctaaaaatgaaattgaactaacattcaaagacaagcctaaaactcgttct 1 V AR Q RI G N S G K P K N E I E L T F K D K P K T R S 48166 acct tat tcaagaaggacgtggcaacaggt cttt caaaagt cgagcatgat tatt ttcaaat agt tgaagcacttaacggaaaa 29 T L F K K D V A T G L S K V E H D Y F Q I V E A L N G K 48250 caattcgaacctaatatgaagcaggtgtcatctttctttatagttcagtatgaatttattttcaatattaagtgcatcgattat 57 Q F E P N M K Q V S S F F I V Q Y E F I F N I K C I D Y 48334 aactggttcaacttttcgagcactatgaaaaatgttcgaacttatttaaacattgagtcgaacattgaactttgtcgattttta N W F N F S S T M K N V R T Y L N I E S N I E L C R F L 48418 gctgaaagttttgttaaatatgaaaatgttcgaaaaagattgaacctaagcgaaaggttcataacggtctcgactttcaaaaga 113 A E S F V K Y E N V R K R L N L S E R F I T V S T F K R 48502 gcctggattggacgaactcgaaggaaaaacgggttcaaaattcgaaggattttatag 48561 141 A W I L D E L E G K T 0 S K F E G F Y dplORF043 i69 aQ roactaatat tatcacaqictqagcagt ttaagcaact tgca~tt caaatctcgcact t ccaggat ttt caaaaggt agtgaa 1 M T N I I T A E Q F K Q L AF Q I1 AL 1, F G S EZ 31783 cc tat ccatgt taaaat cgagcagcaggtgt catgaacctaatcgct aacgggaaaat ccct aatacgct tt taggtaaagtg 29 P I N V K I R A A 0 V M N L I A N G K I P N T L L 0 K V 3186? acagaactgtttggagaaacttcgacagtcactaaagacaatgctagtctagcatcaattactgaccaacagaagaaaga~gcg 57 T E LF G ET S T V T K DN A S L A S IT D QQ K-LK-E A 31951 ct cgaccgattgaacaaaaccgat accggtatt caagacatggctgaact tctt cgagtat t cgcagaaget tcaatggtagag L 0 R L N K T D T G I Q D M A E L L R V F A E A S M V E 32035 cctacttacgctgaagtcggcgagtatatgacagatgagcaacttatgacaatcttcagtgcaatgtacggtgaagtgactcaa 113 P T Y A E V 0 E Y M T D E Q L M T I F S A M Y G E V T Q 32119 gctgaaacctttcgtacagacgaaggaaatgtctaa 32154 141 A E T F R TODE O N V WO 00/32825 PCT/I B99/02040 378 dplORFO4 4 25666 atggtaagtgt tttgattagcagcagct cct tttgaagtt cctgct tcatt ttagct cgacaagtat tt ctaaat cgaataag 1 M V S V L I S S S S F L K F L L H F S S T S I S K S N K 25562 gttttcaatttccttgtttcctacataagtggtgaaccgataatggcacttaggacattcgaagaatctccactctacgccctt 29 V F N F L V S Y I S G E P I M A L R T F E E S P L Y A L 2S498 ttcgatatgtttcgaaacaatctgtttagatgtaaggtcgaacttatgctcacaatggtcacaattaaccttgaacgtctgggt 57 F D M F R N N L F Rt C K V E L M L T M V T I N L E R L G 25414 cgactccttcttcggttggttgttcagtttgtctttttcttcatcaacttcgtcttcttcactcgtttcatctaggc R L L L R L V V Q F V L F L C H Q L R L L N S F H L E A 25330 cctcttgttcgtttaattcgtttgctaatacaggcaatgctccagctgagatttcgtcaagctgagcaagttcttccaaaatgc 113 P L V Rt L I R L L I Q A M L Q L ft F Rt Q A E Q V L P K C 25246 gttcccattccttgtccgccttttccttcttactga 25211 141 V P I P C P P F P S f Y 25340 atgaaacgagtgaagaagacgaagttgatgacaaagaaaaagaacaaactgaacaaccaaccgaagaaggagtcgacccagacg I M K R V K K T K L M T K K K N K L N N Q P K K E S T Q T 25424 tt caaggttaattgtgaccattgtgagcataagt tcgaccttacat ctaaacagat tattt cgaaacatat cgaaaagggcgt a 29 F K V N C D H C E H K F D L T S K Q I I S K H I E K G V 25508 gagtggagattct tcgaatgtcctaagtgccattatcggt tcaccacttatgtaggaaacaaggaaattgaaaaccttattcga 57 E W R F F E C P K C H Y R F T T Y V G N K E I E N L I R 25592 tttagaaatactgtcgagctaaaatgaagcaggaacttcaaaaaggagctgctgctaatcaaaacacttaccattcatatcga F R N*T C R A K M K Q E L Q K G A A A N Q N T Y H S Y Rt 25676 at tcaggatgagcaagctgggcataaaat ct cagggc t tatggcgaagctaaagaaggagat aaa cat tgaaaaacgagaaaaa 113 I Q D E Q A G H K I S G L M A K L K K E I N I E K R E K 25760 gaatgggtatctatatag 25777 141 E W V S I dplORFO4 6 42774 a tgc caat gtggct aaacgacacagcagt ct tgacgacgat tat t acag9c agcggagtgct t actgt cctact aaat a a 1 M P M W L N D T A V L T T I I T A C S GCV L T V L L N K 42858 ttattcgaatggaaatcgaataaagccaagagcgttttagaggatatctctacaactcttagcactcttaaacagcaggtcgac 29 L F E W K S N K A K S V L E D I S T T L S T L K Q Q V D 42942 gggattgaccaaacgacagtagcaatcaatcaccaaaatgacgtcattcaagacggaactagaaaaattcaacgttaccgtctt 57 C I D Q T T V A r N H Q N D V I Q D G T ft K I Q R Y Rt L 43026 tatcacgacttaaaaagggaagtgataacaggctatacaactctcgaccattttagagagctctctattttattcgaaagttat Y H D L K R E V I T G Y T T L D H F Rt E L S I L F E S Y 43110 aagaaccttggcggaaatggtgaagttgaagccttgtatgaaaaatacaagaaattaccaattagggaggaagatttagatgaa 113 K N L C G N C E V E A L Y E K Y K K L P I Rt E E D L D E 43194 actatctaa 43202 141 T I dp10ftF047 47542 atgaaatttgaagatgaaaaacagttcatcgctgcaattgaagaagccggtgaattaaatgctaccaaagggacatggagaaa 1 M K FE D E K Q F I A AlI E E A C E L NA T K C D M E K 47626 caagtcaaaagtcttcgtgatgctctaaaagagtacatgaaagaaaatgacattgaatctgctcaaggtaagcacttttctgct 29 Q V K S L R D A L K E Y M K E N 0 I E S A Q C K H F S A 47710 accttctacacgacagagcgctcaactatggacgaagaacgcttgaaagaaattatcgaaaaattagttgacgaagccgagacg 57 T F Y T T E R S T M D E E R L K E I I E K L V D E A E T 47794 gaagaaatgtgtgaaaaactt tcagggct tat cgaatacaagcctgt catcaatacgaaactt ct caggatatgatt tatcac E E M C E K L S C L I E Y K P V I N T K L L E D M I Y H 47878 ggcgagattgaccaagaagcaattcttccagcagttgtcatttctgttacagaaggcattcgttttggaaaggctaaaatttag 47961 113 C E I D Q E A I L P A V V I S V T E C I ft F C K A K I dplORF04 B 16709 atggaaacaacactttatttcggttatcttacagcagattggaaagacggtcacaagaactacactttccactatgaaagcatt 1 ME T T L Y F C Y L T AD W K DCG H K N Y T F H Y ES I 16625 cctgtaaaagaaactgagaaacaatataaggtcactggaatcaatcctaacttgtacttagacctaggctcagttattagaaag 29 P V K E T E K Q Y K V T C I N P N L Y L D L C S V I RK 16541 agcgaact tgacat tgc agtat tcaaagcatgt cctgtcg ct gaa actggagt cacact tact cgcgacatggaagt tgatgct 57 S E L D I A V F K A C P V A E T C V T L T ft D H E V D A 16457 agaattgaaatcatcaagaaattaactacaagaatcgaacgccttaacgaaagaattaaagcaagaaatgaacaaggaaacaa Rt I E I I K K L T T R I E ft L N E R I K A ft N E Q C K Q 16373 gaaagccgccacctagtatctgcgctagaagattgcgctcgtcaaattgctggaatttatcaataa 16308 113 E S R H L V S A L E D C A R Q I A G I Y Q dplORF04 9 44018 atgtt tcaaccat t tctcagcgagcatgtagcct tggt cgt caaagtagaaccaagact tgt tttct t cgat at actcgaact c 1 H F Q P F L S E H V A L V V K V E P R L V F F DI L E L 43934 atc ttgtag cg tccactacgacag ga c cqca~~q tt tC-Cgt 29 I F W I S S V C S S V P E T S S I F L P A K F L L S R L 43850 agcat ttgcgt agt caagcgat agacgtagt agt aaggt tgac ctgcat agtaccaacgct cat cgtggt cgt tgacgaaat 57 S IC V S Q AI D V V V ft L T C IV P T LI V V V D CN 43766 t ccgtcgtaggcgtagttgcagtgaat gatgttatcact gt caatgaacatccctgtatgacct ccagcgcctqqg"agcacc S V V C V V A V N D V I T V N E H P C M T S S A-C A S T 43682 tttgcgtccccagatgaagatgtcgcctcgtttagcatcccacggagcattttcactaattag 43620 113 F A S P D E D V A S F S I P ft S I F T N 15081 atgaacaatcagcgaaagcaaatgaacaaacgaatcgtcgaacttcgcgaagactatcaacgtgcaagaggtcgaataaacttc 1 M N N QR K Q M N K ft I V E L RE D Y QfR A R C R I N F *A u m4 N 0 0 a 0 LSA v Ell 08L9BT g 6 V D ST88T 'dN1 V V 3 U S V I I H I I H H N W4 d a A a V 'I S X S8 9 11± ai A9 CI a 9 AA Aa 0 A'I x LS A A 9 N D V H 1. a 1 X A 'I S W A A M4 A 0 1'd A d 1 0 d A 6 5 B 6 D S~L 906T 3a A7 '11 if -d x a a a A 0Ox i i t N CAii i iA N aT 9SOAHO~d X A A 'I I H V N X S a d A ElI vooBn E 96LZ V a E9 A X H 7I V V A. X A A 9 H A 'I A I A I. A X H a E9 a A 98 A a 0 a V A I A 9 a a 0 L S A A a '1 9 1 M H 9 a a I. LS I1 V N X A 0 1 D If I I A X X N D A 1 V V JL V d 1 0 V V 6 qI V A A H a d I VA I s J -dL 0N a a Lx XNA 'd A N d W T 555 ~L Z9LZ SSOANOtdp 'I V 9 S a A 'I 0 '1 V rl N D ElI I X X C A a I 'I E) I I A C x '1 C a V X a N H q1 V a 0 A 'd SB a H A V a 'd S N V I 'I E x E9 '1 a a V H S A d 'I V W A LS T6St I V A '1 'I S C H C X A a A V H N S I V V V 1L C H 1 a *A A 6
LOSI'I
.A S V vCaA A a a a N A 1 'd1N A 1. a NO N 0 D 14 1 1190ANOTdp .S 'I 'i H v I v a I 1 9 a S S ELt LT66T, t5 P PS525PD 9661, S S X 'I H a '1 N I. d d S S W4 V N A V 'I S Nd S S 0 V a A I S9 '1 A M S d S '1 S d I C A d N S S N S S I. S d d N S D d S LS S N 1 1. W I V N E) d d 'I C H A I S 'I I S V A V 0 a A d 6 D S INH NA Id S I 'I S I :Di1d V N I SA IN ad A3 '1 11 LSOANd~p N D C9 0 d I Nd A a I a 'I N Ell A NS'H L a a A C E9 I a V A I A a A f9 w C N i I N Nd A so N H s A 3a C o 9I I S N S H A A, I 0 A N M4 A V A S f9 'I S L9 N J, V S d W N1 H S A N 'I A I A C3 a d Nd a 1 C 0 a a a a A 6Z ED5 P B2PE aaE E EuEu EEDEEDP D.~P6~ aS 0090 L 9) d I W d V d S '1 0 A J d I a 3X 'IN'dC WH 1 H Xi N N W 1 C N A' 1X1 a 1 '1 I C b a rI I1 N El i I'SI0L T D EPDu
OTOL
H 1. a A A N A Nd Nf S N N( I a 1 1 C1 V Nd 0 1 1 N 9 aC A S8 DD BSS e~ Se S 2 o~S ESB~uaea LiO L I I A 1 N A 9 N 0) V A I N 0 V a 1 S N Vd q I G d A 1L S N LS D a S eS 2 P E D E L 66Z1 a A 'I A rI C1 a V A A a N V N C N N X 0 9 A D C S A 14 S 6z N N q ANX INX Id V a I 1V N Nd A O N N A A M A G A SH W TSOANO~dp S S H S X D A 'I a a 'I N1 a X 1 a I D H ELt 31I'STSeE SBD
LII'ST
NS N I IN C 9 C i I aa 'I aA V N a N 'I 0H A A S8 P DEBE 5 a B SE PDSE ueaua L L Si H A I3aV V VN I A d 'I CC '1A I' N I AHN 0 s ad LS p 5 5 e 5B 5 5 S SS D~ ~61,lST A 3 A 'I N C I A D A A V a qi N a '1 a a 9 H C N A v '1 '1 6Z Bapp~DePEP S bS ppED e~ee DE~P Saaa~aD 9TSt 6L t Ot'OZO/6681 /13dS~Z/O
A
MUM Om WO 00/32825 PCT/I B99/02040 380 9859 atgcaaaaatctctatttggacctaagctagtgcctgctagt tcaaggcgcaagaaaagaacggttccaaaacctaaacctaaa I M Q K S L F G P K L V P AS S R R K K R T V P K P K P K 9943 atcgatgagcaagtggttgagcttatgaaccgcagagagcgtcaagtgcttgttcatagttgcatctattattattttaatgac 29 1 D E Q V V E L M N R R E R Q V L. V H S C 1 Y Y Y F N D 10027 tcaattatagcagacgggcagtatgacaaatggagccacgaactatattctcttatagtttcgcaccctgatgagtttcgacag 57 S I I A D G Q Y 0 K VI S H E L Y S L I V S H P D E F R Q 10111 actgttctctataacgagttaaacagtttgacggaaatactggaatgggtcttccatacgactgtcagtttgctgtaagggtc T V L Y N E F K Q F D G N T G M G L P Y D C Q F A V R V 10195 gcagaaaggcttttaagaaaatga 10218 113 A E R L L R K dplORFO58 15633 atgacat cacgcgcatacaaaccaat tcccacgcgcagagct agtgctaaacaagagaaggcagt tgct aagcagt tgggagga 1 M T S R A Y K P I P T R R A S A K Q E K A V A K Q L G G 15717 aaagtacagcctaattcaggagccactgactactacaaaggtgacgtcgtaacagactcaatgcttatagaatgcaagacagtt 29 K V Q P N S G A T D Y Y K G 0 V V T D S M L I E C K T V 15801 atgaagccacaaagttcagtcagct Sgaaaaaggaatggttcctaaaaaatgaacaggaaaggttcgctcaaaaactcgactat 57 M K P Q S S V S L K K E W F L K N E Q E R F A Q K L D Y 15885 t ctgctatcgct tt cgactt tggtgacggaggcgaacagtat atagcaatgt ctataagt cagt t caagcgaat attagaggat S A I A F D F G D G G E Q Y I A M S I S Q F K R I L E D 15969 agaaatgataaccttatttaa 15989 113 R N D N L I dplORFOS9 30154 atgtctcagcctgaattagtatggaagcctgaagaatttgttagtaactgtgaacggtatcgaaacaagtttcaagtcgctgtc 1 M S Q P E L V W K P EE F V S N C E R Y R N K F Q V A V 30238 ataacagt ctgcgaagt cgctgct act aagatggaagaat acgcaaagacgcatgctatttggacagaccgt acagggaatgct 29 I T V C E V A A T K M E E Y A K T H A I W T 0 R T G N A 30322 cgacagaaactcaaaggagaagctgctigggtaagcgcagaccaaatcatgatagctgtatcacatcacatggactacgggttt 57 R Q K L K G E A A W V S A 0 Q I M I A V S H H M D Y G F 30406 tggctagaactagctcatggtcgaaaatacaaaattctcgaacaggctgtagaagacaatgtcgaagaactttttagagcgttg W L E L A H G R K Y K I L E Q A V E 0 N V E E L F R A L 30490 agaaggttattagactag 30507 113 R R L L D dp1ORFO6O 38070 gtgat agctgtatctgctatccc tact ccgct ct ttccaggt acaccgtcgact ccatcacgcccaggagct cccggtaaacct 1 V IA V S A I P T P L F P C T P S T P S R P G A P GK P 37986 gcgtcacct ttaggacct t ctagt cgaat ccatgtaaagt cgt caggaactaatt cgct cggt ttc tat tagtattaaggaca 29 A S P L G P S S R I H V K S S G T N S L G F L L V L R T 37902 ccaatgt at ttcccagat tctgcattaaaat tagt ccctaaaatgt catc tgcgtatctaataacaact tgggactcat ttaca 57 P M Y F PODS A L K L V P K M S S A Y L I TT W D S F T 37818 gt ttcccctgaaaggactcct tcgccgt cctcat ttagcaagt ccat caagt ctt t tcgagggt ct tggaaaatgat agtagag V S P E R T P S P S S F S K S I K S F R G S W K M I V E 37734 tttgaaaggtcgtcgtag 37717 113 F E R S S dplORFO61 19475 atggcgagaatgcaaagat tatgcccgatgaaat t ttggaaggcggtaactaaaatgaaat tcgaagt ttatt ctgcqcgacta 1 MA R M Q R L C P M K F W K A V T K M K F E V Y S A R L 19391 tttgacgaagaggcgacatatgataggtatcgtgaagcactagagaaagttggaaatgtcgcttacttttgtgaaattgatact 29 F D E E A T Y D R Y R E A L E K V G N V A Y F C E I D T 19307 ggcaaccttgtaatcgaactcgagctagacagcctagatgacctaatcgcgcttt caaatgtagtgggaactggactaaaatta 57 G N L V I E L E L D S L 0 D L I A L, S N V V G T G L K L 19223 tcacggccttatagagaagataagccttttcaattatggattgttgacgggtacatggaataa 19161 S R P Y R E D K P F Q L W I V D G Y M E dplORFO62 45284 gtgagaagcttcaatcaattccattgcggtgtcaatatcttcttccttgacgagtttaaaaattccgtcaatcgcccattcgta 1 V R S F N Q F H C GV N I F F L D E F K N S V N R P F V 45200 agatgcaggagcaatagatgcaagaagt ttct tttggtct tctgt caaccct tttgcgcgaact ccaatagaaacacct tt tcg 29 R C R S N R C K K F L L V F C Q P F C A N S N R N T F S 45116 agtttcttcgatagtaacgaagttcttcttagagcgataggcgatgtaagactgagtgacgattcgagtagacgcaggaaaggc 57 S F F D S N E V L L R A I C D V R L S D D S S R R R K G 45032 tteaacaattcgactttcaagagccttagtaatcgccatcacgctttcttttttaggagcaggttttcgaacagtagatttctc F N N S T F K S L S N R H H A F F F R S R F S N S R F L 44948 actaactga 44940 113 T N dplORPO63 1 M K F T E C K N W Y K V G E I C Q M L N R S L S T I N V 47284 tggtatgaagcaaaagact tcgctgaagaaaataacat tcact tcccgt tgt tct tcctgaacc tagaacagacct tgaccat 29 W Y E A K 0 F A E E N N I H F P F V L P E P R T 0 L DH 47368 cgtggtc tcgat c ggga tgacgaaggcgtgaacaaact caaacgat t tagggaca acc taa tgcgcggt gactvgcat tc 57 R C S R F W 0 0 E C V N K L K R F P. D N L M R G-D L A F 47452 tacact cgaact cttgtagggaaaactgaaagggaagcaat tcaagaagatgc taaagcat t taaacgtgaacatggat tggag Y T R T L V G K T E R E AlI Q E D A K A F K R E H GL E 47536 aattaa 47541 113 N dplORF064 SsI I IHA I L S D d A 0 X Se 60Z6E 9ST6E 'X 3 1 N 0 H 0N1H0H I d 0 'X )i IS d I S IMXISA
LS
S L .41 7g c I 't I 11A C H A 1 0 AlI 'I S 'I S ILl H AZ 8B6BE C1 L 'i 3 0 G'I a 3 ~X A D X 'I A N A4 A a1 J 33 aI A 0 M A T ILOAHdoTdp X N 0 X 0 0 S 0 1 Xa I 'I A N V~ ui G I-S a a av H 0 '1 A a1 a A a3 N 0 a qI N I 0 A. a S A D A H S HLS S I 'I 3 '1 A N 3 Id A I d a i n .L i a A a 0 i a a a 0 I d 6 LS091 I' a V A1 W V SOXa I dl AID a S NIX A'I11 W T m Hd a A X H v a S a a I a1 a A 3 1 1 a a A SB 1160OZ 6 siOZ 1 M V It 0 I a A H A 1I S 0 A3 W X 3 N H Ca A S 1 X a a1 LS a A a'I 3'I AA I Ia A N H'I S 'IA V'I V a3A S V 'I B5
LZEOZ
A I A 0 V S d a 'ID0 3 V r I X D 1I N a A a I V H A 'I XW
T
I 0 a V H a W 'I I I. S 0 V N S a 3 a 1 3 '1 sB B 9L6Z EOL6Z 0 3 0 a A 0 a A A A 0 v w 0 a m~ A a v a D) a 'I s I d 0 'q LS B B~ D 5 6 v B~ B B6196Z IL A A a N H I a w 'I o I W AL S A AL V A S a H a a 'i 'i v s E56~ e 1~ 6S 555B 553Ss S6Z I SO a 1 w d s d s N a N a I N I a A '1 3 1 1 0 V V WT .H 3 V 3 A H1 S Hd 0 A A I S S S A S 0 '1 A S S V V SB SEL 53S E D5 60B1't 'I D V X V V 0 V 1 V HDV Hd A d 'I I I. D '1 D 'I A S I H L9 ssS SI D S LI'IS D S S Id 31 I rS Aa A 1H AD VD 6Z ~D5 eE5 B~ E 2D ~LL6tt 'X a S 'I II V IA IXV H S 'I H I IS V a 0 V a A H I I A T 0 Z H I A a a I 1 0 I N V H S S H a 0 A S A S I A 'X SB 99SBZ H S 'I I A 'd a 'I A3 S a 0 N I N 'X I H d A A 3 0 a a Z) I A LS S 'X a 3 a s A a L a 'X a S a A H 'X A N S '1 3 S A S A A 6 A H A N 1 A N I 'X 3 A 0 N A A I a H A 0 a M4 H A Z) N I A T B6BBZ 99OAHO~dp 1. S A I S d 3 I S A 'X d 0 A d I a S Z) H V A A N d 'X V LS V H Ad H 'd x 5 d sS a 'I sA a i N Hd A 'A a3 A1 A 6 dXNW HaI H I A d A 0 H A 'I 3 a 'I H l X lXIl A A OW
T
L611TS S9OAHoldp 0 'I a Aa I 'X Vd a I A 3 5 X Hd S I S S I a I D N V Hd V 'X SB e B S 5 ~~09E6Z 3 A 'X 3 d A I a V V A 3 3 Hd W 0 3 A s a 1 3 3 A S 3 1 I LS Re~l353lo~~looa~ee63~i erv~r~aafta-aveS I roe 9LZ6Z 0d A I 3 A I 3 D a 3 V V a I 3 A V A N Hd a I 'X S S V I 6 Z6T6Z VidD A SD0 S H A A V 05S AlIr 'IS XV a 'XI 1V w I
ISE
Ot'0Z0/668 I/Jd SSVOOt MUM om WO 00/32825 PCT/I B99/02040 382 1 M F L R L Q V V S K V F Q L F V Q E S L Q F E O H L L S 50961 tcaaaatgcttcaactccttcccttgtaaccttacttcgaagacgagcagtcgacctagaggcttttgctttcaatggagagct 29 S K C F N S F P C N L T S K T S S R P R G F C F Q W R A 50877 ttcgcctttttcagttccttcttcgccttcctctttgaatcctataagagtataggttccagtttcaacgtcccacatatattc 57 F A F F S S F F A F L F E S Y K S I G S S F N V P H I F 50793 gatgatttttcggrcttcgccatatcggtitttaacgacagatag 50749 D D F S V F A I S V F N D R dplORP073 14262 gtgaacgcttgccggaagaatacgacgaagaaacttgggaacctatcactgaagcagaatacatcaagcgaacagaaaaaccta 1 V N A C R K N T T K K L C N L S L K Q N T S SE Q K N L 14346 aagcagttgcaaaacctactcgaaaaactccagcgccttctcgtcgccctcgcccttaaaagaaaggttgaaataaaatgtgtg 29 K Q L Q N L L E K L Q R L L V A L A L K R K V E I K C V 14430 aaaat tgt caaaacgaaacat tcaat act agaat tt tcaatgaagatgaaagtgtatgt cgacgcctcaittcactt acaagg 57 K I V K T K H S I L E F S M K M K V A M S T P H S L T R 14514 agattcgcgacaccgcagcagctattagcaatcgagcggtag 14555 R FA T P Q Q L L A I E R dplORFO74 32298 gtgacgaaaagaaaaatccaggattgcaaatgcttatggagtgactattttcagtcgctcctctttttgtatatagaaaggaaa 1 V T K R K I Q D C K C L W SODY F Q S L. L FL Y I E R K 32382 ttacatggatttrgggtcaattgcagcaaaaatgactttggatatctcaaacttcacaagtcaattaaatcttgctcaaagtca 29 L H G F W V N C S K N D F G Y L K L H K S I K S C S K S 32466 agcgcaacggctcgcactagagtcttcgaagtcctttcaaattggttctgctttaacaggattagggaaaggacttacgactgc 57 S A T A R T R V F E V L S N W F C F N R I R E R T Y D C 32S50 ggttacccttcctcttatgggattgcagccgcctctattaa 32591 G YP S S Y G I C S R L Y dplORP075 22447 atggcaaagttttgtccgttgaattccgtcatggcccaaagggaaaatgaaagagccatcgatactgtttttcctgaacgaatg 1 M A K F C P L N S V M A Q R E N E RA I D T V F P ER M 22363 gaaccgtctgctatgacgatatcgaaagttcgaaaaggtgagccctttgtccaccatgttaggagctggagttgtttcttacta 29 E P S A M T I S K V R K G E P F V H H V R S W S C F L L 22279 aaagggacgaagttgaacttaggtagtttatttctcaggcttattgtcattatcagtcactcctttaatgtaggaacctgttgc 57 K G T K L N L G S L F L R L IV I IS H S F N V G T C C 22195 gtcactaaattcttgccaaacggcttgagctgct ttat ctag 22154 V T K F L P N G L S C F I dplORFO76 5728B gtgagagcattttcttcactcacgtcttcgagcaagtggtcgaatgtagggtactcttcatcttctgtaacaatatcaatattg 1 V R A F S S L T S S SK W S NyVC Y S S SS V T I SI L 5644 tactcaccattcccaataacttttagcgaagattcttcaggaactaatgtgacggttgcggccgtggtcttttctacaagtttt 29 Y S P F P I T F S E D S S C T N V T V A A V V F S T S F 5560 ccaaactgctctgctttcacaatcacgtcaatttcaacatcgctgtcgataatgcatcgaaggaagtttgagccatcatacgct 57 P N C S A F T I T S I S T S L S I M H R R K F E P S Y A 5476 gtaaacatgacgcattcgccgtcaccaaaaatatgccaatag 5435 V N M T H S P S P K I C Q dplORF077 14800 atggaacgaataaagacgctatttcacgtgatttatgctaacggcactcattagaagtagcagctttgttcgataccgttgat 1 M E R I K T L F H V I Y A NG T H L E VA A L F D T V 0 14884 gattatgatgacgttatagaggacatccaggggtatattgatacccctgacctttataatcaaaggagcattagaatggcgcct 29 D Y D D V I E D I Q G V I D T P D L Y N Q R S I R M A P 14968 tacaatcctgacatcaatggtgacgctattgctactgacattttactacgactagatgatattatctacgtcgacgcaacttgt 57 Y N P D I N G D A I A T D I L L R L D 0 I I Y V 0 A T C 15052 gaaactattaaatacgaggagcctattgcatga 15084 E T I K Y E E P I A dplORP078 17507 atggcaacagtaaaggaaacagtaaaatttgacggacgtcttgtaactatcttcgactacgacgatt tagagtgggaaggatat 1 MA T V K E T V K FODG R L V T I F D Y DODL E W E G Y 17423 gcacctaatgaaggattcgaagatgttgaggacatggaagtccttagcattcgagttcgaaacgaaggtgaggacgacgagtgg 29 A P N E G F E 0 V E D M E V L S I R V R N E G E D 0 E W 17339 gttgaagttatcgcctgctatgaaaacgatgacgaggacgaagatttggaagggttataa 17280 57 V E V I A C Y E N D 0 E 0 E 0 L E G L dplORFO7 9 35288 atggaactgataccattgataaatcctcgaacaaggttgacccctgcgcttaccatttgtccagcgaatccagtaaccttagaa 1 ME L I P L I N P R T R L T P AL TI C P A N P V T L E 35204 acaattgaagttcccatgctgccaattttagagacagctgaaccaatcattgacccaataccactaatgaagtttcgaatcagg 29 T I E V P M L P I L E T A E P I I 0 P I P L M K F R I R 35120 ttcgcacctcctgaaaccatctgtcccacaaagctagcaatcttgctaactaatgatgaaagcatgtttccagctgtcgataaa 57 F A P P E T I C P T K L A I L L T N 0 E S M F P A V 0 K 35036 agtgagccgagaagtgaagcaataccttga 35007 S E P R S E A I P dplORF08O 42490 atgttgaaccttacaaaatcgcgccaaattgtggcagagtcactattggacaaggagcgaaaagaiacttgtcal.acaacg 1 M L N L T K S R Q I V A E F T I GOQ GA E K K L-V K T T 42574 attgtgaacattgatgcaaacgcagtatcaaccgtctctgaaactcttcatgacccagacttgtatgctgcgaaccgtcgagaa 29 I V N I 0 A N A V S T V S E T L H 0 P D L Y A A N R R E 42658 ettcgagctgacgagcaaaaacttcgcgaaactcgttacgcaatcgaagatgaaat tctagctgaacagtcaaagactgaaaca 57 L R A 0 E Q K L R E T R Y A I E 0 E I L A E Q S K T F T 42742 gctctaacagctgaataa 42759 WO 00/32825 PCT/I B99/02040 383 A L T A E dplORPOB1 55466 atgttcaggaacagtatcgtccatctgttggtctgcgtcaaagttaaaggggtcgaaatcttcgttcttgctagcgtcgatata 1 M F R N S I V H L L V C V K V K G V ElI F V L AS V D 55382 ct cgaact cgtat tcaggaagact cat atcaggaagc ct t cttct tcgaccggt agctgt ttgaacatat cccaagt cctgcgc 29 L E L V F R K T H I R K P S S S T G S C L N' I S Q V L R 55298 ctcgtacattaaattcattggaccggaaactatacttccttta 57 L L L N E Y D I V C H F R E L G E E I F N N L I R F F D 55214 agatacattcatctgctcagcgattga 55188 R Y I H L L S 0 dplORFOS2 44728 gtgaacttcacctttcagcttcaactctcgaacgttggtacacaatggaagatgaaactgaacctaaaaaagaagaagctgcta 1 V N F T F Q L Q L S N V G T Q W K M K L N L K K K K L L 44812 aacctgctaaaaaggctgctcctgcagttgctcgacctgctcgaaaaggtagagtcgttcccaaacctaaaaaagaagtccttg 29 N L L K R L L L Q L L D L L E K V E S F P N L K K K S L 44896 aggaagaaattcctgaagttaaggaacagccggaagaagttggttcagttagtgagaaatctactgttcgaaaacctgctccta 57 R K K F L K L R N S R K K L V Q L V R N L L F E N L L L 44980 aaaaagaaagcgtga 44994 K K K A dplORFO83 35974 atgccttcagggtttttaaatcctgagtccttaaatcctgcgaaagtgagtcctacatattctagcacggttgcacctttgtcg 1 M P S G F L N P E S L N P A K V S P T Y S S T V A P L S 35890 acaaggtcaattccgtcgaccaatagcgtctgtctgctagccatctatttctcctttacggtgttacaatgttaccaaaccctg 29 T R S I P S T N S V C L L A I Y F S F T V L Q C Y Q T L 35806 atagagttt ctt tact tctat tatacaat cct ct cgacagt gtcaacgt cgtcat tgtt tcgaactacgat tgt ccaatgt 57 I E F L Y F Y Y T I L S T V C Q R R H C F E L R L F Q C 35722 tga 35720 dplORP084 15445 atgaattatatggtaaaagtcatt ctagt tagtgt ct ttgt actgtcagcct tttgcatgact tgct caatggtt tat t ggt t 1 M N Y M V K V I L V S V F V L S A F C M T C S M V Y L V 15529 acaggtaagcaagaggaccaccgtagtaccgtcgcccttgtatttggcgctctcgtaagctctgcggcgttctattcgacactc 29 T G K Q E D H R S T V A L V F G A L V S S A A F Y S T L 15613 tttatcctcgcctatctgccatga 15636 57 F I L A Y L P dplORPOBS 10847 gtgatgactataatcaaggacttrtttcgagccttgtgatactgtcacgcattcctccatttgcaagtttcccaataaacgaaag 1 V M T I I1K D F F E P C D T V T H SS I C K F P N K R K 10763 ggcgtcacgctcataactataaccagctccttcttcattttcactttcgataataaattgaagttgattaacgatgtcgtcatt 29 G V T L I T I T S S F F I F T F D N K L K L I N D V V I 10679 atcaattcgagtaaagtcaaaccgttgaactcgactgagaatagtgtcaggaatcttttgagggtcagtagtacatag 10602 57 I N S S K V K P L N S T E N S V R N L L R V S S T dplORFO86 52760 atatgggaaaagtatcaat t caaaaat caggaacatt tagctcagggt ctaataacgagtt t ttcacact cgctgaccacggtg 1 1 W E K Y Q F K N Q E H L A Q GL I T S FS H S L T T V 52844 acagcgcaattgtcactctattgtatgatgacccggaaggcgaagacatggattatttcgtag 52906 29 T A Q L. S L Y C M M T R K A K T W I I S* dplORFO87 30036 atgattttgcct tcat catatagaatgaaaat ttt cact ccart ttgggcaaaaatt t ttc ccgcgt cagt agaattggct aaa 1 N IL P SS Y R M K I F T P F W A K I F P A S V E L A K 29952 aggt caggaacagttgaattat caact aaacaaacaaggt cgt ctgct acgact tcat tcgct ttat ccrtt tt tct ttcct cca 29 R S C T V E L S T K Q T R S S A T T S F A L S F F F P P 29868 tat ccat cactgacccaagagt t tcgaagtacct tgatt t tagt aggagcggt tt caatggct ctacgaact tga 29794 57 V P S L T Q E F R S T L I L V C A V S N A L R T dplORPO88 5040 atgaaaaaagttcaaacttatcaagaatatctaaaactagttgagttcaaacgtcaactttctttaaatcttcgagaaggaaaa 1 M K K V Q TVY Q E Y L K L V E F K R Q L S L N L R ECGK 5124 ataggagtcgatgaagcggttattcaattattcaccttctatagtttcaacaatatcgaggaacctcctttcattgtactcaaa 29 I C V D E A V I Q L F T F V S F N N I E E P P F I V L K 5208 atgcaagaggctgccgtgaacgggacttatgaagcaaaactcaatatgcttaaaagatttaaaattatttag 5279 57 M Q E A A V N C T Y E A K L N M L K R F K I I dplORFOS9 12495 attatagccacaatggataaaaaagcttacgggcgcttccacc 1 M S I M S L S I V E Y LODT K C L F N C A S V IF S N S 12411 acacaat tat caggaaaggcc t tagcaact tgc ttcgct tgtcaat t ttagtaaccat caaaacaagtgtcccat at ctaaca 29 T Q L S C K A F S N L L R L S I L V T I K T S V P V L T 12327 tccggaagccttttccacctcgactcattagacagaaactccttatcatctcgaacagcgaatattcgatga 12256 57 SCG S L F H LODS L O R N S L S S R TA N I R 27037 atgctaaaattttcattgacggcgacggtcaacattttgtacctcacgcacgtttcgatgaagttgttcaavagcgcgatgcag 1 M L K F S L TA T V N IL Y L T H V S M K L F N S AM Q 27121 ctaacggctcaattaattcttataaagaacaagtcgcgacgctttctaaacaggtcaaagataacggtgatgcgcagaccacta 29 L T A Q L I L I K N K S R R F L N R S K I T V M R R P L 27205 tccaaaaccttcaagagcaactcgacaagcagtctcaacttgcaaaaggcgctgtga 27261 57 S K T F K S N S T S S L N L Q K A L WO 00/32825 PCT/I B99/02040 384 dplORFO91 43189 atgaaactatctaacgaacaatatgacgtagcaaagaacgtggtaaccgtagtcgttccagcagcgattgcactaat tacaggt 1 M K L S N E Q Y D V A K N V V T V V V P A A I A L IT G 43273 cttggagcgttgtatcaatttgacactactgctatcacaggaaccattgcacttcttgcaacttttgcaggtactgttctagga 29 L G A L Y Q F D T T A I T G T I A L L A T F A C T V L G 43357 gtttctagccgaaactaccaaaaggaacaagaagctcaaaacaatgaggtggaataa 43413 57 V S S R N Y Q K E Q E A Q N N E V E dplORFO92 46989 atgaaaactatctccatattaaggaaagacactaaaaggaagccggacaggaacggaagaaaaactgcactcgaactagctcaa 1 M K T I S I L R K D T K PR K P D R N C R K T A LE L A Q 47073 gagattgatatgtcacctagtgagttagcagagctccttcaaattcctgaaaggacggcaaccagaattttaaaactcgacaaa 29 E I D M. S P S E L A E L L Q I P E R T A T R. I L K L D K 47157 ctgctcaacaaagagcaatgctcaataatagaaaggtatataaatgaaattcactga 47213 57 L L N K E 0 C S I I E R Y I N E I H dplORFO93 45756 atgcaacatacgattaaacaatgtttgaaacttgccttcctgctaactgcaatatcaattgcctgtttagttttccctaaacct 1 M Q H T I K Q C L K L A F L L T A I S I A C L V F P K P 45672 tgctcatcgcctaaaaggaaacatggatgctcttgtgcgtattcgaaacattcaacctggtgcgcgaatggagtagtcttgaac 29 C S S P K R K H G C S C A Y S K H S T W C A N G V V L N 45588 gaaaaetgctcattgcttgaagaagctattcggtttcgagagtcaatgtag 45538 57 E N C S L L E E A I R F R E S M* dplORFO94 8281 atgtacgaattagttctatcactaaaattgacgccgacagcgccgatgtctcaagatgtcgaaaagtgcttcaaaaggctcaag 1 M Y E L V L S L K L T PT A P M S Q D V E K C F K R L K 8365 tatattcagtggcggcaggtgaatgcattaaaattgcacacggatttgctcttgaacttcctaagggatatgaagcaatcttgc 29 Y I Q W R Q V N A L K L H T D L L L N F L R D M K Q S C 8449 atcctcgttccagtctttttaagaaaactggtctaa 8484 57 I L V P V F L R K L V dplORP095 8877 gtgggaaaactacttcagctctcgacattgtcaagaatgcgcaaatggtatttgagcaggaatgggaacagaagactgaagaac 1 VCG K L L Q L S T L S R M R K W Y L S R N G N R R. L K N 8961 tcaaggaaaagctggaaaatgcgcgtgcatccaaagctagcaagactgCtgtCaaggaacttgaaatgcaactcgatagtcttc 29 S R K S W K M R V H P K L A R L L S R N L K C N S I V F 9045 aagagcctcttaagattgtatatcttgaccttgagaatacattag 9089 57 K S L L R. L Y I L T L R. I H dplORFO96 46681 gtgat tcataaattrctt caat t tcgttgaacttatctgcggt t tct cctgtt accaggttgcat ttgactgt ctt cgaaagtat 1 VIMH K F F N F V EL I C G F S C Y Q V A F D CL R K Y 46597 cttagcaagaggttcaataacccccaattgctaaatatcacgcaggactttccttgctggatacattcctcgacaatttc 29 L S K R F N N L F P I A K Y H A C L S L L D T F L D N F 46513 gatacatctttcgaacttgcaagacttgacatcttgagtagttaa 46469 57 D T S F E L A R. L D I L S S dplORFO97 39100 atggacgggat tgaaat cttgatactgaccgacgt atgct cgtccgctgtcagtatgact aaat ccct caccgt ttggact att 1 M D G I EI L I LT D V C S S A V S M T K S L T V W T I 39016 agagaaagcgaggtgagt atat tgcgaacgtccgt cagct cctgcaggt ccaggaat tcct tgaagccct tgaggacct tgaag 29 R E S E V S I L R T S V S S C R S R. N S L K P L R T L K 38932 accttgaaetcctctaggacctgtttcacctatcttggaaactga 38888 57 T L N S S R T C F T Y L G N dplORFO98 43627 gtgaaaatgctccgtgggatgctaaacgaggcgacatct tcatctggggacgcaaaggtgctagcgcaggcgctggaggtcata 1 V K M L P. GM L N E A TS S S GD A K V LA Q A L E VI 43711 cagggatgttcattgacagtgataacatcattcactgcaactacgcctacgacggaatttccgtcaacgaccacgatgagcgtt 29 Q G C S L T V I T S F T A T T P T T E F P S T T T M S V 43795 ggtactatgcaggtcaaccttactactacgtCtatcgcttga 43836 57 G T M Q V N L T T T S I A* dplORFO99 38298 atgcaagt tcgccat ctgctactgaagct ccagctggtggatggtctacgcaagtt cct accgt cccaggtggt cagtatt tat 1 M Q V R H L L L K L Q L V DCG L R K F L P S Q V V S I Y 38382 ggactcgaacaagatggegctacactgaccaaactgatgaaattggatattcagtttcaagaatgggcgagcagggtcctaaag 29 G L E Q D G A T L T K L M K L D I Q F Q E W A S R V L K 38466 gtgacgcaggtcgtgacggtattgcaggaaagaacggaatag 38507 57 V T Q V V T V L Q E R T E dplORFIOO agca gcaccaagjL a t tya gacagogatttcagttc.t c.gc c a-: 1 M Q L T P S E F Y L D L E L R L R I C Q D S L PCG L S R 1681 agcttatgtggaagcatgCtCgtatcgactctatcaaactatgggaaactcctacaggttgcgcagaatgtacttactacgaga 29 S L C C S M L V S T L S N Y G K L L Q VA Q N V L T T R 1765 ttttcacagaagacgagattgaaatgttcaagaacgtaa 1803 57 F S Q K T R L K C S R T dplORFl1 19220 gtgataattttagt ccagttcccactacatttgaaagcgcgattaggtcatctaggctgtctagctcgagttcgattacaaggt 1 V II L V Q F P L H L K A RL C H L C C L A R V R L Q C 19304 tgccagtatcaatttcacaaaagtaagcgacatttccaactttctctagtgcttcacgatacctatcatatgtcgcctcttcgt 29 C 0 Y Q F H K S K R H F Q L S L V L H D T Y H M S P L R.
Hd L xHa 30' A H I AL V A V VY V A a A o x mI oIiaL ZT~Ao~dp *IS S A A 1 A 17 L I I~ s sII IIS 1 x s 1 I. 1 s I Ia A OH x x 7s S NH'I I A 0'I A Z S I HO! 'I A X A A 0 H H S 'I I A T LS 98Z S rI H 0 I SS A.
OC99T Z T99T S A4 0 A q1 I A d I 'i 0 1I H A H S H d 3 1. 1 1 A3 11 S S S H6 V IN 'I V H 0.1 V S A d I A S A H S W S I S V I I S IW I *I V 0 S 0 V I Hd Hd LS H I A N S X a dd A d A V S VI H I S V d 'I dlS A 3 03D6 I SOD A X A d 'I V X 'I dO 0a I.S d s a A 1 0 S S I 'M W *1 ZE91E *00 x '1 0 A v H A LS OSZ6t eeB65V~ 6LZ61' a A I H A X rl d 3 rI N 3 d 'I D H 'I rI I d d 'I H A 0 1 W S 6 I I.A UaD S N H H d rI N a X H I N V V Hd H 0 1 I. Z) S H W I H W 'I a d H M I 'I I I A I I 6 8L8601 TDEI~:ESOt 'I S H H S 0O0S A I A Z) 332W 0 '1 NO 'I H a dlI A S W t
OSLOT
L.OTAHOTdP S3 1 1V S S S IA0 LS S S S S N 3 H 'I I H A A S H V H Hd N a N I S S H1 1 1 d A 6
GVVOT
a I NS A N N al H S A rI 'H H A V q'3 A N A a N A I N W T D S a H X 'I V d V d LS SZST eS 5630961 A H N I I1 A I 'I Hd A S V '1 N S a I N V A H'I A H H a H A 6Z VV6T N 3 1 H Z) N 'I IAS S H N Al I1 '1 S N 3 N SS I.S V A I1W 1 ~eee 55e 2D B66e SOIEASO~dp SA A AIS D 1Ad a LS A q IH S H 3 A a 3 A d a a A 3 1 I H H A S 0 N S a N 'I 'I 6Z S A A AW Hd S A S N 'I A A M N I1 H H 'I X '11 A H H Id W T 3 0) A 3 H 'I X se LZ96V, Lr61s5e556266se6 g~r~ieeL V96V 0 A H 0 0 H 3 N A q1 d S 3 H V 0 d V S 0 rI m V 0 S S a A LS I '1 A A V S H 3 H 0 H A, 5 0 1 A H a 0 s 3 H1 3 N H A 0V6z DA S A V 1 DI qI a A 1 0 M 'IAS I I I I N S A H1 H N 'I T EOTAHO~dp .ISO N I IH 3M IS LS A IA S S I H H 'I M 3 S V D I A I I 1 0 1I A H H H H M A 6Z SHNA H H rlIS a 'I A A 'I A H IS N d S A I 'D 3 M I1IW
T
ZOTAHO~dp *A SI H '1 H NO0V A I0 LS 93?'6T 5,eee 8S6T
SE£
Ot'OZO/669 I/.LJI szgrvoo OM WO 00/32825 PCT/I B99/02040 386 29 Y P G D E K K N P G L Q M L M E dplORFll3 17715 atgaaaacagttaaagaagcaatcaaacaattcggtgatgaatggtggtacgaaattatcaacgaaaacggccaaatgattcaa 1 M K T V K E A I K Q F G D E W W Y ElII N E N G Q M I Q 17631 gacggaagaatcgaagacatgggcgaatacatggaagaaacggtcgaccaagt taagttcatcaactatggtgacatcgaat ct 29 D G R I E D M C E Y M E E T V D Q V K F I N Y G D I E S 17547 caaattatcaaactatatatcgcataa 17521 57 Q I I K L Y I A dplORF1l4 52952 atgctattggcgaagacggggaaacagtccatcctgataattgtccattatgccaaaacggattccctcgtattgaaaaactat 1 M L L A K T G K Q S I L II V H Y AK T D S L V L K N Y 53036 ttcttcaact ttacaaccatgatacgggaaaagttgaaacatgggaccgaggccgttcttatgttCaaaagattgttacattta 29 F F N F T T M I R E K L K H G T E A V L M F K R L L H L 53120 tcaataaatatggaagccttgtga 53143 57 S I N M E A L dplORF1lS 5342 atgagcctcctttttttgatatatataatatacacgaattatcgcgagtttgtaaagccgtttctaaataattttaaatctttt 1 M S L L F L I Y I I Y T N Y RE F V K P F L N N F K S F 5258 aagcatattgagttttgcttcacaagtcccgttcacggcagcctcttgcattttgagtacaatgaaaggaggttcctcgatatt 29 K H I E F C F I S P V H G S L L H F E Y N E R R F L D I 5174 gttgaaactatagaaggtgaataa 5151 57 V E T I E G E dplORF1l6 20662 atgaaattttcaaactttgctaaagcacttactaatgaatacctaatggtagtgaacaatgaccaagctgaagtcttaggcgca 1 M K F S N F A K A L T N E Y L M V V N N D Q A E V L G A 20578 ggaaatatcgaaaacattctcaacggtcgaacttgctaagttgtagctgaagcgacagttttaaaactcgaaaaactcagc 29 G N I E N I L N G S N F A N V V A E A T V L K L E K L S 20494 gaagaggaagctattgagtag 20474 57 E E E A I E dplORPll7 24680 atgataacaggctgctcgaacattttaaatcgaagtgaatctcgtaagtcactaatagttttgttcaagttatctgctactgtg 1 M IT G C S NI L N R S E S R K S LI V L F K L S AT V 24596 ataaggtctttgacatcgcttgtcccgtatatgtcattagtcaatggttcattaagaataactcgacaaggaatttgcttcaag 29 I R S L T S L V P Y M S L V N G S L R I T R Q G I C F K 24512 ccggttggggcggattcttga 24492 57 P V G A D S dplORP1l8 15023 atgatattatctacgtcgacgcaacttgtgaaactattaaatacgaggagcctattgcatgaacaatcagcgaaagcaaatgaa 1 M I L S T S T Q L V K L L N T R S L L H EQ S A K AN E 15107 caaacgaatcgtcgaacttcgcgaagactatcaacgtgcaagaggtcgaataaacttccttcttgctgtaaaggaccacgggCa 29 Q T N R R T S R R L S T C K R S N K L P S C C K G P R R 15191 agaactcgaaaaccttga 15208 57 R T R K P dplORF1l9 41054 atggaggttcaacatccccgattcagtacgtcctactttttcgggcatttctttagtagacacgacttcagcggttcgacagat 1 M E V Q H P R F S T S Y F F G H F F S R H D F S G S T D 41138 tttaacagggaacaacttcctccaaatcatgtcgaacattcaagtcaacttcaacaatgcttccggcgcttacggatccactat 29 F N R E Q L P P N H V E H S S Q L Q Q C F R R L R I H Y 41222 ccaagcatttcacgctga 41239 57 P S I S R dplORPl2 0 28387 gtgttgaagcgcaagcagaatacatgcgtatgcaattgcttcaatacggtaaattcactgtcaaatcaactaacagcgaggctc 1 V L K R K Q N T C V C N C F N T V N S L S N Q L T AR L 28471 aatacacttacgactacaacatggatgctaagcaacaatatgcagtcactaagaaatggactaacccagctgaaagtgacccta 29 N T L T T T T W N L S N N M Q S L R N G L T Q L K V T L 28555 tcgctgacattttag 28569 57 S L T F dplORPl21 39222 gtgcagacggatcacgtgagttcagtttggaagataataatcaacaatatatgggttattact ccgattatgagcaagcagata 1 V Q T D HNV S S V W K IlIIl N HI W V I T P I M S K Q I 39306 gcagggatcgaactaagtatcgatggtttgaccgccttgccaatgttcaagtgggaggtcgaaacgagttccttaattctttat 29 A G I E L S I D G L T A L P M F K W E V E T S S L I L Y 39390 ttgaatttggtttaa 39404 57 L N L V dplORPl22 40402 atgttattctccttatcctacataccgaatcacgttcatgtctggattaaacgagtattgttccgttctaaatcggccgacttg 1 M L F S L S YI P N H V H V W I K R V L F R S K S AD L 40318 aatggattgggtaaagatcccgE tatcgatgtgaatgaacccttgcgtaaggtacataacttcattccctgcggagaacataga 29 N G L G K D P V I D V N E P L R K 'V H N F I P- C G__E--HB 40234 aattcggtcacttga 40220 57 N S V T dplORPl23 21327 atggttcgacttttcgaaggattgaggttttcgaaccggttgagtttttcgagcattctcgacttttcgacccctttctatgct WO 00/32825 PCT/I B99/02040 387 M V R L F E G L R F S N R L SF5S S ILOD F S T P F Y A 21243 cgact t ttcgagtgt t ttgaggtt tcgagcaggt tcgact ttt cgagaaat tgagt ttt tcgac ctct aaat taggctcgat 29 R L F E C F E V F E Q V R L F E K L S F S T S K L G S I 21159 attcgaaaagtttag 21145 57 I R K V dploRP124 17891 atggtaaaagt taaagat ttgcaagtaggaatgaaagt tgtaaatgcaaaaggt actgaat t taaagtaactgaccgt caaggt 1 M V K V K D L Q V G M K V V N A K G T E F K V T D R Q G 17807 cgtaaatgggtaagcctagaacgtcttagtgatggacgtattcggttctatgataacgaatcactaatggacgaaaaagtggag 29 R K W V S L E R L S D G R I R F Y 0 N E S L M D E K V E 17723 gtagtaaaatga 17712 57 V V K dploRP125 49916 atgtcct cagccgcttccgt taaaat tggaacaagtgaat tatatagatgct cct ctt ttagct tgt cgataaggt at tcat ca 1 M S S A A S V K I G T S E L Y R C S SF S L SI R Y S S 49832 gtt tcgccaat tt cgaaaaat tcgaat ccaggaaaatggt cgagaatagt tt cgt cgtccggaact ct tccatat ctcgaaaag 29 V S P I S K N S N P G K W S R I V S S S G T L P Y L E K 49748 tgttcttga 49740 57 C S dplORFl26 16136 atgagctcaagtacgttttctcgaacaatagggtcaagtccagttatatcaacgaactgtatatcgtcctcttgtataggaata 1 M S S S T F S R T I G S S P V I ST N C I S SS C I G I 16052 aggtctgcgtacagttgcatggctgaccctttaattgagaactgttccttcactgtttattttaaataaggttatcatttct 29 R S A Y S C M A D P L I G V T V P S L F I L N K V I I S 15968 atcctctaa 15960 57 I L dplORF127 13511 atgctaaatagctttcccattcaccgtcgctgttcttgcgccatttttcagtttcacgatactgaccaactttgcaaaggtcgt 1 M L N SF PI H R R C S C A I F QOF HODT D Q L C K G R 13427 gaaatagtgctacgattgcaactgtttccattgggtaaatgtcttcccagcctttgcctaccatggtatccatttcgaaaagta 29 E I V L R L Q L F P L G K C L P S L C L P W Y P F R K V 13343 gttgattga 13335 57 V D dplORF128 4852 atgacagcagttcaacaagttaagtt ctacttagaagaagccggcgctcactttctaaaagatgttgagtacagtgacaactta 1 M T A V 00Q V K F Y L EE AG A H F L. K 0 V E Y SODN L 4936 gagcaagcaattatgaaagatattcttaaatggaatggcgctcatagagatgagcacgacatgaaaataacttcatacgaagta 29 E Q A I M K 0 I L K W N G A H R 0 E H 0 M K I T S Y E V 5020 ttatag 5025 57 L dplORF129 25133 atgaactttctgctaagcaacttgcgctcactgaagttcaaactaatgtacgcagccaccaatcttacattgaagaattcagta 1 M N F L L S N L R S L K F K L M Y A A T N L T L K N S V 25217 agaaggaaaaggcggacaaggaatgggaacgcattttggaagaacttgctcagcttgacgaaatctcagctggagcattgcctg 29 R R K R R T R N G N A F W K N L L S L T K S 0 L E H C L 25301 tattag 25306 57 Y dplORFl3O 16789 gtgcttgact ttattcct ttat tat cgtat aat cat aatataaataaaacaagcgt caaggacgcagaaagaggt caat tatgg 1 V L O F I P L L S Y N H N IN K T S V K D A E RG Q L W 16705 aaacaacactt t attt cggt tat ct tacagcagat tggaaagacggt cacaagaactacact t tccactatgaaagcat tcctg 29 K 0 H F I S V I L Q 0 I G K T V T R T T L S T M K A F L 16621 taa 16619 57 dplORF13l 43846 atgctcaaceggctgagaagaaacttggctggcagaaagatgctactggtttctggtacgctcgagcaaacggaacttatccaa 1 M L N R L R R N LA G R K M L L V S G T LEOQ T EL 43930 aagatgagt tcgagtatatcgaagaaaacaagt ct tggtt ctactttgacgaccaaggctacatgct cgctgagaaatggt tga 44013 29 K M S S S I S K K T S L G S T L T T K A T C S L R N C* dplORFl32 15304 gtgactggaaggt cat ctaatacacat agcctcaagacatt tcgt tggct t tcaggaaaacat tcgactagat tgt caatgt at 1 V T C R S S N T H S L K T F R W L S G K H S T R L S M Y 15220 cccacaaaggcttcaaggttttcgagttcttcgccgtggtcctttacagcaagaaggaagtttattcgacctcttgcacgttga 15137 29 P T K A S R F S S S S P W S F T A R R K F I R P L A R dplORFl3 3 8061 atgaett cttcattcatgacaagtt t tcgagt t t ccttgct tgt caggaatagt t ttcccggcggctaaaatgtatagal;ta 1 M T S SF M T S F R V SA C L S G I V F P A A-K M-YR L 7977 tcgtatttttctttcctgatagcagaacttgaatccatttgtattcccaccatttccgccctatctgcggcgaaataa 7900 29 S Y F S F L I A E L E S I C I P T I S A L S A A K dplORF13 4 498 atgacttcaatgtacttaggttccatcaattcatacaagtcattcaaaataatgttcatgcaatcttcgtggaagtcaccgtgg 1 M T S M Y L C S IN S Y K S F K IM F M Q S SW K S P W 414 ttacggaaactgaataagtacaatttcaatgattagattcaaccatcttttcgtttggaatgtaa 349 WO 00/32825 PCT/I B99/02040 388 29 L R K L N K Y N F N D L D S T I F S F G N dplORll3 780 atgaagcagaact tgaaaatgctgctaacgt tgcaat gt ct acggagt caagt tcacca ttct tgaaat tgact cgaaaat ct 1 M K Q N L K M L L M L Q C S T ES S S P F L K L T R K S 864 actcaagctctagct cttcct tat tacaaggaaaaggcgaaat t tcacatggaaaat ct tacgctgaaat cct ag 938 29 T Q A L A L P Y Y K E K A K F H M E N L T L K S dplORF13 6 55252 gtgaagaaatcttcaataaccttattcgcttctttgacagatacattcatctgctcagcgattgagttagccccgcggccgtac I V K K S SI T L F A S LT D T F I C S AlI E L AP R P Y 55168 ataagacctaaaagaacggacttgacagaattcttcgaagttttccttccttgttagtcgttccgcgggatag 55094 29 1 R P K R T D L T E F L R S F P S L L V V P S G dplORFl37 37146 atgcttcgaacttgtttgttagcaccgtcaggaggacaaactagtcgaacccattcacctgcgt ctttgataatatctagcgcg 1 M L R T C L L A PS G G Q T SR TH S PA S L I I SS A 37062 acagcgcctacagaagaagcaacgtgtttcaacttcctaggcaagccttctgctagttcataccataatcgtag 36988 29 T A P T E E A T C F N F L G K P S A S S Y H N A dplORFl38 30662 atgact atategaagaacaatgtagt cat ccggcct atctgtat cttgct cgtcaaat tcaact cctggaagcataggagcagg 1 M T I SK N N V V I R P IC I L L V K F N S W K N R S R 30578 cgagagctgaaatgtaggaagaat t cct tcaat ctgtccatcat tgt cgtcgttagt catgttcactcctag 30504 29 R E L K C R K N F L Q S V N H C R S F S N V N S dp1ORP13 9 12092 atgatactaaatcactcaacttgtttgaccctcctgataaattcgttcacgcagacacgcgcat ttgagccctttttagatacc 1 MI L N N S T C L T L L I NS F T Q T R A F E P F L D T 12008 tttcgcaaacacctagatgcttccctcactaaaaggtcatgggcctcaagttcttcgaaagacatttctacatag 11934 29 F R K N L D A S L T K R S W A S S S S K D I S T dplORF14 0 20562 atgt tt tcgatat tt cctgcgcct aagactt cagct tggt cattgt tcactaccat taggt at tcat tagtaagtgct ttagca 1 M F S I F PA P K T S A WS L F T T I R Y S L V S AL A 20646 aagtgaaaatttcartttatttccctttatttgtttcttaactattattatacaataatgatga 20717 29 K F E N F I L F SL Y L F F F I LL L Y NN D dplORF14 1 42922 gtgctaagagttgtagagatatcctctaaaacgctcttggctttattcgatttccattcgaataacttattagtaggacagta 1 V L R V V ElI SS K T L L A L F D F H S N N L F S P. T V 42838 agcactccgctgcacgctgtaataatcgtcgtcaagactgctgtgtcgtttagccacattggcatagattga 42767 29 S T P L H A V I I V V K T A V S F S H I G I D dplORP142 31898 gtgactgtcgaagttcccaaacagtcgcacttacctaaaagcgatagggatttcccgttagcgattaggttcatg 1 V T V E V S P N S S V T L P K S V L G I F P L A IR F M 31814 acaccigctgctcgaattttaacatggataggttcactaccttttgaaaatcctggaagtgcgatgatttga 31743 29 T P A A R. I L T W I G S L P F E N P G S A N I dplORF143 7565 atgaagtttgggttgacgcttttaactccagaccgtttaasstttttcaaggcttgaaattggataccatataatcttttcatgc I M K F G L T L L T P D R L I F S R L El I Y HNII F S C 7481 ttttggaaatacactaaaattccggcgagaataaatttgcatccatctgcgcgtgatagctggaaccattga 7410 29 F W K Y T K I P A R. I N L H P S A R D S W N N dplORY144 36517 gtgcaaatcaagcgactaacttatttagatacattaaacgaggcgcattcttcaagattcctaatggaaattcaacaattacca 1 V Q IK R. L T Y LODT L N E ANH SS P. F L N E I0 Q L P 36601 ttgaataccgagccgatgacgcagcagcttggacctctactcttcccgctcaagttgaactgtttctaa 36669 29 L N T E P N T Q 0 L G P L L F P L K L N C F dplORF145 42067 at ggaaacagc tggagacc taaeaagtggaa agaggt tc tart taagcaagac t cgaacagaat aat tggcagaa act gt tc 1 M ET A G D L T SG K R F Y L S K T S N R I ICGR N L F 42151 ttcaaagtgggtggaaccatcactcaacctatggcgacgcattctattcgaaaactcttgacggcatag 42219 29 F K V CG T I T 0 P M A T N S I R K L L T A dplORF146 51484 atgacaaactgcatgattgcatcacct ttccagtacggaacctcaagggcgaaacagtattcttcaaccgtcgaagtgttcgtt 1 M T N C MI AS P F Q Y G T S R A K Q Y S S T V E V F V 51568 ctaagtttcaccagtacggtgaagatgaccctaaaacggaatttctttatggccaatatgagcttgtag 51636 29 L S F T S T V K M T L K R N F F M A N M S L dplORP147 55207 atgtatctgtcaaagaagcgaataaggttattgaagatttcttcaccgagttccctaaagtggcagactatatcatattcgttc 1 M Y L S K K R IP. L L K IS S P SS L K W Q T I S Y S F 55291 aacagcaggcgcaggacttgggatatgttcaaacagctaccggtcgaagaagaaggcttcctgatatga 55359 29 N S R R R T W D M F K Q L P V E E E G F L I* dplORFl4 B 28636 gtgtttcggttcaagacca£ tcgagtagggcgaacacctgtacgattttcgatgtcatccattgctgctaaaatgtcagcgata 1 V F R F K T I R V C R T P V R F S M S S I A A-K M-S I I 28552 gggtcactcagctgggttagccatcttagtgactgcatatgttgcttagcatccatgttgtag 2&484 29 G S L S A G L V N F L V T A Y C C L A S N L dplOP149 26474 atgccattgaactttt cgagcataaggattaaccttgccccattgtctcactccagctgtggcggaatggctaatggtagttcg 1 N P L N F S SI R I N LA P L S H S S CG GM A NG S S 26390 agcaagtcgaagggcattgtattcgagattttgatartttatgagcagcaggtttccctag 26331 WO 00/32825 PCT/I B99102040 389 29 S K S K G I V F E I L I F M S S R F P dplORP150 15185 gtggtcctttacagcaagaaggaagtttattcgacctcttgcacgttgatagtcttcgcgaagttcgacgattcgtttgttcat 1 V V L Y S K K E V Y S T S C T L IV F A K F DODS F V H 15101 ttgctttcgctgattgttcatgcaataggctcctcgtatttaatagtttcacaagttgcgtcgacgtag 15033 29 L L S L I V H A I G S S Y L I V S Q V A S T 1 28027 atgattatatcaacgcaggggagattgctagctacattcaagcactccttcaaacgctcttcaataccttggaccaactcttt 1 M I IS T Q G R L L A T F KMH F L Q T L F N T L D Q L F 28111 tccctaatgctcaacaaacagggacagacatttcatggctcaagggtgcaaataatttgccagtaa 28176 29 S L M L N K Q G Q T F H G S R V Q I I C Q 2 42235 atgtgcataaaggact tat cgacaaagaggctactattgcagt act t cctaaggatt tagac cgaaagtttcaatgtatct tc 1 M C I K D L S T K R L L L Q Y F L K DL D R K F Q C I F 42319 aggctctcaataactcatatggaaatgccattctatgtatatacactgacggaagacttgtggga 42384 29 R L S I T H M E M P F Y V Y T L T E D L W dplORP153 22307 atggtggacaaagggct cacct t ttcgaactttcgat atcgtcat agcagacggtt ccatt cgtt caggaaaaacagtat cgat 1 M V DK G L T F S N F R Y R H S R R F H S F R K N S ID 22391 ggctctttcattttccctttgggccatgacggaattcaacggacaaaactttgccatctgtggtaa 22456 29 G S F I F P L G H 0 G I 0 R T K L C H L W dplORF154 18446 gtgacaataggctttaagaactgcaaaaaaacctggggcgtctgcacgcgcaacctggagctccttaacagtcatccaaggctg 1 V T I G F K N C K K TNW G V C T R N L E L L N SMH P R L 18530 aggtttcttacaaacaatcctaattccttcaaaatagctcttgtccggtcaatagtgcctaa 18592 29 R F L T N N P N S F K I A L V R V N S A dp1ORP155 13512 atgaatacgaccctgagcaacttacaatgggacatggtgcaaaatctaatttccttcttcaacgtttcattcaactcacgccag 1 M N T T L S N L Q W D M V Q N L I S FF N V S F N S R Q 13596 ttgaagctcaagcaattttctggcatatgggagcctatgatattagtccttatgcaaatttga 13658 29 L K L K Q F S G I N E P M I L V L M Q I dplORF156 18777 atgctagtatctccatttctgttggtcttgctttttagctctgttcagttcagctgcttctcgcgatgcaatagtttcgagaat 1 M L V S P F L L V L L F S S V Q F S C F S R C N SF E N 18861 atgcctgt tcat aggctcacaatat tccgccaaagat ttgccagt tatggtggcgt caat taa 18923 29 M P V H R L T I F R Q R F A S Y G G V N dplORPl57 13281 gtgctgctggacttgagaagaaattggtatcattttcgagccaatccataaggttctcgataccgtcacgattgattgttt ct 1 V LA G L E K K L V S FS S Q SI R F S I PS R L IV S 13197 gttactgctttcttgaagcgttttttaaagtctgtcatattagacccctttcattttctataa 13135 29 V T A F L K R F L K S V I L D P F N F L dpl0R1158 40727 gtgaacgccgttattagggtcaaacgaagcccaaacggacattgtctttgtcccgtcactattgtgaggaacagtcacttctcc I V N A V I R V K R S P N GMH C L C P V T I V R N S H F S 40643 acttgcgagcgttacctcttcgccggacgtgtcgtagtctgggtgactgctatgaacacttga 40581 29 T C E R Y L F A G R V V V N V T A M N T dplORP1S9 30371 atgatttggtctgcgcttacccaagcagcttctcctttgagtttctgtcgagcattccctgtacggtctgtccaaatagcatgc 1 M I W S AL T Q A AS P L S F C R A F P V R S V Q I A C 30287 gtctttgcgtattcttccatcttagtagcagcgacttcgcagactgttatgacagcgacttga 30225 29 V F A Y S S I L V A A T S Q T V M T A T dplORF160 41324 aigggt t aagacacg cgaggaaaacaat cgaacgt ccaagacgt atct at caatgt tat agaat act atggac cgtct at caa 1 MGC Y R H A R K T I ER P R R I Y Q C YR IL N T V Y Q 41408 tttctccgttcaacgtactcgtcaaaatcctgcaattatccaagctcttcgaaatgctaa 41467 29 F L R S T Y S S K S C N Y P S S S K C dplORP161 52175 atgcaaaaaggtttaaatgcttatctcgacatgacattgaaagcaitgcattcgagactatt tcaaaatgtttggcaacgttca 1 M Q K G L NA Y L 0 M T L K A LMHS R L F Q N V NOQR S 52259 aatcaaaccaaggggccaagttttcaacttaccttacaagactcttcaagaatagaatag 52318 29 N Q T K G P S F Q L T L Q 0 S S R I E dplORF162 13020 atgacagaagttgcggtaaatagcccgcaaaaggtgagagtagttatggtcgggaatattgaatttctcgaatatttaaaaagg 1 M T E VA VHNS P QK V R V V M V G N I E F L E Y L K R 13104 aagtacggaacagaaacttccatcagttatattatagaaaatgaaaggggtctaatatga 13163 29 K Y G T E T S I S Y I i E Ni Z R G L dplORF163 40224 gtgaccgaat t tctatgt t ct ccgcagggaatgaagttatgtacct tacgcaagggtt catt cacat cgataacgggatct t t a 1 V T E FL C S P QG M K L C T L R KGC S F T S IT G S L 40308 cccaatccattcaagtcggccgacttragaacggaacaatactcgtttaatccagacatga 40367 29 P N P F K S A 0 L E R N N T R L I Q T dplORP164 6696 atgtactcttggagaacttcgtgcctaaatgttccagcttcgcccattgcaattaggttagaat ctgcgttatctataatagac 1 M Y S W R T S C L N V P A S PI AlI R L E S A L S5I10 6612 tcaccgattctttcgaaatacatttttcgaatacatccaccaaccccgctgggcttataa 6553 29 S P I L S K Y I F R I H P P T P L G L WO 00/32825 PCT/I B99102040 390 50504 atgagtgaaagctggtcaatccccaccacagatggtctatatttagatatcatgctatctaaaat tgcaggggtaaggttcttt 1 M S E S W S I P T T D G L Y L DI M L S KI AGC V R F F 50420 cctccaatcataaagggcgtgactaccacaagggaattttcagcctcagtcattgcttga 50361 29 P P I I K G V T T T R E F S A S V I A dplORF16 6 23519 gtggtcatgctctttaatgactctatcttctcccgtttggctcgctttactgtcccagctgtaagcatagtattcatcaatgtc 1 V V M L F N D S I F S R L A R F T V P A V S I V F IN V 23435 gtgcgtgttgctagggtcgagtgtaaatctattctcagccaagagttcagcgtgaaatga 23376 29 V R V A R V E C K S I L S Q E F S V K dplORF167 1008 atgct tattcggttggagcttcttacgtcgtataiggtgctcacgcagacgatgcggctggaggtgcttaccctgattgcactc 1 M L I R L E L L T S Y M V L T Q T M R L E V L T L I A L 1092 ctgagttctataattcaatgtcaaatgcaatggaatatggaactggaggcaaggtaa 1148 29 L S S I I Q C Q M Q W N M E L E A R dplORFl68 54345 atgagactttttccaggtatattcttcacattgttcagttcctggagtcaagtattgttcttgaaattcatagagttcgaaag 1 M R L F P G Y I L H I V Q F L E SS I V L ElI H R V R K 54261 tttgcaaagggtcataggccgcatacatataggcaacatcaggaggaattaaactaa 54205 29 F A K G H R P H T Y R Q H Q E E L N dplORPl69 45954 atgaacacagcatcgcgaagagtttcaatgttagtgataaggaagaattcgtcgtggccaccaagcaagtcttctgcccgttta 1 M N T A S R R V S M LV I R K N S S W P P S K S S A R L 45870 gaaactccgtcaatcactaatttcccatctttagtgactcgacttcctaaaatatga 45814 29 E T P S I T N F P S L V T R L P K I dpl0R7170 27600 atgatgattgttcttgtgctcctgccgtttgttgagcagcagcaagttgcttaccaaaagagccgatttcacgaggttcgggaa 1 M M IV L V L L P F V E Q Q Q V A Y Q K S R F H L V R E 27516 caccaccaccgacacgacctggatttcctaaatttccagtcccggctggcgacttag 27460 29 H H H R H D L D F L N F Q S R L A T dplORF17l 47678 atgtcatttctttcatgtactctttagagcatcacgaagacttttgacttgtttctccatgtcgcctttggtagcatttaat 1 M S F S F M Y S F R A S R R L L T C F S M S P L V A F N 47594 tcaccggcttcttcaattgcagcgatgaactgttttcatcttcaaatttcatttaa 47538 29 S P AS S I A A M N C F S SS N F I dpl0RF172 10462 atgtttcgaacattttctaccccattattagaagcagcatcaatttcaataggagagccaagtcctttgttcacatccttcgcg 1 M F R T F S T P L L EA A S IS I G E P S P L F T S F A 10378 aaaattcgagcagtagtggttttaccagttccagcgccaccacagaatagatag 10325 29 K I R A V V V L P V P A P P Q N R dplORF173 32160 atgacattagacatttCCttcgtctgtacgaaaggtttcagcttgagtcacttcaccgtacattgcactgaagattgtcataag 1 M T L D IS F V C T K G F S L S H F T V M C TE D C H K 32076 ttgctcatctgtcatatactcgccgacttcagcgtaagtaggctctaccattga 32023 29 L L I C H I L A D F S V S R L Y H dplORFl74 29766 atgtcccatcagcccttttcattaagattgtcgaaccagcgttcgacttttcatcagtttcaagctgttcttgcttatattggt 1 M S H Q P F S L R L S N Q R S T F H Q F Q A V LA Y I G 29682 cataatagaattgcgccatttgtttccagtagtctgcgtcaccttttagactga 29629 29 H N R I A P F V S S S L R H L L D dpl0RF17 15648 atgcgcgtgatgtcatggcagataggcgaggataaagagtgtcgaatagaacgccgcagagcttacgagagcgccaaatacaag 1 M R VM S W Q I G E D K E C R I ER R R AY E S A K Y K 15564 ggcgacggtactacggtggtcctcttgcttacctgtaaccaaataaaccattga 15511 29 G D G T T V V L L L T C N Q I N H dpl0R717 6 43031 gtgataaagacggtaacgttgaatttttctagtt ccgtcttgaatgacgtcattttggtgattgattgctactgtcgtttggtc 1 V I K T V T L N F S SS V L N D VI L VI D C Y C R L V 42947 aatcccgtcgacctgctgtttaagagtgctaagagttgtagagatatcctctaa 42894 29 N P V D L L F K S A K S C R D I L dplORF177 19937 atgaacctaaacagttcgagacttctcaagctgttgggaaagaagcaggtcgaatattttggtgggaacgtgaacttggtcata 1 M N L N S S R L L K L L G K K Q V E Y F G G N V N LVI1 19853 ttctcgcgactaattttaggtgcttttgtattaatcagcgtgatargcgccag4 156cC 29 F S R L I L G A F V L I S V I C A dplORP17B 11924 atgacaactgtcgaccaatttaaaagacagttgaggaaaagtttaggCtcaatttttccttcatcagtt tccttaaattta(3c 1 M T T V D Q F K R Q L R K S L G S I F P S S V-S L_ N-L S 11840 caattagtaacctttagcgaattgctagcacttgcctcccatattaagtcataa 1178? 29 Q L V T F S E L L A L A S H I K S* dplORF179 56058 atgggtagggttattccttacctcgttgatttgctttatgcaaaacctaccacaatcgcttgtcgtggcttcaggagttgcatt 1 M G R VI P Y L V D L L Y AK P T T I A C R G F R S C I 56142 ttggataagtcaaaaagcaagtgtctttatattcgacaagctctcgaataa 56192 WO 00/32825 PCT/IB199102040 391 29 L D K S K S K C L Y I R Q A L E dplORF18 0 41176 atgttcgacatgatttggaggaagttgttccctgttaaaatctgtcgaaccgctgaagtcgtgtctactaaagaaatgcccgaa 1 M F DM1I W R K L F P V K I CR T A E V V S T K E M P E 41092 aaagtaggacgtactgaatcggggatgttgaacctccatccgtttgaatag 41042 29 K V G R T E S G M L N L H P F E dplORP181 13126 atggaagt tt ctgt tccgtact t ccttt ttaaat at t cgaaaattcaat att cccgaccataactact ct caccrtt t tcgg 1 M E V S V P Y F L F K Y S R N S I F P T I TT L T F C G 13042 ctatttaccgcaacttctgtcataggctgtcctcctttgcttatactgtaa 12992 29 L F T A T S V I G C P P L L I L dplORF182 45369 gtgcttgcccatgtttcaataaatagggttcgacctcgcctagctttcgaacgtgctataacgatttcaatcatagcgaagaaa 1 V L A H V S I H R V R P R L A F E R A I T I S II A K K 45285 ggtgagaagcttcaatcaattccattgcggtgtcaatatcttcttccttga 45235 29 G E K L Q S I P L R C Q Y L L P dplORF183 13896 gtgattccagcttttggtttttcttcagcctcttcaactttttcttccttaggcgcaggttcttacgagttgaactcttaggt 1 V I P A F G F S S A SS T F S S L G AG F L R V E L L G 13812 ttttcatcttcaccgccttcatgctg 13762 29 F S S T T S S T S A S C S T G P dplORF184 53330 gtactcgcacctaaatgttctgagcaataatcagatccctttg 1 V N L P S T T S N I1W S SS R SK I R V P R S S L F S G 53246 aaatcttcaagagtagcactgtcttccggacgctctggaaggaattcataa 53196 29 K S S R V A L S S G R S G R N S dplORF185 22522 atgaaattcgagatgttcgaaatgaaaatctacttattattagacactttagaaatggcaagaaattgtcaactacttctata 1 M K F E M F E M K I Y L L L D T L E M A K K L S T T S I 22606 tatttggaggaaaagatgagtcgagtcaagaccttatacag9gggaa 22653 29 Y L E E K M S R V K T L Y R G dpl0RF186 21272 atgctcgaaaaactcaaccggttcgaaaacctcaatccttcgaaaagtcgaaccattcgaaaagttcaaaagttcgaaaaactc 1 M L E K L N R F E N L N P SK S R T I R K V Q K F E K L 21356 aaccattcgagagtaggaattaaggacataccagttcaacctttttag 21403 29 N H S R V G I K D I P V Q P F dplOR?187 34415 atgcttcacctcattatagacgtcataccgtttcagttgtaga 1 M V L F N L F L L S F K Q L F K L S L L Y S M V L F R H 34499 ttcctacgcttattcaagcaggtcttcaaattttgtcagctctcataa 34546 29 F L R L F K Q V F K F C Q L S dplORFl8 8 35609 atgttcgtaaagcagccggttcgcctcgagtggacttgttcaatacaggaagtgacaaccctaaccaacctcagtcacaatcta 1 M F V K Q P V R L E W T C SI Q E V T T L T N L SM HN L 35693 aaaacaatcaaggcgagcaaaccgttgtcaacattggaacaatcgtag 35740 29 K T I K A S K P L S T L E Q S dplORFl89 42587 atgcaaacgcagtatcaaccgtctctgaaactCttcatgacccagacttgtatgctgcgaaccgtcgagaacttcgagctgacg 1 M Q T Q Y QOP S L K L F M T Q T CM L R T V E N F EL T 42671 agcaaaaacttcgcgaaactcgttacgcaatcgaagatgaaattctag 42718 29 S K N F A K L V T Q S K M K F dplORP19O 39786 atgtattcactcaaagttgttcagtgtggctcaatcatattaaaatcgaacttggtaatatctctact ccttttagtgaagcag 1 M Y S L K V V Q C G S II L K S N LV I S L L L L V KQ 39870 aggaagaccttaaatatcgaattgactcaaaagccgatcaaaagctaa 39917 29 R K T L N I E L T Q K P I K S dplORF191 40996 attctgtcgatgttgtatctgtagcatagctagaagtgattgt 1 M S I V P E L D L G K Y L A K S S D G V K 0 T L V V W F 40912 ttacctaaatctatccagtcgctaccgaaaactcggtaccaaacttga 40865 29 L P K S I Q S L P K T R Y Q T dpl0RF192 2920 atggtcgacgtcgaatgtt ttttcgagatgaagtttagggtcttctcgataccctacggtatgttcagcgagtgctttaacaaa 1 MV DV EC F F km &FR V F SI P Y7G S^ C F .K 2836 acggaatggagtatcttgcaacccgtcacgttctgcgtcctcgcctaa 2789 29 T E W S I L Q P V T F C V L A dplORFl93 42456 atgat tcagct caaattaaat acgaaatgagacat tgtctaaattt aaccaagaattat ctacatt cgatt tcaecaagtc 1 M IS A Q I K Y E M R H C L N L T K N Y L H S I-.S P Q V 42372 ttccgtcagtgtatatacatagaatggcatttccatatgagttattga 42325 29 F R Q C I Y I E W H F H M S Y dplORF194 40284 atgaacccttgcgtaaggtacataacttcattccctgcggagacatagaaattcggtcacttgataccttaatggtagagcta 1 M N P C V R Y I T S F P A E N I ElI R S LODT L M V E L WO 00/32825 PCT/I 899/02040 392 40200 ccgtcgttcttaccgataattagaccttcattagaagagctcatgtaa 40153 29 P S F L P I I R P S L E E L M dplORP195 42584 attccactgttaagtcttacctttcaataccgccatgccattt 1 M F T I V V L T S F F SA P C P I V N S AT I W R D F V 42500 aggttcaacatagttctcacctcctttctaaaaaatattataacatga 42453 29 R F N I V L T S F L K N I I T dplORP196 11273 atgaatacatcttcacttatccctcctaagattgtcattgttg 1 M V D L TS P C P I M S L L L A H Q K K F G F N Y R F S 11189 attaggctcccatttaacaactccagcaagttcattcatttcttctag 11142 29 1 R L P F N N S S K F I H F F dplORP197 7484 ataagtaagttcatcactgaaataagttggtaactaccactac 1 M K R L Y G I Q F Q A L K K L N G L EL K A S T Q T S S 7568 atgcagggtatgaagttcttacaagaagcgtcgaactagattga 7612 29 M Q G M K F L T R S V E L D dplORF198 24119 atcgtacatgctcgtttcagccatccttcgtactgacctcgtg 1 M P L N K L T S SF1I Q C L S S P I Q L T LE T L P A C 24203 ~tctgttgacattgtttatcaggacgagcgtacaaaaggaatga 24247 29 F L L T L F I R T S V Q K E dplORP19 9 15742 gtgtcgatgcgattctcacgtagacgctttgtaccaccggggg I V A P E L G C T F P P N C L A T A F S C L A L A L R V G 15658 attggtttgtatgcgcgtgatgtcatggcagacaggcaggataa 15614 29 I G L Y A R D V M A D R R G dplORP200 47843 ataagtgatgtaccgagttcccatctcttgctgcatattcaat 1 M T G L Y S IS P E S F SNH I S S V S A SS T N F S II 47759 tctttcaagcgttcttcgtccatagttgagcgctctgtcgtgtag 47715 29 S F K R S S S I V E R S V V dplORF2Ol 38569 atgctaagtctttacagtattttgatgatttgcttcgtcatcg 1 M G F T SS F F N Q R S IS LODS N Y L D L Y R F N Y R 38653 aacgggctatcaaaaaacctacattccaaaagacggaatga 38694 29 N GCL S K N L H S K R R E dplORP202 44483 gtgggtattttaatttaaatctgcaatatattgaatcattaat 1 V G R L F F I1K I F Y K M L D N I H S L S Y N T I I1K I 44567 aataaagccgaaaggcgaggaggacattatgtcaaaaattaa 44608 29 N K A E RR G G HNY V K N dplORF203 22781 gtatgatgcgtaaggaccttcaacgtcgaacccgcctgtaaag 1 V I RI G R V T RE P H F R T C Y G T AP C R L V DOK R 22697 ttcaggcatcagtgccacctcatcacagaagatacctgctaa 22656 29 F R H Q C H L IT E DT C dp10RF204 1471 ataccgtggcagagtgtgctttagcaaatgagacatt cattgacagacttgacc 1 M T T V R V K G W L L T F IT S R K S Q V HS L T D L T 1555 acgctgttcttcttcaagggaatgaaccaatcgctttag 1593 29 T L F F F K G M N Q S L 8524 gtaatagagtccgtgttcatggagaaacttcacagatccataa 1 V T L M N G S Q F G M L L V TO I1S S T T K EL P N L E 8608 ttcaggaaaagcaacctgctatcaagttcaatttcgtag 8646 29 F R K S N L L SS S IS dploRF2O6 19855 atacatcctcccaattcacgtcttcacgtggattgatttgtct 1 M T K F T F P P K Y S T C F F P N S L R S L EL F R F I 19939 aaattgttcaacttgagcaagtgcgatattattctttag 19977 29 K L F N L S K C DI I L d1olRF2O7 27502 gtgtcggtggtggtgttcccgaacctcgtgaaatcgg~tc tc~ggttg.CZgtcgtacacgagg 1 V S V V V F P N L V K S A L L V S N L L L L N K R Q ENH 27586 aagaacaatcatcattctttaaataataggaggaactaa 27624 29 K N N H H S L N N R R N dplORF2 08 47279 atttgagacaaatccgaaataatcctcgtgttctacr~-aaact 1 M F G M K 0 K T S L K KI T F TS R L F F L N L E Q T L 47363 accatcgtggttctcgattctgggatgacgaaggcgtga 47401 29 T I V V LODS G M T K A dplORF2O9 29784 attagacatctggctgacgtcacaatagtctgacctgtataga WO 00/32825 PCT/I B99/02040 1 29868 29 dplOftF2lO 53077 1 52993 29 dplORP211 20959 1 20875 29 dplORP212 52983 1 52899 29 dplORtF2l3 30291 1 30207 29 dpl0RP214 24273 1 24189 29 dplORF215 35822 1 35738 29 dpl0R7216 32849 1 32765 29 dplORF217 23443 1 23527 29 dplORF218 22029 1 22113 29 dp1ORF2 19 51388 1 51304 29 dp1ORF22 0 6334 1 6250 29 dplORF22l 43507 1 43591 29 dplORP222 13212 1 13296 29 dp10RF223 14055 1 14139 29 M L R I K F V E PFL K F L L L K S R Y F E T L G S V MND atggaggaaagaaaaaggataaagcgaatgaagtcgtag 29906 M E E R K R I K R M KS atgtttcaacttttcccgtatcatggttgtaaagttgaagaaatagtttttcaatacgagggaatccgttttggcataatggac M F Q L F P Y H G C K V E ElI V F Q Y E G I R F G INO aattatcaggatggactgtttccccgtcttcgccaatag 52955 N Y Q D G L F P R L R Q gtgctcgacttttatgtcgcccctaatttttgtttttacttacggactatgggatttgtaggtattttcagggcgttttttat V L D F Y V A F N F C F Y L R T M G F V G I F R A L F Y ttacttattaagtccttttciatattagattgtttataa 20837 L L I K S F S ILOD C L atggactgtttccccgtcttcgccaatagcattgcaattgatatagcgtcgacgaccgtcaacgtctgcttcgtggactacgaa M DC F P V F A NS I A I101 A S T T V N V C F V DY E ataatccatgtcttcgccttccgggtcatcatacaatag 52861 I I N V F A F R V II Q atgcgtctttgcgtattcttccatcttagtagcagcgacttcgcagactgttatgacagcgacttgaaacttgtttcgataccg M R L C V F F H L S S S ODF A D C Y D S D L K L V SI P ttcacagttactaacaaattcttcaggcttccatactaa 30169 F T V T N K F F R L P Y atgatgccaaagt tgttttcagtgctcattccttttgtacgctcgtcctgataaacaatgtcaacagaaagcaagctggaagg M M P K L F F S ANHS F C T L V L I N N V N R K Q A G R gtttctagggtcaactgtataggtgaactgaggcattga 24151 V S R V N C I G EL R H atgttaccaaaccctgatagagtttctttacttctattatacaatcctctcgacagtttgtcaacgtcgtcattgtttcgaact M L F N FODR V S L L L L Y N PLOD S L S T S S L F R T acgattgttccaatgttgacaacggtttgctcgccttga 35700 T I V P M L T T V C S P* atggcctcggagctcgcggccacatctcctccagatacggcagccaggtcaagtacccctggcatagcgtccatgatttcattt M A S E L A A T SP PODT A A R S S T P G I A S M IS F acctggaaaccggctgaagctagattttccataccttga 32727 T W K P A EA R F atgaatactatgcttacagctgggacagtaaagcgagccaaacgggagaagatagagtcattaaagagcatgaccactgcatgg M N T M L T A G T V K R A K R E K I E S L K SNM T T A W ataggaacagatatgcctgtctcactgacgctctaa 23562 I G TODM P V S L T L atggaatgcttccggaagaggt tcgatatagactacaaattgagcgcgagaaaattacattgctccgggccaaaatgggcgacc M E C F R K R F D ID Y K L S A R K L HNC S G F K W A T aggaaattgaaggcgaggttaaagataacttcgtag 22148 R K L K A RL K ITS atgattttatgctcgactttttcagttctcccatttcttcgaaacgcttcagggctgacgccttgcctaactacttcgctagat M I L C S T F S V L P F L R N A S G L T P C L T T S L D gttccaaaattccttttcagccactggtttccatag 51269 V P K F L F S HW F P gtgaagttttcttcggtgacggttgatacaatttccttcaagagtaagctgttaaggtggcaagtgaattctttcttcgaaact V K F S S V T V O T IS F K S K L L R W Q V N S FF E T ttcttgccagcagatgcgtacatgatgtcttcataa 6215 F L PA D A Y MNM S S atgactgctcaagtt ctatgtactatgctctecgctcagccggagcttcaagtgctggatggcagtcaatactgagtacatgc N T A 0 V L C T M L S A Q F E L Q V LODG Q S I L S T C acgcatggcttattgaaaacggttatgaactaa 43623 T H G L L K T V M N gtgacggtatcgagaaccttatggattggct cgaaaatgataccaatttcttctcaagtccagcaagcactcgataccatggaa V T V S R T L W I G S KM I P I S Q V Q Q A LODT N E gctatgaaggtggacttgtcgagcactcattaa 13328 A M K VODL S S T H atgtggtggacctgctggatatgtcgagatgtctactacttctacagtgaagtcgctgacgtttactacaagaaagatgtcg M W W Y L L ONM F E MS T T S T V K S L T F T T R K M S acgagcctgacgatgacagcgacattcttgtag 14171 T S L TNMT A T F L WO 00/32825 PCT/I B99/02040 394 dpl0RP224 13621 atgccagaaaattgcttgagcttcaactggcgtgagttgaatgaaacgttgaagaaggaaattagattttgcaccatgtcccat 1 M P EN C L S F N W R E L NE T L K K E I R F C T M S H 13537 tgtaagttgctcagggtcgtattcatatgctaa 13505 29 C K L L R V V F I C dpl0RP225 32991 gtgagcaacgggtgcgacgt atttcatcgcct ctgccatgt cgctagt t tetgcgt tcgtat cagctgctgctcgagcaaat ac 1 V S N G CD V F H R L C H V A S F C V R IS C C S SK Y 32907 gtcagccacgtgacccgcctggtttgcctctaa 32875 29 V S H V T R L V C L dplORP226 25191 gtggctgcgtacattagtttgaacttcagtgagcgcaagttgcttagcagaaagttcaicgctaggaattggatagtggtgttc 1 V A A Y I S L N F S E R K L L S R K F I A R N W I V V F 25107 gatagtcattgtcgtaagtgtttgataacttga 25075 29 D S H C R K C L I T dplORP227 23115 atgactcaattagatggtagcgcttatgacgtttcgagaatccataaaggccgaaggttgttgcattatagataccaaagtcgc 1 M T Q L D GS A Y D V S RI H K G R R L L H Y R Y Q S R 23031 ctgctacgaataaacggtcgaattctatattga 22999 29 L L R I N G R I L Y dplORF228 10450 atgttcgaaacattattgaagattctagatacaagtctatggacagcgagttcaaagtttacatcattgacgaggttcatatgc 1 M. F E T L L K I L D T S L W T A S SK F T S L T R F I C 10534 tttcaa~cgagcatttaatgcgctgttga 10563 29 F Q P E H L M R C dp10R7229 27634 atgtgcgagttaagaaaactgattttaatcaaaccactcgaagcattgtcgcaattcctgaccactacgttgctttggctgctc 1 M C E L R K LIiL I K P LE A L SQ0 F L T T T L L W L L 27718 aaattccagctaccgcagcaactcaagtag 27747 29 K F Q L P Q Q L K dpl0R123 0 50723 gtgacgaaaaatccggcatacttgaactatctgtcgttaaaaaccgatatggcgaagaccgaaaaatcatcgaatatatgtggg 1 V T K N PA Y L N Y L S L K T D MA K T E K S S N I C G 50807 acgttgaaactggaacctatactcttatag 50836 29 T L K L E P I L L dp101723 1 31071 atgcgcgtgtcattgcgtttcacatcttcagttccctccgaggtcacggcttcgagttctgctgtttctgccgtatctacgaca 1 M R V S L R F T S S V P S EV T A S SS A V S A V S T T 30987 aagttagctccgccgacttttggcaactga 30958 29 K L A P P T F G N dplORF232 29385 atgtcaattccattagctcttgctaattcaacgagctcaggaacggttttagccgcatactcttcgcgcatttgttcaacttcg 1 M S I P L A L A N S T S S G T V L A A Y S S R I C S T S 29301 tcaatttcttcaactgattcaattgtttga 29272 29 S I S S T D S I V dplORF233 52892 atgtcttcgcct tccgggtcat catacaatagagtgacaattgcgctgtcaccgtggtcagcgagtgtgaaaaactcgttatta 1 M S S P S G S S Y H R V T I A L S P W S A S V K N SiL L 52808 gaccctgagctaaatgttcctgatttttga 52779 29 D P E L N V P D F dp10RP234 36253 atgct tacgagtacagcgact caactgt t cgaaaggt t ataagt ttcaacccgctt tgggaggcgatagcttacctaacccag 1 M iLT S T AT Q LF E R F IS F N P L W E A I A Y L T Q 36337 gaagacctactcgacaatttagagtag 36363 29 E D L L D N L E dplORF23 32768 atgaaat catggacgct atgccaggggt act tgacctggctgcct at ctggaggagatgt gccgcgagct ccgaggc cat gg 1 M K S W T L C Q G Y LiT W L P Y LE E M W P R AP R P W 32852 ctagttcacttcgagcctttggattag 32878 29 L V H F E P L D dp10R7236 37528 atgttcgtcgcttttagatttagcaatatatcgaggcttcatgtggcgtgtagtaaaccacgaaacatcaatgagatattcact 1 M F V A F R F S N I S R L H V A C S K PR NI NE I F T 37444 tccattgttgatagaagcaaacgttaa 37418 29 S I V D R S K R dplORF237 1678 gtgagagtccaggtaaggaatcttgacatattctcagccgtagttctaaatccaaatagaactcgcttggtgt.4actgcattt I V R V Q V R N L D I F S A V ViL N PHN R T RiL V-.S T A F 1594 gctaaagcgattggttcattcccttga 1568 29 A K A I G S F P dp10RP238 1301 atgcctttttgcggtcgatacaagttgcgcaagttccacaactttcagcgtcactttcataacatgaacgagtcaagaaataag 1 M P F C G R Y K L R K F H H F Q R H F H N M N E S R N K WO 00/32825 PCT/I B99/02040 395 1217 gaacatctaaatcaattccccatttaa 1191 29 E H L N Q F P I dplORF23 9 26521 atggatttcacagag c gcactagattc caatttg caac a 1 M V K Y F L S K N V L S T I L M E C AT K L Y G T K T H 26605 tcgaagaaatcgctgatgagttga 26628 29 S K K S L M S dp10R7240 41893 atttgaagggacggtaagcagaaaaaggaaccaggatggtatg 1 H F G I S V K Q S L H G E V T NT R T T L R E L E V NOG 41977 gactatttcaaaatttctggttag 42000 29 D Y F K I S G dp1ORF241 47020 gttttctaagaaattctcatagcgaacaagtacattgttaagt 1 V S F L N M E I V F I L F K Q D IE K V T N F R F H R L 46936 accatctacgatataatctgctaa 46913 29 T I Y D I I C dpl0RF242 41338 gtgt ctgtaacccatgctct tacggtagcggagc cattaaagt tcatcatacccaat ttgccgccgt tttcgttgat agct tg 1 V S V T H AL T V A E P L K F I IP N L P P F S L I A W 41254 tttttacctacgagctcagcgtga 41231 29 F L P T S S A dplORF24 3 51306 attcaatcttcgccgttctgacccacttgctaaatggcatcgt 1 H F Q N S F SA T G F H R T L H R F DL I H S R R I Q L 51222 gtcctgaagtgtagccgcaagtga 51199 29 V L K C S R K dplORF244 27083 gtgaggtacaaaatgttgaccgtcgccgtcaatgaaaattttagcatcgagttctttcgaagttttcgaaataatttccttcac 1 V R Y K M L T V A V N E N F S I E F F R S F R N N FL H 26999 ctgtttgatagttggttcatctag 26976 29 L F D S W F I dplORP245 6278 gtgaggattttcaatttc gcgaatctctag tta atcatgat ta 1 V AS E F F L R N F LA S R C V H D V FI T A S R S F N 6194 tcgaagtcggtctttcaagaataa 6171 29 S K S V F Q E dplORP246 2831 atggactcacgcctcggctgcaaaacaagctgagcgccgattc 1 MHEY L A T R H V L R P R LI D Q K V F E R L P Q Y C P 2747 aggttacaatttcatccggcttaa 2724 29 R L Q F H P A dp10R7247 29641 gtgacgcagactactggaaacaaatggcgcaattctattatgaccaatataagcaagaacagcttgaaactgatgaaaagtcga 1 V T Q T T G N K W R N S I M T N IS K N S L K L H K S R 29725 acgctggttcgacaatcttaa 29745 29 T L V R Q S* dplORF248 53560 gtcagc gttgagaacagtcgtat gtacgaacgagcgattagtc 1 V Q S L V L A R R T M L S Y L L H G K T G S L Q L R L L 53644 acatttcaggaaacgctctaa 53664 29 T F Q E T L dp10R7249 2012 gtgtcattcttcatg ta gcc ccgacg c cggccga ta aagaa 1 V D AT I I A TOG V TOQP L POGT V L L S R H IS Q A K 2096 aagctgctagtcgaatcttga 2116 29 K L L V E S dp 10RF250 23837 atgcaca gaa gcagc atca a actatcaaattcaata ttca atttc 1 HOG K H O R L T K T Q S T I N L L E K F E T I F D N L S 23921 aaaagcaatcacgctttatga 23941 29 K S N H A L dvorsl 39205 atga~atatt cg tcctgt cgg acc ggtcgct cctcLzLgtuL 1 H E IIS L T V C A WL P G Y P L S S VI P L P F R P C 39121 ataggctgcagggtcttttga 39101 29 I G C R V F dplORF252 54771 gtttttgtgacattga t tcaattcaatc ggttcttteaagtcaa 1 V L Y R S K L I L H I F Y I S KV L L R Y R Y Q H A RQ 54687 tactttcgcctgttcctctag 54667 29 Y F R L F L dplORP253 56255 atggttgcgtctataatagaaccgatgttgctagacaaagcatttgcaatcttcgagtctaatttattcgagagcttgtcgaat WO 00/32825 PCT/I B99102040 396 1 V VA S I IE P M L L D K A F A I F E S N L F E S L S N 56171 ataaagacacttgctttttga 56151 29 I K T L A F dpl0R7254 48479 atgaacct ttcgct taggt tcaatct t t tcaacattt tcatatttaacaaaact t t cagctaaaatcacaaattcaatg 1 M N L S L R F N L F R T F S Y L T K L S A K N R Q S5 S 48395 ttcgactcaatgtttaaataa 48375 29 F D S M F K dplORF255 9572 atgctttggt ct tct cgacgaatgactctactacatt ccctgcagggt tt cgagcagtacgggt caatgatgcaccgtt ttcgt I M L W S5S R R M T L L H S L Q G F EQ0 Y G S MM H R F R 9488 caaggtagtcaccttttctaa 9468 29 0 G S H L F 6 15289 atgaccttccagtcactaatgcggccgctgaaattggataccactatacatgggttcaccaacttcgagacaaagcagttgaaa 1 M T F Q S L M R P L K L D T T I H G F T N F E T K Q L K 15373 cacttgaagaaattttag 15390 29 H L K K F dplORF257 28216 gtgaacgtgctggatttagcaaacaagctactgagatggcattcttccgtgagtctatgcgacttggtgaaaaagaccgtcaaa 1 V N V L D L A N K L L R W H S S V S L CD L V K K T V K 28300 acttgcaaatgctattga. 28317 29 T C K C Y dplOR?258 44023 atggaaattggtattggttcgaccgtgacggatacatggctacgtcatggaaacggattggcgagtcatggtactacttcaatc 1 M E I G I GS T V T D T W L R H G N G L AS N G T T S I 44107 gcgatggttcaatggtaa 44124 29 A M V Q W dplORF2S9 4298 atgactcgactacgaagcataaagacaagtggatggaaagagtattcgaagttattcgaaacagttctaatccagacgttaaga 1 M T R L R S IK T S GW K EY S K L F E T V L I Q T L R 4382 ctcacgcatttgggatga 4399 29 L T H L G dplORP26O 24746 gtgaccctacttcctcaat cggcggtactggaggcaagcaagct caagtcactt ccat tt caggaaacttcaacttccttccag 1 V T L L P QS A V L EA S K L K S L P F Q E T S T S F Q 24830 cggctgaatattatttag 24847 29 R L N I I* dplORP26l 288 atgaat tcact t ccct t tgccct aaaacaggacagcc tgact tcgcgaat gtt tt cat tagt tacatt ccaaacgaaaagatgg 1 M N S L P F A L K Q D S L T SR M F S L V T F Q T K R W 372 ttgaatctaaatcattga 389 29 L N L N H dp10R7262 9408 atgcctattcaactccaggcggaaagatgtggaagcatgcttgtgcagttcgacttaaatttagaaaaggtgactaccttgacg 1 M P I L Q A E R C G S M L V Q F D L N L E K V T T L T 9492 aaaacggtgcatcattga 9509 29 K T V H H dplORF263 27052 atgaaaat tt tagcat cgagtt ct t tcgaagttttcgaaataattt ccttcacctgtt tgatagttggtt cat ctagacctttt 1 M K I L A SS S F E V F EH I1S F T C L IV C S S R P F 26968 aacaagtcttctaattga 26951 29 N K S S N dplORF264 6139 gtgaatagtacaaggcggtctaatacgctcaggatttctgctgtagggatagccgcatcatcttcaaactcaattgagtcaagc 1 V N S T R R S N T i R IS A VG I A A SS S N S I E S S 6055 tgtgaaacgtcttcataa 6038 29 C E T S S dp10R7265 4801 gtgaataaagt caagcgt t tt taaaaaagttcat ttttt t ttaaaaaaaat aagagcgaaaagctc tatct aaaatagtc 1 V N K V K R F C I K S S FF F K K N K S E K L L S KI V 4717 gacgttgacgatttttaa 4700 29 D V D D F dp1ORF~bb 50220 atgcccgt tcttccaagcagt tgcaagcattt tat caat agt ccacgacttacct tgtccaggt cgagccattatgacaatcaa 1 M P V L P55 C K H F I NS P R iT L S R S S H Y DMN 50136 aicctcaccaggaagtaa 50119 29 1 L T R K dp10RF267 47367 atggtcaaggtctgttctaggttcaggaagaacaaacgggaagtgaatgttattttcttcagcgaagtcttttgcttcatacca 1 M V K V C S R F R K N K RE V N V I1FF SE V F C F I P 47283 aacattaatcgtagatag 47266 29 N I N R R dplORF268 WO 00/32825 PCF/I B99/02040 397 12621 atgtcaatttcggtcttgtgcttgacaatggattcaactactgatgcgtcaaccrttttcaatcgcgacagcttgtccaattca 1 M S IS V L C L T M D S T T D A S T F F N R D S L S N S 12537 ttgtcaattctagagtaa 12520 29 L S I L E dplORF269 53834 gtgaatagtatcgagtccatcagttctacgtcaatagaacctattccgtcttcaatcattttgtctacatactgctcgagttt 1 V N S I E S IS F Y V N R T Y S V F N H F V Y I LL E F 53750 tgcttcctcagtgattaa 53733 29 C F L S D dpl0R7270 50792 atgatttttcggtcttcgccatatcggr ttttaacgacagatagttcaagtatgccggatttttcgtcacgcttcatagcgata 1 M I F R S S P Y R F L T T D S S S M P D F S S R F I A I 50708 actctgctagcattttga 50691 29 T L L A F dplORF27l 19739 atgaggctgctttgctttatcttcgttaccgtattgaccgacttcctactcgcgaaccttcctacaagaattcatacctcaaag 1 M R L L C F I F V T V L TD F L LA N L P T R I HNT S K 19655 gctttttgtcagccttag 19638 29 A F C Q P dp10R7272 1556 gtggtcaagtctgtcaatgaatgtacctgcgattttcttgacgtgataaaagtcaacaaccatcccttgactcgaacgtggtc 1 V V K S V N E C T C D F L DV I K V N N H P L T R T V V 1472 ataagttccgcctgctaa 1455 29 I S S A C dplORP273 56256 atggatttcattaggactgagtcctcttggaattggaacggttgcatatatagarattccgtcagccgtactaggccaagttct 1 M D F IR T E SS W N W NG C I Y R YVS V S R T R 56340 agttcagtttatcttgcagtcaattgcttcgagatatttgaaaaagtagtcaggaaaattcctgattatcttgcagtcaattgc 29 S S V Y L A V N C F E I F E K V V R K I P D V L A V N C 56424 ttcgagatatttgaaaaagtagtcaggaaaattcctgattatttttttacaaaaacgcttga 56486 57 F E I F S K V V R K I P D Y F F Y K N A WO 00/32825 PCT/I 899/02040 398 Table 31 Query= sidI114822IlanjdplORFOOl Phage dpi ORF136698-4039012 (1230 letters) ,eiI928828 (1,44593) 0R71904; putative [Lactococcus lactis phage Length 1904 Score 427 bits (1086). Expect e-118 Identities 226/475 Positives 281/475 Gaps 45/475 Query: 395 AESGKYIGVLNTNKJCPSSLVPDDFTWIRLSGPKGDAGLPGAPGRlGVDGVPGKSGVG lAD 454 At YIG P D+TW G GA G+DGV GK GVGI Sbj ct: 820 ADYPSYIGQYTDFIQYDSAKPSDYTWSLI- -RGNDGKDGATGKDGV- -AGKDGVGIKT 873 Query: 455 TAITYAVSVSGTQEPENGWSEQVPELIKGRFLNTKTFWRYTOGSHETGYSVAYIGQDG3NS 514 T ITYAtS SGT +P GW* QVP L+KG++LWTKT W YTD S STGYSV YI +0GW-i Sbj ct: 874 TVITYALSSSGTDKPNTGWTSQVPTLVKGQYLNTKTVWI'YTDSSSSTGYSVTYIAG)GNN 933 Query: 515 GKDGIAGKDGVGIAATEVMYASSPSATSAPAGGWSTQVPTVPGGQYLWTRTRWRYTDQTD 574 G DGIAGJDGVGI T YA S T APA GW-*+QVP VP GQ LWT+T W noD T Sbj ct: 934 GNDGIAGIWGVG IKKTTITYAVGTSGTTAPASGWNSQVPNVPAGQLWTKTVWTYTDNTS 993 Query: 575 EIGYSVSRMGEQGPKGDAGR- DGIAGKNGIGLKSTSVSYGISPTDSAIP-GVWASQVP 630 E GYSV+ MG +G KGtD G -*GIAGK+G SP P G VP SbjcC: 994 ETGYSVAMI4GVKGDKGDPGNNGTNGIAGKDGKGI KATAITYQAS PNGTTAPTGTWSASVP 1053 Query: 631 SLIKGQYLWTRTIWTYTDSTTSTGYQKTYIPWGNDGKNGIAGKDGVGI KSTTITYAGST 690 KG +LWTRTIWTY~T+TTETGY Y+ .GW+G .G GKDG GIK+TTITYAGST Sbj ct: 1054 PVAKGSFLWTRTIWrYTDNTIETGYAVAYMGTWGNNGHDGFPGKDGTGIKTTTITYAGST 1113 Query: 691 SGTVAPTSWWTSAIPNVQPGFFLWTKTVWNYT0DTSETGYSVSKIGETXXXXXXXXXJOC 750 SGT P lETS +iP V a -tLWTKTVW YTD+TSETGYSV+ +G Sbjct: 1114 SGTTPPNNGNTSTVPTVAEGWYLNTKTVWTYTDNTSETYSVAMMG VKGDKGDP 1167 Query: 751 )XICOCCCCXADGRS -QYTHLAFSNSPNGEGFSHTDSGRAYVGQYQDFNPVMSKDPAAYT 809 DG+ .T SPZTO A GC+ P tIK +T Sbjct: 1168 GNNGTNGIAGKDGKGIKATAITYQASPNGT TAPTCTWSASVPPVAKGSFLWT 1219 Query: 810 KGNVCAQGIPCKPGADGKTNYFI4IAYASSADGS 846 T7W GN+C G PGKC G KT I YA S G+ Sbjct: 1220 RTIWTYTDNTTETGYAVAYMGTNNNGHDGFPGKDGTGIITT--TITYAGS7SGT 1272 Score 396 bits (1007), Expect e-109 Identities 208/449 Positives 260/449 Gaps 42/449 Query: 421 IRLEGPKGDAGLPGAPGRDGVDGVPGKSCVCIADTAITYAVSVSCTQEPENGWSEQVPEL 480 t+CGKGDCG PG +G+G.CGKCGGI TAITY S +GT P WS5 VP Sbjct: 1155 VAJ'I4VKDKG PGNNTNGIAKDGKIKATAITYQASPNCTTAPTGTWSASVPPV 1211 Query: 481 IKGRFLWTKTPWRYTDCSIETYSVAYICQDGNSCKDGIAGKDGVGIAATEVMYASSPSA 540 KG P1147+7 W4 no ETCYtVAY+C +CN+G DG aCoaG GI T YA S S Sbjct: 1212 AKCSFLWTRTIWTYTDNrI'ETGYAVAYNCTNGNNGMDFPGKDTIKTT7ITYAGSTSC 1271 Query: 541 TEAPAGGWSTQVPTVPCGQYLWTRTRWRYTDQTDEIGYSVSRJGEQGPKGDAGR DGI 597 T P GW++ VPTV G YLWTtT WYTD TECYSV+ MG--G KGD G tGI Sbj ct: 1272 TTPPNNG14TSTVPTVAEGWYLWTKTVWTYTDNTSETGYSVAbU4GVKCDKGDPGNNGTNGI 1331 Query: 598 ACKNGIGLKSTSVSYGISPTDSAIP-GVWASQVPSLIKQYLNTRTIWTYTDSTTETGYQ 656 AGK+G G+K+Tt...Y SP P G W4++i VP KG tLWTRTIWTYTD)+TTETGY Eblt 22A-r(K-.. kTI.-Q-P--TP---SVPPVAKGc.SrTSJWrT1TTWJ~nTTF-TCYA 1391 Query: 657 KTYI PKDGNDCGcNGIAGI CVGIKSTITYASTSGVAPTSNWTSAIPVQPGFFLWTK 716 Y+ taNta +a axca aIK+TTITYAaSTSCT P 1475 -iP V G tLWTK Sbj ct: 1392 VAYNCTWGNNCHDCFPGKDGTCI KTTTITYAGSTSGTTPPNNGWTSTVPTVAECNYLWTK 1451 Query: 717 TVWNYTDDTSETGYSVSKIGETYYYYYYYYYYYYYYYYYYYYYYAnCRS -QYTHLAFSNS 775 TVW YTD-iTSETGYSV+ DG+ T S Sbjct: 1452 TVWTYTDNTSETGYSVA4MG VKGDKCDPCNNGTNCIAGKDGKGIKATAITYQAS 1505 WO 00/32825 PCT/I 899/02040 399 Query: 776 PNGEGFSHTDSGRAYVGQYQDFNPVHSKDPAAVTWTKW KGND 817 PNG A G P +T TW ON.
Sbj ct: 1506 PNGT TAPTGTWSASVPPVAKGSFLWTRTIWTVTDNTTETGYAVAYMGTNGNN 1557 Query: 818 GAQGI PGKPGADGKTNYFMIAYASSADGS 846 G G PGK G KT I YAS G+ Sbjct: 1558 GHDGFPGKDGTGIKTT--TITYAGSTSGT 1584 Score 384 bits (977) Expect e-105 Identities 179/322 Positives 222/322 Gaps 7/322 Query: 421 IRLEGPKGDAGLPGAPGRDGVDGVPGKSGVGIAD)TAITYAVSVSGTQEPENGWSEQVPEL 480 GKGD G PG +G +G+GCKG GI TAITY S +Gr P WS VP.+ Sbjct: 1311 VANMGVKGDKG DPGNNGTNGIAGKDGKGIKATAITYQASPNGTTAPTGTWSASVPPV 1367 Query: 481 IKGRFLNTKTFWRYTDGSHETGYSVAYIGQDGNSGKDGIAGKDGVGIAATEVMYASSPSA 540 KG FLNT.T W4 YTD ETGY.VAY.G .GN.G DG GIOJO GI T YA S S Sbjct: 1368 AKGSFLWTRTIWTYTDNTTETGYAVAYMGTrNGUNGIDGFPGKDGTGIKTTTITYAGSTSG 1427 Query: 541 TEAPAGGWSTQVPTVPGGQYLWTRTRWRYTDQTDEIGYSVSRM.GEQGPKGDAGR DGI 597 T P VPTV G YLWTT WYTD T EGYSV. MG +GKGD G .GI Sbj ct: 1428 nPPNNGWTSVPTVAEGNYLWTKTVWTYTDNTSETGYSVAOIGVKGDKGDPGNNGTNGI 1487 Query: 598 AGIO(GIGLKSTSVSYGISPTDSAI P-GVWASQVPSL.IKGQYLNTRTIWTYTDSTTETGYQ 656 AGK.G GK.T Y SP P G W4*. VP KG +LWrRTIWrYTD+TTETGY Sbjct: 1488 AGIOGKGIKATAITYQASPNGTTAPTGTWSASVPPVAXGSFLWTRTIWTI'DNTTETGYA 1547 Query: 657 KTYI PKDGNDGKNGIAGIOGVGI KSTTITYAGSTSGTVAPTSNWTSAI PNVQPGFFLWTK 716 Y. .014.0 .0 0100 GIK.TTITYAGSTSGT P '4TS .P V G .LWrK Sbj ct: 1548 VAYMGTNGNNGHDGFPGKDGTGI KTTITYAGSTSGTTPPNGWTSTVPVAEGNYLWTK 1607 Query: 717 TVWNYTDDTSETGYSVSKIGET 738 TVW YTD.. ETGYSV K.G T Sbjct: 1608 TVWAYTDNSFETGYSVGKIGNT 1629 Score 201 bits (507), Expect 2e-50 Identities 121/297 Positives a 156/297 Gaps 19/297 Query: 421 IRLEGPKGDAGLPGAPGRDGVDGVPGKSGVGIADTAITYAVSVSGTQEPENGWSEQVPEL 480 +G KGD G PG +G +GGK GGI TAITY S tOT P W4S VP Sbjct: 1467 VA!M1lGVKGDKG-DPNNGTGIAGKDGKGIKATAITYQASPNGTTAPTGTWSASVPPV 1523 Query: 481 I K0RFL'TKTFWRYTDGSHETGYSVAYIGQDGNSGICDGIAGKDGVGIAATEVMYA.SS PSA 540 KG FLNT.T WYTD.+ETGY.VAY.C.+GN.G GGKD10G0 T.+YAS S Sbjct: 1524 AKGSFLNTRTIWTYTDNTTETGYAVAYMGTNGNNGHIJGFPGKDGTGIKTTTITYAGSTSG 1583 Query: 541 TEAPAGGWSTQVPTVPGGQYLW4TRTRWRYTDQTDEIGYSVSRMGEQGPKGDAGRDGIAGK 600 T P GW4-*VPTV G YLNTT WYTD +EGYSV +MG OP AG +G GK Sbj ct: 1584 TTPPNNGWTSTVPTVAEGNYLWTKTVWAYTDNSFETGYSVGKI4GNTGP" AGSNGNPGK 1640 Query: 601 NGIGLKSTSVSYGISPTDSAIPVWASQVPSLIKG-QYLWTRTIWTYTDSTTE--TGYQK 657 G.Y W '4 G Sbj ct: 1641 VVSDTEPTTKFKGLTWKYSGVVDHPLGNGTKI LAGTEYYWNGNNWALYEINAHNINGDNL 1700 Query: 658 TYIPKDGNDGK-NGIAGKDGVGIKSTTITYAGS--TSGTVAPTSNWTSAIPNVQ 708 DGK I G.GV +.TT GS .S5+ TNJT AIN Q Sbjct: 1701 SVTNGTFKDGKIESIWGSNGV NGTTIEGSHLQIMSSDSTTNTEN-TLAIDNRQ 1753 Query= sidIll4823IlanldplORFOO2 Phage dpi 0RF132386-3583511 (1149 letters) sdibiIBAA3l888I (AB009866) orf 15 [bacteriophage phi PVL] Length 694 Score 280 bits (709) Expect a 3e-74 Identities 157/465 Positives 257/465 Gaps =28/465 Query: 40 QIGSALTGLG0KGLTTAVTLPLMGFAAASI KVGNEFQAQMSRVOAIAGATAEELGRMKTQA 99 +G0..T VT A.+K GEF MV.A +GAT EE .K.+A Sbjct: 151 EIGNSNKNVGRNMTI4YVTAPVVAGFAVAAKXGIEFDDSMRKVKATSATGEEFEALCKKA 210 WO 00/32825 PCT/I B99/02040 400 Query: 100 IDoATFAEAGENAAFVEMAP CLXXXXXXXASL 159 .+GA T PSA +A AG. M. GVDL L Sbjct: 211 REMGATTKFSASDSAEALNYMALAGWDSKQMMEGLSGVMDLAAASGEELGAVSDITGL 270 Query: 160 RAFGLEANQAGHVADVFARAAAnTNAETSDMAAMKYVAPVASMGLSLEETASIGIMA 219 AFGL.A ,GHAflV A. N EA KYVAPVA .IG.M+ Sbj ct: 271 TAFGLKAKDSGHLAfVLAQTSSKANTDVRGfLGEAFKYVAPVAGALGYTIEDTSIAIGM 330 Query: 220 DAGIKGSQAGTTLRGALSRIAKPTKANVKSMQELGVSFYDANGNMI PLREQIAQLKTATA 279 tAGIKO .AGT LR PT+AM M. LGS D+NG MIP.R. QE..
Sbjct: 331 NAGIKGEKAGTALRTMnrNLSSPTRAMGNEMERLISITDSNGIPRKLDQREKK 390 Query: 280 GLTQEERNRMLVTLYGQNSLSGMLALLDAGPEKLDO TNALVNSDGAA MATMQDNLA 339 L SG LA A E K.T .S GA.K MA.TM. L Sbj ct: 391 HLSKDQQASSAATIFGKE.ASGALAI INASDEDYQKLTKSIDSSTGASKRMADTMESGLG 450 Query: 340 SKI EQMGGAFESVAI IVQQ! LEPALAKIVGAITKVLEAF NMS PIGQKHVVI FAGMVAAL 399 K. .EPAL IV A+KV. Q VV F VA L Sbjct: 451 GKLRTLRSQLEELALTIYDRIEPALKIIVSAFSKVVTWVTKLPTSIQLAVVGFGLFVAVL 510 Query: 400 GPLLLIAGM---VMTTIVKLRIAIQFLGPAFMGTMGTIAGVIAIF------------- 441 GPL..+G. MT.+ LIt+ F F Sbj ct: 511 GPLVFMFGLFISVMGNAMTVPLLINVNKASGLFAFLRTKIASLVLFPIfL4VSISSLT 570 Query: 442--------YALVAV -FMIAYTKSERFRNFINSLaAPAIKAGFGGA 476 AIX.+ F AY.+SE FRN.+N FPA Sbj Ct: 571 LPITLIVGALVGIGIAFYQAYKRSETFRNIVNQAISGVANAFKAA 615 Query= sidjll4824IlanjdplORFOO3 Phage dpi ORF153538-5587713 (779 letters) >sP IP43741 IDPO1_HAEIN DNA POLYMERASE I (POL I) >gi 11074025 1pir11E64098 DNA polymerase I (polA) homolog Haemophilus influenzae (strain Rd 10120) >gij1573871 (032767) DNA polyrnerase I (polA) (Maemophilus influenzae Rd] Length 930 Score 191 bits (481), Expect le-47 Identities 148/553 Positives 262/553 Gaps 60/553 Query: 63 RLELITEEAXLEQYVDnMIEDGIGSIDVETDGLDTIHDELAGVCLYSPSQKGIYAPVNHV 122 E. A ETD LD. LG+ Y P+ Sbj ct: 333 KYETLLTQAflLTRWI EKLNAAJCLIAVDTETDSLDYMSANLVGI SFALENGEAAYLPLQLD 392 Query: 123 SNNTKHRIKNQISPEFMKKZILQRIVDSGI PVIYHNSKFDMKSIYWRLGVKMNEPAWDTYL 182 .K I I N KFD .51. R .DT L Sbjct: 393 YLDAPKTLEKSTALAAIKPILE-- NPNIHKIGQNIKVDESIFARHGIEQGVEFJTML 448 Query: 183 AAMLLNENESHSLKSLHSKYVRNEENAEVAKFNDLFKGI PFSLIPPDVAYNYAAYDPLQT 242 LN L A YAA D T Sbj ct: 449 LSYTLNSTGRRNMDDLAKRYLGHETIAFESLAGKGKSQLTFNQI PLEQATEYAAEDADVT 508 Query: 243 FELYEFQEQYLTPGTEQCEEYiNLEKVSWVLHNIEMPLI KVLFDMEVYGVDLDQDKLAEIR 302 L E Y .EPL. VL ME GV+D D L Sbj ct: 509 MKLQQALWLKLQEEPTLVELYK---------- TMELPLIJHVLSRNERTGVIJIDSDALFMQS 559 Query: 303 EQFTANNNEAEQEFQQLVSEWQPEIEELRQTNFQSYQKLEMDARGRVTVSISSPTQLAIL 362 Es.. L *rs.sS QE. Sbjct: 560 PFNLASTKQLQEI 592 Query: 363 FYDIMGLKSPERDKPRG- TGES IVEH FDNDI 5XXXXXXXXX(3XXV5TYT1> LDQHL 416 P.G T E..E STY' LQ.+ Sbj ct: 593 LFDKLELPVLOKT -PKGAPSTNEEVLEELSYSHELPKILVKHRGLSKLKSTYTDKLPQMV 651 Query: 417 AKPDNRIHTTFKOYGAKTGRI4SSE-NPNLQNIPSRGE-GAVVRQIFAASEGHYIIGSDYSQ 475 R.HT.. Q TGR.SS .PNLQNIP R E G .RQ F A EG. I+ ,DYSQ Sbjct: 652 NSQTGRVHTSYHQAVTATGRLSSSDPNLQNIPIRNEEGRIRQAFIREGYSIVAYS0 711-- Query: 476 O)EPRSLAELSGDESMRIIAYEQNLDLYSVIGSKLYGVPYEECLEFYPDGTTNKEGKLRRNS 535 E R +A LSGD 0 GV +E R Sbjct: 712 IELRIMAHLSGDQGLINAFSQGIOIHRSTAAEIFGVSLDE VTSEQ RRN 759 Query: 536 VKSVLWGLMYGRGARSIAEQ~MVSVCEANKVI EDFFT6FPKVADYI IFVQQQAQDLGYVQ 595 GL.YG A Q. .5 +A K .F .P V A. GYV.
WO 00/32825 PCT/I B99/02040 401 Sbj ct: 760 AKAINFGLIYGMSAFGLSRQLGISRAlAQKYMDLYFQRYPSVQQFMTDIREKAAQGYVE 819 Query: 596 TATGRRRRLPDMS 608 T GRR LPD++ Sbjct: 820 TLFGRRLYLPDIN 832 Score 46.9 bits (109), Expect Se-04 Identities 34/123 Positives 66/123 Gaps 16/123 (13%) Query: 663 El LI K1NGGKIADAQRQCLNSVIQGTAADMTKYAM~IKV 709 N A+R +QGTAAl+ K AZ4IK+ Sbjct: 807 DIREKAKAQGYVETLFGRRLYLPDINSSNAMRRKGAERVAINAPMQGTAALDIIKRAMIKL 866 Query: 710 HNDAELKELGFHLMI PVHDELLGEVPIIQNAKRGAERLTEVMIEAAKDI ISLPMKCDPSIV 769 VHDEL+ EV M EAA Sbjct: 867 *DEVIRHDPDIEMIMQVHDELVFEVRSEKVAFFREQIKQHM-EAAAELV-VPLIVEVGVG 923 Query: 770 ERW 772 Sbjct: 924 QNW 926 Query- sid114825j1anjdpORF004 Phage dpi ORF140401-4244013 (679 letters) >eznbICABO798lI (Z93946) hypothetical protein (bacteriophage Dp-1) Length 532 Score 1011 bits (2585). Expect 0.0 Identities 497/499 Positives 498/499 (99%) Query: 1 MTKFINSYGPLHLNLYVEQVSQD)VTNNSSRVSWRATVDRDGAYRTWTYGNI SNLSVWLNG
MTKFINSYGPLHLNLYVEQVSQDVTNNSSRVSWRATVDRDGAYRTWYGNISNLSVWLNG
Sbj Ct: 1 MTKFINSYGPLHLNLYVEQVSQDVTNNSSRVSWRATVD)RlGAYRTWTYGNISNLSVWLNG Query: 61 SSVHSSHPDYDTSGEEVTLASGEVTVPHONSDGTKTMSVWASFDPNNGVHGNITI STNYTL 120
SSVHSSHPDYDTSGEEVTLASGEVTVPIOISDGTKTMSVWASFDPNNGVHGNITISTNTL
Sbj ct: 61 SSVHSSHPDYDTSGEEVTLASGEVTVPHNSDGTKTMSVWASFDPNNGVHGNITISTNYTL 120 Query: 121 DSI PRSTQISSFEGNRNLGSLHTVIFNRKVNSFTHQVWYRVFGSDWIDLGKNHTTSVSFT 180 DSI PRSTQI SSFEGNRNLGSLHTVI FNRKVNSFTHQVWYRVFGSDWIDLGKNHrTSVSFT Sbjct: 121 DSIPRSTQISSFEGNRNLGSLHTVIFNRKVNSFTHQVWYRVFGSDWIDL4GKN~HTTSVSFT 180 Query: 181 PSLDLARYLPKSSSGTMDI CIRTYNGTTQIGSDVYSNGWRFNI PDSVRPTFSGI SLVDTT 240 PSLDLRYLPKSSSGTMDICIRTYNGTQIGSDVYSNGWRFNI PDSVRPTFSGISLVDTT Sbj Ct: 181 PSLDLARYLPKSSSGT1DICIRTYNCTTQIGSDVYSNGWRFNIPDSVRPTFSGISLVDTT 240 Query: 241 SAVRQILTGNNFLQIMSNIQVNFNNASGAYGSTIQAFHAELVGKNQAINENGGKLGMMNF 300 SAVRQI LTGNJNFLQIMSNIQVNFNNASGAYGSTIQAFAELVGKNQAINENGGKLGMMNF Sbjct: 241 SAVRQILTGNNFLQIMSNIQVNFNNASGAYGSTIQAFAELVGONQAINENGGKLGMMNF 300 Query: 301 NGSATVRAWVDTRGKQSNVQDVSINVIEYYGPSINFSVQRTRQNPAIIQALRNAKVAPI 360 NGSATVRAWVTDThGKQSNVQDVSIZZVI EYYGPSIZZFSVQRTRQNPAIIQALRNAKVAPI Sbj Ct: 301 NGSATVRAWVTITRGKQSNVQDVSINVIEYYGPSINFSVQRTRQNPAIIQALRNAKVAPI 360 Query: 361 TVGGQ QKNIMQITFSVAPLNTTNFTEDRGSASGTFTTISLMTNSSANLAGNYGPDKSYIV 420
TVGGQQKNIMQITFSVAPLNTTNFTEDRGSASGTFTTISL+TNSSANLAGNYGPDKSYIV
Sbj ct: 361 TVGGQQKI4IQITFSVAPLNTTNFrEDRGSASGTFTTISLLTNSSANLAGNTYGPDKSYIV 420 Query: 421 KAKIQDRFTSTEFSAVATESVVLNYDKDGRLGVGKVVEQGK.AGS IDAAGDIYAGGRQVQ 480 KAKIQDRFTSTEFSA'rV TESVVLNYDKDGRLGVGKVVEQGKAGS IDAAGDIYAGGRQVQ Sbjct: 421 KAKIQDRFTSTEFSATVPTESVVLNYDKDGRLGVGKVVEQGKAGSIDAAGDIYAGGRQVQ 480 Query: 481 Q)FQLTDNNGAI2JRCQYNDV 499
QFQLTDNNGALNRGQYNDV
Sbict: 481 OFQLTDNNGALNRGQYNDV 499 Query- sidI114827j1anjdplORF006 Phage dpi ORF145296-4698712 (563 letters) >gblAADl89871 (AE001666) SWI/SNF family helicase_2 [Chiamydia pneumeniae] Length 1166 Score 171 bits (429), Expect le-41 Identities 150/522 Positives 254/522 Gaps 55/522 WO 00/32825 PCT/I B99/02040 402 Query: 46 SSNNFE -LPYKYFNNVIDALDEWELHI FGELD)KDVQDYIDSRNRIASSSNEQPSFKTTPF 104 S +FE LP I GE++D QD T Sbjct: 659 SLDQFEALPVNF--SMSERLIEIQKQIRGEIEFDFQD---VPQQIQATLRSYQTEG 709 Query: 105 AIQVECFEYAQEHPCFLLGDEQGLG0KTKQAIDIAVSRKASFKH- -CLIVCCISGLKWNWA 162 H +E +1 L D+CGLGKT QAII!AV++ K C C +L NW Sbjct: 710 VHWLE- RLRI@0LNGILADDMGLGKrLOAI -IAVTQSKLEKGSGCSLIVCPTSLVYNWK 766 Query: 163 KEVGIMSNESAHiILGSRVI'KDGKLVIDGV- SKRAEDLLGGHDEFFLITIETLRDAVFIK 221 +t E LVIDGV SR+ L D IT+. L+ V Sbjct: 767 EEFRKF'NPEFR TLVIDGVPSQRRKQLTALADRDVAITSYNLLQIWV--- 812 Query: 222 YLNELTKSGEIGMVIIDEIHKCIG4PSSKQGASIQKLQSYYKMGLTGTPLHNNPIDVFNVM 281 EL KS Vi-tOS H KN iQS LTCTP+ Sbjct: 813 ELYKSPRPDYVVLDSAHHIKNRTTRNAKSVK?4IQSDHRLILTCTPIENSLEELWSLF 869 Query: 282 KWLGAEHHTLTQFKERYCIVDQFNQITGYR--NLAELRELVNDYMLRRTKEEVL -DL 335 iL L i-Ri Vi-ti- y Ni. Vt-i-LRR KEVL DL Sbjct: 870 DFLMPG LLSSYDRF--VGKYIRTGNYMGNKADNMVALKKKVSPFILRRNKEDVLKDL 924 Query: 336 PEKIRVTEYVDMNSKQSKIY---KEVLTKLVQEIDKVKLMPNPLAETIRLRQATGN 388 P Q Ki- L+i-LV*. LA RLi-Q Sbjct: 925 PPVSEILYHCHLTESQKELYQSYAASAKQELSRLVKQEGFERIHIHVLATLTRLKQICCH 984 Query: 389 PSILTTQDVK SCKFERCIEIVEECIQQGKSCVIFSNWEKVIEPLAKIL-SKTVKONL 444 P+iI S Kt G V+FS K L Si- Sbj ct: 985 PAIFAXDAPEPGDSAKYDMLMDLLSSLVDSGHKTVVFSOYTKLIIKKDLESRGIPFVY 1044 Query: 445 VTGETADKFNEI EEFMN2HRKAS VI LGTIGALGTGFTLTKADTVIFLDS PWI'RAEKDQAED 504 GT F V L i+AGTG L ADTVI D W Ai-i-QA D Sbj ct: 1045 LDGSTIG4RLDLVNQFNEDPSLLVFLI SLKAGGTGLNLVGADTVIHYDMWWNPAVENQATD 1104 Query: 505 RCHRIGAKSSVTIYTLVAICGTVDERIEDLIERKGELADYIVD 546 R HRIG SV- Y LV T+i-Ei- L RK L Sbjct: 1105 RVHRIGQSRSVSSYKLVTLNTIEEKILTLQNRKKSLVKCVIN 1146 Query= aidjll4828IlanjdploRFOO7 Phage dpi ORF122230-2362113 (463 letters) >gil2444105 (U88974) ORF26 (Streptococcus thermophilus temperate bacteriophage 012051 Length 411 Score 88.9 bits (217), Expect 7e-17 Identities 80/315 Positives 133/315 Gaps -48/315 Query: 139 QGVTLAGI FCDEVALMPESFVNQATGRCSVTCSKMWFSCNPANPNMYFKKNWIDKQVEKR 198 -iGCT G +tE +L E RCS Gi-ti--+ NPIJPNHt Sbj ct: 121 RGFTAFGAYVNEASLANELVFKEI ISRCSCDGARVVWDSNPDNPNHWLNRDYIGKN-DCK 179 Query: 199 ILYLHFTMDDNPSLT DSI KRRYEKMYAGVFRKRFIL4GLWVTAflCLVYSMFNEEQHV 254 F +DDN Li. DSIK K GCF R ILGLW AtC-iYttti-+HV Sbjct: 180 IIDFSFKLDDNTFLSKRYIDSIKAATPK GKFYDRDILGLWrVAEGAIYADYDSKIHV 236 Query: 255 KKLNIEFDRLFVACDFCIYNATTFCLYGFSKRRKRYHLIESYYHSCREA.EEQLTEADVNS 314 E R F Di-C. G- i+tL-i -E tiA Sbjct: 237 VDELPEMKRYFCCIDWCYTHYGSIVIVG-ECVDNNFYLVDCVAAQFKEIDWWVEQA---- 291 Query: 315 NIQFSSVLKTTKEYAWLVDD-IRCKQIEYI ILDPSASAI4IVELQKHPYIAR- 41 P1 371 tK T Y N i-tAR I Sbjct: IPFYAJDSARPEHVARFE-NECFDI 323 Query: 372 I PARNDVTLCISFHAELLAENRFTLDPSNT -HDIDEYYAYSWDSKASQTCEDRVI KEHDH 430 V GI Ai-L E DE YY W t. i-D KE D sibj Ct: 324 MUANSV±AuIL±AKLk'KEKKLYVKRGF-VPRFtIJEI YQYKWKk25'i- Utk'Lr.bkUU 360 Query: 431 CMDRNRYACLTDALI 445 i-D RYA +D +I Sbjct: 381 VLDSVRYAIYSDYVI 395 Query= sidlll4829IlanidplORFOO8 Phage dpi ORF149624-5096111 (445 letters) >gbIAAfl199OlI (AF100420( DnaB replication fork helicase (Thermus aquaticus] WO 00/32825 PCT/I B99/02040 403 Length 444 Score 67.5 bits (162), Expect 2e-10 Identities 69/248 Positives 111/248 Gaps 14/248 Query: 147 GERLGISTGFEYYYYYYYYYYYYYXIVIMARPGQGKS -WTIDKI4LATAWIO40HDVLLYS 205 GE G+ TGF+ IhIARP GK+ A KG V tYS Shjct: 178 GEVAGVRTGF'KELDQLIGTLGPGSLNI-IAARPAMGKTAFALTIAQNAALKE0VGVGIYS 236 Query: 206 0EMSEMQVGARIDTILSNVSINSITKGIWN)H4QFEKYEDHIQAI4TEAENSLVVVTPFMI0 265 EM Q+ R+ 0 D F D ++EA TP Sbj ct: 237 LEMPAAQLTLRNNCSEARIDMNRVRLG0QLTDRDFSRLVDVASRLSEAP- IYIDDTPDLTL 295 Query: 266 GKNLTPAILDSMISKYRPSVVGIDQLSLMS -ESYPSREQKRIQYANITMDLYKISAKYG 323 A +5S+ ID L LMS S S E A I+ L G Sbjct: 296 ME--VRARARRLVSQNQVGLIIIDYLQLMSGP0SGKSGENRQQEIAAISRGLKALAARELG 353 Query: 324 IPIVLNVQAGRSAKTEGAESMELEHIAESDGVGQNASRVIANKRD---EKSGILEL 376 IPI+ Q Rt.. L ES Q+A V..+RDl EK+GI E+ Sbj ct: 354 I P1IALSQLSRAVEARPNKRPMLSDLRESGSIEQDADLVMFIYRDEYYNPHSEKAGIA.EI 413 Query: 377 SVVKNRYG 384 V KR G SbjCt: 414 IVGKQRNG 421 Query= sidlll4S3l1lanidplORF0lO Phage dpi 0RF18699-985912 (386 letters) >gi12760912 (AF037258) RecA protein (Chiorobium tepidumi Length 346 Score 133 bits (331) Expect 2e-30 Identities 99/340 Positives 164/340 Gaps 66/340 (19%) Query: 44 GGLPRKRVEFFGPESS0KSALDIVO4AQMVFXzXflzzuzutnzXrz::ARASKASKT 103 00LPR RV E +GPESSGKfl AL AQ Sbjct: 67 GGLPR0RVTEIYGPESSGKTTLALHAIAE-AQ----------------------------1010 100 Query: 104 AVKELEMQLDSLQEPLKIVYLDLEI(TLDTEWAKXI0VDVDNIWIVRPEMNSAEEI LQYVL 163 L +D E+ D .A+KGVD++ PE S E+ L V Sbjct: 101 VDAEHAFDPTYARKLGVDINALLVSQPE--SGEQALSIVE 143 Query: 164 DIFET0EVGLVVLDSLPYMVSQNLIDEELTKKAYAGISAPLTEFSRKVTPLLTRYNAIFL 223 +0 V ++VtDS+ +V Q E+ RK.T i L Sbjct: 144 TLVRSGAVDIIVIDSVAALVPQAEIJEGEMGDSVVGLQARLM4SQALRKLTGAISKSSSVCL 203 Query: 224 GINQIREDMNSQYNA-YSTPGGKMWKHACAVRLKFRKGDYLDENGASLTRTARNPAGNVV 282 INQ+iR+ Y. TGGK K .VRL RK +4+0G L ON Sbjct: 204 FINQLRflKIGVMYGSPETTGGKALKFYSSVRLDIRKIAQI-KDGEELV---GNRT 255 Query: 283 ESFVEKTKAPKPDRKLVSYTLSYHDGIQIENDLVDVAVEFGVIQKAGAWFS IVDLETGEI 342 V K K P K Y +GI +L+D.AVEFG+I+K+GAWFS 0 Sbjct: 256 KVIVVKNKV-APPFKTAEFDILYGEGISVLGELIDLAVEFGIIKKSGAWFSYGTEKLG-- 312 Query: 343 RTDEDEEPLKFQGKANLVRRFKSDDYLFDMVMTAVEIIT 382 QG. KED. L T Sbjct: QGRENVKKLLKEDETLR.NTIRQQVRDMLT 341 Query= sid11l4832I1anldplORFOII Phage dpi 0RF128017-2909613 (3S9 letters) >gij 2444110 (088974) ORF31 (Streptococcus thermophilus temperate bacteriophage UL4Ufji Length 348 Score 187 bits (469). Expect le-46 Identities 118/358 Positives 187/358 Gape 21/358 Query: 3 IYDYINAEIASYIQALPSNALQYLGPTLFPNAQQTGTDISWLKGANNLPVTIQPSNYDA 62 IYD A IA Y AL N LG +.iFP +Q GT .S.+KGA* +D Sbj ct: 4 IYDKVTASNIAGYFNALQENVSSTLGES IFPARKQLG0TKLSYI KGASGQSVAIJKAAAFDT 63 Query: 63 KASLRERAGFSKQATEM4AFFRESMRLGEKDRQNLQMLLNQSSA-LAQPLITQLYNDTOJL 121 WO 00/32825 PCT/I 899/02040 404 .14R FF.E.M E DRQ L +A L 44ND L Shi ct: 64 WVTIRDRVSAENOIDEOMPFFKEAMLVKENURQQLNLVKDSGNAVLVNTIVAGI FNDNLTL 123 Query: 122 VDGVEAQAEY?4RMQLLOYGKFTVSTNSEAQYTYDYNIIDAKQQYAVTKKWTNPAESDPIA 181 V.,G A+ EMRHQi-L GK S Y D K.Q V+K W P PA Sbj ct: 124 VNGARARLEA24RIQVLATGKIAFTSDGVNIDIDYCVKPDHKXQ -VSKS WAPG -ATPLA 180 Query: 182 DI LAAMDDI ENRTGVRPTRMVLNRNTYNQMTKSDS IKXAL -AIGVQGSWENFLLLASDAE 240 D+ G+ P R VN TN- K+S K. O S i-E Sbj ct: 181 DLEDAI -ETARELG;LNPERAVMNAKTFOLIRKAASTVKVIKPLAODGS---- -AVTKAELE 235 Query: 241 KPIAEKTOLQIAVYSKKIAQFADADKLPDVONI RQFNLIDDOKVVLLPPDAVGHTWYGTT 300 +IAs- G0I DOG +F DG +L+P .G.T .OTT Sbjct: 236 NYIADNFGVSIVLENOTYRN--------- DKOEVSKF--YPDOHLTLIPNOPLGNTVFGTT 285 Query: 301 PEAFD1LASOOT -DAQVQVLSOOPTVrTYLEKHPVNIATVVSAVMI PSFEGIDYVOVLT 357 PE DL T.AV.. 0 VTT PVN T VS V PSFE-i-D V LT Shi ct: 286 PEESDIJFADNTVNAEVEIVDNOIAVrrTKTTDPVNVQTKVSMVALPSFERLDDVYMLT 343 Query= sidjll4834jlanjdplORFO13 Phage dpi ORF110215-1124013 (341 letters) .spIP09122IDP3XBACSU DNA POLYHERASE III SUBUNITS GAM-MA AND TAU) Length 563 Score 182 bits (458). Expect 2e-45 Identities 118/353 Positives 176/353 Gaps 31/353 Query: 7 YRPQTPEEVVAQEYVKEILLNQLQNGAI KHOYLFCXXXXXXXXGXXRIFAKDVN .RPQ FEi-VV L N L H YLF .IFAK VN Sbj ct: 10 FRPQRPEDWGCQEHITKTLQNALLQKCFSHAYLFSPRTGKTSAAKIACAVNCEHAPV 69 Query: 61 SPIEIDAASNNOVENVRNI IEDSRYKSMDSEFKVYIIDEVH 105 KG. IEIDAASNNOV. i-R+I i-KVYIIDEVH Sbj ct: 70 DEPCNECAACKIThOSISDVI EIDAASNNOVDEIRDIRDKVKFAPSAVTYKVYI IDEVH 129 Query: 106 MLSTGAFNALLKTLEEPSSOTVFI LCTTDPQKI PDTILSRVQRFDFTRIDNDDIVNQLQF 165 GAFNALLKTLEEP i-FIL TTi.P KIP TI.SR QRPDP RI IV Sbjct: 130 HLSIGAFNALLKTLESPPEHCIFILATTEPHKIPLTIISRCQRFDFKRITSQAIVRMNK 189 Query: 166 I IESENEEOAGYSYERDALSFIOKLANOOI4RDSITRLEKVLDYSHHVDMEAVSNAL -G 222 IS L I A.GI4RD... i-S Di- V -iAL C Sbj Ct: 190 IVDAEQ--LQVEEOSLEI IASAAHGGMRDALSLLDQAISFSO DILKVEDALLITG 242 Query: 223 VPDYETFASLVEAIANYDOSKCLEIVNDFHYSOKDLKLVrRNPTDFLLEVCKYWLVRDI 5 282 LE 010 Sbj ct: 243 AVSQLYIOKLAKSLHDKNVSDALETLNELLQQGKDPAXLIEDMIFYPRDMLLYKTAPOLE 302 Query: 283 ITQLPAHPESKLEQFCEAFQYPTLLWMLEEMNELAOVVKWSPNAKPI IETKLL 335 E L M-1i-i -aNi- +KW E ti- Sbj ct: 303 OVLEKVKVDETPRELSEQIPAQALYEMIDILNKSMQEMKWTNNPRIFFEVAvv 355 Query= sid1ll4a3S1lanIdplORP0l4 Phage dpi 0RF150961-5197413 (337 letters) >sP IP474 92I1PRIMLHYCOE DNA PRIMASE >siI113614 96I1pirIIF64227 DNA primase (dnaE) homolog MG0250 Mycoplasma genitalium (SOC3) >gij 3844848 (U39704) DNA primase (dnas) (Mycoplasma genitaliun] Length 607 Score 57.0 bits (135), Expect 2e-07 Identities a 53/190 Positives 89/190 Gaps 17/190 Query: 146 RRLDKY~rTI4---- VwVrPVT.rnrT.TRM~rn VnV-- T-unrT'rVrnVMT.fl'VTrWt- E Y FI+P K +PD K +1I Pa-+ 0 V F Sbjct: 170 ESMERYPPINPKIKPSELYLFS-KTNQQGLGPFDFNTKKATPQNQIMIPIHDFNGNPVOF 228 Query: 197 NRRSVRSKFHQYGEDDPKTEFLYOYELVAFRDYFEKPISQVFVTSVINCLTLWSMcIP 256 RSV i-a EL+ K-i--iQ+iFi-E TL +Ksbj ct: 229 SARSVDNIHKLKYIOISADHSF -FKKOSLLFNFHRLNKNLNQLFIVEOYFDVPTLTNSKFS 287 Query: 257 AVALMGVOGN-QINLLKR- -LPYRNIVLALDPDNAOQTAQSKLYRQLKRSK-VVRFLNY 312 AVALMO. +*QI iK +i-VL.ALD Di-GQ A L +L i-V++ Sbjct: 288 AVALHOLALNDVQIKAIKAJIFKELQTLVLAILDNDASOQNAVFSLISKLNNNNFIVEIVQW 347 WO 00/32825 PCT/I B99/02040 405 Query: 313 PKEFYDNKWD 322 D
WD
Sbjct: 348 EHNYKD--_WD 355 Query= sidlll4837IlanIdplORF0l6 Phage dpi ORF143413-4430313 (296 letters) >embiCAB07986 I (293946) N-acetylmuramoyl-L-alanine amidase [bacteriophage Dp-l] Length 296 Score 661 bits (1686), Expect 0.0 Identities 296/296 Positives 296/296 (100%) Query: 1 MGVDI EKGVAWMQARKGRVSYSMDFRDGPDSYDCSSSYYALRSAASSAGWAVNTEY4H MGVDI EKGVAWMQARKGRVSYSMDFRflGPDSYDCSSSMY-YALRSAGASSAGWAVNTEYMH Sbj ct: 1 MGVDIEKGVAWNQARKRVSYS'WFRflGPDSYDCSSSMYYALRSAGASSAGWAVNTEYMH Query: 61 AWLIENGYELISENAPWDAKRGDIFIWGRKGASAGAGGHTGMFIDSDNIIHCNYAYDGIS 120 AWLIENGYELI SENAPWDAKRGDI FlWGRKGASAGAGGHTGMFIDSDNI IHCNYAYD)GIS Sbjct: 61 AWLIENGYELISENAPWDAKRGDIFIWGRKGASAGAGGHTGMFIDSDNIIHCNYAYDGIS 120 Query: 121 VNDHDERWYYAGQPYYYVYRLTNANAQPAEKKLGWQKDATGFWYARANGTYPKDEFEYI E 180 VNflRDERWYYAGQPYYYVYRLTNANAQPAEKXLGWOKDATGFWYARANGTYPKDEFEYIE Sbjct: 121 VNDKDERWYYAGQPYYYVYRLTNANAQPAEKKLGWQKDATCFWYARANGYPKWEFEYIE 180 Query: 181 ENKSWFYFDDQGYMLAEKWLKMTDGNWYW'FDRflGYMATSWKRIGESWYYFNRDGSMVTGW 240 E-NKSWFYFDDQGYMLAEKWLO{TDGNWYWFDRflGYMATSWKRIGESWY'YFNRDGSMVTGW Sbj ct: 181 ENKSWFYFDDQGYMLAEKWLKHTDGNWYWFDRDGYMATSWKRIGESWYYFNRDGS4VTGW 240 Query: 241 I KYYDNNYYCDATNGDMKSNAFIRYNDGWYLLLPDGRLADKPQFTVEPDGLITAKV 296 I KYYDNWYYCDATNGDMKSNAFI RYNDGWYLLLPDGRLADKPQFTVEPDGLITAKV Sbj ct: 241 IKYYDNWYYCDATNGDMKSNAFIRYNflGWYLLLPDGRLADKPQFTVEPDGLITAKV 296 Query= sid11l48411landplORF020 Phege dpi ORF11864-265811 (264 letters) >emb1CAB132471 (Z99111) similar to coenzyme PQQ synthesis [Bacillus subtilis] Length 243 Score 211 bits (548). Expect Se-SE Identities 117/248 Positives 163/248 Gaps 15/248 (6%4) Query: 23 MPIMEIFGPTIQGEGMVIGQKTIFIRTGGCDYHCNWCDSAFTWNGTTEPE YITGKEA.A 4P4+4EIFGPTIQGEGMVICQKT+F+RT GCDY C+WCDSAFTW+G+ +E Sbjct: 5 IPVLEIFGPTIQGEMVIGQKTFVRTAGCDYSCSWCDSAFTWDGSAKKDIRWM4TAEEIF 64 Query: 81 SRILKLAFNDKGEQICNNVTLTGGNPALINEPMAKMISILKEHGFKFGLETQGTRFQEWF 140 D G +HV'r++GGNPAI,+ I tLKEt LETQGT *Q+WF Sbjct: 6S AEL CDIGGDAFSHVTISGGNPALLKQ-LDAFIELLKENNIRAALETQGTVYQDWF 118 Query: 141 KEVSDITISPKPPSSGMRTNMKILEAIVDRN--NDENLDWSFKIVIFDENDLAYARflMFK 198 +D+TISPKPPSS M TN L+ I+ ND S K+VIF++ DL K Sbjct: 119 TLIDDLTISPKPPSSKVNFQKLDHILTSLQENDRQHAVSLKVVIFNDEDLEFAKTVHK 178 Query: 199 TFEGKLRPVNYLSVGNANAY- -EEGKISDRLLEKLGWLWDKVYEDPAFNNVRPLPQLHTL 256 +C YL VGN LLIK L DKV D N VR LPQLHTL Sbj ct: 179 RYPO -I PFYLQVGNDDVHTTDDQSLIAHILLGKYEALVDKVAVDAELNLVRVLPQLHTL 235 Query: 257 VYDNKRCV 264
NKRGV
Sbjct: 236 LWGNKRGV 243 WO 00/32825 PCT/I B99/02040 406 Query= sid11l484211anldplORF021 Phage dpi 0RF12504-329512 (263 letters) >spIP19465iGCH1_BACSU GTP CYCLOHYDROLASE I (GTP-CH-I) >giI9S4ll1pirlI1A38256 GTP cyclohydrolase I (EC 3.5.4.16) Bacillus subtilis >giJ 143231 (M37320) regulatory protein (Bacillus subtilis] >gi1143799 (M80245) MtrA (Bacillus subtilis] >gil2634696jeibICAl141941 (Z99115) GTP cyclohydrolase I (Bacillus subtilis] Length 190 Score 208 bits (523), Expect 4e-53 Identities 103/185 Positives 133/185 (71%k) Gaps 1/185 Query: 80 VTLIDNTEAAVQRLFGLLGEDAERflGLQDTPFRFVKAIJAEHTVGYREDPKLHLEKTFDVDH 139 V E +GED R+GL DTP R KC AE G EDPK H+ F +H Sbjct: 4 VNKEQIEQAVRQILEAIGEDPNREGLLDTPKRVAKMYAEVFSGLNEDPKEHFQTIFGENH 63 Query: 140 EDLVLVKDIPFNSLCEHHLAPFVGKVHIAYIPKD- KITGLSKFGRVVEGYAKRLQVQERL 198 E+LVLVKDI F+S+CEHRL Pr GKC H+AYIP+ K+TGLSK R VS AKR Q.QER+ Sbj ct: 64 EELVLVKDIAFHSMCEHHLVPFYGKAHVAYI PRGGKVTGLSKLARAVEAVAKRPQLQERI 123 Query: 199 TQQIADAIQEVLNPQAVAVIVEAEHTCMSGRGIKGIGATTVTSTMRGLFQDDASARAELL 258 T IA++I E L+P V V+VEAEH CM+ RG++K GA 'rVTS +RG+F+DDA+ARAE+L Sbj ct: 124 TSTIAESIVETLDPHGVMVVVAEHMCMTMRGVRKPGAKTVTSAVRGVFKDDAAARAEVL 183 Query: 259 QLIKK 263
IK+
Sbjct: 184 SHIKR 188 Query= sidjll4843IlanjdplORFO22 Phage dpi 0RF130896-3167512 (259 letters) >gi12347102 (U77367) internalin (Listeria nonocytogenes] Length 821 Score 55.0 bits (130), Expect 5e-07 Identities 44/149 Positives 63/149 Gaps 13/149 Query: 119 FRMNIYVPNYVG -DSIVNYVKITLNNCTGKAPGLS IGKEFYAPSFNI KAREATKAGLPV 176 F VPN D NN T AP L Y PE *K Sbjct: 383 FSKTLSVPNNITSIDGTLIAPSTISNNGTYflAPNLKWSLPNYLPS--VICYTFSQKIPIGT 440 Query: 177 KSMDYVAQLPAVLR--RVTFDLNGGTGTADAVRVEAGKKISPKPVDPTLTGKAFKGW 231 +Y L+ +VTF+ G T V E P+P PT G F GW Sbjct: 441 GTSNYSGFITQPLKELLDYKVTFNVEGNTSEVETVTS NLIPEPTSPTKQGYTFDGW 497 Query: 232 -KVEGSSTIWDFDNHMMPDRDVKLVAQFA 259 S T WDF MP D+ LA F+ Sbjct: 498 YDAETGGTKWDF7TGQMPANDLTLYAI4FS 526 Score 43.4 bits (100), Expect 0.002 Identities 47/195 Positives 73/195 Gaps 12/195 Query: 72 YDLTFKDNTFDPSIMAII GGTVRQQGGTIAGYDT -PMLAQGASNMIPFRNNIYVPNY- 128 YD T +G GG T M A F +N Y N1+ Sbj ct: 547 YDALLNEPTTPTCQGYTFDGWYDAETGGNCWDFCTMIC1PANDVAFYAHiFTINNYQANFDI 606 Query: 129 VGDSIVNYVICITLNNCTGICAPGLS IICEFYAPEFNIKIAREATICAGLPVICSMDYVAQL 185 V Y T G +tA K TIC +P A Sbj ct: 607 DGSVICNSTIAYDTLLNEPTTPTCQGYTFDGWYDAETGGTCWDFCTE- MPANflVTLYAHF 665 Query: 186 PAVLRRVTFDLNGGTGTAflAVRVEAGKXI SPCP VDPTLTGCAFCGW- )VEGESTIWDFDN 244 FD++G T +V +A +P+2 F+ TU +GW E T WDF Sbjct: 666 TINNYQANFDIDGAV-TEEVVNYDA LIPPTSPSTGFTLGWYDAVGGTCWDFKT 721 Query: 245 HMNPDRDVICLVAQFA 259 MP D+ L AF4 Sbjct: 722 MIQIPANDITLYANFS 736 Score 38.3 bits Expect 0.057 Identities 42/169 Positives 59/169 Gaps 10/169 WOOO0/32825 PCT/I 899/02040 407 Query: 96 QQGGTIAGYDT PMLaAQGASNMKPFRNNI YVPNYVGOS IVNYVKIT LNNCTGKAPG 150 +GGT T M A P +N Y N+ Dt+V LN T Sbj ct: 501 £TGGTKWDFTTGQIPANDLTLYAMPSVNSYQANFDIDGVVTNEAVVYDALLNEPrI'PTKQ 560 Query: 151 LS IGKEFYAPEFNIKARATKAGLPVKSMDYVAQLPAVLRRVTFDLNGGTGTADAVRVEA 210 E +P +tA FD+a-G A Sbj ct: 561 GYTFDGWYDAETGGNKWDFKTMKMPANDVAFYAFTINNYQANFDIDGEVKNETI A 616 Query: 211 GKKISPKPVDPTLTGKAPKGW- KVEGESTIWDFDNI{MPDRDVKLVAQF 258 +P PT G F GW E T WDF MP DV L AF Sbj Ct: 617 YDTLLNEPTTPTKQGTFDGWYAETGGTCWDFKTKEMPANDVTLYAHF 665 Query= sidI14850I1anldp1ORF029 Phage dpi 0RF1662-134812 (228 letters) >gi12650185 (AE001074) succinoglycan biosynthesis regulator (exaB) [Archaeoglobus fulgidus] Length =239 Score -119 bits (295), Expect 2e-26 Identities 19/224 Positives 113/224 Gaps 11/224 Query: 1 MKSVVLLSGGVDSATCLAIEVDK4GSKNVHAIAFNYGQKHEAELENAANVA4FYGVKFTI MK+V+LLSGG+DStT L .D G VHA. F YCQIGI EtEtA VA V+ Sbj ct: 1 MKAVMLLSGGIDSSTLLYYLLD--GCYEVHALTFFYGQKHSKEIESAEKVAKAAXVRHLK 58 Query: 61 LEIDSKIYXXXXXXLLQCKGEI SMGKSYAEILIAEICEVVDTYVPFRNGLMLSQXXXXXXXX 120 ++tISI+ L G+ E Y+E T VP RN-s.LS Sbj ct: 59 VD STIHOLISYGALTGEEEVPKA- FYSEEVQRR- -TIVPNRNMILLS -IAAGYAV 1-10 Query: 121 I:C:OEXflUOXrnSJOCXtPCTPEFYNSMSNAEYT-GGKVTLVAPLLTLTKAQVVKW 179 PDC EF At V +AP TKA.V+ Sbj ct: 111 ICIGAKEVHYAAHLSOYSIYPOCRKEFVKALDTAVYLANIWTPVEVRAPFVDMTKAflIVRL 170 Query: 180 GIOLOVPYFLTRSCYESOAESCGTCATCIDRKKAFEENCMTOPI 223 G+ L VPY LTf SCYE C +C TC+tR tAP NG. OP.
Sbj ct: 171 CLKLCVPYSLTWSCYECCORPCLSCCTCLERTE-AFLANCVIWPL 214 Query- sidIll485SjlanjdplORFO34 Phage dpl ORF1131-65212 (173 letters) >eslbiCABl3248I (Z99111) similar to hypothetical proteins [Bacillus subtilis] Length 165 Score 220 bits (556) Expect 4e-57 identities 103/139 Positives 117/139 (84%) Query: 5 TTRTOAELTCVTLLCNQOTKYOYOYNPOVLETFPNKHPENNYLVTFOCYEPTSLCPKTCQ 64 TRr +.EL CVTLLCNQ T Y POVLE.PPNKH i-Y V P. EPTSLCPKTCQ Sbj ct: 2 TTRKESELECVTLLCNQCTNYLPEYAPDVLESPPNK4VNRDYFVKFNCPEPTSLCPKTCQ 61 Query: 65 PDPANVFISYIPNEUIVESKSLKLYLSRNCOPHEOCMNIILNLYELMEPKYIEVMC 124 POPA .+ISYX Pi-EIUVESKSLKLYLPSPRNHCDPHEOCMNII tHOL ELM.P.YIEV C Shi ct: 62 POPATIYISYIPEKVESKSLKLYLSFRNHDHECMMI IMNflLIELMDPRYIEVWC 121 Query: 125 LPTPRCCISIYPPVNKVNP 143 PTPRCCISI P. N P Sbjct: 122 KPTPRCCISIOPYTNYCKP 140 Query= sidIll487landplORO36 Phage dpi 0RF148808-4936211 (184 letters) >gil1353529 (U38906) 0RF12 [Bacteriophage rnt] Length 296 Score 53.5 bits (126), Expect le-06 Identities 42/149 Positives 70/149 Caps 9/149 Query: 34 IASNTVCNCKTSWAVRLLQRYLAETALOCRIVEKCMPVVSAQLLTEPCDYNYPQTMQEPL 93 +S CCK. A+ L T L ti- V Sbjct: 155 VVSCPACTGSHLASILKCLQHTDLT--VIFASWSEVLHLIKlSPONIDSFYSTEYFM 212 WO 00/32825 PCT/I B99/02040 408 Query: 94 ERFERLKTCELLVIDEIGGGSLTK.ASYPYLYD)LVNYRVDNNLSTIYTTNYTDDEIIDLL4G 153 E F .LLVID+IG S L R TI TTN DEI Sbjct: 213 EVF RRTDLLVIDDIGSEKITEWSMSLLTEVLDART KTIITTNLKSDEIRKKYH 265 Query: 154 QRLYSRIYDTSVVLDFQASNVRGLEVSEI 182 R YSR++ F VS++ Sbjct: 266 NRTYSRLFRGIGKKAFNFENIKDKRVSQL 294 Query= iidJll4SS9IlanIdplORFO3B Phage dpi ORF11350-187113 (173 letters) >BPI P44123 IYB90J4AEIN HfYPOTHETICAL PROTEIN H11190 >9gi110746751Pir1 IF64021 hypothetical protein HI11190 Haemophilus influenzae (strain Rd 14W20) >giI1574117 (U32798) 6-pyr-uvoyl tetrahydrobiopterin synthase, putative (Haemophilus influenzae Rd) Length 141 Score 100 bits (247). Expect 6e-21 Identities 59/143 Positives 83/143 Gaps 10/143 Query: 2 RVSKTLTFDAAHQLVGHFGKCANLHGHTYKVEISLAGGTYDHGSSQGI4VVDFYHVKKIA- ++SK +FD AH L OH GKC NLHGHTYK+ G Y G+ H4V+DF +K4 I Sbj ct: 3 141SKEFSFDMARLLDGHDGKCQNLHGHTYKLQVEI SGDLYKSGAKKAMVIDFSDLCS IVK 62 Query: 61 GTFIDRLDHAVLL-QGNEP- IALANAVDTKRVLFGFRTTAENHSRFLTWTLTELWK 115 +D +OHA +0 NE L K FRTTAE L Sbj ct: 63 KVILDPMDHAFIYDQTNERESQIATLLQKLNSKTFGVPFRTTAEEIARFIFNRLKM- -DE 120 Query: 116 HARIDSIKLWETPTCCAECTYYE 138 1SILWETPT CY E Sbjct: 121 QLSISSIRLWETPT--SFCEYQE 141 Query= sidIll46O1lanIdplORFO39 Phage dpi ORF13306-380313 (165 letters) >embtCAA6B244I (X99978) ORF7; hydophobic protein [Lactobacillus plantarum] Length 168 Score 64.4 bits (154), Expect Identities 49/156 ,Positives 84/156 Gaps 9/156 Query: 8 WLVRTALIAALYVTLTVAFSAISY- -GPIQFRVSEALLILLPLWNHRWTPGIVLGTIIANF AL+AA+YV L+ +A S G IQFRVSE L L GIV G I++ Sbj ct: 9 WI IN-ALVAAI4YVVLCLGPAAFSLASGAIQFRVSEGLNHLAVFNRKYI 1401VAGVI LFDA 67 Query: 66 FSP-LGLIDVLFGSLATFLGXXXXXXX000CSPLYSLICPVLA -NAYLIALELRIVY 120 FPP L++VLFG L +A ++IAL Sbj ct: 68 FGPGASLLNVLFGGGQSLLALLLVLTWLAPKLKTVWQRNLLNIALFTVSMFMIALHITMNS 127 Query: 121 S-LPFWESVIYVGISEAIIVLISYFLISTLAKNNRF 155 S FW .*SE II+ I+ +L HF Sbjct: 128 SGVAFWPTYLTTALSELIIMSITAPIMYSLDRVLHF 163 Query. sidIll4862lanIdplORFO4l Phage dpi ORF18208-869913 (163 letters) )gi12522313 (AF012906) dtITPase homolog [Bacillus subtilis] >gi126343941embiCA9138931 (Z99114) similar to deoxyuridine 5' -triphosphate nucleotidohydrolase [Bacillus subtilis] >gi13025643 (AF020713) putative dtiTPase [Bacteriophage SPBc2] Length 142 Score 106 bits (267), ixpecc =2c1 Identities 65/160 Positives a 83/160 Gaps 25/160 Query: S VDVKMIDPKLDRLKYT GDWVDVRIS55ITKIDADSA2VSRCRKVLQKAQVYSVAAGECI 62 +K +D at GDWD.R.+ I D SbjCt: 3 IKIKYLDETQTRINCNEQGDWIDLRAAEDVAICKDEFKL----------------------- 41 Query: 63 KIAHGFALELPKGYEAI LHPRSSLFKKTGLI PVSS -GVIDEGYKGDTDEWFSVWYATRDA 121 G A+ELP.GYEA PaSS +K4 G+I +S GVIDE YKGD D WF YA RD Sbj ct: 42 -VPLGVANELPEGYEAHVVPRSSTYKNPGVIQTNSNGVIDESYKGDNDFWFFPAYALRDT 100 WO 00/32825 PCT/I B99102040 409 Query: 122 DIYQIORQKPIKNVSGARGGT 161 1 RI QFRI +K PA+ V+ LGN RGGHGSTG Sbjct: 101 KIKKrCDRICQFRIMKKMPADLIEDRLNGDRGGHGSTG 140 Query= sidII14867I1anldpi0RF0 4 6 Phage dpi ORF142774-432021 3 (142 letters) ,embjCABO7984I (Z93946) hypothetical protein [bacteriophage Dp-1] Length 142 Score 287 bits (728), Expect 2e-77 Identities 142/142 (100%) Positives 142/142 (100%) Query: 1 MPMWLNDTAVLTTI ITACSGVLTVLLNKLFEKNKA LDI STTLSTI4CQQVDGIDQ MP?4WLNDTAVLTTI
ITACSGVLTVLLNKLFEWKSNKAKVLEOISTILSTLKQQVDGIDQ
Sbj ct: 1 MPMWLNDTAVLTTI ITACSGVLTVLLNKLF WKNKAXSVLEISTTLSTLKQGDIDQ Query: 61 T VAINHQNDVIQDGTRKIQRYRL''DLREITGYTLDHFRELS ILFESYKN~LiGGNGE 120 TTAUQDIDTKQYLHLRVIGTLHPESLEYN4GG Sbj ct: 61 TVAINQNDVIQDGTRKIQRYRLYHDLKREITGYTITLDHFREIS ILFESYKqLGGNGE 120 Query: 121 VE.ALYEKYKLPIREEDDETI 142
VE.ALYEKYKKLPIREEDLDETI
Sbjct: 121 VE.ALYEKYXKLPIREEDLDTI 142 Query. sidIll49l1laldplORF080 Phage dpi ORF142490-4275911 (89 letters) ,.embICABOl983I (Z93946) hypothetical protein (bacteriophage Dp-i] Length 124 Score 147 bits (367), Expect Identities 75/75 Positives 75/75 (100%) Query: 1 MLLKRIAFIQAKLKTVIAASVELiPLANRLA Sct: 1 MLNLTKSRQIVAEFTIGQGAEKKLVKT DNASVSTHDDYANRRL Query: 61 EQKLRETRYAIEDEI
EQKLRETRYAIEDEI
Sbjct: 61 EQKLRETRYAIEDEI Query. sidl1i4921aidplORF091 Phage dpi ORF143189-4341311 (74 letters) >embICAfl07985I (Z93946) holin (bacteriophage Dp-11 Length 74 Score -63.2 bits (151) Expect 2e-10 Identities 34/74 Positives 34/74 Query: 1 MKSNZYXX XAIUJ'JUUJ.JLJU- XYQF XXXXXXXXXXLVS MKLSNEQYD YQFD
VLGVSSR
Sbj ct; 1 MKLSNEQYDVAKNVV VVPA PTLAAIQFITTYAITGTILTFAGTVLVSS Query: 61 NYQKEQEAQNNEVE 74
NYQKEQEAQNNEVE
Sbjct: 61 NYQKEQEAQNNEVE 74 WO 00/32825 PCT/I B99/02040 410 Condensed listing of homology information from above Phage: dpi Database: nr Program: Blastp Query- sidjll4822jlalldpiO.FO0 Phage dpi ORF136698-4039012 (1230 letters) giJ24441 2 4 (U88974) ORF45 [Streptococcus thermophilu~s temperate 467 e-130 gi1928828 (L44593) 0RF1904; putative [Lactococcus lactis phage B 427 e-118 giJ2935 676 (AF032121) unknown [Streptococcus thermophilus bacter. 309 le-82 giJ2935691 (AF032122) unknown (Streptococcus thermophilus bacter. 306 7e-82 gi13540289 (AF057033) putative anti-receptor (Streptococcus ther. 279 6e-74 gil4530l541gblAAD2l894.ll (AF085222) putative tail-host specific. 220 3e-56 giI930045lemb[CAA33387l (X15332) alpha-i (III) collagen [Homo sa. 58 4e-07 gili0706O3lpirI ICGHU7L collagen alpha 1(111) chain precursor h. SB 4e-07 gil45029511reflNP_00008l.lIPCOL3AlI collagen, type III, alpha 1 58 4e-07 gilliS290IsplP042581CAl3 _BOVIN COLLAGEN ALPHA l[III[ CHAIN >giJ7 58 4e-07 gil57S322lembICAA36279l (X(52046) type III collagen [Mus musculus) 57 8e-07 giJ2ii9i631pirlIS5 9 85 6 collagen alpha 1(III) chain precursor 57 8e-07 gil5439l21spIPl3941lCAi3_RAT COLLAGEN ALPHA 1(II1) CHAIN >gi1543. S7 ie-06 giJ3i7l9981em1bICAA06510I (AJ005395) collagen alpha 1 (II1) [Ratt. 57 le-06 gil3947S65lemfbICAA90250I (Z49967) similar to collagen; CDNA EST 54 7e-06 gil4234O31pirlIA46053 bullous pemphigoid antigen, BPAG2, type XV 53 9e-06 giJilS4i01splP12il41CCS1_CAEEL CUTICLE COLLAGEN SQT-l gil844371 53 9e-06 gil387380lemfblCAA90084I (Z49907) cuticle collagen SQT-i; cONA E. 53 9e-06 Query= sidlil4823IlanidpiORFOO2 Phage dpi ORF132386-3583511 (1149 letters) gil3341922ldbjlBAA3l888I (AB009866) orf i5 [bacteriophage phi PVLJ 280 3e-74 giJ4i26622ldbjIBAA36642.11 (AB016282) ORF36 (bacteriophage phi-lOS) 232 le-59 gili369948leinbICAA591941 (X84706) host interacti.ng protein [Bact. 201 3e-50 gi139i12 (AF063097) gpT [Bacteriophage P2] 188 2e-46 giJ333727 2 (U32222) G protein [Bacteriophage 186] 161 3e-38 giJ4O63799ldbjIBAA362531 (ABOOBSSO) orf25; similar to T gene of 159 8e-38 giJ31722 7 4 (AF022214) minor tail subunit; putative tape-measure 123 6e-27 gil46Sl271spjQ0S2331VG26_BPMLS MINOR TAIL PROTEIN GP26 >giJ41904. 108 2e-22 gi13540284 (AF057033) putative minor tail protein (streptococcus.. 99 2e-i9 giJ2444ii9 (188974) ORF40 [Streptococcus thermophilus temperate 90 6e-i7 gil26345SSlemfbiCAB14OS31 (Z99115) yomI [Bacillus subtilis) >giJ3. 66 le-09 gil 2392838 (AF0i1378) unknown (Bacteriophage ski] 64 5e-09 gi(2764873lemlbICAA66557I (X97918) gene 18.1 [Bacteriophage SPPlI 62 3e-08 gill353559 (U38906) ORF42 [Bacteriophage rit] 61 6e-08 gil63084l1pirllS39079 puff C-8 protein fungus gnat (Rhynchosci 55 2e-06 gill73086S1spIP5173iiYO2 7 _BPHPi HYPOTHETICAL 72.8 KD PROTEIN IN 53 8e-06 giJ224288Iprfi 11101273J ORT 7 [Bacteriophage HP1] S3 le-OS Query= sid1114824IlarlldpiORFOO3 Phage dpl ORF153538-5587713 (779 letters) giII188251spIP00582DP01_.ECOLI DNA POLYMERASE I (POL I) >giJ6705. 193 3e-48 gil29821021pdbllKFSIA Chain A, All-Oxygen Dna Complexed To The 3 193 3e-48 gil2298891pdbIlDPII DNA Polyinerase I (Kienow Fragment) (E.C.2 193 3e-48 gijli69402jspIP43741(DPO1_HAEIN DNA POLYMERASE I (POL I) >giJlO7. 191 le-47 giJ2688462 (AEOO1lS6) DNA polymerase I (polA) [Borrelia burgdorf. 190 3e-47 giJ80glB01pdbIiKLNIA Escherichia coli 190 3e-47 gi1193934lembICAA729971 (Y12328) DNA-directed DNA polymerase I 189 Be-47 giJ4090935 (AF028719) DNA polymercab Zyjpz 1 f "hdcthe!, cp T 175 le-42 gil473157l1gblAAD28505.11AF121 7 8O_-1 (AF121780) DNA polymerase I 174 2e-42 giJ1633576 (1157757) similar to proofreading 31-5 exonuclease an 173 4e-42 giJ3322368 [AEO0ii95) DNA polymerase I (polA) [Treponema pallidum] 172 9e-42 giJl006S95idbjIBAAlO748l (D64005) DNA polymerase I [Synechocysti. 171 2e-41 giIS50621spIQ07700lDPO1_MYCTU DNA POLYMERASE 1 (POL 1) >gi14161 163 5e-39 giJ43769081gblAAD187Sll (AE001645) DNA Polymerase I [Chiainydia p 157 2e-37 giJi169403lspIP4683SIDPOI_MYCLE DNA POLYMERASE I (POL IV >gi1107. 1S2 7e-36 gi12l458391pirlI S72949 DNA polymerase I Mycobacterium, leprae 152 7e-36 gi11405438lemnbICAA671841 (X98575) DNA-dependent DNA polymerase 152 9e-36 gil2S063651spIP80194IDPOl_-THECA DNA POLYMERASE I, THERMOSTABLE 147 2e-34 giJ3328929 (AE001322) DNA Polymerase I [Chlamydia trachomatis] 147 3e-34 WO 00/32825 PCT/IB199/02040 .411 9 iI3913510I5pIO52225IDPO1_THEFI DNA POLYMEPASE I. THERlIOSTABLE 146 7e-34 gi11205984 (U33536) DNA polymerase I [Bacillus stearothermophilus] 146 7e-34 gilil88271sp13252IDP0i _STRPN DNA POLYMERASE I (POL V) 9 i19802. 145 9e-34 giI19422O2IpdbtlJXEI Stoffel Fragment of Taq Dna Polymerase 1 145 le-33 giI194352OipdbllKTOI Dna Polymerase 145 le-33 gil10840221pirIIJX03S9 DNA-directed DNA polymerase (EC 2.7.7.7) 145 le-33 gij507891IdbjjBAAO6775j (D32013) DNA Polymerase (Thermus aquaticus] 145 1e-33 giI118828jspIP19821IDPOl_THZAQ DNA POLYMERASE 1. THERMOSTABLE 145 ie-33 gill7O65021spIP52028IDPO_THET'H DNA POLYMERASE 1, THERJ4OSTABLE 144 2e-33 giJ10972l11prfJ1113329A DNA polymerase [Thermus aquaticus therm 144 2e-33 giJ2O982891pdblTAJA Chain A, Structure Of Dna Polymerase 143 3e-33 Query- sidjll4825jianjdplORFoO4 Phage dpi ORF140401-4244013 (679 letters) gill934761IernbiCAB079811 (Z93946) hypothetical protein (bacterio. 1011 0.0 gi13540290 (AF057033) putative minor structural protein [Strepto. 346 2e-94 gi12444125 (U88974) ORF46 [Streptococcus thermophilus temperate 339 3e-92 gil1934762lembICABO79821 (Z93946) hypothetical protein (bacterio 300 2e-80 gj145301551gb1AAD21895.11 (AF085222) unknown [Streptococcus ther. 276 4e-73 gi12935677 (AF032121) unknown [Streptococcus thermophilus bacter. 250 3e-65 gi12935692 (AF032122) unknown (Streptococcus thermophilus bacter 250 3e-65 giJ1136289 (U42597) histidine kinase A [Dictyostelium discoideum] S0 7e-05 Query= sidjll4827IlanjdplORFOOS Phage dpi ORF145296-4698712 (563 letters) gil43771651gblAAD189871 (AE001666) SWI/SNF family helicase_2 171 le-41 giIl7699471embiCAA670951 (X98455) SNF (Bacillus cereus] 160 3e-38 gi13329163 (AE001341) SWF/SNF family helicase [Chiamydia trachom 159 6e-38 gil4377149IgbIAAD189731 (AE001664) SWI/SNF family helicase_1 [Ch. 157 2e-37 gij3328995 (AE001326) SWI/SNF family helicase (Chlamydia trachom 153 2e-36 giI2493354IspjP75093IY01B_MYCPN HYPOTHETICAL HELICASE MG018/MGO1. 146 4e-34 giJ16537481dbjIBAA186591 (D90916) helicase of the snf2/rad54 fain... 143 3e-33 giI1763712lemb1CAB059391 (Z83337) member of the SNF2 helicase fa. 143 4e-33 gil2636153lembICAB15645.lI (Z99122) similar to SNF2 helicase [Ba 143 4e-33 gil2909552lembiCAAl72841 (AL021924) helZ (Mycobacterium. tubercul. 140 2e-32 9 iJ3844627 (U39681) ATP-dependent RNA helicase, putative [Mycopi 136 3e-31 giJ13514631spIP472641Y018_MYCGE HYPOTHETICAL HELICASE MG018 136 4e-31 gi12660669 (AC002342) human Mi-2 autoantigen- like protein [Arabi. 131 2e-29 gi1136153?lpirI1I64201 helicase (moti) homolog Mycoplasma geni 129 4e-29 gi134829771emb1CAA20533.1I (AL031369) putative protein [Arabidop. 128 9e-29 giJ3298562 (U91543) zinc-finger helicase (Homo sapiens] 120 2e-26 gil3875971jembICAB0249l1 (Z80344) similar to helicase; cONA EST 120 2e-26 gij4SS74Sljref INP_001263.11PCHD31 chromodomain helicase DNA bind 120 2e-26 gi12645435 (AF007780) CHD3 (Drosophila melanogasterl 118 gi13875l65lemblCAA917981 (Z67881) Similarity to mouse Chromodoma. 118 Query= sid111482811anidplORFOO7 Phage dpi ORF122230-2362113 (463 letters) gi12444105 (U88974) ORF26 (Streptococcus thermophilus temperate 89 7e-17 giJ3318666 (U19754) BBA31 hoinolog (Borrelia burgdorferi] 59 7e-08 giJ2690260 (AE000790) conserved hypothetical protein [Borrelia b 56 5e-07 Query- sid111482911anldplORFOO8 Phage dpi ORF149624-5096111 (445 letters) giJ44062101gbIAAl199011 (AF100420) DnaB replication fork helicas. 68 2e-10 giJ3l2l983IsplO25916IDNAB_HELPY REPLICATIVE DNA HELICASE 9 iJ231. 67 2e-10 giJ4416322igbIAAl203141 (AF106032) replicative helicase; DnaB (B 65 9e-10 giJ4155895 (AEOO1S5l) REPLICATIVE DNA HELICASE (Helicobacter 60 4e-08 giJ3322317 (AEOO1i91) replicative DNA helicase (dnaB) [Treponema. 58 le-07 gili380311PP045301VG41_BPT4 PRIMASE-HELICASE (PROTEIN GP41) >g 53 3e-06 IJj53Z £YJO AZ004 raflflylv ~j I-A Query. sidlll483ltlamldplORF0lO Phage dpi ORF18699-985912 (386 letters) giJ2760912 (AF037258) RecA protein (Chlorobium tepidumi 133 2e-30 gil32198511SPIP946661RECA_CLOPE RECA PROTEIN >gi11698591 (U61497. 129 3e-29 giI135o566IspIP482951RECASTRVL RECA PROTEIN >gilS08860 (U04837) 128 7e-29 gii7441631prfIl20i4250A recA-like protein [Streptomyces violaceus) 126 3e-28 giI73O487IspIP41OS4IRECA-STRA4 RECA PROTEIN >giI~lll33IembICAA82. 125 4e-28 gil26873341embiCAAi587SI (AL020958) RecA protein [Streptomyces c. 125 6e-28 giI13S0565IspIP482941RECA_STRLI RECA PROTEIN >gij481482jpirj1538 125 6e-28 WO 00/32825 PCT/I B99/02040 -(12 giI464S99IspIP33S421RECA_-AQUPY RECA PROTEIN >gillO86l67jpirlIlASS 123 2e-27 9 iI4176361sp1P327251RECA_-RHOSH RECA PROTEIN >giIS413071pirI 5415 123 2e-27 gi1298434 8 (AE000775) recombination protein ReCA [Aquifex aeolicus) 123 2e-27 gil3219854I~pIP9S8461RECA_STRRM RECA PROTEIN >gill729800IembICAA. 122 4e-27 giI2SOOO86IspIQ5956OIRECA_MYCSM RECA PROTEIN >gi11430892lembICAA. 122 4e-27 gili3SO5671spiP48296IRECA_-THEAQ RECA PROTEIN >gijI0?2963IpirjlA5... 122 6e-27 gij625663jpirIIJXO292 recA protein Thermus aquaticus (strain HB8) 121 le-26 giJ11728801spIP4244OIRECA_CA4JE RECA PROTEIN >gil2ll99911pirII4 120 2e-26 giJ41546S4 (AEO0i453) RECA PROTEIN. (Helicobacter pylori J991 120 2e-26 gi110729681pir1 1C55020 recA protein Thermus sp >giI4SB472ldbjI 120 2e-26 9 iI32198521spIP95469IRECA_-PARDE RECA PROTEIN >giI1825468 (U59631. 119 3e-26 9 il2507284I5P1P424451RECAN ELPY RECA PROTEIN >gij231323SjgbIAAD0. 119 4e-26 gil11728901spIQ023SOIRECA_-STAAU RECA PROTEIN >gil463285 (L25893) 116 Se-26 gi144162091gblAAD202611 (AF094756) RecA protein (Bifidobacterium. 116 Se-26 gil25000841spIQ5918OIRECA_BORBU RECA PROTEIN >gil1276443 (U23457. 118 5e-26 Query- sidlll4832IlanldplORFOll Phage dpi ORF128017-2909613 (359 letters) giJ2444110 (U88974) ORF31 (Streptococcus thermophilun temperate 187 le-46 gi13320438 (AF0S7033) gp348 [Streptococcus thermophilus bacteria 179 2e-44 gi14795141pirj IS34244 hypothetical protein p38 actinophage VWB 62 8e-09 Query= sidli4834IlaflidplORF0l3 Phage dpi ORF110215-1124013 (341 letters) gilS808S55embICAA299S81 (X06803) dnaZX-like ORF put. DNA polymer.. 182 2e-45 9 iI118807IspIP09122IDP3X_BACSU DNA POLYMERASE III SUBUNITS GAMMA... 182 2e-45 gil982921pirlI113786 DNA-directed DNA polymerase (EC 2.7.7.7) II 182 2e-45 giJ1527142 (U66040) DNA polymerase III gamma subunit [Salmonella 172 4e-42 gi124941971spIP748761DP3X_SALTY DNA POLYMERASE III SUBUNITS GAMM. 172 4e-42 giJ1188081spIP0671OIDP3X_ECOLI DNA POLYMERASE III SUBUNITS GAMMA 170 le-41 giJ4155207 (AE001497) DNA POLYMERASE III SUBUNITS GAMMA AND TAU 169 2e-41 gil23138411gblAADO7767.11 (AE000584) DNA polymerase III gamma 168 4e-41 giJ2583049 (AF025391) DNA polymerase III holoenzyme tau subunit 166 3e-40 giJ2984127 (AE000759) DNA polymerase III gamma subunit [Aquifex 166 3e-40 giJ38613901embICAA152891 (AJ235273) DNA POLYMERASE III SUBUNITS 165 5e-40 giJ11693971spIP43746IDP3XIIHARIN DNA POLYMERASE III SUBUNITS GAM.. 156 2e-37 gi11293572 (U49738) DNA polymerase III tau homolog Dna.X [Cauloba... 151 Se-36 giJ3328753 (AE001306) DNA Po1 III Gamma and Tau [Chlamydia trach 148 4e-35 gil43762941gbIAAD181931 (AE001589) DNA Polymerase III Gamma and 148 5e-35 giIS8i255iembICALA28175I (X04487) alternate dnazx protein (AA 1-6 146 3e-34 giJ2688379 (AE001151) DNA polymerase III, subunits gamma and tau. 140 2e-32 giJ3323329 (AE001268) DNA polymerase III, subunits gamma and tau. 137 le-31 Query= sidJli483S1lanjdpIORF0l4 Phage dpi ORF150961-5197413 (33? letters) gi]i3467961spIP474921PRIM_MYCGE DNA PRIMASE >giJ13614961pirlI1F64. 57 2e-07 gil7400081prfIl2004290A primase (Naemophilus influenzae] 51 ic-OS gi~II726191spIQ083461PRIM HAEIN DNA PRIMASE >gill0740331pirlI1A64. 51 ic-OS gill7O97691spIQ0455PRIMLACLA DNA PRIMASE >gij1075726jpir1 3C2. 51 ic-OS gil639846ldbjiBAAO35161 (D14690) DNA primase (Lactococcus lactis] Si ia-OS Query- sidII14837IlanidplORFO16 Phage dpi ORF143413-4430313 (296 letters) gill934766IembiCAB079861 (Z93946) N-acetylmuramoyl-L-alalie ami. 661 0.0 giJ1136761spIPO66S31ALYSSTRPN AUTOLYSIN (N -ACETYLMURAMOYL- L- ALA 221 4e-57 giI2823261pirlIA42935 N-acetylmuramfoyl-L-alafline amidase (EC 219 3e-56 giJ4166181epIP327621ALYS_BPHB3 LYTIC AMIDASE (N-ACETYLMURAMOYL-L 212 2e-54 gil2852731pirlIA42936 N-acetylmuramoyl-L-aianine amidase (EC 3.5 212 2e-54 gili277871spIPi5O57ILYCA -BPCP1 LYSOZYME (ENDOLYSIN) (MURAMIDASE) 162 4e-39 giI677611pirtIMUBPCP N-acetylmuramoyl-L-alaline amidase (EC 3.5 162 4e-39 giJi277891spjPi9386fLYCA_BPCP9 LYSOZYME (ENDOLYSIN) (MURAMIDASE) 160 ie-38 giJ928832 (L44bV4 i URi25; pu.L&Li~ve 119 2e-26 giJ2Sii7O51embICAA717831 CY1O818) sigA binding protein (Streptoc. 111 9e-24 gi14097980 (U72655) surface protein C [Streptococcus pneumoniaej 107 le-22 giJ2351768 (U89711) PspA [Streptococcus pneumozliael 105 4e-22 giJ2425l09 (AF019904) choline binding protein A (Streptococcus-p 104 6e-22 gil2823351pir[IA4i971 surface protein pspA precursor Streptoco. 104 le-2i gil257633ilembiCAA05i581 (AJ002054) SpsA protein [Streptococcus 103 2e-21 gi121272951pir1 ISS7962 cspC protein Clostridium acetobutylicum. 85 6e-16 gil2576333lembiCAA051591 (AJOO2OSS) SpsA protein (Streptococcus 84 ic-iS giJ41065221gbIAAlO2874.i1 (AF097909) excreted protein FibB (Pept. 83 3e-15 gi 11361406 1pir I I S57714 cspB protein Clostridium acetobutylicum. 82 4e-15 giJ19148721embICAB047581 (Z82001) PCPA [Streptococcus pneumoniae] 81 9e-15 WO 00/3282 5 PCT/I B99/02 040 413 gij3168594jdbjIBAA286l3I (AB012763) SpaP. (Erysipelothrix rhusiop 81 le-14 gil2292750lembICA.A64942I (X95646) homology to orf259 of lactococ 80 3e-14 giJ2935696 (AF032122) putative lysin (Streptococcus thermophilus 80 3e-14 gil458691ldbjIBAA76S40.1I (AB017447) protective antigen SpaA.l 80 3e-14 si]3540294 (AF057033) lysin [Streptococcus thermophilus bacterio 79 Be-14 Query= sidjl1484ljlanjdplORF02O Phage dpi ORF11864-265811 (264 letters) gil263374SlefbiCAB13247I (Z99111) similar to coenzymse PQQ synthe 217 5e-56 gi12808502letmbiCAA125321 (P.225561) ExSD protein [Sinorhizobium 163 le-39 giJ386ll5l1e'bICAAl5051I (P.235272) unknown [Rickettsia prowazekiil 82 gill652793ldbjlBAA177l2I (D90908) hypothetical protein (Synechoc 76 3e-13 9 iI1723815IspIPS5139IYGCFECOLI HYPOTHETICAL 25.0 KD PROTEIN IN 70 2e-11 gi12984272 (AE000769) hypothetical protein (Acruifex aeolicus] 66 4e-10 giJ4155435 (AE001516) putative (Helicobacter pylori J99] S7 le-07 gil2i278331pin1 1C64505 coenzyme P00 synthesis protein III homolo 55 5e-07 gi1262 2 3 38 (AE000890) coenzyme P00 synthesis protein III (Methan 54 9e-07 gil3257042ldbj1BAA297251 (APOOOOO3) 2S4aa long hypothetical Prot 53 2e-06 gil23l4068jgblAAD07976.lI (AE000602) conserved hypothetical prot 52 6e-06 giJ17238161SpjP450971YGCF_HAEIN HYPOTHETICAL PROTEIN H11189 >gil S0 2e-05 Query= sidj114842jlanjdplORF02l Phage dpl ORF12504-329512 (263 letters) 9 iI127481IspIP19465IGCH1_BACSU GTP CYCLOHYDROLASE I (GTP-C(-I) 208 4e-53 gil324231l5ersbICAA04237I (P.1000685) GTP cyclohydrolase (Streptoc 191 4e-48 giJ24946951splQS47691GCH1_SYNP7 GTP CYCLOHYDROLASE I (GTP-CH-I) 189 2e-47 giJ2550611bbsJ112832 (S44049) GTP cyclohydrolase I (clone hGCH-1 187 7e-47 gil45039491refINP_000152.1IPGCHlI GTP cyclohydrolase 1 (dopa-res... 187 7e-47 giJ2ll39671embICAB08935I (Z95557) folE (Mycobacterium. tuberculosis] 187 7e-47 gill7302401spIP5Ol411CCH1_CHICK GTP CYCLOHYDROLASE I (GTP-CH-I) 185 3e-46 giI2494696I5pIQ55759IGCH1_SYNY3 GTP CYCLOHYDROLASE I (GTP-CH-I) 184 5e-46 giJ12106l1spIP222881GCH1_RAT GTP CYCLOHYDROLASE I PRECURSOR (GTP 184 6e-46 gi1830141sp!0137741GCHI_SCHPO GTP CYCLOHYDROLASE I (GTP-CH-I) 184 6e-46 giJ30972241emb1CAAlB7951 (AL023093) GTP cyclohydrolase I (Mycoba. 182 2e-45 giJ24946971spIQ199801GCH1_CAEEL PROBABLE GTP CYCLOHYDROLASE I (G 182 2e-45 giI462167IspIQ0S9l5IGCH1_MOUSE GTP CYCLOHYDROLASE I PRECURSOR (G 180 7e-45 gi1i669664lembICAA898081 (Z49706) GTP cyclohydrolase I [Dictyost 180 le-44 gi12981082 (AF052048) GTP-cyclohydrolase [Ostertagia ostertagil 178 3e-44 giI31954IembICAA789O81 (116418) GTP cyclohydrolase I (Homo sapi. 177 8e-44 giJ5513441b5150280 (S71373) GTP cyclohydrolase I [mice. Peptid... 174 5e-43 gijl7302471spIP5i6011GCHl_YEAST GTP CYCLOHYDROLASE I (GTP-CH-I) 174 7e-43 gill24691lembICAA87397I (Z47201) GTP cyclohydrolase 1 fSaccharo... 172 2e-42 gill7302461spIP5159SIGCH1_STRPN GTP CYCLOHYDROLASE I (GTP-CH-I) 168 3e-41 giJ2982951 (AE000680) GTP cyclohydrolase I (Aquifex aeolicus) 164 6e-40 Query= sidIl1484311afldpi0RF022 Phage dpi ORF130896-3167512 (259 letters) giJ2347102 (U77367) internalin [Listeria monocytogenes] 55 5e-07 gil31232261spIP251461INLA_-LISMO INTERX4ALIN A PRECURSOR >gil48705. 52 4e-06 9 i1149674 (M67471) internalin (Listeria monocytogenes] 52 4e-06 Query. sidI1l48S0jlanjdp1ORFO29 Phage dpi ORF1662-134812 (228 letters) giJ2650185 (AE001074) succinoglycan biosynthesis regulator (exsB. 119 2e-26 gil386l23ilembiCAAl5l3lI (P.235272) unknown (Rickettsia prowazekii] 117 8e-26 gi12622210 (AE000881) conserved protein [Methanobacterium thermo 108 4e-23 giJ2983380 (AE000709) trans-regulatory protein ExsB (Aquifex aeo. 88 6e-17 giJlOO1327IdbjIBAA1O814I (064006) ExsB [Synechocystis sp.] 88 6e-17 giJ1280551pir11864468 hypothetical protein homolog MJ1347 met 83 le-iS giJ4155143 (AE001491) putative (Helicobacter pyloni J991 82 4e-15 g~jj.L160gbjAZ3'7C;!; t'EC057) "n~rved hypothetical Prot. 80 2e-14 giJ21208141pir1 560183 protein ExsB Rhizobium meliloti >giJ114. 76 3e-13 gil2633743lembICAB132451 (Z99111) similar to hypothetical protei. 75 5e-13 giJ11755431SpIP441241YBAX.HEIN HYPOTHETICAL PROTEIN HI1191 >gij 74 ie-12 gi124955371sp1P777561YP.X_ECOLI HYPOTHETICAL 25.5 1(0 PROTEIN.IN 71 5e-12 gil32564711dbjlSAA29154.1I (APOOOO0i) 269aa long hypothetical pr. 67 le-lO giJ2921156 (AF022216) aluminum resistance protein [AnthrobaCter 54 le-06- Query= sidI1l485Sjlanjdp1ORF034 Phage dpi ORF1131-65212 (173 letters) gi12633746lembIC.3132481 (Z99111) similar to hypothetical protei 20.e5 220 4e-57 WO 00/32825 PCT/I B99/02040 -41-4 gi155926 (AE001554) putative [Helicobacter pylori J99] 162 le-39 gi1231458S1gblAAD08456.lI (AE000642) conserved hypothetical prot 161 3e-39 gi12983458 (AE000714) hypothetical protein [Aguifex aeoiicus] 103 9e-22 giJlOO6604ldbjIBA.Al07571 (D64005) hypothetical protein (Synechoc. 87 6e-17 gi12967529 (U11045) unknown (Buchnera aphidicola] 79 2e-14 giI24956541sp1Q469201YQCDECOLI HYPOTHETICAL 32.6 PROTEIN IN 69 2e-11 gilll75604IsplP441531YQCDRAEIN HYPOTHETICAL PROTEIN HI1291 >gil 63 le-09 giJ38606421embICAA145431 (AJ23S270) unknown [Rickettsia prowazekii) 56 ie-07 Query= sidjll4857IlanjdplORF036 Phage dpi ORF148808-4936211 (184 letters) giJ1353529 (U38906) ORF12 (Bacteriophage rnt] 53 le-06 Query= sidj1l4859IlanjdpiORFO3B Phage dpi ORF11350-187113 (173 letters) gili755421spIP441231YB90_HAEIN HYPOTHETICAL PROTEIN H11190 >gil 100 6e-21 giJ2982977 (AE000681) hypothetical protein (Aqruifex aeolicus] 67 7e-11 giJ38607441embICAAi4645I (AJ235270) unknown (Rickettsia prowazekii] 65 3e-10 gi12650193 (AE001074) conserved hypothetical protein [Archaeoglo... 58 4e-08 gil3258383ldbj18AA3i066.1I (AP00OOO7) 157aa long hypothetical pr. 55 2e-07 giJiO0i71ldbjIBAA10550I (D64004) hypothetical protein (Synechoc. 50 8e-06 giJ4155434 (AE001516) putative (Helicobacter pylori J99] 50 le-OS Query= sidIlI48601iafldplORF039 Phage dpi ORF13306-380313 (165 letters) gill922884lembiCAA682441 (X99978) ORF7; hydophobic protein (Lact 64 Se-la Query- sidill4862IlanidplORF04i Phage dpi ORF18208-869913 (163 letters) giJ25223i3 (AF012906) dUTPase homolog [Bacillus eubtilis] ,giJ26 108 2e-23 gil26341S0jembICAB136501 (E99113) similar to deoxyuridine S1-tni 108 3e-23 gil39l35461spIO54134IDUT_-STRCO DEOXYURIDINE 5--TRIPHOSPHATE NJCL. 56 2e-07 gij39l3542jspjO48S00jDUT_-BPT5 DEOXYURIDINE 51-TRIPHOSPHATE NUCLE. 52 3e-06 giI39l3548IspIO689921DJTCHLTE DEOXYURIDINE 51-TRIPHOSPHATE NUCL. 50 le-OS Query. sidIll48671lanidplORF046 Phage dpi ORF142774-4320213 (142 letters) gi11934764lembICABO79841 (Z93946) hypothetical protein (bacterio. 287 2e-77 Query= sidj11490ijlanjdplORF080 Phage dpi 0RF142490-4275911 (89 letters) gill934763lembICAl079831 (293946) hypothetical protein (bacterio. 147 Query- sidli4912IlanldpiORF09l Phage dpl ORF143189-4341311 (74 letters) gili934765lembICAB079851 (Z93946) holin (bacteriophage Dp-i]63 2ei 63 2e-10 WO 00/32825 WO 0032825PCT/I B99/02040 .415 Table 32 Sequence of Dpi published by Sheehan and at.. 4731 nucleotides.
1 tttaaatttt ttgacaaagt 71 gtccagtgtg gctcaatcat 141 agaccttaaa tatcgaattg 211 gaaaaggctc aactacatga 281 aggcttatga aggtagaatg 351 ggcaagtcga attgaagcta 421 tgcatgagct cttctaatca 491 gtgaccgaat ttctatgttc 561 Laacgggatc tttacccaat 631 atgaacgtga ttcggtatgt 701 cacttgaacc tttacgtcga 771 ctactgtcga ccgcgatgga 841 aaaiggttca agtgttcata 911 ggagaagtga ctgttcctca 981 ataacggcgt tcacggaaat 1051 gatttctagt tttgagggaa 1121 tcttttacgc atcaagtttg 1191 ctagcgtatc ctttacgccg 1261 catctgtatt cgaacctata 1331 aacatccccg attcagtacg 1401 agattttaac agggaacaac 1471 cgcttacgga tccactatcc 1541 ggcggcaaat tgggtatgat 1611 gaaaacaatc gaacgrccaa 1681 cgttcaacgt actcgtcaaa 1751 gtaggaggtc aacagaaaaa 1821 cagaagatag aggttcggcg 1891 agctggtaac tacgggccgg 1961 gaatttagtg ctacggtacc 2031 gtaaggttgt agaacaaggg 2101 agttcaacag tttcagctca 2171 agcgtgaaac agagtttaca 2241 gggactattt caaaatttct 2311 atgttcatca ggacagcgaa 2381 aagacttcga acagaataat 2451 cgacgcattc tattcgaaaa 2521 gacaaagagg ctactattgc 2591 tcaataactc atatggaaat 2661 agataattct tggttaaatt 2731 attttttaga aaggaggtga 2801 attggacaag gagctgaaaa 2871 ccgtctctga aactcttcat 2941 aaaacttcgc gaaactcgtt 3011 cccggctcta acaggctgaa 3081 acgacgatta ttacagcgtg 3151 ataaagccaa gagcgtttta 3221 tgaccaaacg acagtagcaa 3291 taccgtctrtt atcaegactt 3361 tctctatttt attcgaaagt 3431 caagaaatta ccaattaggg 3501 cgtggtaacc gtagtcgttc 3571 actactgcta tcacaggaac 3641 gaaactacca aaaggaacaa 3711 gttgcgtgga tgcaggcccg 3781 atgactgctc aagttctatg 3851 tactgagtac atgcacgcat 3921 gctaaacgag gcgacatctt taattcaaat tgtaccgctg aagcaatttt ccatgtattc acccaaagtt attaaaatcg aacttggtaa tatctctact ccttttagtg aagcagagga actcaaaagc cgatcaaaag ctaactaacc aacagttgac ggcactcacg cgcagaaccg aaagctaagg ctacaatgga gcagitaagt aacttagaaa aaaactaatg aagaagctat caacaaatcg gaacccgacc taatcttagc ctatccaaga acttggcggg ctacgggaac tgaagaagtt cgtcgacagt aggtctaatt atcggtaaga acgacggtag ctctaccatt aaggtatcaa tccgcaggga atgaagttat gtaccttacg caagggttca ttcacatcga ccattcaagt cggccgattt agaacggaac aatactcgtt taatccagac aggaraagga gaataacatg acaaaattta tcaactcata cggccctctt acaagttagt caggacgtaa cgaacaactc ctcgcgagtt agttggcgag gcttatcgaa cgtggactta tggaaatatt agtaaccttt ccgtatggtt gcagtcaccc agactacgac acgtccggcg aagaggtaac gctcgcaagt caatagtgac gggacaaaga caatgtccgt ttgggcttcg tttgacccta atcactatct ctactaaita cactttagac agtattccaa ggtctacaca atcgaaatct aggatcttta catacggtta tctaaccg aaaagtgaac gtaccgagtt ttcggtagcg actggataga tttaggtaag aaccatacta tcactggact tagcaaggta cttacctaaa tcaagttccg gaacaatgga acggaactac gcaaattggt agtgacgtct attcaaacgg acggaggttc tcctactttt tcgggcattt ctttagtaga cacgacttca gcggttcgac ttcctccaaa tcatgtcgaa cattcaagtc aacttcaaca atgcttccgg aagcatttca cgctgagctc gtaggtaaaa accaagctat caacgaaaac gaactttaat ggctccgcta ccgtaagagc atgggttaca gacacgcgag gacgtatcta tcaatgttat agaatactat ggaccgtcta tcaatttctc atcctg3caat tatccaagct cttcgaaatg ctaaggtcgc acctataacg catcatgcaa attaccttct ccgtggcgcc gttgaacact actaatttca tcagggacgt tcactactat ttccctactg actaactcgt ccgcgaactt acaagtctta catagttaag gctaaaatcc aagacaggtt cacttcgact taccgaatca gtagttctia actatgacaa ggacggtcga cttggagttg aaggcagggt caattgatgc agcaggtgat atatatgctg gaggtcgaca ctgataataa tggagcattg aacaggggtc aatataacga tgttggaata tggcgaagta acaaatacga ggacaaccct acgggaactc gaggtgaatg ggttagatag ctggaaaatg gttcaatcct tcattacaat gtcaggaaga cgatggaaac agctggagac ctaacaagtg gaaagaggtt ctatttaagc tggcagaaac ttgttcttca aagtgggtgg aaccatcact caacctatgg ctcttgacgg catagtatat ttgagaggaa atgtgcataa aggacttatc agtacttcct gaaggattta gaccgaaagt ttcaatgtat cttcaggctc gccacttat gtatatacac tgacggaaga cttgtggtga aatcgaatgt tagacaatgt ctcatttcgt atttaatttg agctgaaaic atgttataat gaactatgtt gaaccttaca aaatcgcgcc aaattgtggc agagttcact gaaacttgtc aaaacaacga ttgtgaacat tgatgcaaac gcagtatcaa gacccagact tgtatgctgc gaaccgtcga gaacttcgag ctgacgagca acgcaatcga agatgaaatt aatagctgga gcgggggaaa aaagggggag taaggaggcg tcaatctatg ccaatgtggc taaacgacac cgcagtcttg cagcggagtg cttactgtcc tactaaataa gttattcgaa tggaaatcga gaggatatct ctacaactct tagcactctt aaacagcagg tcgacgggat tcaatcacca aaatgacgtc attcaagacg gaactagaaa aattcaacgt aaaaagggaa gtgataacag gctatacaac tctcgaccat tttagagagc tataagaacc ttggcggaaa tggtgaagtt gaagccttgt atgaaaaata aggaagattt agatgaaact atctaacgaa caacatgacg tagcaaagaa cagcagcgat tgcactaatt acaggtcttg gagcgttgta tcaatttgac cattgcactt cttgcaactt ttgcaggtac tgttctagga gtttctagcc gaagctcaaa acaatgaggt ggaataatgg gagtcgatat tgaaaaaggc aaagggtcga gtatcttata gcatggactt tcgagacggt cctgatagct tactatgctc tccgctcagc cggagcttca agtgctggat gggcagtcaa ggcttatcga aaacggttat gaactaatta gtgaaaatgc tccgtgggat catctgggga cgcaaaggtg ctagcgcagg cgctggaggt catacaggga atcttac rar-gae attcac acaca gtcaacctta ctactacgtc tatcgcttga ctaacgeaaa tgctcaaccg gcagaaagat gctactggtt tctggtacgc tcgagcaaac ggaacttatc atcgaagaaa acaagtcttg gttctacttt gacgaccaag gctacatgct atactgatgg aaactggtat tggttcgac-c gtgacggata catggctacg.
gtcatggtac tacttcaatc gcgatggttc aatggtaacc ggttggattgtattgtgatg ctaccaacgg cgacatgaaa tcgaatgcgt ttatccFta tattaccgga cggacgtctg gcagataaac ctcaattcac cgtagagccg agtttaaaat atagagagga ggaagctctt ttcttaatat tgtttctctt ctgcggggtt tatgtgtcgt gaattactct atttacttat tcgaagattt 4061 4131 4201 4271 4341 4411 4481 4551 4621 tgagcgttgg gctgagaaga caaaagatga cgctgagaaa tcatggaaac agtattacga taacgacggc gacgggctca aatcccgcaa tactatgcag aacttggctg gttcgagtat tggttgaaac ggattggcga taattggtat tggtatctac ttactgctaa ggtttcgacc 4691 caattataat taaataatca acgagattca taattggagg aatg WO 00/32825 WO 0032825PCT/I B99/02040 416 Table 33 Streptococcus accession numbers gil5776553igblAF026 4 7 l .21AF026471 [5776553] gil5410470jgbAF139890.1lAF1 3 9 89 O [54104701 gil54104681gbAF139889.1lAFl 39 88 9 [5410468] gil5410466igblAF139888.1lAFl 39 8 88 [5410466] gil54 lO464igblAF 139887.1 IAFi 39887 [5410464] gil54104621gbAF139886.1IAFl 3 9 88 6 [5410462] gil54 lO46OjgblAFl 39885.1 AFi 39885 [5410460] gil54 10458igblAF1 39884.1 AFI 39884 [54104581 gil54 104561gblAF139883. 1IAFI 39883 [5410456] gil3093394lemblAJ005 69 7 1 SPN5697 [3093394] gil5759208lgbiAF 171873.1 IAF17I 873 [5759208] giIS758 3 1 gb1AF 162664. 1 JAF 162664 [5758311] gi5739313gbAF161701.11AF1 6 1 7 0l [5739313] gi5739310gbAF161700.1lAF1 6 1 7 OO [5739310] gil57263541gbAF159448.1lAF15 9 4 4 8 [5726354] gil5726290lgblAF1 27143.1 IAFI 27143 [5726290] gijS71I2666igblAF 140784.1 IAF 140784 [5712666] gil42 18525lemb1AJ009639. I SPAJ9639 [42185251 gil56165241gbjAF169483.11AF1 69 4 83 [5616524] gil55793951gbAF162656.1lAF1 6 26 5 6 [5579395] gil55793931gblAF162655.1lAF1 6 2 6 55 [5579393] giI5S7889OiemblA1l 31985.1 ISPN 131985 [55788901 gil55664421gbIAF167442.1lAF1 67 4 42 [5566442] gil5459332IembiAJ243540. 11EVE243540 [5459332] gil5305398igblAF07281 1.1jIAF07281 1 [5305398] gi152959 2 1 lemb1A1242698. I ISPN242698 152959211 gil5295920lembIAJ24269 7 1 ISPN242697 [5295920] gil52959 19emfblA242696. 1 lSPIN242696 [52959 19] gi1529591I8lemblAJ242695. I SPN242695 [5295918] gil45835221gbAF 140356. 1IAF 14035 6 [45835221 gij523 1 2O6gblAF 157826. 1IAF 157826 [5231206] gil523 1203 jgblAFl57825. I AFi 57825 [52312031 giJ523 12001gb1AF 157824. 1 IAF 157824 [5231200] gil523 11 971gblAF1 57823.1 AF 157823 [5231197] gil52311941gblAF157822.1lAF15 7 8 22 [52311941 giJ5231191I gbAF 15782'1. 1 AF 15 7 8 2 1 [5231191] gil52 3 1 1881gbAF 157820.1IJAF 157820 [5231188] gil52311851gblAF157819.11AF15 78
I
9 [52311851 gil5231182gbAF 157818. 1 AFl15 7 818 [5231182] gil5231179gbAF157817.11AF15 7 8 17 [52311791 gil43368511gbAF106138.1lAF1O 6 l 3 8 [4336851] gil4336848gbAF106137.1lAF10 6 1 37 [4336848] gil43368451gbAF106136.1lAFlO 6 l 36 [4336845] gil4336842gbAF106135.1IAF10 6
I
3 S [4336842] gil43368391gblAF 106 134.1 IAF 106134 [4336839] gil43368361gblAF1 06133.1 IAFI 06133 [4336836] gil43368331gbIAF106132.1IAF1061 32 [4336833] gij3907597jgblAF094575.1 1AF094575 [3907597] gil50304251gblAF06 1748.21AF06 1748 [5030425] giJ4902881 lemblAJ239004. I1ISPN239004 [490288 1] giJ5SOOIlOjgbjAFl 112358.1IAF1 112358 [50017 giJ50016901gb1AF 106539. 1 1AF 106539 [50016901 gil49 73 27 1 gblAF 144420.1 IAF 144420 [4973271] giJ4973269Igb1AF 144419. 1 IAF 144419 [4973269] gil49732671gblAF 1444 18.1 IAF 144418 [4973267] gi1 4 9 28 1 90gblAF 129757.1 AF 129757 [4928190] gil49277431gblAF 126061.1 IAF 126061 [49277431 gil4927742jgblAF 126060.1 IAF 126060 [4927742] gil 4 92 7 74 1 jgbjAF 126059. 1 IAF 126059 [4927741] gil4495247 lemb1AJ240675. 1 I SPN240675 [4495247] gil4495245lemblAJ240670. 1 ISPN240670 [4495245] gil4495243lembiAJ240669. I SPN240669 [4495243] gil4 49 524 1 lemblAJ240668. I JSPN240668 [4495241] gil4495239lenmbIAJ24066 7 I SPN240667 [4495239] WO 00/32825 PCTII B99/02040 gil4495237IemblAJ240666. I 1SPN240666 [4495237] gil4495235lembIAJ240665. 1 ISPN240665 [4495235] gil44952331emb1AJ240664. 1 ISPN240664 [44952331 gi14495231 lemblAJ240663.1 ISPN240663 [4495231] gil4495229eemblAJ240662. 1 ISPN240662 [4495229] gil4495227lemblAJ24066 1.1 1SPN240661 [4495227] gil4495225IembIAJ240660. 1 ISPN240660 [44952251 gil4495223IembIAJ240659.1 ISPN240659 [4495223] gi1449522 1 emb1AJ240658. 1 SPN240658 [4495221] gi14 4 952 1 91emblAJ240657.1 ISPN240657 [4495219] gil44952171emb1AJ240656. 1ISPN240656 [4495217] gi144952 151emblAJ240655. ISPN240655 [44952151 gil4495 2 13lemb1AJ240654.1 ISPN240654 [4495213] gi144952 1 lemblAJ240653.1 SPN240653 [4495211] gil4495209lemblAJ24065 2 1 ISPN240652 [4495209] gi14495207jemb1AJ24065 1.1 1SPN240651 [4495207] gi144952051emblAJ240650. 1 ISPN240650 [4495205] gi14495203 lemblAJ24064 9 1 ISPN240649 [4495203] giJ4495201 lemb!AJ2406 48 I ISPN240648 (4495201] gi144951991emblAJ240647.1 ISPN240647 [4495199] gii4493 i 97jcui'uj~j240 A I IC- TiTA AA [4495197] gil44951951emblAJ240643.1 ISPN240643 [44951951 gil4495 1931emblAJ240642.11SPN240642 [4495193] gi14495191 lemblAJ24064 1.1 ISPN240641 [4495191] gil4 4 951 89lemblAJ240640.1 ISPN240640 [4495189 gi[44951 871emblAJ240639.1 ISPN240639 [4495187] gij44951 851embAJ240638.1 ISPN240638 [4495185] gi144951 831emblAJ240637.1 !SPN240637 [4495183] giJ4495 1811emblAJ240636. 1 ISPN240636 [4495181] gi14495 1791emblAJ240635.1 ISPN240635 [4495179] gi144951 771emb1AJ240634.1 SPN240634 [4495177] gi14 4 95 1 75lemb1AJ240633.1 ISPN240633 [4495175] gi144951 731embAJ240630.1 ISPN240630 [4495173] gi14495171 lembAJ240629. 1 ISPN240629 [4495171] gil449 5 169lembAJ240628. 1 ISPN240628 [4495169] gil4 4 951 67embAJ240627. 1 ISPN240627 [4495167] giJ4495165 lembAJ240626. 1 ISPN240626 [4495165] gi144951 631emblAJ240625.1 ISPN240625 [4495163] giJ4 4 95161 lemblAJ240624. I ISPN240624 [4495161] gi1449 5 1 591embAJ240623.1 ISPN240623 [4495159] gi144951 57 emblAJ240622. 1 ISPN240622 [4495157] gi14495 1551emblAJ24062 1. 1SPN240621 [4495155] gi14495153 1emblAJ240620. 1 ISPN240620 [4495153] gi1449515 lembAJ240619. 1SPN240619 [4495151] vilAd4Q5 1491prnhI A 1240616.1 1SPN240616 [4495149] gi144951 47IembIAJ240615.I1 SPN240615 [4495147] gi14495 1451emb1AJ240614.1 ISPN240614 [4495145] gil4 4 95 143 embIAJ240613.1 1SPN240613 [4495143] WO 00/32825 WO 0032825PCT/I B99/02040 giJ449514 I embIAJ24O6 12.1 ISPN2406 12 (4495141] gil4495139lemblAJ24061 1. 1IjSPN24061 1 (4495139] gil4495 137iemblAJ2406 10. 1 ISPN2406 10 [4495137] gil44951I351cmb1AJ240609. 1 ISPN240609 [4495135] gil4495 133lembiA1240608. I ISPN240608 [4495133] giJ449513 I emblAJ240607. 1 ISPN240607 [4495131] gil44 9 5 1291embA1240606. I SPN240606 [4495129] gil48836981gbjAF079807. 1 AF079807 [4883698] gil48385621gb1AF145055. I AF145055 [4838562] gil4063727igbIL29324. I1jSTRINTE [40637271 giJ3093401 lembIAJOO56 19.1 IfSPAJ5619 [3093401] gil4 I 3889igblAF0293 68.1 AF029368 [4103889] gil2897689]dbjID638O5. 1 D63805 [2897689] giJ4566771I1gb1AF1 1774 1. 1IAF 17741 [4566771 gil4566768igb1AF 117740. 1 IAF 117740 [45 66768 gil4538836lemblA1240793. 1 ISPN240793 [4538836] gil4538832lemblAJ240792. 1 ISPN240792 [4538832] gil4538828lemblAJ24079 1. 1 ISPN24079 1 [4538828] gil45388241emb1A1240790. I1SPN240790 [4538824] giJ453882 1 lemb1AJ240789. 1 ISPN240789 [4538821] gil45388 1 8emb1AJ240788. 1 ISPN240788 [4538818] gil45388 1 5emb1AJ240787. 1 ISPN240787 [4538815] gil45388 121embjAJ240786. 1 ISPN240786 [45388121 ei145388091emblA1240785. I ISPN240785 [4538809] gil4538806lembIA3240784. I ISPN240784 [4538806] gij45388031emblAJ240783 .1 ISPN240783 [4538803] gil45388001emblAJ240782. I ISPN240782 [4538800] gij4538797lembiAJ24078 1.1 :SPN24078 1 [4538797] gil4538794lemb1AJ240780. 1 ISPN240780 [4538794] giJ453879 1 lemb[A1240779. I SPN240779 [4538791] gil45387881embiAJ240778. 1 ISPN240778 [4538788] gi145387851embjAJ240777. 1 ISPN240777 [4538785] gij4538782lembIAJ240776. I1SPN240776 [4538782] gil45387791emblAJ240775. 1 ISPN240775 [4538779] gil4538776IembIAJ240774. 1 ISPN240774 [4538776] gil4538773 1cmblAJ240773.1 I1SPN240773 [45387731 gil45387701emblAJ240772. 1 ISPN240772 [4538770] gil4538767lemblAJ24077 1. 1 ISPN24077 1 [4538767] gij4538764lemblAJ240770. 1 ISPN240770 [4538764] giJ4538761 I emb1AJ240769. I ISPN240769 [4538761] gil45387581emb1A1240768. 1 ISPN240768 [4538758] gij4538755lemb1AJ240767. I1ISPN240767 [4538755] gil4538752lemb]AJ240766. 1 ISPN240766 [4538752] gil45387491embfAJ240765. 1 ISPN240765 [4538749] gil4538746lembjA124076 1. 1 ISPN240761I [4538746] gij4538743lemblA1240760. I SPN240760 [4538743] gil45387401emb1AJ240759. 1 ISPN240759 [45387401 oil~45387171pmhlA24075R. I ISPN240758 [4538737] gil4538734lemblAJ240757. 1 ISPN240757 [4538734]giJ453873 1 IemblAJ240756. I ISPN240756 [45387311 gil4538728lemb1AJ240755. 1 ISPN240755 [4538728] WO 00/32825 WO 0032825PCT/I B99/02040 g i145 38725 lemb1AJ240754. I ISPN240754 [4538725] gil4538722lemblAJ2407S3. 1 ISPN240753 [4538722] gil45387 1 9emb1A1240752. I ISPN240752 [4538719] gil45387 1 6emb1AJ24075 1.1 ISPN240751I [45387161 gil453871I3IembIAJ24O75O. 1 ISPN240750 [4538713] giJ45387 I 01emblAJ240749. 1 ISPN240749 [45387101 gil4538707lembIAJ240748. 1 ISPN240748 [4538707] gil4538704IemblAJ240747. 1 ISPN240747 [4538704] giJ453870 1 lemblAJ240746. 1 ISPN240746 [45387011 gi145386981emblAJ240745. I ISPN240745 [45386981 gi14538695jemblA1240744. 1 ISPN240744 [4538695] gil45386921emb1A1240743.1 I1SPN240743 [45386921 gi14538689lemblAJ240742. I ISPN240742 [4538689] gi145386861emblAJ24074 1. 1 ISPN24074 1 [4538686] giI4538683lemblAJ240740. 1 ISPN240740 [4538683] gil4538680lemblAJ240739. 1 ISPN240739 [4538680] gil4538677IembIAJ240738.1 ISPN24073 8 [4538677] gi[4530444igblAFl 18229. 1 AF1 18229 [4530444] gil4519253ldbjlAB015852.1IAB015852 [4519253] gi451921ldbjABO15851.1AB015851 [4519251] 192491dbjAB0 15850.1 IABOl 5850 [4519249] gil4S 192471dbj1AB01 5849.1 ABO 15849 [4519247] gi45i9245djj~5 iiAB~ 'j'4 ~45,925 19243ldbj1AB0 15847.1 IABOI 5847 [4519243] gil4519241IdbjlAB015846.1IAB015846 [4519241] gil4519239ldbjIABOl 1210.LIABOI 1210 [4519239] gil4519237jdbjlABO1 1209.IIABO1 1209 [4519237] giJ45l9235ldbjIABOl 1208.1 IABO1 11208 [4519235] gil4519233ldbjlABO1 1207.11ABOI 1207 [45192331 gil45l923lidbjlABO1 1206.11AB01 1206 [4519231] gil4S l9229IdbjIABOl 1205. 1 JABOI 1205 [4519229] gil45l9227ldbjlABOl 1204.II ABO1 1204 [4519227] gil4519225ldbjlAB01 1203.1IIAB0I 1203 [4519225] gil4519223ldbjjABO1 1202.11ABOI 1202 [4519223] gil4519221ldbjAB011201.11AB011201 [4519221] giJ45l921ldbjIABOI 1200.1ABOII1200 [4519219] gil45l921ldbjlABOI 1199.11AB01 1199 [4519217] gil4519215IdbjAB011I198.1AB01 1198 [4519215] gi144951I27lemblAJ240605.1 ISPN240605 [4495127] giJ4468 03 1 lembIAJ 132957. 1 ISPN 132957 [4468031] gil4468O29lemblAJl 32956. 1 ISPNI132956 [4468029] gil42185321emb1AJO 10312.1 ISPNO10312 [4218532] gil4456852jemblAJ236792. 1 ISPN236792 [4456852] gil4456850jemblAJ23679 1. 1 ISPN23679 1 [4456850] gij4456848lemblAJ236790. 1 ISPN236790 [4456848] gi14456846lemb1A1236789. 1 ISPN236789 [4456846] gi13550644jemblAJ006987.1!SPAJ6987 [3550644] gil3550625lemblAJ006986.11SPAJ6986 [3550625] gil44165181gb1AF014458.21AF014458 [4416518] gil440626OjgblAFlO101116. 1IAF 105116 [4406260] gil44062571gbIAF 105115.1 AF 105115 [4406257] gil44O62541gblAFI105114.I1IAF 105 114 [4406254] gil44062461gblAF 105113.1 IAF 105113 [4406246] gil44062431gblAF1051 12.1IAF1051 12 [4406243] gil4138533jembIAJ005815.11SPN5815 [4138533] gil3821I7261emb1AJ232433. 1 ISPN232433 [3821726] gil3821 7241emblAJ232432. 1 ISPN232432 [3821724] gil38217221ernb1AJ232431.1ISPN232431-- [3821722] gi13821I7201emb1AJ232430. 1 ISPN232430 [3821720] WO 00/32825 WO 0032825PCT/I B99/02040 gil382 171 81emblAJ232429. 1 ISPN232429 [3821718] gij382 171 61emb1AJ232428. 1 ISPN232428 [3821716] gi1382 171 41emblAJ232427. 1 ISPN232427 [3821714] gil382 171 2lemblAJ232426. 1 ISPN232426 [3821712] giJ382 17 10lemblA1232425. I SPN232425 [3821710] gil3821I708lemblAJ232424.1 I1SPN232424 [3821708] gil382 1706lemblAJ232423. 1 ISPN232423 [38217061 gil3821I7041emblAJ232422. I ISPN232422 [3821704] gil382 1702lembIAJ23242 1. 1 ISPN23242 1 [3821702] gil3821700jemblAJ232420. 11SPN232420 [3821700] gil3821I6981emblA32324 19.1 ISPN2324 19 [3821698] gil382 16961 emb1AJ2324 18. 1 ISPN2324 18 [3821696] gi13821I6941emblAJ2324 17.1 ISPN2324 17 [3821694] gil3821 692lemblA12324 16.1 ISPN2324 16 [3821692] gi1382 1690lemb1AJ2324 15.1 ISPN2324 15 [3821690] gi13821I6881emb1AJ2324 14.1 SPN2324 14 [3821688] gi13821I6861emblAJ2324 13.1 ISPN2324 13 [3821686] gij3821 6841embIAJ2324 12.1 ISPN23241 2 [3821684] gil38216821emb1AJ23241 1.1]SPN23241 1 [3821682] gil3821I6801emb1A32324 10.1 ISPN2324 10 [3821680] g i i3 8 16 7 8 biAI2 409 0 [3821678] gil382 16761emblAJ232408. 1 ISPN232408 [3821676] gil3821674lemblAJ232407. I1ISPN232407 [3821674] gil38216721emblA1232406. I ISPN232406 [3821672) gil382 16701emb1AJ232405. 1 ISPN232405 [3821670] gil3821668lembIAJ232404.1 ISPN232404 [38216681 gi1382 1666lemb1AJ232403.1 I SPN232403 [3821666] gij3821 6641emb1AJ232402. 1 ISPN232402 [3821664] giI3821662lemblAJ232401.1 ISPN232401 [3821662] gi1382 1660lemblAJ232399. 1 ISPN232399 [3821660] gil38216581ernblAJ232398.1 ISPN232398 [3821658] gi1382 16561emb1A1232397.1 I1SPN232397 [3821656] gij3821 654lemblAJ232396. I ISPN232396 [3821654] gil382 1652lemblAJ232395.1 ISPN232395 [3821652] gil3 821 6501emblA1232394.1 I SPN232394 [3821650] gil3821648lemblAJ232393.1 ISPN232393 [3821648] gij382 1646lemblAJ232392. 1 ISPN232392 [3821646] giJ3 821 644 jembIAJ23239 1. 1 ISPN23239 1 [3821644] gil3821I6421emblAJ232390. 1 ISPN232390 (3821642] gil3821I640lemb1AJ2323 89.1 SPN232389 [3821640] gij38216381embIA1232388. 1 ISPN232388 [3821638] gil382 16361emb1AJ232387. I ISPN232387 [3821636] gil382 16341emb1A1232386. 1 ISPN232386 [3821634] gij3821632lemblA1232385. I ISPN232385 [3821632] gX191 I -l I -IATI)%Rd1_qN*_?IR [382 1630] gil3821 628lemblAJ232383. I ISPN232383 [3821628] gi[382 16261embiAJ232382. 1 ISPN232382 [3821626] gi1382 16241emb1AJ23238 1. 1 ISPN232381I [3821624) WO 00/32825 WO 0032825PCT/I B99/02040 gil3821622lembAJ232380.1I1SPN232380 [3821622] gil382 16201emb1A3232379. 1 ISPN232379 [3821620] gil3821I618lemb1AJ232378. 1 ISPN232378 [3821618] gil382 161 61emb1A1232377. 1 ISPN232377 [3821616] gil3821614lemblAJ232376. 1 ISPN232376 [3821614] gi13821 612lembIAJ232375. I ISPN232375 [3821612] giJ38 2 16 10lemblAJ232373. 1 ISPN232373 [3821610] gil3821608lembIAJ232372. 1 ISPN232372 [3821608] gil3821I6061emblAJ23237 1. 1 ISPN232371I [3821606] gil3821I604lemb1AJ232370. 1 ISPN232370 [38216041 gi13821I6021emb1AJ232369. I SPN232369 [3821602] gil3821I600lemb1AJ232368. 1 ISPN232368 [3821600] gil3821I5981emblAJ232367. 1 ISPN232367 [3821598] gil38 2 15961embjAJ232366. I SPN232366 [3821596] gil382l 5941emb1AJ232365. 1 ISPN232365 [3821594] gil3820454lembAJ007367.1ISPN736 7 [3820454] gil382 15921emb1A1232364. 1 ISPN232364 [3821592] gi13821I590lemblAJ232363.1 ISPN232363 [3821590] gil382 15881emblA32323 62. 1 ISPN232362 [38215881 gil382 1586lembIAJ23236 1.1 ISPN23236 1 [3821586] gil3821584lemblAJ232360. 1 ISPN232360 [3821584] gil38215821emblAJ232359.1 ISPN232359 [3821582] gil382 1580lembIAJ23235 8.1 I SPN232358 [3821580] gil3821I5781emb1A1232357. 1 ISPN232357 [3821578] gil3821I5761emb!Ai232356. 1 ISPN232356 [3821576] gi13821 5741emblAJ232355. 1 ISPN232355 [3821574] gil382 15721emb!AJ232353. 1 ISPN232353 [3821572] gij3821570jemblAJ232352.1 ISPN232352 [3821570] gil3821 5681emb1A323235 1. 1 ISPN23235 1 [38215681 gil3821 5661emb1AJ232350. 1 ISPN232350 [3821566] gil3821564lemblAJ232349.1 ISPN232349 [3821564] gil3821562lemblAJ232348.1 ISPN232348 [3821562] gil3821I5601emb1AJ232347. 1 ISPN232347 [3821560] gil3821 5581embI1232346. 1 ISPN232346 [3821558] gil382 15561emblA1232345.1 ISPN232345 [3821556] gil382 15541emblA3232344. 1 ISPN232344 [3821554] gil3821 5521emb1AJ232343. 1 ISPN232343 [3821552] gij3821I5501emblAJ232342. 1 ISPN232342 [3821550] gij382 15481cmblA323234 1.1 ISPN232341I [3821548] gi1382 1546 jembf AJ232340. 1 ISPN232340 [3821546] gi13821 5441emblAJ232339. 1 ISPN232339 [382 1544] gil3821 542jemblAJ232338. 1 ISPN232338 [3821542] gil3821I5401emblAJ232337. 1 ISPN232337 [3821540) gi138 2 1 5381emblA1232336. 1 ISPN232336 [38215381 08215 a~uILLUJ335 1iPT11 [3821536] gil382 1534lemb1AJ232334. 1 ISPN232334 [3821534] giJ3 82 15 3 2emb1A123 23 33. 11 SPN2 3233 3 [3821532] gi1382 15301emblAJ232332. 1 ISPN232332 [3821530] WO 00/32825 WO 0032825PCT/I B99/02040 gil3821528lemblAJ23233 1. 1 ISPN23233 1 [38215283 gi1382 15261emb!AJ232330. 1 ISPN232330 [3821526] gi13821I5241embiA1232329. I ISPN232329 [3821524] gil382 15221emblA1232328. I ISPN232328 [382 1522] gi13821 S2OlemblA3232327. I ISPN232327 [3821520] gil382 151 81emb1AJ232326. I ISPN232326 [3821518] gil382 151 61emb1AJ232325. 1 ISPN232325 [38215161 gif 382151 41emb1A1232324. I ISPN232324 [3821514] gil3 8215 12!emb1AJ232322.1 ISPN232322 [3821512] giJ38215 10jemblAJ23232 1. 1 ISPN23232 1 [3821510] gil3821I508lemb1AJ232320. 1 ISPN232320 [3821508] gil3821506lemb1AJ2323 19.1IjSPN232319 [3821506] gil3821504lembjAJ2323 18.1I1SPN2323 18 [3821504] gil3821502lemb1AJ2323 17.1I1SPN2323 17 [3821502] gil382 15001emb1AJ2323 16.1 ISPN2323 16 [3821500] gil3821498lemblA12323 15.1 SPN2323 15 [3821498] gil382 1496f embiAJ2323 14.1 I SPN2323 14 [3821496] gil3821494lemblAJ232313.1 ISPN2323 13 [3821494] gil38214921emb1A1232312.1 ISPN2323 12 [3821492] gil3821 4901emb1AJ2323 11.1 ISPN2323 1 [3821490] gil3821488jembIAJ2323 I1U.1I1SPN2323 i0 [382 1488] gi1382 14861emb1AJ232309. 1 ISPN232309 [3821486] gil382 14841emb1AJ232308.1 I SPN232308 [3821484] gij3821482lemblAJ232307.1 ISPN232307 [3821482] gil382 14801emb1A1232306. 1 ISPN232306 [3821480] gil3821478lemblAJ232305. I1ISPN232305 [3821478] gil3821476lemblAJ232304.1 ISPN232304 [3821476] gil382 1474lemb1A1232303. 1 ISPN232303 [3821474] gil3821472lembjAJ232302. 1 ISPN232302 [3821472] gil3821470lemblAJ232301. I1ISPN232301 [3821470] gij382 14681emb1AJ232300. 1 ISPN232300 [3821468] gij382 14661emb1AJ232299. 1 ISPN232299 [3821466] gil382 1464lemb1AJ232298. 1 ISPN232298 [3821464] gil382 14621emb1AJ232297. 1 ISPN232297 [3821462] gil3 821 4601emb1AJ232295.1 ISPN232295 [3821460] gil382 1458lembIAJ232294. 1 ISPN232294 [3821458] gil382 1456lemb1AJ232293.1 I SPN232293 [3821456] gil382 1454lemb1AJ232292. 1 ISPN232292 [3821454] gil382 1452lembrAJ23229 1. 1 ISPN23229 1 [3821452] gil3821450lemblAJ232290. 1 ISPN232290 [3821450] gil382 14481emblAJ232289. 1 ISPN232289 [3821448] gil382 14461emb1AJ232288. 1 ISPN232288 [3821446] gil382 14441emb1AJ232287. I ISPN232287 [3821444] gil3821442lemblAJ232286. I ISPN232286 [3821442] n,.382 .t V 'lfl I IC')1.T40- [3821440] gi1382 14381emb1AJ232284. 1 ISPN232284 [3821438])gij3821436jemblAJ232283. I ISPN232283 [3821436] gi138214341emb1232282. 1 ISPN232282 [3821434] WO 00/32825 WO 0032825PCT/l B99/02040 gi13 8 2 1432lemb1A123228 1.1 ISPN23228 1 [3821432] gil3821430iembIAJ232280. 1 ISPN232280 [3821430] gil3821I428lemblA3232279. 1 ISPN232279 [3821428] agil3821426lemblAJ232278.1 ISPN232278 [3821426] gil3821424lemblAJ232276. 1 ISPN232276 [3821424] gil382 1422iemblAJ232275. 1 ISPN232275 [3821422] gil382 14201emb1AJ232274. I SPN232274 [3821420] gil382 14 18lemb1AJ232273. 1 ISPN232273 [3821418] gi13821416lemblAJ232272.1 ISPN232272 [3821416] gil3821414lemblAJ23227 1. 1 ISPN23227 1 [3821414] gil38 2 1412lemblAJ232270. I SPN232270 [3821412] giJ3821 4 1lembIA.J232269. 1 ISPN232269 [38214101 gil382 1408lemblAJ232268. 1 ISPN232268 [38214081 gij3821406lemblAJ232267. IS1PN232267 [3821406] gij38 2 1404lemb1A1232266. 1 ISPN232266 [3821404] gil382 14021emblAJ232265. I ISPN232265 [3821402] gil382 1400lemblAJ232264. 1 ISPN232264 [3 82 1400] gil3821I398lemb1AJ232263. 1 ISPN232263 [3821398] gil382 1396lemblA3232262. 1 ISPN232262 [3821396] gil382 1394lemblA123226 1. 1 ISPN23226 1 [3821394] gil3821392lembIA123226U. i iSr1'232260 [382 1392] gil3821 3901emblAJ232259. 1 ISPN232259 [3821390] gi13821388lemblAJ232258. I ISPN232258 [3821388) gil382 1386lemb1AJ232257. I1SPN232257 [3821386] gi1382 13841emb1AJ232256.1 SPN232256 [3821384] gi13821382lemblAJ232255.1 ISPN232255 [3821382] gil3821I3801emblAi232254. 1 ISPN232254 [3821380] gij3821I378lemblA1232253. 1 ISPN232253 [3821378] gil3821376jemblAJ232252. I1ISPN232252 [38213761 gil3821374lemblAJ23225 1.1 ISPN232251 [3821374] gil3821 372lemb1AJ232250. 1 ISPN232250 [3821372] gij3821I370lemblAJ232249. 1 ISPN232249 [3821370] gil3821I3671emblA1232248. I SPN232248 [3821367] gil3 821 3651emblA3232247. 1 ISPN232247 [3821365] gil382 1363lemb1AJ232246. 1 ISPN232246 [3821363] giJ3821361 lemblAJ232245.1 ISPN232245 [3821361) gi1382 13591emblAJ232244. 1 ISPN232244 [382 1359] gi1382 1357lemblAJ232243.1 ISPN232243 [3821357] gil382 1355lemblA123224 1. 1 ISPN232241I [3821355] gil292 18421gblAF0473 85.1 1AF0473 85 [2921842] gil29098631gblAF047696. 1 AF047696 [2909863] gil4 1933531gblAF055088. 1 AF055088 [4193353] gij4 1852421gbIAH007276. I ISEGSPTNJUJNC [4185242] giJ4 185241 IgblAF066797. I ISPTNJUJNG2 [4185241] gil4l185240lgblAF066796. I ISPTNJUTNC 1 [4185240] gil40979791gbIU72655.11!SPU72655 [4097979] gil40637201gbIL29323. 1 ISTRMTR [4063720] gil I657605igbIU66846.1 ISPU66846 [1657605]1 gill 657602 jgbjU66845. 1 1SPU668454 1 637602] gil40094851gblAF068903.1 AY068903 [4009485] gil40094771gblAF068902. 1 IAF068902 [4009477] gil4009462igblAF06890 1.1 1AF068901I [4009462] WO 00/32825 WO 0032825PCT/I B99/02040 -424 gil3947767lembIAJ2 33 8 9 6 IlSPN233896 [39477671 gi13947765lemlblAJ2338 9 5.1 ISPN233895 [3947765] gil3947763IembIAJ233 8 94 I SPN233894 [39477631 gil39 47 7 6 1lemblAJ233893. 1 ISPN233893 (3947761] gil39477591emblAJ2338 92 .l IlSPN233892 [39477591 gil3947757lembIAJ2338 9 1.1 ISPN23389 1 [39477571 gil3947755lemblAJ23 3 8 90 .1 ISPN233890 [3947755] gil3947753lemblAJ233 8 89 1 ISPN233889 [3947753] gil39 4 7 7 5 lI lemblA323388 8 1 ISPN233888 [3947751] gil39477491cmblAJ233 88 7 .l IlSPN233887 [3947749] gil3947730lerflAJ23388 6 .l ISPN233886 [3947730] gil3758891lemblZ71552.1ISPADCA [3758891] giI381I84791gblAF057294. 1 AF057294 [3818479] giI23Sl 767igbIU897 11.1 lSPU8971 1 [2351767] gil3 3 9 56 6 l dbjlAB006879.l AB006879 [33956611 gil3395659ldbjlAB0068 7 8.l 1AB006878 [3395659] gil3395657ldbjIAB006877. 1 AB006877 [33956571 gil3395655ldbjlAB006876. 1 JAB006876 [3395655] gil3395653ldbjlAB006875.1lABOO 6 87 5 [33956531 gi13 39 56 5 1ldbjlABOO6874. 1 AB006874 [3395651] gil3395649IdbjlAB006873 .1 AB006873 [33956491 gil3395647ldbjlAB006872.1lABOO 6 8 72 [3395647] gil3395645ldbjlAB00687 1.1 1AB00687 1 [33956451 gil3395 6 43 ldbjlABOO687O. 11AB006870 [3395643] gil3395641ldbjlAB006869.1lABOO 6 8 69 [3395641] gil3395639ldbjlAB006868.I 1AB006868 [3395639] gil23 15992 jgbjU87092. 1 SPU87092 [2315992] gil22093381gblU93576. 1 ISPU93576 (22093381 gij2109442igb1AF00065 8 I ISPDNAARG (21094421 gill 881S381gbIU09239. 1 ISPU09239 [1881538] IllI I lT T7,CYlIo r i AfOA gi11498294igbIU41735.11SPU41735 [1498294] gil112 13493igbIU47687.I1 SPU47687 [1213493] gill 1631091gbIU43526.1lSPU43526 [1163109] gil55600lgbIU15171.11SPUI5171 [556001] gil4550631gblU02920. I SPU02920 [455063] gil7848961gbIL36923. IISTRSTRH [784896] gil3320386lgblAF030373.1 lAF030373 [33203861 gil28047721gblAF030374. 1 AF030374 [2804772] gil2804762jgblAF030372.1IAF0303 7 2 [2804762] gil28047561gblAF03037 1.1 1AF03037 1 [2804756] gil2804750lgblAF030370 I 1AF030370 [2804750] gil28047451gblAF030369. I AF030369 [2804745] gil2804739lgblAF030368. I AF030368 [2804739] gij28047321gblAF0303 67.1 1AF030367 [2804732] gil2804726lgblAF030366. 1 AF030366 [2804726] gil2804720lgblAF030365 1 1AF030365 [2804720] gil28047131gblAF030364.1lAF0 3
O
3 64 [2804713] gil28047071gblAF03O363.i 1AF030363 [2804707] gil280 4 7 0 11gblAFO3O362. 1 AF030362 [2804701] gil28046941gblAF03036I .1 lAF030361 (2804694] gil2804688lgblAF03O 36 0.11AF030360 [28046881 gij280-46821gblAF030359 1 1AF030359 [2804682] gil3550979ldbjlAB010387.1IAB0 3 8 7 [3550979] gil2 27 S lO0lemblAJOOO336. IISPR6LDH [2275100] gil35518531gblAF076029.1lAFO 76
O
29 [3551853] gi]3SS 1773 jgblU94770. 1 SPU94770 [3551773] gil3550617lemblAJ004869.1lSPA1 4 8 69 [3550617] gif35135631gblAF0557271lIAF 0 5 57 2 7 [3513563] gil3S 13561 gblAF055726.1 1AF055726 [3513561] gil35I35591gbIAF0557251lIAF055 7 2 5 [3513559] gil35135571gblAF055724.11AF055 7 2 4 [3513557] gij3l31355gblAFOSS723.1IAFOSS 72 3 [3513555] gil35 13553 jgblAF055722. 11AF055722 [3513553] ~I1~354Io)IAA9521.iAF5571 3513549] gil3513545lgblAF0557201lIAF 0 5 5 72 0 (35135451 gi1 1914869lemlblZ82001 .1ISPZ82001 [-19148691 gil29l 1421lgblAF046238.1lAF0462-3 8 [291 14211 gil2 9 l 14I91gblAF046237.1IAF046237 [29114191 ,,'I11I I 1171oh A F040 161 IAF046236 [29114171 WO 00/32825 WO 0032825PCT/I B99/02 040 gi1291 14131gblAF046234.11AF046 2 3 4 [2911413] giJ291 141 I IgbAF046233.1IIAF046233 [29114111] gir291 1409igbAF046232.11AF046232 [29114091 gil291I1407 jgblAF04623 1.1 1AF04623 1 [2911407] gil291 1405igblAF046230.11AF046230 [2911405] gi13258601 Igb1U40786. I SPU40786 [3258601 gil3211I7561gblAF052209. I AF052209 [3211756] gil32I 17521gblAF052208.1IAF052208 [3211752] gil32 117471gblAF052207. 11AF052207 [3211747] gil32201941gblAF053 121.11AF053121 [3220194] gil2766O52lembIZ99863.1 SPZ99863 [2766052] gil2766050lembIZ99862. 11SPZ99862 [2766050] gil27660481emb1Z9986 1.1 jSPZ9986 1 [2766048] gil2766046lembIZ99860. 1 ISPZ99860 [2766046] gil2766044lemb1Z99859. I SPZ99859 [2766044] gil2766042lembIZ99858. 1 ISPZ99858 [2766042] gii2766040lemb1Z99857.1I1SPZ99857 [2766040] gij27660381emb1Z99856.11SPZ99856 [2766038] gil2766036IembIZ99855 .1 SPZ99855 [2766036] gil2766034lembIZ99854.1I1SPZ99854 [2766034] gi12766032lembIZ99853.I1 SPZ99853 [2766032] gil2766030IembIZ99852. I SPZ99852 [2766030] gil2766028IembIZ9985 1.1 1SPZ9985 1 [2766028] gi12766026lembIZ99850. 11SPZ99850 [2766026] gil2766024lembjZ99849. 1 SPZ99849 [2766024] gi12766022lembIZ99848. 11SPZ99848 [2766022] gil2766020lemb1Z99847. 1 SPZ99847 [27660201 gij276601I81embIZ99846. 11SPZ99846 [2766018] gij276601I6jembIZ99845. 11SPZ99845 [2766016] gil27660 141cmb1Z99844. 1 SPZ99844 [2766014] gil27660 1 2embIZ99843. 1 SPZ99843 [2766012] gi127660 101emblZ99842. 1 SPZ99842 [2766010] gil2766008lembIZ9984 1.1 1SPZ9984 1 [2766008] gij2766GujinujZ84G. Ir iL"Z^^4 '271006 gij2766004lembIZ99839. I1ISPZ99839 [2766004] gi12766002lembIZ99838. 1 SPZ99838 [2766002] gil2766000IembIZ99837. 11SPZ99837 [2766000] gil2765998lembIZ99828. 1 ISPZ99828 [2765998] gil2765996lembIZ99827.1I1SPZ99827 [27659961 gil2765994lemb1Z99826. 1 SPZ99826 [2765994] gil2765992lembIZ99825.11SPZ99825 [2765992] gi12765990lemb[Z99824.1 I1SPZ99824 [2765990] gi12765988lembIZ99823.IISPZ99823 [2765988] gil2765986IembIZ99822. I SPZ99822 [2765986] gil2765984lembIZ99821.11SPZ99821 [2765984] gi12765982iemb1Z99820. 1 ISPZ99820 [2765982] gil27659801embIZ998 19.1 1SPZ998 19 [2765980] gij2765978lembIZ99818.11SPZ99818 [2765978] gil2765976IembIZ998 17.1 SPZ998 17 [2765976] gil27659741embIZ998 16.1 1SPZ998 16 [2765974] gi12765972lemb1Z998 15.1 1SPZ998 15 [2765972] gil2765970jembIZ998 14.1 1SPZ998 14 [2765970] gi12765968lembIZ998 13.1 1SPZ998 13 [2765968] gil2765966IembIZ99812.1ISPZ99812 [2765966] gi12765964lembIZ998 11.1 1SPZ998 11 [2765964] gil27659621emb1Z998 10. 1 ISPZ998 10 [2765962] gil2765960lemb1Z99809. I ISPZ99809 [2765960] gil2765958IembIZ99808. 11SPZ99808 [2765958] gij2765956lemb1Z99807. 11SPZ99807 [2765956] gil2765954lembjZ99806. 11SPZ99806 [2765954] gi127659521embIZ99805. 1 SPZ99805 [2765952] gil2765950lembIZ99804. 1 ISPZ99804 [2765950] gil27659481emb1Z99803. 11SPZ99803 [2765948] gi1289 4 1041emb1X77249. 1 ISPR6CIARH [2894104] gil31538971gblAF067128.1IAF067128 [3153897] gil31527121gblAF065153.11AF065153 [3152712] gil3l52710~gbAF065152.1jAF065152 [3152710] gi13l527081gblAF065 151.1 1AF065151 [3152708] gil3l1164261gbIU84387. 1 ISPU84387 [3116426] gi!23854031emb1AJ01247.1 1SP7465RR3 [2385403] giI234254OlemblAJOO 1250. 1ISP7978RR5 [2342540] giJ2342539lemblAJ00 125 1. 1 ISP7978R1R3 L- gil2342538lemblAJ01248.1 SP7466RR5 [2342538] giI23142537iembAJOO 1249.1 SP7466R.R3-- [2342537] gil3065896igb]AF058920.1I1AF058920 [3065896] gil29826471emblAJ002294.1 !SPAJ2294 [2982647] WO 00/32825 WO 0032825PCT/I B99/02040 gil2982645lemblAJ00229 3 .l 1SPAJ2293 [2982645] gil2982643lemblAJ002292.1ISPAJ 2 2 92 [2982643] gii298 2 64 IlemblAJ00229 1.1 jSPAJ229 1 [2982641] gillI 620466lemblX99400. 1 ISPDACAO 1620466] gil2 196665lemblZ8438 1.1 1HSZ8438 1 [2196665] gi12196663lemblZ84380.1I1HSZ 8 4 38 0 121966631 gii2 196661 lemblZ84379. I IHSZ84379 [2196661 gi12l96659lembIZ84378. I HSZ8 4 37 8 [2196659] gil625l751gblL36 131.1 ISTREXP IQA [625175] gil30049451gblAF036624.I1 AF036624 [3004945] 0 49 43 1gblAF036623.I1 AF036623 [3004943] gil3 00 49 4 1 gblAF036622.I1 AF036622 [3004941] gil30049391gblAF03662 1.1 1AF03662 1 [3004939] gil30049371gblAF036620. 11AF036620 [3004937] gil30049351gblAF036619.1IAF03 6 6 1 9 [3004935] gil2370572lernblZ86 112.1 1SPZ861 12 [2370572] gi]2765946lemblZ99802. 1 SPZ99802 [2765946] gi12398824lembIZ34303.l I SPC11.NRC [2398824] gil28 94 5 12lemblAJ22349 1.1 ISPPPR3 [2894512] gil2198539lemblX85787.1IISPCPSlI4E [2198539] gil2 76 6 156lemblZ999 15.1 1SPZ999 15 [2766156] gil276 6 154lembIZ999 14.1 1SPZ999 14 [2766154] gil2766 152 jembI Z999 13.1 1SPZ999 13 [2766152] gil27661I SlembIZ999 12.1 jSPZ999 12 [2766150] gil 2 76 6 148lembIZ999 11.1 1SPZ999 11 [27661481 gil2 7 66 146lemblZ999 10.1 1SPZ999 10 [2766146] gi12 76 6 1441emblZ99909. I SPZ99909 [2766144] gij 27 66 142lembIZ99908. I SPZ99908 [2766142] gil27661 40 jembjZ99907. 11SPZ99907 [2766140] gil276 6 138lemb1Z99906. 1 SPZ99906 [2766138] gil27 6 6 136lembIZ99905.1I1SPZ99905 [2766136] gil27661I34lemblZ99904. 1 SPZ99904 [2766134] gi1276 6 132lemblZ99903. I SPZ99903 [2766132] gi127 66 130lemblZ99902. I SPZ99902 127661 30] gil2766 128lemblZ9990 1.1 1SPZ9990 1 [2766128] gil2766126!embIZ99900. 11SPZ99900 [2766126] gil27661I24lemblZ99899. 1 ISPZ99899 [2766124] gil 2 7 66 122lemblZ99898. 1 SPZ99898 [2766122] gil2766 120lemb1Z99897. 1 SPZ99897 [2766120] gil27 66 11I8lemb1Z99896. 1 ISPZ99896 [2766118] gi12766l 16lemblZ99895.1I1SPZ99895 [2766116] gil276 6 I 141emblZ99894. 11SPZ99894 [2766114] gil2766 11 2lemblZ99893.1 1SPZ99893 [2766112] gil276 6 l 10lemblZ99892.11SPZ99892 [2766110] gil2766 108lemblZ9989 1.1 1SPZ9989 1 [2766108] gil2766 106lemblZ99890. 1 iSPZ99890 [2766106] gil276 6 104leniblZ99889. I ISPZ99889 [2766104] gil2766102lembjZ99888.1ISPZ99888 [2766102] gi1276 6 1I00lemblZ99887.I1 SPZ99887 [2766100] gil27660981embIZ99886. 11SPZ99886 [2766098] gil2766096lemblZ99885.1 1SPZ99885 [2766096] gil2766094lemblZ99884. I SPZ99884 [2766094] gil2766092lemblZ99883. 11SPZ99883 [2766092] gi12766090lemblZ99882. 1 SPZ99882 [2766090] gil2766088lembIZ9988 1.1 1SPZ9988 1 [2766088] gil27660861embIZ99880. 1 SPZ99880 [2766086] gil2766084lembIZ99879. 11SPZ99879 [2766084] gil2766082lembIZ99878. I SPZ99878 [2766082] gil2766080lemblZ99877. 11SPZ99877 [2766080] gil2766078lemblZ99876.1I SPZ99876 [2766078] gil2766076lembIZ99875.1 1SPZ99875 [2766076] gil2766074lenmblZ99874. 1 ISPZ99874 [2766074] gil2766072lemblZ99873.1 1SPZ99873 [2766072] gil2766070lemblZ99872. 11SPZ99872 [2766070] gil2766068lemblZ99871 .11SPZ99871 [2766068] gil2766066IembIZ99 870.1 1SPZ99870 [2766066] gil2766064lembIZ99869. 1 ISPZ99869 [2766064] gil2766062lembIZ99868. 1 SPZ99868 [2766062] gil2766060lemblZ99867. 1 ISPZ99867 [2766060] gil2766058lemblZ99866. 1 SPZ99866 [2766058] gil2766056lemb!Z99865.1 1SPZ99865 [2766056] gil2766054lembIZ99864. 11SPZ99864 [2766054] gil2765906lemb1Z99206. I SPZ99206 [2765906] gi~ziOJ ienl1OZ9Y2h.i I F1 55904.j gil2765902lembIZ99204. 1 SPZ99204 [2765902] gil2765900IembIZ99203.1I1SPZ99203 [27659001 gil2765898lembIZ99202. 1 SPZ99202 [2765898] gil27658961embIZ9920 1.1 1SPZ99201 [2765896] gil2765894lemblZ99200. I ISPZ99200 [2765894] gil27086 3 1 IgblAFO3695 1. 1 IAF03695 1 [27086311 WO 00/32825 PCT/IB99/02040 gil886956lembZ49097. 1 ISPCS 1 11 2X [886956] gil26560931gbIL21856. 1ISTRMALR [26560931 gil25763321emblAJ02055.11SPSPSA47 [2576332] gil2576330lemblAJ002054.1lSPSPSA2 [2576330] gil2511704lemblY0818.1ISPY1O818 [2511704] gi119446191embZ83335.1 SPZ83335 [19446191 gil24251081gblAF019904.1lAFO1990 4 [2425108] gil2385404lemblAJOO1246. 1SP7465RR5 [2385404) gi[438213lembIZ1 6082.1 IPNALIB [438213] gil2149613lgbU9072 1.1 ISPU90721 [2149613] gi1493 9 1lembIZ21841.1ISPPBP2BB [49391] gil22092071gblAF004325. 1 1AF004325 [2209207] giJ2 2 9 30 6 l lemblZ959 14. 1SPZ95914 [22930611 gi122763931gbIU16156.1lSPU16156 [2276393] gil21833141gblAF003930.1 AF003930 [2183314] gil2182093lemblX957 17.1ISPPARECGN [2182093] gil984230lemblZ49095. 1 SPCS 111LA [984230] gil886954lembZ49096. I ISPCS I092X [886954] gil 1181613ldbjID82873. 1STRPBP2BE [1181613] gill 181612ldbjlD82871.1 ISTRPBP2BCZ [1181612] gil1181611ldbjlD82870.1lSTRPBP2BB2 [1181611] gill 181579ldbjlD82869. ISTRPBP2BA1 [1181579] gi1 181192ldbjlD82872.1STRPBP2BD [1181192] gil575595ldbjID42075.1ISTRPBP2B2 [575595] gi1133997l ldblD42074. 1ISTRPBP2BI [1339971] gil21083291emblY 11463.1 ISPDNAGCPO [2108329] gil19441 15ldbjlABOO2522.1 1AB002522 [1944115] gi16666691emblZ77727. 1ISPIS1381C [1666669] gil 1666668lembIZ77726.I ISPIS1381B [1666668] gill 666667lemblZ77725.1 ISPIS 1381A [1666667] oil 9148731emblZ82002.ISPZ82002 [19148731 gil143 1584lemblZ74778. 1ISPDHFR [1431584] gil47452lemblZl5120.1ISPSTRG [47452] giI581717embIZ12159.1lSPCP131G [581717] gil47342lemblX17337.1ISPAMILOC [47342] gill 80030lgbU83667. I ISPU83667 [1800300] gill 532066lemblY07780.I ISPTETOGEN [1532066] gil I1612691gbL39074. 1 ISTRSPXB [1161269] gi11460093 lemblX94909.1ISPIGA 1 PRT [1460093 gi11750263lgbIU72720. 11SPU72720 [1750263] gil2986491gbS56948. 1lS56948 [298649] gil2545371gbS4351 1. 1 IS435 11 [254537] gil2452271gbS8l051 .1S81051 [245227] gil2452261gbS81045. 1S81045 [245226] gil24S2251gbS81043.1IS81043 [245225] gil I150618lemblZ49988. I SPMMSAGEN [1150618] gil47456lemblXO 1138.1ISPTN917A [47456] gi11658316lemblZ472 10.1ISPDEXCAP [1658316] gil 550802lembX953 85.1 ISPCOMCGEN [1550802] gil47457lemblX01 137.1ISPTN917B [47457] gil975714lembX90941 .ISPTRJ5251 [975714] gil975713lembX90940. 1 ISPTLJ5251 [975713] gil975709lemblX90939.I ISPDNATETM [975709] gil 1524346lembZ7969 1.1 ISOORFS [1524346] gi11553054lemblX98364.1ISPPBPHU9 [1553054] gi11553052lembX98367.1ISPPBPHU13 [1553052] gi11553050lembX98366.1ISPPBPHU12 [1553050] gi11553048lemblX98365.IISPPBPHU11 [1553048] gi115750291gbU53509.l1SPU53509 [1575029] gil 542968igbU49088. I1SPU49088 [1542968] gil I5429661gbU49087. I ISPU49087 [1542966] gil 1536961 lemblY07845. 1 ISPGYRA [1536961] gil47391 lembIX16367.1ISPPBPX [473911 gi11490398lembZ67739.IISPPARCETP [1490398] gil1490395lemblZ67740.1 ISPGYRBORF [1490395] gil 14315 89lembZ74777.1 ISPTMRDHFR [1431589] gil408l451embIZ21702.1 ISPUNGMUTX [408145] gil47461 lemblX61025. I SPXISINT [47461] gil47459lembX55651.1ISPUNGG [47459] gil47454lembIX52632.1ISPT1545E [47454] gi14742 1 lembiZi 7307. 1 ISPRECA [42421] gil47419lembX67873.1 ISPPONA8 [47419] gil47417lembIX67872. I ISPPONA7 [47417] gil47415 lembX6787 1.1 ISPPONA6 [47415] WO 00/32825 WO 0032825PCT/1 B99/02040 gi14741I3lembIX67870.11ISPPONA5 [47413] gi14741 I lemblX67869. IISPPONA4 [47411] gil47409jemblX67867. 1 SPPONA2 [47409] gil47407lembIX67866. 1ISPPONA 1 [47407] gi[47405jembIX67868. 1 SPPNA3 [47405] gil474O3lembIX52474. 1 ISPPLY [474031 gil984232lemblX 16022.1 ISPPENA [984232] gij5171I90jembjX7821 5.1ISPPBPXG [517190] gij295840lembIZ22230. 1 SPPBP2BBA [295840] gil28898 1 lemblZ22 185.1I ISPPBP2BAC [28898 1] gi]2889791embIZ22 184.1 ISPPBP2BAB [2889791 gi12884661emblZ2 1981.1 ISPPBP2BAA [288466] gij49390jembiZ21 813.1 ISPPBP2XD [49390] gil49389lemblZ21812.l ISPPBP2XC [49389] gil49387jembIZ2181 1. 1ISPPBP2BJ [493871 gij49385jembjZ2 1810.1 ISPPBP2BI [49385] gil49382lembZ21808.1lSPPBP2BH [49382] gij49380IemblZ21807.1lSPPBP2BG [49380] gil49379lembIZ2 1806.1 ISPPBP2BF [493791 gil49377lembIZ2 1805.1 ISPPBP2BE [49377] gil49376lemblZ2 1804.1 ISPPBP2XB [49376] gil49375lemblZ2 1803.1 ISPPBP2XA [49375] gij49374IemblZ2 1802.1 ISPPBP2BD [49374] gij49372lemblZ2 1801.1 ISPPBP2BC [49372] gij493691emb1Z2 1799.1 ISPPBP2BA [49369] gil47399lembIX 13137.1 ISPPENASE [47399] gil47397lemblX 13136.1 SPPENARE [47397] gil 1052802lembIX83917.1IlSPGYRBG 1052802] gil587550lemblX72967.1 ISPNANA [587550] gil49384lemblZ2 1809.1 SPPBP lAB [49384] gil49371lembZ21800.11SPPBP1AA [49371] gil984228lemblZ49094. IISPCS 1091 A [984228] gii47372lembIX54225. 1 SPENDA [47372] OCAI..~LI7A~A 1 D~7QChrqAAaA
L--
giJ407 1721embIZ2685 1.1 ISPATPAS2 [407172] giJ407 1661emblZ26850. 1 SPATPAS 1 [407166] gil47353lembIX63602. IISPBOX [47353] gil47348lembIX05577.1I SPAPHA3 [47348] gil47337jembjX65 132.1ISP824PBPX [47337] U-e A I(fLYllVrA'71, Cl giJ4733 1 IemblX6S 133.1I ISP577PBPX [47331 gil559527lembIX65136.1lISP1 1OPBPX [559527] gil3ll 415lembIZ22807. 1ISP16SRNAA [3114151 gii47329lemblX65 135. I1ISP53 I1PBPX [47329] gij473071embIX6 13 1. 1 ISP290PBPX [47307] gil47295lemblX583 12.1 ISP 16SRNA [47295] gil854614lembZ49109.11SPGADAGN [854614] gil5564281gbIL36660.1ISTRORFI [556428] giJ511062IembIZ35135.1IlSPALIAG [511062] gill 208737igbIU47625.1 ISPU47625 [1208737] giJ5300621gbIU12567.11SPU12567 [530062] giI1S3656lgbIM29686.11STRHEXB [153656] gill 536541gbIM 18729.1 ISTRHEXA [153654] gij153608jgbjM14339.1lSTRDPN2A [153608] gi11536051gbIM14340.1lSTRDPNIA [153605] gil6435431gbIU20084. 1 SPU20084 [643543] giJ64354 1 IgbIU20083.1I ISPU20083 [64354 11 gil643539lgblU20082. 11SPU20082 [643539] gil643537igblU2008 1.1 ISPU2008 1 [643537] gil6435351gbIU20080. 1 ISPU20080 [643535] gil6435331gbIU20079.1I1SPU20079 [643533] gil6435311gbIU20078.1lSPU20078 [643531] gil6435291gbIU20077. 1 ISPU20077 [643529] gil6435271gbIU20076. I1SPU20076 [6435271 gil6435251gblU20075.1 1SPU20075 [643525] gil6435231gbIU20074. 11SPU20074 [643523] giJ643521IlgbIU20073.1 ISPU20073 [643521 gil6435 191gbIU20072. 1 SPU20072 [643519] gil64351I7lgbIU2007 1.1 ISPU20071 [6435171 gil6435 1 SlgbIU20070. 1 ISPU20070 [6435151 gij6435l31gbIU20069.1 ISPU20069 [6435131 giJ6435 1 lgbIU20068. 1 ISPU20068 [6435111 gij6435091gblU20067. I SPU20067 [643509] oill Al 7R)OhAIT 117SA6 I IPT 137560 1017802] gil 6632771gbIM361 80.11ISTRCOMAA [663277] gi14377041gblL20670. 1 ISTRHYALURO [432704] gillI 538491gbIL0775 1. 1 ITRNTN525-2R 153849] gill 538551gbIM255 19.1 ISTRVAI1 [153855] gill1538531gbIjM802 15.1 ISTRUVS402A [153853] WO 00/32825 WO 0032825PCT/I B99/02040 .4 2 9 gill 53840lgbIM74 122.1I ISTRSURPROA 153840] gil 1537961gbIM60763.1IjSTRRRNAA 153796] gi115379ligbIM31296.1ISTRRECP [153791] guS I 66391gbIL20556. 1 ISTRPLPA [516639] gi11537831gbIM28679.11STRPROMB [153783] gil 1537821gbIM28678. IISTRPROMIA 153782] gil 1537661gbIM90527.1IlSTRPONA 153766] gil 1537641gbIJ04479.1IlSTRPOLA 153764] gillI 537521gbIM255 15.1 STRNG4369 [153752] gil 1 537221gbIL0861 1.1 ISTRMLTODX [153722] gil I53702IgbIJ0 1796. 1 STRMALMXP [1537021 gil 15370 1 gbIJ01 795. 1 ISTRMALMX 153701 gi11536931gbM13812.IISTRLYTPN [153693] gi11536911gbjM17717.1lSTRLYS [153691] gi1l536671gbIM25525.IISTRKAG73 [153667] gij398 102jgblL20564. 1jSTREXP9B [398102] gil3981001gblL20563.IISTREXP9A [398100] gil398098lgbIL20562. 1 ISTREYP8A [398098] gij398096lgbjL20561 .1ISTR.EXP7A [398096] gil3980941gblL20560. 11STREXP6A [398094] gi13980921gbIL20559.I1ISTREXP5A [398092] gi1398090lgblL20558. 1 STREXP4A [398090] gill 536261gbIJ04234. 1 STR.EXOA [153626] gil153612igbIM11226.1lSTRDPNM [153612] gil I536031gblM2552 1. 1ISTRDN87669 [153603] gill153601 1gbM25526. 11STRDN87577 [153601] gi1153599igbIM25522.11STRDN]79 [153599] gi1153594igbIM37688.1lSTRDACA [153594] gi11535821gbL07752.IISTRATT'B [153582] gil4665141gbIL31413.1ISTRIRRA [466514] gill 15355 1 lgbIM2552O. 1 ISTR8249 [15355 11 giI153549igblM25524.11STR53 13972 [153549] gi 1 535471gbIM25517.1ISTR29044 [153547] gill jS '.07 P53L giI153541jgbM25518.1ISTR,121 [153541] gij 1535391gbIM255 16.1 ISTRI 1 0K70 [153539] gil506632igbIU04047.1 ISPU04047 [506632] gil393267jgbIL 19055.1 ISTRPAPA [393267] gi14420661gbIS62272.1 IS 62272 [442066] gil295l9l1lgbILl 1190.1 ISTRPUJRISYN [29519 1]

Claims (68)

1. A method for identifying at least one bacterial target for antibacterial agents, comprising: contacting a bacterial protein with a bacteriophage polypeptide that inhibits bacterial growth, wherein said bacteriophage polypeptide is a polypeptide encoded by said bacteriophage or a variant of said encoded polypeptide; determining whether said bacteriophage polypeptide binds to said bacterial protein; and identifying any said bacterial protein bound by said bacteriophage polypeptide, wherein binding of said bacteriophage polypeptide to said bacterial protein is indicative that said bacterial protein is a target.
2. The method of claim 1, wherein said variant comprises a fragment of a bacteriophage protein.
3. The method of claim 1 or 2, wherein said determining comprises detecting a protein:protein interaction between the bacteriophage polypeptide and the bacterial protein.
4. The method of claim 3, wherein said protein:protein interaction is detected by using a technique selected from the group consisting of affinity chromatography, immuprecipitation, crosslinking, and yeast two hybrid.
5. The method of any one of claims 1 to 4, further comprising confirming essentiality of said target for bacterial replication.
6. The method of any one of claims 1 to 5, further comprising the step of screening for a small molecule that binds to or reduces the level of activity of the bacterial target.
7. The method of claim 6, wherein said antibacterial agent inhibits bacterial growth.
8. The method of any one of claims 1 to 7, further comprising the step of identifying a bacterial homolog or a bacterial protein fragment of said target.
9. The method of any one of claims 1 to 8, wherein a plurality of bacterial proteins in a bacterial extract are contacted with said bacteriophage polypeptide. The method of any one of claims 1 to 9, wherein said bacteriophage is selected from the group consisting of the bacteriophages listed in Table 1. (R:\LIBVV 05898.doc NSS 431
11. The method of claim 10, wherein said bacteriophage has a host selected from the group consisting of Staphylococcus aureus, Streptococcus pneumoniae and Pseudomonas aeroginosa.
12. The method of any one of claims 1 to 11, wherein said bacteriophage s polypeptide is an isolated or purified polypeptide.
13. The method of any one of claims 1 to 11, wherein said bacteriophage polypeptide is a recombinant bacteriophage polypeptide.
14. The method of any one of claims 1 to 13, further comprising identifying said bacteriophage polypeptide encoded by said bacteriophage, wherein identifying said bacteriophage polypeptide comprises sequencing the genome of said bacteriophage and identifying open reading frames in said genome. The method of any one of claims 1 to 14, further comprising determining the binding site of said bacteriophage polypeptide onto said bacterial protein.
16. A method for identifying at least one bacterial target for antibacterial agents, comprising: contacting at least one homolog of a bacterial protein that binds with a bacteriophage polypeptide that inhibits bacterial growth; and determining whether said bacteriophage polypeptide binds to said homolog; wherein binding of said homolog by said bacteriophage polypeptide is indicative that said S 20 homolog is a target for antibacterial agents.
17. The method of claim 16, wherein said determining comprises detecting a protein:protein interaction between said homolog and the bacteriophage polypeptide.
18. The method of claim 17, wherein said protein:protein interaction is detected by using a technique selected from the group consisting of affinity chromatography, immuprecipitation, crosslinking, and yeast two hybrid.
19. The method of any one of claims 16 to 18, wherein said homolog is in a bacterial extract. The method of any one of claims 16 to 19, further comprising screening for a small molecule that binds to or reduces the level of activity of the bacterial target.
21. The method of claim 20, wherein said small molecule inhibits bacterial growth.
22. The method of any one of claims 16 to 21, further comprising identifying a bacteriophage polypeptide variant binding to said homolog. [R:\LIBVV]05898 doc NSS 432
23. The method of any one of claims 16 to 22, further comprising determining the binding site of said bacteriophage polypeptide onto said homolog.
24. A method for identifying a bacterial target for screening antibacterial agents, comprising: identifying a full-length bacteriophage protein or a bacteriophage polypeptide variant which inhibits bacterial growth when introduced into a bacteria; contacting said bacteriophage protein or variant with a full-length bacterial protein, or a bacterial protein fragment or a homolog thereof; and determining whether said bacteriophage polypeptide or variant binds to said full-length bacterial protein, or said bacterial protein fragment or homolog thereof; wherein binding of said bacteriophage protein or variant with said bacterial protein, fragment or homolog is indicative that said bacterial protein, fragment or homolog is a target for screening antibacterial agents.
25. The method of claim 24, wherein in the bacteriophage protein or variant is contacted with a bacterial extract comprising a plurality of bacterial proteins.
26. The method of claim 24 or 25, wherein said bacteriophage polypeptide variant comprises a fragment of a full-length bacteriophage protein.
27. The method of any one of claims 24 to 26, wherein said determining 20 comprises detecting a protein:protein interaction between the bacterial protein fragment or homolog and the bacteriophage protein or variant.
28. The method of any one of claims 24 to 27, further comprising confirming essentiality of said target for bacterial replication.
29. The method of any one of claims 24 to 28, further comprising screening for a small molecule that binds to or that reduces the level of activity of the bacterial target. The method of claim 29, wherein said small molecule inhibits bacterial growth.
31. The method of any one of claims 24 to 30, further comprising determining the binding site of said bacteriophage polypeptide or variant onto said full-length bacterial protein, bacterial protein fragment or homolog thereof.
32. A method for identifying a bacterial target for screening antibacterial agents, comprising: 433 providing a bait capable of inhibiting bacterial growth when introduced into a bacteria, said bait being selected from the group consisting of full-length bacteriophage proteins and bacteriophage polypeptide variants; providing a prey, selected from the group consisting full-length bacterial proteins, bacterial protein fragments and homologs thereof; contacting said bait with said prey under conditions suitable for allowing formation of specific bait:prey complex(es); and identifying a prey forming any said bait:prey complex(es); wherein formation of bait:prey complexes is indicative that said prey is a bacterial target 0o against which antibacterial agents may be screened.
33. The method of claim 32, wherein the full-length bacterial proteins or bacterial protein fragments are in a bacterial extract.
34. The method of claim 32 or 33, wherein said bait is immobilized on a solid phase matrix.
35. The method of any one of claims 32 to 34, further comprising confirming essentiality of said target for bacterial replication.
36. The method of any one of claims 32 to 35, further comprising screening for a small molecule that binds to or that reduces the level of activity of the bacterial target.
37. The method of claim 36, wherein said small molecule inhibits bacterial S 20 growth.
38. The method of any one of claims 32 to 37, wherein said bacteriophage is selected from the group consisting of the bacteriophages listed in Table 1.
39. The method of claim 38, wherein said bacteriophage has a host selected from the group consisting of Staphylococcus aureus, Streptococcus pneumoniae and Pseudomonas aeroginosa. The method of any one of claims 32 to 39, wherein said bacteriophage protein or bacteriophage polypeptide variant has been isolated or purified.
41. The method of any one of claims 32 to 39, wherein said bacteriophage polypeptide is a recombinant bacteriophage polypeptide. 0 30 42. The method of any one of claims 32 to 41, further comprising sequencing the genome of said bacteriophage and identifying open reading frames in said genome for identifying full-length bacteriophage proteins or bacteriophage polypeptide variants encoded by said bacteriophage. [R;\LIBVV]05898 docNSS 434
43. The method of any one of claims 32 to 42, further comprising determining site(s) of formation of said bait:prey complex(es).
44. A method for identifying at least one target for antibacterial agents, comprising: contacting a bacterial protein with a bacteriophage polypeptide that inhibits bacterial growth; determining whether said bacteriophage polypeptide binds to said bacterial protein; and identifying any said bacterial protein bound by said bacteriophage 1o polypeptide, wherein binding of said bacteriophage polypeptide to said bacterial protein is indicative that said bacterial protein is a said target. The method of claim 44, wherein said determining comprises identifying at least one bacterial protein which binds to said bacteriophage polypeptide using affinity chromatography on a solid matrix.
46. The method of claim 44 or 45, wherein said method further comprises identifying a bacterial nucleic acid sequence encoding said target of said bacteriophage polypeptide.
47. The method of any one of claims 44 to 46, wherein said determining is performed for a plurality of bacteriophage polypeptides that inhibit bacterial growth.
48. The method of claim 47, wherein said determining is performed using bacteriophage polypeptides that inhibit bacterial growth from a plurality of different •bacteriophages.
49. The method of claim 48, wherein said plurality of different bacteriophage is at least 3 different bacteriophages. The method of claim 49, wherein said plurality of different bacteriophage is at least 5 different bacteriophages.
51. The method of claim 50, wherein said plurality of different bacteriophage is at least 10 different bacteriophages.
52. The method of any one of claims 44 to 51, wherein said at least one target is a plurality of targets.
53. The method of claim 52, wherein said plurality of targets is from a plurality of different bacteria. [R:\LIBVV]05898.doc:NSS 435
54. The method of any one of claims 44 to 53, further comprising determining the binding site of bacteriophage polypeptide to said bacterial protein. A method for identifying an antibacterial agent, comprising: identifying at least one target for antibacterial agents according to any one of claims 1 to 5, 16 to 19, 24 to 28, 32-35, 44 to 54; and screening for at least one compound that binds to or reduces the level of activity of said at least one target; wherein identification of a compound that binds to or reduces the level of activity of said at least one target is indicative that said compound is an antibacterial agent.
56. The method of claim 55, wherein said compound is a small molecule.
57. The method of claim 55 or 56, wherein said compound is a fragment or variant of a bacteriophage inhibitor protein.
58. The method of any one of claims 55 to 57, wherein said screening comprises at least one step carried out in vitro.
59. The method of any one of claims 55 to 58, wherein said screening comprises at least one step carried out in vivo in a non-human animal. The method of any one of claims 55 to 59, further comprising determining the binding site of said compound onto said target.
61. A method of making an antibacterial agent, comprising the steps of: identifying an antibacterial agent according to any one of claims 55 to 60; and synthesizing said antibacterial agent in an amount sufficient to inhibit bacterial growth.
62. The method of claim 61, wherein said antibacterial agent is synthesized in an amount sufficient to provide a therapeutic effect when administered to an organism infected by a bacterium naturally producing said target.
63. A method for inhibiting a bacterium, comprising the steps of making an antibacterial agent according to claim 61 or 62; and contacting said bacterium with said antihacterial agent.
64. The method of claim 63, wherein said contacting is perfoed in vitro. 30 64. The method of claim 63, wherein said contacting is performed in vitro. in a 30 65. The method of claim 63, wherein said contacting is performed in vivo in a non-human animal.
66. A method for treating a bacterial infection in a non-human animal suffering from an infection, comprising: making an antibacterial agent according to claim 61 or 62; [R:\LI BVV]05898.doc:NSS 436 and administering to said non-human animal a therapeutically effective amount of said antibacterial agent.
67. An isolated, purified, or enriched nucleic acid molecule at least 15 nucleotides in length, wherein said molecule comprises at least a portion of a bacteriophage sequence, and wherein said bacteriophage is selected from the group consisting of Staphylococcus aureus bacteriophage 77, 3A, 96, and 44AHJD, Enterococcus bacteriophage 182, and Streptococcus pneumoniae bacteriophage Dp-1, with the proviso that said bacteriophage sequence is other than the nucleotide sequence shown in Table 32.
68. The nucleic acid sequence of claim 67, wherein said sequence comprises at 0o least 50 nucleotides.
69. The nucleic acid sequence of claim 67 or 68, wherein said nucleic acid sequence corresponds to at least a portion of a nucleic acid sequence which encodes a product which provides a bacteria-inhibiting function. The nucleic acid sequence of any of claims 67 to 69, wherein said nucleic acid sequence encodes a polypeptide having a bacteria-inhibiting activity.
71. The nucleic acid sequence of any of claims 67 to 70, wherein said nucleic acid sequence is transcriptionally linked with regulatory sequences enabling induction of expression of said sequence.
72. A recombinant vector comprising at least one nucleic acid sequence according to any one of claims 67 to 71.
73. The vector of claim 72, wherein said vector is an expression vector.
74. A recombinant cell comprising a vector according to claim 72 or 73. An isolated, purified, or enriched polypeptide comprising at least a portion of an antimicrobial protein, wherein said polypeptide is encoded by a bacteriophage selected 25 from the group consisting of Staphylococcus aureus bacteriophage 77, 3A, 96, and 44AHJD, Enterococcus bacteriophage 182, and Streptococcus pneumoniae bacteriophage Dp-1, with the proviso that said polypeptide is encoded by a bacteriophage sequence other than the niicleotide sequence shown in Table 32.
76. The polypeptide of claim 75, wherein said polypeptide comprises at least 30 contiguous 10 amino acid residues of said antimicrobial protein. go• 77. A computer readable device when used in the method according to any one of claims 1 to 66, said device having recorded therein a nucleotide sequence of a portion of at least one bacteriophage genome of Staphylococcus aureus bacteriophage 77, bacteriophage 3A, or bacteriophage 96, a nucleotide sequence at least 95% identical to a [R:\LIBVV]05898.doc:NSS 437 said nucleotide sequence, a ribonucleic acid equivalent, a degenerate equivalent, a homologous sequence, or at least one amino acid sequence encoded by said nucleotide sequence; and a nucleotide sequence or amino acid sequence analysis program, wherein said program can perform at least one sequence analysis on said nucleotide or amino acid sequence.
78. The device of claim 77, wherein said at least a portion of at least one bacteriophage genome comprises at least one ORF.
79. The device of claim 77 or 78, wherein said device comprises a medium selected from the group consisting of floppy disk, computer hard drive, optical disk, computer random access memory, an magnetic tape wherein said nucleotide or amino acid sequence or said program or both are recorded on said medium. A method for identifying at least one bacterial target for antibacterial agents, substantially as hereinbefore described with reference to any one of the examples.
81. An isolated, purified, or enriched nucleic acid sequence at least 15 nucleotides in length, wherein said sequence corresponds to at least a portion of a bacteriophage sequence, and wherein said bacteriophage is selected from the group consisting of Staphylococcus aureus bacteriophage 77, 3A, 96, and 44AHJD, Enterococcus bacteriophage 182, and Streptococcus pneumoniae bacteriophage Dp-1, substantially as hereinbefore described with reference to any one of the examples.
82. A recombinant vector comprising at least one nucleic acid sequence according to claim 81. o oo
83. A recombinant cell comprising a vector according to claim 82. Dated 19 January, 2004 Phagetech, Inc. Patent Attorneys for the Applicant/Nominated Person SPRUSON FERGUSON *o *oo• [R:\LIBVV]05898 doc:NSS
AU15815/00A 1998-12-03 1999-12-03 Development of novel anti-microbial agents based on bacteriophage genomics Ceased AU774841B2 (en)

Applications Claiming Priority (13)

Application Number Priority Date Filing Date Title
US11099298P 1998-12-03 1998-12-03
US60/110992 1998-12-03
US32614499A 1999-06-03 1999-06-03
US09/326144 1999-06-03
US09/407,804 US6982153B1 (en) 1998-12-03 1999-09-28 DNA sequences from staphylococcus aureus bacteriophage 77 that encode anti-microbial polypeptides
US09/407804 1999-09-28
US15721899P 1999-09-30 1999-09-30
US60/157218 1999-09-30
US16877799P 1999-12-01 1999-12-01
US60/168777 1999-12-01
US09/454252 1999-12-02
US09/454,252 US6783930B1 (en) 1998-12-03 1999-12-02 Development of novel anti-microbial agents based on bacteriophage genomics
PCT/IB1999/002040 WO2000032825A2 (en) 1998-12-03 1999-12-03 Development of anti-microbial agents based on bacteriophage genomics

Publications (2)

Publication Number Publication Date
AU1581500A AU1581500A (en) 2000-06-19
AU774841B2 true AU774841B2 (en) 2004-07-08

Family

ID=27557794

Family Applications (1)

Application Number Title Priority Date Filing Date
AU15815/00A Ceased AU774841B2 (en) 1998-12-03 1999-12-03 Development of novel anti-microbial agents based on bacteriophage genomics

Country Status (5)

Country Link
EP (1) EP1135535A2 (en)
JP (1) JP2002531107A (en)
AU (1) AU774841B2 (en)
CA (1) CA2353563A1 (en)
WO (1) WO2000032825A2 (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7101969B1 (en) 1998-12-03 2006-09-05 Targanta Therapeutics Compositions and methods involving an essential Staphylococcus aureus gene and its encoded protein
JP2003517833A (en) * 1999-12-22 2003-06-03 ファゲテック,インコーポレイティド Compositions and methods related to Staphylococcus aureus essential genes and proteins encoded by them
AU2002220422B2 (en) * 2000-12-01 2007-06-28 Targanta Therapeutics Inc. S.aureus protein STAAU R2, gene encoding it and uses thereof
AU2621202A (en) 2000-12-19 2002-07-01 Phagetech Inc Compositions and methods involving an essential staphylococcus aureus gene and its encoded protein staau r9
AU2002224692B2 (en) * 2000-12-20 2007-05-24 Targanta Therapeutics Inc. Compositions and methods involving an essential staphylococcus aureus gene and its encoded protein STAAU_R4
AU2002336752A1 (en) * 2001-09-21 2003-04-01 New Horizons Diagnostics Corporation Composition for treating streptococcus pneumoniae
US6759229B2 (en) 2001-12-18 2004-07-06 President & Fellows Of Harvard College Toxin-phage bacteriocide antibiotic and uses thereof
WO2005085288A2 (en) 2004-03-01 2005-09-15 The Cbr Institute For Biomedical Research Natural igm antibodies and inhibitors thereof
US7569223B2 (en) 2004-03-22 2009-08-04 The Rockefeller University Phage-associated lytic enzymes for treatment of Streptococcus pneumoniae and related conditions
GB201119167D0 (en) * 2011-11-07 2011-12-21 Novolytics Ltd Novel bachteriophages
US9409977B2 (en) 2013-03-12 2016-08-09 Decimmune Therapeutics, Inc. Humanized, anti-N2 antibodies
CN111316999B (en) * 2020-03-04 2022-02-08 苏州十一方生物科技有限公司 Spray type environmental disinfectant containing bacteriophage and preparation method and application thereof
CN111296493A (en) * 2020-03-09 2020-06-19 苏州十一方生物科技有限公司 Phage disinfectant and preparation method thereof

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0072925A2 (en) * 1981-08-17 1983-03-02 Rutgers Research and Educational Foundation T4 DNA fragment as a stabilizer for proteins expressed by cloned DNA
WO1989000199A1 (en) * 1987-07-06 1989-01-12 Louisiana State University Agricultural And Mechan Therapeutic antimicrobial polypeptides, their use and methods for preparation

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0755441A4 (en) * 1994-04-05 2000-01-26 Exponential Biotherapies Inc Antibacterial therapy with genotypically modified bacteriophage
DE69531539T2 (en) * 1995-06-16 2004-04-01 Société des Produits Nestlé S.A. Phage resistant streptococcus

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0072925A2 (en) * 1981-08-17 1983-03-02 Rutgers Research and Educational Foundation T4 DNA fragment as a stabilizer for proteins expressed by cloned DNA
WO1989000199A1 (en) * 1987-07-06 1989-01-12 Louisiana State University Agricultural And Mechan Therapeutic antimicrobial polypeptides, their use and methods for preparation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MOL. BICROBIOL. (1997) 25, PP 717-25 *

Also Published As

Publication number Publication date
CA2353563A1 (en) 2000-06-08
JP2002531107A (en) 2002-09-24
WO2000032825A3 (en) 2001-01-18
EP1135535A2 (en) 2001-09-26
WO2000032825A2 (en) 2000-06-08
AU1581500A (en) 2000-06-19

Similar Documents

Publication Publication Date Title
US6783930B1 (en) Development of novel anti-microbial agents based on bacteriophage genomics
AU774841B2 (en) Development of novel anti-microbial agents based on bacteriophage genomics
US6638718B1 (en) Methods of screening for compounds active on staphylococcus aureus target genes
KR102003770B1 (en) Novel Staphylococcus specific bacteriophage SA3 and antibacterial composition comprising the same
KR101592177B1 (en) Method for prevention and treatment of Escherichia coli infection using a bacteriophage with broad antibacterial spectrum against Escherichia coli
CN107208068B (en) Novel Shiga toxin F18-producing Escherichia coli bacteriophage Esc-COP-1 and application thereof in inhibiting proliferation of Shiga toxin F18-producing Escherichia coli
CN109082414B (en) Staphylococcus aureus bacteriophage and application thereof
CN108359643A (en) Novel staphylococcus aureus bacteriophage and combinations thereof and application
CN110545670B (en) Phage therapy
KR102073095B1 (en) Escherichia coli bacteriophage Esc-COP-14 and its use for preventing proliferation of pathogenic Escherichia coli
KR102432624B1 (en) Novel Staphylococcus specific bacteriophage OPT-SC01 and antibacterial composition comprising the same
KR101993123B1 (en) Novel pathogenic Escherichia coli specific bacteriophage ECO5 and antibacterial composition comprising the same
CN110719784A (en) Therapeutic phage compositions
US6376652B1 (en) Compositions and methods involving an essential Staphylococcus aureus gene and its encoded protein
KR20210143974A (en) Jumbo bacteriophage PALS2 and its endolysins LysPALS21 and LysPALS22 from Staphylococcus aureus
AU778782B2 (en) Compositions and methods involving an essential staphylococcus aureus gene and its encoded protein
KR102418861B1 (en) Bacteriophage with growth inhibition activity against Staphylococcus sp.
KR102203675B1 (en) Novel Yersinia specific bacteriophage YE12 and antibacterial composition comprising the same
KR20180074578A (en) Novel Enterococcus faecalis specific bacteriophage EF1 and antibacterial composition comprising the same
KR102334893B1 (en) Novel Campylobacter specific bacteriophage OPT-CJ1 and antibacterial composition comprising the same
US20030138771A1 (en) DNA sequences from S. pneumoniae bacteriophage DP1 that encode anti-microbal polypeptides
US20070059709A1 (en) Staphylococcus aureus antibacterial target genes
KR101993125B1 (en) Novel ESBL producing Escherichia coli specific bacteriophage ECO4 and antibacterial composition comprising the same
KR101992013B1 (en) Novel bacteriophage having bacteriocidal activity against pathogenic enterobacteria and uses thereof
KR102066898B1 (en) Novel Enterococcus faecalis specific bacteriophage EF5 and antibacterial composition comprising the same

Legal Events

Date Code Title Description
MK6 Application lapsed section 142(2)(f)/reg. 8.3(3) - pct applic. not entering national phase
MK6 Application lapsed section 142(2)(f)/reg. 8.3(3) - pct applic. not entering national phase
TH Corrigenda

Free format text: IN VOL 14, NO 37, PAGE(S) 6637-6641 UNDER THE HEADING APPLICATIONS LAPSED, REFUSED OR WITHDRAWN PLEASE DELETE ALL REFERENCE TO APPLICATION NO. 15815/00

FGA Letters patent sealed or granted (standard patent)