AU696768B2 - Recombinant xylanases - Google Patents

Recombinant xylanases Download PDF

Info

Publication number
AU696768B2
AU696768B2 AU43479/93A AU4347993A AU696768B2 AU 696768 B2 AU696768 B2 AU 696768B2 AU 43479/93 A AU43479/93 A AU 43479/93A AU 4347993 A AU4347993 A AU 4347993A AU 696768 B2 AU696768 B2 AU 696768B2
Authority
AU
Australia
Prior art keywords
gly
thr
ala
ser
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU43479/93A
Other versions
AU4347993A (en
Inventor
Harry John Gilbert
Geoffrey Peter Hazlewood
Gang Ping Xue
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Commonwealth Scientific and Industrial Research Organization CSIRO
Newcastle University of Upon Tyne
Babraham Institute
Original Assignee
Commonwealth Scientific and Industrial Research Organization CSIRO
University of Newcastle, The
Babraham Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Family has litigation
First worldwide family litigation filed litigation Critical https://patents.darts-ip.com/?family=25626386&utm_source=***_patent&utm_medium=platform_link&utm_campaign=public_patent_search&patent=AU696768(B2) "Global patent litigation dataset” by Darts-ip is licensed under a Creative Commons Attribution 4.0 International License.
Application filed by Commonwealth Scientific and Industrial Research Organization CSIRO, University of Newcastle, The, Babraham Institute filed Critical Commonwealth Scientific and Industrial Research Organization CSIRO
Priority to AU43479/93A priority Critical patent/AU696768B2/en
Priority claimed from PCT/GB1993/001283 external-priority patent/WO1993025693A1/en
Publication of AU4347993A publication Critical patent/AU4347993A/en
Assigned to BIOTECHNOLOGY AND BIOLOGICAL SCIENCES RESEARCH COUNCIL, THE, UNIVERSITY OF NEWCASTLE-UPON-TYNE reassignment BIOTECHNOLOGY AND BIOLOGICAL SCIENCES RESEARCH COUNCIL, THE Amend patent request/document other than specification (104) Assignors: AGRICULTURAL AND FOOD RESEARCH COUNCIL, THE, UNIVERSITY OF NEWCASTLE-UPON-TYNE
Application granted granted Critical
Publication of AU696768B2 publication Critical patent/AU696768B2/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Landscapes

  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Enzymes And Modification Thereof (AREA)
  • Catalysts (AREA)

Description

£WI DATE 01194 AOJP DATE 24/03194 APPLN. 1D 43479/93 PCT NUMBER PC/GB93'01283 AU~343'~J? 041 International Patent classifleatlon 51 C12N 15/56, 9/24, A23K 1/16 A21 D8/04[1)2 1C 9/ 10__ (21) International Application Numnber: (22) international Filing Date: (11) International Publication Number: NVO 93/25693 5 Al (43) Inernational Publication Date: 21 [December 1993 23 1193): PCT 01393 012831 (81) Designated States: AU, 1ill. 10G. lil. A. Ill. 1W, JR.
1Jue193 (17 06,93 RU1. SI), SK. UA, US, VN. I uropcetn Patent (AT, BIE, CH. DE, DK. tS. I-R, 6B. OR. Mi. IT. MU MC. NL, PT, SE). OAPI patent (l1F. 03. CG, Cl, 01. GA.
O N, M L. M R. N[F. S N. TI) Priority data: I'L 2985 17 June 1992 (1 "006,9') C72)pplitantnndIentors: HAZLENVO Geoffrey, Peter [GII-0111; 109AX Duchess D~rive, Newrtarket, Suf.
folk C338 2AL GIL3ERT, Ha.rry, John 103B Gill.
16 Kells Gardent I ow Fell. Gateshead. T no and Wear NE9 5XS (0GB). /\(IIF Published ih i ntenwuil archl report.
u'/J ,AIX. 7/ et (74) Agents: SH-EARD. Andrew', Oregon~ et al; Kilburn Strode, 30 John Street, Lonlon WCIN 21)1 (G1B).
On I 1 1. 3 e i Q a 0 r4 'ii) COMMON WEALT] INDUSTRIAL RIES! of Limecstone Avenue, Capital Territory, 260 THlE BABRAIL1AMI of flabrahn I loll Bal C132 4AT, United Ki 'I SCIENTIFIC AND .ARCII ORGANISATION''~ Campbell, Australian 1I Australia
NSTITUTE
brahamn, Cambridge, 4 ngdon: (GB) R A 4, j x (54) Title: RECOMIBINANT XYASS Xyianese activity
SNXR
H'P S SCPPV S I I I I X K K 8: pNX2 Naj z ynA X Mr 23 S R K pNX3 A,
I
N& R pNX4 (Sm)Nr -J1530 1530 PNX6 NA P (Sm) Nr pNX6 Nr S S N& (57) Abstract Recombinant xylanases are derived from anaerobic fungi, particularly Neo callitnastix parricianin. The enzymes are highly specific for xylans and have industrial value, particularly in the pulp and paper industries. Certain truncated fo;rms of the enzymneq. avd enzymes encoded by truncated DNA sequences, are preferred for their high expression levels.
Wn 01i'mrm This invention relates to recombinant xylanascs derivable from an anaerobic fungus.
Xylan. a major component of plant hemicelluloses, consists of a polymer of 1,4linked O-D-xylopyranose units substituted with mainly acetyl, arabinosyl and glucuronosyl residues. Hardwood xylan is typically Q-acetyl-4-Omethyiglucuronoxylan with approximately ten percent of xylose units ca-1,2-lhnked to a 4-0-methylgiucuronic acid side chain, and seventy percnt of xylose residues acetylated at the C-2 or C-3 positions. Softwood xylans are commonly arabino-4- 0-methyl-glucuronoxylans in which more than ten percent of xylose units are substituted with c*-1,3-linked arabionfuranose residues. A repertoire of microbial enzymes act co-operatively to convert xylan to its constituent simple sugars. These is include endo-0-1,4-xylanases (EC j3-xylosidase (EC 3.2.1.37) and a series of enzymes which cleave side-chain sugars (glycosidases) or remove acetyl groups from the xylan backbone (Dekker and Richards, Adv.
Carbohydr. Chern. Biochem. 32: 277-352 (1976); Biely, Trends Biotechnol. 3: 286-290 (1985); Poutanen et al, "Accessory Enzymes Involved in the Hydrolysis of Xylans" In: Enzymes in Biomass Conversion. ACS Symposium Series 460.
pp426-436. Ed. G.F. Letham. (1991)). Xylanolytic micro-organisms generally express isoenzymic; forms of xylanases which are encoded by multiple genes (Haziewood et FEM'S Microbial. Lett, 51: 231-236 (1988); Gilbert et J.
Gen. Microbial. 134: 3239-3247 (1988); Clarke et FEMS Microbial. Lett. 83: 305-3 10 (1991)).
Some xylanases hydrolyse only xylan (Hall et Mot. Microbial. 3: 1211-1219 (1989); Wong et Microbial. Rev. 52: 305-3 17 (1988). Many microorganisms that hydrolyse xylan also degrade cellulose. In view of the similarity of the bond 83 WO 93/25693 PCT/GB93/01283 -2cleaved (3-1,4-glycosidic linkages), and the cross-specificity sometimes observed between cellulases and xylanases, the phylogenctic relationships of these enzymes is an interesting question. Recently, sequence alignment and hydrophobic cluster analysis have been utilised to assign plant cell wall hydrolases to eight enzyme families (Henrissat et al, Gene 81: 83-95 (1989); Gilkes et al, Microbiol. Rev. 303-315 (1991)). Xylanases showed no convincing sequence identity with cellulases suggesting that the two enzyme species evolved from distinct ancestral genes.
Many plant cell wall hydrolases consist of two distinct domains; a catalytic domain (CD) linked by hydroxyamino acid/proline-rich linker sequences to a non-catalytic cellulose binding domain (CBD; Gilkes et al, Microbiol. Rev. 55: 303-315 (1991); Kellett et al, Biochem. J. 272: 369-376 (1990); Gilbert et al, Mol. Microbiol. 4: 759-767 (1990)). The precise role of the CBD is the subject of much debate; in is aerobic fungal cellulases the CBD plays a critical role in the enzymes' hydrolysis of crystalline cellulose (Tomme et al, Eur. J. Biochem. 170: 575-581 (1988)).
The role of this domain in prokaryotic cellulases and xylanases is less certain (Ferreira et al, Biochem. J. 269: 261-264 (1990)). In addition to their modular structure, cellulases often contain extended repeated sequences (Gilkes et al, Microbiol. Rev. 55: 303-315 (1991)). The precise role of these tandem repeats is largely unresolved.
Many cellulolytic and hemicellulolytic prokaryotes reside in the rumen of cows and sheep. Recently, anaerobic rumen fungi have also been shown to degrade both cellulose and xylan efficiently (Orpin and Letcher Curr. Microbiol. 3: 121-124 (1979); Lowe et al, Appl. Environ. Microbial. 53: 1216-1223 (1987)) and similar fungi reside in the alimentary tracts of large herbivores (Orpin and Joblin, "Anaerobic fungi". In: The Rumen Microbial Ecosystem, P.N. Hobson (Ed), ppl29-150, Elsevier, London (1988)). The cellulase complex of the rumen fungus d WO 93/25693 PCr/GB93/01283 -3- Neocallimastix frontalis has been characterised by Wood et al, Biochemistry and Genetics of Cellulose Degradation: FEMS Symp. 43: 31-52 (1988). The lower eukaryote synthesises a large multienzyme complex, of M r 1-2 million, which rapidly hydrolyses crystalline cellulose. The complex contains substantial endoglucanase, and some 0-giucosidase activity. The fungus also synthesises an Avicelase, presumably a cellobiohydrolase. Another rumen fungus, Neocallimasrix patriciarum, produces extracellular enzymes which hydrolyse filter paper cellulose, AVICEL" (a trade mark for microcrystalline cellulose) and xylan (Williams and Orpin Can. J. Microbiol. 33: 418-426 (1987)). None of these enzymes has been characterised. Limited information on Neocallimasrix genes encoding plant cell wall hydrolases has been described (Reymond et al, FEMS Microbiol. Lett. 77: 107-112 (1991)).
Xylans arc found, in association with lignin, in the primary and secondary cell walls of most plants. The association between xylan and lignin is the key to the commercial potential of xylanases in, among other things, paper pulp processing.
Sandoz Products Ltd in the USA have already conducted practical trials using a crude fungal xylanase to replace, at least partially, the amount of chlorine and chlorine-derived compounds normally used to bleach the objectionable brown lignin-derived residues in the treatment of wood pulp in the production of paper and other wood-derived products. The chlorine requirements of present day wood pulping plants are such that each plant may have its own chlorine dioxide production unit.
The advantages to the paper industry in avoiding the use of chlorine are clear: improvements in waste handling, operator safety and plant capital could be achieved if a suitable replacement for chlorine could be found. However, the paper industry is intensely competitive, and profit margins are slim, so any chlorine replacement must be capable of being produced reasonably economically WO 93/25693 I'Cr/GB93/01283 -4and must also, of course, be sufficiently effective to persuade pulp and paper manufacturers of the benefits of its use.
The full length cDNA and protein sequence of a xylanase from Neocallimastix paiciarum were available from the EMBL databank in Heidelberg, Germany, as of 5 May 1992 under the accession number X65526. The xylanase was designated XYLA and the corresponding gene xynA.
It has now been found that modified xylanases derived from individual xylanases from aiaerobic fungi, such as the XYLA enzyme from N. patriciarum, have properties which make them appropriate for industrial uss, particularly in the manufacture of pulp and paper. It appears surprisingly that truncation can enhance the expression of the enzyme.
According to a first aspect of the present invention, there is provided a xylanase which has at least one catalytic domain which is substantially homologous with a xylanase of an anaerobic fungus and which is not a full length natural xylanase.
Preferred catalytic domains are identical to catalytic domains of natural xylanases from anaerobic fungi. However, for the purpose of the present invention, a first sequence is substantially homologous with a second sequence if, for example, it shares its biological activity and there is at least about 40% homology at the amino acid level; so a catalytic domain of a xylanase of this aspect of the invention has at least about 40% homology with a catalytic domain of a natural xylanase of an anaerobic fungus. In general, it may be preferred for there to be at least 70%, 80% or 90% homology (in increasing order of preference) between the two amino acid sequences being compared. Homology may alternatively or additionally be assessed at the nucleic acid level. DNA encoding a first amino acid sequence may be substantially homologous with and hybridise to DNA (which SUBSTITUTE SHEET
ISA/EP
WO 93/25693 PCT/GB93/01283 may be cDNA or genomic DNA) which encodes a second amino acid sequence or would so hybridise but for the degeneracy of the genetic code. Hybridisation conditions may be stringent, such as 65 0 C in a salt solution of approximately 0.9 molar.
Examples of anaerobic fungi, which may be alimentary tract (particularly rumen) fungi, include: Neocallimastix spp., such as N. patriciarum, N. frontalis, N.
hurleyensis and N. stanthorpensis; Sphaeromonas spp., such as S. communis; Caecomyces spp., such as C. equi; Piromyces spp., such as P. communis, P. equi, P. dumbonica, P. lethargicus and P. nai; Ruminomyces spp., such as P. elegans; Anaeromyces spp., such as A. mucronaus and Orpinomyces spp., such as 0. bovis and 0. joyonii.
Caecomyces equi, Piromyces equi, Piromyces dumbonica and Piromyces mai are found in horses and not in the rumen of cattle like the other fungi listed above.
Neocallimasix spp. are preferred, particularly N. pariciarum.
Xylanases in accordance with the invention may have a high specific activity. The specific activity may be significantly higher than that of bacterially derived xylanases and may for example be at least 1000, 2000, 3000, 4000, 4500, 5000 or even 5500 U/mg protein, in increasing order of preference. (A unit of xylanase activity is defined as the quantity of enzyme releasing 1 /mole of product, measured as xylose equivalents, in 1 minute at More particularly, xylanases in accordance with this aspect of the invention may be significantly better expressed than natural XYLA is expressed by N. parriciarum; e-rnression may be at least 10 fold improved or preferably at least 100 fold improvt ver the wild type enzyme.
WO 93/25693 PCF/GB93/01283 -6- Xylanases in accordance with the invention may have the ability to degrade xylan at high efficiency. At least 0.1, and preferably at least 0.5 or even 0.75 g reducing sugar may be produced per g xylan substrate.
Xylanases in accordance with the invention may have no significant residual activity against cellulose, in contrast to many known xylanases. This property is particularly useful in the application of the invention to the pulp and paper industry, as the enzyme can remove xylan and dissociate lignin from plant fibre without damaging cellulose fibre.
Xylanases in accordance with the invention may have at least two catalytic domains. The arrangement of the catalytic domains may be as in a wild type xylanase enzyme, or they may be arranged in an artificial configuration to increase or otherwise improve the xylanolytic activity of the enzyme.
A particularly preferred xylanase as a source of catalytic domains for use in the invention, is that derived from Neocallimastix patriciarum and designated XYLA; it has the following properties: a specific activity of 5980 U/mg protein for the purified enzyme when prepared by the following protocol: Host cells coliXL1-Blue harbouring a plasmid expressing the enzyme) are harvested by centrifugation and resuspended in Tris-HCl buffer, pH 8.0, and the cytoplasmic fraction prepared as described by Clarke et al, (FEMS Microbiol. Letts. 83 305-310 (1991)). Xylanase, precipitated by the addition of ammonium sulphate (0.39 g/ml), is redissolved in 10 mM Tris-HCI buffer, pH After dialysis against three changes of the same buffer, the WO 93/25693 PC/GB93/01 283 -7xylanase is substantially purified by anion-exchange chromatography on DEAB-Triacryl M essentially as described by Hall et al. (McI.
Microbiol. 3 1211-1219 (1989)).
S (ii) the ability to degrade xylan at high efficiency, releasing 0.9g of reducing sugar per g of the substrate; (iii) no significant residual activity against cellulose (as determined by no detectable release of reducing sugar from carboxyethyl celulosc.
barley f-glucan, laminarin or lichenan); and (iv) two catalytic domains.
The structure of mature XYLA may be represented as follows (from the Nis terminus to the C-terminus): CAT 1-IK1-CAT2-LINK2-CTR1-CTR2 wherein: CATI represents a first catalytic domain., having the sequence:
RLTVGN
GQTQHKGVADGYSYEIWLDNTGGSGSMTLGSGATFKAEWN
ASVNRGNFLARRGLDFGSQK KATDYSYIGLDYTAYRQTG SASGNSRLCVYGWFQNRGVQ GVPLVEYYIIDWVDWVPDA QGRMVTIDGAQYKFQMDHT GPTINGGSETFKQYFSVRQQ KRTSGHITVSDHFKEWAKQG WGIGNLYEVALNAEGWQSSG
IADVTKLDVYTTQKGSNPAP;
CAT2 represents a second catalytic domain having the sequence
K
FTVGNGQNQHKGVNDGFSYEIWLDNTGGNGSMTLGSGATF
WO 93/25693 PCT/GB93/01283 -8-
KAEWNAAVNRGNFLARRGLDFGSQKKATDYDYIGLDYAAT
YKQTASASGNSRLCVYGWFQ NRGLNGVPLVEYYIIEDWVD WVPDAQGKMVTIDGAQYKIF QMDHTGPTINGGSETFKQYF SVRQQKRTSGHITVSDHFKE WAKQGWGIGNLYEVALNAEG WQSSGVADVTLLDVYTTPKG SSPA; LINK1 represents a first linker having the sequence:
TSTGTVPSSSAGGSTANGK;
LINK2 represents a second linker having the sequence:
TSAAPRTTTRTITRTKSLPTNYNK;
CTR1 represents a first C-terminal repeat having the sequence:
CSARITAQGYKCCSDPNCVVYYTDEDGTWGVENNDWCGCG;
and CTR2 represents a second C-terminal repeat having the sequence:
VEQCSSKITSQGYKCCSDPNCVVFYTDDDGKWGVENNDWC
GCGF.
All these partial sequences can be seen in SEQ ID NO: 1 and SEQ ID NO: 2.
The structure of xylanases from other anaerobic fungi may be broadly similar, but of course the precise sequences of the components will generally be different, unless the source organism is very closely related to N. patriciarum. It may not be necessary for the entirety of the sequence of each region (particularly the catalytic domains) to be present for activity; in the present invention, although the entirety of a catalytic domain may be present, it is sufficient for the active portion of the catalytic domain to be present (that is to say, the catalytic domain must be functionally present).
The two catalytic domains can be seen to be very similar to each other but not identical. The difference between them gives an indication of the degree of WO 93/25693 PCT/GB93/01283 -9homology to a natural sequence that is particularly preferred. The two C-terminal repeats can also be seen to be similar to each other (but less so than the two catalytic domains). The difference between them gives an indication of the degree of homology which is still highly preferred. The precise sequence of the two linker sequences may not be particularly important; all that is necessary is that the spatial arrangement of the catalytic domain(s) is such as to enable them to function effectively (and preferably optimally).
Preferred embodiments of the invention comprise a catalytic domain which is substantially homologous with at least one of CAT1 and CAT2 and are missing at least part of the amino acid sequence downstream (ie towards the C-terminus) of CAT2. At least part of CTR2 may be missing; alternatively or (preferably) additionally, at least part of CTR1 may be missing.
Particular embodiments of xylanases in accordance with the invention include those including (and preferably consisting essentially of) the following regions:
A.
B.
C.
D.
E.
F.
G.
H.
CAT1-LINK1-CAT2-LINK2-CTR1(truncated) CAT1-LINKl-CAT2-LINK2(truncated) LINK (truncated)-CAT2-LINK2(tuncated) CAT1-LINK1(truncated) CAT1(truncated) LINK (truncated)-CAT2-LINK2-CTR1-CTR2 LINKl(truncated)-CAT2-LINK2-CTR1 (tnmcated) LINK (truncated)-CAT2(truncated) (eg pNX3); (eg pNX4); (eg (eg pNX6); (eg pNX7); (eg pNX8); (eg pNX9); (eg (The plasmid designations in brackets refer to plasmids in the examples whose expression products are the xylanases shown.) Signal sequences may initially be present but will preferably be absent in the final molecule. Structures C, F, G and H are preferred and structures C, G and H are particularly preferred.
I
WO 93/25693 PCT/GB93/01283 Enzymes in accordance with the invention may comprise a single CAT1 domain, a single CAT2 domain, or have two or more catalytic domains, each of which independently may be chosen from CAT1 and CAT2. It may be that substantially only catalytic domains are present; and as indicated above it may be that not all of the natral catalytic domain sequences are essential for adequate activity.
On the immature protein a signal peptide may be present; the sequence of the natural signal peptide is:
MRTIKFFFAVAIATVAKAQWGGGGASAGQ.
This sequence again is shown in SEQ ID NO:1 and SEQ ID NO:2.
Xylanases in accordance with the invention may be prepared by any suitable means. While bulk fermentation of the source anaerobic fungus may be undertaken, and polypeptide synthesis by the techniques of organic chemistry may be attempted, the method of preparation of choice will generally involve recombinant DNA technology. A xylanase as described above will therefore for preference be the expression product of heterologous xylanase-encoding DNA in a host cell.
According to a second aspect of the invention, there is provided an isolated or recombinant DNA molecule encoding a xylanase which has a catalytic domain substantially homologous with a xylanase of an anaerobic fungus, provided that the DNA molecule does not comprise a full length copy of natural mRNA encoding the xylanase.
cDNA (apparently comprising a full length copy of mRNA) encoding a xylanase of Neocallimastixfrontalis has been described by Reymond et al, FEMS Microbiol.
Lett. 77: 107-112 (1991), but no expression was reported.
WO 93/25693 PCT/GB93/01283 -11- Although a full length copy of natural mRNA is not present in DNA in accordance with this aspect of the invention, it should be understood that the invention is not limited to truncated cDNAs. It is contemplated that some or all of the introns (if any) naturally present in the corresponding wild type gene may be present.
However, at least some sequence that is present in the full length cDNA is absent in DNA in accordance with this aspect of the invention. It should also be understood that this aspect of the invention encompasses DNAs encoding full length xylanases; the absent portion of the DNA may be (and in some embodiments preferably is) in 'ie 3' and/or 5' untranslated regions. Substantially full length or truncated xylanases may therefore be produced from DNA in accordance with this aspect of the invention which is substantially missing the 3' unranslated region, or is substantially missing the 5' untranslated region or is substantially missing both the 3' and 5' untranslated regions.
A full length cDNA encoding a xylanase of an anaerobic fungus (taking the xynA gene of N. patriciarum as the prototype) may have the following structure: 'utr-sig-carl-linkl-car2-link2-ctrl-ctr2-3 'utr, wherein 5'utr represents a 5' untranslated region; sig encodes a signal peptide; catl encodes a first catalytic domain; linkl encodes a first linker sequence; ca2 encodes a second catalytic domain; link2 encodes a second linker sequence; ctrl encodes a first C-terminal repeat; ctr2 encodes a second C-terminal repeat; and 3'utr represents a 3' untranslated region.
I
WO 93/25693 PCT/G#B93/O 1283 Genomic sequences may have one or more introns intrspersed within the above struture. In the xynA gene encoding the XYLA enzyme of JV pamciarwn. the various DNA segments have the following sequences: 3 1 utr: TTrATAATCAATCTCTAAT TATT TTAGGAAAAAATAMAATAAATATAAT AAATATTAG;AGAGTAATATTAAAACAGAAAAAAAACTArTAG=trA-TTT TTTACTGTMAAAAAAAAATAAAAAACAAAATTAATAAAGATATTTTGAAAMhTATT
CAATTAGAAAAAAA;
Sig: ATGAGAACTATTAAATTCTTTTTCGCAG;TAGCTATTGCAACTrGTTG CTAAGG;CCCAA'TGGGGTGGAGGTGGTGCcTCTGCTGGTCAA; caUi: AGA~TTAhCCGTCGGTAATO GTCAAACCCAACATAAGGTGTAGCTGATGGrTACAGTTATGAAATCTGTTAGATAMCA CCGGTG4GTGTCTATACTTCGTGTGTGCAACCTTCAGGCTGATGATG CATCTGTTAACCGTGGTAA CTTG~CCCGTCGTGGTCTTGACTTCG=TCTCAAAAGA
AGGCAAVCGATTACAGCTACATTG=TTGGATTATACTGCAACTTACAGACAAACTGGTA
GCGCAAGTcTAAr.'CCCGCrC'CTcTATACGTGGTCCAAAACCToAGT,'AAG GTGTCCATTCGTAGAATACTACATCATTWGTGCrGGTTACTGGGTCCGATGCAC AAGGTAGATC4GTAACCAATGGAGCTCATATAGATTTTCCAAATGGATCACACTG
GTCCAACTATCAATGGTGGTAGTGAAACCTTTAAGCAATACTCAGTC;TCCOTCAACAAA
ACTTCTG CATATTACTGTCTCIGATCACMGMTGGCCAAACAAGGTT GGGGTA'1 GGTAACCTTATGAAGTTGTTGAACGCCGAAGGTGGCAAGTATGGTA
TAGCTGATGTCACCAAGTTAGATQTTACACAACCCAAAAAGGTTCTAATCCTGCCCCT;
lI nki
ACCTCCAO"TACTGTCCAAGCACTGCTGTGAAGTACTCCAATGGTAAA;
cat2:
AAGT
TTACTGTCGGTAMTGACAAACCAACATAAGGGTkGTCAACGATGGTTCAGTTATGAAA TCTGGAGATAACACGGTGTAACGTTATACTCTCGTAGTGGTGCAAT1T~iCA AGGAT TTGCGCT 'AAC GGTACCTTCCCCCGTCGTGG;T~rGACT TCGGTCTCAAAAAAGCACCGATTACGACTACATTGATTAATAGTCCGTACT1 ACAAACAAACTGCCAGTGCAAOTGGTAACTCCCOTCTCTC;TGTATACGQATGGTrTCCA WO 93/25693 WO 93/25693 C/C 193/01283 -13- ACCGrGGACTTAATGGCGTTCCTAGTAGAMTACTACATCATGAAGATTGGcGCT GG~rrCAGATGCACAAGGAAAAATGGTAACCATTATGAGCTCAATATAAGATTTTCC AAATGGATCACACTGGTCCAACTATCAATGGTGGTAGTGAAACCTTTAAGCAA6TAcTTCA
GTGTCCGTCAACAAAAGAGAACTTCTGGTCATATTACTGTCTCAGATCACTTAAGGAAT
GLrCCTITCACGCCGAAGGTT GGCAAAGTAGTGaTGTrCTrATGTCACCTTATTAGATGTTACACAACTCCAAAGGGTT CTAGTCCAG CC; .link2: ACCTGCCGCC TACTACTACCCGTACTACTACGTACCAAGTCC"TCCAACC
AATTACAATAAG;
Ctrl:
TGTGTGAT='-CAGTCAGGTTGGTCATZGTT
is TACTACACTCATGAGGATGGTrACTGGCGTGTTGAAAACAACGACGGTGTGGTTCOTGT; ctr2:
GTTGACAATGTTCTCCAAGATACTTCTCAAGGTTACAAGTGTTTAGCATCCAAAT
TGCGTT0~TiCTrwACACTGATACGATGTAATGGGGTGTGAA1ACAACGATGGT G=GaTGTc; and tr:
T=AAG~GTAAAATACTAATTAATAA
AMrAA.TTAAA~A~TTATTAAAnATTA GAAAAATrTA MA AAA AAAAArAAAAAI.TAAGTTTGAAAAM=-A
AAGAATTAAAAAAAMAATTGAAGMTATGAAAMTTA
AAATGTAAAAGTTTAAAA.TACAAATTTAGAATGATTAAAA
AAMATTATGAAAAACCCAAATGTAArMr .7 7? (Note that the first thre nucleotides of the 5'wr segment constitute a stop codon, which will generally be present.) The use of (less than the totality at) these DNA segments, or sequences substantially homologous with them. is preferred in this aspect of the invention.
Preferred embodiments correspond generally to the preferred embodiments of the xylanases per se in accordance with the first aspect of the invention, but with the WO 93/25693 PCT/GB93/01283 -14added considerations that it may be preferred for a DNA sequence encoding a peptide signal sequence to be present and/or it may be preferred for one or both of the uriranslated regions to be truncated or absent. Particular embodiments of this aspect of the invention include those including (and preferably consisting essentially of, apart from vector-derived sequences) the following segments: a.
b.
c.
d.
e.
f.
g.
h.
5 'utr-sig-catl-linkl-cat2-link2-ctrl (truncated) 5 'utr-sig-catl-linkl-car2-link2(truncated) linkl(truncated)-cat2-link2(tnmcated) 5 'utr-sig-carl-linkl (truncated) 5 'ur-sig-catl(truncated) linkl(truncated)-ca2-link2-ctrl-ctr2-3 'utr linkl (truncated)-cat2-link2-ctrl(truncated) linkl(truncated)-cat2(truncated) (eg pNX3); (eg pNX4); (eg (eg pNX6); (eg pNX7); (eg pNXS); (eg pNX9); (eg (The plasmid designations in brackets refer to plasmids in the examples including the DNA sequences shown.) Structures c, f, g and h are preferred and structures c, g and h are particularly preferred.
Recombinant DNA in accordance with the invention may be in the form of a vector. The vector may for example be a plasmid, cosmid or phage. Vectors will frequently include one or more selectable markers to enable selection of cells transfected (or transformed: the terms are used interchangeably in this specification) with them and, preferably, to enable selection of cells harbouring vectors incorporating heterologous DNA. Appropriate start and stop signals will generally be present. Additionally, if the vector is intended for expression, sufficient regulatory sequences to drive expression will be present. Vectors not including regulatory sequences are useful as cloning vectors; and, of course, expression vectors may also be useful as cloning vectors.
WO 93/25693 PCT/GB93/01283 Cloning vectors can be introduced into E. coli or another suitable host which facilitate their manipulation. According to another aspect of the invention, there is therefore provided a host cell transfected or transformed with DNA as described above.
DNA in accordance with the invention can be prepared by any convenient method involving coupling together successive nucleotides, and/or ligating oligo- and/or poly-nucleotides, including in vitro processes, but recombinant DNA technology forms the method of choice.
Xylanase-encoding DNA may be cloned from a DNA library, which may be prepared from one of the above fungi. The library may be genomic, but a cDNA library may be easier to prepare and work with, particularly if steps are taken to enhance the likelihood of the presence of xylanase-encoding cDNA in the cDNA library.
Cultivation of a chosen fungus, such as N. pariciarum, may proceed anaerobically in an appropriate culture medium containing rumen fluid; the sole or predominant carbon source may be xylan so as to promote xylanase expression and, hence, to 2 0 cause an increase in the amount of xylanase-encoding RNA. However, cultivation in the presence of xylan is not essential, and the carbon source may instead be a cellulose, such as the microcrystalline cellulose sold under the trade mark AVICEL.
After cultivation of the fungus, total RNA may be extracted in any suitable manner. Fungal cells may be harvested by filtration and subsequently lysed in appropriate cell lysis buffer by mechanical disruption. A suitable RNA preserving compound, such as guanidinium thiocyanate, may also be added to the fungal cells to reduce or prevent RNase-mediated digestion. Total RNA may subsequently be isolated from the resulting homogenate by any suitable technique such as by
I
WO 93/25693 PCT/GB93/01283 -16ultracentrifugation through a CsClI cushion or as described in Sambrook et al.
Molecular Cloning: A Laboratory Manual, 2nd Edition, Cold Spring Harbor.
New York: Cold Spring Harbor Laboratory Press (1989).
Another method for preparation of total fungal RNA in addition to that described above may be based on or adapted from the procedure described in Puissant and Houdebine Bio-Techniques 148-149 (1990). In this method, total fungal RNA can be isolated from the above homogenate by extraction with phenol/chloroform at pH 4 to remove DNA an. associated protein. The resulting crude RNA was further purified by washing with lithium chloride-urea solution.
A suitable further technique for fungal RNA extraction is that of Teeri et al. (Anal.
Biochem. 164 60-67 (1987)).
Once total RNA has been extracted, by whichever method, poly-A mRNA may then be isolated from the total RNA, for example by affinity chromatography on a compound containing multiple thymidine or uracil residues, to which the poly-A tail of the mRNA can bind. Examples of suitable compounds include oligo-dT cellulose and poly-U SEPHADEX". Poly-A mRNA can then be eluted by a suitable buffer.
A cDNA expression library may then be constructed using a standard technique based on conversion of the poly-A+ mRNA to cDNA by reverse transcriptase.
While it is possible to construct a genomic library, a cDNA library is preferred because it avoids any difficulties which may be caused by the presence of introns in the fungal genomic DNA. The first strand of cDNA may be synthesised using reverse transcriptase and the second strand may be synthesised using any suitable DNA-directed DNA polymerase such as Escherichia coli DNA polymerase I (E.
coli pol I).
I
WO 93/25693 PCr/GB93/01283 -17- The cDNA may subsequently be fractionaed to a suitable size and may be ligated to a suitable vector which is preferably a phage vector such as XZAP, XZAPII or Xgt 11. Suitable kits for the purpose are available from Stratagene. Further or alternative guidance may be had from Reymond et al (FEMS Microbiol. Let. 17 107-112 (1991)) which details th- preparation of a cDNA library from N. frontalis.
The resulting cDNA library may then be amplified after packaging in vitro, using any suitable host bacterial cell such as an appropriate strain of E. coli.
The screening of xylanase positive recombinant clones may be carried out by any suitable technique, which may be based on hydrolysis of xylan. In this procedure the clones may be grown on culture media incorporating xylan and hydrolysis may be detected by the presence of xylanase-positive plaques suitably assisted by a suitable colour indicator. Methods for selecting xylanase clones are described in the literature. Two examples are Clarke et al. (FEMSMicrobiol. Lett. 83 305-310 (1991)) and Teather and Wood (Appl. Environ. Microbiol. 43 777-800 (1982)).
Xylanase positive recombinant clones may then be purified (that is to say a plaque may be converted to a bacterial colony) by well established procedures. Suitable techniques can be found in Sambrook et al (1989) (loc. cit.), but it would be usual simply to follow the manufacturer's instructions in whichever kit was being used and the cDNA insert in the clones may then be excised into a vector of choice, such as pBLUESCRPTSK() to name only one example. Other suitable plasmids can be used for subcloning; examples include the pUC plasmids and plasmids derived from them, as described in Sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd edition, Cold Spring Harbor, New York: Cold Spring Harbor Laboratory Press (1989). Expression vectors (particularly plasmids) in which the xylanase-encoding DNA is under the control of an appropriate promoter may also be formed by ligation and transformed and transfected into a suitable expression host. Examples of suitable expression vectors include the pUC series (which have
I
WO 93/25693 PCT/GB93/01283 -18the lacZp promoter), the pMTL series (which also have the lacZp promoter and SpBLUESCRIPT (which has both the lacZp promoter and the T 7 promoter).
The nature of the promoter is not in general believed to be particularly critical and will depend on the expression host and the conditions under which expression is desired. As indicated above, a suitable example for a bacterial expression host such as E. coli is the lacZ promoter. Alternative promoters for bacterial hosts include the bacteriophage T 7 promoter.
It may not be necessary to purify recombinant xylanases from their expression hosts. While E. coli as a host cell may be suitable for application of the xylanase of the invention in pulp manufacture, it will be appreciated that other host cells could be used such as gram positive bacteria inclusive of Bacillus subtilis, or lactic acid bacteria. Alternatively a eukaryotic expression host may be used; an example would be yeast (such as Saccharomyces cerevisiae).
Host cells expressing xylanases as described above and/or harbouring DNA sequences as described above (whether for expression or otherwise) themselves constitute a further aspect of the invention. Also included in the invention are methods of preparing a host cell, in which xylanase-encoding DNA is transformed or transfected into a cell, and methods of producing a' xylanase, in which expression hosts are cultivated to express xylanase-encoding DNA.
Depending on the nature of the host cell, it may be preferred for recombinant DNA in accordance with the invention to include a signal sequence. Either a hostspecific signal sequence may be included or, for expression in eukaryotes, the enzyme's own signal sequence may be used. A translational start site adapted for or preferred by the expression host may be provided; however, the protein's own translational start site may be adequate or even in some circumstances preferred.
WO 93/25693 'PCT/GB93/,l2,' -19- Recombinant xylanase enzyme from an expression host may then be characterised.
Principal features that have been ascertained for certain embodiments of the invention are as follows: the cloned xylanase has a very high specific activity (5980 U/mg protein of the purified enzyme); this is in contrast to many cloned xylanases from bacteria which have been reported so far; (ii) the enzyme is able to degrade xylan at extraordinarily high efficiency, releasing 0.9g of reducing sugar per g of the substrate.; (iii) the enzyme has no residual activity again cellulose, while many other xylanases possess some cellulase activity; and (iv) the enzyme contains two catalytic domains, which may have potential for construction of a highly efficient xylanase-producing clone by further genetic manipulation of the xylanase cDNA.
The high specific activity of the full length cloned xylanase (hereinafter referred to as xylanase A) (5980 U/mg protein of the purified enzyme) is an intrinsic property of this fungal xylanase. However, the expression level of the present construct of xynA cDNA in pBluescript vector (pNX1) is relatively low in E. coli, accounting for 0.3% of soluble protein synthesised by E. coli cells. Generally speaking, the expression of the cloned gene at the level of 10% of total cellular E. coli protein is attainable.
Truncated forms of xynA cDNA may be prepared by the use of restriction enzymes. Some truncated forms, including that in the plasmid designated produce several hundred-fold higher xylanase activity than pNX1. One explanation for this observation is that is a result of the utilisation of LacZ translation initiation sequence for the synthesis of the truncated xylanase A. Another explanation is that avoidance of AT-rich regions may result in higher expression levels; a theory is WO 93/25693 PCI'/GB93/01283 that the mRNA degrading activity of RNase E is the rate limiting step in protein symhesis, and that RNase E has a preference for AT-rich regions of mRNA. It is possible to further increase its expression level in E. coli by using a stronger promoter, such as Bacteriophage T 7 promoter.
Recombinant xylanase A (XYLA) purified from Escherichia coli harbouring xynA, had an of 53000 and hydrolysed oat spelt xylan to xylobiose and xylose. The enzyme did not hydrolyse any cellulosic substrates. The nucleotide sequence of xynA revealed a single open reading frame of 1821 bp coding for a protein of Mr 66192. The predicted primary structure of XYLA comprised of an N-terminal signal peptide followed by a 225 amino acid repeated sequence, which was separated from a tandem 40 residue C-terminal repeat by a threonine/proline linker sequence. The large N-terminal reiterated regions consisted of distinct catalytic domains which displayed similar substrate specificities to the full length enzyme.
Xylanases in accordance with the invention have a number of applications in the food, feed, and pulp and paper industries. The use of xylanases described herein in these industries is included within the scope of the invention.
Dealing first with the food industry, certain properties of dough and its resultant baked products are dependent on the pentosan and starch content of the flour used.
These properties include the texture, volume and staling of bread. The use of xylanase could modify baked products to provide goods of potential commercial value. Among the properties that can be modified by xylanase treatment is the specific volume of bread. The increase in specific volume is enhanced further when amylase is added in combination with xylanase. One of the factors contributing to this effect is the water-binding capacity of carbohydrates. The invention provides dough including a xylanase as described herein.
WO 93/25693 PCT/GB93/01283 -21- In the animal feed industry, the use of enzyme supplementation to improve feed for chicks was reported as early as 1957. More recent results suggest that, in certain grains such as wheat, and particularly rye, it is the pentosans in the endosperm that are mainly responsible for poor nutrient uptake and sticky droppings from the chicks. Both problems appear to result from the high viscosity of the undigested pentosans. This hampers the diffusion of nutrients and binds water to make excreta watery. The problems can be alleviated using xylanase preparations. Xylanase action can improve both the weight gain of chicks and their feed conversion efficiency. It appears that xylanase supplementation could be used to improve the nutritional value of rye, so as to promote the use of this grain in chick feed. The effectiveness of this treatment may be dependent on the variety of rye. The invention provides the use of xylanase in chick feed and grain for these purposes.
In the pulp industry, dissolving pulps are purified celluloses used for making viscose rayons, cellulose esters and cellulose ethers. They are derived from prehydrolysed kraft pulps or acid sulphate pulps. Their processing is characterised by the derivatisation of the cellulose at one stage, the derivative being soluble in common solvents and thus permitting the formation of fibres, films and plastics.
Impurities in the cellulose hamper derivatisation and thus lead to insolubles that block orifices in sprayers or form defects in the final product. Furthermore, certain xylan impurities can lead to colour, haze and thermal instability in acetate products. Xylanases may thus have a role to play in removing impurities, and the use of xylanases described herein for this purpose is comprehended within the invention.
The prebleaching of kraft pulp using cellulase-free xylanase has been identified as one of the biotechnologies most likely to be accepted in the pulp and paper industry in the near future, but only if suitable xylanases become available. The WO 93/25693 PCT/G B93/01283 -22kraft (also known as alkaline or sulphate) process has become the predominant pulping technology in Canada because it produces strong wood fibres and because the chemicals used are recovered and recycled. Kraft pulps, particularly those derived from softwoods, are relatively difficult to bleach. A sequence of stages using elemental chlorine and chlorine-containing compounds is traditionally required to bleach these pulps effectively to the desired full brightness of The bleaching process, particularly when using elemental chlorine, products chloro-organics that have traditionally been discharged from the bleach plant with the waste water. However, both public demand and legislated regulations are .0 presently pressurizing pulp mills to reduce or eliminate the emission of these pollutants. The pulp and paper industry is considering the implementation of various alternative technologies in order to reduce the environmental impact of its mills. These options include xylanase prebleaching of kraft pulp. Xylanases in accordance with the present invention are particularly well suited to this purpose.
It is believed that the xylanases of the present invention are particularly applicable to the paper and pulp industry. While it is appreciated that the use of enzymes will never replace chemicals completely, there is pressure being exerted by those concerned with the environment to reduce the use of chemicals. There are also practical reasons for reducing the use of chemicals in the paper and pulp industry.
Pulping plants usually generate their own supplies of chlorine and chlorine dioxide on site, and this can limit capacity as well as being potentially hazardous. Treating the paper pulp (eg kraft pulp) to remove lignin involves the use of chlorine, NaOH, H 2 0 2 and chlorine dioxide. Sandoz in the USA have conducted practical trials using their CARTAZYME product, which is a fungal xylanase (crude), active at 30-55"C, pH 3 to 5, and contains 2 xylanases, and have found that a 25-33% reduction in chlorine is possible using 1U xylanase/gm pulp. Also the product is brighter than when chemicals alone are used. Another advantage of the xylanase WO 93/25693 PCT/GB93/01283 -23is that it is specific whereas chemicals can attack the cellulose at low lignin contents, leading to reduced fibre strength and other undesirable physical characteristics. It is therefore clear that xylanases could become more important in pulp bleaching and recombinant ones particularly so because of their specificity and high yield. It is believed that lignin is bonded to hemicellulose, and if the hemicellulose (xylan) is depolymerised the lignin may be partially disassociated from cellulose and subsequently washed out. At present, however, some chemical treatment may still be necessary. The main points about xylanase of the present invention, with respect to commercial use, are its very high specific activity and high level of expression would make it economical to produce on a large scale and (ii) its lack of cellulase activity make it particularly useful where it is necessary to remove xylan specifically as applied to the paper making and textile industry.
It is also believed that the xylanase of the invention could find a valuable application in the sugar industry and in relation to the treatment of bagasse or other products containing xylan for more efficient disposal.
It was previously mentioned that the protein sequence of XYLA and the DNA sequence of xynA were made available on 5 May 1992 on the EMBL database under accession number X65526. This availability may not constitute effective prior art in the jurisdictions of all of the states designated in this application. For those jurisdictions where the EMBL database entry does not constitute effective prior art, notice is hereby given that the invention is and will be defined more broadly than as indicated above. In particular, the invention may then be seen to reside in the following further aspects: a xylanase which has at least one catalytic domain which is substantially homologous with a xylanase of an anaerobic fungus; the xylanase may be a full length natural xylanase of an anaerobic fungus; and SUBSTITUTE SHEET
ISA/EP
WO 93/25693 PCT/GB93/01283 -24an isolated or recombinant DNA molecule encoding a xylanase which has a catalytic domain substantially homologous with a xylanase of an anaerobic fungus, provided that if the DNA molecule is cDNA encoding a xylanase of Neocallimasrixfrontalis then the DNA molecule is operatively coupled to a promoter; the DNA molecule may comprise a full length copy of natural mRNA encoding the xylanase.
It will be apparent from the foregoing that the invention includes within its scope not only the recombinant xylanase described above but also xylanases derived from other anaerobic fungi as described above which may be prepared by the methods described herein. The invention also includes within its scope any mutant derived from N. patriciarum or strains derived from N. patriciarum by selection or gene transfer.
The invention also includes within its scope DNA sequences derived from pNX1, pNX4, pNX5, pNX6, pNX8, pNX9 and pNX10 and DNA sequences capable of hybridising thereto; (ii) a DNA construct containing a DNA sequence as in (i) operably linked to regulatory regions capable of directing the expression or over-expression of a polypeptide having xylanase activity in a suitable expression host; (iii) a transformed microbial host capable of the expression or overexpression of a fungal xylanase containing an expression construct as in (ii); (iv) a polypeptide having xylanase activity produced by expression using a microbial host as in (iii); amino acid sequence as shown in Figure 4 including I- WO 93/25693 PCT/GB93/01283 components A, B, C and D and amino acid sequences derived from this xylanase; and (vi) plasmids described in Figure 1.
The invention also includes within its scope a method of preparation of a xylanase from E. coli harbouring the recombinant plasmids as shown in Figure 1.
Each preferred feature described above with reference to one aspect of the invention is equally preferred, mutatis mutandis, for each other aspect.
The invention will now be illustrated by the following examples. The examples refer to the accompanying drawings, in which: FIGURE 1 is a resmriction map of recombinant plasmids containing xynA.
The positions of the cleavage sites of EcoRI SstI Scal HpaI KpnI Xhol Smal PvuII NaeI NruI (Nr), Stui (St) and HindI are shown. Restriction sites of multiple cloning regions or vectors in parenthesis have been destroyed. Multiple cloning regions of vectors, designated by are derived from pSK(S), pMTL20(20) and pMTL22(2) respectively. The solid line with an arrow shows the extent and orientation of the xynA open reading frame. Construction of the deletion mutants of xynA is detailed below. The phenotypes of E. coli strains harbouring the recombinant plasmids are shown.
FIGURES 2A and 2B show the purification of XYLA. SDS/PAGE of XYLA purified from cell-free extract E. coli XL1-Blue harbouring pNX1 or pNX5(B). Lane 1 contained XYLA purified by anion exchange chromatography, lane 2 contained cell-free extract from E. coli harbouring pNX1 or pNX5 and lane 3 (B only) contained cell-free extract from E. coli WO 93/25693 WO 9325693PCT/01193/01283 -26containing p~luescript SK. Gels depicted in A and B contained 1017 (w/v) or 15%17 (wlv) polyacrylamide, respectively. Protein sizes are shown in kD, deduced from the marker proteins which are high (.Figure 2A) or low (Figure 2B) molecular weight markers from Sigma.
FIGURE 3 shows the effect of purified XYLA on the specific viscosity of soluble xylan in PC buffer, pH 6.5 at 37*C. Specific viscosity (N) and reducing sugars were measured as described below.
FIGURE 4 shows the primary structure otXYLA. The two homologous catalytic domains, designated A and B, together with the duplicated Cterminal sequences (C and D) are boxed.
FIGURE 5 shows the alignmient of homologous regions of N. patriciarum is XYLA and prokaryote xylanases. Thm enzymes compared were as follows: B. pumilus xylanase A (XYLAB; Fukusaki et al, FOES Len. 171: 197-201 (1984)), B. circulazs xylanase (XYLBC; Yang et al, Nuc!. Acids Res. 16: 7178 (1988)) and C acetobuylicum xylanase B (XYLBCA;, Zappe et al., Nuc!. Acidsr Res. 18 2179 (1990)). Residues which show identity r similarity in all primary sequences compared are boxed. The positions of the first and last residues of homologous regions, in their respective primary sequences, are shown.
FIGURE 6 shows the structure of plasmid pNX1.
FIGURE 7 shows the cloning and characterisation of NeocallimnastiX patriciarumt xylanase A encoding cDNA.
WO 93/7S693 WO 932S693PCI'/GB93/01283 -27.
EXAMPLE I Preiaration of pNXI 1.1 Microbial strai sm vectors- and culture media Tle anaerobic fungus Neocallimasrir pairiciarwn (type species) was isolated from a sheep rumen by 0rpin, and Munn, Trans. Br. Mycol. Soc. 86: 178- 181 (1986). Host strains for eDNA cloning were E. ccli PLK-F' and XL1-Blue.
E. ccli strain JM83 was used for characterisation of the xylanase+ cDNA clones.
The vectors were XZAPII, pBLuEsCRu~rSK(-) (Stratagene), pMTL2O, pMTL22 and pMTL23 (Chambers et Gene 68: 139-149 (1988)). N. parriciarum culture was maintained in a medium containing 10% rumen fluid as described by Kemp et al, J. Gen. Microbiol. 130: 27-37 (1984)). E. coli strains were grown in L-broth (Sambrook er at, Molecular Cloning. A Laboratory Manual, 2nd edition. Cold Spring Harbor, New York: Cold Spring Harbor Laboratory Press (1989). The is recombinant phage were grown ir. E. coli strains using NZY medium according to Stratagene's instructions.
1.2 General-recombinant DNA tcniaUes Agarose-gel electrophoresis, transformation of E. ccli and modification of DNA using restriction enzymes and T4 DNA ligase were as described by Gilbert et al., J. Gen. Microbiol. 134 3239-3247 (1988). Large amounts of plasmid DNA was extracted from E. coli by Brij lysis' and subsequent CsCI density-gradient centrifugation (Clewell, and Helinski, Proc. Nail. Acad. Sci. USA 62: 1159.1166 (1969)). The rapid boiling method of Holmes, and Quigley, M., 2S Anal. Biochem. 114: 193-197 (1981) and the alkaline lysis method of Birnboim, H.L. and Doly, NucI. Acids Res. 7:1513-1521 (1979) were employed to isolate plasmid for rapid restriction analysis and nucleotide sequencing, respectively.
Northern hybridisation was as described by Gilbert et al, J. Bacterial. 161: 314- 320 (1985).
WO 93/25693 PCT/GB93/01283 -28- 1.3 Cultivation of rumen anaerobic fungrus. patriciarum N. parriciarum was grown in a rumen fluid-containing medium (Kemp et al, J.
Gen. Microbiol. 130: 27-37 (1984)) in the presence of 1% AVICEL at 39 0 C and anaerobic conditions for 48hr (Alternative culture media, such as described by Philips, and Gordon, Appl. Environ. Microbiol. 55: 1695-1702 (1989) and Lowe et al, J. Gen. Microbiol. 131: 2225-2229 (1985), can be used.
1.4 Total RNA isolation The frozen mycelia were ground to fine powder under liquid nitrogen with a io mortar and pestle. 5-10 vol of guanidinium thiocyanate solution (4M guanidinium thiocyanate, 0.5% sodium laurylsarcosine, 25mM sodium citrate, pH 7.0, ImM EDTA and 0.1M mercaptoethanol) was added to the frozen mycelial powder and the mixture was homogenised for 5 min with a mortar and pestle and for a further 2 min at full speed using a Polytron homogeniser. Total RNA was isolated from the homogenate by ultracentrifugation through a CsC1 cushion (Sambrook et al, Molecular Cloning. A Laboratory Manual, 2nd edition. Cold Spring Harbor, New York: Cold Spring Harbor Laboratory Press (1989). (Alternative method for preparation of total fungal RNA, such as adaptation of the procedure described by Puissant, and Houdebine, Bio-Techniques 148-149 (1990), can be used).
Poly A mRNA purification Poly A mRNA was purified from the total RNA by Oligo (dT) cellulose chronr:ography (Sambrook et al, Molecular Cloning. A Laboratory Manual, 2nd edition. Cold Spring Harbor, New York: Cold Spring Harbor Laboratory Press (1989).
1.6 Construction of a cDNA expression library of N. parriciarum The cDNA library was constructed, using Stratagene's XZAP cDNA synthesis kit, basically according to the manufacturer's instructions.
WO 93/25693 PCT/G B93/01283 -29- The procedure is described briefly as follows: Poly A mRNA was converted to the first strand cDNA by reverse transcriptase, using XhoI linker oligo (dT) primer and 5-methyl dCTP. Double-stranded cDNA was synthesised from the first-strand cDNA by the action of RNase H and DNA polymerase I. After blunting cDNA ends, the cDNA was ligated with EcoRI adaptor, phosphorylated and digested with Xhol to create cDNA with EcoRI site at 5' region and XhoI site at 3' region. The cDNA was size-fractionated by 1% low-melting point agarose gel electro-phoresis and 1.2-8 Kb sizes of the cDNA were recovered by phenol extraction (Sambrook et al, Molecular Cloning. A Laboratory Manual, 2nd edition. Cold Spring Harbor, New York: Cold Spring Harbor Laboratory Press (1989)). The size-fractionated cDNA was then ligated to the EcoRIIXhoI digested XZAPII vector (other expression vectors can be used).
The cDNA library was packaged in vitro and amplified using E. coli PLK-F' as plating cells.
1.7 Screening xvlanase-positive recombinant bacteriophage clones Recombinant phage were grown in E. coli XL1-Blue in 0.7% top agar containing 0.1% xylan and 10mM isopropyl-#-D-thiogalactopyranoside (IPTG, an inducer for LacZ promoter controlled gene expression). After overnight incubation at 37"C, Congo red solution was added over the top agar. After incubation at RT for min, the unbound dye was removed by washing with 1M NaCI. Xylanaseproducing phage plaques were surrounded by yellow haloes against a red background.
The xylanase-positive recombinant phage were purified to homogeneity by replating and rescreening the phage as above for 2-3 times.
WO 93/25693 PCT/GB93/01283 The cDNA inserts in xylanase-positive phage were excised into pBLUESCRIPT SK' using VCS-M13 helper phage.
1.8 Xvlanase and related-enzyme assays The enzyme extracts from E. coli harbouring xylanase-positive recombinant plasmids were prepared as described by Kellett et al, Biochem. J. 272: 369-376 (1990).
The enzymes were assayed for hydrolysis of xylan or other substrates at 37C in 50mM potassium phosphate /12mM citric acid buffer, pH 6.5 and the reducing sugars released from xylan or other plant polysaccharides (carboxymethyl cellulose, barley S-glucan, laminarin, lichenan) were measured as described by Kellett et al, Biochem. J. 272: 369-376 (1990) and Hazlewood et al, J. Gen.
Microbiol. 136: 2089-2097 (1990). Assays for activities against artificial substrates(methylumbelliferyl--D-cellobiosidenethylumbeliferyl-3-D-glucoside, methylumbelliferyl-g-D-xyloside and p-nitrophenyl-f-D-xyloside) were described by Hazlewood et al, J. Gen. Microbiol. 136: 2089-2097 (1990).
1.9 DNA sequencing Plasmid DNA, denatured by alkali, was neutralised and further purified by spin dialysis (Murphy, and Kavanagh, Nucl. Acid Res. 16: 5198 (1988)).
Sequencing of the resultant DNA was based on the protocol recommended by the manufacturer of the Sequenase DNA sequencing kit (USA, Cleveland, OH).
Overlapping sequences were generated by cloning appropriate restriction fragments into pMTL-based vectors. Sequences were compiled and ordered using the computer programs described by Staden, Nucl. Acids Res. 16: 3673-3694 (1980). The complete sequence of the cDNA contained in the plasmid designated pNX1 was determined in both strands. The xylanase-encoding gene contained in the plasmid was designated xynA and the gene product, the xylanase enzyme itself, was designated XYLA.
WO 93/25693 PC/GB93/01283 -31- EXAMPLE 2 Construction of pNX4. a Deletion Mutant of pNX1 (xvnA) pNX1 was linearised by XhoI and the 3' region of xynA cDNA was removed by Bal-31 digestion (Hall, and Gilbert, Mo. Gen. Genet. 213: 112-117 (1988)). After blunt ending, the truncated cDNA was excised from pNX1 by EcoRI digestion and cloned into EcoRI/SmaI digested pMTL22 vector.
EXAMPLE 3 Construction of pNX5. a Deletion Mutant of DNX1 (xvnA) 720bp ScalINruI fragment was excised from pNX4 and cloned into vector. This resulted in a highly expressing clone, in which the enzyme expression levels were some hundreds higher than for pNX1.
EXAMPLE 4 Construction of pNX6. a Deletion Mutant of pNX1 (xvnA) pNX6 was constructed by cleaving pNX1 with EcoRIIScaI and cloning the resulting fragment into EcoRIISmaI-cut pMTL22.
EXAMPLE 5 Construction of pNX8. a Deletion Mutant of pNX1 (xvnA) pNX1 was digested with ScaI and XhoI to obtain 1.3kb fragment which was cloned into pMTL20 so that the XynA sequence was in phase with the LacZ ATG contained in the vector. This resulted in a high expression clone in which the expression level was approximately fifteen times that of pNX1.
EXAMPLE 6 Construction of pNX9. a Deletion Mutant of pNX1 (xvnA) pNX8 was cut with KpnI (1 site in vector polylinker) and the insert fragment, after electroelution was digested with RsaI (cuts in the PT linker region of the gene) to
I
WO 93/25693 PCr/GB93/01283 -32produce a -700bp fragment which was cloned into pMTL20 which had been cut with KpnI and StuI. This resulted in a highly-expressing clone (much better than clone containing pNX8) with second catalytic domain in frame with vector LacZ N-terminus.
EXAMPLE 7 Construction of pNXIO. a Deletion Mutant of pNX1 (xvnA) pNX8 was digested with KpnI and the fragment (-850bp) was ligated into KpnI-cut This clone also expressed well but the protein expressed contains some residues at the carboxy end, which when removed allow for the high level expression observed for pNX9.
EXAMPLE 8 Puriiication and amino acid sequencing of the N-terminus of xylanase A E. coli XL1-Blue harbouring pNX1 or pNX5 was cultured for 16 hours in LB broth containing ampicillin (100ug/ml). Cells, harvested by centrifugation, were resuspended in 50mM Tris/HCI buffer, pH 8.0 and the cytoplasmic fraction prepared as described previously (Clarke et al, FEMS Microbiol. Lett. 83: 305-310 (1991)). Xylanase, precipitated by the addition of ammonium sulphate (0.39g/ml), was redissolved in 10mM Tris/HCl buffer, pH 8.0. After dialysing against 3 changes of the same buffer, the xylanase was substantially purified by anion exchange chromatography on DEAE-Trisacryl M essentially as described by Poole et al, Mol. Gen. Genet. 223: 217-223 (1990).
The xylanase (designated XYLA) purified from cell-free extract of E. coli XL1- Blue harbouring pNX1 was fractionated by SDS/PAGE and electroblotted onto PROBLOT" membrane (Applied Biosystems Inc). N-terminal sequence was determined by automated Edman sequencing using a 470 gas-phase sequenator WO 93/25693 l'CT/GB93/01283 -33equipped with a 120A on-line phenylthiohydantoin analyser (Applied Biosystems Inc: Hunkapillar et al, Methods Enzymol. 91: 399-413 (1983)).
EXAMPLE 9 Summary of Isolation of xvnA A cDNA library consisting of 106 clones was constructed using mRNA isolated from N. patriciarum cells grown with AVICEL as sole carbon source. Thirty one recombinant bacteriophages which hydrolysed xylan were identified after screening x 104 clones from the library, and 16 strongly xylanase-positive phage were isolated for further characterisation. Restriction mapping and hybridisation data indicated that all the xylanase- positive recombinants contained cDNA sequences derived from the same mRNA species. A restriction map of the largest cDNA sequence encoding a functional xylanase, designated xynA, is shown in Figure 1.
A nucleic acid probe consisting of 1.7kb of the 5' region ot xynA, hybridised to a single 2.5kb Neocallimastix RNA species. This suggests that the longest xynA cDNA isolated is almost full length.
EXAMPLE 10 Characterisation of xylanase A The cDNA sequences encoding NeocaUlimasix xylanases were excised from XZAPI and rescued in E. coli XL1-Blue as recombinants of pBLUESCRIFT SK.
Xylanase activity expressed by the recombinant strain harbouring the plasmid pNX1, which contained the longest form of xynA,was found predominantly in the cell-free extract, indicating that the enzyme was not efficiently secreted by E. coli.
The xylanase, designated xylanase A (XYLA), was purified to near homogeneity pure). Purified XYLA had a specific activity of 5980 U/mg protein, compared to the cell free extract value of 16 U/mg protein. This indicates that XYLA consists of 0.3% of soluble protein synthesised by E. coli cells harbouring pNX1. The purified enzyme had an Mr of 53000 (Figure 2) and an N-terminal
I
WO 93/25693 WO 9325693PCT/G1193/01'233 -34sequence of IATVAKAQWGGGGAS. XYLA hydrolysed xylan but exhibited no activity against carboxymethyl cellulose, barley fl-glucan, laminarin, lichenan or the artificial substrates 4-methyl-unibelliferyl-3-D-xyloside and p-nitrophenyl-o3-Dxylopyranoside (Table 1).
TABLE 1 The enzyme activity of purified xylanase A from E. coii harbouring pNX1 (xynA cDNA) plasmid.
Substrate Activity 1 protein Barley #-glurAn 0 Carboxyrnethylcellulose 0 Xylan 5980 Xylobiose 0 135 p-Nitrophcnyl O-D-xylobiosidc (PNX) 0 Methylumbelliferyl fi-D-ccilobioside (MUG) 0 4-Methylumbelliferyl O-D-glucoside (MUG) 0 4-Methylumbelliferyl #-D-xyloside (MUX) 0 10ne unit of XYLA releases 1 Amole of product per mute.
The enzyme attacked soluble xylan in a manner typical of an endo-j3-1,4-xylanae (EC promoting a rapid decline in viscosity (Figure 3) and releasing 893mg of reducing sugar per g of substrate. Analysis of the hydrolysis products by HPLC revealed that XYLA liberated approximately equal amounts of xylobiose and xylose. No disaccharides containing arabinose, the major side-chain sugar of oat spelt xylan, were detected among the reaction products, suggesting that the WO 93/25693 PCT/GB93/01283 enzyme does not hydrolyse glycosidic linkages involving xylose units linked to side chain sugars.
EXAMPLE 11 Nucleotide sequence The 2.3kb Neocallimasrix cDNA derived from pNX1 was sequenced in both strands (Accession number X65526 in EMBLIGenbank/DDBJ Nucleotide Sequence Data Libraries). Translation of the nucleotide sequence revealed a single open reading frame (ORF) of 1821 bp encoding a polypeptide of Mr66192. The deduced primary structure of the encoded protein is shown in Figure 4. The Nterminal 15 residues of recombinant XYLA, purified from E. coli, exhibited a perfect match with amino acids 12 to 26 of the translated sequence. The assignment of the proposed translation initiation codon was based on the following observations: there are not ATG sequences upstream of the ORF; (ii) translational stop codons are in all 3 reading frames upstream of the putative translational start codon. Inspection of the nucleotide sequence in the vicinity of the putative ATG start codon did not reveal any alternative sequences which could act as translational start codon in E. coli. It is likely, therefore, that translational initiation of the xynA occurs at the same codon in the enteric bacterium and anaerobic fungus. This is despite the fact that lower eukaryote mRNAs do not contain ribosome binding sequences which conform to the corresponding E. coli sequence. Presumably the sequence AGA, 7bp upstream of the ATG start codon, acts as weak ribosome binding sequence in the bacterium. Transcription initiation of xynA in E. coli is presumably at the vector's lacZp as subcloning of the xynA cDNA, on a 2.3 kb EcoRI-XhoI restriction fragment, into pMTL22, generated a recombinant plasmid (pNX2) which did not direct a functional xylanase. The vector's locZp is at the 3' ofxynA in pNX2. Although XYLA is not secreted by E. coli. the deduced N-terminal region of the xylanase conforms to that of a signal peptide: comprising of an N-terminal hydrophilic basic region followed by a sequence of 23 predominantly hydrophobic or neutral amino acids.
i WO 93/25693 PC/GB93/01283 -36- The G C content of the xynA ORF was 43.4%, comparmd to 10.7% for the and 3' non-coding regions (excluding the 3' polyA tail). The overall G C content of Neocalimastix DNA is approximately 15% (Billon-Grand et al, FEMS Microbiol. Lett. 82: 267-270 (1991)), indicating that non-protein coding regions of the genome are generally very A T-rich. The bias in codon utilisation in xynA is evident from the absence of 14 of the 61 amino acid codons. There is a marked preference for T in the third position -50% of all codons end in T) and an exclusion of G in the wobble position. Apart from ATG and TGG, which are the sole codons for Met and Trp respectively, only 3 codons contain G in the third position; AAG, GAG and TTG.
Inspection of the deduced primary structure of mature XYLA revealed several interesting features. Between residues 255-265 and 491-519 are regions rich in proline and hydroxy amino acids. Many cellulases and xylanases consist of multiple domains which are linked by sequences rich in proline/hydroxy amino acids (Gilkes et al, Microbiol. Rev. 55:303-315 (1991)). The presence of 2 such "linker sequences" in XYLA suggests that the enzyme consists of at least 3 distinct domains. The Neocallimastix xylanase, in addition to comprising of linker regions.
also contains a 225 amino acid repeated sequence at the N-terminus. and a Cterminal 40 residue reiterated domain (Figure There is no obvious sequence conservation between the large and small relxp ted regions. The two N-terminal repeated sequences exhibited 91.6% and 95.6% identity and similarity, respectively. The 40 amino acid reiterated region displayed 82.9% and 95.1% identity and similarity, respectively. DNA encoding the two repeated regions also showed sequence identity, with the 699 bp and 120 bp reiterated sequences exhibiting 92.7% and 90.8% identity, respectively.
I
WO 93/25693 IPCT/G 93/01283 -37- EXAMPLE 12 Homolozv Studies Hydrophobic cluster analysis has shown that cellulases and xylanases can be grouped into nine enzyme families. Proteins within a family are structurally related and have probably evolved from a common ancestral gene (Henrissat er al, Gene 81: 83-95 (1989)). Comparison of XYLA with sequences in the SWISS- PROT database revealed homology between the fungal enzyme and Bacillus pumilis xylanase A (Fukusaki et al, FEBS Lett. 171: 197-201 (1984)), Bacillus circulans xylanase (Yang et al, Nucl. Acids Res. 16: 7178 (1988)), Clostridium acetobutylicum xylanase B (Zappe et al, Nucl. Acids Res. 18: 2179 (1990)) and the N-terminal region of the multi-domainRuminococcusflavefaciens xylanase (Zhang Flint, Mol. Microbiol. 6: 1013-1019 (1992)). The degree of homology between these enzymes and N. patriciarum XYLA is shown in Figure It is interesting to note that only the large repeated sequence of XYLA exhibited homology with other hemicellulases; the C-terminal reiterated region showed no identity with proteins in the database. This suggests that XYLA has a modular structure in which the N-terminal region constitutes the catalytic domain.
EXAMPLE 13 Structure and function of XYLA To investigate the assertion that the N-terminal repeated sequence constituted the catalytic domain of XYLA, 5' and 3' regions of xynA were deleted, or subcloned into appropriate vectors, and the capacity of the resultant xynA derivatives to express a functional xylanase was evaluated. A truncated form of xynA in which 291 bp of the 3' region encoding the 40 amino acid C-terminal repeat, had been deleted, still encoded a functional xylanase. The predicted Mr of the encoded enzyme was 53000. This is similar to the size of XYLA purified from E. coli harbouring pNX1. Thus. the recombinant xylanase synthesised from the full- WO 93/25693 PCT/GB93/01283 -38length gene by the enteric bacterium could also lack the C-terminal repeated sequence. Support for this view is provided by the fact that several multidomain cellulases and xylanases are particularly sensitive to proteolytic cleavage within the linker sequences (Tomme et al, Eur. J. Biochem. 170: 575-581 (1988); Gilkes et al, Biol. Chem. 263: 10401-10407 (1988) including a Pseudomonas xylanase, expressed by E. coli which was substantially cleaved within the serine-rich linker sequences (Hall et al, Mol. Microbiol. 3: 1211-1219 (1989)). A more substantial 3' deletion (pNX6), extending for 1011 bp did not affect the capacity of xynA to direct the synthesis of a functional xylanase. However, removal of 1324 bp from the 3' region of xynA resulted in the synthesis of an inactive derivative of XYLA.
These data suggest that the N-terminal 270 residues of the N. patriciarum xylanase folds into a catalytically active enzyme. To determine whether both N-terminal reiterated sequences, fold into functional xylanases, the 720 bp Scal Nrulrestriction fragment (NruI cleaves in the multiple cloning region of pNX4) was cloned into pMTL20 to generate pNX5, in which truncated xynA was in phase with the vectors lacZ' translation initiation codon (Figure E. coli harbouring expressed 15 times more XYLA compared to a clone harbouring full length xynA.
This elevation in the expression of the fungal enzyme, is presumably a result of the utilisation of an E. coli translation initiation sequence in xynA encoded by pNX5. XYLA purified from cell-free extract of coli containing pNX5 had an Mr of 26000 (Figure 2B). These data confirm that the reiterated l'-terminal 225 residues constitute distinct catalytic domains. Interestingly, a further increase in xylanase activity was achieved by deletion of a few amino residues from the Cterminus of the second catalytic domain to generate pNX9.
To investigate the substrate specificities of the N- and C-terminal catalytic domains, the capacity of the xylanases, encoded by pNX6 and pNX5, to cleave plant structural polysaccharides were assessed. The enzymes cleaved only xylan.
releasing xylobiose and xylose in similar proportions to that of full-length XYLA.
I
WO 93/25693 'PCr/GB93/01283 -39- Thus, both catalytic domains displayed the same substrate specificities as fulllength XYLA.
Although many cellulases and xylanases consist of multiple domains, cc." om Caldocellum saccharolyticum (Saul et al, Appl. Environ. Microbiol. 56: 3117-3124 (1990)) is the only previous example of an enzyme containing 2 distinct catalytic domains. This enzyme consists of an N-terminal exoglucanase and a C-terminal endoglucanase which belong to different enzyme families. Thus, the gene encoding celB probably arose through the fusion of two discrete cellulase genes.
This invention provides evidence that fungal xylanases can also consist of mutiple catalytic domains. In contrast to the celB gene, xynA is clearly a result of tandem duplication of an ancestral gene. It is not apparent what selective advantage the gene duplication confers on the anaerobic fungus. Is it simply a mechanism for increasing the expression of XYLA catalytic domains? As this is the first description of an anaerobic fungal xylanase, it is unclear whether multiple catalytic domains are a common feature of lower eukaryote hemicelllas's.
I I WO 93125693 9325693 CT/CGB93/0 1283 SEQUENCE LISTING GENERAL INFORM4ATION: Mi APPLICANT: CA) NAME: Harri John GILBERT STREET. ll Gardens, Low Fall, CITY: .,txi aid STATE: a ~nd Wear COUNTRY: Ur.4ted Kingdom CF) POSTAL CODE NE9 SXS CA) NAME: Geoffrey Peter HAZLEWOOD STREET: 109A Duchess Drive CITY: Newmarket CD) STATE: Suffolk CE) COUNTRY: United Kingdom CF) POSTAL CODE CZIP),. CB8 SAL (ii) TITLE OF INVENTION: Recombinant Xylanaues (iii) 17UMBER OF SEQUENCES; 18 Civ) COMPUTER READABLE )FORM: MS-DOS FLOPPY DISK CONTAINING ASCII FILE C93..01283.ASC) CURRENT APPLICATION DATA: APPLICATION NUMBER: WO PCT/GB93/01283 INFORMATION FOR SEQ ID NO: 1: Wi SEQUENCE CHARACTERISTICS: LENGTH: 2338 base pairs CB) TYPE: nucleic acid CC) STRANDEDNESS: double CD) TOPOLOGY: linear (ii) MOLECULE TYPE- cDNA (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (vi) ORIGINAL SOtRnCEt CA) ORGANISM: Neocallimaistix patriciarum STRAIN: (typo species) Cix) FEATURE: CA) NAME/KEY: CDS LOCATION: 19S. .2018 OTHER INFORMATION: /functionu "Xylanolytic enzyme" /producto "XYLA" /standard, namem "Xylanaoell Cix) FEATURE- NAME/KEY: sig~peptide LOCATION: 195..281.
Cix) FEATURE: CA) NAME/KEY: matpeptide LOCATION: 282..2018 (ix) FEATURE: CA) NAME/KEY: misc feature LOCATIO'1: 282..959 VD OTHER INFORMATION: /labelm CAT1 ISUBSTITUTE SHEET WO 93/25693 P'CT/G1193/O 1283 -41- /notes "Ist catalytic domain" (ix) FEATURE: NAME/KEY: misc- feature LOCATION: 1017..1691 OTHER INFORMATIONs /label= CAT2 /notes "12nd. catalyti.c domainw (ix) FEATURE: NAME/KEY: nuisc feature LOCATION: 17647.1883 CD) OTHE INFORMATION: /labels CTR1 /notes "1st, C-terminal repeat" (iX) FEATURE:.
NAME/KEY:. misc feature LOCATION: 1884..2015 OTwE INFORMATION: /label- CTR2 /notes "2nd C-terminal repeat" (ix) FEATURE: NAMM/KY: misc feature LOCATION: 1. .2338 OTHER INFORMATION: /labels pNXIlinsert (ix) FEATURE: NAME/KEY: misc feature LOCATION: 1..2338 OTHER INFORMATION: /labels pNX2_insert /note= "IpNX2 insert is in reverse orientation to pNXI insert" (ix) FEATURE: NAME/KEY: misc-feature LOCATION: 1..1847 OTHER INFORMATION: /labels pNX3_inaert (ix) FEATURE:.
NAME/KEY: misc feature LOCATION: 1..172S OTHER INFORMATION: /labels pNX4_insert (ix) FEATURE: NAME/KEY: misc-feature LOCATION: 1002..172S OTHER INFORMATION: /label= pNXS insert (ix) FEATURE: NAME/KEY: misc feature LOCATION: 1..1001 OTHER INFORMATION: /label= PNXG-insort
FEATURE:
NAMS/KEY: misc feature LOCATIO'Ui: 1-.690 OTHER INFORMATION: /label. pNX7_insert (ix) FEATURE: NAME/KEYt mi.sc feature LOCATION: 1002. .2338 OTHER INFORMATION. /labels pNXB insert (ix) FEATURE: NAME/KEY: misc feature LOCATION: 1002. .1847 OTHER INFORMATION: /label. pNX9_..sert WO 93r25693 WO 9325693PCr/GB93/01283 -42- Cix) FEATURE; CA) NAM'E/K.EY: misc feature LOCATION: 1002.. 1709 OTHER INFORMATION: /labei= pNXiO insert Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: TrTTTATTATA TCAATCTCTA ATTI'ATTT TTAGGAAAAA AATAAAAAAA TAAATATAAT
AAATATTAGA
?TTACTGGT
OAGTAATATr TAAAAACAAA GAAATTTAAA TAAAAAAAAA ATAAAAAACA AAATTAATAA AACOTTTATT TAGTTATTTT AGATATTT GAAAAATATT GAATTAGAAA AAAA ATG AGA ACT ATT AAA GCA ACT Ala Thr CAA AGA Gin Arg 1 GAT 00? Asp Giy TCT ATG Ser M4et TCT GTT Ser Val TCT CAA Ser Gin GCA ACT Ala Th~r s0 GTA TAC Val. Tyr GAA TAO Giu Tyr GGT AGA Gly Arg GAT CAC ASP His 145 TAC TTC Tyr Phe 160 Met Arg Thz Ile -29 AAG 0CC CAA TGO Lys Ala Gin Trp, 10 GTC GOT AAT GGT Val Gly Asn Gly
S
TAT GAA ATC TOO Tyr Glu Ile Trp GGT ACT GOT GCPA Gly Ser Gly Ala GOT AAC TTC CTT Gly Asn Phe Leu GCA ACC GAT TAC Ala Thr Asp Tyr 70 CAA ACT GGT AGC Gin Thr Oly Ser as TTC CAA AAC COT Phe Gin Ann Arg 100 ATT GAA OAT TG Ile Giu Asp Trp ACC AT? OAT GGA Thr Ile Asp Gly 135 CCA ACT ATC AAT Pro Thr Ile An 1S0 COT CAA CAA AAG Arg Gin Gin Lys TTC TTT TTC OCA GTA OCT AT? Phe Phe Phe Ala Val Ala Ile OGA GOT GOT 0CC TCT OCT GOT Gly Gly Oly Ala Ser Ala Gly ACC CAA CAT AAG GOT OTA GCT Thr Gin His Lys Giv Val Ala 10 is GAT AAC ACC GOT GOT AGT GOT Asp Asn Thr Gly Oly Ser Gly 25 TTC AAO OCT GATA TOO AAT GCA Phe Lys Ala Oiu Trp Ann Ala COT COT GOT CTT GAO TTC GOT Arg Arg Gly Leu Asp Phe Gly TAC ATT GGA TTG OAT TAT ACT Tyr Ile Gly Leu Asp Tyr Thr AOT GOT AAC TOC COT CTC TOT Ser Gly Ann Ser Arg Leu Cys 90 OTT CAA GOT OTT OCA TTO OTA Val. Gin Gly Val. Pro Leu Val.
105 110 GAC TGO OTT CCA GAT GCA CAA Asp Trp Val Pro Asp Ala Gin 125 CAA TAT AAO AT? TTC CAA ATO Gin Tyr Ly's Ile Phe Gin Met 140 GOT AGT GAA ACC TTT AAG CAA Gly Ser Oiu Thr Phe Lys Gin 155~ ACT TOT GOT CAT AT? ACT GTC Thr Ser Gly His Ile Thr Vai 170 175 120 180 230 278 326 374 422 470 518 566 614 662 710 758 806 T CA OAT CAC I TTT AAG OAA TOG 0CC AAA CAA GOT TOG GGT AT? G AAC WO 93/25693 WO 93/25693I'CBG193/0 1283 -43- Ser
CT?
Leu
GCT
Ala
CCT
Pro
AGT
Ser 240
CAT
His
ACT
Thr
GCT
Ala
GGT
Giy
GGA
Gly 320
AAC
Asn
GGC
Gly Val
AAG
Lys
GAA
Glu 400
GGT
Gly Asp His TAT GAA Tyr Glu CAT GTC Asp Val 210 GCC CC? Ala Pro 225 ACT GCC Thr Ala AAG GGT Lys Gly GCT GGT Giy Gly GAA TGG Giu Trp 290 CT? GAC Leu Asp 305 TTA GAT Leu Asp TCC CGT Ser Arg GTT CC? Val Pro CCA GAT Pro Asp 370 AT? TTC Ile Phe 385 ACC TTT Thr Phe CAT ATT His Ile Phe
GTT
Val 195
ACC
Thr
ACC
Tbz
AAT
Asn
GTC
Val
AAC
Asn 275
AAT
Asn
TTC
Phe
TAT
Tyr
CTC
Leu
TA
Leu 355
OCA
Ala
CAA
Gin
AAG
Lays
ACT
Thr Lys 180
OCT
Al a
AAG
Lys
TCC
Ser
GGT
Gly
AAC
Asn 260
GGT
Gly
GCA
Ala
GGT
G2.y
GCT
Ala
TOT
Cys 340
GTA
Val
CAA
Gin
ATG
Met
CAA
Gin
GTC
Val 420 Glu Trp TTG AAC Leu Asn TTA GAT Leu Asp ACT GGT Tbr Gly 230 AAA AAG Lys Lys 245 GAT COT Asp Gly TOT ATG '3er Met GCT GTT Ala Val TOT CAA Ser Gin 310 GOT ACT Ala Thr 325 GTA TAO Vai Tyr GAA TAO Glu Tyr GGA AAA Gly Lys GAT CAC Asp His 390 TAC TTO Tyr Phe 405 TCA GA? Ser Asp Ala Lys GCC GAA Ala GlU 200 GTT TAO Val Tyr 215 ACT GTT Thr Val TlT ACT Phi! Thx TTC AGT Phe Ser ACT CTC Thr Leu 280 AAC CGT Asn Arg 295 AAG AAG Lys Lys TAC AAA Tyr Lys GGA TGG Gly Trp TAC ATC Tyr Ile 360 ATG GTA Met Val 375 ACT GGT Thr Giy AGT GTC Ser Val CAC TTr His Phe Gin 185
GGT
Gly
ACA
Thr
CA
Pro
GTC
Val
TAT
Tyr 265
GT
Gly
GOT
Gly
GCA
Ala
CAA
Gin
TTO
Phe 345
ATT
Ile
ACC
Thr
OCA
Pro
CT
Arg
AAG
Lys 425 Gly TrM Gly Ile Gly Aen 190
TGG
Trp
ACC
Thr
AGO
Ser
GGT
Giy 250
GAA
Giu
AGT
Ser
AAC
Asn
ACC
Thr
ACT
Thr 330
CAA
Gin
GAA
Glu
AT?
Ile
ACT
Thr
CAA
Gin 410
GAA
GiU
AGT
Ser 205
GOT
Gly
OCT
Ala
CAA
Gin
TTA
Leu
ACT
Thr 285 G00 Ala
GAC
Asp
GCA
Ala
GGA
Gly
GTT
Val 365
GOT
Ala
GOT
Gly
AGA
Arg
AAA
Lays
GOT
Giy
TOT
Ser
GOT
Gly
AAC
GAT
Asp 270 Phe
COT
Arg
TAC
Tyr
AGT
Ser
CT?
Leu 350
GAO
Asp
CAA
Gin
GGT
Gly
ACT
Thr
CAA
Gin 430 902 950 998 1046 1094 1142 1190 1238 1286 1334 1382 1430 1478 1526 1574 1622 TGG GGT AT? Trp Gly Ile GGT AAC CT? TAT GAA GTT OCT TTG AAC GCC GAA GGT TG Gly 435 Asn Leu Tyr Glu Val Ala Leu Asn Ala GTh Gly Trp 440 445 WO 93/25693 WO 9325693PC1'/GB93/01283 CAA AGT AGT Gin Ser Ser 450 CCA AAG GGT Pro Lys Gly 465 COT ACT ACT Azg Thr Thr 480 TCT GCT AGA Ser Ala Arg TOT GTT GTT Cys Val. Val.
AAC GAC 'rGG Asa Asp Trp TCT CAA GGT Ser Gin Gly 545 ACT GAT GAC Thr Asp Asp 560 TOT GOT i-rC Cys Gly Phe -44- GOT GTT GCT OAT GTC ACC ?TA TI'A GAT OTT TAC ACA ACT Gly Val Ala Asp Val Thr Leu Leu Asp Val. TIyr Thr Thr 455 460 TCT AGT CCA GCC ACC TCT GCC OCT COT COT ACT ACT ACC Ser Ser Pro Ala Thr Ser Ala Ala Pro Arg Thr Thr Thr 470 475 ACT COT ACC AAG TOT OTT CCA ACC AAT TAC AAT AAG TGT Thr Arg Thr Lys Ser Leu Pro Th? Asa Tyr Asn Lys Cys 485 490 495 ATT ACT OCT CAA GOT TAC AAG TGT TOT AOC OAT CCA AAT le Thr Ala Gin Giy Tyr Lys Cys Cys Ser Asp Pro Asn 500 505 510 TAC TAC ACT OAT GAG OAT GOT ACC TOG GOT GTT GAA AAC Tyr Tyr Thr Asp Giu Asp Gly Tb? Trp Gly Val Giu Asn 515 520 525 TOT GOT TGT GOT GTT OAA CAA TOT TCT TCC AAG, ATC ACT Cys Gly Cys Gly Val Glu Gin Cys Ser Ser Lys Ile Thr 535 540 TAC AAG TGT TGT AGC OAT CCA AAT TGC GTT GTT TTC TAC Tyr Lys Cys Cys Ser Asp Pro Asn Cys Val. Val Phe Tyr 550 555 OAT GOT AAA TOG GOT OTT GAA AAC AAC OAC TOG TOT GOT Asp Gly Lys~ Trp Gly Val. 01u Asa Asn Asp Trp Cys Gly 565 570 575 TAAOCAGTAA AATACTAATT AATAAAAAAT TAAAGAATTA 1670 1718 1766 1814 1862 1910 1958 2006 2055 2115 2175 223S 2295 2338 TGAAAAATTT AAATTTAAAA ATTTAAAAGA ATTATGAAAA ATTTAAATTT AAAAATTTAA AAAAAACTAA. TTTAGTAAAA AATTAAAOAA TTATTGAAAA T1'TTAAATGT AAAAATTTAA AAAATACAAA 'TTTOTAAAAA AAAATGAAAG AATTATGAAA AATTAAAATG TAAAAOTIrTA AAAAATACAR ATTTGTAAGA AAAATAAAGA ATTATAAAAA AAATAAAGAA TTATGAAAAA CCCAAATOTA AAGAAAAAAA AAAAAAAAAA AAAAAAAAAA AAA INFORMATION FOR SEQ ID NO: 2: Wi SEQUENCE CHARACTERISTICS: LENGTH: 607 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: Met Arg Thr Ile Lys Phe Phe Phe Ala Val. Ala Ile Ala Thr Val Ala -29 -25 -20 Lys Ala Gin Trp Gly Gly Gly Gly Ala Ser Ala Oly Gin Arg Leu Thr -5 1 7a1 Gly Asn Gly Gin Thr Gin His Lys Gly Val Ala Asp Gly Tyr Ser 10 is WO 93/25693 W Cr/193/O 283 Tyr Gly Gly Ala Gin Phe .00 Ile Thr Pro Arg Lys 180 Ala Lys Ser Gly Asn 260 Gly Ala Gly Ala Cys 340 Val Glu Ile Trp Leu Ser Asn Thr Thr Gln Glu Ile Thr Gin 165 Glu Leu Leu Thr Lys 245 Asp Ser Ala Ser Ala 325 Val Glu Gly Phe Asp Gly Asn Asp Asp Ile 150 Gin Trp Asn Asp Gly 230 Lys Gly Met Vai Gin 310 Thr Tyr Tyr Thr Ala Ser Ala Gly Vai 120 Ala Gly Arg Lys Glu 200 Tyr Val Thr Ser Leu 280 Arg Lys Lys T-p Ile 360 Asp 25 Phe Arg Tyr Ser Val 105 Asp Gin Gly Thr Gin 2.85 Gly Thr Pro Val Tyr 265 Gly Gly Ala Gln Phe 345 Ile Asn Thr Gly Gly Ser Lys Arg Ile Gly Gin Trp Ser Ser 170 Gly Trp Thr Ser Giy 250 GLu Ser Asn Thr Thr 330 Gin Glu Ala Gly Gly 75 Asn Gly Val Lye Glu .55 Gly Trp Gin Gin Ser 235 Asn Ile Gly Phe Asp 315 Ala Asn Asp Glu Leu Leu Ser Vai Pro Ile Thr His Gly Ser Lys 220 Ser Gly Trp Ala Leu 300 Tyr Ser Arg Trp Asn Phe Tyr Leu Leu 110 Ala Gln Lys Thr Gly 290 Gly Ser Gly Asn Asp 270 Phe Arg Tyr Ser Leu 350 Asp Gly Ser Met Thr Leu Ser Vai Ser Gin Ala Thr Val Tyr Glu Tyr Gly Arg Asp His 245 Tyr Phe 260 Ser Asp Leu Tyr Ala Asp Pro Ala 225 Ser Thr 240 His Lys Thr Gly Ala Glu Gly Leu 305 Gly Leu 320 Asn Ser Gly Val Val Pro Asn Lys Tyr Gly Tyr Met 230 Thr Ser His Glu Val 210 Pro Ala Giy Gly Trp 290 Asp Asp Arg Pro Asp 370 WO 93/25693 WO 9325693PCT/GB93/01 283 -465- Gin Gly Met Asp Gin Tyr Val Ser 420 Asn Leu Val Ala Ser Pro Arg Th: 485 Thr Ala 500 Tyr Thr Gly Cys Lys Cys Gly Lys 565 Lys Met 375 His Th: 390 Phe Ser Asp His Tyr Glu Asp Val 455 Ala Thr 470 Lys Ser Gin Gly Asp Glu Giy Val.
535 Cys Ser 550 Trp Gly Val Thr Gly Pro Val Arg Ile Thr Gin 410 Giu Leu Leu Ala Thr 490 Cys Tb: Cys Asn Asn 570 Ala Gin Tyr Lys Ile Phe Gin le Asn Gly Gly Ser 395 Gin Lys Arg Thr Se: 41i5 Trp Ala Lys Gin Gly 430 Asn Ala Giu Giy Trp 445 Asp Vai Tyr Thr Thr 460 Pro Arg Thr Thr Thr 475 Asn Tyr Asn Lys Cys 495 Cys Ser Asp Pro Asn 510 Trp Giy Val Giu Asn 525 Ser Se: Lys Ile Thr Cys Vai Val. Phe Tyr 555 Asn Asp Trp Cys Gly 575 INFORMATION FOR SEQ ID NO: 3: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 1847 base pairs TYPE: nuclic acid STRAIflEDNESS: double TOPOLOGY. linear (ii) MOLECULE TYPE: cDNA (ix) FEATURE: NAME/KEY: CDS LOCATION: 195. .1847 (ix) FEATURE: NAME/KEY: sigpeptide LOCATION: 195. .281 (ix) FEATURE: NAME/KEY: misc-feature LOCATION: 1847 OTHER INFORMATION: /label- pNX3_insert (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: WO 93/25693 WO 932569 rerC 1193/01283 -47- TTTTATTATA TCAATCTCTA ATTTATT TrAGGAAAAA AAATATTAGA GAGTAATATT TAAAAACAAA GAAATTTAAA 2ACTGGT TAAAAAAAAA ATAAAAAACA AAAT1'AATAA AATAAAAAAA TAAATATAAT AACG=z-ATT TAGT'ATr AGATATI'TTT GAAAAATATT GAATTAGAAA. AAAA ATG AGA ACT ATT Met Arg Thr Ile AAA TTC T1'T TTC GCA GTA GCT ATT Lys Phe Phe Phe Ala Val. Ala le
GCA
Ala
CAA
Gin
GAT
Asp
TCT
Ser
TCT
Ser
TCT
Ser
GCA
Ala
GTA
Val.
12S
GAA
Glu
GGT
Giy
GAT
Aso
TAC
Tyr
TCA
Ser 205 ACT GTT Thr Val.
is AGA TTA Arg Leu GGT TAC Gly Tyr ATG ACT Met Thr GTT AAC Val An CAA AAG Gin Lys ACT TAC Thr Tyr 110 TAC GGT Tyr Gly TAC TAC Tyr Tyr AGA ATG Arg Met CAC X~T His Thr 17S ITTC AGT Phe Ser 190 GAT CAC Asp His
CAA
Gin
AAT
An 35
ATC
Ile
GGT
Gly
TT-C
Phe
GAT
Asp
GGT
Giy 115
AAC
Asn
GAT
Asp
GAT
Asp
ATC
Ile
CAA
Gin 195
TGG
Trp GGA GGT Giy Gly ACC CAA Th~r Gin GAT AAC Asp An Trc AAG Phe Lys CGT CGT Arg Arg TAC ATT Tyr le AGT GGT Ser Gly GTT CAA Val Gin 135 GAC TGG Asp Trp iso CAA TAT Gin Tyr GGT AGT Gly Ser ACT TCT Thr Ser CAA GGT Gin Gly 215
GGT
Gly
CAT
His
ACC
Thr
GCT
Ala
GGT
Gly
GGA
Gly
AAC
An 120
GGT
Gly GT-r Val.
7AAG Lys
GAA
Giu
GGT
Gly 200
TGG
Trp CTT TAT GAA GTT OCT Leu Tyr Giu Va]. Ala TTG AAC GCC GAA GGT TGG CAA AGT AGT Leu Asn Ala Giu Gly Trp Gin Ser Ser 230 GGT ATA Giy Ile 235 WO 93/25693 WO 9325693PCT/G B93/01283 GCT GAT GTC Ala Asp Val CCT GCC CCT Pro Ala Pro 255 AGT ACT GCC Ser Thix Ala 270 CAT AAG GGT His Lys Gly 295 ACT GGT GGT Thr Gly Gly GCT GAA TGG Ala Giu T-p, GGT CTT GAC Gly Lou Asp 335 GGA TTA GAT Giy Lou Asp 350 AAC TCC CGT Asn Ser Arg 365 GGC GTT CCT Gly Val Pro GTT CCA CAT Val Pro Asp AAG ATT TTC Ly's Ile Phe 415 GAA ACC TTT Glu Thr Phe 430 GGT CAT ATT Gly His Ile 445 TGG GGT ATT Tr-p Gly Ile CAA AGT AGT Gin Ser Ser CCA AAG GGT Pro Lys Gly 495 ACC AAG Thr Lys 240 ACC TCC Thr Ser AAT GGT Asn Gly GTC AAC Val Asn AAC CGT ASn Gly 305 AAT GCA Asn Ala 320 TTC CGT Phe Giy TAT GC Tyr Ala CTC TGT Lo Cys TrA GTA Lo Val 38S GCA CAAL Ala Gin 400 CAA ATG Gin Met AAG CPAA Lys Gin ACT GTC Thr Val GGT AAC Gly Asn 465 GGT GTT Gly Val 480 TCT AGT
TTA
Leu
ACT
Thr
AA
Lys
GAT
Asp 290
TCT
Ser
GCT
Ala
TCT
Ser
GCT
Ala
GTA
Vai 370
GAP.
Glu
GGA
Giy
GAT
Asp
TAC
Tiyr
TCA
Ser 450
CTT
Leu
GCT
Ala
CCA
GAT GTT Asp Val GGT ACT Giy Thr 260 AAG TTT Lys Phe 275 GGT TTC Gly Phe ATG ACT Met Thr GTT AAC Vai Asn CAA AAG Gin Lys 340 ACT TAC Thr Tyr 355 TAC GGA Tyr Giy TAC TAC Tyr Tyr AAA ATG Lys Met CAC ACT His Thr 420 TTC AG? Phe Ser 435 GAT CAC Asp His TAT GAA Tyr Glb GAT GTC Asp Val GCC ACC -48-
ACA
Thr
CCA
Pro
GTC
Vai
TAT
Tiyr
GGT
Gly 310
GGT
Gly
GCA
Ala
CAA
Gin
TTC
Phe
ATT
Ile 390
ACC
Thr
CCA
Pro
CGT
Arg
AAG
Lys
GCT
Ala 470
TTA
Leo
ACC
Thr
AGC
Ser
GGT
Gly
GAP.
Glu 295
ACT
Ser
AAC
Asn
ACC
Thr
ACT
Thr
CAA.
Gin 375
GA.
Gio
AT?
Ile
ACT
Thr
CAA
Gin
GAP.
Glu 455
TTG
Lo
TTA
Lo CA. AAA Gin Lys ACT TCT Ser Ser 265 AAT GGA Asn Giy 280 ATC TGG Ile Trp COT GCA Gly Ala TTC CT? Phe Lou CAT TAC Asp Tyr 345 CCC AGT Ala Ser 360 AAC CGT Asn Arg GAT TGG Asp Trp GIV: GGP.
Asp Giy ATC AAT Ile Asn 425 CAA AAG Gin Lys 440 TGG GCC Trp Ala AAC GCC Asn Ala GA? GT? Asp Val
GGT
Giy 250
C
Ala
CAP.
Gln
TTA
Lo
ACT
Thr
CC
Ala 330
CAC
Asp
GCA
Ala
GGA
Giy
CT?
Vai
GCT
Ala 410
COT
Gly
AGA
Arg
AAA.
Lys
GAP.
Glu
TAC
Tyr 490
ACT
TCT AP.T Ser Asn GGT CGA Gly Gly AAC CAA Asn Gin CAT AAC Asp Asn 300 TTC AAG Phe Lys 315 CGT CGT Arg Arg TAC AT? Tyr Ile AG? GGT Ser Gly CT? AAT Leu Asn 380 GAC TG Asp Trp 395 CAA TAT Gin Tyr GGT ACT Giy Ser ACT TCT Thr Ser CAA. OCT Gin Gly 460 GGT TGO Gly Trp, 475 ACA ACT Thr Thr ACT ACC 950 998 1046 1094 1142 1190 1238 1286 1334 1382 1430 1479 1526 1574 1622 1670 1718 Ser Ser Pro Ala TCT CCC OCT CCT CC? Ser Ala A-la Pro Arg 505 Thr Thr Thr WO 93/25693 I'/C B93/01283 -49- CGT ACT ACT ACT CGT ACC AAG TCT CTT CCA ACC AAT TAC AAT AAG TGT Arg Thr Thr Thr Arg Thr Lys Ser Leu Pro Thr Asn Tyr Asn Lys Cys 510 515 520 TCT GCT AGA AT? ACT GCT CAA GGT TAC AAG TGT TGT AGC GAT CCA AAT Ser Ala Arg Ile Thr Ala Gin Gly Tyr Lys Cys Cys Ser Asp Pro Asn 525 530 535 540 TGT GTT GTT TAC TAC ACT GAT GAG GAT CGT ACC Cys Val Val Tyr Tyr Thr Asp Giu Asp Gly Thr 545 550 IFORMATION FOR SEQ ID NO: 4: SEQUENCE CHARACTERISTICS: LENGTH: 551 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 1766 1814 1847 Met 1 Arg Thr Ile Lys Phe Phe Phe Ala Val Ala Ile Ala Thr Val Ala Lys Ala Val Gly Tyr Glu so Gly Ser Gly Asn Ala Thr Gin Thr Phe Gin 130 Ile Glu 145 Thr Ile Pro Thr Arg Gin Lys Glu 210 Gin Asn Ile Gly Phe Asp Cly 115 Asn Asp Asp Ile Gin 195 Ala Ser Lys Gly Gly Gly Glu Trp Leu Asp Leu Asp 105 Ser Arg Val Pro Pro Asp Ile Phe 170 Thr Phe 185 His Ile Gin Asp Ser Ser Ser Ala Val 125 Glu Gly Asp Tyr Ser 205 Trp Ala Lys Gln Trp Gly Ile Gly Asn Leu Tyr Glii Vai r IWO 93/25693 PCIr/GB93/01283 Ala Leu Asn Ala Glu Trp Gin Ser Ser Ile Ala Asp Val Thr 225 Lys Ser Gly Asn Gly 305 Ala Gly Ala ys Val 385 Gin Met Gin Val Asn 465 Val Ser Arg Thr Tyr 545 Leu Thr Lys Asp 290 Ser Ala Ser Ala Val 370 Glu Gly Asp Tyr Ser 450 Leu Ala Pro Thr Ala 530 Thr Asp Gly Lys 275 Gly Met Val Gin Thr 355 Tyr Tyr Lys His Phe 435 Asp Tyr Asp Ala Lys 515 Gin Aso Thr Ser Gly Glu 295 Ser Asn Thr Th~r Gin 375 Glu Ile Thr Gin Glu 455 Leu Leu Ala Thr Cys 535 Thr Phe Leu Asp Tyr 345 Ala Ser 360 Asn Arg Asp Trp Asp Gly Ile Asn 425 Gin Lys 440 Trp Ala Asn Ala Asp Val Pro Arg 505 Asn Tyr 520 Cys Ser Gly 250 Ala Gin Leu Thr Ala 330 Asp Ala Gly Val Ala 410 Gly Arg Lys Glu Tyr 490 Thr Asn Asp Asn Gly Gin Asn 300 Lys Arg Ile Gly Asn Trp Tyr Ser Ser Gly 460 Trp Thr Thr Cys Asn 540 Ala Thr 270 Lys Gly Glu Leu Leu 350 Ser Vai Pro Ile Thr 430 His Gly Ser Lys Thr 510 Ala Val Pro 255 Ala Gly Gly Trp Asp 335 Asp Arg Pro Asp Phe 415 Phe Ile Ile Ser Gly 495 Thr Arg Val 240 Thr Asn Vai Asn Asn 320 Phe Tyr Leu Leu Ala 400 Gin Lys Thr Glv Gly 480 Ser Thr Ile Tyr INFORMATION FOR SEQ ID NO: i) SEQUENCE CHARACTERISTICS: LENGTH: 1725 base pairs WO 93/25693 PI'lrG B93/O 1283 -51- TYPE: nucleic acid STRANDmEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (ix) FEATURE: NAM/KEY: CDS LOCATION: 1.95.. 1724 (ix) FEATURE: NAME/KEY: sigpeptide LOCATION: 1.95. .281.
(ix) FEATURE: NAME/KEY: misc-feature LOCATION: 1725 OwE INFORMATION: /label- pNX4 insert (xi) SEQUENCE DESCRIPTION: SEQ ID NO: TTTATTATA TCAATCTCTA AT1'TATTT TTAGGAAAAA AATAAAAAAA TAAATATAAT s0 AAATATTAGA GAGTAATATT TAAAAACAAA GAAATTrAAA AACGTATT TAGTTATTTT 1.20 TTTTACTGGT TAAAAAAAAA ATAAAAAACA AAATTAATAA AGATATITTTT GAAAAATATT 180 GAATTAGAAA AAMA ATG AGA ACT ATT Met Arg Thr Ile AAA TTC ?1T TTC GCA GTA GCT ATT Lys Phe Phe Phe Ala Val Ala Ile GCA ACT Ala Th~r CAA AGA Gin Arg CAT GGT Asp Gly TCT ATG Ser Met TCT GTT Ser Val.
TCT CAA Ser Gin GCA ACT Ala Thr 1.10 GTT GCT Val Ala TTA ACC Leu Thr TAC ACT Tyr Ser ACT CTC Thi Leu AAC CCT Asn Arg AAG AAG Lys Lys TAC AGA Tyr Arg GGT GGA Giy Gly CAA ACC Gin Thr TTA OAT Leu Asp ACC TTC Thr Phe CCC CCT Ala Arg 85 ACC TAC Ser Tyr GCA ACT Ala Ser GOT GOT GCC TCT GCT GOT Gly Giy Ala Ser Ala Giy CACAT AAG GCT GTA GCT G2.n His Lys Gly Val Ala AAC ACC GOT CGT AGT GOT Asn Thr Gly Gly Ser Gly 55 AAG CCT GAA TGG AAT CCA Lys Ala G2.u Trp, Asn Ala CCT GGT CTT CAC TTC GGT Arg Giy Leu Asp Phe Gly AT? GCA TTC CAT TAT ACT Ile Gly Leu Asp Tyr Thr 2.05 CC? AAC TCC CCT CTC TCT Giy Asn Ser Arg Leu Cys 2.20 CAA GCT GTT CCA TTG GTA Gln Gly Val Pro Leu Vai
CTA
Val 122 TAC GCT TCC 'ITC CAA AAC CCT CGA CTT Tyr G2.y Tro Phe Gin Asn Arg Gly Val WO 93/25693 WO 9325693PCT/G B193/01283 -52- GAA TAC TAC Glu Tyr Tyr GGT AGA ATG Gly Arg Met GAT CAC ACT Asp His Tbr 1.75 TAC TTC ACT Tyr Phe Ser 190 TCA GAT CAC Ser Asp His 205 cTr TAT GAA Leu Tyr C3.u GCT CAT CTC Ala Asp Val CCT CCC CCT Pro Ala Pro 255 ACT ACT GCC Ser Thr Ala 270 CAT AAG GGT His Lys Gly 285 ACT GGT GGT Th~r Cly Gly GCT GAA TCG Ala Glu Trp GGT CTT GAC Gly Leu Asp 335 CGA TTA CAT Gly Leu Asp 350 AAC TCC CGT Asn Ser Arg 365 GCC GTT CCT Gly Val Pro GTT CCA GAT Val Pro Asp
ATC
Ile
GTA
Val 160
OCT
Gly
GTC
Val
TTT
Phe
GTT
Val.
ACC
Thr 240
ACC
Thr
AAT
Asn
GTC
Val
AAC
Asn
AAT
Asn 320
TTC
Phe
TAT
Tyr
CTC
Leu
TTA
Leu
GCA
Ala 400 TGG GTT GAC TGG GTT CCA GAT GCA CAA Trp Val Asp Trp Val. Pro Asp Ala Gin 150 155 GGA GCT CAA TAT AAG ATT TTC CAA ATG Gly Ala Gin Tyr Lys Ile Phe Gin Met 165 1.70 AAT GGT GGT ACT CGAP ACC TTT AAG CAA Asn (Giy Gly Ser Giu Thr Phe Lys Gin 180 2.85 AAG AGA ACT TCT GC? CAT AT? ACT GTC Lys Arg Th~r Ser Gly His Ile Thr Val 200 GCC AAA CAA GCT TOG GGT AT? GGT AAC Ala Lys Gin Gly Trp Oly Ile Gly Asn 215 220 GCC GAA OCT TOG CAA AG? AG? COT ATA Ala Giu Oly Trp Gin Ser Ser Gly Ile 230 235 GTT TAC ACA ACC CAA AAA COT TCT AAT Val Tyr Thr Thx Gin Lys Giy Ser Asn 245 250 ACT OTT CCA AGC ACT TCT GCT GOT GGA Tbr Val Pro Ser Ser Ser Ala Giy Giy 260 265 TTT ACT GTC COT AAT GGA CAA AAC CAA Phe Thr Val Gly Asn Giy Gin Asn Gin 280 TTC AGT TAT GAA ATC TOG TTA OAT AAC Phe Ser Tyr Giu Ile Trp Leu Asp Asn 295 300 ACT CTC COT AGT GOT GCA ACT TTC AAG Thr Leu Gly Ser Gly Ala Thr Phe Lys 310 315 AAC CGT OCT AAC TTC CTT CCC CGT COT Asn Arg Giy Asn Phe Leu Ala Arg Arg 325 330 AAG AAC GCA ACC GAT TAC GAC TAC ATT Lys Lys Ala Thr Asp Tyr Asp Tyr Ile 340 345 TAC AAA CAA ACT GCC ACT GCA AGT GOT Tyr Lys Gin Th~r Ala Ser Ala Ser Gly 360 GGA TGG TTC CAA AAC CGT GGA CTT AAT C-iy Trp Phe Gin Asn Arg Oiy La..u Asn 375 380 TAC ATC ATT GIA CAT TGG GTT GAC TGG Tyr Ile Ile Glu Asp Trp Val Asp TrP 390 395 ATC GTA ACC AT? CAT GGA GCT CAA TAT Met Val Thr Ile Asp Gly Ala Gin Tyr 405 410 662 710 758 806 854 902 950 998 1046 1094 1.142 1190 1238 2.286 3.334 1382 1430 WO 93/25693 P'CF/GB193/01283 -53- AAG ATT TTC CAA ATG GAT CAC ACT GGT CCA ACT ATC AAT GGT GGT AGT 1478 Lys Xie Phe Gin Met Asp His Thr Giy Pro Thr Ile Asn Giy Gly Ser 415 420 425 GAA ACC "TT AAG CAA TAC TTC AGT GTC CGT CAA CAA AAG AGA ACT TCT 1526 Giu Thr Phe Lys Gin Tyr Phe Ser Val Arg Gin Gin Lys Arg Thx Ser 430 435 440 GGT CAT ATT ACT GTC TCA GAT CAC TTT AAG GAA TGG GCC AAA CAA GGT 1574 Giy His Ile Thr Vai Ser Asp His Phe Lys Giu Trp Ala Lys Gin Giy 445 450 455 460 TGG GGT ATT GGT AAC CTT TAT GAA GTT GCT TTG AAC GCC GAA GGT TGG 1622 Trp Gly Ie Gly Asn Leu Tyr Giu Vai Ala Leu Asn Ala Giu Gly Trp 465 470 475 CAA AGT AGT GGT GTT GCT GAT GTC ACC T"rA TTA GAT GTT TAC ACA ACT 1670 Gin Ser Ser Giy Vai Ala Asp Val Thr Leu Leu Asp Vai Tyr Thr Thr 480 485 490 CCA AAG GGT TCT ACT CCA GCC ACC TCT GCC GCT CCT CGT ACT ACT ACC 1718 Pro Lys GMy Ser Ser Pro Ala Thr Ser Ala Ala Pro Arg Thr Thr Th~r 495 500 505 CGT ACT A 1725 Arg Thr 510 INFORMATION FOR SEQ ID NO: 6: SEQUENCE CHARACTERISTICS: LENGTH: 510 amino acids TYPE: amino acid TOPOLOGY: iinear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: Met Arg Thr Ile Lys Phe Phe Phe Ala Val Ala le Ala Thr Val Ala 1 5 10 is Lys Ala Gin Trp Gly Giy Gly Gly Ala Ser Ala Gly Gin Arg Leu Thr 25 Vai Giy Asn Giy Gin Thr Gin His Lys Gly Vai Ala Asp Giy Tyr Ser 40 Tyr Giu Ile Trp Leu Asp Asn Thr Giy Gly Ser Gly Ser Met Thr Leu s0 55 Gly Ser Giy Ala Thr Phe Lys Ala Glu Trp Asn Ala Ser Val Asn Arg 70 75 Gly Asn Phe Leu Ala Arg Arg Gly Leu Asp Phe Gly Ser Gin Lys Lys 90 Ala Thr Asp Tyr Ser Tyr Ile Gly Leu Asp TV-- Thr Ala Thr Tyr Arg 100 1os 110 Gin Thr Gly Ser Ala Ser Gly Asn Ser Arg Leu Cys Val Tyr Gly Trp 115 120 125 Phe Gin Asn Arg Gly Val Gin Gly Val Pro Leu Val Giu Tyr Tyr Ile 130 135 140 WO 93/25693 i'CT/GB93/0 1283 -54- Ile 145 Thr Pro Arg Lya Ala 227 Lys Ser Giy Asn Gly 305 Ai Gly Ala ys Val 385 Gin Met Gin Val Asn 4rB Val Glu Ile Thr Gin Glu 210 Leu Leu Thr Lys Asp 290 Ser i.la Ser Ala Val 370 clu Gly Asp Tyr Ser 450 Leu Ala Asp Asp Ile Gin 195 Trp Asn Asp Gly Lys 275 Gly Met Val Gin Thr 355 Tyr Tyr Lys His Phe 435 Asp Tyr Asp Trp Gly A~n 180 Lys 2a Ala Vai Thr 260 Phe Phe Thr Asn Lys 340 Gly Tyr Met Thr 420 Ser His Ciu Val val Ala 165 Gly Axg Lye Giu TPyr 245 Val Thr Ser Leu Arg 325 Lys Lys Trp Ile Val 405 Gly Val Phe Val Thr 485 150 Gin Gly Thr Gin Gly 230 Thr Pro Val 1y1 Gly 310 Gly Ala Gln Phe I le 390 Thr Pro Arg Lys 470 Leu Trp Tyr Ser Ser Gly 215 Trp Thr Ser Gly Glu 295 Ser Asn Thr Thr Gin 375 GlU lie Thr din Glu 455 Leu Leu Val Lys Giu Gly 200 Trp Gin Gin Ser Asn 280 lie Gly Phe Asp Ala 360 Asn Asp Asp Ile Gin 440 Trp Asn Asp Pro Ile Thr 185 His Gly Ser Lye Se: 265 Gly Trp Ala Leu Tyr 345 Ser Arg Trp Gly Asn 425 Le Ala Ala Val Asp Phe 170 Phe lie Ile Ser Gly 250 Ala Gln Leu Thr Ala 330 Asp Ala Gly Val Ala 410 aly Arg Lye Glu Tyr 490 Ala Gin 155 Gin Met Lys Gin Thr Val Gly Asn 220 Cly Ile 235 Ser Asn Giy Giy Asn Gin Asp Asn 300 Phe Lye 315 Arg Arg Tyr Ile Ser Gly Leu Asn 380 Asp Trp 395 Gin Tyr Gly Se: Thr Se Gir Gly 460 Gly Trp 475 Thr Thr Gly Arg Asp His Tyr Phe 190 Se: Asp 205 Leu Tyr Ala Asp Pro Ala Ser Thr 270 His Lye 285 Thr Gly Ala Giu Gly Leu Gly Leu 350 Asn Ser 365 Gly Val Val Pro Lys Ile GlU Thr 430 Gly His 445 Trp Gly Gin Se: Pro Lys Met Thr 175 Se: His Glu Val Pro 255 Ala Gly Gly Trp Asp 335 Asp Arg Pro Asp Phe 415 Phe Ile Ile Se: Gly 495 Vai 160 Gly Val Phe Val Thr 240 Thr Asn Val Asn Asn 320 Phe Leu Leu Ala 400 Gin Lys Thr Gly 480 Ser .WO093/25693 PC17GB93/01283 Set Pro Ala Thr Ser Ala Ala Pro Arg Thr Thr Thr Arg Thr 500 SO5 510 INORMATION FOR SEQ ID NO: 7: SEQUENCE CHARACTERISTICS: LENGTH: 724 base pai.rs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: cDNP, (ix) FEATURE: NAME/KEY: CDS LOCATION: 1. .723 (iX) FEATURE: NAME/KEY: misc feature LOCATION: 1. .724 OTHER INORMATION:- /label- pNXS insert (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: ACT GCC AAT GGT Thr Ala Asa Gly 1
AAG
Lys
GOT
Gly
GAA
Glu
CTT
Leu
TTA
Leu
TCC
Ser
OTT
Val
CCA
Pro
ATT
Ile 14S GOT GTC Gly Val GOT AAC Gly Asn 35 TGG AAT Trp, Asn 50 GAC TTC Asp Phe GAT TAT Asp Tyr COT CWC Arg Leu CCT TTA PrQ Leu 115 OAT OCA Asp Ala 130 TTC CAA Phe Gin AAA AAG TTT ACT GTC GOT AAT Lys Lys Phe Thr Val. Gly Asn 5 GAT GOT TTC AOT TAT GAA ATC Asp Oly Phe Ser Tyr Glu Ile 25 TCT ATG ACT CTC GGT AGT GOT Ser Met Thr Leu Gly Ser Gly 40 OCT OTT AAC CGT GOT AAC TTC Ala Val. Asa Arg Giy Asn Phe 55 TCT CAM MO AAG GCA ACC OAT Ser Gin Lys Lys Ala Thr Asp 70 75 OCT A~CT TC AAA CAA ACT GCC Ala Thr Tyr Lys Gin Thr Ala OTA TAC OGA TOO TTC CAM MC Val Tyr Gly Trp Phe Gin Asn 105 GAA TAC TAC ATC ATT GMA OAT Giu Tyr Tyr Ile Ie Giu Asp 120 GGA MAA ATO OTA, ACC ATT OAT Gly Lys Met Val. Thr Ile Asp 135 OAT CAC ACT GOT CCA, ACT ATC Asp His Thr Gly Pro Thr Ile 150 155 OCR CMA MC Gly Gin An TOGG TTA OAT Trp Leu Asp OCA ACT TTC Ala Thr Phe CTT 0CC COT Leu Ala Arg TAC OAC TAC Tyr Asp Tyr AGT OCA AOT Ser Ala Ser CGT OCR CTT Arg Gly Leu 110 TOO OTT GAC Trp Val Asp 125 OGA OCT CAA Gly Ala Gin 140 MAT GOT GOT Ann Gly Gly
CAT
His
ACT
Thr
OCT
Ala
GOT
Gly
OCR
Oly s0
AAC
An
GOC
Gly
OTT
Val.
AAO
Lys
GMA
Oiu 160 WO 93/25693 PCr/c B93/O1 283 ACC TTT Thr Phe CAT ATT His Ile GGT ATT Gly Ile AGT AGT Ser Ser 210 AAG GGT Lys Gly 225 ACT A Thx
TTC
Phe
GAT
Asp
TAT
Tyr
GAT
Asp
GCC
Ala 230 -56- GTC COT CAA CAA AAG Val Arg Gin Gin Lys 170 TTT AAG GAA TOG GCC Phe Lys Glu Trp Ala 185 OTT GCT TTG AAC GCC Val Ala Leu Asn Ala 200 ACC TTA TTA GAT GTT Thr Leu Leu Asp Vai 220 TCT GCC GCT CCT COT Snr Ala Ala Pro Arg 235 AGA ACT TCT GGT Arg Thr Ser Gly 175 AAA CAA GGT TGG Lys Gin Gly Trp 190 GAA GGT TOG CAA Glu Gly Trp Gin 205 TAC ACA ACT CCA Tyr Thr Thr Pro ACT ACT ACC CGT Thr Thr Thr Arg 240 528 576 624 672 720 INFORMATION FOR SEQ ID NO: 8: i) SEQUENCE CHARACTERISTICS: LENGTH: 24. amino acids TYPE: amino acid TOPOLOGY. linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: Thr Ala Asn Gly Lys Asp Ser Ala Ser Ala Val Glu Gly Asp Lys Phe Thr Val Giy Met Val Gln Thr Tyr Tyr Lys His 150 Phe Thr Asn Lys Tyr Gly Tyr Me, Thr Oly Asn Gly Gin Asn Gin His 10 1 Glu Ile Trp Leu Asp Asn Thr Ser Gly Ala Thr Phe Lys Ala Asn Phe Leu Ala Arg Arg Gly Thr Asp Tyr Asp Tyr Ile Gly 75 Thr Ala Ser Ala Ser Gly Asn 90 Gin Ann Arg Gly Leu Asn Gly 110 Glu Asp Trp Val Asp Trp Val 125 Ile Asp Giy Ala Gin Tyr Lys 140 Thr Ile Asn Gly Giy Ser Glu WO 93/25693 WO 93/5693 CT/GB93/01283 Tbr Phe Lys His Ile Thr Gly Ile Gly 195 Ser Ser Gly 210 Lys Gly Ser 225 Gin Tyr Phe Ser Val Arg 165 Val Ser Asp His Phe Lys 180 185 Ann Leu Tyr Giu Val Ala Gl.a Gin Lys 170 Arg Thr Ser Gly 17 S Giu Trp Leu Ann 190 Ala Glu Giy Trp Gin Val Ala Asp Ser Pro Ala 230 Val 215 Thr Leu Leu Asp Val 220 Arg Tyr Thr Thr Pro Ser Ala Ala Pro 235 Thr Thr Thr INFORMATION FOR SEQ ID NO: 9: Wi SEQUENCE CHARACTERISTICS: LENGTH: 1001 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (ix) FEATURE: NAME/KEY: CDS LOCATION: 195. .1001 (ix) FEATURE: NAME/KEY: sigpeptide LOCATION: 195. .281 (ix) FEATURE: NAME/KEY: misc-feature LOCATION: 1. .1001 O0710 114FORMATION: /label= pNX6_insert (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: TTTTATTATA TCAATCTCTA ATTTATTTTT TrAGGAAAAA AATAAAAAAA TAAATATAAT AAATATTAGA GAGTAATATT TAAAAACAAA GAAATTTAAA AACGTTTATT TAGTTAT~T TTTTACTGGT TAAAAAAAAA ATAAAAAACA AAATTAATAA AGATATTTTT GAAAAATATT GAATTAGAAA AAAA ATG AGA ACT ATT AAA T~C TTr TTC GCA GTA GCT ATT Met Arg Thr Ile Lyn Phe Phe Phe Ala Val Ala Ile 1 5 GCA ACT T GCT AAG GCC CAA TGG GGT GGA GGT GGT GCC TCT GCT GGT Ala Thr -1 Ala Lys Ala Gin Trp Gly Gly Gly Gly Ala Ser Ala Giy is 20 CAA AGA TTA ACC GTC GGT AAT GGT CAA ACC CAA CAT AAG GGT GTA GCT Gin Arg Leu Thr Val Gly Ann Gly Gin Thr Gin His Lys Gly Val Ala 35 GAT GGT TAC AGT TAT GAl ATC TGG TTA GAT AAC ACC GGT GGT AGT GGT Asp Gly Tyr Ser Tyr Glu Ile Trp Leu Asp Ann Thr Giy Gly Ser Gly 50 55 WO 93/25693 -58- TCT ATG ACT CTC GGT AGT GGT GCA ACC TrC AAG GCT GAA TGG AMT PCT/GB93/01283 Ser Met Tbx TCT GTT AMC Ser Val. Asn TCT CMA AAG Ser Gin Lys GCA ACT TAC Ala Thx Tyr 110 GTA TAC GGT Val. Tyr Giy 125 GAA TAC TAC Giu Tyr Tyr GGT AGA ATC, Gly Arg Met GAT CAC ACT Asp His Thr 17S TAC TTC AGT Tyr Phe Ser 190 TCA GAT CAC Ser Asp His 205 CTT TAT GMA Leu Tyr Giu GCT GAT GTC Ala Asp Val CCT GCC CCT Pro Ala Pro 255
ACT
Ser Giy Ser Gly GGT MAC TTC Gly Asn Phe GCA ACC CAT Ala Thr Asp CAA ACT OCT Gin Thr Gly 115 TTC CAA AAC Phe Gin Asn 130 ATT GAA GAT Ile Giu Asp 145 ACC ATT GAT Th Ile Asp CCA ACT ATC Pro Thr Ile CGT CAA CAA Arg Gin Gin 195 AAG GAA TGG Lys Giu Trp 210 GCT TTG MAC Ala Leu Asn, 225 AAG TI'A CAT Lys Leu Asp TCC ACT GOT Ser Tkir Gly Ala Leu
TAC
Tyr 100
AGC
Ser
CGT
Arg
TG
Trp
GGA
Giy
AAT
Asn 180
MAG
LYS
GCC
Ala
GCC
Ala
GTT
Val
ACT
Thr 260 Thr
GCC
Ala 85
AGC
Ser
GCA
Ala
GGA
Giy
OTT
Val
GCT
Ala 165
GT
Giy
AGA
Arg
AAA
Lys
GMA
Giu
TAC
Tyr 245
OTT
Val Lys Ala Giu Trp Asn Ala COT OCT CTT GAC 7'TC OCT Arg Gly Leu Asp Phe Gly ATT OGA TI'G CAT TAT ACT Ile Gly Leu Asp Tyr Thx 105 OCT MAC TCC COT CTC TGT Gly Asn Ser Arg Leu Cys 120 CMA GGT OTT CCA TTG GTA Gin Giy Val. Pro Leu Val 135 140 TOG OTT CCA CAT GCA CMA Trp Vai Pro Asp Ala Gin 155 TAT MAG ATT TTC CMA ATG Tyr Lys Ile Phe Gin Met 170 AGT GMA ACC TTT MAG CAA Ser Giu Thr Phe Lys Gin 185 TCT GOT CAT ATT ACT GTC Ser Cly His Ile Thr Val 200 GOT TOG GGT ATT OCT MAC Gly Trp Gly Ile Giy Asn 215 220 TOG CMA AGT AGT OCT ATA Trp Gin Ser Ser Oly le 235 ACC CMA AMA GGT TCT MAT Thr Gin Lys Cly Ser Asn 250 ACC AGT TCT OCT COT GGA Ser Ser Ser Ala Giy Gly 265 .902 1001 INFORM'ATION FOR SEQ ID NO: Wi SEQUENCE CHARACTERISTICS: LENGTH: 269 amino aci.ds TYPE: amino acid TOPOLOGY: iinear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: WO 93/25693 WO 93/25693 C/G B93/O 1283 .59- Met 1 Lys Val Tyr Gly Giy Ala Gin Phe Ile 2.45 Thr Pro Arg Lys Al a 225 Lys Ser Arg Al a Gly Giu s0 Ser Asn Thr Thr Gin 130 Giu Ile Thr Gin Giu 210 Leu Leu Thr Thr Gin Asn 3S Ile Giy Phe Asp Gly Asa Asp Asp Ile Gin 2.95 Trp Asn Asp Gly Ile Trp Gly Trp Ala Leu Tyr 100 Ser Arg Trp Gly Asn 180 Lys Ala Ala Val Th~r 260 Lys 5 Gly Gin Leu Thr Ala Ser Ala Gly Val Ala 1.65 Giy Arg Lys Glu Tyr 245 Val Phe Gly Thr Asp Phe 70 Arg Tyr Ser Val Asp 150 Gin Gly Thr Gin Gly 230 Thr Pro Phe Ala Gly Ala His Lys 40 Thr Giy Ala Glu Gly Leu Gly Leu 105 Asn Ser 120 Gly Val Val Pro Lys Ile Glu Thr 185 Gly His 200 Trp Gly Gin Ser Gin Lys Ser Ser 26S Ala Ile Ala Giv Val Ala Ser Gly Asn Ala Phe Gly Tyr Thr Leu Cys Leu Val 140 Ala Gin iss Gin Met Lys Gin Thr Val Gly Asn 220 Gly Ile 235 Ser Asa Gly Gly Ala Gin Asp Ser Ser Ser Ala Val 125 Giu Gly Asp Tyr Ser 205 Leu Ala Pro Ser Val is Leu Tyr Thr Asn Lys Tyr Giy Tyr Met Thr 175 Ser His Glu Val Pro 255 INFORMATION FOR SEQ ID NO: 11: SEQUENCE CHLARACTERISTICS: LENGTHi 690 base pairs TYPE: nucleic acid STRANflEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (ix) FEATURE: NAME/KEY: CUS LOCATION: 195. .689 (ix) FEATURE: .WO 93/25693 PCr/G B93/O1283 NAME/KEY: sigjpeptide LOCATION: 195..281 (ix) FEATURE: MAME/KEY: misc feature LOCATION: 1..690 OTER INFORMATION: /label. pNX7_insert (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: TTTTATTATA TCAATCTCTA ATTTATTTTi TTAGGAAAAA AATAAAAAAA TAAATATAAT AAATATTAGA GAGTAATATT TAAAAACAAA GAAATTTAAA AACGTTTATT TAGTTATTT TTTTACTGGT TAAAAAAAAA ATAAAAAACA AAATTAATAA AGATATITT GAAAAATATT GAATTAGAAA AAAA ATG AGA ACT ATT AAA TTC Met Arg Thr Ile Lys Phe
GCA
Ala
CAA
Gin
GAT
Asp
TCT
Ser
TCT
Ser
TCT
Ser
GCA
Ala
GTA
Val 125
GAA
Giu
GOT
Gly ACT OTT Thr Vai AGA TTA Arg Leu GOT TAC Gly Tyr ATG ACT Met Thr OTT AAC Val Asn CAA AAG Gin Lys ACT TAC Th Tyr 110 TAC GGT Tyr Gly TAC TAC Tyr Tyr AGA ATO Arg Met
OCT
Ala
ACC
Thr
AGT
Ser
CTC
Leu
COT
Arg
AAG
Lys
AGA
Arg
TOO
Trp
ATC
Ile
GTA
Val 160 0CC Ala
GGT
Gly
GAA
Glu so
AGT
Ser
AAC
Asn
ACC
Thr
ACT
Thr
CAA
Gin 130
GAA
Glu
ATT
Ile
CAA
Gin
AAT
Asn 35
ATC
Ile
GOT
Gly
ITC
Phe
GAT
Asp
GOT
Gly 115
AAC
Asn
OAT
Asp
OAT
Asp GGT OGA Gly Oly CAA ACC Gin Thr TTA OAT Leu Asp ACC TTC Thr Phe 0CC COT Ala Arg 85 AGC TAC Ser Tyr OCA AGT Ala Ser OGA OTT Gly Val OTT GAC Val ksp 150 OCT C Ala TTT TTC OCA OTA GCT ATT Phe Phe Ala Vai Ala Ile GGT GOT 0CC TCT OCT GGT Oly Gly Ala Ser Ala Gly CAA CAT AAO GOT OTA OCT Gin His Lys Gly Val Ala AAC ACC GOT GOT AGT GOT Asn Thr Gly Gly Ser Gly 55 AAG OCT GAA TGO AAT OCA Lys Ala Glu Trp Asn Ala COT GOT CTT GAC TTC GOT Arg Gly Leu Asp Phe Oly ATT GOA TTG GAT TAT ACT Ile Giy Leu Asp Tyr Thr 105 GOT AAC TCC COT CTC TGT Oly Asn Ser Arg Leu Cys 120 CAA GOT OTT CCA TTG OTA Gin Oly Val Pro Leu Val 135 140 TOG OTT CCA OAT OCA CAA Trp Vai Pro Asp Ala Gin 155 INFORMATION FOR SEQ ID NO: 12: t 1) SEQUENCE ~UW~I~blY LENGTH: 165 amino acids TYPE: amino acid WO 93/25693 -61- TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: PCT/GB93/O1 283 Met Arg Thr Ile Lys Phe Phe Phe Ala Val Ala Ile Ala Thr Val Ala 1 5 10 Lys Val Tyr Gly Gly Ala Gin Phe Ile 145 Ala Gin Giy Asa Giu Ile Ser Gly Asn Phe Thr Asp Thr Giy Gin Asn 130 Glu Asp Trp Gly Trp Ala Leu Tyr 100 Ser Arg Trp Giy Gin Leu Thr Ala Ser Ala Gly Val Gly Thr Asp Phe Arg Tyr Ser Val Asp 150 Gly Gly Gin His 40 Asn Thr Lys Ala Arg Giy Ile Gly Gly Asn.
120 Gin Gly 135 Trp Val Ser Gly Gly Trp Asp Asp Arg Pro Asp Gin Asp Ser Ser Ser Ala Val 125 Giu Giy Thr Ser Leu Arg Lys Arg Trp Ile Vai Thr Ile Asp Gly Ala 165 INFORMATION FOR SEQ ID NO: 13: SEQUENCE CHARACTERISTICS: LENGTH: 1337 base pairs TYPE: nucleic acid STR.TINDEDNESS: doubie TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (ix) FEATURE: NAME/KEY: CDS LOCATION: 1..1014 (ix) FEATURE: NAME/KEY: misc feature LOCATION: 1. .1337 OTHER INFORMATION: /label= pNXB insert (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: ACT GCC AAT GGT AAA AAG TTT ACT GTC GGT AAT GGA CAA AAC CAA CAT Thr Ala Asn Gly Lys Lys Phe Thr Vai Gly Asn Gly Gin Asn Gin His 3. 5 10 is WO 93/25693 WO 93/5693r/GB93/O 1283 -62- GAA AAG GC? GTC AAC GAT GGT TTC ACT TAT Lys Giy Val Asn Asp Gly Phe Ser GGT GGT AAC Cly Gly Asn GAA TGG AAT Giu ?rp, Asn GAC TTC Leu Asp Phie TIA GAT TAT Leu Asp Tyr TCC CCT CTC Ser Arg Leu GT? CC? TTA Val. Pro Leu CCA GAT GCA Pro Asp Ala 130 AT? TC CAA Ile Phe Gin 145 ACC T AAG Thx Phe Lys CAT AT? ACT His Ile Thr GGT AT? GC? Gly Ile Gly 195 AG? AG? GGT Ser Ser Gly 210 AAG GGT TC? Lys Giy Ser 225 ACT ACT ACT Thr Thr Thr GCT AGA AT? Ala Arg Ile GTT GT? ?AC Val Val Tyr 275
GGT
Gly
GCA
Ala
GG?
Giy
C
Ala
?GT
Cys 100
GTA
Val
CAA
Gln
ATG,
Met
CAA
Gin
CTC
Val 180
AAC
Asa
CT?
Val
AG?
Ser
CC?
Arg
ACT
Thr 260
ATC
Ser Met C? GT Ala Val TCT CAA Ser Gin 70 C? ACT Ala Thr GTA TAC Val. Tyr GAA TAC Giu Tyr GGA AAA Gly Lys GA? CAC Asp His I50 TAC TC Tyr Phe 165 TCA GA? Ser Asp
TAT
Leu Tyr C? GA? Ala Asp CCA CC Pro Ala 230 ACC AAG ?hr Lys 245 GC CAA Ala Gin ACT CC Thr Leu 40 AAC CC? Asn Arg 55 AMC AAC Lys Lys TAC AAA Tyr Lys GGA TCG Giy ?rp TAC ATC Tyr Ile 120 ATG GA Met Val 135 ACT CC? Thr Ciy ACT GTC Ser Val CAC TT? His Phe GAA CT? Ciu Val 200 CC ACC Vai ?hr 215 ACC TCT Thr Ser TC? CT? Ser Leu GGT TAC Cly Tyr Tyr Giu 25 CC? ACT Gly Ser CC? AAC Gly Asn GCA ACC Ala ?hr CAA AC? Gin ?hr 90 TTC CAA Phe Gin 105 AT? GMA Ile Giu ACC AT? Thr Ile CCA ACT Pro ?hr CC? CAA Arg Gin 170 AAC GAA Lys Glu 185 C? TC Ala Leu TA TTA Leu Leu CCC CC? Ala Ala CCA ACC Pro Thr 250 AAC ?GT Lys Cys 265 ATC TGC Ile Trp CC? CCA Cly Ala TC CT? Phe Leu GA? TAC Asp Tyr 75 CCC ACT Ala Ser AAC CC? Asn Arg GA? TCG Asp ?rp CAT GGA Asp Gly 140 ATC AAT Ile Asn 155 CAA AAG Gin Lys TGG GCC ?rp Ala AAC CC Asn Ala CAT CT? Asp Val 220 CC? CC? Pro Arg 235 AAT TAC Asn Tyr ?CT AC Cys Ser TTA GA? AAC Leu Asp Ann AC? TTC AAG Thr Phe Lys CCC CC? CC? Ala Arg Arg CAC ?AC AT? Asp Tyr Ile CCA ACT CC? Ala Ser Gly GGA CT? AA? Cly Leu Ann 110 CT? GAC TCG Val Asp Trp 125 C? CAA TAT Ala Gin Tyr CC? CW? ACT Cly Giy Ser AGA AC? TC? Arg Thr Ser 175 AAA CAA CC? Lys Gin Gly 190 CMA CC? CG Ciu Gly ?rp 205 TAC ACA AC? Tyr Thr ?hr ACT ACT ACC ?hr ?hr ?hr AAT AAC TCT Asa Lys Cys 255 GA? CCA AA? Asp Pro An 270
ACT
Thr
C?
Ala
CC?
Ciy
CGA
Gly
AAIC
Ann
GCC
Gly
CT?
Val
AAG
Lys
GAA
Giu 160
CC?
Gly
TGC
?rp
CAA
Gin
CCA
Pro
CC?
Arg 240 Ser
TC?
Cys 96 144 192 240 288 336 384 432 480 528 576 624 672 720 768 816 TAC AC? GAT GAG Tyr ?hr Asp Giu CC? ACC ?GG CC? Gly ?hr Trp Gly CAA AAC AAC Glu Asn Asn WO 93/25693 WO 9325693PCY/GB93/01283 -63- GAC TGG TGT GGT TGT GGT GTZ' GAA CAA TGT TCT T Asp Trp Cys Gly Cys Gly Val Glu Gin Cys Ser S 290 295 3 CAA GGT TAC AAG TGT TGT AGC CAT CC-A AAT TGC G Gin Gly Tyr Lys Cys Cys Ser Asp Pro Asa Cys V 305 310 315 GAT GAC GAT GOT AAA TGG GOT OTT GAA AAC AAC G.
Asp Asp Asp Gly Lys Trp Gly Val Glu Asn Asa A 325 330 GOT TTC TAAGCAGTAA, AATACTAATT AATAAAAAAT TA Gly Phe AAATTTAAAA ATTTAAAAGA ATTATGAA'A ATTTAAATTTA TTTAGTAAAA AATTAAAGAA TTATTGAAAA TTTTAAATGT TT1'GTAAAAA AAAATGAAAG AATTATGAAA AATTAAAATG T ATTrGTAAGA AAAATAAAGA ATTATAAAAA AAATAAAGAAT AAGAAAAAAA AAAAAAAAAA AAAAAAAAAA AAA INFORMATION FOR SEQ ID NO: 14: Wi SEQUENCE CHARACTERISTICS: LENGTH: 338 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: CC AAG er Lys 00 TT OTT al Val AC TG ap Trp
GAATTA
ATC ACT TCT Ile Thr Ser TTC TAC ACT Phe Tyr Thr 320 TOT GOT TGT Cys Gly Cys 335
TGAAAAATTT
912 960 1008 1064 1124 1184 1244 1304 1337 AAAATTTAA AAAAAACTAA PAAATTTAA AAAATACAAA AAAAGTTTA AAAAATACAA rATGAAAAA CCC7,AATGTA Thr Ala Asa Gly Lys Lys Phe Th: Val 1 5 Gly 10 Asn Oly Gin Asa Gin His is Leu Thr Ala Asp Ala Gly Val 125 Ala Gly
ASP
Phe Arg Tyr Ser Leu 110 Asp Gin Gly WO 93/25693 WO 9325693PCr/C B93/01283 -64- Thr Phe Lys Gin Tyr Phe Ser Val Arg Gin Gin Lys Arg Thr Ser Gly His Gly S er Lys 225 Thr Al a Val Asp Gin 305 Asp Giy Ile Thr Ile Gly 195 Ser Giy 210 Giy Ser Th'r Thr Arg Ile Val. Tyr 27S Trp Cys 290 Gly Tyr Asp Asp Phe Vai Asn Vai Ser Arg 260 Gly Lys Gly Ser Asp Leu Tyr Ala Asp Pro Ala 230 Thr Lys 245 Ala Gin Thr Asp Cys Giy Cys Cys 310 Lys Trp 32S Phe Lys 185 Vai Ala 200 Thr Leu Ser Ala Leu Pro Tyr Lys 265 Asp Giy 280 Giu Gin Asp Pro Val. Glu 170 Giu Leu Leu Ala Thr 250 Cys Thz Cys Asn Asn 330 Ala Ala Val 220 Arg Tyr Ser Gly Ser 300 Val Asp Lys GiiU 205 Tyr Thr Asn Asp Vai 285 Lys Val Trp Gin 190 Gly Thr Thr Lys Pro 270 Giu, Ile Phe Cys 175 Gly Trp Thr Thr Cys 255 Asn Asn Thr
TY-
Giy 335 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 846 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (ix) FEATURE: NAME/KEY: CDS LOCATION: 1. .846 (ix) FEATURE: NAME/ KEY misc feature LOCATION: 1. .846 OTHER INFORMATION: /label= pNX_insert (xi) SEQUENCE DESCRIPTION: SEQ ID NO: IS: ACT GCC AAT GOT AAA AAG TTT ACT OTC GGT AAT GGA CAA AAC CAA CAT Thr Ala Asn Gly Lys Lys Phe Thr Val Gly Asn Gly Gin Asn Gin His I. 5 10 1s AAG GOT GTC AAC OAT GOT TTC AGT TAT GP.A ATC TGG TTA GAT AAC ACT Lys Gly Val Asn Asp Giy Phe Ser Tyr Giu Ile Trp Leu Asp Asn Thr 25 WO 93/25693 WO 9325693PCT/G B93/O 1283
GT
Gly
GAA
Giu
CTT
Leu
TTA
Lett
TCC
Ser Val
CCA
Pro
ATT
Ile 145
ACC
Tbhr
CAT
His
GGT
Gly
AGT
Ser
P.AG
Lys 225
ACT
Thr
GCT
Al a
GTT
Val.
GGT
Gly
TGG
Trp
GAC
Asp
GAT
Asp
COT
Arg
CCT
Pro
OAT
Asp 130
TTC
Phe Phe
ATT
Ile
ATT
Ile
AGT
Ser 210
GOT
Gly
ACT
Thr
AGA
Arg
OTT
Val
AAC
Asn
AAT
Asn
TTC
Phe
TAT
Tyr
CTC
Leu
TTA
Leu 115 G CA Ala
CAA
Gin
AAG
Lays
ACT
Thr
GT
Gly 195
GOT
Gly
TCT
Ser
ACT
Thr
ATT
Ile
TAC
Tyr 275 GGT TCT Gly Ser GCA GCT Ala Ala GOT TCT Gly Ser OCT GCT Ala Ala 8s TGT GTA Cys Val 100 OTA GAA Vai Giu CAA GGA Gin Giy ATG CAT Met Asp CAA TAC Gin Tyr 165 OTC TCA Val Ser 180 AAC OTT Asa Leu OTT OCT Vai Ala AGT CCA Ser Pro CGT ACC Arg Thr 245 ACT OCT Thr Al a 260 TAC ACT Tyr Thr
ATG
Met
GTT
Val.
CAT4 Gin 70
ACT
Thr
TAC
Tyr
TAO
Tyr
AAA
Lys
CAC
His 1S0
TTC
Phe
OAT
Asp
TAT
Tyr
GAT
Asp 0CC Ala 230
AAG
Lys
CAA
Gin
CAT
Asp ACT CTC Thr Leu 40 AAC CGT A.sf Arg AAG AAG Lys Lye TAO AAA Tyr Lys OGA TG Gly Trp TAO ATC Tyr Ile 120 ATG OTA Met Val 135 ACT GOT Thr Gly AGT OTO Ser Vai CAC TTT His Phe GAA OTT Giu Val.
200 OTO ACC Vai Thr 215 ACC TCT Thr Ser TCT CTT Ser Leu GOT TAC Gly Tyr GAG OAT Giu Asp 280
GOT
Gly
GT
Gly
OCA
Ala
CAA
Gin
TTO
Phe 105
ATT
Ile
ACC
Thr
CCA
Pro
COT
Arg
AAG,
Lye 185
OCT
Ala
TTA
Leu 0CC Ala
CCA
Pro
AAO
Lys 265
GT
Gly AOT GOT OCA ACT TTC AAG OCT Ser
AAC
ACC
Thr
ACT
Thr 90
CAA
Gin
GAA
ATT
Ile
ACT
Thr
CAA
Gin 170
GAA
011.
TTO
Leu
TTA
Leu
GCT
Ala
ACC
Thr 250
TOT
Cys
ACC
Thr Gly Ala Thr Phe Lye Ala
TTC
Phe
GAT
AST)
75
GCC
Ala
AAC
Asn
GAT
Asp
OAT
Asp
ATC
Ile 155
CAA
Gin
TG
Trp
AAC
OAT
Asp
CCT
Pro 235
AAT
Asn
TOT
Cys
COT
Arg
TAC
Tyr
AOT
Ser
CTT
Leu 110
GAO
Asp
CAA
Gin
GT
Giy
ACT
Thr
CAA
Gin 190
GT
Gly
ACA
Thr
ACT
Thr
AAG
Lye
CCA
Pro 270
COT
Arg
ATT
Ile
GOT
Gly
AAT
Asn
TOG
Trp
TAT
Tyr
AGT
Ser TOT*11 S er 175
GOT
Gly
TOG
Trp
ACT
Thr
ACC
Thr
TOT
Cys 255
AAT
GOT
Gly
OGA
Oiy
AAC
Asn
GOC
Gly
OTT
Val.
AAO
Lye
OAA
Giu 160
GOT
Gly
TGO
Trp
CAA
Gin
CCA
Pro
COT
Arg 240
TOT
Ser
TOT
Cys mmmmmmr--.
I WO 93/25693 PCT/GB93/O1 283 -66- INFORMATION FOR SEQ ID NO: 16: SEQUENCE CHARACTERISTICS LENGTH: 282 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECIE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: Thr Ala Asn Gly Lys Lys Phe Thr Val Asn Gly Gin Asn Gin His Lys Gly Glu Leu Leu Ser Val Pro Ile 145 Thr His Gly Ser Lys 225 Thr Ala Gly Gly Trp Asp Asp Arg Pro Asp 130 Phe Phe Ile Ile Ser 210 Gly Thr Arg Val Asn Asn Phe Tyr Leu Leu 115 Ala Gin Lys Thr Gly 195 Gly Ser Thr lie Gly Met Val Gin 70 Thr Tyr Tyr Lys His 150 Phe Asp Tyr Asp Ala 230 Lys Gin Tyr Gly Gly Ala Gin Phe 105 Ile Thr Pro Arg Lys 185 Ala Leu Ala Pro Lys 265 Ile Gly Phe Asp Ala Asn Asp Asp Ile 155 Gln Trp Asn Asp Pro 235 As Cys Leu Thr Ala Asp Ala Gly Val 125 Ala Gly Arg Lys Glu 205 Tyr Thr Asn Asp Asp Phe Arg Tyr Ser Leu 110 Asp Gin Gly Thr Gin 190 Gly Thr Thr Lys Pro 270 Val Val Tyr Tyr Thr Asp Glu Asr Gly Thr JI I WO 93/25693 WO 9325693PCr/GB93/OI 283 -67- INFORMATION FOR SEQ ID NO: 17; SEQUENCE CHARACTERISTICS: LENGTH: 708 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear MOLECULE TYPE: cDNA (ix) FEATURE: NAME/KEY: CDS LOCATION: 708 (ix) FEATURE.: NAME/KEY: misc-feature LOCATION: 708 OTHER INFORMATION: /label= pNX1O insert (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: ACT GCC AAT GGT AAA AAG TTT ACT Thr Ala Asn Gly Lys Lys Phe Thr 1
AAG
Lys
GGT
Gly
GAA
Glu
OTT
Leu Leu
TCC
Ser
GTT
Val
CCA
Pro
ATT
Ile 145
ACC
.hr 5 GTC AAC GAT Val Asa Asp 20 AAC GGT TCT Asn Gly Ser 35 AAT GCA GCT Asn. Ala Ala TTC GOT TOT Phe Gly Ser TAT GCT OCT Tyr Ala Ala CTC TOT GTA Leu Cys Val 100 TTA GTA GAA Leu Val Glu 115 GCA CAA GGA Ala Gin Gly CAA ATG CAT Gin Met Asp AAG CAA TAO Lys Gln Tyr 165
GOT
Gly
ATG
Met Val
CAA
Gln 70
ACT
Th~r
TAC
Tyr
TAC
Tyr
AAA
Lys
CAC
His 150
TTC
Phe GTC GOT Val Gly 10 TAT GAA Tyr Giu 25 GOT AGT Gly Ser GOT AAC Gly Asn GCA ACC Ala. Thr CAA ACT Gin Thr 90 TTC CAA Phe Gin 105 ATT GAA Ile Glu ACC AT1' Thr Ile CCA ACT Pro Thr CGT CAA Arg Gln 170 AAT OGA CAA .AAC Asn Gly Gln Asn ATC TOG TTA CAT Ile Trp Leu Asp WOT GCA ACT TTC Gly Ala Thx Phe TTC OTT GCC COT Phe Leu Ala Arg GAT TAC CAC TAC Asp Tyr Asp Tyr 75 GCC AGT GCA AGT Ala Ser Ala Ser AAC COT OCA OTT Asn Arg Gly Leu 110 CAT TOO GTT CAC Asp Trp Val Asp 125 CAT OCA OCT CAA Asp Gly Ala Gin 140 ATC AAT GOT GOT Ile Asn Gly Oly 155 CAA AAG AGA ACT Gin Lys Axg Thr CAA CAT Gln His is AAC ACT Asn Thr AAG OCT Lys Ala COT GGT Arg Gly ATT GGA L..j Gly s0 GOT AAC Gly Asn AAT G00 Asn Gly TGO OTT Trp, Val TAT AAG Tyr Lys AGT GA~A Ser Olu 160 TCT GOT Ser Oly 175 WO 93/25693 PCr/GB193/01283 -68- ATT ACT GTC TCA CAT CAC Trr AAG GAA TOG GCC AAA CAA GGT TGG Ile Thr Val Ser Asp His Phe Lys Giu Trp Ala Lys Gin Gly Trp 180 185 190 AT? GGT A.C CT? TAT GAA GTT GCT TG AAC GCC GAA GGT TGG CM Ile Gly Asn Leu Tyr Glu Val Ala Leu Asn Ala Giu Gly Trp Gin 195 200 205 AGT GGT OTT GCT GAT GTC ACC TTA TTA GAT GTT TAC ACA ACT CCA Ser Gly Val Ala Asp Val Thr Leu Leu Asp Val Tyr Thr Thx Pro 210 215 220 GGT TCT AG? CCA GCC ACC TCT GCC OCT CC? CGT Gly Ssr Ser Pro Ala Thr Ser Ala Ala Pro Arg 230 235 INFORMATION FOR SEQ ID NO: I8 SEQUENCE CHARACTERISTICS: LENGTH: 236 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION; SEQ ID NO: 18: Thr Ala Asn Gly Lys Lys Phe Thr Val Lye Cly Glu Leu Leu Ser Val Pro Ile 145 Thr His oly Met Val Gln 70 Thr Tyr Tyr Lys His
ISO
Phe Asp Phe Thr Asn Lye Tyr Gly Tyr Met 135 Thr Ser His Tyr Oly Gly Ala Gin Phe 105 lie Thr Pro Arg Lys 185 Gly Ciii Ser Ann Thr Thr Gln Glu Ile Thr Gin 170 Glu Leu Thr Ala Asp Ala Cly Val 125 Ala Oly Arg Lys Asp Asn Thr Phe Lys Ala Arg Arg Gly Tyr Ile Gly Ser Gly Asn Leu Ann Oly 110 Asp Trp Val Gin Tyr Lys Gly Ser Glu 160 Thr Ser Cly 175 Gin Gly Trp 190 Asn Gly Gin An Gin His Gly Ile Cly Asn Leu Tyr Clu 195 Val Ala Leu Asn Ala G1U Giy Trp Gln 200 205 WO 93/25693 PICT/GB93/01283 -69- Ser Ser Gly Val Ala Asp Val Thr Leu Leu Asp Val Tyr Thr Thr Pro 210 215 220 Lys Gly Ser Ser Pro Ala Thr Ser Ala Ala Pro Arg 225 230 235 SUMMARY OF SEQUENCE LISTINGS SEQ ID NO: 1 pNX1 DNA and coding regio- SEQ ID NO: 2 Protein sequence of SEQ ID NO: 1 SEQ ID NO: 3 pNX3 DNA and coding region SEQ ID NO: 4 Protein sequence of SEQ ID NO: 3 SEQ ID NO: 5 pNX4 DNA and coding region SEQ ID NO: 6 Protein sequence of SEQ ID NO: SEQ ID NO: 7 pNX5 DNA an,' coding region SEQ ID NO: 8 Protein sequence of SEQ ID NO: 7 SEQ ID NO: 9 pNX6 DNA and coding region SEQ ID NO: 10 Protein sequence of SEQ ID NO: 9 SEQ ID NO: 11 pNX7 DNA and coding region SEQ ID NO: 12 Protein sequence of SEQ ID NO: 11 SEQ ID NO: 13 pNX8 DNA and coding region SEQ ID NO: 14 Protein sequence of SEQ ID NO: 13 SEQ ID NO: 15 pNXS DNA and coding region SEQ ID NO: 16 Protein sequence of SEQ ID NO: SEQ ID NO: 17 pNX10 DNA and coding region SEQ ID NO: 18 Protein sequence of SEQ ID NO: 17
I

Claims (27)

1. A xylanase which has at least one catalytic domain which is substantially homologous with a xy-lanase of an anaerobic fungus of the genus Neocaiimostix andi which is not a full length natural xylanase and, (ii) capable of being expressed in a suitable host at a substantially higher level than a full length natural xylanase.
2. A xylanasa as claimed in, claim 1, wherein at least one Catalytic domain is identical to a catalytic domain of ti natural xylanaso from an anaerobic fungus of the genus Neocollizzst ix.
3. A xylanase as claimed in claim I. or 2, wherein the anaerobic fungus is a rumen fungus.
4. A xylanase as claimed in any one of the preceding claims, wherein the fungus is Neocallimastixpatricia-rum. A xylanase as claimed in any one of claims I1 to 5i, which is derived from a 15 xylanase of the genus N~ocallimastix having the structure (from the N-terminus to the C-terminus): CATI-LINKI-CAT2-LIhJK2-.CTRI-CTR2 wherein. CT rersnsafrtaayi oan CAT2 repr'esents a firont catalytic domain, LINKI represents a first linker, LINKZ represents a second linker, CTRI represents a first C-terminal repeat, and GTR2 represents a second C-terminal repeat,
5. A xylanase as claimed in claim 5, wherein CATt has a seqjuence which is identical or otherwise substantially homologous to 'the sequence: RLTVGN ASVNRGNrLARRCLDFGSQK IQ<.TDYSYIGLDYTATYRQTG SASGNSFLCVYGWFQNRGVQ CVPLVEYYIEDWVD:WVFDA QGIRMVT1DGAQYK1FQN1IDIJT GPTINGCS86'rr-KQY2FSVRQQ KRTSGHITVSDHFXEWAI(QG WGIGNLYEVALNAEGWQSSG a)VTMCDVYMlQ!CSNPAP.
7. A xylanase as claimed in claim 5 or 6. wherein CATZ has a sequence which is identical or otherwise substantially homologous to the seqluence: Ir~l~ I K VI'VGNGQNQI IKGNIG llsYIVKN1':;N;SM'r A KA EWNAAVNRCI NllARRGIA,I)F(CSQKKATI)YI)YI; IYA' YKXI'ASASGNSI(VYCVFQNRLNG VI)VlYYIIEI])W\) WVVJ'IAQG KNIV'1'11)(' AQYKIFQNIDI IT( PTINGGSF-TFKQYF SVRQQKRSG(;I IIITVSKI IFKHX \VAKQG\V(;IGNINVANAEG WQSS(W'VAI)VTI' 1,1 )VY'FI'PI)K(G SSPA.
8. A xylanase as claimed in claim 5, 6 or 7, wherein LINKi has a sequence which is identical or othorwise substantially homologous to the sequence: TISr('VPSSSAt( CS'I I N. K
9. A xylanase as claimed in any one of claims 5 to 8, wherein LINK2 has a 0 cv. sequence which is identical or otherise substantilliy homologous to thle seqluence: sequence which is identical or otherwise substantially homologous to the sequence: :.SAIZITP'A VPTR ](;'rW(;VlIN N WWC
10. A xylanaso as claimed in any one of claims 5 to 11, wherein CTR has a sequence which is identical or otherwise substantially homologous to the sequence: 20 vscSAI (YKCCSPINCV VYT) )T(WEV; 1NNDX\VCGCG.( 1:
12. A xylanase as claimed in any one of claims 5 to 11 comprising a catalytic domain which is substantially homologous with at least one of CATI and CAT2 and is missing at least part of the amino acid sequence downstream towards the C-terminus) of CAT2.
13. A xylanase as claimed in claim 12, wherein at least part of CTR2 is missing. 14, A xylanase as claimed in claim 12 or 13, wherein at least part of GTR1 is missing, A xylanase as claimed in any one of claims 5 to 14, which has the structure: CAT1-LINK2-CAT2-LINK2-CTRl (truncated); CAT1-LINKI-CAT2-LIN2 (truncated): LINK-(tuncated)-CA2-,LINK2 (truncated); GATi-LINKi (truncated); CAT1(truncated) LINKi(truncated)-CAT2-LINK2-CTRI-CTR2: LINK1(truncated)-CAT2-LINK2-CTRI(TRUNCATED): or UJ I I 07 '0 '98 09:46 FAX 61 3 93490 1.101 FB RICE& M- U005o 72 LINK1(truncated)-GAT2(tnmcated).
16. A xylanase as claimed in claim 14, which has the structure: LINKI (truncated)-CAT2-LINK2 (truncated), 17, An isolated or recombinant DNA molecule encoding a xylanase which has a catalytic domain substantially homologous with a ,xylanase of an anaerobic fungus of the genus Neocallimastix, provided that the DNA molecule does not comprise a full length copy of natural xnRNA encoding the xylanase and is capable of being expressed in a suitable host at a substantially higher level than a full length copy of natural mRNA encoding the xylanase.
18. A DNA molecule as claimed in claim 17, wherein the absent portion, or one of the absent portions, of the DNA corresponds to the 3' and/or W' untranslated region of the mRNA. V. 19. A DNA molecule as claimed in claim 17 or 18, which is derived from a DNA molecule having the follo-wing structure; :0.69 15 5'ulr-sig-catl -Iink-cat2-Jink2-ctrl -cr2-3'utr, OV. wherein represents a 5'untranslated region; sig encodes a signal peptide; 20cati encodes EL first cataly-tic domain: 20 inki encodes a first linker sequence; cat2 encodes a second catalytic domain; Iihk2 encodes a second linker sequence; 0. ciri encodes a first C-terminal repeat; ctr2 encodes a second C-terminal repeat; and 3'utr represenits a 3' untranslated region. A DNA sequence as claimed in claim -19, wherein the 3'utr segment ha. a sequence which is identical to or otherwise substantially homologous with the following sequence: =T TATVA TCAATCTCTA ATr-IArrr TrAGGAAAAA AATAAAAAA TAAATATA-AT AAATrAGA GAGIAATA'EV 'rAAAAACAAA GAAATAAA AACGTITTf TAGTrTA=In TTITACTGGT'TAAAAAAAA ATAAAAAACA AAATTAATAA AGATAfl=r GAAAATT GAVIIAGAAA AAAA. )7 08 98 090.47 rAX 01 3 93410 1301 MB RICE CO. do o 0 72/1 21, A DNA sequence as claimed in claim 19 or 20, wherein the sig segment has a sequence which is identical to or otherwise su~bsta~ntially homologous with. the following sequence: ATCAGAACTA1TAAAT7CI-TITCGCAGTAGCTATTCAACTGTTG L ~-1 GTAA(I(ICG (IAAT((('I'(IIAGGT(TGGG AA.
22. A DNA sequence as claimed in claims 19, 20 or 21, wherein the ca1l segment has a sequence which is identical to or otherwise substantially homologous with the following sequence: AC.ITAAC(IT.,,'cCIGT,'AATCI, I'IA A 'P''IA Y (I I"'A(,'IAA(IA (CAA G GII' T' CA'( I'A(AAI'AC'I'A(;iATCA'f'1(AA(IA'1FI'( I( I'V'C IA('I'C ICI( IF1'( :ACATI'(A( *TGAAGATCAAT(' V.'
23. A DNA sequence as claimed in any one of claims 19 to 22, wherein the linki segment has a sequence which is identical to or otherwise substantialy homologous with the following sequence: AG TC
24. A DNA sequence as claimed in aiy one of claims 19 to 23, wherein the v12 segment has a sequence which is identical to or otherwvise substantially homologous with the following sequence: AACIT T(.G'T'AAT (A(,AAAAG(,AA(,AI'AA( ('1'ICACI'' ~FICCTTIA ACC(IT( iCAr[' AAG T(;GT((T C AA'FP, AT,A'A (IC AAT(IAAA(AAAF(TACECITII(C1CINIAACATW (,(,(,CAAA(CAA(,,3,i*(.(,('CI'A'r'('CGTAAC(,,rrrr(CAAGi'C'IrrrAA('CIA(II1 GTIACICICAGGG. *'l I A DNA sequence as claimed in any one of claims 19 to 24, wherein the Jink2 segment has a sequence which is identical to or otherwise substantially homologous with the following sequence: AAITACANAAC
26. A DNA sequence as claimed in anly one of claims 19 to 25, wherein the ciri segment has a sequence which is identical to or otherwise substantially homologous with the following sequence:
27. A DNA sequence as claimed in any one of claims 19 to 26, wherein the Ctr2 segment has a seque .ice which is identical to or otherwise substantially homologous with the following sequence: GACAT AG'1'1'AA'r;AA(A(,AG-r(.AA(;'I'( ArTAGAT AA.A((A,''GG
28. A DNA sequence as claimed in any one of claims 19 to 27. wherein the segment has a sequence which is identical to or otherwise substantially homologous with the following sequence: 20 1I'AAC;C.A.PAAAAI'i C''AVNrI'AA AAAAI'IAAA(UAIA'rGAAAAATTTiAAA'i'171AAAAA'"FPAAAAG.AAITAT](,AAAAAITI'"IA ,\A'I'1I"'AAAAA'11rAAAAAAAACI'AAII-I'A(P'AAAAAA''I'AAA(.AAI"'A'1'TcixjAAA'I"I-r-'A AA''GTIAAAAA'171''AAAAAA'IACAAA'rFFFC' TAAAAAAAAATC AAAMAATTATG AAAAIFA AAAT']'AAAA(,'I'II'AAAAAA'IA(,AAATrT'I'U1''AA(.AAAAA'riAAAAFPI'ArAAAAAAAAA 25 AAG AA'ITl\AT(,AAAAA(;(,(AAAT'r('PAAA('AAAAAAAAAAAAAAAAAAAAAiiAAAAAAA.
29. A DNA sequence as claimed in anly one of claims 17 to 28 encoding a xylanase as claimed in anly one of claims 1 to 16. A DNA sequence as claimed in any one of claims 19 to 29 which comprises the following segments: 'r-igcI1-ll cI-ik-li(rnae) Wir-sig-c(I I -linki -cil2-Iin k2(trunicated); linki (trunicated)-c.aI2-liink2(truncated); i-hinki (truncated); 1 (truncated), liniki (truncated)-cat2-Jinik2-cti -ctir2-3'u Ii'; 1hiki (truilcated)-c.aI2-Iiik2-cIrl (truncated); linkl (truncated)-cul2(truncated).
31. A DNA molecule as claimed in any one of claims 17 to 30, which is in the form of a vector.
32. A DNA molecule as claimed in claim 31, wherein the vector is a plasmid.
33. A DNA molecule as claimed in claim 31 or 32, wherein the vector is an expression vector.
34. A DNA molecule which is, or comprises the insert of, plasmid pNX3, pNX4, pNX6, pNX7, pNX8, pNX9 or pNX10, as defined herein. A DNA molecule which is, or comprises the insert of plasmid pNX5, pNX9 or pNX10, as defined herein.
36. A host cell transfected or transformed with a DNA molecule as claimed in any one of claims 17 to
37. The use of a xylanase as claimed in any one of claims 1 to 16 in the modification of baked products. 38, The use of a xylanase as claimed in any one of claims 1 to 16 as an enzyme supplement for animal feed.
39. The use of a xylanase as claimed in any one of claims 1 to 16 as an impurity remover in pulp. The use of a xylanase as claimed in any one of claims 1 to 16 in the prcbleaching of kraft pulp. Dated this first day of July 1998, UNIVERSITY OF NEWCASTLE- UPON-TYNE and BIOTECHNOLOGY AND BIOLOGICAL SCIENCES RESEARCH COUNCIL Patent Attorneys for the Applicant: F B RICE CO v 4 l O I
AU43479/93A 1992-06-17 1993-06-17 Recombinant xylanases Ceased AU696768B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU43479/93A AU696768B2 (en) 1992-06-17 1993-06-17 Recombinant xylanases

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
AUPL298592 1992-06-17
AUPL2985 1992-06-17
PCT/GB1993/001283 WO1993025693A1 (en) 1992-06-17 1993-06-17 Recombinant xylanases
AU43479/93A AU696768B2 (en) 1992-06-17 1993-06-17 Recombinant xylanases

Publications (2)

Publication Number Publication Date
AU4347993A AU4347993A (en) 1994-01-04
AU696768B2 true AU696768B2 (en) 1998-09-17

Family

ID=25626386

Family Applications (1)

Application Number Title Priority Date Filing Date
AU43479/93A Ceased AU696768B2 (en) 1992-06-17 1993-06-17 Recombinant xylanases

Country Status (1)

Country Link
AU (1) AU696768B2 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1995011981A1 (en) * 1993-10-26 1995-05-04 Commonwealth Scientific And Industrial Research Organisation Cultivation process and constructs for use therein

Also Published As

Publication number Publication date
AU4347993A (en) 1994-01-04

Similar Documents

Publication Publication Date Title
EP0652961A1 (en) Recombinant xylanases
US10774318B2 (en) Cellulase gene
US10125355B2 (en) Protein having B-glucosidase activity and uses thereof
JP2014236730A (en) Novel expression-regulating sequence and expression products in the field of filamentous fungi
JPH08507695A (en) EG III Cellulase Purification and Molecular Cloning
CA2571996A1 (en) Cellulases from rumen
US20030027298A1 (en) Enzymatic array and process of making same
CA2136350C (en) Xylanases from trichoderma reesei and methods for their production
Espino et al. Botrytis cinerea endo-ß-1, 4-glucanase Cel5A is expressed during infection but is not required for pathogenesis
FI110614B (en) Cloning, expression and use of acetylxyl anesterases of fungal origin
WO1993024621A9 (en) Novel enzyme preparations and methods for their production
US5935836A (en) Actinomadura xylanase sequences and methods of use
WO1997013853A2 (en) Protein detection
JPH06500022A (en) Cloning and expression of DNA molecules encoding fungal-derived arabinan-degrading enzymes
KR101219514B1 (en) Novel xylanase produced from Cellulosimicrobium funkei HY-13 strain
AU754791B2 (en) Phenolic acid esterases, coding sequences and methods
US7226772B2 (en) Recombinant xylanases derived from anaerobic fungi, and the relevant sequences, expression vectors and hosts
WO1993025671A1 (en) Recombinant xylanase
AU696768B2 (en) Recombinant xylanases
CA2139099A1 (en) Recombinant cellulases
US11371032B2 (en) Beta glucosidase with high glucose tolerance, high thermal stability and broad PH activity spectrum
US7960511B2 (en) Acid-resistance endoglucanase and the use of thereof
BR102019023536A2 (en) enzymatic cocktail containing cellulases, xylanases and polysaccharide monooxygenases and their application in the hydrolysis of lignocellulosic biomass
AU696724B2 (en) Recombinant xylanase
Karita et al. Purification of the Ruminococcus albus endoglucanase IV using a cellulose-binding domain as an affinity tag

Legal Events

Date Code Title Description
MK14 Patent ceased section 143(a) (annual fees not paid) or expired