AU727294B2 - Regulation of gene expression in plants - Google Patents

Regulation of gene expression in plants Download PDF

Info

Publication number
AU727294B2
AU727294B2 AU89670/98A AU8967098A AU727294B2 AU 727294 B2 AU727294 B2 AU 727294B2 AU 89670/98 A AU89670/98 A AU 89670/98A AU 8967098 A AU8967098 A AU 8967098A AU 727294 B2 AU727294 B2 AU 727294B2
Authority
AU
Australia
Prior art keywords
sequence
leu
ser
gly
ala
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU89670/98A
Other versions
AU8967098A (en
Inventor
Zhongyi Li
Matthew Morell
Sadequr Rahman
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Commonwealth Scientific and Industrial Research Organization CSIRO
Biogemma SAS
Original Assignee
Commonwealth Scientific and Industrial Research Organization CSIRO
Goodman Fielder Pty Ltd
Groupe Limagrain Pacific Pty Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AUPO9108A external-priority patent/AUPO910897A0/en
Priority claimed from AUPP2509A external-priority patent/AUPP250998A0/en
Application filed by Commonwealth Scientific and Industrial Research Organization CSIRO, Goodman Fielder Pty Ltd, Groupe Limagrain Pacific Pty Ltd filed Critical Commonwealth Scientific and Industrial Research Organization CSIRO
Priority to AU89670/98A priority Critical patent/AU727294B2/en
Priority claimed from PCT/AU1998/000743 external-priority patent/WO1999014314A1/en
Publication of AU8967098A publication Critical patent/AU8967098A/en
Application granted granted Critical
Publication of AU727294B2 publication Critical patent/AU727294B2/en
Assigned to COMMONWEALTH SCIENTIFIC AND INDUSTRIAL RESEARCH ORGANISATION, BIOGEMMA SAS, GOODMAN FIELDER PTY LIMITED reassignment COMMONWEALTH SCIENTIFIC AND INDUSTRIAL RESEARCH ORGANISATION Alteration of Name(s) in Register under S187 Assignors: COMMONWEALTH SCIENTIFIC AND INDUSTRIAL RESEARCH ORGANISATION, GOODMAN FIELDER LIMITED, GROUPE LIMAGRAIN PACIFIC PTY LIMITED
Assigned to COMMONWEALTH SCIENTIFIC AND INDUSTRIAL RESEARCH ORGANISATION, BIOGEMMA SAS reassignment COMMONWEALTH SCIENTIFIC AND INDUSTRIAL RESEARCH ORGANISATION Alteration of Name(s) in Register under S187 Assignors: BIOGEMMA SAS, COMMONWEALTH SCIENTIFIC AND INDUSTRIAL RESEARCH ORGANISATION, GOODMAN FIELDER PTY LIMITED
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Landscapes

  • Breeding Of Plants And Reproduction By Means Of Culturing (AREA)

Description

WO 99/14314 PCT/AU98/00743 1 REGULATION OF GENE EXPRESSION IN PLANTS This invention relates to methods of modulating the expression of desired genes in plants, and to DNA sequences and genetic constructs for use in these methods.
In particular, the invention relates to methods and constructs for targeting of expression specifically to the endosperm of the seeds of cereal plants such as wheat, and for modulating the time of expression in the target tissue.
This is achieved by the use of promoter sequences from enzymes of the starch biosynthetic pathway. In a preferred embodiment of the invention, the sequences and/or promoters are those of starch branching enzyme I, starch branching enzyme II, soluble starch synthase I, and starch debranching enzyme, all derived from Triticum tauschii, the D genome donor of hexaploid bread wheat.
A further preferred embodiment relates to a method of identifying variations in the characteristics of plants.
BACKGROUND OF THE INVENTION Starch is an important constituent of cereal grains and of flours, accounting for about 65-67% of the weight of the grain at maturity. It is produced in the amyloplast of the grain endosperm by the concerted action of a number of enzymes, including ADP-Glucose pyrophosphorylase (EC 2.7.7.27), starch synthases (EC 2.4.1.21), branching enzymes (EC 2.4.1.18) and debranching enzymes (EC 3.2.1.41 and EC 3.2.1.68) (Ball et al, 1996; Martin and Smith, 1995; Morell et al, 1995). Some of the proteins involved in the synthesis of starch can be recovered from the starch granule (Denyer et al, 1995; Rahman et al, 1995).
Most wheat cultivars normally produce starch containing 25% amylose and 75% amylopectin. Amylose is composed of large linear chains of a linked a-Dglucopyranosyl residues, whereas amylopectin is a branching form of a-glycan linked by a linkages. The ratio of amylose and amylopectin, the branch chain length and the WO 99/14314 PCT/AU98/00743 2 number of branch chains of amylopectin are the major factors which determine the properties of wheat starch.
Starch with various properties has been widely used in industry, food science and medical science. High amylose wheat can be used for plastic substitutes and in paper manufacture to protect the environment; in health foods to reduce bowel cancer and heart disease; and in sports foods to improve the athletes' performance. High amylopectin wheat may be suitable for Japanese noodles, and is used as a thickener in the food industry.
Wheat contains three sets of chromosomes B and D) in its very large genome of about 1010 base pairs (bp).
The donor of the D genome to wheat is Triticum tauschii, and by using a suitable accession of this species the genes from the D genome can be studied separately (Lagudah et al, 1991) There is comparatively little variation in starch structure found in wheat varieties, because the hexaploid nature of wheat prevents mutations from being readily identified. Dramatic alterations in starch structure are expected to require the combination of homozygous recessive alleles from each of the 3 wheat genomes, A, B and D. This requirement renders the probability of finding such mutants in natural or mutagenised populations of wheat very low.
Variation in wheat starch is desirable in order to enable better tailoring of wheat starches for processing and enduser requirements.
Key commercial targets for the manipulation of starch biosynthesis are: i. "Waxy" wheats in which amylose content is decreased to insignificant levels. This outcome is expected to be obtained by eliminating granule-bound starch synthase activity.
2. High amylose wheats, expected to be obtained by suppressing starch branching enzyme-II activity.
3. Wheats which continue to synthesise starch at elevated temperatures, expected to be obtained by WO 99/14314 PCT/AU98/00743 3 identifying or introducing a gene encoding a heat-stable soluble starch synthase.
4. "Sugary types" of wheat which contain increased amylose content and free sugars, expected to be obtained by manipulating an isoamylase-type debranching enzyme.
There are two general strategies which may be used to obtain wheats with altered starch structure: using genetic engineering strategies to suppress the activity of a specific gene, or to introduce a novel gene into a wheat line; and (b)selecting among existing variation in wheat for missing ("null") or altered alleles of a gene in each of the genomes of wheat, and combining these by plant breeding.
However, in view of the complexity of the gene families, particularly starch branching enzyme I (SBE without the ability to target regions which are unique to genes expressed in endosperm, modification of wheat by combination of null alleles of several enzymes in general represents an almost impossible task.
Branching enzymes are involved in the production of glucose a-1,6 branches. Of the two main constituents of starch, amylose is essentially linear, but amylopectin is highly branched; thus branching enzymes are thought to be directly involved in the synthesis of amylopectin but not amylose. There are two types of branching enzymes in plants ,starch branching enzyme I (SBE I) and starch branching enzyme II (SBE II), and both are about 85 kDa in size. At the nucleic acid level there is about 65% sequence identity between types I and II in the central portion of the molecules; the sequence identity between SBE I from different cereals is about 85% overall (Burton et al, 1995; Morell et al, 1995).
In cereals, SBE I genes have so far been reported only for rice (Kawasaki et al, 1991; Rahman et al, 1997). A cDNA sequence for wheat SBE I is available on the GenBank WO 99/14314 PCT/AU98/00743 4 database (Accession No. Y12320; Repellin Nair Baga and Chibbar Plant Gene Register PGR97-094, 1997).
As far as we are aware, no promoter sequence for wheat SBE I has been reported.
We have characterised an SBE I gene, designated wSBE I-D2, from Triticum tauschii, the donor of the D genome to wheat (Rahman et al, 1997). This gene encoded a protein sequence which had a deletion of approximately 65 amino acids at the C-terminal end, and appeared not to contain some of the conserved amino acid motifs characteristic of this class of enzyme (Svensson, 1994). Although wSBE I-D2 was expressed as mRNA, no corresponding protein has yet been found in our analysis of SBE I isoforms from the endosperm, and thus it is possible that this gene is a transcribed pseudogene.
Genes for SBE II are less well characterised; no genomic sequences are available, although SBE II cDNAs from rice (Mizuno et al, 1993; Accession No. D16201) and maize (Fisher et al, 1993; Accession No. L08065) have been reported. In addition, a cDNA sequence for SBE II from wheat is available on the GenBank database (Nair et al, 1997; Accession No. Y11282); although the sequences are very similar to those reported herein, there are differences near the N-terminal of the protein, which specifies its intracellular location. No promoter sequences have been reported, as far as we are aware.
Wheat granule-bound starch synthase (GBSS) is responsible for amylose synthesis, while wheat branching enzymes together with soluble starch synthases are considered to be directly involved in amylopectin biosynthesis. A number of isoforms of soluble and granulebound starch synthases have been identified in developing wheat endosperm (Denyer et al, 1995). There are three distinct isoforms of starch synthases, 60 kDa, 75-77 kDa and 100-105 kDa, which exist in the starch granules (Denyer et al, 1995; Rahman et al, 1995). The 60 kDa GBSS is the product of the wx gene. The 75-77 kDa protein is a wheat WO 99/14314 PCT/AU98/00743 5 soluble starch synthase I (SSSI) which is present in both the soluble fraction and the starch granule-bound fraction of the endosperm. However, the 100-105 kDa proteins, which are another type of soluble starch synthase, are located only in starch granules (Denyer et al, 1995; Rahman et al, 1995). To our knowledge there has been no report of any complete wheat SSS I sequence, either at the protein or the nucleotide level.
Both cDNA and genomic DNA encoding a soluble starch synthase I of rice have been cloned and analysed (Baba et al, 1993; Tanaka et al, 1995). The cDNAs encoding potato soluble starch synthase SSSII and SSSIII and pea soluble starch synthase SSSII have also been reported (Edwards et al, 1995; Marshall et al, 1996; Dry et al, 1992). However, corresponding full length cDNA sequences for wheat have hitherto not been available, although a partial cDNA sequence (Accession No. U48227) has been released to the GenBank database.
Approach referred to above has been demonstrated for the gene for granule-bound starch synthase.
Null alleles on chromosomes 7A, 7D and 4A were identified by the analysis of GBSS protein bands by electrophoresis, and combined by plant breeding to produce a wheat line containing no GBSS, and no amylose (Nakamura et al, 1995).
Subsequently, PCR-based DNA markers have been identified, which also identify null alleles for the GBSS loci on each of the three wheat genomes. Despite the availability of a considerable amount of information in the prior art, major problems remain. Firstly, the presence of three separate sets of chromosomes in wheat makes genetic analysis in this species extraordinarily complex. This is further complicated by the fact that a number of enzymes are involved in starch synthesis, and each of these enzymes is itself present in a number of forms, and in a number of locations within the plant cell. Little, if any, information has been available as to which specific form of each enzyme is expressed in endosperm. For wheat, a limited WO 99/14314 PCT/AU98/00743 6 amount of nucleic acid sequence information is available, but this is only cDNA sequence; no genomic sequence, and consequently no information regarding promoters and other control sequences, is available. Without being able to demonstrate that the endosperm-specific gene within a family has been isolated, such sequence information is of limited practical usefulness.
SUMMARY OF THE INVENTION In this application we report the isolation and identification of novel genes from T. tauschii, the D-genome donor of wheat, that encode SBE I, SBE II, a 75 kDa SSS I, and an isoamylase-type debranching enzyme (DBE). Because of the very close relationship between T. tauschii and wheat, as discussed above, results obtained with T. tauschii can be directly applied to wheat with little if any modification.
Such modification as may be required represents routine trial and error experimentation. Sequences from these genes can be used as probes to identify null or altered alleles in wheat, which can then be used in plant breeding programmes to provide modifications of starch characteristics. The novel sequences of the invention can be used in genetic engineering strategies or to introduce a desired gene into a host plant, to provide antisense sequences for suppression of one or more specific genes in a host plant, in order to modify the characteristics of starch produced by the plant.
By using T. tauschii, we have been able to examine a single genome, rather than three as in wheat, and to identify and isolate the forms of the starch synthesis genes which are expressed in endosperm. By addressing genomic sequences we have been able to isolate tissue-specific promoters for the relevant genes, which provides a mechanism for simultaneous manipulation of a number of genes in the endosperm. Because T. tauschii is so closely related to wheat, results obtained with this model system are directly applicable to wheat, and we have confirmed this experimentally. The genomic sequences which we have PCT/AU98/00743 Received 21 June 1999 7 determined can also be used as probes for the identification and isolation of corresponding sequences, including promoter sequences, from other cereal plant species.
In its most general aspect, the invention provides a nucleic acid sequence encoding an enzyme of the starch biosynthetic pathway in a cereal plant, said enzyme being selected from the group consisting of starch branching enzyme I, starch branching enzyme II, starch soluble synthase I, and debranching enzyme, with the proviso that the enzyme is not soluble starch synthase I of rice, or starch branching enzyme I of rice or maize, and that starch branching enzyme II does not have the N-terminal amino acid sequence:
AASPGKVLVPDGEDDLASPA.
Preferably the nucleic acid sequence is a DNA sequence, and may be genomic DNA or cDNA. Preferably the sequence is one which is functional in wheat. More preferably the sequence is derived from a Triticum species, most preferably Triticum tauschii.
Where the sequence encodes soluble starch synthase, preferably the sequence encodes the 75 kD soluble starch synthase of wheat.
Biologically-active untranslated control sequences of genomic DNA are also within the scope of the invention.
Thus the invention also provides the promoter of an enzyme as defined above.
In a preferred embodiment of this aspect of the invention, there is provided a nucleic acid construct comprising a nucleic acid sequence of the invention, a biologically-active fragment thereof, or a fragment thereof encoding a biologically-active fragment of an enzyme as defined above, operably linked to one or more nucleic acid sequences facilitating expression of said enzyme in a plant, preferably a cereal plant. The construct may be a plasmid or a vector, preferably one suitable for use in the transformation of a plant. A particularly suitable vector S is a bacterium of the genus Agrobacterium, preferably AMENDED SHEET (Article 34) (IPEA/AU) PCT/AU98/00743 Received 21 June 1999 7a- Agrobacte-i ur turnefaci ens. Methods of transforming cereal plants using Agrobacteriurn turnefaciens are known; see for example Australian Patent No. 667939 by Japan Tobacco Inc., AMENDED SHEET (Article 34) (LPEA/AU) WO 99/14314 PCT/AU98/00743 -8- International Patent Application Number PCT/US97/10621 by Monsanto Company and Tingay et al (1997).
In a second aspect, the invention provides a nucleic acid construct for targeting of a desired gene to endosperm of a cereal plant, and/or for modulating the time of expression of a desired gene in endosperm of a cereal plant, comprising one or more promoter sequences selected from SBE I promoter, SBE II promoter, SSS I promoter, and DBE promoter, operatively linked to a nucleic acid sequence encoding a desired protein, and optionally also operatively linked to one or more additional targeting sequences and/or one or more 3' untranslated sequences.
The nucleic acid encoding the desired protein may be in either the sense orientation or in the antisense orientation. Preferably the desired protein is an enzyme of the starch biosynthetic pathway. For example, the antisense sequences of GBSS, starch debranching enzyme, SBE II, low molecular weight glutenin, or grain softness protein I, may be used. Preferred sequences for use in sense orientation include those of bacterial isoamylase, bacterial glycogen synthase, or wheat high molecular weight glutenin Bxl7. It is contemplated that any desired protein which is encoded by a gene which is capable of being expressed in the endosperm of a cereal plant is suitable for use in the invention.
In a third aspect, the invention provides a method of modifying the characteristics of starch produced by a plant, comprising the step of: introducing a gene encoding a desired enzyme of the starch biosynthetic pathway into a host plant, and/or introducing an anti-sense nucleic acid sequence directed to a gene encoding an enzyme of the starch biosynthetic pathway into a host plant, wherein said enzymes are as defined above.
Where both steps and are used, the enzymes in the two steps are different.
Preferably the plant is a cereal plant, more preferably wheat or barley.
WO 99/14314 PCT/AU98/00743 9
I
As is well known in the art, anti-sense sequences can be used to suppress expression of the protein to which the anti-sense sequence is complementary. It will be evident to the person skilled in the art that different combinations of sense and anti-sense sequences may be chosen so as to effect a variety of different modifications of the characteristics of the starch produced by the plant.
In a fourth aspect, the invention provides a method of targeting expression of a desired gene to the endosperm of a cereal plant, comprising the step of transforming the plant with a construct according to the invention.
According to a fifth aspect, the invention provides a method of modulating the time of expression of a desired gene in endosperm of a cereal plant, comprising the step of transforming the plant with a construct according to the second aspect of the invention.
Where expression at an early stage following anthesis is desired, the construct preferably comprises the SBE II, SSS I or DBE promoters. Where expression at a later stage following anthesis is desired, the construct preferably comprises the SBE I promoter.
While the invention is described in detail in relation to wheat, it will be clearly understood that it is also applicable to other cereal plants of the family Gramineae, such as maize, barley and rice.
Methods for transformation of monocotyledonous plants such as wheat, maize, barley and rice and for regeneration of plants from protoplasts or immature plant embryos are well known in the art. See for example Lazzeri et al, 1991; Jahne et al, 1991 and Wan and Lemaux, 1994 for barley; Wirtzens et al, 1997; Tingay et al, 1997; Canadian Patent Application No. 2092588 by Nehra; Australian Patent Application No. 61781/94 by National Research Council of Canada, Australian Patent No. 667939 by Japan Tobacco Co, and International Patent Application Number PCT/US97/10621 by Monsanto Company.
WO 99/14314 PCT/AU98/00743 10 The sequences of ADP glucose pyrophosphorylase from barley (Australian Patent Application No. 65392/94), starch debranching enzyme and its promoter from rice (Japanese Patent Publication No. Kokai 6261787 and Japanese Patent Publication No. Kokai 5317057), and starch debranching enzyme from spinach and potato (Australian Patent Application No. 44333/96) are all known.
Detailed Description of the Drawings The invention will be described in detail by reference only to the following non-limiting examples and to the figures.
Figure 1 shows the hybridisation of genomic clones isolated from T. tauschii.
DNA was extracted from the different clones, digested with BamHI and hybridised with the 5' end of the maize SBE I cDNA. Lanes 1, 2, 3 and 4 correspond to DNA from clones kEl, kE2, XE6 and kE7 respectively. Note that clones El and kE2 give identical patterns, the SBE I gene in iE6 is a truncated form of that in XEl, and kE7 gives a clearly different pattern.
Figure 2 shows the hybridisation of DNA from T. tauschii.
DNA from T. tauschii was digested with BamHI and the hybridisation pattern compared with DNA from kEl and XE7 digested with the same enzyme. Fragment El.1 (see Figure 3) from kEl was used as the probe; it contains some sequences that are over 80% identical to sequences in E7.8.
Approximately 25 ig of T. tauschii DNA was electrophoresed in lane 1, and 200 pg each of XEl and XE7 in lanes 2 and 3, respectively.
Figure 3 shows the restriction maps of clone XE1 and XE7. The fragments obtained with EcoRI and BamHI are indicated. The fragments sequenced from kEl are El.1, E1.2, a part of E1.7 and a part of Figure 4 shows the comparison of deduced amino acid sequence of wSBE I-D4 cDNA with the deduced amino acid WO 99/14314 PCT/AU98/00743 11 sequence of rice SEE I (RSBE I; Nakamura et al, 1992), maize SBE I (MSBE I; Baba et al, 1991), wSBE I-D2 type cDNA (D2 CDNA; Rahman et al, 1997), pea SEE II (PESBE II, homologous to maize SBE I; Burton et al, 1995), and potato SEE I (POSBE; Cangiano et al, 1993). The deduced amino acid sequence of the wSBE I-D4 cDNA is denoted by "D4cDNA".
Residues present in at least three of the sequences are identified in the consensus sequence in capitals.
Figure 5 shows the intron-exon structure of wSBE I-D4 compared to the corresponding structures of rice SEE I (Kawasaki et al, 1993) and wSBE I-D2 (Rahman et al, 1997). The intron-exon structure of wSBE I-D4 is deduced by comparison with the SBE I cDNA reported by Repellin et al (1997).
The dark rectangles correspond to exons and the light rectangles correspond to introns. The bars above the structures indicate the percentage identity in sequence between the indicated exons and introns of the relevant genes. Note that intron 2 shares no significant sequence identity and is not indicated.
Figure 6 shows the nucleotide sequence of part of wSBE I-D4, the amino acid sequence deduced from this nucleotide sequence, and the N-terminal amino acid sequence of the SEE I purified from the wheat endosperm (Morell et al, 1997).
Figure 7 shows the hybridisation of SBE I genomic clones with the following probes, A. wSBE I-D45 (derived from the 5' end of the gene and including sequence from fragments El.1 and E1.7), and B. wSBE I-D43 (derived from the 3' end of the gene and containing sequences from fragment E1.5). For panel A, the tracks 1-13 correspond to clones XEl, XE2, XE6, XE7, XE9, XE14, XE22, kE27, Molecular weight markers, XE29, XE30, XE31 and XE52. For panel B, tracks 1-12 correspond to clones XEl, XE2, XE6, XE7, XE9, XE14, XE22, XE27, kE29, XE31 and XE52. Note that clones XE7 and XE22 do not WO 99/14314 PCT/AU98/00743 12 hybridise to either of the probes and are wSBE I-D2 type genes. Also note that clone XE30 contains a sequence unrelated to SBE I. The size of the molecular weight markers in kb is indicated. Clones kE7 and XE22 do hybridise with a probe from El.l. which is highly conserved between wSBE I-D2 and wSBE I-D4.
Figure 8 shows the alignment of cDNA clones to obtain the sequence represented by wSBE I-D4 cDNA. BED4 and were obtained from screening the cDNA library with maize BEI (Baba et al, 1991). BED1, 2 and 3 were obtained by RT-PCR using defined primers.
Figure 9a shows the expression of Soluble Starch Synthase I (SSS), Starch Branching Enzyme I (BE I) and Starch Branching Enzyme II (BE II) mRNAs during endosperm development..
RNA was purified from leaves, florets prior to anthesis, and endosperm of wheat cultivar Rosella grown in a glasshouse, collected 5 to 8 days after anthesis, 10 to days after anthesis and 18 to 22 days after anthesis, and from the endosperm of wheat cultivar Rosella grown in the field and collected 12, 15 and 18 days after anthesis respectively. Equivalent amounts of RNA were electrophoresed in each lane. The probes were from the coding region of the SM2 SSS I cDNA (from nucleotide 1615 to 1919 of the SM2 cDNA sequence); wSBE I-D43C (see Table I), which corresponds to the untranslated 3' end of wSBE I-D4 cDNA (El and the 5' region of SBE9 (SBE9 corresponding to the region between nucleotides 743 to 1004 of Genbank sequence Y11282. No hybridisation to RNA extracted from leaves or preanthesis florets was detected.
Figure 9b shows the hybridisation of RNA from the endosperm of the hexaploid T. aestivum cultivar "Gabo" with the starch branching enzyme I gene. The probe, wSBEI-D43, is defined in Table 1.
Figure 9c shows the hybridisation of RNA from the endosperm of the hexaploid T. aestivum cultivar "Wyuna" with WO99/14314 PCT/AU98/00743 13 the starch branching enzyme II gene. The probe, wSBE II-D13, is defined in Table 2.
Figure 9d shows the hybridisation of RNA from the endosperm of the hexaploid T. aestivum cultivar "Gabo" with the SSS I gene. The probe spanned the region from nucleotides 2025 to 2497 of the SM2 cDNA sequence shown in SEQ ID No:ll.
Figure 9e shows the hybridisation of RNA from the endosperm of the hexaploid T. aestivum cultivar "Gabo" with the DBE I gene. The probe, a DBE3' 3'PCR fragment, extends from nucleotide position 281 to 1072 of the cDNA sequence in SEQ ID No:16.
Figure 9f shows the hybridisation of RNA from the endosperm of the hexaploid T. aestivum cultivar "Gabo" with the wheat actin gene. The probe was a wheat actin DNA sequence generated by PCR from wheat endosperm cDNA using primers to conserved plant actin sequences.
Figure 9g shows the hybridisation of RNA from the endosperm of the hexaploid T. aestivum cultivar "Gabo" with a probe containing wheat ribosomal RNA 26S and 18S fragments (plasmid pta250.2 from Dr Bryan Clarke, CSIRO Plant Industry).
Figure 9h shows the hybridisation of RNA from the hexaploid wheat cultivar "Gabo" with the DBE I probe described in Figure 9e. Lane 1; leaf RNA; lane 2, preanthesis floret RNA; lane 3, RNA from endosperm harvested 12 days after anthesis.
Figure 10 shows the comparison of wSBE I-D4 (sr 427.res ck: 6,362,1 to 11,099) and rice SBE I genomic sequence (dl0838.em_pl ck: 3,071,1 to 11,700)(Kawasaki et al, 1993; Accession Number D10838) using the programs Compares and DotPlot (Devereaux et al, 1984). The programs used a window of 21 bases with a stringency of 14 to register a dot.
Figure 11 shows the hybridisation of wheat DNA from chromosome-engineered lines using the following probes: A. wSBE I-D45 (from the 5' end of the gene), WO 99/14314 PCT/AU98/00743 14 B. wSBE I-D43 (from the 3' end of the gene), and C. wSBE I-D4R (repetitive sequence approximately 600 bp 3' to the end of wSBE I-D4 sequence.
N7AT7B, no 7A chromosome, four copies of 7B chromosome; N7BT7D, no 7B chromosome, four copies of 7D chromosome; NTDT7A, no 7D chromosome, four copies of 7A chromosome. The chromosomal origin of hybridising bands is indicated.
Figure 12 shows the hybridisation of genomic clones Fl, F2, F3 and F4 with'the entire SBE-9 sequence.
The DNA from the clones was purified and digested with either BamHI or EcoRI, separated on agarose, blotted onto nitrocellulose and hybridised with labelled SBE-9 (a SBE II type cDNA). The pattern of hybridising bands is different in the four isolates.
Figure 13a shows the N-terminal sequence of purified SBE II from wheat endosperm as in Morell et al, (1997).
Figure 13b shows the deduced amino acid sequence from part of wSBE II-D1 that encodes the N-terminal sequence as described in Morell et al, (1997) Figure 14 shows the deduced exon-intron structure for a part of wSBE II-D1. The scale is marked in bases.
The dark rectangles are exons.
Figure 15 shows the hybridisation of DNA from chromosome engineered lines of wheat (cultivar Chinese Spring) with a probe from nucleotides 550-850 from SBE-9.
The band of approximately 2.2 kb is missing in the line in which chromosome 2D is absent.
T2BN2A: four copies of chromosome 2B, no copies of chromosome 2A; T2AN2B: four copies of chromosome 2A, no copies of chromosome 2B; T2AN2D: four copies of chromosome 2A, no copies of chromosome 2D.
WO 99/14314 PCT/AU98/00743 15 Figure 16 shows the N-terminal sequence of SSS I protein isolated from starch granules (Rahman et al, 1995) and deduced amino acid sequence of part of Sm2.
Figure 17 shows the hybridisation of genomic clones sgl, 3, 4, 6 and 11 with the cDNA clone (sm2) for SSS I. DNA was purified from indicated genomic clones, digested with BamHI or SacI and hybridised to sm2. Note that the hybridisation patterns for sgl, 3 and 4 are clearly different from each other.
Figure 18 shows a comparison of the intron/exon structures of the wheat and rice soluble starch synthase genomic sequences. The dark rectangles indicate exons and the light rectangles represent introns.
Figure 19 shows the hybridisation of DNA from chromosome engineered lines of wheat (cultivar Chinese Spring) digested with PvuII, with the sm2 probe.
N7AT7B: no 7A chromosome, four copies of 7B chromosome; N7BT7D: no 7B chromosome, four copies of 7D chromosome; N7DT7A: no 7D chromosome, four copies of 7A chromosome.
A band is missing in the N7BT7A line.
Figure 20a shows the DNA sequence of a portion of the wheat debranching enzyme (WDBE-1)PCR product. The PCR product was generated from wheat genomic DNA (cultivar Rosella) using primers based on sequences conserved in debranching enzymes from maize and rice.
Figure 20b shows a comparison of the nucleotide sequence of wheat debranching enzyme I (WDBE-I) PCR fragment (WHEAT.DNA) with the maize Sugary-i sequence
(SUGARY.DNA).
Figure 20c shows a comparison between the intron/exon structures of wheat debranching enzyme gene and the maize sugary-1 debranching enzyme gene.
Figure 21a shows the results of Southern blotting of T. tauschii DNA with wheat DBE-I PCR product. DNA from T. tauschii was digested with BamHI electrophoresed, WO 99/14314 PCT/AU98/00743 16 blotted and hybridised to the wheat DBE-I PCR product described in Figure 20a. A band of approximately 2 kb hybridised.
Figure 21b shows Chinese Spring nullisomic/ tetrasomic lines probed with probes from the DBE gene. Panel shows hybridisation with a fragment spanning the region from nucleotide 270 to 465 of the cDNA sequence shown in SEQ ID No:16 from the central region of the DBE gene. Panel (II) shows hybridisation with a probe from the 3' region of the gene, from nucleotide 281 to 1072 of the cDNA sequence given in SEQ ID No:16.
Figures 22a to 22e show diagrammatic representations of the DNA vectors used for transient expression analysis. In each of the sequences the N-terminal methionine encoding ATG codon is shown in bold.
Figure 22a shows a DNA construct pwsssIprolgfpNOT containing a 1042 base pair region of the wheat soluble starch synthase I promoter (wSSSIprol, from -1042 to SEQ ID No:18) fused to the green fluorescent protein (GFP) reporter gene.
Figure 22b shows a DNA construct pwsssIpro2gfpNOT containing a 3914 base pair region of the wheat soluble starch synthase I promoter (wSSSIpro2, from -3914 to SEQ ID No:18) fused to the green fluorescent protein (GFP) reporter gene.
Figure 22c shows a DNA construct psbeIIprolgfpNOT containing an 1203 base pair region of the wheat starch branching enzyme II promoter (sbellprol, from 1 to 1023 SEQ ID No:10 fused to the green fluorescent protein (GFP) reporter gene.
Figure 22d shows a DNA construct psbeIIpro2gfpNOT containing a 1353 base pair region of the wheat starch branching enzyme II promoter and transit peptide coding region (sbellpro2, regions 1-1203, 1204 to 1336 and 1664 to 1680 of SEQ ID No:10 fused to the green fluorescent protein (GFP) reporter gene.
Figure 22e shows a DNA construct pact_jsgfg_nos WO 99/14314 PCT/AU98/00743 17 containing the plasmid backbone of pSP72 (Promega), the rice ActI actin promoter (McElroy et al. 1991), the GFP gene (Sheen et al. 1995) and the Agrobacterium tumefaciens nopaline synthase (nos) terminator (Bevan et al. 1983).
Figure 23 shows T DNA constructs for stable transformation of rice by Agrobacterium. The backbone for each plasmid is p35SH-iC (Wang et al 1997). The various promoter-GFP-Nos regions inserted are shown in (c) and respectively, and are described in detail in Example 24. Each of these constructs was inserted into the NotI site of p35SH-iC using the NotI flanking sites at each end of the promoter-GFP-Nos regions. The constructs were named p35SH-iC-BEIIprol_GFP_Nos, p35SH-iC-BEIIpro2_GFP_Nos p35SH-iC-SSIprol_GFP_Nos and SSIpro2_GFP_Nos Figure 24 illustrates the design of 15 intronspanning BE II primer sets. Primers were based on wSBE II-D1 sequence (SEQ ID No:10), and were designed such that intron sequences in the wSBE II-D1 sequence (deduced from Figure 13b and Nair et al, 1997; Accession No. Y11282) were amplified by PCR.
Figure 25 shows the results of amplification using the SBE II-Intron 5 primer set (primer set 6: sr913F and WBE2E6 R) on various diploid, tetraploid and hexaploid wheats.
i)T.boeodicum (A genome diploid) ii)T.tauschii (D genome diploid) iii)T.aestivum cv. Chinese Spring ditelosomic line 2AS (lacking chromosome arm 2AL) iv)Crete 10 (AABB tetraploid) v)T. aestivum cv Rosella (hexaploid) The horizontal axis indicates the size of the product in base pairs, the vertical axis shows arbitrary fluorescence units. The various arrows indicate the products of different genomes: A, A genome, B, B genome, D, D genome, U, unassigned additional product.
WO 99/14314 PCT/AU98/00743 18 Figure 26 shows the results obtained by amplification using the SBE II-Intron 10 primer set (primer set 11: da5.seq and WBE2E11R on the wheat lines: aestivum cv. Chinese Spring ditelosomic line 2AS.
(ii)T. aestivum Chinese Spring nullisomic/tetrasomic line N2BT2A.
(iii) T. aestivum Chinese Spring nullisomic/tetrasomic line N2DT2B.
The horizontal axis indicates the size of the product in base pairs, the vertical axis shows arbitrary fluorescence units. The various arrows indicate the products of different genomes: A, A genome, B, B genome, D, D genome.
Figure 27 shows the results of transient expression assays typical of each promoter and target tissue. The photographs (40 x magnification) of representative tissue resulting from the transient expression assays typical of each promoter and target tissue revealed under a Leica microscope with blue light illumination. Photographs were taken 48 to 72 hours after tissue bombardment. The promoter constructs are listed as follows, (with the panels showing endosperm, embryo and leaf expression listed in respective order): pact_jsgfp_nos (panels a,g and pwsssIprolgfpNOT (panels b, h and n); pwsssIpro2gfpNOT (panels c, i and psbeIIprolgfpNOT (panels d, j and psbeIIpro2gfpNOT (panels e, k and q); pZLgfpNOT (Panels f, 1 and r).
Example 1 Identification of Gene Encoding SBE I Construction of Genomic Library and Isolation of Clones The genomic library used in this study was constructed from Triticum tauschii, var. strangulata, accession number CPI 100799. Of all the accessions of T. tauschii surveyed, the genome of CPI 100799 is the most closely related to the D genome of hexaploid wheat.
WO 99/14314 PCT/AU98/00743 19 Triticum tauschii, var strangulata (CPI accession number 110799) was kindly provided by Dr E Lagudah. Leaves were isolated from plants grown in the glasshouse.
DNA was extracted from leaves of Triticum tauschii using published methods (Lagudah et al, 1991), partially digested with Sau3A, size fractionated and ligated to the arms of lambda GEM 12 (Promega). The ligated products were used to transfect the methylation-tolerant strain PMC 103 (Doherty et al. 1992). A total of 2 x 106 primary plaques were obtained with an average insert size of about 15 kb.
Thus the library contains approximately 6 genomes worth of T. tauschii DNA. The library was amplified and stored at 4°C until required.
Positive plaques in the genomic library were selected as those hybridising with the 5' end of a maize starch branching enzyme I cDNA (Baba et al, 1991) using moderately stringent conditions as described in Rahman et al, (1997).
Preparation of Total RNA from Wheat Total RNA was isolated from leaves, pre-anthesis pericarp and different developmental stages of wheat endosperm of the cultivar, Hartog and Rosella. This material was collected from both the glasshouse and the field. The method used for RNA isolation was essentially the same as that described by Higgins et al (1976). RNA was then quantified by UV absorption and by separation in 1.4% agarose-formaldehyde gels which were then visualized under UV light after staining with ethidium bromide (Sambrook et al, 1989).
DNA and RNA analysis DNA was isolated and analysed using established protocols (Sambrook et al, 1989). DNA was extracted from wheat (cv. Chinese Spring) using published methods (Lagudah et al, 1991). Southern analysis was performed essentially as described by Jolly et al (1996). Briefly, 20 gg wheat WO 99/14314 PCT/AU98/00743 20 DNA was digested, electrophoresed and transferred to a nylon membrane. Hybridisation was conducted at 420C in 25% or formamide, 2 x SSC, 6% Dextran Sulphate for 16h and the membrane was washed at 60°C in 2 x SSC for 3 x lh unless otherwise indicated. Hybridisation was detected by autoradiography using Fuji X-Omat film.
RNA analysis was performed as follows. 10 ig of total RNA was separated in a 1.4% agarose-formaldehyde gel and transferred to a nylon Hybond N membrane (Sambrook et al, 1989 and hybridized with cDNA probe at 420C in Khandjian hybridizing buffer (Khandjian, 1989). The 3' part of wheat SBE I cDNA (designated wSBE I-D43, see Table 1) was labelled with the Rapid Multiprime DNA Probe Labelling Kit (Amersham) and used as probe. After washing at 600C with 2 x SSC, 0.1% SDS three times, each time for about 1 to 2 hours, the membrane was visualized by overnight exposure at -80°C with X-ray film, Kodak MR.
Example 2 Frequency of Recovery of SEE I Type Clones from the Genomic Library An estimated 2 x 10 plaques from the amplified library were screened using an EcoRI fragment that contained 1200 bp at the 5' end of maize SBE I (Baba et al, 1991) and twelve independent isolates were recovered and purified.
This corresponds to the screening of somewhat fewer than the 2 x 10 primary plaques that exist in the original library (each of which has an average insert size of 15 kb) (Maniatis et al, 1982), because the amplification may lead to the representation of some sequences more than others.
Assuming that the amplified library contains approximately three genomes of T. tauschii, the frequency with which SBE I-positive clones were recovered suggests the existence of about 5 copies of SBE I type genes within the T. tauschii genome.
Digestion of DNA from the twelve independent isolates by the restriction endonuclease BamHI followed by hybridisation with a maize SBE I clone, suggested that the WO 99/14314 PCT/AU98/00743 21 genomic clones could be separated into two broad classes (Figure One class had 10 members and a representative from this class is the clone lE1 (Figure 1, lane XE6 (Figure 1, lane 3) is a member of this class, but is missing the 5' end of the El-SBE I gene because the SEE I gene is at the extremity of the cloned DNA. Further hybridisation studies at high stringency with the extreme 5' and 3' regions of the SEE I gene contained in XEl suggested that the other clones contained either identical or very closely related genes.
The second family had two members, and of these clone XE7 (Figure 1, lane 4) was arbitrarily selected for further study. These two members did not hybridise to probes from the extreme 5' and 3' regions of the SEE I gene that were contained in XEl, indicating that they were a distinct sub-class.
The DNA from T. tauschii and the lambda clones XEl and XE7 was digested with BamHI and hybridised with fragment El.l, as shown in Figure 2. This fragment contains sequences that are highly conserved (85% sequence identity over 0.3 kB between XEl and XE7), corresponding to exons 3, 4 and 5 of the rice gene. The bands in the genomic DNA at 0.8 kb and 1.0 kb correspond to identical sized fragments from XEl and XE7, as shown in Figure 2; these are fragments El.1 and E7.8 of XEl and XE7 genomic clones respectively. Thus the arrangement of genes in the genomic clones is unlikely to be an artefact of the cloning procedure. There are also bands in the genomic DNA of approximately 2.5 kb, 4.8 kb and 8 kb in size which are not found from the digestion of XEl or XE7; these could represent genes such as the 5' sequences of wSBE I-D1 or wSBE I-D3; see below.
Example 3 Tandem Arrangement of SBE I Type Genes in the T. tauschii Genome Basic restriction endonuclease maps for XEl and XE7 are shown in Figure 3. The map was constructed by WO 99/14314 PCT/AU98/00743 22 performing a series of hybridisations of EcoRI or BamHI digested DNA from XEl or XE7. The probes used were the fragments generated from BamHI digestion of the relevant clone. Confirmation of the maps was obtained by PCR analysis, using primers both within the insert and also from the arms of lambda itself. PCR was performed in 10 .l volume using reagents supplied by Perkin-Elmer. The primers were used at a concentration of 20 JIM. The program used was 94 0 C, 2 min, 1 cycle, then 94 0 C, 30 sec; 55°C, 30 sec; 72 0
C,
1min for 36 cycles and then 72 0 C, 5 min; 25 0 C, 1 min.
Sequencing was performed on an ABI sequencer using the manufacturer's recommended protocols for both dye primer and dye terminator technologies. Deletions were carried out using the Erase-a-base kit from Promega.
Sequence analysis was carried out using the GCG version 7 package of computer programs (Devereaux et al, 1984).
The PCR products were also used as hybridisation probes. The positioning of the genes was derived from sequencing the ends of the BamHI subclones and also from sequencing PCR products generated from primers based on the insert and the lambda arms. The results indicate that there is only a single copy of a SBE I type gene within XEl.
However, it is clear that ?E7 resulted from the cloning of a DNA fragment from within a tandem array of the SBE I type genes. Of the three genes in the clone, which are named as wSBE I-D1, wSBE I-D2 and wSBE I-D3); only the central one (wSBE I-D2) is complete.
Example 4 Construction and Screening of cDNA Library A wheat cDNA library was constructed from the cultivar Rosella using pooled RNA from endosperm at 8, 12, 18 and 20 days after anthesis.
The cDNA library was prepared from poly A+ RNA that was extracted from developing wheat grains (cv.
Rosella, a hexaploid soft wheat cultivar) at 8, 12, 15, 18, 21 and 30 days after anthesis. The RNA was pooled and used WO 99/14314 PCT/AU98/00743 23 to synthesise cDNA that was propagated in lambda ZapII (Stratagene).
The library was screened with a genomic fragment from E7 encompassing exons 3, 4 and 5 (fragment E7.8 in Figure A number of clones were isolated. Of these an apparently full-length clone appeared to encode an unusual type of cDNA for SEE I. This cDNA has been termed SEE I-D2 type cDNA. The putative protein product is compared with the maize SEE I and rice SEE I type deduced amino acid sequences in Figure 4. The main difference is that this putative protein product is shorter at the C-terminal end, with an estimated molecular size of approximately 74 kD compared with 85 kDa for rice SEE I (Kawasaki et al, 1993).
Note that amino acids corresponding to exon 9 of rice are missing in SEE I-D2 type cDNA, but those corresponding to exon 10 are present. There are no amino acid residues corresponding to exons 11-14 of rice; furthermore, 'the sequence corresponding to the last 57 amino acids of SEE I-D2 type has no significant homology to the sequence of the rice gene.
We expressed SEE I-D2 type cDNA in E. coli in order to examine its function. The cDNA was expressed as a fusion protein with 22 N-terminal residues of P-galactosidase and two threonine residues followed by the SEE I-D2 cDNA sequence either in or out of frame. Although an expected product of about 75 kDa in size was produced from only the in-frame fusion, we could not detect any enzyme activity from crude extracts of E. coli protein.
Furthermore the in-frame construct could not complement an E. coli strain with a defined deletion in glycogen branching, although other putative branching enzyme cDNAs have been shown to be functional by this assay (data not shown). It is therefore unclear whether the wSBE I-D2 gene in XE7 codes for an active enzyme in vivo.
WO 99/14314 PCT/AU98/00743 24 Example 5 Gene Structure in E7 i. Sequence of wSBE I-D2 We sequenced 9.2 kb of DNA that contained wSBE I-D2. This corresponds to fragments 7.31, 7.8 and 7.18. Fragment 7.31 was sequenced in its entirety (4.1 kb), but the sequence of about 30 bases about 2 kb upstream of the start of the gene could not be obtained because it was composed entirely of Gs. Elevation of the temperature of sequencing did not overcome this problem. Fragments 7.8 (1 kb) and 7.18 (4 kb) were completely sequenced, and corresponded to 2 kb downstream of the last exon detected for this gene. It was clear that we had isolated a gene which was closely related (approximately 95% sequence identity) to the SBE I-D2 type cDNA referred to above, except that the last 200 bp at the 3' end of the cDNA are not present. The wSBE I-D2 gene includes sequences corresponding to rice exon 11 which are not in the cDNA clone. In addition it does not have exons 9, 12, 13 or 14; these are also absent from the SBE I-D2 type cDNA. The first two exons show lower identity to the corresponding exons from rice (approximately 60%) (Kawasaki et al, 1993) than to the other exons (about A diagrammatic exonintron structure of the wSBE I-D2 gene is indicated in Figure 5. The restriction map was confirmed by sequencing the PCR products that spanned fragments 7.18 and 7.8 and 7.8 and E7.31 (see Figure 3) respectively.
ii. Sequence of wSBE I-D3 This gene was not sequenced in detail, as the genomic clone did not extend far enough to include the end of the sequence. The sequence is of a SBE-I type. The orientation of the gene is evident from sequencing of the relevant BamHI fragments, and was confirmed by sequence analysis of a PCR product generated using primers from the right arm of lambda and a primer from the middle of the gene. The sequence homology with wSBEI-D2 is about 80% over the regions examined. The 2 kb sequenced corresponded to WO 99/14314 PCT/AU98/00743 25 exons 5 and 6 of the rice gene; these sequences were obtained by sequencing the ends of fragments 7.5, 7.4 and 7.14 respectively, although the sequences from the left end of fragment 7.14 did not show any homology to the rice sequences. The gene does not appear to share the 3' end of SBE I-D2 type cDNA, as a probe from 500 bp at the 3' end of the cDNA (including sequences corresponding to exons 8 and from rice) did not hybridise to fragment 7.14, although it hybridised to fragment 7.18.
iii. Sequence of wSBE I-D1 This gene was also not sequenced in detail, as it was clear that the genomic clone did not extend far enough to include the 5' sequences. Limited sequencing suggests that it is also a SBE I type gene. The orientation relative to the left arm of lambda was confirmed by sequencing a PCR product that used a primer from the left arm of lambda and one from the middle of the gene (as above). Its sequence homology with wSBE I-D2 ,D3 and D4 (see below) is about in the region sequenced corresponding to a part of exon 4 of the rice gene.
Starch branching enzymes are members of the aamylase protein family, and in a recent survey Svensson (1994) identified eight residues in this family that are invariant, seven in the catalytic site and a glycine in a short turn. Of the seven catalytic residues, four are changed in SBE I-D2 type. However, additional variation in the 'conserved' residues may come to light when more plant cDNAs for branching enzyme I are available for analysis. In addition, although exons 9, 11, 12, 13 and 14 from rice are not present in the SBE I-D2 type cDNA, comparison of the maize and rice SBE I sequences indicate that the 3' region (from amino acid residue 730 of maize) is much more variable than the 5' and central regions. The active sites of rice and maize SBE I sequences, as indicated by Svensson (1994), are encoded by sequences that are in the central portion of the gene. When SBE II sequences from Arabidopsis were WO 99/14314 PCT/AU98/00743 26 compared by Fisher et al (1996) they also found variation at the 3' and 5' ends. SBE I-D2 type cDNA may encode a novel type of branching enzyme whose activity is not adequately detected in the current assays for detecting branching enzyme activity; alternatively the cDNA may correspond to an endosperm mRNA that does not produce a functional protein.
Example 6 Cloning of the cDNA corresponding to the wSBE I-D4 gene The first strand cDNAs were synthesized from 1 Lg of total RNA, derived from endosperm 12 days after pollination, as described by Sambrook et al (1989), and then used as templates to amplify two specific cDNA regions of wheat SBE I by PCR.
Two pairs of primers were used to obtain the cDNA clones BED1 and BED3 (Table Primers used for cloning of BED3 were the degenerate primer GGC NAC NGC NGA G/AGA C/TGG 3' (SEQ ID NO.1), based on the N-terminal sequence of the purified wheat endosperm SBE I protein, in which the 5' end of the primer is at position 168 of wSBE I-D4 cDNA, as shown in Table 1, based on the N-terminal sequence of wheat SBE I, and the primer NTS3' TAC ATT TCC TTG TCC ATCA 3' (SEQ ID NO.2) in which the 5' end is at position 1590 of wSBE I-D4 cDNA, (see Table designed to anneal to the conserved regions of the nucleotide sequences of BED5 and the maize and rice SBE I cDNAs. For clone BED1, the primers used were 5' ATC ACG AGA GCT TGC TCA (SEQ ID NO.3) WO 99/14314 PCT/AU98/00743 27 in which the 5' end is at position 1 of wSBE I-D4 cDNA (see Table the sequence was based on the wSBE I-D4 gene, and BEC3' 5' CGG TAC ACA GTT GCG TCA TTT TC 3' (SEQ ID NO.4) in which the 5' end is at position 334 of wSBE I-D4 cDNA (see Table and the sequence was based on BED 3.
Example 7 Identification of the gene from the Triticum tauschii SBE I family which is expressed in the endosperm We have isolated two classes of SBE I genomic clones from T. tauschii. One class contained two genomic clone isolates, and this class has been characterised in some detail (Rahman et al, 1997). The complete gene contained within this class of clones was termed wSBE I-D2; there were additional genes at either ends of the clone, and these were designated wSBE I-D1 and wSBE I-D3. The other class contained nine genomic clone isolates. Of these XEl was arbitrarily taken as a representative clone, and its restriction map is shown in Figure 3; the SBE I gene contained in this clone was called wSBE I-D4.
Fragments El.1 (0.8 kb) and E1.2 (2.1 kb) and fragments E1.7 (4.8 kb) and E1.5 (3 kb) respectively were completely sequenced. Fragment E1.7 was found to encode the N-terminal of the SBE I, which is found in the endosperm as described in Morell et al (1997). This is shown in Figure 6. Using antibodies raised against the N-terminal sequence, Morell et al (1997) found that the D genome isoform was the most highly expressed in the cultivars Rosella and Chinese Spring. We have thus isolated from T. tauschii a gene, wSBE I-D4, whose homologue in the hexaploid wheat genome encodes the major isoform for SBE I that is found in the wheat endosperm.
WO 99/14314 PCT/AU98/00743 28 Table 1 Location of structural features and probes within wSBE I-D4 sequence.
A. Location of exons by comparison with the cDNA sequence of Repellin et al., (1997). Accession number Y12320.
Exon number Start posn 4890 5082 5524 5819 6149 6519 7744 8015 8562 9137 9421 9580 9781 9990 End posn 4987 5149 5731 5888 6318 7424 7860 8077 8670 9237 9488 9661 9897 10480 B. Other features.
Name of feature. wSBE I-D4.
sequence Putative initiation of translation Mature N-terminal sequence of SBE I End of translated SBE I sequence End of D4 cDNA sequence wSBE I-D45 wSBE I-D43 El.1 BED 1 BED 2 BED 3 BED 4 BED 5 Endosperm box like motif TGAAAAGT CAAAT motif TATAAA motif 4900 5550 10225 10461 4870,5860 10116,10435 5680,6400 4480,590 4863 4833 D4 cDNA sequence.
11 124 2431 2687 1,354 2338,2657 380,630 1,354 169,418 151,1601 867,2372 867,2687 WO 99/14314 PCT/AU98/00743 29 All nine genomic clones of the XEl type isolated from T. tauschii appear to contain the wSBE I-D4 gene, or very similar genes, on the basis of PCR amplification and hybridisation experiments. However, the restriction patterns obtained for the clones differ with BamHI and EcoRI, among other enzymes, indicating that either the clones represent near-identical but distinct genes or they represent the same gene isolated in distinct products of the Sau3A digest used to generate the library.
Example 8 Investigation of other SBE I genomic clones isolated All ten members of the IEl-like class of SBE I genomic clones were investigated by hybridisation with probes derived from fragment E1.7 (sequence wSBE encoding the translation start signal and the first 100 amino acids from the N-terminal end and intron sequences; see Table 1) and from fragment E1.5 (sequence wSBE I-D43, corresponding largely to the 3' untranslated sequence and containing intron sequences, see Table The results obtained were consistent with one type of gene being isolated in different fragments in the different clones, as shown in Figure 7. The PCR products were obtained from the clones XEl, 2, 9, 14, 27, 31 and 52. These hybridised to wSBE I-D45 using primers that amplify near the 5' end of the gene (positions 5590-6162 of wSBE I-D4). Sequencing showed no differences in sequence of a 200 bp product.
Analysis of the promoter for wSBE I-D4 allows us to investigate the presence of motifs previously described for promoters that regulate gene expression in the endosperm. Forde et al (1985) compared prolamin promoters, and suggested that the presence of a motif approximately -300 bp upstream of the transcription start point, called the endosperm box, was responsible for endosperm-specific expression. The endosperm box was subsequently considered to consist of two different motifs: the endosperm motif (EM) (canonical sequence TGTAAAG) and the GCN 4 motif (canonical WO 99/14314 PCT/AU98/00743 30 sequence G/ATGAG/CTCAT). The GCN4 box is considered to regulate expression according to nitrogen availability (Muller and Knudsen, 1993). The wSBE I-D4 promoter contains a number of imperfect EM-like motifs at approximately -100, -300 and -400 as well as further upstream. However, no GCN4 motifs could be found, which lends support to the idea that this motif regulates response to nitrogen, as starch biosynthesis is not as directly dependent on the nitrogen status of the plant as storage protein synthesis. Comparison of the promoters for wSBE I-D4 and D2 (Rahman et al, 1997) indicates that although there are no extensive sequence homologies there is a region of about 100 bp immediately before the first encoded methionine where the homology is 61% between the two promoters. In particular there is an almost perfect match in the sequence over twenty base pairs CTCGTTGCTTCC/TACTCCACT, (positions 4723-4742 of the wSBE I sequence), but the significance of this is hard to gauge, as it does not occur in the rice promoter for SBE I. The availability of more promoters for starch biosynthetic enzymes may allow firmer conclusions to be drawn. There are putative CAAT and TATA motifs at positions 4870 and 4830 respectively of wSBE I-D4 sequence. The putative start of translation of the mRNA is at position 4900 of wSBE I-D4.
Figure 5 shows the structure of the wSBE I-D4 gene, compared with the genes from rice and wheat (Kawasaki et al, 1993; Rahman et al, 1997). The rice SBE I has 14 exons compared with 13 for wSBE I-D4 and 10 for wSBE I-D2.
There is good conservation of exon-intron structure between the three genes, except at the extreme 5' end. In particular the sizes of intron 1 and intron 2 are very different between rice SBE I and wSBE I-D4.
Example 9 Isolation of cDNA for SBE I Using the maize starch branching enzyme I cDNA as a probe (Baba et al, 1991), 10 positive plaques were recovered by screening approximately 105 plaques from a wheat endosperm cDNA library prepared from the cultivar WO 99/14314 PCT/AU98/00743 31 Rosella, as described in Example 4. On purifying and sequencing these plaques it was clear that even the longest clone (BEDS, 1822 bp) did not encode the N-terminal sequence obtained from protein analysis. Degenerate primers based on the wheat endosperm SBE I protein N-terminal sequence (Morell et al, 1997) and the sequence from BEDS were then used to amplify the 5' region: this produced a cDNA clone termed BED 3 (Table 1 and Figure This cDNA clone overlapped extensively and had 100% sequence identity with BED5 and BED4 (Figure As almost the entire protein Nterminal sequence had been included in the primer sequence design, this did not provide independent evidence of the selection of a cDNA sequence in the endosperm that encoded the protein sequence of the main form of SBE I. Using a BED3 to screen a second cDNA library produced BED2, which is shorter than BED3 but confirmed the BED3 sequence at 100% identity between positions 169 and 418 (Figure 8 and Table In addition the entire cDNA sequence for BED3 could be detected at a 100% match in the genomic clone XEl.
Primers based on the putative transcription start point combined with a primer based on the incomplete cDNAs recovered were then used to obtain a PCR product from total endosperm RNA by reverse transcription. This led to the isolation of the cDNA clone, BED1, of 300 bp, whose location is shown in Figure 8. By analysing this product, a sequence was again obtained that could be found exactly in the genomic clone XEl, and which overlapped precisely with BED3.
The N-terminal of the protein matches that of SBE I isolated from wheat endosperm by Morell et al (1997), and thus the wSBE I-D4 cDNA represents the gene for the predominant SBE I isoform expressed in the endosperm. The encoded protein is 87 kDa; this is similar to proteins encoded by maize (Baba et al, 1991) and rice (Nakamura et al, 1992) cDNAs for SBE I and is distinct from the wSBE I-D2 cDNA described previously, in which the encoded protein was 74 kDa (Rahman et al, 1997).
WO 99/14314 PCT/AU98/00743 32 Five cDNA clones were sequenced and their sequences were assembled into one contiguous sequence using a GCG program (Devereaux et al, 1984). The arrangement of these sequences is illustrated in Figure 8, the nucleotide sequence is shown in SEQ ID No:5, and the deduced amino acid sequence is shown in SEQ ID No:6. The intact cDNA sequence, wSBE I-D4 cDNA, is 2687 bp and contains one large open reading frame (ORF), which starts at nucleotides 11 to 13 and ends at nucleotides 2432 to 2434. It encodes a polypeptide of 807 amino acids with a molecular weight of 87 kDa. Comparison of the amino acid sequence encoded by wSBE I-D4 cDNA with that encoded by maize and rice SBE I cDNAs showed that there is 75-80% identity between any of two these sequences at the nucleotide level and almost at the amino acid level. Alignment of these three polypeptide sequences, as shown in Figure 4, along with the deduced sequences for pea, potato and wSBE I-D2 type cDNA, indicated that the sequences in the central region are highly conserved, and sequences at the 5' end (about 80 amino acids) and the 3' end (about 60 amino acids) are variable.
Svensson et al (1994) indicated that there were several invariant residues in sequences of the a-amylase super-family of proteins to which SBE I belongs. In the sequence of maize SBE I these are in motifs commencing at amino acid residue positions 341, 415, 472, 537 respectively; these are also encoded in the wSBE I-D4 sequence (SEQ ID No:9), further supporting the view that this gene encodes a functional enzyme. This is in contrast to the results with the wSBE I-D2 gene, where three of the conserved motifs appear not to be encoded (Rahman et al, 1997).
There is about 90% sequence identity in the deduced amino acid sequence between wSBE I-D4 cDNA and rice SBE I cDNA in the central portion of the molecule (between residues 160 and 740 for the deduced amino acid product from wSBE I-D4 cDNA). The sequence identity of the deduced amino WO 99/14314 PCT/AU98/00743 33 acid sequence of the wSBE I-D4 cDNA to the deduced amino acid sequence of wSBE I-D2 is somewhat lower (85% for the most conserved region, between residues 285 to 390 for the deduced product of wSBE I-D4 cDNA). Surprisingly, however, wSBE I-D4 cDNA is missing the sequence that encodes amino acids at positions 30 to 58 in rice SBE I (see Figure 4).
This corresponds to residues within the transit peptide of rice SBE I. A corresponding sequence also occurs in the deduced amino acid sequence from maize SBE I (Baba et al, 1991) and wSBE I-D2 type cDNA (Rahman et al, 1997).
Consequently the transit sequence encoded by wSBE I-D4 cDNA is unusally short, containing only 38 amino acids, compared with 55-60 amino acids deduced for most starch biosynthetic enzymes in cereals (see for example Ainsworth, 1993; Nair et al, 1997). .The wSBE I-D4 gene does contain this sequence, but this does not appear to be transcribed into the major species of RNA from this gene, although it 'can be detected at low relative abundance. This raises the possibility of alternative splicing of the wSBE I-D4 transcript, and also the question of the relative efficiency of translation/transport of the two isoforms. The possibility of alternative splicing in both rice and wheat has been considered for soluble starch synthase (Baba et al,1993 Rahman et al, 1995). Alternative splicing of soluble starch synthase would give a transit sequence of 40 amino acids, which is the same length proposed for the product of wSBE I-D4 cDNA.
We have previously used probes based on exons 4, and 6 (E7.8 and El.l, see Rahman et al., 1997) of wSBE-D2 to probe wheat and T. tauschii genomic DNA cleaved with PvuII and BamHI respectively. This region is highly conserved within rice SBE I, wSBE I-D2 and wSBE I-D4 and produced ten bands with wheat DNA and five with T. tauschii DNA. Neither PvuII nor BamHI cleaved within the probe sequences, suggesting that each band represented a single type of SBE I gene. We have described four SBE I genes from T. tauschii: wSBE I-D1, wSBE I-D2, wSBE I-D3 and wSBE I-D4 (Rahman et al, WO 99/14314 PCT/AU98/00743 34 1997 and this specification), and so we may have accounted for most of the genes in T. tauschii and, by extension, the genes from the D genome of wheat. In wheat, at least two hybridising bands could be assigned to each of chromosomes 7A, 7B and 7D.
Example 10 Tissue specificity and expression during endosperm development The 300 bp of 3' untranslated sequence of wSBE I-D4 cDNA does not show any homology with either the wSBE I-D2 type cDNA that we have described earlier (Rahman et al, 1997) or with BE-I from rice, as shown in Figure We have called this sequence wSBE I-D43C (see SEQ ID No:9) It seemed likely that wSBE I-D43C would be a specific probe for this class of SBE-I, and thus it was used to investigate the tissue specificity. Hybridization of RNA from endosperm of hexaploid T. tauschii cultures with SBE I, SBE II, SSS I, DBE I, wheat actin, and wheat ribosomal RNA was examined.
RNA was purified at various numbers of days after anthesis from plants grown with a 16 h photoperiod at 13 OC (night) and 18 oC (day). The age of the endosperms from which RNA was extracted in days after anthesis is given above the lanes in the blot. Equivalent amounts of RNA were electrophoresed in each lane. The probes used are identified in Tables 1 and 2.
The results are shown in Figures 9a to 9g. An RNA species of about 2700 bases in size was found to hybridise.
This is very close to the size of the wSBE I-D4 cDNA sequence. RNA hybridising to wSBE-I-D43C is most abundant at the mid-stage of endosperm development, as shown in Figure 9a, and in field grown material is relatively constant during the period 12-18 days, the time at which there is rapid starch and storage protein accummulation (Morell et al, 1995).
The sequence contained within the wSBE I-D4 gene appears to be expressed only in the endosperm (Figure 9a, Figure 9b). We could not detect any expression in the leaf.
WO 99/14314 PCT/AU98/00743 35 This could be because another isoform is expressed in the leaf, and/or because the amount of SBE I present in the leaf is much less than what is required in the endosperm.
Isolation of SBE I clones from a leaf cDNA library would enable this question to be resolved.
Example 11 Intron-Exon Structure of SBE I By comparison of the cDNA sequence of SEE I (Repellin et al, 1997) with that of wSBE I-D4 we can deduce the intron-exon structure of the gene for the major isoform of SBE I that is found in the endosperm. The structure contains 14 exons compared to 14 for rice (Kawasaki et al, 1993). These 14 exons are spread over 6 kb of sequence, a distance similar to that found in both rice SBE I and wSBE I-D2. A dotplot comparison of wSBE I-D4 sequence and that of rice SBE I sequence, depicted in Figure 10, shows good sequence identity over almost the entire gene starting from about position 5100 of wSBE I-D4; the identity is poor over the first 5 kb of sequence corresponding largely to the promoter sequences. The sequence identity over introns (about 60%) is lower than over exons (about Example 12 Repeated Sequences in SBE I Sequencing of wSBE I-D4 revealed there was a repeated sequence of at least 300 bp contained in a 2kb fragment about 600 bp after the 3' end of the gene. We have called this sequence wSBE I-D4R (SEQ ID NO: This repeated sequence is within fragment E1.5 (Figure 3 and Table 1) and is flanked by non-repetitive sequences from the genomic clone. We have previously shown that the restriction pattern obtained by digesting XEl with the restriction enzyme BamHI is also obtained when T. tauschii DNA is digested. Thus wSBE I-D4R is unlikely to be a cloning artefact. A search of the GenBank Database revealed that wSBE I-D4R shared no significant homology with any sequence in the database. Hybridisation experiments with wSBE I-D4R showed that all of the other SBE I-D4 type WO 99/14314 PCT/AU98/00743 36 genomic clones (except number 29) contained this repeated sequence (data not shown). The wSBE I-D4R sequence was not highly repeated and occurred in the wheat genome with a similar frequency as the wSBE I-D4 sequence.
When SBE I-D4R was used as the probe on wheat DNA from the nulli-tetra lines, four bands were obtained; two of these bands could be assigned to chromosome 7A and the others to chromosomes 7B and 7D (Figure 11). One of the two BamHI fragments from wheat DNA which could be assigned to chromosome 7A was distinct from the single band from chromosome 7A detected using wSBE I-D43 as the probe; the other three bands coincided in the autoradiograph with bands obtained with wSBE I-D43, and are likely to represent the same fragment. However, one of these fragments was distinct from the BamHI fragment that hybridised to the wSBE I-D43 sequence. In wSBE I-D4 (see SEQ ID No:9), the wSBE I-D43 sequence is only 300 bp upstream of wSBE I-D4R, and occurs in the same BamHI fragment. These results suggest that the wSBE I-D4R sequence can occur independently of wSBE I-D4 in the wheat genome.
Example 13 Isolation of Genomic Clones Encoding SBE II Screening of a cDNA library, prepared from the wheat endosperm as described in Example 4, with the maize BE I clone (Baba et al, 1991) at low stringency led to the isolation of two classes of positive plaques. One class was strongly hybridising, and led to the isolation of wheat SBE I-D2 type and SBE I-D4 type cDNA clones, as described in Example 5 and in Rahman et al (1997). The second class was weakly hybridising, and one member of this class was purified. This weakly hybridising clone was termed SBE-9, and on sequencing was found to contain a sequence that was distinct from that for SBE I. This sequence showed greatest homology to maize BE II sequences, and was considered to encode part of the wheat SBE II sequence.
The screening of approximately 5 x 105 plaques from a genomic library constructed from T. tauschii (see WO 99/14314 PCT/AU98/00743 37 Example 1) with the SBE-9 sequence led to the isolation of four plaques that were positive. These were designated wSBE II-D1 to wSBE II-D4 respectively, and were purified and analysed by restriction mapping. Although they all had different hybridization patterns with SBE-9, as shown in Figure 12, the results were consistent with the isolation of the same gene in different-sized fragments.
Example 14 Identification of the N-terminal sequence of SBE II Sequencing of the SBE II gene contained in clone 2, termed SBE II-DI (see SEQ ID No:10), showed that it coded for the N-terminal sequence of the major isoform of SBE II expressed in the wheat endosperm, as identified by Morell et al (1997). This is shown in Figure 13.
Example 15 Intron-Exon Structure of the SBE II Gene In addition to encoding the N-terminal sequence of sBE II,.as shown in Example 10, the cDNA sequence reported by Nair et al (1997) was also found to have 100% sequence identity with part of the sequence of wSBE II-D1. Thus the intron-exon structure can be deduced, and this is shown in Figure 14. The positions of exons and other major structural features of the SBE II gene are summarized in Table 2.
Example 16 Number of SBE II Genes in T. tauschii and Wheat Hybridisation of the SBE II conserved region with T. tauschii DNA revealed the presence of three gene classes.
However, in our screening we only recovered one class.
Hybridisation to wheat DNA indicated that the locus for SBE II was on chromosome 2, with approximately 5 loci in wheat; most of these appear to be on chromosome 2D, as shown in Figure WO 99/14314 PCT/AU98/00743 38 Table 2 Positions of structural features in wSBE II-D1.
A. Positions of exons.
Exon number 1 2 3 4 6 7 8 9 11 12 13 14 16 17 18 19 21 22 Genomic start 1058 1664 2038 2681 2949 3145 3540 3704 4110 4818 5115 6209 6427 6739 7447 8392 9556 9839 10120 10395 10928 11092 Genomic finish 1336 1761 2279 2779 2997 3204 3620 3825 4188 4939 5234 6338 6549 6867 7550 8536 9703 9943 10193 10550 11002 11475 B. Other structural features within the sequence Putative initiation of translation Mature N-terminal sequence of SBE II.
wSBE II-D13 Endosperm box like motif TGAAAAGT Endosperm box like motif TGAAAGT Endpsperm box like motif CGAAAAT Endosperm box like motif TAAATGT CAAAAT motif TCAATT motif TATAAA motif AATTAA motif wSBE II-D1 DNA 1214 1681 11116 to 11448 521 565 669 768 784 1108 799 1110 WO 99/14314 PCT/AU98/00743 39 Example 17 Expression of SBE II Investigation of the pattern of expression of SBE II revealed that the gene was only expressed in the endosperm. However the timing of expression was quite distinct from that of SBE I, as illustrated in Figures 9a, 9b and 9c.
SBE I gene expression is only clearly detectable from the mid-stage of endosperm development (10 days after anthesis in Figure 9b), whereas SBE II gene expression is clearly seen much earlier, in endosperm tissue at 5-8 days after development (Figures 9a and 9c), corresponding to an early stage of endosperm development. The hybridisation of wheat endosperm mRNA with the actin and ribosomal RNA genes is shown as controls (Figures 9fa and 9g, respectively).
Example 18 Cloning of Wheat Soluble Starch Synthase cDNA A conserved sequence region was used for the synthesis of primers for amplification of SSS I by comparison with the nucleotide sequences encoding soluble starch synthases of rice and pea. A 300 bp RT-PCR product was obtained by amplification of cDNA from wheat endosperm at 12 days post anthesis. The 300 bp RT-PCT product was then cloned, and its sequence analysed. The comparison of its sequence with rice SSS cDNA showed about 80% sequence homology. The 300 bp RT-PCR product was 100% homologous to the partial sequence of a wheat SSS I in the database produced by Block et al (1997).
The 300 bp cDNA fragment of wheat soluble starch synthase thus isolated was used as a probe for the screening of a wheat endosperm cDNA library (Rahman et al, 1997).
Eight cDNA clones were selected. One of the largest cDNA clones (sm2) was used for DNA sequencing analysis, and gave a 2662 bp nucleotide sequence, which is shown in SEQ ID NO:14. A large open reading frame of this cDNA encoded a 647 amino acid polypeptide, starting at nucleotides 247 to 250 and terminating at nucleotides 2198 to 2200. The WO 99/14314 PCT/AU98/00743 40 deduced polypeptide was shown by protein sequence analysis to contain the N-terminal sequence of a 75 kDa granule-bound protein (Rahman et al, 1995). This is illustrated in Figure 16. The location of the 75 kDa protein was determined for both the soluble fraction and starch granulebound fraction by the method of Denyer et al (1995). Thus this cDNA clone encoded a polypeptide comprising a 41 amino acid transit peptide and a 606 amino acid mature peptide (SEQ ID NO:12). The cleavage site LRRL was located at amino acids 36 to 39 of the transit peptide of this deduced polypeptide.
Comparison of wheat SSS I with rice SSS and potato SSS showed that there is 87.4% or 75.9% homology at the amino acid level and 74.7% or 58.1% homology at the nucleotide level. Some amino acids in the at N-terminal sequences of the SSS I of wheat and rice were conserved.
Major features of the SSS I gene are summarized in Table 3.
Example 19 Isolation of Genomic Clone of Wheat Soluble Starch Synthase Seven genomic clones were obtained with a 300 bp cDNA probe by screening approximately 5 x 105 plaques from a genomic DNA library of Triticum tauschii, as described above. DNA was purified from 5 of these clones and digested with BamHI and SacI. Southern hybridization analysis using the 300 bp cDNA as probe showed that these clones could be classified into two classes, as shown in Figure 17. One genomic clone, sg3, contained a long insert, and was digested with BamHI or SacI and subcloned into pBluescript KS+ vector.
WO 99/14314 PCT/AU98/00743 41 Table 3 Comparison of exons and introns of soluble starch synthases I genes of wheat and rice Identity of exons of wheat and rice soluble starch synthase I genes of Exons la lb 2 3 4 6 7 8 9 11 12 13 14 15b wSSI-D1 rSSI identity 255 316 356 78 125 82 174 82 92 63 90 125 109 53 40 159 392 113 298 356 78 125 82 174 82 92 63 90 125 109 53 41 113 539 57.52 58.92 82.87 92.31 90.40 89.02 93.10 93.90 92.39 90.48 82.22 88.80 91.74 81.13 80.00 79.65 46.46 start site stop site (wSSI-D1) (wSSI-D1) -253 0 1 316 1473 1828 2746 2823 2906 3028 4113 4194 4286 4459 4562 4643 4743 4835 4959 5021 5103 5192 8594 8718 8807 8915 8992 9044 9160 9199 9499 9657 9658 10098 e starch synthase I genes Identity of introns of solubl of wheat and rice Introns wSSI-D1 rSSI identity 1 2 3 4 6 7 8 9 11 12 13 14 1156 917 82 1084 91 102 99 123 81 3401 88 76 115 299 907 851 87 835 96 189 96 110 78 663 124 81 135 830 41.05 41.65 45.12 48.50 57.78 52.48 52.08 45.46 58.97 37.56 56.82 48.68 45.22 45.80 start site stop site (wSSI-D1) (wSSI-Dl) 317 1472 1829 2745 2824 2905 3029 4112 4195 4285 4460 4561 4644 4742 4836 4958 5022 5102 5193 8593 8719 8806 8916 8991 9045 9159 9200 9498 n 1. Exon Ib: coding Note: Exon la: non-coding region of exo region or exon 1.
Exon 15a: coding region of exon 15. Exon 15b: noncoding region of exon wSSI-Dl: wheat soluble starch synthase I gene.
rSSI: rice soluble starch synthase I gene.
WO 99/14314 PCT/AU98/00743 42 These subclones were analysed by sequencing. The intron/exon structure of the sg3 rice gene is shown in Figure 18. The SSS I gene from T. tauschii is shown in SEQ ID No:13, while the deduced amino acid sequence is shown in SEQ ID NO:14.
Example 20 Northern Hybridization Analysis of the Expression of Genes Encoding Soluble Starch Synthase Total RNAs were purified from leaves, pre-anthesis material, and various stages of developing endosperm at 5-8, 10-15 and 18-22 days post anthesis. Northern hybridization analysis showed that mRNAs encoding wheat SSS I were specifically expressed in developmental endosperm.
Expression of this mRNAs in the leaves and pre-anthesis materials could not be detected by northern hybridization analysis under this experimental condition. Wheat SSS I mRNAs started to express at high levels at an early stage of endosperm, 5-8 days post anthesis, and the expression level in endosperm at 10-15 days post anthesis, was reduced.
These results are summarized in Figure 9a and Figure 9d.
Example 21 Genomic Localisation of Wheat Soluble Starch Synthase DNA from chromosome engineered lines was digested with the restriction enzyme BamHI and blotted onto supported nitrocellulose membranes. A probe prepared from the 3' end of the cDNA sequence, from positions 2345 to 2548, was used to hybridise to this DNA. The presence of a specific band was shown to be associated with the presence of chromosomes 7A (Figure 19). These data demonstrate location of the SSS I gene on chromosome 7.
Example 22 Isolation of SSS I Promoter We have isolated the promoter that drives this pattern of expression for SSS I. The pattern of expression for SSS I is very similar to that for SBE II: the SSS I gene WO 99/14314 PCT/AU98/00743 43 transcript is detectable from an early stage of endosperm development until the endosperm matures. The sequence of this promoter is given in SEQ ID Example 23 Isolation of the Gene Encoding Debranching Enzyme from Wheat The sugary-i mutation in maize results in mature dried kernels that have a glassy and translucent appearance; immature mature kernels accumulate sucrose and other simple sugars, as well as the water-soluble polysaccharide phytoglycogen (Black et al, 1966). Most data indicates that in sugary-i mutants the.concentration of amylose is increased relative to that of amylopection. Analysis of a particular sugary-i mutation (su-iRef) by James et al, (1995) led to the isolation of a cDNA that shared significant sequence identity with bacterial enzymes that hydrolyse the a 1,6-glucosyl linkages of starch, such as an isoamylase from Pseudomonas (Amemura et al, 1988), ie.
bacterial debranching enzymes.
We have now isolated a sequence amplified from wheat endosperm cDNA using the polymerase chain reaction (PCR). This sequence is highly homologous to the sequence for the sugary gene isolated by James et al, (1995). This sequence has been used to isolate homologous cDNA sequences from a wheat endosperm library and genomic sequences from Triticum tauschii.
Comparison of the deduced amino acid sequences of DBE from maize with spinach (Accession SOPULSPO, GenBank database), Pseudomonas (Amemura et al, 1988) and rice (Nakamura et al, 1997) enabled us to deduce sequences which could be useful in wheat. When these sequences were used as PCR amplification primers with wheat genomic DNA a product of 256 bp was produced. This was sequenced and was compared to the sequence of maize sugary isolated by James et al, (1995). The results are shown in Figure 20a and Figure This sequence has been termed wheat debranching enzyme sequence I (WDBE-I).
WO 99/14314 PCT/AU98/00743 44 WDBE-1 was used to investigate a cDNA library constructed from wheat endosperm (Rahman et al, 1997) enables us to isolate two cDNA clones which hybridise strongly to the WDBE-I probe. The nucleotide sequence of the DNA insert in the longest of these clones is given in SEQ ID No:16.
Use of WDBE 1 to investigate a genomic library constructed from T. tauschii, as described above has led to the isolation of four genomic clones, designated II, 12, 13 and 14, respectively, which hybridised strongly to the WDBE-I sequence. These clones were shown to contain copies of a single debranching enzyme gene. The sequence of one of these clones, 12, is given in SEQ ID No:17. The intron/exon structure of the gene is shown in Figure 20c. Exons 1 to 4 were identified by comparison with the maize sugary-i cDNA, while Exons 5 to 18 were identified by comparison with the cDNA sequence given in SEQ ID No:16. The major features of the DBE I gene are summarized in Table 4.
Hybridization of WDBE-I to DNA from T. tauschii indicates one hybridizing fragment (Figure 21a). The chromosomal location of the gene was shown to be on chromosome 7 through hybridisation to nullisomic/tetrasomic lines of the hexaploid wheat cultivar Chinese Spring (Figure 21b).
We have clearly isolated a sequence from the wheat genome that has high identity to the debranching enzyme cDNA of maize characterised by James et al (1997). The isolation of homologous cDNA sequences and genomic sequences enables further characterisation of the debranching enzyme cDNA and promoter sequences from wheat and T. Causchii. These sequences and the WDBE I sequences shown herein are useful in the manipulation of wheat starch structure through genetic manipulation and in the screening for mutants at the equivalent sugary locus in wheat.
Figure 9e shows that the DBE I gene is expressed during endosperm development in wheat and that the timing of expression is similar to the SBEII and SSSI genes. Figure 9h WO 99/14314 PCT/AU98/00743 45 shows that the full length mRNA for the gene (3.0 kb) is found only in the wheat endosperm.
Example 24 Transient assays of Promoter-GFP Fusions DNA constructs DNA constructs for transient expression assays were prepared by fusing sequences from the BEII and SSI promoters to the gene encoding the Green Fluorescent Protein. Green Fluorescent Protein (GFP) constructs contained the GFP gene described by Sheen et al. (1995). The nos 3' element (Bevan et al., 1983) was inserted 3' of the GFP gene. The plasmid vector (pWGEM_NZfp) was constructed by inserting the NotI to HindIII fragment from the following sequence: GCGGCCGCTC CCTGGCCGAC TTGGCCGAAG CTTGCATGCC TGCAGGTCGA CTCTAGAGGA TCCCCGGGTA CCGAGCTCGA ATTCATCGAT
GATATCAGAT
CCGGGCCCTC TAGATGCGGC CGCATGCATA AGCTT 3' into the NotI and HindIII sites of pGem-13Zf(-) vector (Promega). The sequences at the junction of the wSSSIprol and wSSSIpro2 and GFP were identical, and included the junction sequence: 5'....CGCGCGCCCA CACCCTGCAG GTCGACTCTA GAGGATCCAT GGTGAGCAAG 3'.
The sequence at the junction of wsbellprol and GFP was: 5' GCGACTGGCT GACTCAATCA CTACGCGGGG ATCCATGGTG
AGCAAGGGCG
3'.
The sequence at the junction of wsbellpro2 and GFP was: GGACTCCTCT CGCGCCGTCC TGAGCCGCGG ATCCATGGTG AGCAAGGGCG 3'.
The structures of the constructs are shown in Figures 22a to 22f.
WO 99/14314 PCT/AU98/00743 46 Table 4 Structural features of wDBEI-D1
A.
Position of exons Exon Start number positi on 1890 2342 2615 3016 3360 4313 4526 4734 5058 5202 5558 6575 7507 8450 8739 8902 9114 Still being sequen ced End posit ion 2241 2524 2707 3168 3436 4454 4633 4819 5129 5328 5644 6671 7661 8527 8823 8981 9231 Comments (deduced by comparison with maize) (deduced by comparison with maize) (deduced by comparison with maize) (deduced by comparison with maize) Note that following nucleotides 3330, 6330 and 8419 there may be short regions of DNA not yet sequenced.
B.
CAAAAT motif TCAAT motif ATAAATAA motif 1833 1838 1804 Endosperm box like motif TAAAACG 1463 WO 99/14314 PCT/AU98/00743 47 Preparation of target tissue All explants used for transient assay were from the hexaploid wheat cultivar, Milliwang. Endosperm (10 12 days after anthesis), embryos (12 14 days after anthesis) and leaves (the second leaf from the top of plants containing 5 leaves) were used. Developing seed or leaves were collected, surface sterilized with 1.25% w/v sodium hypochlorite for 20 minutes and rinsed with sterile distilled water 8 times. Endosperms or embryos were carefully excised from seed in order to avoid contamination with surrounding tissues. Leaves were cut into 0.5 cm x 1 cm pieces. All tissues were aseptically transferred onto SD1SM medium, which is an MS based medium containing 1 mg/L 2,4-D, 150 mg/L L-asparagine, 0.5 mg/L thiamine, 10 g/L sucrose, 36.g/L sorbitol and 36 g/L mannitol. Each agar plate contained either 12 endosperms, 12 embros or 2 leaf segments.
Preparation of gold particles and bombardment Five gg of each plasmid was used for the preparation of gold particles, as described by Witrzens et al. (1998). Gold particle-DNA suspension in ethanol (10 gl) was used for each bombardment using a Bio-Rad helium-driven particle delivery system, PDS-1000.
GFP assay The expression of GFP was observed after 36 to 72 hours incubation using a fluorescence microscope. Two plates were bombarded for each construct. The numbers of expressing regions were recorded for each target tissue, and are summarized in Table 5. The intensity of the expression of GFP from each of the promoters was estimated by visual comparison of the light intensity emitted, and is summarized in Table 6.
The DNA construct containing GFP without a promoter region (pZLGFPNot) gave no evidence of transient expression in embryo (panel 1) or leaf (panel r) and WO 99/14314 PCT/AU98/00743 48 extremely weak and sporadic expression in endosperm (panel f) this construct gave only very weak expression in endosperm with respect to the number (Figure 5) and intensity (Figure 6) of transient expression regions. The constructs pwsssIprolgfpNOT (panels b, h and n), psbeIIprolgfpNOT(panels d, j and and psbeIIpro2gfpNOT (panels e, k and q) yielded low numbers (Table 5) of strongly (Table 6) expressing regions in leaves, and there was a very uneven distribution of expressing regions between target leaf pieces (Table pwsssIpro2gfpNOT (panels c, i and o) gave no evidence of transient expression in leaves (Table These results show that each of the promoter constructs is able to drive the transient expression of GFP in the grain tissues, endosperm and embryo. The ability of the short SSI promoter (pwsssIpro2gfpNOT containing 1042 bp of the ATG translation start site) to drive expression in leaves (panel n) contrasts with the inability of the long SSI promoter (pwsssIpro2gfpNOT containing 3914 base pair region 5' of the ATG translation start site, panel o) suggesting that regions for controlling tissue specificity are located between -3914 and -1042 of the SSI promoter region (SEQ ID Example 25 Stable transformation of rice Stable transformation of rice using Agrobacterium was carried out essentially as described by Wang et al.
1997. The plasmids containing the target DNA constructs containing the promoter-reporter gene fusions are shown in Figure 23. These plasmids were transformed into Agrobacterium tumefaciens AGL1 by electroporation.and cultured on selection plates of LB media containing rifampicillin (50 mg/L) and spectinomycin (50 mg/L) for 2 to 3 days, and then gently suspended in 10 ml NB liquid medium containing 100 gM acetosyringone and mixed well. Embryogenic rice calli (2 to 3 months old) derived from mature seeds were immersed in the A. tumefaciens AGL1 Table Transient Assay of GFP based constructs Tissue Construct Plate No.
Explant Number Ave. S.D.
1 2 3 4 5 6 7 8 9 10 11 12 Endosperm pact_jsgfg_nos 1 0 0 1 158 152 148 0 2 12 159 95 64 65.9 71.6 Endosperm pact_jsgfg_nos 2 3 13 2 83 18 9 6 188 0 102 5 3 36.0 58.6 Embryo pact_jsgfg_nos 3 97 79 77 101 121 176 89 129 139 212 131 138 124.1 40.1 Embryo pact_jsgfgnos 4 18 39 89 82 7 52 94 147 19 66 106 85 67.0 41.6 Leaf pact_jsgfg_nos 5 0 2 0 3 0 0 0.8 1.3 Leaf pact_jsgfg_nos 6 0 0 0 1 0 0 0.2 0.4 Leaf pact_jsgfg_nos 7 3 0 0 2 0 3 1.3 Endosperm pZLGFPNot Endosperm pZLGFPNot Embryo pZLGFPNot Embryo pZLGFPNot Leaf pZLGFPNot Leaf pZLGFPNot Leaf pZLGFPNot 8 13 0 4 0 14 0 0 0 0 0 0 1 9 0 0 0 0 14 0 0 5 3 4 6 0 10 0 0 0 0 0 0 0 0 0 0 0 0 11 0 0 0 0 0 0 0 0 0 0 0 0 12 0 0 0 0 0 0 13 0 0 0 0 0 0 14 0 0 0 0 0 0 to 2.7 5.2 2.7 4.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Table 5 (Continued) Transient Assay of GFP based constructs Tissue Construct Plate Explant Number Ave. S.D.
No.
Endosperm psbeIIprolgfpNOT 15 111 0 77 142 0 127 7 35 39 191 95 34 71.5 62.3 Endosperm psbeIIprolgfpNOT 16 21 101 0 0 34 164 102 5 39 125 147 114 71.0 60.6 Embryo psbeIIprolgfpNOT 17 23 67 63 4 12 14 9 8 29 19 24 51 26.9 21.7 Embryo psbeIIprolgfpNOT 18 92 144 64 36 31 23 106 43 11 1 9 7 47.3 45.4 Leaf psbeIIprolgfpNOT 19 0 0 0 0 0 0 0.0 0.0 Leaf psbeIIprolgfpNOT 20 6 0 0 0 0 0 1.0 2.4 Leaf psbeIIprolgfpNOT 21 0 0 0 0 3 5 1.3 2.2 Endosperm psbeIIpro2fpNOT 22 12 18 3 0 0 21 13 0 10 11 10 0 8.2 7.4 o Endosperm psbeIIpro2fpNOT 23 24 25 13 68 11 0 0 0 1 0 0 0 11.8 20.1 Embryo psbeIIpro2fpNOT 24 9 13 4 7 6 21 0 9 3 5 2 4 6.9 5.7 Embryo psbeIIpro2fpNOT 25 5 0 3 5 23 4 3 1 8 12 8 13 7.1 6.4 Leaf psbeIIpro2fpNOT 26 0 2 0 0 0 0 0.3 0.8 Leaf psbeIIpro2fpNOT 27 0 5 0 8 0 0 2.2 Leaf psbeIIpro2fpNOT 28 0 0 0 0 0 0 0.0 0.0 -4 0 Table Transient Assay of GFP based constructs Tissue Construct Plate Explant Number Ave. S. D.
Endosperm pwssslprolgfpNOT 29 121 0 0 28 0 4 81 23 0 2 0 2 21.8 39.2 Endosperm pwssslprolgfpNOT 30 3 0 0 92 12 0 0 102 4 159 41 24 36.4 52.8 Embryo pwssslprolgfpNOT 31 112 106 74 54 33 73 77 49 42 38 59 46 63.6 25.6 Embryo pwssslprolgfpNOT 32 97 48 110 22 191 112 53 6 9 145 6 10 67.4 62.4 Leaf pwssslprolgfpNOT 33 0 0 0 0 0 0 0.0 0.0 Leaf pwssslprolgfpNOT 34 0 0 0 0 0 0 0.0 0.0 Leaf pwssslprolgfpNOT 35 12 0 0 0. 0 0 2.0 4.9' Endosperm pwssslpro2fpNOT Endosperm pwssslpro2fpNOT Embryo pwssslpro2 fpNOT Embryo pwssslpro2 fpNOT Leaf pwssslpro2fpNOT Leaf pwssslpro2fpNOT Leaf pwssslpro2fpNOT
'-F
36 0 0 18 81 0 0 0 6 0 0 1 0 8.8 23 .3, 37 0 18 14 6 63 8 8 23 79 7 46 51 26.9 26.11 38 15 7 14 57 8 3 26 10 47 34 47 0 22.3 19.4 39 9 15 48 103 31 22 107 22 27 82 51 63 48.3 33.8 40 0 0 0 0 0 0 0.0 0.0 41 0 0 0 0 0 0 0.0 0.0 42 0 0 0 0 0 0 0.0 0.0 WO 99/14314 PCT/AU98/00743 52 Table 6 Comparison of the Intensities of Transient Expression Tissue pact_j pwsssI pwsssI psbell psbell pZLGFP s Not gfgno prolgf pro2gf prolgf pro2gf s pNOT pNOT pNOT pNOT Endosperm 10 4 2.5 3.5 1.5 Embryo 10 5.5 5.5 1.5 1 0 Leaf 10 20 0 10 10 0 All intensities are relative to pact_js-gfg_nos transient expression in the target tissue Relative intensities were independently scored by three researchers and averaged.
WO 99/14314 PCT/AU98/00743 53 suspension. After 3 10 minutes the A. tumefaciens AGL1 suspension medium was removed, and the rice calli were transferred to NB medium containing 100 pM acetosyringone for 48 h. The co-cultivated calli were washed with sterile Milli Q H 2 0 containing 150 mg/L timentin 7 times to remove all Agrobacterium, plated on to NB medium containing 150 mg/L timentin and 30 mg/L hygromycin, and cultured for 3 to 4 weeks. Newly-formed buds on the surface of rice calli were excised and plated onto NB Second Selection medium containing 150 mg/L timentin and 50 mg/L hygromycin. After 4 weeks of proliferation calli were plated onto NB Pre- Regeneration medium containing 150 mg/L timentin and 50 mg/L hygromycin, and cultured for 2 weeks. The calli were then transferred on to NB-Regeneration medium containing 150 mg/L timentin and 50 mg/L hygromycin for 3 to 4 weeks.-Once shooting occurs, shoots are transferred onto rooting medium MS) containing 50 mg /L hygromycin. Once adequate root formation occurs, the seedlings are transferred to soil, grown in a misting chamber for 1-2 weeks, and grown to maturity in a containment glasshouse.
Example 26 Use of probes from SSS I, SBE I, SBE II and DBE sequences to identify null or altered alleles for use in breeding programmes DNA primer sets were designed to enable amplification of the first 9 introns of the SBE II gene using PCR. The design of the primer sets is illustrated in Figure 24. Primers were based on the wSBE II-D1 sequence (deduced from Figure 13b and Nair et al, 1997; Accession No.
Y11282) and were designed such that intron sequences in the wSBE II sequence were amplified by PCR. These primer sets individually amplify the first 9 introns of SBE II. One primer (sr913F) contained a fluorescent label at the 5' end.
Following amplification, the products were digested with the restriction enzyme Ddel and analysed using an ABI 377 DNA Sequencer with GenescanTM fragment analysis software. One primer set, for intron 5, was found to amplify products from WO 99/14314 PCT/AU98/00743 54 each of chromosomes 2A, 2B and 2D of wheat. This is shown in Figure 25, which illustrates results obtained with various wheat lines, and demonstrates that products from each of the wheat genomes from diverse wheats were amplified, and that therefore lines lacking the wSBEII gene on a specific chromosome could be readily identified. Lane (iii) illustrates the identification of the absence of the A genome wSBEII gene from the hexaploid wheat cultivar Chinese Spring ditelosomic line 2AS.
Figure 26 compares results of amplification with an Intron 10 primer set for various nullisomic/tetrasomic lines of the hexaploid wheat Chinese Spring. Fluorescent dUTP deoxynucleotides were included in the amplification reaction. Following amplification, the products were digested with the restriction enzyme DdeI and analysed using an ABI 377 DNA Sequencer with GenescanTM fragment analysis software. In lane Chinese Spring ditelosomic line 2AS, a 300 base product is absent; in lane (ii) N2BT2A, a 204 base product is absent, and in lane (iii) N2DT2B a 191 base product is absent. These results demonstrate that the absence of specific wSBEII genes on each of the wheat chromosomes can be detected by this assay. Lines lacking wSBEII forms can be used as a parental line for breeding programmes for generation of new lines in which expression of SBE II is diminished or abolished, with consequent increase in amylose content of the wheat grain. Thus a high amylose wheat can be produced.
Table 7 shows examples primers pairs for SBE I, SSS I and DBE I which can identify genes from individual wheat genomes and could therefore be used to identify lines containing null or altered alleles. Such tests could be used to enable the development of wheat lines carrying null mutations in each of the genomes for a specific gene (for Table 7 PCR Primers for Starch Biosynthesis Genes Gene Foward Foward Primer sequence Reverse Reverse Primer sequence Tempj Primer Primer 0 c) Product (bp) SBE I ZLE1 5d GGC GGC GGC AAT GTG CGG CTG AG ZLBE1 !CCA GAT CGT ATA TCG GAA GGT CG 157.3 A=625, 63B 600, D -550 SSS I sssE0lF GAA.CTC GCG CCC GAC CTC CT ZLSg7 AGC CAC GAT TAT GCT GTC GAT GG 55.0 A, 450; B=450; 630 sssEl4F TTC TCA CCG CTA ACC GTG GAC ZLSml9 GTC TAC ATG ACG TAG GGT TGG TC 55 .8 lB 400, D -500 no A ___product DBE I DBEE17F TGG TCT GAG AAT AGC CGA TTC sr1536F AAGGCCACATAGATCTCG 56.8 B, 190; D, 190, A, 160.
Nonspecif i
C
product 220 bp Temp: annealing temperature, bp =length of the product in base pairs 00 WO 99/14314 PCT/AU98/00743 56 example SBEI, SSI or DBE I) or combinations of null alleles for different genes.
It will be apparent to the person skilled in the art that while the invention has been described in some detail for the purposes of clarity and understanding, various modifications and alterations to the embodiments and methods described herein may be made without departing from the scope of the inventive concept disclosed in this specification.
Reference cited herein are listed on the following pages, and are incorporated herein by this reference.
WO 99/14314 WO 9914314PCT/AU98/00743 57
REFERENCES
Ainsworth, Clark, J. and Balsdon, J.
Plant Molecular Biology, 1993 22 67-82 Amemura, Chakrabort, Fujita, Noumi, T. and Futai, M.
Biol. Chem., 1988 263 9271-9275 Baba, Kimnura, Mizuno, Etoh, Ishida,y., Shida, 0. and Arai, Y.
Biochem. Biophys. Res. Commun., 1991 181 87-94.
Baba,T.; Nishihara,M.; Mizuno,K.; Kawasaki,T.; Shimada,H.; Kobayashi,E.; Ohnishi,S.; Tanaka,K.; Arai,Y.
Plant Physiol, 1993, 103 565-573.
Ball,S.; Guan,H.'; James,M.; Myers,A.; Keeling,P.; Mouille,G.; Bul6on,A.; Colonna,P.; Preiss,J.
Cell, 1996, 86 349-352 Bevan, Barnes, and Chiltona, M.
Nucleic Acids Research, 1983, 11 369-385 Black, Loerch, McARdle, F.J. and Creech, R.G.
Genetics, 1966 53 661-668 Block, Loerz, Lutticke, S.
Genbank database Accession number U48227 Burton, Bewley, Smith, Bhattacharya,
M.K.,
Tatge, Ring,S., Bull, Hamilton, W.D.0. and Martin,
C.
The Plant Journal, 1995 7 3-15.
Cangiano, La Volpe, Paulsen, P. and Kreiberg, J.D.
Plant Physiology, 1993 102 1053-1054.
Clarke, Mukai, Y. and Appels, R.
Chromosoma, 1996 105 269-275 Devereaux, Haeberli, P. and Smithies, 0.
Nucleic Acids Res., 1984 12, 387-395.
Denyer, Hylton, Jenner, C.F. and Smith, A.M.
Planta, 1995 196 256-265 Doherty, Lindeman, Trent, Graham, M.W. and Woodcock, D.M.
Gene, 1992 124 113-120 WO 99/14314 PCT/AU98/00743 58 Dry, Smith, Edwards, Bhattacharyya, Dunn, Martin, C.
Plant J 1992, 2 193-202 Edwards, Marshall, Sidebottom, Visser, R.G.F., Smith, Martin, C.
Plant J, 1995 8 283-294 Fisher, Boyer, C.D. and Hannah, L.C.
Plant Physiology, 1993 102 1045-1046 Forde, Heyworth, Pywell, J. and Forde, M.
Nucleic Acids Research, 1985 13 7327-7339 Gill, B.S. and Appels, R.
Plant Syst. Evol., 1988.160 77-90.
Higgins, Zwar, Jacobsen, J.V. (1976) Nature, 1976, 260 166-168 Khandjian, E.W.
Bio/Technology, 1987, 5 165-167 Jahne, Lazzeri, Jager-Gussen, M. and Lorz, H.
Theor. Appl. Genet., 1991 82 47-80 James, Robertson, D.S. and Myers, A.M.
Plant Cell, 1995 7 417-429 Jolly, Glenn, G.M. and Rahman, S.
Proc. Natl Acad. Sci., 1996 93 2408-2413.
Kawasaki, Mizuno, Baba, T. and Shimada, H.
Molec. Gen. Genet., 1993 237 10-16.
Lagudah, Appels, R. and McNeill,
D.
Genome, 1991 34 387-395 Lazzeri, Brettschneider, Luhrs, R. and Lorz, H.
Theor. Appl. Genet., 1991 81 437-444 Maniatis, Fritsch, E.F. and Sambrook,
J.
Molecular cloning. A Laboratory Manual., New York. Cold Spring Harbor Laboratory, 1982 Marshall,J.; Sidebottom,C.; Debet,M.; Martin,C.; Smith,A.M.; Edwards,A.
The Plant Cell, 1996 8 1121-1135 Martin, C. and Smith, A.
The Plant Cell, 1995 7 971-985.
McElroy, Blowers, Jenes, Wu R.
WO 99/14314 WO 99/ 4314PCT/AU98/00743 59 Mol. Gen. Genet., 1991 231 150-160.
Mizuno, Kawasaki, Shimada, Satoh, Koyabashi, Okumura, Arai, Y. and Baba, T.
J.Biol. Chemn., 1993 268 19084-19091.
Muller,M.; Knudsen,S.
Plant J, 1993, 4 343-355 Morell, Blennow, Kosar-Hasheni, B. and Samuel, M.S.
Plant Physiol., 1997 113 201-208.
Morell, Rahinan, Abrahams, S.L. and Appels, R.
Aust.J.of Plant Physiol., 1995 22 647-660.
Nair, Baga, Scoles, Kartha, K. and Chibbar, R.
Plant Science, 1997 1222 153-163 Nakamura,Y.; KuboA.; Shimamune,T.; Matsuda,T.; Harada,K.; Satoh,H.
Plant J, 1997, 12 143-153 Nakamura, Yanarnori, Hirano, Hidaka, S. and Nagamine, T.
molecular and General Genetics, 1995 248 253-259 Nakamura, Takeichi, Kawaguchi, K. and Yamanouchi, H.
Physiologia Plantarun, 1992 84 329-335.
Nakamura, Umemoto, T. and Sasaki, T.
Planta, 1996 199 209-214 Rabman, Kosar-Hashemi, Samuel, Hill, Abbott, Skerritt, Preiss, Appels, R. and Morell, M.
Aust. J. Plant Physiol., 1995 22 793-803.
Rahman, Abrahams, Mukai, Abbott, Samuel, M., Morell, M. and Appels, R.
Genome, 1997 40 465-474 Repellin, Nair, Baga, M. and Chibbar, R.N.
Plant Gene Register PGR97-094 (1997) Samibrook, Fritsch, E.F. and Maniatis, T.
molecular Cloning: A Laboratory Manual (Cold Spring Harbor Laboratory Press, 2 nd ed 1989) Sheen, Hwang, Niwa, Kobayashi, and Galbraith, D.W.
The Plant Journal, 1995 8 777-784 WO 99/14314 PCT/AU98/00743 60 Svensson,
B.
Plant Mol. Biol., 1994 25 141-157.
Tanaka, Ohnishi, Kishimoto, Kawasaki, Baba,
T.
Plant Physiol 1995, 108 677-683 Tingay, McElroy, Kalla, Fieg, Wang, M., Thornton, S. and Bretell,
R.
The Plant Journal, 1997 11 1369-1376 Wan, Y. and Lemaux,
P.G.
Plant Physiology, 1994 104 37-48 Wang, Upadhyaya, Brettell, and Waterhouse,
P.M.
Journal of Genetics and Breeding, 1997 51 325-334.
WO 99/14314 PCT/AU98/00743 61 SEQUENCE LISTING GENERAL INFORMATION:
APPLICANT:
NAME: COMMONWEALTH SCIENTIFIC AND INDUSTRIAL RESEARCH ORGANISATION STREET: Limestone Avenue CITY: Campbell STATE: ACT COUNTRY: AUSTRALIA POSTAL CODE (ZIP): 2612 NAME: THE AUSTRALIAN NATIONAL UNIVERSITY STREET: BRIAN LEWIS CRESCENT CITY: ACTON STATE: ACT COUNTRY: AUSTRALIA POSTAL CODE (ZIP): 2601 NAME: GOODMAN FIELDER LIMITED STREET: LEVEL 42, GROSVENOR PLACE CITY: SYDNEY STATE: NSW COUNTRY: AUSTRALIA POSTAL CODE (ZIP): 2000 NAME: GROUPE LIMAGRAIN PACIFIC PTY LIMITED STREET: LEVEL 31, 1 O'CONNELL STREET CITY: SYDNEY STATE: NSW COUNTRY: AUSTRALIA POSTAL CODE (ZIP): 2000 (ii) TITLE OF INVENTION: REGULATION OF GENE EXPRESSION IN PLANTS (iii) NUMBER OF SEQUENCES: 17 (iv) COMPUTER READABLE FORM: MEDIUM TYPE: Floppy disk COMPUTER: IBM PC compatible OPERATING SYSTEM: PC-DOS/MS-DOS SOFTWARE: PatentIn Release Version #1.30 (EPO) INFORMATION FOR SEQ ID NO: I: SEQUENCE CHARACTERISTICS: LENGTH: 17 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "pcr primer based on the N-terminal sequence of wSBE I 5 end at position 168 of SEQ ID (iii) HYPOTHETICAL:
NO
WO 99/14314 PCT/AU98/00743 62 (iv) ANTI-SENSE: FRAGMENT TYPE: (vi) ORIGINAL SOURCE: ORGANISM: triticum tauschii TISSUE TYPE: Endosperm (xi) SEQUENCE'DESCRIPTION: SEQ ID NO: 1: GGCACGCGAG AGACTGG 17 INFORMATION FOR SEQ ID NO: 2: SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "pcr primer in which 5 end is at position 1590 of SEQ ID (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: FRAGMENT TYPE: (vi) ORIGINAL SOURCE: ORGANISM: triticum tauschii TISSUE TYPE: Endosperm (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: TACAITTCCT TGTCCATCA 19 INFORMATION FOR SEQ ID NO: 3: SEQUENCE CHARACTERISTICS: LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "pcr primer 5 end is at position I of SEQ ID (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: FRAGMENT TYPE: (vi) ORIGINAL SOURCE: ORGANISM: triticum tauschii TISSUE TYPE: Endosperm WO 99/14314 WO 99/ 4314PCT/AU98/00743 63 (xi) SEQUENCE DESCRIPTION: SEQ, ID NO: 3: ATCACGAGAG CTFTGCTCA 18 INFORMATION FOR SEQ ID NO: 4: SEQUENCE CHARACTERISTICS: LENGTH: 23 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear 00i MOLECULE TYPE: other nucleic acid DESCRIPTION: /desc "pcr primer 5 'end is at position 334 of SEQ ID (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: FRAGMENT TYPE: (vi) ORIGINAL SOURCE: ORGANISM: tniticum tauschii TISSUE TYPE: Endosperm (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: CGGTACACAG TTGCGTCATT TrC 23 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 2687 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: (vi) ORIGINAL SOURCE: ORGANISM: triticumn tauschii TISSUE TYPE: Endosperm (xi) SEQUENCE DESCRIPTION: SEQ ID NO: ATCGACGAAG ATGCTCTGCC TCACCGCCCC CTCCTGCTCG CCATCTCTCC CGCCGCGCCC CTCCCGTCCC GCTGCTGACC GGCCCGGACC GGGGATTTCG GCCAAGAGCA AGTTCTCTGT TCCCGTGTCT GCGCCAAGAG ACTACACCAT GGCAACAGCT GAAGATGGTG TTGGCGACCT TCCGATATAC GATCTGGATC CGAAGTTTGC CGGCTTCAAG GAACACTTCA GTTATAGGAT GAAAAAGTAC CTTGACCAGA AACATTCGAT TGAGAAGCAc GAGGGAGGCC TTGAAGAGTT CTCTAAAGGC TATTTGAAGT TTGGGATCAA CACAGAAAAT GACGCAACTG TGTACCGGGA WO 99/14314 WO 9914314PCT/AU98/00743 64 ATGGGCCCCT
GCAGCAATGG
TGGGCACAGG ATGACAAAGG TGGGAAACCT GCCATCCCCC ACTATGGGTC
GATCGGGTTC
TGGAGCTCCA TATGACGGTG GCATCCTCGG CCTCGAAAGC TGGTGAGAGG CCTGAAGTAA AAACGCAAAC AACTACAACA CTTCTTTTGG TACCATGTGA GGACCTCAAA
TATCTTGTTG
TGTCCATAGC
CATGCGAGCA
AAACACACAG
GAGTCCTATT
TCGCCTGTTC
AACTATGCCA
TTGGATGGAC GAATTCATGT TAATCACCAT
GGTATCAATA
TACCGATGTA
GATGCAGTTG
GCCAGAAGCA
ACTGTTGTTG
TGATGAAGGT GGAGTAGGGT TGACTACTTG AAGAACAAAG GACCAACAGG
AGATATACGG
TGTTGGCGAC
AAGACTATGG
AGACTTGCAG
CCTGCTTCAC
CTTCATCACC
ATGGCCCTTG
CCACCCAGA\ TGGATTGACT ACGCCAGTGG AGCCTCTCAG TCAAGCAATG
AATGCGCTCG
CAGCGACATG AATGAGGAAA CTTCAATTTT CATCCCAGTA GAAGTACAAG GTAGCTCTGG CCAGTACAAC GATCACTTCA CAACAACCGC CCTAATTCAT TCGCGTCGAG
GAAAAAGCGG
TGCTCCTGGG TACATCGATG
ATGCACAACT
ATAATTATGG
ATAATTCCA
CTGCATGGAT
TTCACTGGGA
CTGACGCTCC
GCACATACAG
CAGTTCAGCT
CGAATTTCTT
ACAAGGCACA
GTAATATGAC
TCCATACAGG
ATTGGGAGGT
TTGACGGCTT
TGTCATTCGC
TTTACATGAT
CAGAAGATGT
TTGACTATCG
ATGACCTTGA
AAAAGTGCAT
CATTTCTCTT
CTACAATTGA
GAGGTGATGG
TTCCAAGAGA
ACATTGATCA
ACGACAAGTT
AGAAGATTAT
AAACTTATGA
ACTCCGATGC
CGTCACCTGA
TCAAAGTCCT
AAAAGCCTAA
TTGAAGCCAC
TATTGGTGAC
TGTTTGGTCA
GGTTAAATTT
TCGTTATGCA
TCCACCTTCT
ACGTATTTAC
AGAATTTGCA
GATGGCAATC
CGCAGTTAGC
TAGCTTAGGG
AGATGGTCTA
AGAAAGGGGT
CTTACGGTAT
CCGATTTGAT
TGGAAATTAC
GCTTGCGAAC
TTCAGGCATG
CCTTGCTATG
ATGGTCAATG
TGCATATGCT
GATGGACAAG
TCGTGGAATT
CTACTTGAAT
AGGCAACAAC
CCTACGATAC
TTCCTTCCTA
TGTATTTGAA
TGGTTACAAA
TCTGATGTTT
AGGAGTACCA
GTCTCCACCC
GGATGAAGGA
TCGTGTCAAA
TTCAACAACT
ATCAGGATTT
CGATTTCACC
ACTTTTGACG
GGTGAAAGGT
GAGGCTCATG
GACAATGTGT
ATGGAACATT
AGCAGATCAG
TTGCGTGTTC
AATGGCTATG
TATCATAAAC
CTTCTTTCTA
GGAGTA.ACAT
AAGGAATATT
CATTTAATGC
CCAGTGCTTT
GCTATTCCTG
AGTGCAATAG
GAGAGCCACG
GAAATGTATA
GCACTTCAAA
TTTATGGGTA
TGGAGTTATG
AAGTACATGA
TCGTCATCAA
CGTGGAGATC
GTCGGATGTG
GGTGGACATG
GGAGTACCTG
CGCACTTGTG
GCTGCTTCTT
GACGCAGCAG
GGAATGGCTC
CCCATGTCA.A
GTGGAGATGG
CCTCTAAATT
ATGTGTTTAA
TGGGGATGAG
TACCGCGCAT
CCATATTATG
GAACACCAGA
TGATGGATGT
ATGTTGGACA
TGTGGGATAG
ATCTGAGATA
CCATGCTATA
TTGGTTTGGA
ACAAAATCTT
GTCGGTCAGT
ATAGATGGAT
CACATACTCT
ATCAGTCTAT
CTGGCATGTC
AGATGATTCA
ATGAGTTTGG
ATAAATGCAG
ACGCATTTGA
AGCAGATTGT
TGGTCTTCGT
ATTTGCCTGG
GAAGAGTGGC
AAACAAACTT
TGGCTTACTA
GGGGCAAAGC
ATGGTGAGGC
420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 WO 99/14314 WO 99/43 14PCT/AU98/00743 65 GACTTCTGGT TCCAAAAAGG CGTCTAC TGTCTTCGGG TCACCTGACA AAGATAA CGTGTACCGA CGTCCTTGTA ATATTCC GTGCAGACTT GAGATTCTGG CTTGGAC ATAAGAGGTG ATGGTGCGGG TCGAGTC TCCTCTGTCA TAAAGGAAGT TTCGGGC INFORMATION FOR SEQ ID NO: 6: SEQUENCE CHARACTERISTICS: LENGTH: 807 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear
AGG
CAA
TGC
~TTT
:CGG
TTT
AGGTGACTCC
ATAAGCACCA
TATTGCTAGT
GCTGAGGTTA
CTATATGTGC
CAGCCCAGAA
AGCAAGAAGG
TATCA-ACGCT
AGTAGCAATA
CCTACTATAT
CAAATATGCG
TAAAAAA
GAATTAACTT 2400 TGATCAGAAC 2460 CTGTCAAACT 2520 AGAAAGATAA 2580 CCATCCCGAG 2640 2687 (ii) MOLECULE TYPE: protein (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: (vi) ORIGINAL SOURCE: ORGANISM: triticum tauschii TISSUE TYPE: Endosperm, (ix) FEATURE: NAME/KEY: Protein LOCATION: L .807 OTHER INFORMATION:/label= sheI /note= "deduced amino acid sequence from SEQ ID (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: Met 1 Pro Leu Cys Leu Thr Ala Pro Ser Cys Ser Gly Pro Ser Leu Pro Pro Arg Ser Arg Pro 20 Lys Phe Ser Ala Ala Asp Arg Pro 25 Al a Pro Gly Ile Ser Val Pro Val Thr Ala Giu Ser 40 Asp Pro Arg Asp Tyr Asp Ser Ala Lys Thr Met Ala Leu. Asp Pro Asp Gly Val Leu Pro Ile Lys Phe Ala Gly Phe Lys 70 Glu His Phe Ser Tyr Arg Met Lys Lys Leu Asp Gin Lys Ser Ile Glu Lys His Glu. Giy Giy Leu Giu Glu Phe Ser Lys Thr Val Tyr 115 Gly 100 Tyr Leu Lys Phe Gly 105 Ala Ile Asn Thr Giu Asn Asp Ala 110 Gin Leu Ile Arg Giu Trp Ala Pro 120 Ala Met Asp Ala 125 WO 99/14314 PCT/AU98/00743 66 Gly Asp Phe Asn Asn Trp Asn Gly Ser Gly His Arg Met Thr Lys Asp 130 135 140 Asn Tyr Gly Val Trp Ser Ile Arg Ile Ser His Val Asn Gly Lys Pro 145 150 155 160 Ala Ile Pro His Asn Ser Lys Val Lys Phe Arg Phe His Arg Gly Asp 165 170 175 Gly Leu Trp Val Asp Arg Val Pro Ala Trp Ile Arg Tyr Ala Thr Phe 180 185 190 Asp Ala Ser Lys Phe Gly Ala Pro Tyr Asp Gly Val His Trp Asp Pro 195 200 205 Pro Ser Gly Glu Arg Tyr Val Phe Lys His Pro Arg Pro Arg Lys Pro 210 215 220 Asp Ala Pro Arg Ile Tyr Glu Ala His Val Gly Met Ser Gly Glu Arg 225 230 235 240 Pro Glu Val Ser Thr Tyr Arg Glu Phe Ala Asp Asn Val Leu Pro Arg 245 250 255 Ile Lys Ala Asn Asn Tyr Asn Thr Val Gin Leu Met Ala Ile Met Glu 260 265 270 His Ser Ile Leu Cys Phe Phe Trp Tyr His Val Thr Asn Phe Phe Ala 275 280 285 Val Ser Ser Arg Ser Gly Thr Pro Glu Asp Leu Lys Tyr Leu Val Asp 290 295 300 Lys Ala His Ser Leu Gly Leu Arg Val Leu Met Asp Val Val His Ser 305 310 315 320 His Ala Ser Ser Asn Met Thr Asp Gly Leu Asn Gly Tyr Asp Val Gly 325 330 335 Gin Asn Thr Gin Glu Ser Tyr Phe His Thr Gly Glu Arg Gly Tyr His 340 345 350 Lys Leu Trp Asp Ser Arg Leu Phe Asn Tyr Ala Asn Trp Glu Val Leu 355 360 365 Arg Tyr Leu Leu Ser Asn Leu Arg Tyr Trp Met Asp Glu Phe Met Phe 370 375 380 Asp Gly Phe Arg Phe Asp Gly Val Thr Ser Met Leu Tyr Asn His His 385 390 395 400 Gly Ile Asn Met Ser Phe Ala Gly Asn Tyr Lys Glu Tyr Phe Gly Leu 405 410 415 Asp Thr Asp Val Asp Ala Val Val Tyr Met Met Leu Ala Asn His Leu 420 425 430 Met His Lys Ile Leu Pro Glu Ala Thr Val Val Ala Glu Asp Val Ser 435 440 445 Gly Met Pro Val Leu Cys Arg Ser Val Asp Glu Gly Gly Val Gly Phe 450 455 460 WO 99/14314 PCT/AU98/00743 67 Asp Tyr Arg Leu Ala Met Ala Ile Pro Asp Arg Trp Ile Asp Tyr Leu 465 470 475 480 Lys Asn Lys Asp Asp Leu Glu Trp Ser Met Ser Ala Ile Ala His Thr 485 490 495 Leu Thr Asn Arg Arg Tyr Thr Glu Lys Cys Ile Ala Tyr Ala Glu Ser 500 505 510 His Asp Gin Ser Ile Val Gly Asp Lys Thr Met Ala Phe Leu Leu Met 515 520 525 Asp Lys Glu Met Tyr Thr Gly Met Ser Asp Leu Gin Pro Ala Ser Pro 530 535 540 Thr Ile Asp Arg Gly Ile Ala Leu Gin Lys Met Ile His Phe Ile Thr 545 550 555 560 Met Ala Leu Gly Gly Asp Gly Tyr Leu Asn Phe Met Gly Asn Glu Phe 565 570 575 Gly His Pro Glu Trp Ile Asp Phe Pro Arg Glu Gly Asn Asn Trp Ser 580 585 590 Tyr Asp Lys Cys Arg Arg Gin Trp Ser Leu Ser Asp Ile Asp His Leu 595 600 605 Arg Tyr Lys Tyr Met Asn Ala Phe Asp Gin Ala Met Asn Ala Leu Asp 610 615 620 Asp Lys Phe Ser Phe Leu Ser Ser Ser Lys Gin Ile Val Ser Asp Met 625 630 635 640 Asn Glu Glu Lys Lys Ile Ile Val Phe Glu Arg Gly Asp Leu Val Phe 645 650 655 Val Phe Asn Phe His Pro Ser Lys Thr Tyr Asp Gly Tyr Lys Val Gly 660 665 670 Cys Asp Leu Pro Gly Lys Tyr Lys Val Ala Leu Asp Ser Asp Ala Leu 675 680 685 Met Phe Gly Gly His Gly Arg Val Ala Gin Tyr Asn Asp His Phe Thr 690 695 700 Ser Pro Glu Gly Val Pro Gly Val Pro Glu Thr Asn Phe Asn Asn Arg 705 710 715 720 Pro Asn Ser Phe Lys Val Leu Ser Pro Pro Arg Thr Cys Val Ala Tyr 725 730 735 Tyr Arg Val Glu Glu Lys Ala Glu Lys Pro Lys Asp Glu Gly Ala Ala 740 745 750 Ser Trp Gly Lys Ala Ala Pro Gly Tyr Ile Asp Val Glu Ala Thr Arg 755 760 765 Val Lys Asp Ala Ala Asp Gly Glu Ala Thr Ser Gly Ser Lys Lys Ala 770 775 780 Ser Thr Gly Gly Asp Ser Ser Lys Lys Gly Ile Asn Phe Val Phe Gly 785 790 795 800 WO 99/14314 WO 9914314PCT/AU98/00743 68 Ser Pro Asp Lys Asp Asn Lys 805 INFORMATION FOR SEQ ID NO: 7: SEQUENCE CHARACTERISTICS: LENGTH: 319 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: (vi) ORIGINAL SOURCE: ORGANISM: triticumn tauschii TISSUE TYPE: Endosperm (ix) FEATURE: NAME/KEY: misc~signal LOCATION :1..319 OTHER INFORMATION:/function= untranslated region of wSBE I-D34 cDNA" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: GCGACTTCTG GTTCCAAAAA GGCGTCTACA GGAGGTGACT CCAGCAAGAA GGGAATTAAC TTTGTCTTCG GGTCACCTGA CAAAGATA).C AAATAAGCAC CATATCAACG CTTGATCAGA ACCGTGTACC GACGTCCTTG TAATATTCCT GCTATTGCTA GTAGTAGCAA TACTGTCAAA CTGTGCAGAC TTGAGATTCT GGCTTGGACT TTGCTGAGGT TACCTACTAT ATAGAAAGAT AAATAAGAGG TGATGGTGCG GGTCGAGTCC GGCTATATGT GCCAAATATG CGCCATCCCG AGTCCTCTGT CATAAAGGA 319 INFORMATION FOR SEQ ID NO: 8: SEQUENCE CHARACTERISTICS: LENGTH: 4890 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear 00i MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL:
NO
120 180 240 300 (iv) ANTI-SENSE: (vi) ORIGINAL SOURCE: ORGANISM: triticum tauschii TISSUE TYPE: Endosperm Oix) FEATURE: WO 99/1 4314 PTA9/04 PCT/AU98/00743 69 NAME/KEY: promoter LOCATION: L .4890 OTHER INFORMATION:/function= "promoter containing sequence of SBE I" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: GGGTGGCGGG TCGGGCGGCA
CGGGCGGCAG
GGCTGCGGCT
GGTTGACTTT
GGACTCCAAA
AACATTTATT
AAAATCAAAT
TTCATTATTT
CGGAGAGAAA
TGAAAATAAC
ATCACAAATC
ATTGAAAATT
AATGTAGAAC
TCGTGAACTT
GAAATAACTT
AGTGCCTATG
CTAGTAATGC
TTTTCAAATG
ATGGAAAGTT
TGTCAAAAGA
CCTGATCCTG
SO CGGTAGATCT
GATTTATGTG
TTTGGACGCG
AATGCCATAG
GAAGAGTGAA
TGCTTCCCTT
AGCCCTTCAT
CGGCGGCTAG
TTAAAGGCCG
AAAAAATAAT
AATCCCGAAG
TGGGCCTAAA
AAAATCCAAA
GTGAAGAAGT
AATAATTAAA
GAAATCCCCA
AAAATGCAAT
TGGGATGTTA
TCTATTTTGT
GAATCAAACC
GTGTTACATA
GTAATATCAA
AATGCAATGC
TATGAGTTGC
CTTCTCTTTT
TAATACCTCA
AAATAAAATC
TGTGGTAGTA
AAGTACATCG
CCTTGTGGGA
CTAGCAAGTA
GTGCTTCAAC
GTCGTAATCC
TGGATTCCCC
AGGCGCGGGG
GGTTTCGCGG
GCCAGGCTGA
AATTCGGACA
TAAATTTTTC
ATGCAATTTT
TAAAATCAAA
CATTTTATCC
ACAAATGATC
ACTCTCTCCG
AAAATATGAT
CATATAACTC
TTTGAAATTG
TTTAAATAAA
GATTTATTAC
TAAATATCTT
ATGCTAAAAG
CAACAAGTGG
TACATGGTTT
TAATACAATT
AGAAGATTTG
CTCATGATGT
TACACCAGAC
GTGTAAAGTA
GGCCTAGTTA
TAAAGGTTAG
TGTGCATTTG
TGGATGTCTT
CGGCGGGGCG
CGGCGGCGAC
GGTGTCCGGG
TGCAAAAAkAG
CCCATTCTTA
GAAAAATGCG
TATTTGTTTT
CATCTCATAT
CTATTTTCAA
TGGGTCCTTG
ATGCATGATG
AAATTCTATA
TATTATTTTT
ACAAAGCATA
AATAGCGTTG
GATAGATGTT
AATAGAACCT
CATACTTGGC
AGATTCCAGC
CCACTAAAGT
GTGTCATCAT
AAAATTATCA
ATAGTTGACA
CTACCATGTA
AGGAAATTCT
ACCCACTTAA
TAGGTCCCTC
TTTGTTACAT
GCCGGGGCGG
TTGGGCTGAG
TCGGACACGG
TAAGAAAAGA
AAAATAAGCC
TATTTTTCCT
TAATATTTTT
ATTTTGATAT
AATTTGAGAA
AGTTGCGTGA
ATCTAATGTA
ATTATGAACA
TAGAATTAGT
AAAATGACAA
TATGTGTGTA
TCTACAATTC
TAGTTTCATT
ACTGTTTGTT
ATGTAGCCAC
CACCTAGCCC
CATGACAACA
AGAGGGAGAG
CATCGATTTT
TTAGAAGAGG
TCCTTAGATC
AAAATGTCAC
GGATCTGAGC
TTTATTGAAG
CGCGGCGGCG
GCGGGGCACG
CCCGTA-AGGC
AATAATAA.AC
GGACAAGATG
AATTCGGAAT
CCTCCAATAT
GAAATATTTT
AACCCA-AATA
AATTTCTAGG
TAACATTCCA
CAGAAATATT
CTAGAGCATT
ATTCACATAT
TGTGTGCGTG
ACGGGTCTAA
TAACTAACAA
TGTTCATTTT
AAAATATGAT
AAGTGACCGA
AATTATTAGG
AATGTATGGA
TTAAGATACA
TGAAATGAGA
CCCTTCTCCC
TTTGAATCTT
CCTTTCTCCA
TGAGAGTGAA
120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 TTATTATATG CCCATAGGAG GTGGGATATA AAGGCTGTTG GTATTCTGCA CCATACATGC 1740 WO 99/14314 WO 9914314PCT/AU98/00743 70 TAGAGTAGGG AGGAGAGGCT GGTGCATGAT
TCCCCCACCC
TAGCCCATTC
TGACTTGAAT
CTCACAACAT
CTTTTATGCT
CCATGCGCTC
TCTATCCTCC
AGGCAAGTAA
GGCATTTTTG
ATTTAGGGAG
GTTGTTTCGC
CTTCCTCCTT
TGCACAGATA
TGCGCCCACT
CTCCGACTAT
GGCATCACCT
ACACAACTAC
TCCTCCCACG
CGAGGCTGAT
TCATACATGT
ATCCTGAGGA
AGTACGTCCT
CCCCTACGCC
GGCGAAGGTA
AAAATGTCGC
AGCAATATAA
CCATAAATTG
ATGAAAGCCG
TACTCCATAT
GCTTTTAGCG
TTCCCCTTTT
AAATTTATAT
TTTTTCTATT
ACTAACAAGT
TTCATCATGG
TTGACAATGT
TAACTTAA-AA
AAAAGTTATT
TCTCATGTTT
TTGTTTCTGG
GAGCAACTCT
GCCCAAAATG
CTCGCTCCAC
ACTCACTCCA
GCGCCAACGC
ATCACCGACG
CCCCTCAAAT
AAA.ATCTCAA
CCCCAAGACA
TGGTAAACCG
ATGGTAGGAT
ATGTCGGCTC
TAACGAGGTC
CCCGTTCGAT
CTAGGAGTTC
TTCCTCGACG
CTGCAGGTAG
TCAACGACTT
GTTTATCACA
AACATACAAA
CATATCGCGA
GACATAGGAC
CCTTTCGTGC
TTCATTTCTT
ACCATTTTTC
ATTAATTTGT
TTTTTTTATT
ACTTATTAAT
GCCTCATATA
AGGATTATTT
CAAACATAGA
ACTCTAGAAA
GAATGAGTCG
AGCAGAGTCG
GCACTTCAGA
AAACAAGCTT
TATCCAACCG
CGGGATTTTA
AGTGGGGTGA
TCATGAGGCA
CGGCATCACC
TCTCCTCCCC
CATACCCAAT
ATTCTCCTCC
CCATGATGGC
ATCCCCATTG
GTCGCAATGC
CGCCCCGCAA
ATCTCTCTTA
TAGAACATAG
TTGAAGTCGC
TTTGATAATG
TTTTTAGCAA
CTATGTGTTT
AACCATACAT
AGTGGTGCCC
TGAAATCTAT
TGTTTCTCGC
GTCTCTTATG
ACATGGTGGA
AGGTCTTCAT
CATGATTAGT
TGGCATGTGG
TTTTGGTGCA
TTTATAAACA
CCATATATCT
GGGAAGGTAA
CGATATGCCC
AGAGTCACCA
CGAGCCTCCA
CAAGCGCATG
CACAGCGCAT
CAAGAAGGAT
GCCATTTGGA
AAAACCATAG
TCTATGCCAC
CATGGTTTAC
TAGAATGGCG
GTGCATCATT
ATGTCGTTGG
GACTCTCCAA
CCATCTATAA
GGAGGACAAC
CAATGTCGAA
AAATAAAATG
ATTTGAACCG
ATGAAAAAAG
GAGCCGCAGC
ATGAAGACTC
ATAGGGAGTG
TTTATTTTTT
TATTTTTTGT
AGAAGTCCAG
CTAGCCCATA
CCTCTGATTT
TTCTTGGATT
GACTGATAGG
GTCGTAAAGA
AAGGATATCA
CTTTGTTGCA
TCTTAGGGAA
AATCGCCATA
TATCCCTTCG
AATATGGAGG
CATGAGGGAA
TACAGGTACA
AAGCACCCTC
TGGTCATCGC
CTGCCGCCTC
AATGTCATCA
CGGCAGTGCG
CGTGTGGCGC
GATTTGGCGC
TCCCCTTGCC
ACTCAAAGCT
GGAGGAGCAA
GGCTAGACGA
TGGCGACATT
TAGTGTGACT
GTGTGGTTCA
AAACAAGTAA
TGCCAAGTAC
TACTAGAGTT
AGGGTAGTTG
TTCTCTTTTG
TGTTATATTC
ACTTGCATAT
TATTTACCCC
GTTTTTCTGT
TTTGTTTACT
AAGATATATT
AAACTACTTT
CCATGCATGA
AAATATTTAA
GGTTAAAGTG
ATGCCAATAT
GATAGCCATA
CCATGGATTC
GTTTTAGCTT
TGAACCAGCA
CCATTAGTGG
GTGGCATAAG
CCCCTTCCTC
TTATGGAGAG
AACCCCACCT
TTCCTCCTCC
TTCGGGTCCA
CCCCAGTCGG
CACAATGAGG
CGATAGCTCT
CGGCGGCGGC
GCATATTTTG
ACTTTTGGCC
ACTAAATGTA
GACCACAAAT
ATATGAAGCG
CTCTAAGGCC
GACTGTTCGT
TAGGTTTCCC
TAGTTTCATA
GGAGGTGCAC
1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 WO 99/14314 WO 9914314PCT/AU98/00743 71
ACACAAACAT
ATCATCAAAA
CATTTAAAAT
CAAACCACAT
GAAACCGAAA
CGATGCTTAG
GGGGACATTG
GGCGAGGCGG
GAGGAGTAGC
TCCCCCACCC
CAAAAAGTAA
TAGAAAAAGC
TGTTCTTTTA
TTTCACAAAA
GGCCCAGCGG
CTCCCTCGCC
TTCCTGTCCA
GCCACACTCC
GATTCCCGTC
ATAAAGTATA
CCTGCCAATA
GTGAACAATT
GCTATACACT
GTAATGTTAG
ACGTGAGATG
CAGTTGACAC
ACGTCGGGCT
CTGCAAAACA
TGACAAGCAA
TTGTTCTTGC
AATCCTTTTA
GAGCAAATAT
CTGACGAAGG
CGCACGACCG
CCCGTTTCCC
AAGCGGCCAC
TCCTCCGGCC
CGCCGCCATG
AATACTAACT
TGAGATATAG
GTTTTTTTAG
TGCTCCATAT
CCGTTTTTCT
GGGATGACCA
GAGAGCGGTG
GGCAGGTAGG
TGGTACACCA
CAACCAACCA
TGGACAGCGC
TAGTTCTTTT
CTTCTTTTTT
CTGAAAGTGG
TCCACGTGCA
CCTCCCTCCC
GGACCGGAAA
GATATAAkAGC
GAGGAAGATG
TGAGAAGTAT
TTTTGAATAT
AAAAAATATA
GAAACCATGT
ATTCAAAGAA
CAACGTCCCT
AGGGGtTGCG
GGGGAGGGGG
GTTTTCTGCC
TCGCAGTCCC
AAAGAGTAAA
GTGAAAGTAA
TTTTAGGGAA
CGAGACAGTG
CCCCGGCCCT
TCTCGTTGCT
AAAATCACGC
GCGCGGGGCC
GTTTGCGTGG
ATCAATATGA
AGAAATAACT
TTGCTATTGG
GAAGGAGAGT
ACAGAGACCT
ATGCGTGTGC
AAGGACCGGG
CTACGAAAAC
ACATGTCCCT
CTTTTGTTAG
TGCTTTTATA
AAGAGCAAAT
AGGGCCCATA
CCCGGGCCCG
TCCACTCCAC
CTTTCCGTTG
TCAAAAAAAC
GCAACGCAAC
CCAACCCAGC
GCAGTTGCCT
CGAGGTGACG
CACCGGAGAT
GGCAACATGT
GGAGGAAGAA
CTCATTTCAT
CTGGTCTTTG
TTTTCATTTC
GTGATTGGGA
ATCTTCCACT
GCTTTCGTCC
CAGATCCGTT
TGTTCTCCTC
GGTCTCCGGC
3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 4500 4560 4620 4680 4740 4800 ACGGGCCCGG CGCAAAATGG 4860 4890 INFORMATION FOR SEQ ID NO: 9: SEQUENCE CHARACTERISTICS: LENGTH: 6228 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: (vi) ORIGINAL SOURCE: ORGANISM: triticumn tauschii TISSUE TYPE: Endosperm (Ox) FEATURE: NAME/KEY: misc-feature LOCATION: I OTHER INFORMATION:/product= "coding region of wSBE I-D4 gene" (xi) SEQUENCE DESCRIPITION: SEQ ID NO: 9: WO 99/14314 WO 99/ 4314PCT/AU98/00743 72
ACCGGCCCGG
CCGCCCCCTC
CCGGACCGGG
CCGGCTCCGT
ATGTGCGGCT
TGAGCCCTCT
ACGGAATCTG
GGGATTCGTC
GATCCGTACG
AAATGTGTAT
TCCTGTGTTG
TCATTTATCG
CATGGCAACA
TGCCGGCTTC
GATTGAGAAG
GTTTGAAACA
TTTTGTAGGC
ATGGGCCCCT
GGTTAACTTA
TCTAGCTAGT
GGTTGGCTGG
CATGTATTTA
TTCAACAACT
ATCAGGATTT
CGATTTCACC
ACTTTTGATG
GGTGAAAGGT
AACTTACATT
GCATCCTCGG
TGGTGAAAAG
AAAGGCAAAC
TTCTTTTGGG
ACCTCAATAT
CCATAGCCAT
CGCAAAATGG
CTGCTCGCCA
GATCTCGGTG
TCTGCCGGGG
GAGCGCGGTG
CCCCTGTCTA
ATCCACGGTG
CACTGAGGAA
CAGAATATCC
AATCTGTGCT
TGTCTCTACT
AAGGCCAAGA
GCTGAAGATG
AAGGAACACT
CACGAGGGAG
ATAGTTACAT
TATTTGAAGT
GCAGCAATGT
TGAAGTGCTG
AAAGAGTAGA
TATTCATTTC
CTTGTGAGTC
GGAATGGCTC
CCCATGTCAA
GTGGAGATGG
CCTCTAAATT
CTACTTTTAG
AATGTGGAGA
CCTCGAAAGC
CCTGAAGTAA
AACTACAA
TACCATGTGA
CTTGTTGACA
GCGAGCAGTA
GATTCCCGTC
TCTCTCCCGC
AGTCAGTCGG
TTTCCCTGAT
CCCGCGCCCT
CCCAGATTTG
GTTATTGGAA
CAAGTGGATG
CTCCTGCAGT
GAATGTATCA
ACTTGTTCAG
GCAAGTTCTC
GTGTTGGCGA
TCAGTTATAG
GCCTTGAAGA
CTTGTGGCGT
TTGGGATCAA
AAGTTCTAGT
ATGAAACTGT
TAAATATGAA
TTTTATGGCA
ATTACTTTAT
TGGGCACAGG
TGGGAAACCT
ACTATGGGTC
TGGAGCTCCA
TGGCTCGAGA
CATGATACTT
CTGACGCTCC
GCACATACAG
CAGTTCAGCT
CGAATTTCTT
AGGCACATAG
ATAAGACAGA
CGCCGCCATC
CGCGCCCCTC
GATCTTCATT
GCGATGCCGC
CTTCGCTCCG
CGACCGTGAT
ATAGTATATA
CGATTTCGAT
GTCTCAACCG
ACCAATAATT
TCCTGATCTG
TGTTCCCGTG
CCTTCCGATA
GATGAAAAAG
GTTCTCTAAA
CCGCAGCACA
CACAGAAAAT
GTTGTCACGC
CTTAAGAGTT
ATATGTTTTC
ATACTTGCTT
GGGTGTAGGG
ATGACAAAGG
GCCATCCCCC
GATCGGGTTC
TATGACGGTG
GCAAGAAATC
TTATTGCTCG
ACGTATTTAC
AGAATTTGCA
GATGGCAATC
CGCAGTTAGC
TTTACGGTTG
TGGTCTTAAT
GACGAAGATG
CCGTCCCCCT
TCTTTTCTTT
GCGCGCGCAG
CTGGTCGTGG
CCCCTGTTGT
CTACTAATAA
TGGATTTCTC
TATTACTGGA
GCTGCATTGT
CCGCTTATCC
TCTGCGCCAA
TACGATCTGG
TACCTTGACC
GGTTAGCTTT
AAAGACATAA
GACGCAACTG
AACTAATTGC
TATGGCTTGT
CCTTTTCTAG
CTAACTATCT
ATGCACAACT
ATAATTATGG
ATAATTCCAA
CTGCATGGAT
TTCACTGGGA
TAAGTAAAAC
TTTTGCAGGT
GAGGCTCATG
GACAATGTGT
ATGGAACATT
AGCAGATCAG
CGTGTTCTGA
GGCTATGATG
CTCTGCCTCA
GCTGACCGGC
TCTTTCGTTT
GGCGGCGGCA
CCGCGGAAGG
CGCCGGGCAA
ACTTGAGGCT
TGCTTTATGC
TGTACAACCC
GAAAACATAA
TAACTTTTGT
GAGACTACAC
ATCCGAAGTT
AGAAACATTC
TGTTTCATGT
TGCGACTCTG
TGTACCGGGA
AATGGTCGTT
CTTTTCTGAT
TTATGGTCAT
TTAGTAGATT
TATTGGTGAC
TGTTTGGTCA
GGTTAAATTT
TCGTTATGCA
TCCACCTTCT
CCACACAATT
ATGTGTTTAA
TGGGGATGAG
TACCGCGCAT
CATATTATGC
AACGCCAGAG
TGGATGTTGT
TTGGGCAAAA
120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 WO 99/14314 WO 9914314PCT/AU98/00743 73
CACACAGGAG
CCTGTTCAAC
ATGGACGAAT
CACCATGGTA
GATGTAGATG
GAAGCAACTG
GAAGGTGGAG
TACTTGAAGA
AACAGGAGAT
CCCTCCTTTG
CATACAGTTC
TTCGCTTGAT
CTAGTGATAG
GTTATATATC
AAATTTTCAG
GTATACTGGC
TCAAAAGGTT
TTGTAATGTT
CCTTTTGGTA
TGGCCCTTGG
TGTCAAAACT
AGGGCGAAAA
TGTTATCACG
TTTTAGCGTG
CACACTTATG
ATGATGCAAA
TCACGACAAG
GTTTATATCT
GAATGGATTG
TGGAGCCTCG
GTTTCTGGTC
GTAGCTTATT
TTCATATTPA
TCCTATTTCC
TATGCCAATT
TCATGTTTGA
TCAATATGTC
CAGTTGTTTA
TTGTTGCAGA
TAGGGTTTGA
ACAAAGATGA
ATACGGAAAA
TCGCTGTGCG
AAAGGTGAGA
GACTTTTAGT
TACCCACTAA
GTTGACTTTG
TCTATTGTTG
ATGTCAGACT
CGATTCGTTT
CGTTGTTACT
CATTTGGCTT
AGGTGATGGC
TATTTCTGAT
GTTTAAACAT
TATCATTTAG
GCAGTCTATT
AATATTCCCT
CATGATAGAG
CTTCTTGCAG
GTTTTCTAAC
ACTTTCCAGA
CAGACATTGA
TGGTAGCTCT
TACACTGTGT
GCCTTTCAAA
ACACAGGAGA
GGGAGTCTTA
TGGCTTCCGA
ATTCGCTGGA
CCTGATGCTT
AGATGTTTCA
CTATCGCCTG
CCTTGAATGG
GTGCATTGCA
TGAGTATGTG
CACTTTCTTT
TGCTTCACAA
CCAGCTATTA
TGTTCATCTA
GCGACAAGAC
TGCAGCCTGC
TAAGTATTCC
CAGAGTTCTG
ATTTTGTTAC
TACTTGAATT
CAATATGTTT
CTGTTTTCTA
CTGTGCCGGT
GTTGGATCCT
GTTTAAAAGA
ATGTTAGCAT
AAAATCAGCA
TCATACTGAC
AGAAGGCAAC
TCACCTACGA
CTTGGGATCT
TCCAACTTCT
CTAAACTAA).
AAGGGGCTAT
CGATTTCTTC
TTTGATGGGG
AGTTACAAGG
GCGAACCATT
GGCATGCCAG
GCTATGGCTA
TCAATGAGTG
TATGCTGAGA
TTCTTTTTTT
GCCTGGTAGA
GTTCGAATTA
CGGACCATGT
TTGAAACAAC
TATGGCATTT
TTCGCCTACA
TGAATTTGAT
CTTAGTCCTT
AAATATTTCA
TTATGGGTAA
CGGGATTCCC
TGATAGCCAA
AGTTAATCTT
CTTATTCCAA
TTTTTATTTT
GTCTTTCTTA
GTATATGGCA
GGTGCAATTT
AACTGGAGTT
TACAAGGTTA
TGACCTCACT
GTCTTGTGGA
TTGCTGATCT
CATAAACTGT
TTTCTAATCT
TAACATCCAT
AATATTTTGG
TAATGCACAA
TGCTTTGTCG
TTCCTGATAG
GAATAGCACA
GCCATGATCA
ATGGGGCACT
CAAATTTGAG
AGTTAGTTAT
AAGAATGTCC
TTAGTAGTTA
CTCTTGATGG
ATTGATCGTG
GTTCTAGTTC
GAAGATAATG
GATGATTCAC
TGAGGTAATA
TCGAAAAAAA
GTACTCCCCA
TATTCTAATT
TTACATATAT
ATACCAATGT
ACCTACTCAT
AATTGCTGCA
CCTTTTAGTT
ATGATAAATG
TGCCTATGTA
TAGTTCCTTC
TAAATTCTCC
ACTACTAGTT
GGGATAGCCG
GAGATATTGG
GCTATATAAT
TTTGGATACT
ACTCTTGCCA
GTCAGTTGAT
ATGGATCGAC
TACTCTGACC
GGTATGTTTT
GGTCTAAGAA
AAATAAACAT
ATTCTGATAA
GAAGACTGCA
ACTTTCACGC
ACAAGGAAAT
GAATTGCACT
CAGACGAGTA
TATTCCAGTC
TTCATCACCA
TCTGGTTATC
TCCTTTGGGC
GCTATTTCCA
CATTGTTGTT
GCCGACATCA
TTCTCCGTAA
GTTTTACATA
ACCTGACAAC
TGGCCACCCA
CAGACGCCAG
TATTTTTACA
ATCTCTGACT
CTTCTAACGT
GCTCAGTACG
2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 WO 99/14314 WO 9914314PCT/AU98/00743 74 ATGACCAAAT CTTGCCTGTG TGCAGTGCAT ACATTATCCA CTGTCTTCTT TTGTTAACAG TCACGATTTT TCTCATATTT CTAACATATA TAATTTGAJ\C CGACAAATTT TCCTTCCTAT GAAGTAGTTA ACTATACAAT ACTCTTAAGA ATAGCAACTC ATCTGTTTGA TATGAACCAT TTCCAGATTA TTGTATTTGA AAACTTATGA TGGGTAACTG ATGACTAATG TGCTTAATCT GACTTGCCTG GGAAGTACAA GGAAGAGTAA GCAATGTTAA AGAAGGGGCC ATCAAGGCTG CCCAGGTGGC CCATGACAAC AAACAAACTT CAACAACCGC TGGTAATGCT AATTACTAGG TGCAGTACGA TCTCACAAAA GGAAAAGCCC AAGGATGAAG TGTTGAAGCC ACTGGCGTCA GGCGTCTACA GGAGGTGACT CAAAGACAAC AAATAAGCAC TAATACTCCT GCTATTGCTA GGCTTGGACT TTGCTGAGGT GGTCGAGTCC AGCTATATGT GTTTCGGGCT TCCATCCCAG CATAGTTACA TGATAATTGA ATAACTCCAG GGCCAAGAAA GGGAAGCTTC AGTCCTTGTT TAAGCCATCA TCTTATCAAG TTCCAGGTGT TGGTTCCTCC CACTGACCAT CGAAGCCACG TCAAAATATC ACAAACTGCC
GTAACCTAGT
TATAAATTGA
GAAGTTATTT
TATCCAACTT
AGTACATGAA
CATCATCAAA
GTTTAGTCAG
TGACTTGTGC
TGTTGTCTCA
ACGTGGAATC
ATCTCTTGCA
CGTTTCCACT
GGTAGCTCTG
TGATGTTCAA
CATCAGATAA
GATCACTTTA
CCTAACTCAT
AGGATTTAGT
TGCTCTCTTG
GAGCTGCTTT
AAGACGCAGC
CCAGCAAGAA
CATATCAACG
GTAGTAGCAA
TACCTACTAT
GCCAAATATG
AATAAAAACA
TGCATATTGC
GCCTAGATTG
TCCGTTCTCG
TCCCAAAATT
ACAACCAAAA
GTGGGCATGA
ATGGCATCTT
AATTTTCTTG
CATTGCAATT
TCTCTGCATC
TTCTGCATTC
CGCATTTGAT
GCAGATTGTC
GGCAGCTGTT
GTTTTATGTT
AAATGGGCTA
TGGTCTTCGT
AGCTTTGCCT
TTTAAAACAC
GACTCTGATG
GATCTGTTTT
TCTTATTTGC
CGTCACCTGA
TCAAAATCCT
AACAATAAAT
CCAGGCTTAC
CTTGGGGGA.A
AGATGGTGAG
GGGAATTAAC
CTTGATCAGG
TACTGTCAAA
ATAGAAAGAT
CGCCATCCCG
GTTGTCTGTT
TATAAGCCTG
TATCTTTTTT
AGACAAGGCG
CTCTGGTTGA
GGCGACCATC
AATGCGCATC
CTGCCAAAGG
ATTCTTACAC
TCCCAAATAT
TGATAAATAA
AAGCATTTTT
CAAGCAATGA
AGCGACATGA
GCATCATTTG
ACCAAATAAG
TGGACTCAAT
CTTCAATTTT
TTCAATATTT
GCAGTTACAA
CTCTGATGTT
GCAACACTAT
AGTGTTGATC
AGGAGTACCA
GTCTCCATCC
AAATAACAGC
TATCGCGTCG
ACTGCTCTCG
GCGACTTCTG
TTTGTCTTTC
ACCGTGTGCC
CTGTGCAGAC
AAATA.AGCGG
AGTCCTCTGT
TGCAATTTCT
GATTGCATCT
TGCTAATAAC
TCATGTTTGG
AAGAAACCAT
GTCGTCATCA
GCCCAAGACT
CTGCACTGCA
ATTAGTGATA
TATTTGAAGG
TAATAGCCTT
TGTTTCTCGC
ATGCGCTCGA
ATGAGGAAAA
ATTCACTCCT
TTGAAACCGT
CCAACTTCCT
CATCCCAGTA
CTTCTGCTTA
AGTCGGATGT
TGGTGGACAT
GTTCTTCTAT
TGTGCTGCAT
GGAGTACCTG
CGCACTTGTG
AAAAGATATC
AGGAGAAAGC
GGTACATCGA
GTTCCGAAAA
TGTCACCCGA
GACGTCCTTG
TTGAAATTCT
TGATGGTGCG
CATAAAGAAA
TTTTGTCTTG
TCTTTTGCTA
TGCAGTGCTG
CGCACAAAGG
CACTAACTTG
TCGCTCACAG
TGGGACCGTT
CCTTTGGCAT
4080 4140 4200 4260 4320 4380 4440 4500 4560 4620 4680 4740 4800 4860 4920 4980 5040 5100 5160 5220 5280 5340 5400 5460 5520 5580 5640 5700 5760 5820 5880 5940 6000 6060 WO 99/14314 WO 9914314PCT/AU98/00743 GAACAGAAGC A.ACAGGGGCT TGGAACTGAA CGCCGAAAAT AAAGTCAAAC CGGCTGGCC 6120 GGATTGAAAG GGGAAACGCC AAAATCCACT TAATTTGAAT GGAAGGAGGA ATGGTTCTTG 6180 CTGGTTTCAA CTCTGCAGGC TTCCCTCTGA ATTTCACACG GAGCCATT 6228 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 11463 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: (vi) ORIGINAL SOURCE: ORGANISM: triticumn tauschii TISSUE TYPE: Endosperm (ix) FEATURE: NAME/KEY: misc-feature LOCATION: L.11463 OTHER INFORMATION:/product= "complete sequence of the starch branching enzyme 11 gene" (xi) SEQUENCE DESCRIPTION: SEQ ID NO:
AGAAACACCT
TTAGCGTCTA
CCAGTCCAAA
AGGCGCATTC
ACAGCGGACG
GACGCACATG
TTTCCCCTCT
ACATGCATTT
CCATGCACCG
GTCGAGTCGA
GTACA-AATAC
TTTTTACACG
ACGACGGACA
AAACAAAATA
ACGCGCTCCC
CCATTTTAGA
GTTTTCTTAA
ACGTCTGCTA
GAACTGGACA
TGAGTGCGTG
AACACCATGA
GGAAATTCAT
TCAAACA.AGG
ACGAGTCCAT
GAAGAGGATG
ATAATGTACC
AAAATGCCAT
ATCAGACACT
AATACTTATA
AGCCGTTGGT
TTTTTTTTTT GTTCTTTTCG
AAGAACAGGC
GGATCACCAG
GACGCTCACG
ACACATGGGG
TGATGCTATC
AGCTCACACT
AAAATTAATT
GCGAGGTGGA
ACACTGAAAG
CTACAATTTG
AGCTGGCCCG
CACCAACTGC
AACGAGGGTA
TTGCGATCTC
CATTTAGGCC
CTGCAAAGTT
CAGGAGCCCA
TCATCTATGG
AGGCCTGATG
TTTTTTTAAT
CTCAAACCAC
AACGAAGAAC
TATGCGTATT
TTTTTTGGAG
CATGCGTGCA
TTTTGTCTGG
CTAGAGGCCG
GTCCTCCCGC
GACGGTGGGT
CTGCTTTACA
AAGCGCGAGA
GCACCACAGG
GCGTCGGAGC
GAGGGAGCAA
GGAAGCAAGA
CATGACATGC
TGAAAATCAA
ACGATTTCAT
CAGAGTGGTG
GATCGGATGA
GACACAATAA
CTAACGGCAT
ACGCAGCGTC
CGTGGAGAGA
AAAGGCTCAA
CCACCAAAAC
CTTGAGCCTG
AAGGAAGAGA
CCATGCACCT
GTTGGCAAAC
AATTCTCAAA
CATCCCAGTT
TTACATACAT
TGGTCTTTTT
TCGGTCGGAG
ATGTTTTTGT
GGCCAGGTAA
GCCTCCACCG
WO 99/14314 WO 9914314PCT/AU98/00743 76
TCCGTCCGTC
CACACACACT
CCTCTCCCCC
CACGTTGCTC
TGCATTTCGG
GGGATGGCGA
GTGGCGCGGG
AAGAAGGACT
CCTTCTCTCT
GAGTGAGAGA
CGGGGAAATG
TTTCATTCTG
TGTGGCGTTT
GCGGCCTCTC
CGCAACCTGA
TTCACTTACC
ATCAGCATTG
CTCTTGGGCC
TGCACCGTTT
CTGAAGATAT
TTCAATCTTC
GAGTTAAGGA
AGAAAATATA
AATGCCTACC
GGAACATCAA
AGAATATGCT
GAATAACTGT
GGGTTATAGA
TTGGGAAACT
ATTCGAATGA
TCGTGCTGCT
GCTTGGATTT
TAATTGCATA
CTGAAGGTAT
GCTGCCACCT
CACACACGGC
GCCCATCCCC
CCCCTTCTCA
CCGGCGGGTT
CGTTCGCGGT
CCGGCTCGGA
CCTCTCGTAC
CCTCTGCGCG
GATAGCTGGA
CGTTAGTGTC
ATATATATTT
TTTCACTATT
CAGGGAAGGT
AGAATTACAG
AAATGCCGGA
TGCAGTACTG
ACTGAAAAAA
GGGGTTTCGT
CGAGGAGCAA
AGAACCGACT
ACTAGTCGTG
CGAGATTGAC
CGCTGCTTTC
AGAGACAAAG
GGGAAGTAAA
CTCCGATCAT
TTTTACTTTG
TAGTTTCTTA
TTTTGGGTAT
ATTGACCAAC
ACCCGCAGGT
TCTTATAAGA
CGTCTAATTG
CTGCTGTGCG
ACACTCCCCG
ATGCACTGCA
TCGCTTCTCA
GAGTGAGATC
GTCCGGCGCG
GCGGAGGGGC
GCCTCGCTCT
CGCATGGCCT
TTAGGCGATC
ACCCAGGCCC
TCTCATTCTT
GTAGTCATCC
CCTGGTGCCT
GTACACACAC
TGAAACCAAC
CACTGCCTTG
TCAGATGGAT
CAGTCTGCTC
ACGGCGGAAG
CAGGGCATTG
GGGGAGAAAC
CCAACACTGA
GCTCATTTTG
ACTAGGGACC
TGTATA.ATTG
TACAATTAAA
CTAATTCCTC
TCTTTGTGGC
ACCTCGGTGG
ATGAAGGTGG
AAATTTAAAG
AAATTTATAA
CATATCTTAT
CGCGCACGAA
TGGGTCCCCT
CCGTACCCGC
ATTAATATCT
TGGGCGACTG
ACTCTCGGTG
GGGGCGGACT
CTCGAATCTC
GTTCGATGCT
GCGCTTCCTG
TGGTGTTACC
TTTCTTCCTG
TTGCATTTTG
GACGGCGAGA
TCGTGCCGGT
CACGGATGCG
TTCATTTTGT
GTGCATTCTA
TACAATTGCT
TGAACATGAC
TGGAAACAAT
CGCGAGTTGT
AAGATTTTCG
AATTAAGGTC
ACCATTTCAT
ATGGCTACAA
GAGTGGCAAA
TACCAAATTC
CTTTTTCTTT
ATTCAACAGA
ATTGGAAGCA
CTTTATTATT
TTCCTGTTTT
AAGAAAATTT
GGGAGGAAGA
TTCCGGCTTG
CAGCTTCCAC
CCATCACTCG
GCTGACTCAA
TGGCGCGGGC
TGCCGTCCCT
CCCCGTCTGG
GTTCCCCAAT
AACCTGTATT
ACGGCTTTGA
TTCTTGCTGT
CAGGCCCCGT
GGACGACTTG
AAATCTTCAT
TCAGGTTTCG
TAGCCTTGGC
GCAAGAACTT
ATTTTTCGTG
ACGGGGGACT
CACTGATGGT
CCCAAAACCA
GAGCCATCTT
CTTTCATCAT
ACAGATCCCT
TTTGCTCAAA
CTGATGAAAA
CTAGGGGGGA
TGGGGAAAAC
TACAGCGAAT
TTTTCTCGTG
ATGAAACGCC
CCCCTCTCTT
ATATTCCTGT
ACGAACGCCG
GCGTCTATCT
CCCCGCCGCA
GGTTCCGCGC
TCACTACGCG
CGGCGTCGGA
GCTCCTCAGG
CTTTGGCTCC
TGATCTCCAT
TTTTCCCCCG
TCATTCCTCG
AACTGCAAGT
CCTGAGCCGC
GCAAGTCCGG
ACAATCGTTA
AGCTTCTTCT
CCCGTGCTGG
CACAACATAA
CTGTAGATAC
GCAGAGAAAC
GTAACCAAAG
GGAGATGGGC
GACTACCGGT
GCAAATTTGG
TCGTGGTCTG
ATTGCAATAC
TGTGGTGGAT
AATCTACCAG
ACATTGCTAA
ACAAGAGAAT
GTTATGAAAA
TCCACTAGTC
TTTTCCAGTG
TTTCCCCTAT
960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 WO 99/14314 WO 99/43 14PCT/AU98/00743 77
TTTCCAGTGC
TTTTAAGTTC
AGCTTACTGG
CTGGCTTTTG
CAAATGCAGA
ATAACTACTG
CCTCAACTAC
ATTCTACTGA
ATACGTGCAA
CATGCTATCA
TGATTATGGT
TCATGGCTCA
AACTCTGCCT
TCCGGTGTGA
ATACCTTTCA
ATTATTAAAT
AGTCAAGACA
TAAGGGGCAA
TCTTTTATGT
AAGGATATTT
AGAGTCACTA
TATTTCACCT
TGTTAACATA
ATATTGGCA.A
GATTTCCATT
AGAGCATAGT
ACTAGCTTAA
GCCATTTCCT
TCATTATTCT
TCTCCCATTA
GAGGCATCGC
GTGATCTTGC
CAAGAATTA\
TGAAGGTATC
CTTAACGAGA
ACTTACAAAT
CACCCTGTTA
TACTATGACC
ATACATCTAT
ATCATATCAA
ATTTAGTCCA
CACTCCCATC
GTGAAGGTTT
GTTTGGGAGA
CGTGTAAAGG
ACTAAGGGTC
AGGATTCAAT
ATGGCATATA
GAAATTTCCA
ATACTTTTGA
CCAACCTTGG
GTTCTCTGTT
ACATGCAAAT
AGGATTTATG
GTTTCTGGTC
TTACATGGTG
GTGCAAAACT
GCATTTGGAG
TATATGAATT
GATTTCCCAC
ACCTTATTAA
GCGAGCGATT
TGAAGAGGAT
TAATATATAC
ACAGGAACCG
AAGGCTTGGA
ACTTACCGAG
CACCTTCCAA
TAGCTTACTG
CAGTCTGCAG
AGAGTATGTC
TTGTATTTAT
AATGGTATAA
TCTTTTTGAG
TGCATTATGT
GCTCCTATTG
TTTTCCTCCC
TAAGCTGGCC
CCTTTTCCTC
TTCTGCTTGG
TTATGATCCA
GTGTTACAGT
ATTTGGAAGT
TGATGTGTGT
GGTTAGGATA
GCAGGAGAAG
AATCACACAT
TGATGGTTTA
CATTCACTTG
TTGCTTCCTC
GCAGTGGGCA
CCATTGTTGT
TTAGGATGTA
TGAGAGAGAG
CAAAAACTTC
ATAGTTAATT
TATCATCACA
AAGATAAATT
TACAATGCAG
AATGGGCTCC
TTTATTGTTA
AATACTGACC
CATTAGTAGG
TACAGCTTGG
TTAGCTGTTT
TTTGTCAGTG
ATTGAAAATG
GTGCTTTTCC
ATGCAGATAT
TAACAACGCT
AATTATTTAG
TCTGTTTTTT
ATCAAGTTCT
CCTGAAGAGG
TTTTTAATAC
GACATATGCA
ATGCTTGTGT
TTCCATTTTG
TATGTCTTCC
TGGAATGAGC
TTCTATGGAT
ACAACCTCGA
TTTGTCTGCT
TGTGAAAGTC
TGCAATAGCT
AGAAATATTG
ACAAGGGGGG
CATTGTTCTG
CTTTGTAACC
ATACTTAGAG
CATATGCTAA
TGCAGATAAT
CTGGAGCGCA
ATGGTCACTA
AGTTACTATA
TGACTTCAAC
CAATTTTCCA
GCACATTCCT
TCTTAAGCTT
AGTATATTAA
ATCTACAATG
TTGATATGGT
GATGGATCCT
TCGAGGATGT
AGATACGGAT
CTGTGCAGGC
TAAGTATCGA
CCACTTCTTA
TTAATTCACC
GTGACATAAG
GCCTTTTGTG
AACATCTCAA
AGCCCGGTAT
TTTCTAGTTC
TTTTATTTTC
TGTTCTTTTG
ATATCTATTT
CGGTATAATG
CATTGGAGCG
GGGGGGGGGG
AGGTGTACGT
TACTTGGAAA
GATGCATCTG
TTTTAGGGAT
GGCAATCCAG
TGTTATGTTC
TTCACCAACT
AATTTATGAT
AATTGGAATC
CCTTTGCTTC
TAAAGTTGAG
CAGCCCAAAG
GGATGAATGA
AGCATATTTC
CTTTTCAGGA
CAGCTATTCC
AGCATTTTCG
GGATACTCCA
TCCAGGTGAA
TCTACATTAC
CTGACATGTG
TTCTAAGGGC
ATCTTATAGC
ACCATTTACT
CTAAACGACC
GTCAATAAGT
TGTTATGTAC
TAATGTCTTC
TCTTCTGTAA
TTTTTTTGTC
TAACCATGTT
TCTCCAGCAA
GGGGTTCCCT
ACTGCAGGGA
CTTGAGTCTT
AAATTTTAGT
GAGGTGTTGC
GAGCATTCAT
3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 4500 4560 4620 4680 4740 4800 4860 4920 WO 99/14314 WO 99/ 4314PCT/AU98/00743 78
ACTATGCAAG
TTGGAGCTAT
AGATATATAG
GTTCCAGGTA
ACTTAAAATC
TTCATAGGTA
GGGAAATTCA
AAAATCTAGA
CCATCCTAAA
GTGACTTCTT
TAGTAAACTG
GATGTGGATT
GGCACTACAT
AGGCCCCACT
TTACTTAAAG
GAAAATATAT
ATTATATTAG
AATATAGAGA
ACATCAAATA
ATATGCATCA
TAATCTACTT
ATGATTTTGT
CGATGGCACT
TTCTCGTCTA
TTGGCTAACT
TATTGAGATT
TTCGATTTGA
AAGTGGTTTC
CATGATCAGG
ACCTGATGAG
TCTTGTTCTA
TTGATGCGGT
CTGTATCCAT
AGTTTTATTT
CTTTGGGTAT
TACATCCTAA
TACAACTACA
CCATGTTACT
CTTGATCGAT
ATTAGTCCAA
GGCAATTATG
GTGGCATAAG
TGGCAGGGCC
TTTTCTCAGA
ACAGTTCCAT
GAGAAGTTCA
ATAGTTTGCA
TGCCAGCTTC
TTCTTCATTT
CAACATCTAC
CACATCTTTG
AGTTTGACTT
ATATAGATAG
GACCATCTGT
TTCCTTCTAC
GTACCCTGCA
GATACACATT
TTCAACTATG
GTTCCTGTTA
CTTACTGTCA
TGGGGTGACC
AGTAACTTTT
ACTTGTGCTA
ATCATGGAAG
GATGACATTT
AGTTTACTTG
TGGTGAAGAT
TGGGGATCAC
TCACACAATC
TGCTTCATGC
CTTAGTATTC
AATTTTTTTG
AGAGCACATG
TTTAATTTTA
ATACATTGTC
GAAAATTGGC
CTATCGCCGA
TGTATTAAAC
AGAATATCGT
GATGCTATCA
AGTTGGAAAA
ATACTAGATG
GTCCTAAGTC
AACACCAA.AT
ATGTTGTAGA
AGGACAAATC
ATGTCAACAC
TTGCTTTAGC
TTGGTTTGGT
GTCATTCGTC
ACTTCCACGG
GGAGTTGGGA
ATCTGTTCTT
AACGCGAGAT
TCCATGATGT
TTAGGGCACT
CGGAGTCTTA
ATTGGAAGTG
ACTGGGAACT
ATGCTGGTCA
GTAAGTGCTT
TCTGTTACAC
CATTTTTTTC
ACATAAA-ATA
TGAAAAAGAT
CACCAAGTAG
AGCTTGGTTT
GCTGTTTTAC
AAAAGCTAAG
AAAAACTAGA
ATATTTTTCC
CAGTTGGACA
TTTGTAATGG
ATAGAATTAA
CTGACAGCAA
TTACTTCCCT
AAACTTCTTT
TACTTTGATC
TATCAGCACA
TAGAACTTCA
TTCAACAAAA
CACTTGCTTT
TGATTCTATT
AAATAATACC
TGGTCCACGC
AGTATGTAGC
ACACATGTTG
GGTGGCTTGA
ATACTCACCA
GAAACAATTG
GATAGTTCCC
ATTATTATTT
ATGGCGAATA
ACGATCTAAT
ACAGTATTTA
TTTTTGTTAG
TGTATACACT
TTTGGATATA
CATTTTATTG
CCGTTTTGGA
GCTTGTTCTT
TGTTTATCTG
AGTGGCGAAA
GTGGCAAAAA
ATTCTATATA
TGAAATGTAT
CAACACAATT
TCAACTGGCC
TACCTCACTG
GTTGAATTCA
AAGTTTGACC
AGATTAACAA
TTTTTCTATA
ATCAATTTGG
AAATCAGACC
CATATTTATG
TCAGTTGCAT
CTTGACGGTT
GGCCATCATT
TCTGACTTCT
ATATTCTATT
AGAATATAAG
TGGATTACAA
CTATGCATCA
TAGTATGCTT
ATTTTCTTTC
TTTTGGATTT
TCATGGACTT
TGATTTTTAA
GGGTAAAATC
CTTCACCCAT
ATCCTTTATT
TTGTTGGCTT
ACTCCAGAGG
ATGGATATTG
GTATTCTAAA
GTGAAATGTC
TAAAATTTTC
ATTGTGCTAC
TTGGTACATG
TGATGCCATA
ATGTACTCGT
ATAAGTGGCC
TTTGAACATA
A.AGTCTATTG
TTTTTATTTT
GACTTGGTCA
ATCAGAGGGA
TTGTCACCAT
TGTTTGTACC
TGCTTCATCA
TGAATGGTTT
GGATGTGGGA
GTCACCATAT
CTTATGCAGG
TTTGATGGAT
GTAAGTCATC
TAACATGTAT
GTACAATTTT
TAAGTTTGTT
GCTACTGATG
TATCCTGATG
CTAGTTAAGT
TCTCTTTTCA
4980 5040 5100 5160 5220 5280 5340 5400 5460 5520 5580 5640 5700 5760 5820 5880 5940 6000 6060 6120 6180 6240 6300 6360 6420 6480 6540 6600 6660 6720 6780 6840 6900 6960 WO 99/14314 PTA9/04 PCT/AU98/00743 79
TAACAATGCT
GGGCATTCAA
TTTGGGTTTT
AAGAAATGGG
GGAAATGAGA
TTGTAATAGT
TGTACACATC
TCTGCCAGCA
TGGAATGCCT
GCATATGGCT
ATTACATGCG
GTGAATATCT
AAATAGCAAA
CTGTAGCAGG
TATATGAGAA
TAAGGATGGG
AAAGGAATAT
GGCTCGGTAA
CAAATACTTA
TATTTAGATT
TTTGGTGGAT
CTCTGCCGTT
TTAAGCATGT
AATATGCTGC
ATGGGCGATA
GCAGAAAGTC
AAGGTACTAG
CTTTGTAGAG
ACTCATTTTG
CGGAGGGAGT
TTGGCTGCCT
AACGTACCAT
TTCTTATTTA
AATTTATACC
GCTTACGAGC
TCAATAAGTG
CAACCTTGTC
GGCTATTCCC
GGTTTALACTT
CACAAACAAG
TTAACTGTTC
ACATTTTGCA
GTAGCAGATA
CACAATGATC
AGACATTTGC
TCTCGGAAAT
CCAGTCAACA
AGTTAGTATA
CAGTAGGTAA
ACAGGGTCAT
AAAAAACTTT
TTGCTACTAC
TAAATACTAA
ACCAGAATTT
ACAAAAGCTG
TTTTTGAAGC
AGTGTAATTT
TTGTGCACAC
ATGATCAAGC
CTGTTACTTT
ATTCCACTAT-
CTTCGTATGT
ACATAATTGA
CACCCATCAC
GTGGTACTGT
TTTGATTGCT
TTGTATGATA
ATATTTTTTG
GGAGTGTGTG
AATTGCTTCA
AAGGACATGA
AAATTCCTGC
TAATCCTGAG
ACAGTTCTAA
TCCCTGTTCC
ALATGGATTGA
TAGATTACAT
CTGTTATCAG
GTAATGGCTA
CAGTTAGCAA
TAAACTGTGG
TAAATTTAGC
GTACCATATC
ATGATGATCC
ACAGCTGCCA
CTCGATACAT
CTGCCCTCTT
TTTTCAGTTT
TGTGAGCTGT
AGCATTTCTT
CCTAACAAAT
ACTAGTTGGT
TGGACAAAAG
GGACCACATA
AGTCCATAGT
TTTGTCTCAT
CAGCTATTTC
GGCGGCTTGT
TATGTTACCG
ATGCATCACT
ATGGCTGTAA
ACTAATGTTG
GAAGGCTAAC
ATTATACTTC
ACTGCTATGC
CTTTCAACTC
TTTGTGTAAC
AGATGGTGGT
ACTCCTCAAG
TTTCTAA-ATG
CTTGAATACG
GTGTCTTTAT
TATTTTCAGA
TCATTAATTG
CAGATAAAAT
TAGTTGTAAT
AGATAGATAT
ATCTGTCATG
TGGCAATAAT
GTTAGTAATG
TTTGCATCAT
TGGTACTTAA
TAACACAGGC
AGAAGGTGGC
GACAAGACTA
AATTACTCCC
GTATATAGAT
GAAATCTCTA
CAGATTGCTA
CCAACTGTTA
GAACTTTGAC
TTCATTTGCT
TAGTAATTTG
TTTATTTGAT
TATTATTTAT
TTTGATTCCA
AGTGTGTTCT
AATCTCACTG
ATGAGAAAAT
TGTGAAATTG
GTTGGTTTTG
TAAGTGCAGG
GTAAAAAGGA
AGAAGTCAAA
GCTGGGCAGT
AACAATATTA
TGTTCACCTT
AAATCGTTAT
TAATGAAAAG
GCAGGAACGC
ATCTGTGTTC
AAACTTAACT
ATGTGCTCCC
TATTTTTGTG
TACATTCTTG
AAAGTGACGA
TTGAGAAGTG
TTGCATTCTG
TCCCGTTCCT
GCATTTTAGA
CAGAGACTTA
GTGTTTTCTT
CTTGAGCAGA
AGTTATGTTG
CATTCCTTTC
AAAAGTGCAA
AGTATGCTTG
TTAATTGCGG
TAAACGCTTT
GTACATGTAT
TATGTTGTAG
AGAGTCCGCT
TTCAGGTCAG
ACTACCGCCT
AATATTGGTG
AAATATGTAT
TACATGATTT
GTACATTGCG
TTTATATCCG
TTGTCCTGTT
TAGGTTTACA
GCTGACAAAA
GACTAAAGCT
TGCTTTGTGC
ATTCAACCAA
TGCTGCTGTT
TGTGAGTAGT
GAAGTGTCCA
ATCTTGGAAA
TGTAACTTAT
GTTGATGGAT
AAATATAAGT
GTGTAGATTC
TATTTAGGAA
GTGATAAAGA
ATTTGCTGAA
CAATTTTCTG
CGAGACCAGC
7020 7080 7140 7200 7260 7320 7380 7440 7500 7560 7620 7680 7740 7800 7860 7920 7980 8040 8100 8160 8220 8280 8340 8400 8460 8520 8580 8640 8700 8760 8820 8880 8940 WO 99/14314 WO 9914314PCT/AU98/00743 80 CAAAGTCACG TGTTAGCTGT GTGATCTGTT
GCTAAAATCC
CTTTCTTAGA
CCGCCTAAAG
CATTTTGGAG
CGCTCCTTGA
CATTCAAAAG
TGTTCCAATT
TATAACTACA
CTGTGCTGTA
CAGGATATGT
TAGCATTACA
ACTTCATGGG
ATGATTGTGA
ATCTGTTGCT
CAAGAGGCCC
GATAAATGCC
TAGATCTTTA
CATTGCTTTT
AAAAATATCA
AGTTCGATCA
TTGTTGCATA
ATTAATTCCT
AAACTATTTT
AGTTTATGAC
TCCTCAAAAG
ACTACCGTGT
CACCCTTCAC
TTTGTTGGTC
CTCGAGCTGT
TTCATATTAT
ACTAGAACTA
GGACAATTGG
CTTGGACTCT
AACGAATTAT
TGATTACCAT
GAGTGATTTT
ATATGCTTAG
GGTTTTATTA
GAAACGGTCA
TTATGAGTTT
GTTGTTTTTA
AATTATTTAT
ATGATTTCAT
TAAAATGATC
AAATGAGTTT
TTTACTGTAA
TCCAAGGAGG
ACAAACTCTT
GCCGTAGATT
TTGGCCATTT
GTAGTTTTGT
TGATTTTTTG
GGCAATGCAG
ACAAGTCACA
GTAATGAGAT
CTTAAGTGCT
ATCTGAGCAC
AGGAGATTTG
TGGGTGTTCC
CAGTAGGGTT
GTGCAGCTAT
TGTAGCCATA
CTACTTAAGT
TTTTCCGAAT
CTGGGTTTTT
GACGATGCAC
TTGCTTGAAT
AGTGCCTGAA
TATTGGATAG
TAACAGCTCT
TGGCGCCATC
CATCATTCTA
TTGGGACTCC
TACCAGTGTA
CCGACATAGA
GGCTCTGGAT
AGGCTTGTCA
GGGCATCCTG
TTTGAACCAT
AAGTTAACTT
CCAACCGGCA
TGATCTTGTA
ATTTCTTGAT
AGACGTTAAC
CAGGGAGATG
CATCTTGAGG
GTTTAACGTC
GAAAACTGTG
TGTGTATTGA
CAGTATGTTT
GTATTTGTTT
AAGCCTGGGA
AGTGGGGGCT
CAATATAAAG
GGAAGGTTGT
GTTTGTTTCA
CTACCCTAAC
GTTAGTTGTG
TCTTTGGTGG
ATCTGAATCT
TTAAATATAC
GGCTGAAATA
ATTCCTGGCC
GGGAAGTTTG
TTTGTAACTA
ATCAGGACCA
AAAGGGA.ACA
GTTTTATTCC
ACAGCATGAA
AGGCTTCA.AC
CCATGGGTTT
GTCAGTCTTT
GCTTTTCTTT
CTATTTACTT
AAGTTCTCCC
AGTTTTAGCT
GAAATCATAA
ATAAGTATGT
CAGATTTTCT
AAAAATATGG
AGTCTCTTCA
CAAAGGCGGA
TACATATACC
CACGGAAACA
TCAACTTCCA
AGTACAAGGT
TCTACAACTT
AATAGGGTALA
TCTTAACAGC
ATCTTTATGC
CATCCTAGCA
ACAGTTTCTG
ATTCAGCAGG
TGAGCAAATT
AGACGTATAG
GTTTTGGTGT
GAGTCTTCGT
GTCACAAGTC
GTGGCACCTG
CCATACTAAG
AAAGTGTCTC
AGGACAGTTG
CATATCAAGC
TCTTCGCATT
AGGTGGTGA.A
ACAACATTAT
CACATTGTAT
GGCAGAATGG
CTGGAAATAA
GTGCTATTAC
TGTTTGTTAG
GTTGAGAGTT
TAGATATCGT
GGTATGTCAC
AGTGGTAAAA
GCTGGAATTG
AGCACTGACA
TGAGGAAGAT
CTGGAGCAAT
ATGCTTGCCT
TTAATTCCAC
TTTGTAAAGA
CCCGAAGCAC
TCAGTTGGAC
GTTTTAGAGC
CTATTTCTTA
CTTGATCATG
TTATTAATAG
TCACCTGGCT
TTCTTGGATG
TACPLACATAA
TGCATCTACA
TAAGGAAACA
AGCAAGATTC
ATATTGTGCT
ATACTTGGTA
TCTCTTTGTG
GATCGTGGCA
GGCTATCTTA
TGCATTCTGC
GTATTATGTA
ATAGATTTTC
CAATAGTTAT
ATTCCCTCAC
GAAAGATCAA
GTTGATCATT
GGTATGCAAG
TGGTTTGTCT
AAAGTGTAGA
CTTTTCACCA
ATGTAACTGC
AAGGTGATCA
AGCTTTTTTG
TTTCATTGTC
ATGGATAGAG
AAAGAATTTG
ATACCATTCA
TCGGTCTAAT
AGCCCCATTT
ATCAGGTGGC
ATGTCGACTA
9000 9060 9120 9180 9240 9300 9360 9420 9480 9540 9600 9660 9720 9780 9840 9900 9960 10020 10080 10140 10200 10260 10320 10380 10440 10500 10560 10620 10680 10740 10800 10860 10920 10980 WO 99/14314 WO 99/ 4314PCT/AU98/00743 81
CTTCACAACC
AAATCTGAAT
CCGCGCTCTT
TAAGA.ACCAG
TGAGCGAAGC
GTGCCTCTTC
GAAAGAAAAT
CACATTCCCG
TCC
GTAAGTCTGG
CAACTTCCCA
TCTCGGTGTA
CAGCGGCTTG
GACGGGCA.AC
CCCAGATGCC
GGACGGGCCT
GTTGTTTTTG
GCTCAAGCGT
ATTGCTGATG
CACTCCGAGC
TTACAAGGCA
GGCGCGAGGC
AGGAGGAGCA
GGGTGTTTGT
TACATATAAC
1463
CACTTGACTC
CCCTTGCAGG
AGAACTGCGG
AAGAGAGAAC
TGCTCCAAGC
GATGGATAGG
TGTGCTGCAC
TAATAATTGC
GTCTTGACTC
AACATCCGCA
TCGTGTATGC
TCCAGAGAGC
GCCATGACTG
TAGCTTGTTG
TGAACCCTCC
CCGTGCGCTC
AACTGCTTAC
TGACAACAGG
CCTTACAGAG
TCGTGGATCG
GGAGGGGATC
GTGAGCGCTC
TCCTATCTTG
AACGTGAAAA
11040 11100 11160 11220 11280 11340 11400 11460 INFORMATION FOR SEQ ID NO: 11: SEQUENCE CHARACTERISTICS: LENGTH: 2662 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: (vi) ORIGINAL SOURCE: ORGANISM: triticum tauschii TISSUE TYPE: Endosperm (ix) FEATURE: NAMEIKEY: misc-feature LOCATION: L .2651I OTHER INFORMATION:/product= "nucleotide sequence of cDNA wheat SSS I" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: TCTCCCACTC TTCTCTCCCC
GGCCTCGGCC
GGAGCGCGCC
GCACCGAGCG
CCGCGCGCCC
CCGCCTGCGC
GCTCCGGCGC
GCGCCCCGCG
GCCGCCCGCG
GGGGGAACTC
ACCGGCAAAC
CCGCGGCAGC
GGGCGATCCA
ACACCCATGG
GCCGATCCGG
TTGGCGCGGG
CAGCAGCAGC
CCCGCCCAGT
GCGCCCGACC
GCGCACACCG
CCCCCGATCC
AGCAGCACCG
CCGTCCGTGC
CGGCGACGGG
CGACGGCGGC
GCCGCTACGT
AACTGGCCCC
CGCCGGCCCC
TCCTGCTCGA
AGTCGGCACC
GCTTTTGCAG
CAGTGGGAGA
GTCCGCACCT
CGTCGGCGCC
CCGGGCGTCC
TGCCGAGCTC
GCCGCTCGTG
GACGCAGCCG
AGGGATTGCT
GGCTCATCAC
GCAGCGCACT
GAGAGGCTTC
CCTCCGCCTC
GGGTGCCTCG
GCCTGCGTCG
AGCAGGGAGG
CCAGGCTTCC
CCCCTGCCGG
GAGGATTCCA
CCATCACCTC
AAAACCCCGG
GCCCCGGCCC
CTCCCCTGTC
CCCCCAGCGT
TCCGCGCGCG
GCCCCGCGGC
TCGCGCCGCC
ACGCCGGCGT
TCGACAGCAT
120 180 240 300 360 420 480 540 600 WO 99/14314 WO 99/ 4314PCT/AU98/00743 82
AATTGTGGCT
TAAAGTTACA
GGGGCTGGGA
GATGGTTGTA
ATACACTGGG
TCATGAGTAT
AGGAAGTTTA
1s
CCTTTGCTAT
ACAGAATTGC
TGCAAAATAT
TTTAGCACAT
ATGGTATGGA
GGGTGAGGCA
CAGTCAGGGT
CTTAAGCTCC
GAACCCCACC
GGCCAAATGT
TCTGATTGGC
CATTCCAGAG
TTTTGAAGGC
TGGATTTAGT
ATCCAGGTTT
TGTAGTTCAT
AAAAGGAGAG
GGCATTGCGA
GAAGCGAGGC
CTTCGAATGG
AGCGCGGGTC
ACCCCTGTAC
CGCCGGTTCG
ACAGTTACAG
CCACTCAGAT
TCTTTAGCCT
GCAAGTGAGC
CGTAGCATCG
GATGTTTGTG
ATGCCALAGAT
AAGCACATTA
AGAGACAACG
TATGGAGATA
GCTGCATGCG
ATGTTTGTTG
AGACCATACG
CAGGGTCTGG
GCTTTAGAAT
GTTAACTTTT
TATTCATGGG
CGAAAAAGTG
ACAGACAAGT
AAAGCTGAAT
TTTATTGGAA
CTCATGAGGG
TGGATGAGAT
GTTCCAGTTT
GAACCTTGTG
GGAACTGGGG
GAGCGTACAG
ACCGCGATGT
ATGACGAAAG
GCCTTCGTGG
TCCTTGAGCT
ATTGCGTTGT
AGAGTAGATG
TTTTGGGGAA
GGCAGCCTCT
TAGCGATTGT
AGGATTCTGA
TGTTTGTGAC
GTTCGTTACC
ACTTGAATGG
AGATTCCATG
TCGATTGGGT
ATTTTGGTGC
AGGCCCCACT
TGAACGATTG
GTGTTTACAG
AGCCTGCAAG
GGGTATTTCC
TGAAAGGAGC
AGGTCACAAC
TATTGAATGG
GTCTCCCTCA
TGCAGAAGGA
GACTGGATTA
AGGACGTGCA
CTACCGAGTC
CCCACAGAAT
GTCTTAATCA
GCCTCCGAGA
GGTGGGCGTT
CGACATTCAG
ACCATACGTG
ACCAACCCTA
CTGAAGACAT
CCTGCTACAG
ACGGCTGTGC
TAAGGAAGGG
CTGTCCGTGT
GAAGTTTGTT
GATCATGGAT
TGGTGAAGCT
AATTGCTCTT
GTCCTCTGAT
CTTTGGGGGA
GTTTGTCGAT
TTTTGGTGAT
AATCCTTGAA
GCATGCCAGC
AGATTCCCGC
TACATATCCT
AGAATGGGCA
AGTCGTGACA
TGCTGAAGGT
AATTGTAAAT
TCATTATTCT
GCTGGGTTTA
CCAGAAkAGGC
GTTTGTCATG
GAGTTACAAG
AACTGCAGGT
GCTATATGCT
CACAGTCGAG
CTCACCGCTA
GGAGCACAAG
GGACCATGCC
CGTCATGTAG
GTTCCTCATC
TAGAGTCGCA
TGCTGCGGCG
ATGTGCTGCA
TACAGCTGAA
GCATTCTGTG
GCGAATGAGC
GCTCCTTATG
GCTGCTCGTG
kAAPACTATG
TCACATGAAG
CATCCGTCAT
AATCAGTTCA
TTGGGAGGAT
CTTGTGCCAG
AGCACCCTTG
GATCTGGGAT
AGGAGGCATG
GCAGATCGAA
GGACAGGGCC
GGAATTGACA
GTCGATGACC
CCTGTAAGGG
ATTGATCTCA
CTTGGATCTG
GATAAATTCC
TGCGATATAT
ATGCAATATG
ACCTTCAACC
ACCGTGGACA
CCGTCCTGGG
GCCGAGCAGT
ACGGGGACTG
CTTCCGCGGC
ATGCGCCTGC
GTGACAGCTT
GGATGGT'rAA
ATCAGAAACC
TATGTTGTCT
AACCTCAAGC
CAAAGTCAGG
GTCACCGTGT
CAAAGGCATT
TGACCTTTTT
ATCATAGACC
GATACACACT
ATATTTATGG
TCCTTCTTGC
TTATACATAA
TGCCACCTGA
CCCTTGACAA
TTGTGACCGT
TCAATGAGCT
TTALATGATTG
TCTCTGGAAA
AGGATGTTCC
TTAAAATGGC
GGGATCCAAT
GTGGATGGGT
TGTTAATGCC
GTACAGTTCC
CTTTTGGTGC
AGATGTTGTG
AGGGGCTCAT
ACGAGCAGAT
GGGAGGTCGA
CCGGAAGGAT
TTGCTTGGTC
CGGGTGGATG
CAGCAAAGCA
AACTGGTGAC
TGTCCTTAGC
660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 WO 99/14314 PCT/AU98/00743 83 TGACAAATAT TAGACCTGTT GGAGAATTTT ATTTATCTTT GCTGCTGTTG TTTTTGTTTT 2640 GTTAAAAAAA AAAAAAAAAA AA 2662 INFORMATION FOR SEQ ID NO: 12: SEQUENCE CHARACTERISTICS: LENGTH: 768 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (iii) HYPOTHETICAL: NO (vi) ORIGINAL SOURCE: ORGANISM: triticum tauschii (ix) FEATURE: NAME/KEY: Protein LOCATION: 1..768 (ix) FEATURE: NAME/KEY: Protein LOCATION:1..768 OTHER INFORMATION:/product= "deduced amino acid sequence SBE II" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: Met Ala Thr Phe Ala Val Ser Gly Ala Thr Leu Gly Val Ala Arg Pro 1 5 10 Pro Ala Ala Ala Gln Pro Glu Glu Leu Gln Ile Pro Glu Asp Ile Glu 20 25 Glu Gln Thr Ala Glu Val Asn Met Thr Gly Gly Thr Ala Glu Lys Leu 40 Glu Ser Ser Glu Pro Thr Gin Gly Ile Val Glu Thr Ile Thr Asp Gly 55 Val Thr Lys Gly Val Lys Glu Leu Val Val Gly Glu Lys Pro Arg Val 70 75 Val Pro Lys Pro Gly Asp Gly Gin Lys Ile Tyr Glu Ile Asp Pro Thr 90 Leu Lys Asp Phe Arg Ser His Leu Asp Tyr Arg Tyr Ser Glu Tyr Arg 100 105 110 Arg Ile Arg Ala Ala Ile Asp Gin His Glu Gly Gly Leu Glu Ala Phe 115 120 125 Ser Arg Gly Tyr Glu Lys Leu Gly Phe Thr Arg Ser Ala Glu Gly Ile 130 135 140 Thr Tyr Arg Glu Trp Ala Pro Gly Ala His Ser Ala Ala Leu Val Gly 145 150 155 160 WO 99/14314 WO 9914314PCT/AU98/00743 84 Asp Tyr Ala Gly Pro 225 Glu Arg Asn Leu Tyr 305 Arg Giu Asn Tyr Leu 385 Ala Gly Thr Val1 Asp 465 Ile Phe Gly Ile Val1 210 Gly Lys Ile Ser Gly 290 Ala Phe Leu Asn Phe 370 Phe Arg Val Gly Val1 450 Ala Pro Asn Val1 Pro 195 Lys Giu Tyr Tyr Tyr 275 Tyr Ser Gly Gly Thr 355 His Asn Trp Thr Asn 435 Tyr Val1 Val1 Asn Trp 180 His Asp Ile Val Giu 260 Ala Asn Phe Thr Leu 340 Leu Gly Tyr Trp Ser 420 Tyr Leu Ser Pro Trp 165 Giu Gly Ser Pro Phe 245 Ser Asn Ala Gly Pro 325 Leu Asp Gly Gly Leu 405 Met Gly Met Ile Asp 485 Asn Ile Ser Ile Phe 230 Gin His Phe Val1 Tyr 310 Giu Val1 Gly Pro Ser 390 Giu Met Giu Leu Gly 470 Gly Pro Phe Arg Ser 215 Asn His Ile Arg Gin 295 His Asp Leu Leu Arg 375 Trp, Glu Tyr Tyr Val1 455 Giu G ly Asn Leu Val1 200 Ala Gly Pro Gly Asp 280 Ile Val1 Leu Met Asn 360 Gly Giu Tyr Thr Phe 440 Asn Asp Val Ala Asp 170 Pro Asn 185 Lys Ile Trp Ile Ile Tyr Gin Pro 250 Met Ser 265 Giu Val Met Ala Thr Asn Lys Ser 330 Asp Ile 345 Gly Phe His His Val Leu Lys Phe 410 His His 425 Gly Phe Asp Leu Val Ser Gly Phe 490 Thr Met Asn Ala Arg Met Lys Phe 220 Tyr Asp 235 Lys Arg Ser Pro Leu Pro Ile Gin 300 Phe Phe 315 Leu Ile Val His Asp Gly Trp Met 380 Arg Phe 395 Asp Gly Gly Leu Ala Thr Ile His 460 Gly Met 475 Asp Tyr Thr Asp Asp 205 Ser Pro Pro Giu Arg 285 Giu Ala Asp Ser Thr 365 Trp Leu Phe Gin Asp 445 Gly Pro Arg Arg Asp 175 Gly Ser 190 Thr Pro Val Gin Pro Giu Giu Ser 255 Pro Lys 270 Ile Lys His Ser Pro Ser Arg Ala 335 His Ser 350 Asp Thr Asp Ser Leu Ser Arg Phe 415 Met Thr 430 Val Asp Leu His Thr Phe Leu His 495 Asp Pro Ser Ala Giu 240 Leu Ile Arg Tyr Ser 320 His Ser His Arg Asn 400 Asp Phe Ala Pro Cys 480 Met WO 99/14314 PCT/AU98/00743 85 Ala Val Ala Asp Lys Trp 500 Trp Glu Asp 545 Met Leu Tyr Phe Asn 625 Ala Gin Val Asp Tyr 705 Ser Asp Lys Lys 530 Lys Ala His Leu Pro 610 Asn Asp His Ser Leu 690 Arg Asp Tyr Met 515 Cys Thr Leu Lys Asn 595 Arg Asn Phe Leu Arg 675 Val Val Asp Phe Gly Val Ile Asp Met 580 Phe Gly Ser Leu Glu 660 Lys Phe Gly Ala Thr 740 Ile Tyr Phe 550 Pro Arg Gly Gin Asp 630 Tyr Lys Glu Phe Ser 710 Phe Glu Ile Val Ala 535 Trp Ser Leu Asn Thr 615 Lys His Tyr Glu Asn 695 Arg Gly His Glu His 520 Glu Leu Thr Val Glu 600 Leu Cys Gly Gly Asp 680 Phe Pro Gly Pro Thr 760 Leu 505 Thr Ser Met Pro Thr 585 Phe Pro Arg Met Phe 665 Lys His Gly Phe His 745 Ala Leu Leu His Asp Arg 570 Met Gly Thr Arg Gin 650 Met Val Trp Lys Ser 730 Asp Val Lys Thr Asp Lys 555 Ile Gly His Gly Arg 635 Glu Thr Ile Ser Tyr 715 Arg Asn Val Gin Asn Gin 540 Asp Asp Leu Pro Lys 620 Phe Phe Ser Ile Asn 700 Lys Leu Arg Tyr Ser Arg 525 Ala Met Arg Gly Glu 605 Val Asp Asp Glu Phe 685 Ser Val Asp Pro Ala 765 Asp 510 Arg Leu Tyr Gly Gly 590 Trp Leu Leu Gin His 670 Glu Phe Ala His Arg 750 Leu Glu Trp Val Asp Ile 575 Glu Ile Pro Gly Ala 655 Gin Arg Phe Leu Asp 735 Ser Thr Ser Leu Gly Phe 560 Ala Gly Asp Gly Asp 640 Met Tyr Gly Asp Asp 720 Val Phe Glu Ser Val Tyr Thr Pro Ser Arg INFORMATION FOR SEQ ID NO: 13: SEQUENCE CHARACTERISTICS: LENGTH: 10550 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO WO 99/14314 WO 9914314PCT/AU98/00743 86- (vi) ORIGINAL SOURCE: ORGANISM: triticum tauschii (ix) FEATURE: NAME/KEY: exon LOCATION: 1.316 OTHER INFORMATION:/product= 'exon I" (ix) FEATURE: NAME/KEY: exon LOCATION: 1472..1828 OTHER INFORMATION:/product= "exon 2' (ix) FEATURE: NAME/KEY: exon LOCATION: 2766..2823 OTHER INFORMATION:/product= 'exon 3" (ix) FEATURE: NAME/KEY: exon LOCATION: 2906..3028 OTHER INFORMATION:/product= "exon 4" (ix) FEATURE: NAME/KEY: exon LOCATION:41 13..4 194 OTHER INFORMATION:/product= "exon (ix) FEAT URE: NAME/KEY: exon LOCATION:4286..4459 OTHER INFORMATION:/product= "exon 6" (ix) FEATURE: NAME/KEY: exon LOCATION: 4562..464 3 OTHER INFORMATION:/product= "exon 7' (ix) FEATURE: NAME/KEY: exon LOCATION:4744..4855 OTHER INFORMATION:/product= "exon 8" (ix) FEATURE: NAME/KEY: exon LOCATION:4999..5021 OTHER INFORMATION:/product= "exon 9" (ix) FEATURE: NAME/KEY: exon LOCATION:5102..5192 OTHER INFORM ATION:/product= "exon (ix) FEATURE: NAME/KEY: exon LOCATION: 8593..8718 WO 99/14314 WO 9914314PCT/AU98/00743 87- OTHER INFORMATION:/product= "exon I1I" (ix) FEATURE: NAME/KEY: exon LOCATION: 8807..8915 OTHER INFORMATION:/product= "exon 12" (ix) FEATURE: NAME/KEY: exon LOCATION: 8992..9 104 OTHER INFORMATION:/product= "exon 13" (ix) FEATURE: NAME/KEY: exon LOCATION: 9161..9199 OTHER INFORMATION:/product= "exon 14' (ix) FEATURE: NAME/KEY: exon LOCATION: 9498..97 13 OTHER INFORMATION:/product= "exon (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: ATGGCGGCGA CGGGCGTCGG CGCCGGGTGC CTCGCCCCCA GCGTCCGCCT GCGCGCCGAT CCGGCGACGG CGGCCCGGGC GTCCGCTI'GC GTCGTCCGCG 100 CGCGGCTCCG GCGCTFGGCG CGGGGCCGCT ACGTCGCCGA GCTCAGCAGG 150 GAGGGCCCCG CGGCGCGCCC CGCGCAGCAG CAGCAACTGG CCCCGCCGCT 200 CGTGCCAGGC TFTCCTCGCGC CGCCGCCGCC CGCGCCCGCC CAGTCGCCGG 250 CCCCGACGCA GCCGCCCCTG CCGGACGCCG GCGTGGGGGA ACTCGCGCCC 300 GACCTCCTGC TCGAAGGTAA AAAACAAGGC TGAATCCTCA GATCACTCCG 350 CGTCTJTCGTT 1TACCAAATA CGGTACTGCG AAGTGGTGCT GTATATGTGA 400 AG1TCTGTC GATF7CY17CC TGACGGATGT TCAGTCGAYI' CAG17GTATA 450 TATGTGATAC G'ITCGTTGTT CATCGATCGT ACAGA'T'AC CAGCACACTA 500 GATAGAAATC GAGACCGAC.G CGGGCAGATC AATAGATITI TCTAGACG'IT 550 'ITATTGGATC GTGAGATGAT TGATTrGGGGT GGCGTGTCGA TACGATAGCG 600 GTGCACCGCC GATGTATCGG GGCATGTGCA CGTGGTTGGG TCTCAGCAGA 650 CATATCACTA GACTGGTATC GTAATITACT AGTACTACTG GAAAGAGGAC 700 TAAAAAGGCT AGGCCAAGTG CACGCATGT T GGGAACGTI7G TI'AAATTGAT 750 GAGTGTCC TITGCTTGGG CTGGTAY17AT TACCAAAAAA TGGTG'TAGT 800 WO 99/14314 WO 9914314PCT/AU98/00743 88 CCCTGTACr-r ATFAATGGGA AAATCTFAAC ATGACACTGG GGTFATGAG 850 TCTCCAA1TG TATAFTCTCA GCACTCAACT GATIIACTG ATACTGTAGT 900 GGAAATGACA CGTGAGCACC CCCC'T7CAAG GAATGCAATG CTTICTITCTG 950 TMATAYFA CAGGAACTAG AAGGAGCTT'C CACCTITTGAG TACAGAAGTA 1000 CTCCCTCCGT TCCAAAATAG ATGACTCAAC 'ITGTACTAA TIIGTACTA 1050 TAGTITAGTAC AAAGTTrGAGT CATCTA=II AGAACGGAGG GAGTAGTATC 1100 GAAATrGAAG ACCCTrGTAT TACTGTCTrG FTITCAATG AAAATGGGAG 1150 GCCCATGCAG TAAGTCACAT GGGCACCTGG GAGGCTGGGA TCATGTGTGC 1200 MrIGCAGAGT ACTAGACCCA GCTCACCCTC TGTTAGA1TA CT1TGTrGGGC 1250 TGCTACMTG TGMIIGCTGT GCAGTATATC AGACATCCTG AAMTGGCAT 1300 CTAGCTGAGA ACAGAATGCA GGTITGCACCA TrCT7ATTAT TGCTAAACTG 1350 'ITGTCACGCA A'IMATAAAG AATGTGATCT TCTGAGTATr AAITrAATCAT 1400 G'17CTGCTAA TATCTGTCCT CGCTCTGGTG YrGACAAATA TACCATATGA 1450 ATA=IICCA TITGCAACC AGGGATI'GCT GAGGATTCCA TCGACAGCAT 1500 AATCGTGGCT GCAAGTGAGC AGGATTrCTGA GATCATGGAT GCGAATGAGC 1550 AACCTCAAGC TAAAGTrACA CGTAGCATCG TGTITGTGAC TGGTGAAGCT 1600 GCTCCTrATG CAAAGTCAGG GGGGCTGGGA GATGT'TTGTG GT-rCGT-TACC 1650 AA1TTGCTcT-r GCTGCTCGTG GTCACCGTGT GATGGTTGTA ATGCCAAGAT 1700 ACT-rGAATGG GTCCTCTGAT AAAAACTATG CAAAGGCATFT ATACACTGCG 1750 AAGCACATTA AGATTCCATG CMIIGGGGGA TCACATGAAG TGACCT1T 1800 TCATGAGTAT AGAGACAACG TCGA17GGGT GGGTACACAA TCACCTTrC~r 1850 ATITCTCTGT-r GAAYrGTAGC AACTGMIIAT CCTFTG'FIAC ACTFTCYIIA 1900 GCCCTGCAAA GACATATGTG AMIICCATAC TYT=GT-rA MICCCTrGT 1950 ACTCT-rCCTC ATGAAGGTCA AAATATCATA TATCCATGGA AGTCATGCAT 2000 GTGCCTAGTA ITrTTGGTGT CGGTGCC1TI7 AACTFICAGG GAT-TAATACG 2050 TGGAATrTGA TAACTAAAGT T-rA=IIATT GAAAAAAAT-r GTAGGTrGG 2100 TGAGCCCACA.GCCACGCAGT GGCACCACTG C'rrGCACATG ATM~GCATT 2150 TCTGITIGCA CCGAGCAcTTr CATGTGAATA AGGTGTAAAA TCATAAAGTA 2200 WO 99/14314 WO 9914314PCTIAU98/00743 89 CCAATI=AT TCTGCCAATr GCACITAAGA GTATATACAT TFATCVFGGC 2250 CTCAATCATG GGAGTACTGT GCATTCAGTG CACCATCArr7 GFTCTAAGGA 2300 GAAAATGTGG GTGCAAGGAA GACACTFTIG TCCC1TAATA AAAGGCAGGC 2350 ACTCTGflGT CATATAGATA GAAAGCAACA AACTTfATIC AAAGAGCTAA 2400 CAATGGCAAA AGAACCAAAA AAAGCATGCT AAGGCGGTGA CACCAAAAGG 2450 TGAGGGGGGC C'ITGTGACTG ACAGCACCCC AAACTATrGC CATFFGTFFIA 2500 CTAAATGAAG ATCATITAG AAGCTCTCAG GAACTTCGAA AACAGTGGCT 2550 'TTCCGTCCAC AGATCGTCTG TrAATA1Tr TGTCCAGTGA TACTIFFTI 2600 GCTCCTI'ACA AGAGTGCCTA TGTrTGACATA TACATfGTFA AGI7GTrCAT 2650 AAG1TT'ACT7 CTTATrCTAA ACAGCAAGTG CCTAATGCTF7 GCAMIIAM 2700 TGGCTAnTTA T1TFATrCT CATITCAATC AACACTITG TTrCAGGTGTr 2750 TGTCGATCAT'CCGTCATATC ATAGACCAGG AAGTFTrATAT GGAGATAAT-r 2800 TFIGGTGCTIT TGGTGATAAT CAGGTACACT ACACTATACT AAGCTCCTAG 2850 TTGACTAAGT CGTAAGTFrGT ACCTCCTCGC TGACCGGCTG CTCTATGTCG 2900 TGCAGTITCAG ATACACACTC CTI17GCTATG CTGCATGCGA GGCCCCACTA 2950 ATCCTTrGAAT TGGGAGGATA TArTATGGA CAGAATrGCA TGTITTGTrrGT 3000 GAACGATTGG CATGCCAGCC TTGTGCCAGT GTACGTFTGT-r TGTGGATCTG 3050 AAAGTCCAAT CCTITrAT7CA TTCTCTGCTr TGCAGTGTGC CCATGTCTAC 3100 AT1C=f~A TGCT-TITIC ATGTCTGTTC TITATATrGCA TATATGCTrA 3150 TGGAGTCTAA AAGTTACCGG AGGGAATAAC TCTTAAGGAT TfCCTCAATC 3200 AATTATCTIT AGC'FIAGTT AACATIACT GTGGCAAACA TAATGTGTI 3250 TGAGA1TIAC AAGT7CAGAG ATrGCACTrC ACTAGTCGT AGCTAATCTG 3300 ATG'TTICCC CGAGAAAATG CCTAAAGcTr TGTGTCT-rGA TGCATTrGATA 3350 GAAAAAGAGT TfATGTACAC TCCCAAA GAG GGGACCCAAA AT7ACAACAC 3400 CACACCCCTG AGAACTAGGC GCTGCCGGAA GAAGCGATGC AAGCCCCACT 3450 GCCCCTGCCT TAGCTCAAAG CCGGGCGTCA GCTGATTGT GTCAAGTAAG 3500 CTAGCAGTGC TAGATTGCGC AAGGTCGATT CGTCGAAGAT GACAGTGTTG 3550 CGCTGCTFcC AAATCCACCA AACTATGAGC ATGATCACTG GAGAAGTACC 3600 WO 99/14314 WO 99/ 4314PCT/AU98/00743 YTTCTCGCG GCTGAGGGGG TGGACTGGTG GTCTGCTGCT GCCAG1TIC 3650 AGATAATCTG AAAAATGCAT G I IIrIGATGA IT=AGTATC TTGCGGACCC 3700 TGGGTACCAC CTAAGCMIC ACACAGTAAT T-rGCAGTFAC ACCTATAAAA 3750 GTAACGGTCA TGATATGCAT GTG'TITGGG TAGATCATGG TGCATGCATT 3800 TFIAGGAATIA GGACATGCCA GAACCACGTG AGGCTFATGG GGCAATITCAT 3850 TFFGTrCCA-IT ATACGAGTCA TGAATATGGT TCAGCATGTT1 TGGACGCTAC 3900 TI'GT-rTGGGG CAATITICAGA TGGTGAATITG TAGCTGCTT'G ATGTFFGGCTA 3950 GCTGGC-FrAT T1IGTACAAG TATCGATGTF AGATGCATAT TTCCT17GT 4000 TCTTGTGCTG TI=GCCATGT TGTATrCCCC FTTCTGTCG CCAGTGTTGC 4050 ATGTrFAAAT-T GG=TICA'rr ACATAATCAA CT17GTI'GCT GACATCAGTC 4100 ATFIATI'C AGCCTTFrG CTGCAAAATA TAGACCATAC GGTGTITACA 4150 GAGATFCCCG CAGCACCC~F GTTATACATA AI7rAGCACA TCAGGT17GG 4200 GTCTATCACC T17CATTATC CGTACATGGC TIGTAAGTC GGTrCACACG 4250 TATCGTCATA CTGTATGITA T-17CAATGTC ATrAGGGTGT GGAGCCTGCA 4300 AGTACATATC CTGATCTGGG ATT'GCCACCT GAATGGTATG GAGCT1TAGA 4350 ATGGGTAMII CCAGAATGGG CAAGGAGGCA TGCCCTrGAC AAGGGTGAGG 4400 CAG'17AACT-r 1TIGAAAGGA GCAGFTGTGA CAGCAGATCG.AAT-rGTGACC 4450 GTCAGTCAGG TGAAATACTC AATACTTCTC T-IT1TT7F GCGGGATGTT 4500 CFTCAGTTCA ATTGCCCTGT CTFTCACCCA ATrAAGAAAT GATI1AATCT 4550 1IG'ICTA GGGTTCA TGGGAGGTCA CAACTGCTGA AGGTGGACAG 4600 GGCCTCAATG AGCTCTTAAG CTCCCGAAAA AGTGTATI'GA ATGGTAACTA 4650 TATITGAATC CACTTATCT-r CTrCTGAAAC ATAFFIACAG AAATAGATGG 4700 ATGGGTI'GCA AGAATAAAYI' CAG=~GCTC TITCGGTATG AAGGAATTGT 4750 AAATGGAATT GACATTAATG ATI'GGAACCC CACCACAGAC AAGTGTCTCC 4800 CTCATCATrA TTCTGTCGAT GACCTCTCTG GAAAGGTGTG TGGATAGTAC 4850 CCTATATAAT AACATGTATA TCTGATCTAG TACT7CTI TTCTIITGCTA 4900 G1TI-rGCTTCC CATGATGTrC TCACTAACTA ATCCTATGTG GMFIGGCATA 4950 cTTrGTCAGGC CAAATGTAAA GCTGAATFIGC AGAAGGAGCT GGGTTIACCT 5000 WO 99/14314 WO 99/ 4314PCT/AU98/00743 91- GTAAGGGAGG ATGT-rCCTCT GG1TAGATAC AAACCCCTAA GATATATAIT 5050 TF=AAATCC CTAAAAAAAA CYFGCCGATC ATCTCATTAG CTTGATrTCAC 5100 AGATTGGCT-T TATFGGAAGA CTGGATrfACC AGAAAGGCAT TGATCTCATJT 5150 AAAATGGCCA TFICCAGAGCT CATGAGGGAG GACGTGCAGT T-rGTAAGTTC 5200 ATATTTCITI TCTrGAGACT AGAGTATAAA TCAAACATGT AGGTGTGGGG 5250 TGGTATAATA CAGACATAAG TTrCCAGCTAT TGCTTfCCATG AGAA=IIAA 5300 TGCTAFFCAG TAATATGCTA CTGCAAG'lM TGAAACAAAG TTGGAAGCAA 5350 TAAATATATG TGTAGCACTG ACCATGCAGT GCCACTATAG CTGGAATGTC 5400 CTGTAGTCTA TGTGATCTAA CACACTCAAC AACATGIT1 CGCATACAAA 5450 CACATGCGTG CGCGCAACAA ACATACTCTA CAATAAAATT GGCT-rGGTGA 5500 ACTGCAGACA TGCTCTFATC TCCAYFCCAA CAFIII T7GT TTCAACATrG 5550 GCTGAAGACT AAGAGAAGGG GGACCCAGGG TGATGTAGCC AACTAGATCC 5600 AGTAAGGAAG CTAGCCGAGC CTAGGAGGAT TCGCTTAGGT AGCTGGAACG 5650 TAGGGTCTCT GACAGGGAAG CTITCGGGAGC TAGTCGATGC AGTGGTGAGG 5700 AGAGGTGTTG ATATCCTTIG CGTCCAAGAA ACCAAATGTA GGGGACAGAA 5750 GGCGAAGGAG GTGGAGGATA CCGGCITrCAA GCTGTGGTAC ATGGGACGGC 5800 TGCAAACAGA AATGGCGTAG GCATCTITGAT CAACAAGAGC C'rrAAGTATG 5850 GAGTGGTAGA CGTCAAGAGA CGTGGGGACC GGATTATCCT CGTCAAGCTG 5900 GTAG1TGGGG ACTTAGTTCT CAATG'ITATC AGCGTGTATG CCCCGCAAGT 5950 AGGCCACAAT GAGAACGCCA AGAGGGAGTT CTGGGAAGGC CTGGAAGACA 6000 TGG~rAGGAG TGTACCGA'Tr GGCGAGAAGC TCTTCATAGG AGGAGACCTC 6050 AATGGCCACG TGGGTACATC TAACATAGGT T17GAAGGGG CACATGGGGG 6100 CTITGGCTAT GGCATCAAGA ATCAAGAAGA AGATGTCTrA CGCTFGCTC 6150 TAGCCTACGA CATGATTGTA GCTAACACCC TCTIMAGAAA GAGAGAATCA 6200 CATCTGGTGA C=flAGTAG TGGCCAACAC TAGCCAGATC GAMIICATCC 6250 TCTCGAGAAG AGAAGATAGG TGTGCGCGCC TAGACTGCAA GGTGATACCT 6300 TCGGATFrCGT GTCCAGCGGG ATAAGCGTGC CAAAGTCGCT AGAATGAAGT 6350 GGTGGAAGCT CAAGGGGGAG GTAGCTCAGG CGTT'CAAGGA GAGGGTCATFT 6400 WO 99/14314 WO 99/ 4314PCT/AU98/00743 92 AGGGAGGGCC C17GGGAGGA AGGAGGGGAT GCGGACAATG TGTGGATGAA 6450 GATGGCGACT TGCATTCGTA AGGTGGCCTC GGAGGAGTGT GGAGTGTCCA 6500 GGGGATGGAG AAGCGAAGAT AAGGATACCT GGTGGTGGAA TGATGATGTC 7000 CAGAAGGCAA TrAAAGAGAA GAAAGATTGC TTTAGACGCC TATACTFGGA 7050 TAGGAGTGCA GTCAACATAG AAAAGTACAA GATGGCGAAG AAGGCCGCAA 7100 AGCGAGCTGT CAGTGAAGCA AGGGGTCGGG CATATGAGGA TCTCTACCAA 7150 CGG1TAGGCA CGAAGGAAGG CGAAAGGGAC ATCTATAAGA TGGCCAAGAT 7200 CCGAGAGAGA GGAAGACGAG GGATATTGGC CAAGTCAAAT GCATCAAGGA 7250 TGGAGCAGAC CAACTCTFGG TGAAGGACGA GGAGATITAAG CATAGATGGC 7300 GGGAGTACTFF CGACAAGCTG TTCAATGGGG AGGATGAGAG TCCTACCA1T 7350 GAACTTGACG ACTCC1TrGA TGAGACCATC ATGCG=IJA TGCGGCGAAT 7400 CCAGGAGTCC GAGGTCAAGG AGGCT17AAA AAGGAGGCAA GGCGATGGGC 7450 CCTGATrGTA TCCCCATITGA GGTGTGGAAA GGCCTCGGGG ACATAGCGAT 7500 AGTATGGCTA ACCAAGCTAT TCAACCTCAT '1ICGGGCA AACAAGATGC 7550 CAGAAGAATG GAGACGAAGT ATATI7AGTAC CAATCATCAA ACAGGGGGGA 7600 TGTTCAGAGT TGTACTAATT ACCATGGAAT TAAGCTGATG AGCCATACAA 7650 TGAAGCTATG GGAGAGAATC ATFrGAGCACC GCITAAGAAG AATGACAAGC 7700 GTGACCAAAA ATCAGMIGG TICATGCCT GGGAGGTCGA CCATGGAAAC 7750 CATITITYG GTACGACAAC TrATGGAGAG ATACAGGGAG CAAAAGAAGG 7800 ACYI'GCATAT GGTGIrCATr GAC1TGAAGA AGGCCTATAA TAAGATACCG 7850 CGGAATGTCA TGTGGTGGGC CTITGGAGAAA CACAAAGTCC CAGCAAAGTA 7900 CATFrACCCTC ATCAAGGACA TGTACGATAA TGTrGTGACA AGTGTrCGAA 7950 CAAGTGATGT CGACACTAAT GACTrCCCGA 'ITAAGATAGG ACTGCATCAG 8000 GGGTCAGC'Tr TGAGCCCITA TCITITGCC TFGGTGATGG ATGAGGTCAC 8050 AAGGGATATA CAAGGAGATA TCCCATGGTG TATGCTCYI GTGGATGATT 8100 TGGTGCTAGT TGACGATAGT CGGGCGGGGG TAAATAACAA GTTAGAGT7IA 8150 TGGAGACAAA CCITGGAATC GAAAGGGMr AGGCTTAGTA GAACTAAAAC 8200 CGAGTACATG ATGTGCGGTT TCAGTACTAC TAGGTGTGAG GAGGAGGAGG 8250 WO 99/14314 WO 99/ 4314PCT/AU98/00743 93- 7TAGCCTFGA TGGGCAGGTG GTACCCCAGA AGGACACCT-r TCGATAMfG 8300 GGGTCAATGC TGCAGGAGGA TGGGGGTATr GATGAAGATG TGAACCATCG 8350 AATCAAAGCT GGATGGATGA AGTGGCGCCA AGCTTCTGGC ATTCThFGTG 8400 ACAAGAGAGT GCCACAAAAG CTAAGGCAAG TFCTACAGGA CGGCGGTTCG 8450 ACCCGCAATG TrGTATGGCG CTGAGTGTTG GCCGACTAAA AGGCGACATG 8500 TrCAACAGTr AGGTGTGGCG GAGATGCGTA TG17GAGATG GATGTGTGGC 8550 CACACGAGGA AGRATCGAGT CCGGAATGAT GATATACGAG ATAGAGTITGG 8600 GGTAGCACCA ATI'GAAGAGA AGCTFFGTCCA ACATCGTCTG AGATGGTTTG 8650 GGCATATFCA GCGCACGCCT CCGAAAACTC CAGTGCATAA CGGACGGCTA 8700 AAGCGTGCGG AGAATGTCAA GAGAGGGCGG GGTAGACCGA ATTGACATG 8750 GGAGGAGTCC G1TAAGAGAG ACCTGAAGGT TrGGAGTA'17 ACGAAAGAAC 8800 TAGCTATGGA CARGGGTGCG TGGAAGC'rTG TTATCCATGT GCCAGAGCCA 8850 TGAGTTGATC ACGAGATCT-r ATGGGTFF7CA CCTCTAGCCT ACCCCAACTF 8900 GTI1rGGGACT AAAGGC'TTrG T7GTTGTrGT TGTTGTTGTT GTTrGTAGCCA 8950 ACTAAATCCA GTITGATCAGT GGTFTTIACT CTrATITTA CAGGTCATGC 9000 TI7GGATCTGG GGATCCAAT7r TIGAAGGCT GGATGAGATC TACCGAGTCG 9050 AGTrACAAGG ATAAA-FrCCG TGGATGGG'Fr GGAYIAGTG 'TrCCAGTTITC 9100 CCACAGAATA ACTGCAGGGT ATGCCGAGAA CTFTCITAACA AGACCTFI'CGT 9150 TATCAGCTTG GATATAT7AT AATGT7CAAA ACATITATGT CTCTC I II 1 9200 GTGCAGTTGC GATATATTGT TAATGCCATC CAGG'FIGAA CCTTGTGGTC 9250 TITAATCAGCT ATATGCTATG CAATATGGTA CAGITCCTGT AGT-rCATGGA 9300 ACTGGGGGCC TCCGAGTAAG ACAACTGCCT TGAAAATrAT CGT-FATC1rrG 9350 GCTCCAACGC AAATGTflCTA AT-rGGCTCGT GTAITCAACA GGACACAGTC 9400 GAGACCTICA ACCC1TITGG TGCAAAAGGA GAGGAGGGTA CAGGGTACGC 9450 ACTGCTCAAT T1AGCTAAC MFCAGT-ITA TC1TFIGCA ATGTCTTTGGG 9500 GGTrCAT7GC GCCATAAATC AACTI'GTGAT AATrAACTGT TACTGTI'CTG 9550 TACTrGCAGG TGGGCGTITCT CACCGCTAAC CGTGGACAAG ATG1TGTGGG 9600 TAAG1TI=G CTGAGCTCTT GTCCGG'17AT AGGATCGACC TITGGCTGTAG 9650 WO 99/1 4314 PTA9/04 PCT/AU98/00743 94- CATGGTACCT TAGTGCCCCT TGTATATAGA CCTAACCTGA TGGACTCACT 9700 YIGTCTACAC TAATCATAGT AGTCGATTGC CCGGAGGCGT TITGCTTGGA 9750 17TCTGCTAAT TI'AATITFCA TGACGATAAC TCATACCATG UITGGTTCT 9800 CCGATGGGGG CCAGAATGGC GTCTAGTGTC TGCGATCTGT GTAACTAGCC 9850 AATGCCGGGT TG'TTCCAAGT GAAAATITFAC CTTGACCA TTGTGCAGGC 9900 ATTGCGAACC GCGATGTCGA CATFTCAGGGA GCACAAGCCG TCCTGGGAGG 9950 GGCTCATGAA GCGAGGCATG ACGAAAGACC ATACGTGGGA CCATGCCGCC 10000 GAGCAGTACG AGCAGATC1T CGAATGGGCC TTCGTGGACC AACCCTACGT 10050 CATGTAGACG GGGACTGGGG AGGTCGAAGC GCGGGTCTCC TTGAGCTCTG 10100 AAGACATGrI' CCTCATCCT-r CCGCGGCCCG GAAGGATACC CCTGTACATT 10150 GCGTFIGTCCT GCTACAGTAG AGTCGCAATG CGCCTGCTTG CTFTGGTCCGC 10200 CGGTrFCGAGA GTAGATGACG GCTGTGCTGC TGCGGCGGTG ACAGCTTCGG 10250 GTGGATGACA GTFACAGT'Ir TGGGGAATAA GGAAGGGATG TGCTGCAGGA 10300 TGGTITAACAG CAAAGCACCA CTCAGATGGC AGCCTCTCTG TCCGTGTIAC 10350 AGCTGAAATC AGAAACCAAC TGGTGACTCT TrAGccTITAG CGATTGTGAA 10400 GTITGTTGCA 'TTCTGTGTAT GTTGTCTrGT CCTITAGCTGA CAAATATrITG 10450 ACCTGTrFGGA TAATTCTATC TTTGCTGCTG, T7nTrCTri GGTCAAAAGA 10500 GGGGT-rCCCT CCGA'1CAT TAACGAAACC ACCAAAATAA CAGCACCCAG 10550 TGCAGGTCTC AGGTITCAGAT ATACTTAAGA CTACTAAATC TAACAGCAGC 10600 TAAAAAGCT-' AAAGATFCAG GCGACATAAC CGAACAAAAT CCACAACCGA 10650 AGGGACCAAA GCAGGACAAG TAAAAAGGCA GNCGACACAA AGCGCAGGTC 10700 GCTGAAAAGG CAAGCAGACA GAGGTCTGCA 'ITCTGTCAAC ACCACTITGTG 10750 AAAAATGAAG AGAAGATCGA GAA7TCCCGG GAATCCG 10787 INFORMATION FOR SEQ ID NO: 14: SEQUENCE CHARACTERISTICS: LENGTH: 647 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (iii) HYPOTHETICAL:
NO
WO 99/14314 PCT/AU98/00743 95 (vi) ORIGINAL SOURCE: ORGANISM: triticum tauschii TISSUE TYPE: Endosperm (ix) FEATURE: NAME/KEY: Protein LOCATION:1..647 OTHER INFORMATION:/product= "deduced amino acid sequence for SSS I" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: Met Ala Ala Thr Gly Val Gly Ala Gly Cys Leu Ala Pro Ser Val Arg 1 5 10 Leu Arg Ala Asp Pro Ala Thr Ala Ala Arg Ala Ser Ala Cys Val Val 25 Arg Ala Arg Leu Arg Arg Leu Ala Arg Gly Arg Tyr Val Ala Glu Leu 35 40 Ser Arg Glu Gly Pro Ala Ala Arg Pro Ala Gln Gln Gln Gln Leu Ala 55 Pro Pro Leu Val Pro Gly Phe Leu Ala Pro Pro Pro Pro Ala, Pro Ala 70 75 Gln Ser Pro Ala Pro Thr Gln Pro Pro Leu Pro Asp Ala Gly Val Gly 90 Glu Leu Ala Pro Asp Leu Leu Leu Glu Gly Ile Ala Glu Asp Ser Ile 100 105 110 Asp Ser Ile Ile Val Ala Ala Ser Glu Gln Asp Ser Glu Ile Met Asp 115 120 125 Ala Asn Glu Gln Pro Gln Ala Lys Val Thr Arg Ser Ile Val Phe Val 130 135 140 Thr Gly Glu Ala Ala Pro Tyr Ala Lys Ser Gly Gly Leu Gly Asp Val 145 150 155 160 Cys Gly Ser Leu Pro Ile Ala Leu Ala Ala Arg Gly His Arg Val Met 165 170 175 Val Val Met Pro Arg Tyr Leu Asn Gly Ser Ser Asp Lys Asn Tyr Ala 180 185 190 Lys Ala Leu Tyr Thr Gly Lys His Ile Lys Ile Pro Cys Phe Gly Gly 195 200 205 Ser His Glu Val Thr Phe Phe His Glu Tyr Arg Asp Asn Val Asp Trp 210 215 220 Val Phe Val Asp His Pro Ser Tyr His Arg Pro Gly Ser Leu Tyr Gly 225 230 235 240 Asp Asn Phe Gly Ala Phe Gly Asp Asn Gin Phe Arg Tyr Thr Leu Leu 0 245 250 255 Cys Tyr Ala Ala Cys Glu Ala Pro Leu Ile Leu Glu Leu Gly Gly Tyr 260 265 270 WO 99/14314 WO 9914314PCT/AU98/00743 96 Ile Tyr Leu Val 290 Arg Asp 305 Leu Giu Tyr Gly Leu Asp Ala Asp 370 Thr Ala 385 Ser Val Pro Thr Ser Gly Pro Val 450 Tyr Gin 465 Arg Giu Giu Gly Gly Trp Cys Asp 530 Gin Leu 545 Gly Gly Gly Giu Met Leu Gly Gin 275 Pro Val Ser Arg Pro Ala Ala Leu 340 Lys Gly 355 Arg Ile Glu Gly Leu Asn Thr Asp 420 Lys Ala 435 Arg Giu Lys Gly Asp Vai Trp Met 500 Vai1 Gly 515 Ile Leu Tyr Aia Leu Arg Giu Gly 580 Trp Ala 595 Asn Leu Ser Ser 325 Glu Glu Val1 Gly Gly 405 Lys Lys Asp Ile Gin 485 Arg Phe Leu Met Asp 565 Thr Leu Cys Leu Thr 310 Thr Trp Ala Thr Gin 390 Ile Cys Cys Val1 Asp 470 Phe Ser Ser Met Gin 550 Thr Gly Arg Met Ala 295 Leu Tyr Val Val1 Vai 375 Gly Val Leu Lys Pro 455 Leu Val1 Thr Vai Pro 535 Tyr Val Trp Thr Phe 280 Ala Val Pro Phe Asn 360 Ser Leu Asn Pro Ala 440 Leu Ile Met Giu Pro 520 Ser Gly Giu Ala Ala 600 Val1 Lys Ile Asp Pro 345 Phe Gin Asn Giy His 425 Giu Ile Lys Leu Ser 505 Val1 Arg Thr Thr Phe 585 Met Val1 Tyr His Leu 330 Giu Leu G ly Giu Ile 410 His Leu Gly Met Gly 490 Ser Ser Phe Val1 Phe 570 Ser Ser Asn Arg Asn 315 Gly Trp Lys Tyr Leu 395 Asp Tyr Gin Phe Al a 475 Ser Tyr His Giu Pro 555 As n Pro Thr Asp Pro 300 Leu Leu Ala Gly Ser 380 Leu Ile Ser Lys Ile 460 Ile G ly Lys Arg Pro 540 Val1 Pro Leu Phe Trp 285 Tyr Ala Pro Arg Ala 365 Trp Ser Asn Val Giu 445 Giy Pro Asp Asp Ile 525 Cys Val1 Phe Thr Arg 605 His Gly His Pro Arg 350 Val1 Giu Ser Asp Asp 430 Leu Arg Giu Pro Lys 510 Thr Gly His Gly Val1 590 Giu Ala Val1 Gin Giu 335 His Val1 Vai Arg Trp 415 Asp Gly Leu Leu Ile 495 Phe Ala Leu Giy Ala 575 Asp His Ser Tyr Gly 320 Trp Aia Thr Thr Lys 400 Asn Leu Leu Asp Met 480 Phe Arg Giy Asn Thr 560 Lys Lys Lys WO 99/14314 PTA9/04 PCT/AU98/00743 97 Pro Ser Trp Giu Gly Leu Met Lys Arg Gly Met Thr Lys Asp His Thr 610 615 620 Trp Asp His Ala Ala Giu Gin Tyr Giu Gin Ile Phe Glu Trp Ala Phe 625 630 635 640 Val Asp Gin Pro Tyr Val Met 645 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 5072 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL:
NO
(vi) ORIGINAL SOURCE: ORGANISM: triticum tauschii TISSUE TYPE: Endosperm (ix) FEATURE: NAME/KEY: promoter LOCATION: L .4993 OTHER INFORMATION:/function= "region containing promoter of SSS I" (xi) SEQUENCE DESCRIPTION: SEQ ID NO:
TCTAGATGCA
TTCGGTCTAC
AACTATGCTC
GATCTTGAGA
ATCTTCCAAT
TTATTATAAA
GTGACCGTGA
GTGTACGTGC
TGAGAAAAAT
ACGTAGTATT
TCCGGCAGGT
GCCGCCCCAG
GAGAAGGCTT
GGGTCATGCC
TGCTGGATAG
TTGTCGCGGA
AATTCTATCA
GAAGTCACTA
ACTTAGTTAT
AAATACCAAA
AGGGATTGAC
GTGTGACTCG
ACTTACGCTA
CAAGAGGTAG
TGATGCGGGC
CGGCGTCTTG
GTCGATGAAC
CTTCGAGAAA
CGGTCGATGT
CGTGATGCCT
ATTGCTCGAC
GTGAAACCTA
TTCCATTGCC
AATATTATCT
AACCCCTTTA
CACGTCTCCT
CTTTACTGCA
CACGCTACCA
CGGGGAAGAA
AACCTGTCCA
TCCAGCTGTT
CCCACCTTGG
GTGGAGTAAT
ATATACATGA
AGTAATTCGT
TGCCCCCCAG
GTTTATTTTA
TATCATATCT
TCGTGTTGGT
ACTGGATTGA
TAACCCTTTC
TCCTCTCCAA
CTCCAGCTGC
CGTAGCGCTC
GTGCCAGCCT
CCACCCTTGT
AGTAGTAGAT
TCATACCTAG
TTACCCACCG
GTCTATTTTG
CTTTGTATCT
ATCAGATCTC
TGCGAGGTTC
TACCTTGGGT
CTCTTTAAAA
CAGGAGCGCG
CTTGGCCAGC
CCTGACACGC
AGCTTGCGCC
GCTTGAGCGG
GCAGAATCGT
ATATTCTCAT
TAATACTTAT
CATCATATTA
TTATTTCTTT
ATTCTCGTAA
TTGTTTGTTT
TTTCAAAAAC
AAAAAAACCA
GAGATCTTTG
TTGGTCGTGA
GGCGTGAACT
TTCTTCTGCT
CGCGCCACCT
CAGCAGGCGG CGGCGTGGGG ATGAAGAGGG TGTCTGCTTC CGGAGCAGGC GGGTCGGCGT WO 99/14314 WO 99/ 4314PCT/AU98/00743 98
TGAACTTGAA
GGAAAGTGGT
CGATCTCTGG
GGAGCGGCAG
GCAGGCAA
CTCAACCTCG
AACCGCCTCG
GTCGAGCTCG
CAGCAGGAAG
ATCAGCAGCG
GTCGACGGCA
CGAGCGGCTG
ATGGCGACGC
CCCGTCCTCG
GCCAGGGCGG
TTCGTCATCG
CGGCTCAGCC
TTGGGGGTCG
TGTGGTGTCC
GCAGCCCTCC
CCCCCTTTTG
CGTGTTGGAG
CGCATCCTGC
GCAGTAGTAC
GAGCGCGACG
GTGGACGAGC
GGCGACGCGG
GAGTCATCCG
TCCTGCTGA).
GGAAAAAGAC
AAACAGTGAC
TGTCGTTTCT
CATACAAAAT
AGGCGGTGGC
GTTGGCGTCC
CTCCGGCTGG
CTCTGGCTGA
GCCACCCGGA
GTGAGGTTCG
TCCGCCCCGA
AGCAGCAGAG
GGGGACTGGT
TTTGCACCAG
GCAAAACGTG
TAGGAGCGCT
CGGCCACCAC
CGGGCAGCTC
CTGCGGCGAC
TGGCGCCTCG
CGCCCTTCCT
TTGGCGCGCG
AGGTGGATGA
GGCAGCGTCT
GTGGGGATGT
AGGGAGGTCG
CCCGCCTCCT
CGCCAGACAC
CCCCAGCAGG
CCAGAGATGG
CCGGAGTGAA
AGAGAGGTGT
TAACCACACA
GGCGAAAAAT
ACGTCGTTTT
TTCTTTTCTC
CAAATGAATG
CCCATGATGG
ACCTCCAGTG
AAGGAGGCTC
GCAGACCCCG
TGGGGGCGCG
CCCCAGACCA
AACTGTCCAG
GGTCCGTGCG
CCATCGCCCC
GGGGAGCAGC
GCTGGAGCAA
CGGTGCCCTC
TGGACGTGCC
CACCTGAGCG
GGCGACGGCC
GACAAGGATG
CGACGTGGCG
GCATCTCGGG
GCAGAGAGAA
GGCGGCCCCT
CGTCCGGACT
TCTGCCGCTC
TGTTCCAGGA
GGCGGTGGCC
AGACGACCCC
CCAGGCGCAT
CCGCGGCGTG
ATCAGTGGCT
TGTGTACTGT
TCACGGACAC
GCGTTGTCGG
CAAATCGACA
CATTCAAGGG
ATGGGGGGAG
CCTGCAGTTT
GACGCTCCGG
CGCCCATGTA
AGGTGGACTG
GGGCGGCAGG
GACAGACGGC
GGTGATGTCT
TGGCCAAGCC
CACACCTTGG
GTTGCCGTCG
AGACTCGGAC
ATGGCGCTGG
GCACCCGAGG
GCGGTCGCGG
CTCGCTGTCA
AGCCCTGCGG
GTCGCGGTCA
ATCCGGCCCC
GGGGTCCAGG
CCATGCCCAC
CAACCAGTCG
CTGCACCGGC
GTGTGCCGAT
AGCGTCGAAA~
TGACGCGGGG
GTGGCCGACG
CTGCACAATA
CGTTAAATAA
ACGACTAGTA
CCGGTGTTGT
AACCGTTTCT
CCGGTAATCC
CATGCCAAAG
GGAAGCCAGA
TGTGCCAGAA
CTCTGCATTG
CGCACCGGAG
CTCGGGTCCA
GGACGACGGA
TGCCAAATGG
ACTGGTACGC
AGGACAGGGA
CGTGCCGGCC
AGTGCGCCAG
TCCTGACGGC
AGCACACCCC
TCTGCACCAT
CCGACGCGAG
ATATGCTCCT
GCTATCGGGG
TCTAGCCCCT
GGTCGATCGA
ACCCAGGCAA
ACGTGGCATG
ATGTTCTCGA
GGTGACCAGG
GCGATGTCCC
AAGGGGAAGG
GGGCTGGAGA
CCCAGTGTCG
ATCATTGGTC
GTACCCAATA
CGAGTCATTG
CTTTGGTTA
A-ATTCTGAGC
ACTTGGTTGA
CGATTGGCGT
CGCAAAGGGA
GGCCAAGGCT
GAAGGCCAAG
CAAAGGGCCA
AGGCCGTGTC
ACTCCACCTC
CAAAGATGGC
GGGTGCGGAC
TCGGCGAGCG
TGGGAGAGCC
GCCTGGATGG
GCCAAGCTGG
CATCTTCATC
GGACGTGAGC
CGAGCGGCCA
TGTAGTCCTT
CGTCCCGGGG
TGATGGAGAA
AGAGGCAGGC
TCTTCCCGAG
CGGCGATGCG
CCGACAGGGA
GGTGCCTGAA
AGTTAGGATG
GGCAGAGGCG
CCACATCATA
ACGCGAACCC
TACTCGGCAA
TACTATGTTT
AAAACAGAAA
CCAGGCTCAG
960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 CTACACCCGC CCTTACAAAA~ AAATCAAAAT AAATACTAGA AAAATTCAAA AAATTCCAAT 2940 WO 99/14314 WO 9914314PCT/AU98/00743 99
TTGTTTGTGC
GACATCTGAG
TCATGCACTG
AGCTAAAATT
ATTTTTTGAA
GGCACCGAAA
TTTCCCTCTG
ATTAGAGGAT
GGTCACGATT
GACAACCCAC
CCGCGGACAC
TGAGGGGTGA
AAGTCTGATT
TCTCCCCTCC
GGCGGCTGAT
GTACTTTTTT
TTTGTCTTCC
CCGGCATAGA
CAGATTTGGT
CAGAGCGATG
AAGCCGGCTC
TAGTGGTCGT
GTACTTCCGA
TTCAACTGAA
CTTTTTTTTG
AAAAAACCCA
CATTTTCGTG
ACGCTGCTAC
CCTTTCCAGG
CAGCCAGCCA
GCTCTTCCGT
CCACTCTTCT
TCGGCCACCG
GTGGTAGATA
CAGCTCTCAG
TTCTGACCCG
TTGAGCGGAG
TTTTTTTTAT
CGCCGCACTC
TTTTTTGAGC
CTACAGCCGG
TTTTGACCCA
ATATCCAGCC
CACACATCTT
CCGGTCCCTG
AGAGATGCTC
AAACTTTGTC
GGCGGGGAGG
AGTCCTCGCA
GGGCTCTGAT
TTCCTATCAT
GCCAGATGCT
GTCCTTAAGG
CGATAGGGGA
TCGGTGCTCT
TCGATGAACT
AACAAAAGAG
CGAGAAGTCG
CGCAACTGTC
TGCGAGGCAT
ATGTCGGCAC
CTCACCACGG
GATATGGCAA
CCGTCCGTCC
CTCCCCGCGC
GCAAACCCCC
ATTTGATGCG
CAAAAAAGAC
ATTTGTCTTT
CTTACGTGAT
TTTTTGTGAT
AGGCTCATCC
AAGGGGCACC
GCGTCTCAAA
GACGGGCCCC
CAAATATGGG
CAGTTTCTAA
TCCGTGGATG
TTAGGTGTTC
GGCGAGCCTG
AGAATCCCGG
GGTGCGGCGT
CCTCCTCGAG
CGTCTTGGTG
TCATATCTAT
GCACGTGCAC
GCAGCGACAG
CGGAACCTCG
CTGATAATAG
TTTCACTAGT
AGTTTCACTA
TGGATCCATC
CCTCTCATTT
TCCACGCAAA
AAAAAAATAC
CGGAGGCACG
CTCCGCCCGT
ACACCGAGTC
CGATCCGCTT
TGAGGTACGC
AAATTCGGGG
TTTGCTGAGA
AAAATGTCTA
TTGTTTCCTG
TTTTCTATAA
ACCCACCAAA
CCAGCCTCAT
TCAAACGGTC
GTGGATATGG
TTTGAGATAT
CGCCCGGACG
CACCCCCATC
TGGATTCTTC
TGTCTTCGCT
TCGGACGTAT
TTCGTCCATC
AGGTGAGGTT
TCAAGGGTTC
GAAGACTTCA
CGGCGCGTCA
ATGTAATTTT
ATATCTCTTC
TCTTCTTTTA
AGTACTAAAC
TTCGTTTTTT
TGCACGGCCC
CAAAAAGAAG
CACGCGCCGC
GGCCGCACAC
CCGCGCCACT
GGCACCGGCT
TTGCAGGCAG
TTCAATTTTC
TCTGTAAAAA
GCTTCTCAGA
TCATGCAAAA
GACGGGTGCA
AAGAAAAGAA
GAGTTTTCAA
ACGCTTGAGC
CTTAAACGCC
GGGCGCCCGG
CCGGATGTGG
TTTGAGGGGT
CCTTGATGGC
TCTCCTCTGC
TGGTTAGTTG
GGTCGTGCTT
TGGACGTACT
ATGGTTTCTT
AGCGGCAACA
CGGCTGTTAT
ACCGCTCGTT
TATGATTTTA
TCTCGCAAAA
GAAACAGAGT
CCACGCAATT
CCCCGAGAAT
AGCTCTCTTC
CCCAACCGAA
TCACGAGCAA
AGCCACTGAA
CCACTCGCCT
CATCACCCAT
CGCACTAAAA
AAATTATTTG
TGTTTACTGT
AGTCCAAATG
AAGGATTGGA
GATAAGCCTG
ATACATACA-A
CTCACATGGT
GGGTCGCCTT
CAGGCTGACC
GCACGCCAGC
AATGCGTTTT
TGGATTTGCC
TAGGGCAAAC
CCGCTGCTCC
TTTAAGTTAC
CTTTTTTGAG
CGACGGAGCT
GTCATGTGGG
ACTGCGGCTC
CGACAAGGTC
CTGGCGGCAG
GAGATGCTTT
AAAGAGAGTT
TTCACTAGCA
ATTCTCAAAA
CGTCTGGATC
TCGCCGGCGT
AACGCACGCG
ACCGTGACAA
AACCGCAGCT
TGCCCCACTC
CACCTCGGCC
CCCCGGGGAG
3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 4500 4560 4620 4680 4740 4800 4860 4920 WO 99/14314 PCT/AU98/00743 100 CGCGCCCCGC GGCAGCAGCA GCACCGCAGT GGGAGAGAGA GGCTTCGCCC CGGCCCGCAC 4980 CGAGCGGGGC GATCCACCGT CCGTGCGTCC GCACCTCCTC CGCCTCCTCC CCTGTCCCGC 5040 GCGCCCACAC CCATGGCGGC GACGGGCGTC GG INFORMATION FOR SEQ ID NO: 16: SEQUENCE CHARACTERISTICS: LENGTH: 1706 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (vi) ORIGINAL SOURCE: ORGANISM: triticum tauschii TISSUE TYPE: Endosperm (ix) FEATURE: NAME/KEY: CDS LOCATION: 1706 OTHER INFORMATION:/product= "partial cDNA for hexaploid wheat DBE" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 5072
GCT
Ala 1 GTG-TCG AAG CTT GAC TAT TTG AAG Val Ser Lys Leu Asp Tyr Leu Lys 5
GAG
Glu 10 CTT GGA GTT AAT Leu Gly Val Asn TGT ATT Cys Ile GAA TTA ATG Glu Leu Met TCT TCC AAG Ser Ser Lys 35 TGC CAT GAG TTC Cys His Glu Phe GAG CTG GAG TAC Glu Leu Glu Tyr TCA ACC TCT Ser Thr Ser TTC TTT TCA Phe Phe Ser ATG AAC TTT TGG Met Asn Phe Trp
GGA
Gly 40 TAT TCT ACC ATA Tyr Ser Thr Ile
AAC
Asn CCA ATG Pro Met ACG AGA TAC ACA TCA GGC GGG ATA AAA Thr Arg Tyr Thr Ser Gly Gly Ile Lys TGT GGG CGT GAT Cys Gly Arg Asp ATA AAT GAG TTC Ile Asn Glu Phe ACT TTT GTA AGA Thr Phe Val Arg GCT CAC AAA CGG Ala His Lys Arg
GGA
Gly ATT GAG GTG ATC CTG GAT GTT GTC TTC Ile Glu Val Ile Leu Asp Val Val Phe
AAC
Asn 90 CAT ACA GCT GAG His Thr Ala Glu GGT AAT Gly Asn GAG AAT GGT Glu Asn Gly TAT ATG CTT Tyr Met Leu 115 ATA TTA TCA TTT Ile Leu Ser Phe
AGG
Arg 105 GGG GTC GAT AAT Gly Val Asp Asn ACT ACA TAC Thr Thr Tyr 110 GGC TGT GGG Gly Cys Gly GCA CCC AAG GGA Ala Pro Lys Gly
GAG
Glu 120 TTT TAT AAC TAT Phe Tyr Asn Tyr
TCT
Ser 125 WO 99/14314 WO 9914314PCT/AU98/00743 101
AAT
Asn
TGT
Cys 145
GAT
Asp
AAC
Asn
CCT
Pro
CTT
Leu
TAT
Tyr 225
GGG
Gly
TTT
Phe
CAG
Gin
CAT
His
AAT
Asn 305
AGC
Ser
AGA
Arg
TCT
Ser
AAA
Lys
ACC
Thr 130
TTA
Leu
CTT
Leu
GTG
Val
CTT
Leu
GGA
Gly 210
CAA
Gin
AAG
Lys
GCT
Ala
GCA
Ala
GAT
Asp 290
TTA
Leu
TGG
Trp
TTG
Leu
CAA
Gin
GGG
Gly 370 TTC AAC TGT AAT CAT CCT GTG GTT CGT CAA TTC Phe
AGA
Arg
GCA
Ala
TAT
Tyr
GTT
Val1 195
GGC
Gly
GTA
Val1
TAC
Tyr
GGT
Gly
GGA
Gly 275
GGA
Gly
CCA
Pro
AAT
Asn
AGG
Arg
GGA
Gly 355
GGC
Gly Asn
TAC
Tyr
TCC
Ser
GGA
Gly 180
ACT
Thr
GTC
Val1
GGT
Gly
CGG
Arg
GGT
Gly 260
GGA
Gly
TTT
Phe
AAT
Asn
TGT
Cys
AAG
Lys 340
GTT
Val1
AAC
Asn Cys
TGG
Trp
ATA
Ile 165
GCT
Ala
CCA
Pro
AAG
Lys
CAA
Gin
GAC
Asp 245
TTT
Phe
AGG
Arg
ACA
Thr
GGG
Gly
GGG
Gly 325
AGG
Arg
CCA
Pro
AAC
Asn Asn
GTG
Val1 150
ATG
Met
CCA
Pro
CCA
Pro
CTC
Leu
TTC
Phe 230
ATT
Ile
GCC
Ala
AAA
Lys
CTG
Leu
GAG
Giu 310
GAG
Giu
CAG
Gin
ATG
Met
AAT
Asn His 135
ATG
Met
ACC
Thr
ATA
Ile
CTT
Leu
ATT
Ile 215
CCT
Pro
GTG
Val
GAA
Giu
CCT
Pro
GGT
G ly 295
AAC
Asn
GAA
Glu
ATG
Met
TTT
Phe
ACA
Thr 375 Pro
GAA
Giu
AGA
Arg
GAA
Giu
ATT
Ile 200
GCT
Ala
CAC
His
CC
Arg
TGT
Cys
TGG
Trp 280
GAT
Asp
A.AT
Asn
GGA
Giy
CGC
Arg
TAC
Tyr 360
TAC
Tyr Val1
ATG
Met
GGT
Gly
GGT
Gly 185
GAC
Asp
GAA
Giu
TGG
Trp
CAA
Gin
CTT
Leu 265
CAC
His
TTG
Leu
AGA
Arg
GAA
Giu
AAT
Asn 345
ATG
Met
TGC
Cys Val1
CAT
His
TCC
Ser 170
GAC
Asp
ATG
Met
GCA
Ala
AAT
Asn
TTC
Phe 250
TGT
Cys
AGT
Ser
GTA
Val1
GAT
Asp
TTC
Phe 330
TTC
Phe
GGC
Gly
CAT
His Arg
GTT
Val1 155
ACT
Ser
ATG
Met
ATC
Ile
TGG
Trp
GTT
Val1 235
ATT
Ile
GGA
Gly
ATC
Ile
ACA
Thr
GGA
Gly 315
GCA
Ala
TTT
Phe
GAT
Asp
GAT
Asp Phe
GGT
Gly
TGG
Trp
ACA
Thr
AAT
Asn 205
GCA
Ala
TCT
Ser
GGC
Gly
CCA
Pro
TTT
Phe 285
AAT
Asn
AAT
Asn
TTG
Leu
TGT
Cys
TAT
Tyr 365
TAT
Tyr
ATT
Ile
TTT
Phe
GAT
Asp
ACA
Thr 190
GAC
Asp
GGA
Gly
GAG
Glu
ACT
Thr
CAC
His 270
GTA
Val1
AAC
Asn
CAC
His
TCT
Ser
CTC
Leu 350
GGC
Gly
GTC
Val1
GTA
Val1
CGT
Arg
CCA
Pro 175
GGG
Gly
CCA
Pro
GGC
Gly
TGG
Trp
GAT
Asp 255
CTA
Leu
TGT
Cys
AAG
Lys
AAT
Asn
GTC
Val1 335
ATG
Met
CAC
His
A.AT
Asn
GAT
Asp
TTT
Phe 160
GTT
Vai
ACA
Thr
ATT
Ile
CTC
Leu
A.AT
Asn 240
GGA
Gly
TAC
Tyr
GCA
Ala
TAC
Tyr
CTT
Leu 320
AAA
Lys
GTT
Val
ACA
Thr
TAT
Tyr 432 480 528 576 624 672 720 768 816 864 912 960 1008 1056 1104 1152 WO 99/14314 PCT/AU98/00743 102
TTT
Phe 385 CGC TGG GAT AAA Arg Trp Asp Lys
AAA
Lys 390 GAA CAA TAC TCT Glu Gin Tyr Ser
GAC
Asp 395 TTG CAC AGA TTC Leu His Arg Phe
TGC
Cys 400 TGC CTC ATG ACC Cys Leu Met Thr TTC CGC AAG GAG Phe Arg Lys Glu GAG GGT CTT GGC Glu Gly Leu Gly CTT GAG Leu Glu 415 GAC TTT CCA Asp Phe Pro AAG CCT GAT Lys Pro Asp 435
ACG
Thr 420 GCC GAA CGG CTG Ala Glu Arg Leu
CAG
Gin 425 TGG CAT GGT CAT Trp His Gly His CAG CCT GGG Gin Pro Gly 430 TCC ATG AAA Ser Met Lys TGG TCT GAG AAT Trp Ser Glu Asn CGA TTC GTT GCC Arg Phe Val Ala
TTT
Phe 445 GAT GAA AGA CAG GGC GAG Asp Glu Arg Gin Gly Glu
ATC
Ile 455 TAT GTG GCC TTC Tyr Val Ala Phe ACC AGC CAC TTA Thr Ser His Leu
CCG
Pro 465 GCC GTT GTT GAG Ala Val Val Glu CCA GAG CGC GCA Pro Glu Arg Ala
GGG
Gly 475 CGC CGG TGG GAA Arg Arg Trp Glu
CCG
Pro 480 1200 1248 1296 1344 1392 1440 1488 1536 1584 1632 1680 1706 GTG GTG GAC ACA Val Val Asp Thr
GGC
Gly 485 AAG CCA GCA CCA Lys Pro Ala Pro GAC TTC CTC Asp Phe Leu TTA CCT GAT Leu Pro Asp TCC AAC CTC Ser Asn Leu 515 CGC CCT GAT Arg Pro Asp 530 GCT CTC ACC ATA Ala Leu Thr Ile
CAC
His 505 CAG TTC TCT CAT Gin Phe Ser His ACC GAC GAC Thr Asp Asp 495 TTC CTC AAC Phe Leu Asn 510 CTA GTA TTG Leu Val Leu TAC CCC ATG CTC Tyr Pro Met Leu
AGC
Ser 520 TAC TCA TCG GTC Tyr Ser Ser Val GTT TGA GAG Val Glu
ACA
Thr 535 AAT ATA TAC AGT Asn Ile Tyr Ser TAA TAT GTC TAT Tyr Val Tyr
ATG
Met 545 TAG TCC TTT GGC Ser Phe Gly TTA TCA GTG TGC Leu Ser Val Cys
ACA
Thr 555 ATT GCT CTA TTG Ile Ala Leu Leu
CCA
Pro 560 GTG ATC TAT TCG Val Ile Tyr Ser
ATA
Ile 565 GCG GCC GCG AA Ala Ala Ala INFORMATION FOR SEQ ID NO: 17: SEQUENCE CHARACTERISTICS: LENGTH: 9289 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (vi) ORIGINAL SOURCE: WO 99/14314 PCT/AU98/00743 103 ORGANISM: triticum tauschii TISSUE TYPE: Endosperm (ix) FEATURE: NAME/KEY: CDS LOCATION: L.9289 OTHER INFORMATION:fproduct= "genomic sequence of DBE" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: CGG GAC Arg Asp 570 CGT CCC TTG GCA Arg Pro Leu Ala TGG GTT ACG TTG Trp Val Thr Leu CCT GAC GCT TCG Pro Asp Ala Ser
CTT
Leu 585 ATC CGG TGT GCC Ile Arg Cys Ala
CTG
Leu 590 AGA CGA GAT ATG Arg Arg Asp Met AGC TCC TAT CGG Ser Ser Tyr Arg 48 96 144 TGT CGG CAC ATT Cys Arg His Ile
CGG
Arg 605 CGG CTT TGC TGG Arg Leu Cys Trp TGT TTT ACC ATT Cys Phe Thr Ile GTC GAA Val Glu 615 ATG TCT TAT AAA Met Ser Tyr.Lys 620 AGG TTT ATC CTT Arg Phe Ile Leu 635 CCG GGA TTC CGA Pro Gly Phe Arg TGA TCG GGT CTT Ser Gly Leu CCC GGG AGA Pro Giy Arg 630 TAA GTT GGG Val Gly CGT TGA CCG Arg Pro
TGA
640 GAG CTT ATA ATG Glu Leu Ile Met
GGC
Gly 645 ACA CCC Thr Pro 650 CTG CAG GGT ATT Leu Gin Gly Ile
ATC
Ile 655 TTT CGA AAG CCG Phe Arg Lys Pro CCG CGG TTA TGA Pro Arg Leu
GGC
Gly 665 AGA TGG GAA TTT Arg Trp Giu Phe AAT GTC CGA TTG Asn Val Arg Leu AGA ACC TGT CAC Arg Thr Cys His
TTG
Leu 680 ACT TAA TTT AAA Thr Phe Lys
ATT
Ile 685 CAT CAA CCG TGT His Gin Pro Cys
GTG
Val 690 TAG CCG TGA TGG Pro Trp TCT CTT Ser Leu 695 TTC GGC GGA Phe Giy Gly AAG TAG TTT Lys Phe 715 CGG GAA GTG AAC Arg Giu Val Asn
ACG
Thr 705 GTT TGA GTT ATG Vai Vai Met CAT GAA CGT His Glu Arg 710 GCG ACC GTT Ala Thr Val CAG GAT CAC TCC Gin Asp His Ser ATC ACT TCT AGC Ile Thr Ser Ser
TCC
Ser 725 GCG TTG Ala Leu 730 TTT CTC TTC TCG Phe Leu Phe Ser
CTC
Leu 735 TCA TTT GCG TAT GTT AGC CAC CAT ATA Ser Phe Aia Tyr Vai Ser His His Ile 740
TGC
Cys 745 TTA GTG TCT GCT Leu Vai Ser Ala GCT CCA CCT CAT Ala Pro Pro His
TAC
Tyr 755 CCC TTC CTT TCC Pro Phe Leu Ser
TAT
Tyr 760 528 576 624 AAG CTT AAA TAG Lys Leu Lys
TCT
Ser 765 TGA TCT CGC GGG Ser Arg Gly GAG ATT GCT GAG Glu Ile Ala Glu TCC TCG Ser Ser 775 WO 99/14314 PCT/AU98/00743 104 TGA CTT ACA Leu Thr
GAT
Asp 780 TCT ACC AAA ACA Ser Thr Lys Thr
GTT
Val 785 GCA GGT GTC GAC Ala Gly Val Asp GAT GCC AGT Asp Ala Ser 790 GCA GGT GAC Ala Gly Asp 795 CGT TAC TAT Arg Tyr Tyr 810 GCA ACC GAG CTC Ala Thr Glu Leu
AAG
Lys 800 TGG GAG TTC GAC GAG GAA CGT GGT Trp Glu Phe Asp Glu Glu Arg Gly 805 GTT TCT TTT Val Ser Phe
CCT
Pro 815 GAT GAT CAG TAG Asp Asp Gln
TGG
Trp 820 AGC CCA GTT GGG Ser Pro Val Gly
ACG
Thr 825 ATC GGG GAT CTA Ile Gly Asp Leu TTT GGG GTT ATC Phe Gly Val Ile
TTA
Leu 835 ATT TCT TTT AGA Ile Ser Phe Arg
TTT
Phe 840 GAC CGT AAT CGG Asp Arg Asn Arg
TCT
Ser 845 ATG TGT GGA TTT Met Cys Gly Phe ATG ATG TAT GAA Met Met Tyr Glu TTA TTT Leu Phe 855 ATG TAT TGT Met Tyr Cys AAG TGG CGA TTG Lys Trp Arg Leu
TAA
865 GCC AAC TCT CGT Ala Asn Ser Arg TAT CCC ATT Tyr Pro Ile 870 TGC GAC AAA Cys Asp Lys CTT GTT CAT Leu Val His 875 ACC ACA ATG Thr Thr Met 890 TAC ATG GGA TTG Tyr Met Gly Leu
TGT
Cys 880 GAA GAT GAC CCT Glu Asp Asp Pro CGG TTA TGC Arg Leu Cys TAA GTC GTG CCT Val Val Pro
CGA
Arg 900 CAC GTG GGA GAT His Val Gly Asp
ATA
Ile 905 GCC GCA TCG TGG Ala Ala Ser Trp
GCG
Ala 910 TTA CAC GCA AGT Leu His Ala Ser
CTT
Leu 915 CAT AGC AAC CAA His Ser Asn Gin
AAC
Asn 920 TCC TCT CCG CAT Ser Ser Pro His AAG CCA CCA ATC Lys Pro Pro Ile
GCA
Ala 930 GCC ACC ATG ACT Ala Thr Met Thr TTC TTC Phe Phe 935 ACC ACT GTC Thr Thr Val TCG GCA AGA Ser Ala Arg 955
AAT
Asn 940 GCC ATG AAA ATC Ala Met Lys Ile
TAT
Tyr 945 ATG TAG ACA TGT Met Thr Cys CCC ATT GCA Pro Ile Ala 950 GCC TCT CTG Ala Ser Leu 960 1008 1056 1104 1152 1200 1248 1296 1344 1392 AAG CGA AGC TTC Lys Arg Ser Phe
ACG
Thr 960 GCA CAC CTT CAT Ala His Leu His GCC GAA Ala Glu 970 GAC AAG GAT GCG Asp Lys Asp Ala
CCC
Pro 975 GAC CGG ATC AAT Asp Arg Ile Asn
TCC
Ser 980 TAT CTA GAT ACC Tyr Leu Asp Thr
TAG
985 985 TGG AGC CAT GCG Trp Ser His Ala ATA GCG GAG ATC Ile Ala Glu Ile
TCC
Ser 995 GAG AGG AAG ACC Glu Arg Lys Thr
GGA
Gly 1000 ACT CGT CGG ACG Thr Arg Arg Thr TCG GCG TCC AAA TCG Ser Ala Ser Lys Ser 1005 AGG AGG CCG GCA TGA Arg Arg Pro Ala 1010 AGC ACA Ser Thr 1015 TCG AGG ATG Ser Arg Met GTG ATC CCC ATA CGG Val Ile Pro Ile Arg 1020 GTA GAT CGG Val Asp Arg 1025 GTC GGC CGC CAT CTC Val Gly Arg His Leu 1030 WO 99/14314 PCT/AU98/00743 105 ACA CCG AGA TTA GGA TGC TTA AAA CGG TTT TTT TGG CAC TAG CAT TAT 1440 Thr Pro Arg Leu Gly Cys Leu Lys Arg Phe Phe Trp His His Tyr 1035 1040 1045 TTT GCA TCA TCC GTT GGA GAG AAC ATG AGA GAG CCC CAT TTC TTC CAC 1488 Phe Ala Ser Ser Val Gly Glu Asn Met Arg Glu Pro His Phe Phe His 1050 1055 1060 GGT TCT ACC TAT GGG ATC TTG TTC TGC TTG CAA CCG GGC CTC ACG GAA 1536 Gly Ser Thr Tyr Gly Ile Leu Phe Cys Leu Gin Pro Gly Leu Thr Glu 1065 1070 1075 1080 AAC CCG CGC CAG CGG ACC CAC CCC ATG CTA GCA GGG CAC GGC ACC CGC 1584 Asn Pro Arg Gin Arg Thr His Pro Met Leu Ala Gly His Gly Thr Arg 1085 1090 1095 AGC GGC CGG TCC AAA TGG ACG GTG AGA ACC GCA ACG CGA CAC GCC CGG 1632 Ser Gly Arg Ser Lys Trp Thr Val Arg Thr Ala Thr Arg His Ala Arg 1100 1105 1110 CAC TGT CAG CAA AGC GAG AGC GCG CGC ACG GCA CAC GCA CGC TCG GAC 1680 His Cys Gin Gin Ser Glu Ser Ala Arg Thr Ala His Ala Arg Ser Asp 1115 1120 1125 GAA CGG ACG GTG CGA TCG ATC CCT CCC CCC TCG CTC AAC CAC AGT AGT 1728 Glu Arg Thr Val Arg Ser Ile Pro Pro Pro Ser Leu Asn His Ser Ser 1130 1135 1140 ACC CTG CCA CAC TAT CAC GCA CGC ACT CGA GTC ACA CCT CCC ACG AAG 1776 Thr Leu Pro His Tyr His Ala Arg Thr Arg Val Thr Pro Pro Thr Lys 1145 1150 1155 1160 AAC CAA CAG GAG GCG CGG ATC CCA CCG ATA AAT AAC CCC GCC TCG CCG 1824 Asn Gin Gin Glu Ala Arg Ile Pro Pro Ile Asn Asn Pro Ala Ser Pro 1165 1170 1175 CTC CTC CCC AAA ATC AAT CAC CGA TCG CTC GGG GTT CCC GGC ATG ACG 1872 Leu Leu Pro Lys Ile Asn His Arg Ser Leu Gly Val Pro Gly Met Thr 1180 1185 1190 ATG ATG GCC ATG GCC AAG GCG CCC TGC CTC TGC GCG CGC CCG TCC CTC 1920 Met Met Ala Met Ala Lys Ala Pro Cys Leu Cys Ala Arg Pro Ser Leu 1195 1200 1205 GCC GCG CGC GCG AGG CGG CCG GGG CCG GGG CCG GCG CCG CGC CTG CGA 1968 Ala Ala Arg Ala Arg Arg Pro Gly Pro Gly Pro Ala Pro Arg Leu Arg 1210 1215 1220 CGG TGG CGA CCC AAT GCG ACG GCG GGG AAG GGG GTC GGC GAG GTG TGC 2016 Arg Trp Arg Pro Asn Ala Thr Ala Gly Lys Gly Val Gly Glu Val Cys 1225 1230 1235 1240 GCC GCG GTT GTC GAG GCG GCG ACG AAG GCC GAG GAT GAG GAC GAC GAC 2064 Ala Ala Val Val Glu Ala Ala Thr Lys Ala Glu Asp Glu Asp Asp Asp 1245 1250 1255 GAG GAG GAG GCG GTG GCG GAG GAC AGG TAC GCG CTC GGC GGC GCG TGC 2112 Glu Glu Glu Ala Val Ala Glu Asp Arg Tyr Ala Leu Gly Gly Ala Cys 1260 1265 1270 AGG GTG CTC GCC GGA ATG CCC GCG CCG CTG GGC GCC ACC GCG CTC GCC 2160 Arg Val Leu Ala Gly Met Pro Ala Pro Leu Gly Ala Thr Ala Leu Ala 1275 1280 1285 WO 99/14314 PCT/AU98/00743 106 GGC GGG GTC AAT TTC GCC GTC TAC TCC GGT GGA GCC ACC GCC GCG GCG 2208 Gly Gly Val Asn Phe Ala Val Tyr Ser Gly Gly Ala Thr Ala Ala Ala 1290 1295 1300 CTC TGC CTC TTC ACG CCA GAA GAT CTC AAG GCG GTG GGG TTG CCT CCC 2256 Leu Cys Leu Phe Thr Pro Glu Asp Leu Lys Ala Val Gly Leu Pro Pro 1305 1310 1315 1320 GAG TAG AGT TCA TCA GCT TTG CGT GCG CCG CGC GCC CCC TTT TCT GGC 2304 Glu Ser Ser Ser Ala Leu Arg Ala Pro Arg Ala Pro Phe Ser Gly 1325 1330 1335 CTG CGA TTT AAG TTT TGT ACT GGG GGA AAT GCT GCA GGA TAG GGT GAC 2352 Leu Arg Phe Lys Phe Cys Thr Gly Gly Asn Ala Ala Gly Gly Asp 1340 1345 1350 GGA GGA GGT TTC CCT TGA CCC CCT GAT GAA TCG GAC TGG GAA CGT GTG 2400 Gly Gly Gly Phe Pro Pro Pro Asp Glu Ser Asp Trp Glu Arg Val 1355 1360 1365 GCA TGT CTT CAT TGA AGG CGA GCT GCA CGA CAT GCT TTA CGG GTA CAG 2448 Ala Cys Leu His Arg Arg Ala Ala Arg His Ala Leu Arg Val Gin 1370 1375 1380 GTT CGA CGG CAC CTT TGC TCC TCA CTG CGG GCA CTA CCT TGA TAT TTC 2496 Val Arg Arg His Leu Cys Ser Ser Leu Arg Ala Leu Pro Tyr Phe 1385 1390 1395 1400 CAA TGT CGT GGT GGA TCC TTA TGC TAA GGT GAT CAT ACT TTA GCT TTA 2544 Gin Cys Arg Gly Gly Ser Leu Cys Gly Asp His Thr Leu Ala Leu 1405 1410 1415 CCT GCA TCT TGG TAT TTA CAG TAG AAA TTG TTA CGT GGA CCC TTA TTT 2592 Pro Ala Ser Trp Tyr Leu Gln Lys Leu Leu Arg Gly Pro Leu Phe 1420 1425 1430 GTT GCC TTT TGT GTT GCT CTA GGC AGT GAT AAG CCG AGG GGA GTA TGG 2640 Val Ala Phe Cys Val Ala Leu Gly Ser Asp Lys Pro Arg Gly Val Trp 1435 1440 1445 CGT TCC GGC GCG TGG TAA CAA TTG CTG GCC TCA GAT GGC TGG CAT GAT 2688 Arg Ser Gly Ala Trp Gin Leu Leu Ala Ser Asp Gly Trp His Asp 1450 1455 1460 CCC TCT TCC ATA TAG CAC GGT ATG CCT GAT TGC TGA AAA TAT TGG CTG 2736 Pro Ser Ser Ile His Gly Met Pro Asp Cys Lys Tyr Trp Leu 1465 1470 1475 1480 CAT TTG TTT CTC TCT TTT TCT CAT ATT TTT CTC CTG TCT TTC ACT TGT 2784 His Leu Phe Leu Ser Phe Ser His Ile Phe Leu Leu Ser Phe Thr Cys 1485 1490 1495 ACT ACA TTG CCT CAG ACA GTC ATG ATC AAA GAG AGC AGT GTC ATT AGA 2832 Thr Thr Leu Pro Gin Thr Val Met Ile Lys Glu Ser Ser Val Ile Arg 1500 1505 1510 CAT TTG TAG TTG TCT GCT GAC TTT GAC CAA AAC TTG TAA TTT ACT GTT 2880 His Leu Leu Ser Ala Asp Phe Asp Gin Asn Leu Phe Thr Val 1515 1520 1525 GTT AAA GGT CCT TGA ATC ATA TTT TTT TAT AAT ATT ATG TTT GCA AGT 2928 Val Lys Gly Pro Ile Ile Phe Phe Tyr Asn Ile Met Phe Ala Ser 1530 1535 1540 WO 99/14314 PCT/AU98/00743 107 GGA AGT AAA GTG AAA TTG CAT CTA GTA TTT GTT GTT GCT GTC TTA GTC 2976 Gly Ser Lys Val Lys Leu His Leu Val Phe Val Val Ala Val Leu Val 1545 1550 1555 1560 GTT TAA TTG GAC ATG CAG TAA AAA GGT TTG CAT CTG CAG TTT GAT TGG 3024 Val Leu Asp Met Gin Lys Gly Leu His Leu Gin Phe Asp Trp 1565 1570 1575 GAA GGC GAC CTA CCT CTA AGA TAT CCT CAA AAG GAC CTG GTA ATA TAT 3072 Glu Gly Asp Leu Pro Leu Arg Tyr Pro Gin Lys Asp Leu Val Ile Tyr 1580 1585 1590 GAG ATG CAC TTG CGT GGA TTC ACG AAG CAT GAT TCA AGC AAT GTA GAA 3120 Glu Met His Leu Arg Gly Phe Thr Lys His Asp Ser Ser Asn Val Glu 1595 1600 1605 CAT CCG GGT ACT TTC ATT GGA GCT GTG TCG AAG CTT GAC TAT TTG AAG 3168 His Pro Gly Thr Phe Ile Gly Ala Val Ser Lys Leu Asp Tyr Leu Lys 1610 1615 1620 GTA CAG CTG TAC TTG CTG ACT ACA TAG GAT AAT TTT TAA AGA AAG CTA 3216 Val Gin Leu Tyr Leu Leu Thr Thr Asp Asn Phe Arg Lys Leu 1625 1630 1635 1640 CAT ATT AGC CAG AAT TTG GGT TAT TAC AAA AAC TAC TGC ATA CTA TAG 3264 His Ile Ser Gin Asn Leu Gly Tyr Tyr Lys Asn Tyr Cys Ile Leu 1645 1650 1655 CAG TTA CAT GCT CAT TAT CGA GGA GAT GCT CAC ACG CAT CTT ATT TGG 3312 Gin Leu His Ala His Tyr Arg Gly Asp Ala His Thr His Leu Ile Trp 1660 1665 1670 ATT TAA TAC CCA ATT CTG TTT TGA TAT TGG ACT GTT CCC TCT ACA GGA 3360 Ile Tyr Pro Ile Leu Phe Tyr Trp Thr Val Pro Ser Thr Gly 1675 1680 1685 GCT TGG AGT TAA TTG TAT TGA ATT AAT GCC CTG CCA TGA GTT CAA CGA 3408 Ala Trp Ser Leu Tyr Ile Asn Ala Leu Pro Val Gin Arg 1690 1695 1700 GCT GGA GTA CTC AAC CTC TTC TTC CAA GTA AGG ACA TGA ATT TAG TAT 3456 Ala Gly Val Leu Asn Leu Phe Phe Gin Val Arg Thr Ile Tyr 1705 1710 1715 1720 TAG CCT GCC AGC ACT GTT TGA GTG AGA GTT CAT ACA CAT TTT GTG CCT 3504 Pro Ala Ser Thr Val Val Arg Val His Thr His Phe Val Pro 1725 1730 1735 GCA TAA CTG ATA TTT GTT CAA ACT ATT TTT TTT AGC AGT CAC TCA ACA 3552 Ala Leu Ile Phe Val Gin Thr Ile Phe Phe Ser Ser His Ser Thr 1740 1745 1750 GTT TTA CAT ATA TAT ATA ATA TAG ACT ATT CGT CAC CCT GGG TGA GGA 3600 Val Leu His Ile Tyr Ile Ile Thr Ile Arg His Pro Gly Gly 1755 1760 1765 ATA GTT ATT CTT CAC CCA CCT CTA TTT TAA CAT CTA TGC ACC GTA ATT 3648 Ile Val Ile Leu His Pro Pro Leu Phe His Leu Cys Thr Val Ile 1770 1775 1780 TTA CGT TTC GTA AAT TTG TCT TAT TTT AGA GAT AAA AAG AGA ACG TAA 3696 Leu Arg Phe Val Asn Leu Ser Tyr Phe Arg Asp Lys Lys Arg Thr 1785 1790 1795 1800 WO 99/14314 PCT/AU98/00743 108 GAA AAC CTA TAA TCG Glu Asn Leu Ser 1805 ATG TAA AAA CAT AGT Met Lys His Ser 1820 TAT TTT TTT TGT TAA Tyr Phe Phe Cys 1835 TCG TAA AAA AAA ATA TGT Ser Lys Lys Ile Cys 1810 GTA AAA TGT ACA TAA AAT Val Lys Cys Thr Asn 1825 TGC CAA ATT TTA TAC AGT Cys Gin Ile Leu Tyr Ser 1840 TAC GTA AAA Tyr Val Lys TTA CAA Leu Gin 1815 ACA TTT TTT GAC CTA Thr Phe Phe Asp Leu 1830 AAA TCA ATA TGA ATG Lys Ser Ile Met 1845 TAA CTA TTT Leu Phe 1850 GTA TTT CAA Val Phe Gin ATG TAA Met 1855 TTT ATT TAT Phe Ile Tyr GAA ATG Glu Met 1860 GTC GTA AGA Val Val Arg TTA CCT CGG GTG AAG AAT AAC TTA TTC TGC ACC CTG GGT GAT GAA Leu Pro Arg Val Lys Asn Asn Leu Phe Cys Thr Leu Gly Asp Glu
TAG
1880 1865 1870 1875 TAA CAC TAT ATA His Tyr Ile TAT ATA TAT ATA TAT Tyr Ile Tyr Ile Tyr 1885 ATA TAT Ile Tyr 1890 ATA TAT ATA Ile Tyr Ile CCG GCT Pro Ala 1895 GCT GCT AAT Ala Ala Asn GAT GTT Asp Val 1900 AAT ATT TCG Asn Ile Ser CAA GTA CCT Gin Val Pro 1905 AAG CTG GAT TTT TCT Lys Leu Asp Phe Ser 1910 CCA TGA GAC ATC AAT CCA TAA TTG AAA TTG GTC ACG ACA GTT GAA TAG Pro Asp Ile Asn Pro Leu Lys Leu Val Thr Thr Val Glu 1915 1920 1925 3744 3792 3840 3888 3936 3984 4032 4080 4128 4176 4224 4272 4320 4368 4416 4464 TTG ATA GCT Leu Ile Ala 1930 GAA AAT GAA ATC CAG Glu Asn Glu Ile Gin 1935 CAT GCT ACT GTC His Ala Thr Val 194C TTG CCA TCT CCA Leu Pro Ser Pro GAC TTG CTA ACA TGA Asp Leu Leu Thr 1945 ATT TTG Ile Leu 1950 TCT GCC TAC Ser Ala Tyr CTG TCA Leu Ser 1955 TTT GTA CCA Phe Val Pro
ACG
Thr 1960 TTC CCA ATT GCC CTC TCA TTA TTC GTG TGT ACC ATG CAT ATG TGT TTT Phe Pro Ile Ala' Leu Ser Leu Phe Val Cys Thr Met His Met Cys Phe 1965 1970 1975 AAC ATG ATT Asn Met Ile ATC ACC CGT Ile Thr Arg 1995 ATT GTT Ile Val 1980 GGC TAT ATT Gly Tyr Ile TCT CTT Ser Leu 1985 TGG AAA CAT Trp Lys His GAC TAA TTT Asp Phe 1990 TTT GTA TAA ACT Phe Val Thr GCT TGT Ala Cys 2000 TTT CAT ATC Phe His Ile AGG ATG AAC TTT Arg Met Asn Phe 2005 ACG AGA TAC ACA Thr Arg Tyr Thr 0 TGG GGA Trp Gly 2010 TAT TCT ACC ATA Tyr Ser Thr Ile AAC TTC Asn Phe 2015 TTT TCA CCA Phe Ser Pro
ATG
Met 202( TCA GGC GGG ATA AAA Ser Gly Gly Ile Lys 2025 AAC TGT GGG CGT GAT Asn Cys Gly Arg Asp 2030 GCC ATA Ala Ile 2035 AAT GAG TTC Asn Glu Phe
AAA
Lys 2040 ACT TTT GTA AGA Thr Phe Val Arg GAG GCT CAC AAA CGG Glu Ala His Lys Arg 2045 GGA ATT GAG GTA AGC Gly Ile Glu Val Ser 2050 AAG TCG Lys Ser 2055 WO 99/14314 WO 99/ 4314PCT/AU98/00743 109 TAC GAG TTA GTT GCT Tyr Giu Leu Val Ala 2060 CCT TTT GAA CTT ATC Pro Phe Glu Leu Ile 2065 AAT TTG ATG Asn Leu Met CGA AGA CAT Arg Arg H-is 2070 CAG CTG AGG Gin Leu Arg GTT ACT Vai Thr GCT AGG TGA TCC TGG Aia Arg Ser Trp 2075 ATG TTG Met Leu 2080 TCT TCA ACC Ser Ser Thr
ATA
Ile 2085 GTA ATG AGA Vai Met Arg 2090 ATG GTC CAA Met Val Gin TAT TAT Tyr Tyr 2095 CAT TTA GGG His Leu Giy GGG TCG ATA ATA CTA Gly Ser Ile Ile Leu 2100 CAT ACT ATA TGC TTG His Thr Ile Cys Leu 2105 CAC CCA His Pro 2110 AGG TGA CAG Arg Gin ATC TTT Ile Phe 2115 CTT GCT GCG Leu Ala Ala
TAA
2120 TTG TTC TTT CAT AGA TOT ATA GAG CAT AGA TGT OTT ATG TAG TAG TTC Leu Phe Phe His Arg Cys Ile Giu His Arg Cys Vai Met Phe 2125 2130 2135 TTT TTC AAG Phe Phe Lys GGG ATT Oly Ile 2140 ATG TTC ATG Met Phe Met CAG GGA GAG TTT TAT Gin Giy Giu Phe Tyr 2145 AAC TAT TCT Asn Tyr Ser 2150 GCC TGT GGG AAT Giy Cys Gly Asn 2155 ACC TTC AAC Thr Phe Asn TGT AAT CAT CCT GTG Cys Asn His Pro Vai 2160 GTT CGT Val Arg 2165 CAA TTC Gin Phe ATT GTA GAT Ile Val Asp 2170 TGT TTA AGG Cys Leu Arg TAC AGA Tyr Arg 2175 TAT ACA TTT Tyr Thr Phe TAC TTC TAG AAC TAC Tyr Phe Asn Tyr 2180 4512 4560 4608 4656 4704 4752 4800 4848 4896 4944 4992 5040 5088 5136 5184 5232 TTT TTC Phe Phe 2185 ATT TCT TTT Ile Ser Phe GCT GCT TGT CAT TTT Ala Ala Cys His Phe 2190 GAT ATG ATT AAT TTG Asp Met Ile Asn Leu 2195
CAA
Gin 2200 GCT TGT GGG GGT Ala Cys Gly Oly AAA TCT Lys Ser 2205 TTT GGT CAG Phe Gly Gin CAT ATT His Ile 2210 GTA TCT TTA Vai Ser Leu AAT GTC Asn Val 2215 ACA AAT ACT Thr Asn Thr AAT GTC Asn Val 2220 CTO GTG CTT Leu Val Leu ATT GAT Ile Asp 2225 TTG GCA TCT Leu Ala Ser TCA AAT TCT Ser Asn Ser 2230 AAC TAA TTT Asn Phe TCT CCA ATG AAA Ser Pro Met Lys 2235 AGG GAA AAA Arg Glu Lys TCT ACT Ser Thr 2240 GTA TGT CTC Val Cys Leu
GTC
Val1 2245 ACT TTT Thr Phe 2250 GTT TTG CAG ATA VJal Leu Gin Ile CTG GGT Leu Gly 2255 GAT GGA AAT Asp Gly Asn GCA TGT TGA TGG TTT Ala Cys Trp Phe 2260 TCG TTT TGA TCT TGC Ser Phe Ser Cys 2265
ATC
Ile 2270 CAT AAT GAC CAG His Asn Asp Gin TGT TGC CTT TTC Cys Cys Leu Phe 229( AGG TTC Arg Phe 2275 CAG GTA ATT Gin Val Ile
TOT
Cys 2280 ATT TAT TGT TTG Ile Tyr Cys Leu TGT TTC TTT TAC Cys Phe Phe Tyr 230( TTT GCG Phe Ala 2285 AAG TCT Lys Ser 0 AGA AGA TTC TTA Arg Arg Phe Leu AAA GAA Lys Glu 2295
GTG
Val1 GGA TCC AGT TAA CGT GTA Gly Ser Ser Arg Val 2305 TGG AGC TCC Trp Ser Ser 2310 WO 99/14314 PCT/AU98/00743 110 AAT AGA AGG TGA CAT GAT CAC AAC AGG GAC ACC TCT TGT TAC TCC ACC 5280 Asn Arg Arg His Asp His Asn Arg Asp Thr Ser Cys Tyr Ser Thr 2315 2320 2325 ACT TAT TGA CAT GAT CAG CAA TGA CCC AAT TCT TGG AGG CGT CAA GGT 5328 Thr Tyr His Asp Gin Gin Pro Asn Ser Trp Arg Arg Gin Gly 2330 2335 2340 ACT TGT TTC ATC CAA CAC CTG TTG TCT GTG TGC ATT CAA TTG TTT TAA 5376 Thr Cys Phe Ile Gin His Leu Leu Ser Val Cys Ile Gin Leu Phe 2345 2350 2355 2360 TAT GGT AAT GAT CAA TTT CCC AAT GTT GAT AAG GAA AAA AAA TGC AAG 5424 Tyr Gly Asn Asp Gin Phe Pro Asn Val Asp Lys Glu Lys Lys Cys Lys 2365 2370 2375 TAG CTC TCT TTA TCT GCT TCT TGT GAG TTA TGC TAA ACA TGT AGA TAC 5472 Leu Ser Leu Ser Ala Ser Cys Glu Leu Cys Thr Cys Arg Tyr 2380 2385 2390 TAC TAT ATT TCA ACT GTA TAT ACT TGA CAT ATT ATT GCT TCC TTG GGA 5520 Tyr Tyr Ile Ser Thr Val Tyr Thr His Ile Ile Ala Ser Leu Gly 2395 2400 2405 GGC TCT CTT ATT CCT TTC CCC CGT TGC AAT TAT AGC TCA TTG CTG AAG 5568 Gly Ser Leu Ile Pro Phe Pro Arg Cys Asn Tyr Ser Ser Leu Leu Lys 2410 2415 2420 CAT GGG ATG CAG GAG GCC TCT ATC AAG TAG GTC AAT TCC CTC ACT GGA 5616 His Gly Met Gin Glu Ala Ser Ile Lys Val Asn Ser Leu Thr Gly 2425 2430 2435 2440 ATG TTT GGT CTG AGT GGA ATG GGA AGG TAA GGT ACC TGT TAA AAG TTT 5664 Met Phe Gly Leu Ser Gly Met Gly Arg Gly Thr Cys Lys Phe 2445 2450 2455 GAA TGG CAA ATA CTG ATA GAA ATA TAA CTT ATA TTT GCG ACA TAT ATA 5712 Glu Trp Gin Ile Leu Ile Glu Ile Leu Ile Phe Ala Thr Tyr Ile 2460 2465 2470 GAT AAA GCA AAA TAA TAC GCA TTC CAC CTG AAC TTT AAA GGG GCA CGC 5760 Asp Lys Ala Lys Tyr Ala Phe His Leu Asn Phe Lys Gly Ala Arg 2475 2480 2485 AGA ATT ATC CCG CAT CTG TCT ACA AGA ATG ATA ACA CAT GTG CTG AAT 5808 Arg Ile Ile Pro His Leu Ser Thr Arg Met Ile Thr His Val Leu Asn 2490 2495 2500 AGT GAA GTA CTA CTT CTC AAA TGT CTG AAT GAA CGC ACT AAC TCT TGT 5856 Ser Glu Val Leu Leu Leu Lys Cys Leu Asn Glu Arg Thr Asn Ser Cys 2505 2510 2515 2520 GAG TGT CAA CCG AGC AAG AAA TAT TTG AGT TTT CTG CAA GAA ATT GTT 5904 Glu Cys Gin Pro Ser Lys Lys Tyr Leu Ser Phe Leu Gin Glu Ile Val 2525 2530 2535 CAT GTT GTG CTG TAT TAT ACT CCC TCC GTC CGA AAT TAT TTG TCG GAG 5952 His Val Val Leu Tyr Tyr Thr Pro Ser Val Arg Asn Tyr LeA Ser Glu 2540 2545 2550 AAA TGG ATG TAT CTA GAC GTA TTT TAG TTC TAG ATA CAT CCA TTT TTA 6000 Lys Trp Met Tyr Leu Asp Val Phe Phe Ile His Pro Phe Leu 2555 2560 2565 WO 99/14314 PCT/AU98/00743 111 TCC ATT TCT Ser Ile Ser 2570 GCA ACA AGT Ala Thr Ser AGT TCC Ser Ser 2575 GGA CGG AGG Gly Arg Arg GAG TAT Glu Tyr 2580 CAT TTA ACA His Leu Thr AAT ATA Asn Ile 2585 TGC ATG TTC Cys Met Phe GAA GTA AAT CCC CAC Glu Val Asn Pro His 2590 GAA TAA Glu 2595 GCA TAT AAG Ala Tyr Lys
ACG
Thr 2600 ATA TTG CTT TTT Ile Leu Leu Phe GAC TTG Asp Leu 2605 CAA CAC CTA Gin His Leu AAC CTC Asn Leu 2610 ATT GTT TTC Ile Val Phe TCC TAG Ser 2615 GAT TTT GGG Asp Phe Gly TGT TCG AAG CAA GCA Cys Ser Lys Gin Ala 2620 GCT GGT Ala Gly 2625 GAT ATT TAA Asp Ile TTT ACC TTT Phe Thr Phe 2630 GCC TTT ATT TGT AGC TTG ATT TGA GGG TGC GGC AAA GGT TTT AGC TTA Ala Phe Ile Cys Ser Leu Ile Gly Cys Gly Lys Gly Phe Ser Leu 2635 2640 2645 GTA GTG TTT TGT AAA TTA TTA TAG TTT ATG TAT ATA CTC CTC ATT TGG Val Val Phe Cys Lys Leu Leu Phe Met Tyr Ile Leu Leu Ile Trp 2650 2655 2660
GCA
Ala 2665 CTT CCG TAC Leu Pro Tyr TGG TCC CAT Trp Ser His 2670 AGA AGA TAA Arg Arg AAA TGG Lys Trp 2675 AAT GAT GTC Asn Asp Val
TGG
Trp 2680 CCA ATA ATT GTT Pro Ile Ile Val GAC AAC Asp Asn 2685 ACT GTT GCG Thr Val Ala CAT TTG ATT TTT ATC His Leu Ile Phe Ile 2690 AGG GAA Arg Glu 2695 6048 6096 6144 6192 6240 6288 6336 6384 6432 6480 6528 6576 6624 6672 6720 6768 TGG AAA ATT Trp Lys Ile GAA ATC GGT AAG AAA Glu Ile Gly Lys Lys 2700 CAT TGC GAT ATT AAG His Cys Asp Ile Lys 2705 CTT GTA TAT Leu Val Tyr 2710 CGT GTG CAT Arg Val His i GCT AAT GCT GGT Ala Asn Ala Gly 2715 GGA TCT TTA Gly Ser Leu AGA GGG AAC ATA TGA Arg Gly Asn Ile 2720
TCT
Ser 2725 CCA TCT TCA Pro Ser Ser 2730 ACT AAA AAA Thr Lys Lys ATA TGT Ile Cys 2735 TGC ACA TCT Cys Thr Ser CCC ACG TCA CTT ACT Pro Thr Ser Leu Thr 2740 AGC TAT Ser Tyr 2745 TTC ATC CAA Phe Ile Gin GTA CTA Val Leu 2750 ACT TGT GTG Thr Cys Val GTT GTC Val Val 2755 TCC TCA GTA Ser Ser Val
CCG
Pro 2760 GGA CAT TGT GCG Gly His Cys Ala CCA ATT Pro Ile 2765 CAT TAA AGG His Arg CAC TGA TGG ATT TGC His Trp Ile Cys 2770 TGG TGG Trp Trp 2775 TTT TGC CGA 2 Phe Cys Arg TGG CAA TAC Trp Gin Tyr 2795 ATG TCT Met Ser 2780 TTG TGG AAG Leu Trp Lys TCC ACA Ser Thr 2785 CCT ATA CCA Pro Ile Pro GGT AAG TTG Gly Lys Leu 2790 ATT TTT TAT Ile Phe Tyr TTG GAA ATG GGT Leu Glu Met Gly TGA GTG AAT GTC ACA Val Asn Val Thr 2800
TGG
Trp 2805 ATA TAC CAC ATG ATG Ile Tyr His Met Met 2810 ATA CAC ATG TAA ATA TAT Ile His Met Ile Tyr 2815 AAC GAT TAT AGT GTA Asn Asp Tyr Ser Val 2820 WO 99/14314 PCT/AU98/00743 112 TGC ATA Cys Ile 2825 TGC ATT TGG CTA AGA AGT ACT Cys Ile Trp Leu Arg Ser Thr 2830 CCC TCC CTT Pro Ser Leu 2835 AGT AAA AGT Ser Lys Ser
TAG
2840 TAC AAA GTT GAG Tyr Lys Val Glu TCA TCT Ser Ser 2845 ATT TTG GAA Ile Leu Glu CGG AGG Arg Arg 2850 GAG TAT AAG Glu Tyr Lys TGT ATA Cys Ile 2855 CAC TAG TGC His Cys CAT AGG GCT His Arg Ala 2875 AAT ATA Asn Ile 2860 TAG GTT TTA Val Leu ACA CCC Thr Pro 2865 AAC TTG CCA Asn Leu Pro ATG AAG GAA Met Lys Glu 2870 ATA ATC CAC Ile Ile His TTC TAG TTA TCT Phe Leu Ser TAT TTA TTT GTC TGG Tyr Leu Phe Val Trp 2880
TGA
2885 TGA AAA ATT CCA GCC ATG TCA TTT TTT AGG GGG Lys Ile Pro Ala Met Ser Phe Phe Arg Gly GGA GAA GAA ACT ACA Gly Glu Glu Thr Thr 2900 2890 2895 TTG ATT TTT CCC CCT Leu Ile Phe Pro Pro 2905 AAA AAA Lys Lys 2910 AGC CAT CTC Ser His Leu AGA TTT CAT AGG TAA Arg Phe His Arg 2915
CTT
Leu 2920 GCT TTT CTG TAA Ala Phe Leu AGA AAT GAA AAC GAC Arg Asn Glu Asn Asp 2925 TTC ATA CTT TCT GTC Phe Ile Leu Ser Val 2930 GAT TAT Asp Tyr 2935 AAG TGT ATA Lys Cys Ile CAC TAG His 2940 TGC AAT ATA Cys Asn Ile
TAG
2945 GTT TTA ACA Val Leu Thr CCC AAC TTG CCA Pro Asn Leu Pro 2950 TTT GCT GGT GAA Phe Ala Gly Glu 2965 6816 6864 6912 6960 7008 7056 7104 7152 7200 7248 7296 7344 7392 7440 7488 7536 ATG AAG GAA CAT Met Lys Glu His 2955 AGG GCT TTC Arg Ala Phe TAG TTA TCT TAT TTA Leu Ser Tyr Leu 2960 TAA TCC Ser 2970 AAC TAT Asn Tyr 2985 ACT GAA AAA TTC Thr Glu Lys Phe CAG CCA Gin Pro 2975 TGT CAT TTT Cys His Phe TTA GGG GGG AGA AGA Leu Gly Gly Arg Arg 2980 ATT GAT TTT Ile Asp Phe TCC CCC TAA AAA AAG Ser Pro Lys Lys 2990 CCA TCT CAG ATT CAT Pro Ser Gin Ile His 2995
AGG
Arg 3000 AAC TTG CTT TTC Asn Leu Leu Phe TGT AAA Cys Lys 3005 GAA ATG AAA Glu Met Lys ACG ACT Thr Thr 3010 TCA TAC TTT Ser Tyr Phe CTG CGG Leu Arg 3015 TTA TTT Leu Phe CGC TTA CTT Arg Leu Leu AGC TCG Ser Ser 3020 ATG GAT ATT Met Asp Ile TGT AAG ATG AAT GCC AAA Cys Lys Met Asn Ala Lys 3025 3030 GGC GGG ATT Gly Gly Ile 3035 CAA CCC AGT Gin Pro Ser 3050 TGA TCG TTA TTC Ser Leu Phe ACC TTG TTA TTG Thr Leu Leu Leu 3055
CAA
Gin 3040 ATT TCA TTT GGT Ile Ser Phe Gly TTC TCT AGC AAT Phe Ser Ser Asn 3045 GCA CTG CAA TTT Ala Leu Gin Phe CTT ATT GAT TAA TCA Leu Ile Asp Ser 3060 GGC AGG AGG AAG GAA ACC TTG GCA CAG TAT Gly Arg Arg Lys Glu Thr Leu Ala Gin Tyr 3065 3070 CAA CTT GGT ATG TGC Gin Leu Gly Met Cys 3075
ACA
Thr 3080 WO 99/14314 PCT/AU98/00743 113 TGA TGG ATT Trp Ile TAC ACT GGG Tyr Thr Gly 3085 TGA TTT GGT ACA TAT Phe Gly Thr Tyr 3090 AAT ACC AAG TCA ATT Asn Thr Lys Ser Ile 3095 S TAC CAA ATG Tyr Gin Met GGA ATT GTG Gly Ile Val 3115 GGG AGA Gly Arg 3100 CCA ATA GAG ATG GAG Pro Ile Glu Met Glu 3105 AAA ATC ACA ATC TTA GCT Lys Ile Thr Ile Leu Ala 3110 CTT TTT TTT TGA AAT TTT Leu Phe Phe Asn Phe 3125 GGG AGG TAA TTC Gly Arg Phe TGA ACT Thr 3120 CAT GCT TTA CAT AAT AGT His Ala Leu His Asn Ser 3130 CAA ATG Gin Met 3135 GCT GAC AAA Ala Asp Lys TGT CGT TGT ATG GTT Cys Arg Cys Met Val 3140 CTC TCT ACC TAA ACC GTT AAG GCA GTA AGA GTT TCC CTA CAA GAT CTC Leu Ser Thr Thr Val Lys Ala Val Arg Val Ser Leu Gin Asp Leu 34 3150U 3155 3160 TTT GTT CGT ATA ATT GTA TTT TCT AGA Phe Val Arg Ile Ile Val Phe Ser Arg 3165 GAA AAG TTG CCT TCA Glu Lys Leu Pro Ser 3170 ATT TTG Ile Leu 3175 TGC ACG CGG Cys Thr Arg CAG TAC AGG AAT TGT Gin Tyr Arg Asn Cys 3180 GGT TAT AAA TAT TGA Gly Tyr Lys Tyr 3185 ACC ATC GTT ACT Thr Ile Val Thr 3195 AAT AGG GGG Asn Arg Gly AAC AAT AAG CAC ATT Asn Asn Lys His Ile 3200
TTT
Phe 3205 TAC AGG CTG Tyr Arg Leu 3190 TTA ATA GCA Leu Ile Ala ATC CGA ACC Ile Arg Thr 7584 7632 7680 7728 7776 7824 7872 7920 7968 8016 8064 8112 8160 8208 8256 8304 AAG GCA Lys Ala 3210 ATA AGT Ile Ser 3225 TCA CCC TTG TTC Ser Pro Leu Phe CGT TTC CAA TGA AAT Arg Phe Gin Asn 3215 CAC AGT His Ser 3220 TTT ACA AGT Phe Thr Ser ATG CGT Met Arg 3230 AGA GAG AAA Arg Glu Lys TAA AGT ATC AAC CCG Ser Ile Asn Pro 3235
GCA
Ala 3240 GAA ACA GTT GTT Glu Thr Val Val TCA GGC GCA AAG AGA Ser Gly Ala Lys Arg 3245 AAA GGA AAC GAT ATG Lys Gly Asn Asp Met 3250 CTC TAT Leu Tyr 3255 TAC ATC AAC Tyr Ile Asn CTT TTA Leu Leu 3260 GCA TTT AGG Ala Phe Arg GAC GAC CAG CAT CAT Asp Asp Gin His His 3265 AAT CAA CTG GAG Asn Gin Leu Glu 3275 CGA GGT CAC Arg Gly His CTC CAA Leu Gin 3280 TCT TCT CAG Ser Ser Gin
CAG
Gin 3285 CCC ATC TTC Pro Ile Phe 3270 CCT CAG AGT Pro Gin Ser GGG GTT GGG Gly Val Gly GGT GAC Gly Asp 3290 CTC CCA AGC AAG Leu Pro Ser Lys TGC ATC Cys Ile 3295 AGC ATC CAT Ser Ile His CAT CTG His Leu 3300 CAC ATA His Ile 3305 CCA TGA GCA Pro Ala CAA TCA CCT GAA TTT Gin Ser Pro Glu Phe 3310 GAT GAA TTT TCC TCT Asp Glu Phe Ser Ser 3315
GTT
Val 3320 TAC CTT GCA GCA Tyr Leu Ala Ala GAC CCC TGC CGT ATA Asp Pro Cys Arg Ile 3325 AAT GGT TTT AAA TGA Asn Gly Phe Lys 3330 CAG CAT Gin His 3335 WO 99/14314 PCT/AU98/00743 114 GTT CTT TCA GTT TGA GCA AAA TTT GTG CAA TTG CAA AGA AGC TTT AGA 8352 Val Leu Ser Val Ala Lys Phe Val Gin Leu Gin Arg Ser Phe Arg 3340 3345 3350 ATC ATG TGG AAC ATG CAC TTA CAT TTC ATC TGA CAA TAT AGG AAG GAG 8400 Ile Met Trp Asn Met His Leu His Phe Ile Gin Tyr Arg Lys Glu 3355 3360 3365 AGC CCG ACG TCG CAT GCT CCT CTA GAC TCG AGG AAT TCG CAA GAT TGT 8448 Ser Pro Thr Ser His Ala Pro Leu Asp Ser Arg Asn Ser Gin Asp Cys 3370 3375 3380 CTG TCA AAA GAT TGA GGA AGA GGC AGA TGC GCA ATT TCT TTG TTT GTC 8496 Leu Ser Lys Asp Gly Arg Gly Arg Cys Ala Ile Ser Leu Phe Val 3385 3390 3395 3400 TCA TGG TTT CTC AAG TAA GAC TTA TAT CTG ATC TCT TCA ATT TTT GAG 8544 Ser Trp Phe Leu Lys Asp Leu Tyr Leu Ile Ser Ser Ile Phe Glu 3405 3410 3415 ATT GCC TGT TTT TCA CAA TGG CAT ATG TTG TCA GGT GAA ACA TCC AAT 8592 Ile Ala Cys Phe Ser Gin Trp His Met Leu Ser Gly Glu Thr Ser Asn 3420 3425 3430 CCC AGT ATT AAT AGA GCC AAC ATG AAG GGA TTG CTT ATC TGA GAT ATC 8640 Pro Ser Ile Asn Arg Ala Asn Met Lys Gly Leu Leu Ile Asp Ile 3435 3440 3445 TGC CAA AGT TGA ATT CTT AGA TTC ACC TTC TTC AGT ATT TCA GAC CTT 8688 Cys Gin Ser Ile Leu Arg Phe Thr Phe Phe Ser Ile Ser Asp Leu 3450 3455 3460 CTA AGC ATT TTC ATT TTT TTT TTC AAT TGT TAG GGA GTT CCA ATG TTT 8736 Leu Ser Ile Phe Ile Phe Phe Phe Asn Cys Gly Val Pro Met Phe 3465 3470 3475 3480 TAC ATG GGC GAT GAA TAT GGC CAC ACA AAA GGG GGC AAC AAC AAT ACA 8784 Tyr Met Gly Asp Glu Tyr Gly His Thr Lys Gly Gly Asn Asn Asn Thr 3485 3490 3495 TAC TGC CAT GAT TCT TAT GTC AGT ACA ATT TGG TCA CAT ATT GTT GTT 8832 Tyr Cys His Asp Ser Tyr Val Ser Thr Ile Trp Ser His Ile Val Val 3500 3505 3510 CTA AGT AAC TAT CTT CAA ATC TTT GCA TTC ATC CGT CAT GGC TCT TCT 8880 Leu Ser Asn Tyr Leu Gin Ile Phe Ala Phe Ile Arg His Gly Ser Ser 3515 3520 3525 GTA GGT CAA TTA TTT TCG CTG GGA TAA AAA AGA ACA ATA CTC TGA CTT 8928 Val Gly Gin Leu Phe Ser Leu Gly Lys Arg Thr Ile Leu Leu 3530 3535 3540 GCA AAG ATT CTG CTG CCT CAT GAC CAA ATT CCG CAA GTA AGT ATT CCG 8976 Ala Lys Ile Leu Leu Pro His Asp Gin Ile Pro Gin Val Ser Ile Pro 3545 3550 3555 3560 TTG AAT AAT TTC TGT GTA GAA CCA CTG AAG GTG CCT CCA AAC GCT AAG 9024 Leu Asn Asn Phe Cys Val Glu Pro Leu Lys Val Pro Pro Asn Ala Lys 3565 3570 3575 CGA GCA AGG TCA ATT TCA CAC CCT AAT CAA GTT GGT GTT GTC-TAT TTG 9072 Arg Ala Arg Ser Ile Ser His Pro Asn Gin Val Gly Val Val Tyr Leu 3580 3585 3590 WO 99/14314 WO 99/ 4314PCT/AU98/00743 115 TGT ATT TGA TCT Cys Ile Ser 3595 GCT GCA CTG TAG GGA Ala Ala Leu Gly 3600 GTG CGA GGG TCT TGG CCT TGA Val Arg Gly Ser Trp Pro* 3605 GGA CTT TCC AAC GCC CGA Giy Leu Ser Asn Gly Arg 3610 ACG GCT Thr Ala 3615 GCA GTG GCA Ala Val Ala TGG TCA Trp Ser 3620 TCA GCC TGG Ser Ala Trp GAA GCC Glu Ala 3625 TGA TTG GTC Leu Val TGA GAA TAG CCG ATT Glu Pro Ile 3630 CGT TGC CTT TTC CAT Arg Cys Leu Phe His 3635
GGT
G ly 3640 9120 9168 9216 9264 9289 ACA CAT ATA OTT Thr His Ile Val CTG ACA CTT CAC TAT Leu Thr Leu His Tyr 3645 AGT TGT TTT AAA AAA Ser Cys Phe Lys Lys 3650 GAA AAT Giu Asn 3655 TTA ACT CAA AAG TAA ATT ATG GAG A Leu Thr Gin Lys Ile Met Giu 3660

Claims (46)

1. A nucleic acid sequence encoding an enzyme of the starch biosynthetic pathway in a cereal plant, wherein the enzyme is selected from the group consisting of starch branching enzyme I, starch branching enzyme II, starch soluble synthase I, and debranching enzyme, with the proviso that the enzyme is not soluble starch synthase I of rice, or starch branching enzyme I of rice or maize.
2. A sequence according to claim 1, wherein the sequence is a genomic DNA or cDNA sequence.
3. A sequence according to claim 1 or claim 2, wherein the sequence is functional in wheat.
4. A sequence according to any one of claims 1 to 3, wherein the sequence is derived from a Triticum species.
5. A sequence according to claim 4, wherein the Triticum species is Triticum tauschii.
6. A sequence according to any one of claims 1 to wherein the sequence encodes starch branching enzyme I or a biologically-active fragment thereof, and wherein the sequence has at least 70% sequence homology with the sequence shown in SEQ ID NO:5 or SEQ ID NO:9.
7. A sequence according to claim 6, wherein the homology is at least
8. A sequence according to any one of claims 1 to wherein the sequence encodes starch branching enzyme II a or biologically-active fragment thereof, and wherein the sequence has at least 70% sequence homology with the sequence shown in SEQ ID WO 99/14314 PCT/AU98/00743 117
9. A sequence according to claim 8, wherein the homology is at least A sequence according to any one of claims 1 to wherein the sequence encodes soluble starch synthase or a biologically-active fragment thereof, and wherein the sequence has at least 70% sequence homology with the sequence shown in SEQ ID NO:ll or SEQ ID NO:13.
11. A sequence according to claim 10, wherein the homology is at least
12. A sequence according to claim 11, wherein the sequence encodes a 75 kD soluble starch synthase of wheat.
13. A sequence according to claim 12, which encodes an amino acid sequence at least 70% homologous to that shown in SEQ ID NO:14.
14. A sequence according to any one of claims 1 to wherein the sequence encodes debranching enzyme or a biologically-active fragment thereof, and wherein the sequence has at least 70% sequence homology with the sequence shown in SEQ ID No:17. A sequence according to claim 14, wherein the homology is at least
16. A promoter of an enzyme selected from the group consisting of starch branching enzyme I, starch branching enzyme II, starch soluble synthase I, and debranching enzyme, with the proviso that the enzyme is not soluble starch synthase I of rice, or starch branching enzyme I of rice or maize.
17. A promoter according to claim 16, wherein the promoter is a starch branching enzyme I promoter or PCT/AU98/00743 Received 21 June 1999 118 biologically-active fragment thereof, and wherein the promoter sequence has at least 70% sequence homology with the sequence shown in SEQ ID No:8.
18. A sequence according to claim 17, wherein the homology is at least
19. A promoter according to claim 16, wherein the promoter is a starch soluble synthase I promoter or biologically-active fragment thereof, and wherein the promoter sequence has at least 70% sequence homology with the sequence shown in SEQ ID A sequence according to claim 19, wherein the homology is at least
21. A nucleic acid construct comprising a nucleic acid sequence encoding an enzyme of the starch biosynthetic pathway in a cereal plant, operably linked to one or more nucleic acid sequences facilitating expression of the nucleic acid sequence in a plant, wherein the enzyme is selected from the group consisting of starch branching enzyme I, starch branching enzyme II, starch soluble synthase I, and debranching enzyme, with the proviso that the enzyme is not soluble starch synthase I of rice, or starch branching enzyme I of rice or maize, a biologically- active fragment thereof, and that starch branching enzyme II does- not have the N-terminal amino acid sequence: AASPGKVLVPDGEDDLASPA.
22. A nucleic acid construct for targeting a gene to the endosperm of a cereal plant, comprising one or more promoter sequences selected from the group consisting of SBE I promoter, SEE II promoter, SSS I promoter, and DBE promoter, operatively linked to a nucleic acid sequence encoding a protein, wherein the expression of the targetted gene in the endosperm of a cereal plant is modified. AMENDED SHEET (Article 34) (IPEA/AU) WO 99/14314 PCT/AU98/00743 119
23. A construct according to either claim 21 or claim 22, wherein the promoter or nucleic acid sequence is also operatively linked to one or more additional targeting sequences and/or one or more 3' untranslated sequences.
24. A construct according to claim 23, wherein the nucleic acid encoding the protein is either in the sense or antisense orientation.
25. A construct according to claims 24, wherein the protein is an enzyme of the starch biosynthetic pathway.
26. A construct according to claim 25, wherein the nucleic acid encoding the protein is in the antisense orientation, and the enzyme is selected from the group consisting of GBSS, starch debranching enzyme, SBE II, low molecular weight glutenin, and grain softness protein I.
27. A construct according to claim 25, wherein the nucleic acid encoding the protein is in the sense orientation, and the enzyme is selected from the group consisting of bacterial isoamylase, bacterial glycogen synthase, and wheat high molecular weight glutenin Bxl7.
28. A construct according to any one of claims 21 to 27, wherein the plant is a cereal plant.
29. A construct according to claim 28, wherein the cereal plant is either wheat or barley.
30. A construct according to claim 29, wherein the cereal plant is wheat.
31. A construct according to any one of claims 21 to wherein the construct is either a plasmid or a vector. WO 99/14314 PCT/AU98/00743 120
32. A construct according to claim 31, wherein the plasmid or vector is suitable for use in the transformation of a plant.
33. A construct according to claim 32, wherein the plasmid is selected from the group consisting of those depicted in Figures 22a to 22f.
34. A construct according to claim 32, wherein the vector is a bacterium of the genus Agrobacterium. A construct according to claim 34, wherein the vector is Agrobacterium tumefaciens.
36. A method of modifying the characteristics of starch produced by a plant, comprising the steps of: introducing a nucleic acid sequence encoding an enzyme of the starch biosynthetic pathway into a host plant, and/or introducing an anti-sense nucleic acid sequence directed to a gene encoding an enzyme of the starch biosynthetic pathway into a host plant, wherein the enzyme is selected from the group consisting of starch branching enzyme I, starch branching enzyme II, starch soluble synthase I, and debranching enzyme, with the proviso that the enzyme is not soluble starch synthase I of rice, or starch branching enzyme I of rice or maize, and wherein if both steps and are used, the enzymes in the two steps are different.
37. A method according to claim 36, wherein the plant is a cereal plant.
38. A method according to claim 37, wherein the cereal plant is wheat or barley. WO 99/14314 PCT/AU98/00743 121
39. A method of targeting expression of a gene to the endosperm of a cereal plant, comprising the step of transforming the plant with a construct according to any one of claims 21 to A method of modulating the time of expression of a gene in endosperm of a cereal plant, comprising the step of transforming the plant with a construct according to any one of claims 21 to
41. A method according to claim 40, wherein when expression at an early stage following anthesis is desired, the construct comprises either the SEE II, SSS I, or DBE promoter.
42. A method according to claim 40, wherein when expression at a later stage following anthesis is desired, the construct comprises the SBE I promoter.
43. A plant transformed with a construct according to any one of claims 21 to
44. A plant according to claim 43, wherein the plant is a cereal plant. A plant according to claim 44, wherein the cereal plant is wheat or barley.
46. A method of identifying variations in the starch synthesis characteristics of a cereal plant, comprising the step of identifying a variation in nucleic acid sequence in the intron regions of the SEE I, SBE II, SSS I or DBE genes.
47. A method of identifying variations in the starch synthesis characteristics of a cereal plant, comprising the step of identifying a variation in nucleic acid sequence compared to the sequence shown in one or more SEQ ID WO 99/14314 PCT/AU98/00743 122 SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:16, or SEQ ID NO:17.
48. A method according to claim 47, in which a mutation or absence of a SBE I, SBE II, SSS I or DBE gene is detected.
49. A method according to either claim 47 or claim 48, in which the cereal plant is wheat or barley.
50. A product comprising plant material propogated from a plant transformed with a nucleic acid sequence encoding an enzyme of the starch biosynthetic pathway in a cereal plant, operably linked to one or more nucleic acid sequences facilitating expression of the nucleic acid sequence in a plant, wherein the enzyme is selected from the group consisting of starch branching enzyme I, starch branching enzyme II, starch soluble synthase I, and debranching enzyme, with the proviso that the enzyme is not soluble starch synthase I of rice, or starch branching enzyme I of rice or maize, a biologically-active fragment thereof.
51. A product comprising plant material propogated from a plant in which a gene was targeted to the endosperm of a cereal plant, by a nucleic acid construct comprising one or more promoter sequences selected from the group consisting of SBE I promoter, SBE II promoter, SSS I promoter, and DBE promoter, operatively linked to a nucleic acid sequence encoding a protein, wherein the expression of the targetted gene in the endosperm of a cereal plant is modified.
52. A product according to claim 50 or claim 51 wherein the product is a food product.
AU89670/98A 1997-09-12 1998-09-11 Regulation of gene expression in plants Ceased AU727294B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU89670/98A AU727294B2 (en) 1997-09-12 1998-09-11 Regulation of gene expression in plants

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
AUPO9108 1997-09-12
AUPO9108A AUPO910897A0 (en) 1997-09-12 1997-09-12 Regulation of gene expression in plants
AUPP2509 1998-03-20
AUPP2509A AUPP250998A0 (en) 1998-03-20 1998-03-20 Regulation of gene expression in plants
PCT/AU1998/000743 WO1999014314A1 (en) 1997-09-12 1998-09-11 Regulation of gene expression in plants
AU89670/98A AU727294B2 (en) 1997-09-12 1998-09-11 Regulation of gene expression in plants

Publications (2)

Publication Number Publication Date
AU8967098A AU8967098A (en) 1999-04-05
AU727294B2 true AU727294B2 (en) 2000-12-07

Family

ID=27156790

Family Applications (1)

Application Number Title Priority Date Filing Date
AU89670/98A Ceased AU727294B2 (en) 1997-09-12 1998-09-11 Regulation of gene expression in plants

Country Status (1)

Country Link
AU (1) AU727294B2 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU1902895A (en) * 1994-03-25 1995-10-17 Brunob Ii B.V. Method for producing altered starch from potato plants
WO1997004113A2 (en) * 1995-07-14 1997-02-06 Danisco A/S Inhibition of gene expression

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU1902895A (en) * 1994-03-25 1995-10-17 Brunob Ii B.V. Method for producing altered starch from potato plants
WO1997004113A2 (en) * 1995-07-14 1997-02-06 Danisco A/S Inhibition of gene expression

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
NAIR RB ET AL, PLANT SCIENCE, 122:153-163 *

Also Published As

Publication number Publication date
AU8967098A (en) 1999-04-05

Similar Documents

Publication Publication Date Title
CA2303407C (en) Regulation of gene expression in plants
US6794558B1 (en) Nucleic acid module coding for αglucosidase, plants that synthesize modified starch, methods for the production and use of said plants, and modified starch
EP0826061B1 (en) Improvements in or relating to plant starch composition
US6515203B1 (en) Nucleic acid molecules encoding enzymes having fructosyl polymerase activity
US6791010B1 (en) Nucleic acid molecule coding for beta-amylase, plants synthesizing a modified starch, method of production and applications
US7465851B2 (en) Isoforms of starch branching enzyme II (SBE-IIa and SBE-IIb) from wheat
AU8991198A (en) Improvements in or relating to stability of plant starches
HUT77745A (en) New isoamylase gene, compositions containing it and methods for use of isoamylase
CZ20001680A3 (en) Molecules of nucleic acid encoding enzymes exhibiting fructosyltransferase activity and processes for preparing inulin with long chain
AU727294B2 (en) Regulation of gene expression in plants
AU777455B2 (en) Nucleic acid molecules from artichoke (cynara scolymus) encoding enzymes having fructosyl polymerase activity
AU780523B2 (en) Novel genes encoding wheat starch synthases and uses therefor
Kim et al. Expression ofEscherichia colibranching enzyme in caryopses of transgenic rice results in amylopectin with an increased degree of branching.
AU2004200536A1 (en) Nucleic acid molecules

Legal Events

Date Code Title Description
FGA Letters patent sealed or granted (standard patent)