A METHOD THAT COMPARE GENOMIC SEQUENCES
FIELD OF THE INVENTION
The invention relates to the field of DNA fingerprinting, and genomic polymorphism
5
BACKGROUND OF THE INVENTION
Mapping plant genes with molecular markers has many applications for agriculture: these include marking useful genes to assist breeding, QTL (Quantitative Trait Loci) analysis and map based cloning of important genes and QTLs. Genetic markers, when closely associated with economically important traits, increase breeding efficiency by allowing earlier selection in smaller 10 populations, as compared with conventional breeding techniques. Genetic markers, when combined with genetic maps, can be employed in marker-assisted selection (MAS) strategies for yield and quality components. As genetic maps become saturated, their value to programs of plant improvement increases. The most interesting applications of molecular markers require reasonably saturated maps (markers < 1 cM apart). Following a decade of initial investment to develop 15 marker technology, even the most demanding and sophisticated applications (QTL mapping, positional cloning) are now materializing in crops such as tomato, maize, rice and lettuce, and in the model plant, Arabidopsis. The dynamic field of plant genome analysis is reaching a point where many ofthe initial expectations begin to materialize. An array of molecular marker techniques (see below) enable "seeing" the genome, and following of chromosomal segments that 20 carry a desired trait through sexual crosses. It has been demonstrated that QTLs (Quantitative Trait Loci) can be mapped, and that many valuable alleles are hidden in exotic germplasm items - even when their phenotype for that trait is inferior. Comparatives mapping of genomes from crops that do not cross with each other but are phylogenetically related will allow the mapping of agronomic traits in one crop and subsequent finding of their homologues in other crops. The 25 various marker techniques involve heritable intra-specific variation in the DNA that can be detected using various means. RFLP (Restriction Fragment Length Polymorphism) has been used since the 80's to provide genetic markers that are presumably random, co-dominant, and unlimited in number. They are detected on Southern blots using clones of DNA (genomic, cDNA, ESTs etc.) as probes. Their main disadvantages are the technical difficulty of application and high cost. 30 RAPDs (Random Amplified Polymorphic DNA) are yet another class of frequent polymorphisms. They involve differences in the banding pattern of PCR (Polymerase Chain Reaction)
amplification products, applying random primers to genomic DNA samples. Their main advantage over RFLPs is the simple, non-radioactive protocol requiring less DNA. The disadvantages are that most polymorphisms are dominant (band versus no band) and thus less informative in linkage analysis, and they are less reproducible. Some investigators use RAPDs at an initial screening stage, and then develop SCARs (Sequence Characterized Amplified Regions) ofthe most 5 promising PCR products. These consist of PCR products produced by longer, more specific primers that are derived by sequencing RAPD bands. Microsatellites are repetitive sequences that consist of simple repeats of di-, tri-, or tetra-nucleotides interspersed in the genome. Single-locus markers (also called SSR-Simple Sequence Repeats) were derived from cloned microsatellites by sequencing the flanking DNA and synthesizing specific PCR primers that flank the repeats. SSRs 10 are highly polymorphic, highly reproducible, co-dominant markers that are frequently used in mapping projects, although the initial investment in developing each marker is high. Another marker system based on microsatellites was developed and termed Inter-SSR-PCR: here, the microsatellite repeat is used as the amplification primer, and a multibanded, polymoφhic pattern is obtained. Still another technique applied in mapping projects, is the AFLP (Amplified Fragment 15 Length Polymorphism). Here, subsets of restriction fragments are amplified by PCR from artificially ligated linkers, giving rise to multibanded patterns analyzed on sequencing gels. Obviously, each technique has its advantages and drawbacks, and for any botanical taxon or specific application one should select the methodology of choice.
A large percentage of actively expressed human and plant genes are members of families of DNA 20 sequences that show a high degree of sequence similarity. However, the extent of sequence sharing and the organization of family members may vary widely.
Classical gene families are distinguished by members that exhibit a high degree of sequence homology over most ofthe gene length or at least the coding DNA component, a feature which automatically identifies such sequences as being closely related evolutionary as well as 25 functionally. In some gene families there is particularly pronounced homology within specific, strongly conserved, regions ofthe genes. Such families may include genes possessing large, highly conserved sequence motifs, such as the paired box of PAX genes (Ξ390 bp) or the homeobox of the homeobox containing genes(≡180 bp). On the oher hand the sequence similarity between the full length of family members may be low to an extent that the members of some gene families 30 may not be obviously related at the DNA sequence level. Nevertheless, they encode gene products that are characterized by a common general function and the presence of short conserved sequence motifs( e.g. the DEAD box RNA helicases gene family).
Gene families can occur as closely clustered genes at specific subchromosomal localization, or as widely dispersed genes.
In many instances the clustered genes possess higher degree of similarity in the coding sequence than the interspersed genes, and they also demonstrate related pattern of expression.
Gene amplification, the mechanism responsible for the evolvement of multigene families, may 5 result from large scale gene duplication events that involve the whole genome, or large chromosomal segments. Selective duplication of specific genes can occur by copy transposition events and also by tandem gene duplication.
The present invention describes a new method for obtaining DNA polymoφhism that is related to gene families, and also describes several accompanied novel technologies related to it. 10
SUMMARY OF THE INVENTION
The present invention is based on a novel method that uses in PCR a mix of primers matching conserved regions in the sequence of gene family members for DNA fingeφrinting of cells and tissues. 15
The method includes several complementary methods and techniques that complement the basic method.
These methods are in the following subjects: (1) The use ofthe gene family conserved primers alone or in combination with microsatellite sequences, or with RAPD primers both in PCR method or with AFLP. 20
(2) The use ofthe gene family conserved primers in combination with primers that match ubiquitous cis-acting regulatory elements. (3) Coupling between DNA fingeφrinting analysis and the gene family expression pattern analysis of cells and tissues. (4) The use ofthe gene family conserved primers in micro-array DNA fingeφrinting (5) Gene family DNA fingeφrinting and cancer. 25
1. The use of the gene family conserved primers in combination with primers of microsatellite sequences, or RAPD. or in combination with a DNA digestion method
The use of primers that match conserved regions in gene families as a tool for detecting polymoφhic markers in PCR may be advantageous. This is because the isolated polymoφhic 30 marker encode the sequence of a gene that contribute or is responsible for traits that vary between the compared strains. Thus, either the genetic markers themselves are responsible to economically important traits, or they might turn out to have a significant contribution to the
specific trait. For example. ( 1 ) Kinase genes are associated with resistance to many plant diseases, and (2) various flavors and tastes of plant fruits are related to the cytochrome P450 enzyme system.
Another advantage of finding a specific marker that is a gene family member is that its homologous genes can be identified in other strains and species. 5
For PCR DNA fingeφrinting of gene family it is preferable to design primers that match the conserved sequences of gene families having many gene members, dispersed around the genome, and in which high degree of conservation exist between the gene members, preferably at least in two conserved domains. In both respects the kinase super family is an excellent candidate containing more then 1000 genes in each organism and a very conserved catalytic domain. The 10 homeobox gene family and the MADS box gene family are also good candidates fulfilling especially the requirement for a high degree of conservation between the various genes, while cytochrome P450 enzymes are much less conserved but are highly dispersed throughout the genome.
In addition the method might assist in finding true orthologous genes between different species in 15 the milieu of genes that belong to gene families as will be explained later. According to the method of invention, for the purpose of DNA fingeφrinting analysis, the PCR reaction contains a mix ofthe gene family conserved primers together with the primer that match any kind of frequently repeating sequences, e.g. the ISSR. The ability ofthe gene family primers to generate PCR products in combination with the ISSR primer extends the feasibility of creating a variable 20 spectrum of PCR fragments rather than just by using sense and anti-sense primers ofthe gene family. If the mix of primers contains the gene family sense and anti sense primers together with a repeat primer, these PCR fragments will be basically of three types. (1) PCR products that are produced by the repeat primers. (2) PCR products that contain on one end primer ofthe gene family and on the other end the repeat primer, thus containing part ofthe gene sequence and an 25 adjacent genomic sequence. (3) PCR products that contain both primers from the gene family and thus the sequence might be of an exon or of two parts of an exon separated by an intron. Using a mix of PCR primers that contains only the gene family sense or anti sense primers along with the repeat primer these PCR fragments will basically result in two types of products: 1 ) Products that are produced by the repeat primers alone. 2) Products that contain one primer of 30 the gene family and one primer that match the repeat sequence.
In either alternative, in order to differentiate between these fragments and to utilize for the analyses the PCR products that contain at least one gene family primer, it is necessary to perform the following steps:
(1) To perform the PCR fingeφrinting separately with the repeat primer alone, and with gene family primers alone, and then with their combinations. (2) To subtract from the fingeφrint ofthe 5 combination of primers, the fingeφrint results produced with the repeat along, thus identifying the relevant fragments that contain part ofthe sequence of a gene family member.
Alternatively, it is possible to use fiorescent primers in order to differentiate between relevant and irrelevant products. The method can differentiate between the different DNA fingeφrinting products, by performing the fingeφrinting of a sample using primers labeled with different 10 fluorochromes for the repeat group of primers, and for the group of primers that match gene family conserved regions. In this way PCR products that contain different combination of primer pairs will have different wavelength emissions and thus can be easily differentiated.
Using sets of many primers also makes it impossible to perform direct sequence on product obtained as one do not know which ofthe many primers is the one used in a specific PCR 15 product. Also the cloning procedure is a little bit more complicated as the amplification of a specific band for cloning should be performed with two sets of many primers and thus might become ineffective.
In order to overcome these problems and to find the identity of a specific PCR product we offer the designing ofthe following set of primers for each ofthe gene families: 20
1. Design primary sets of primers one set that matches the sequences of a first conserved region and a second set that matches a second conserved region of a gene family.
2. Design secondary sets of primers of which each primer is composed of three parts. 1) At the 3' side of each ofthe primers there is a sequence that is identical to the sequence of one ofthe primary primers. 2) At the 5' sequence of each ofthe primers an additional 18-21 nucleotides 25 are added. The sequence of this additive is the same for all the primers that belong to one set that matches a first conserved region, and is different from the sequence added to the second set matching the second conserved sequence. At the 5' side ofthe other set of primers a unique and common sequence to that set is added and it is different from that ofthe first set. 3) At the most 5" end of these secondary primers the restriction site of a non- frequent cutter 30 restriction enzyme is added. The site is common to the primers that belong to one conserved region, and is different from the restriction site added to the primers that match the other
conserved region (Figure 1 ). Items 2.2 and 2.3 5' parts ofthe gene family primers can be designated together the 5" anchor sequences.
3. Design tertiary primers and usually three primers. One having the sequence of the 5' anchor sequence ofthe first set ofthe secondary primers. A second primer having the sequence of the
5' anchor sequences ofthe second set of secondary primers, and the third primer that is the 5 repeat or RAPD primer with a third restriction site (figure 1 ).
Three steps of DNA fingeφrinting are then performed in the following manner: a. DNA samples of compared strains are amplified using a gene family set of primers and a repeat primer. b. Secondary PCR is performed using the secondary primers to re-amplify the products of the 10 first round of PCR. Alternatively the primary PCR of item a can be omitted. c. PCR is performed on either a selected PCR product from the secondary PCR, or using the whole secondary PCR reaction, with selected combination of tertiary primers. These primer combinations and PCR steps have several advantages:
If a PCR product that was differentially expressed is isolated the three step system enables to 15 verify the content of primers that created it. A selected PCR product is purified from the gel, divided into parts that are amplified using different combination of tertiary primers. The resulting PCR product teaches us about the composition of the primers that created the selected primary PCR product. For example if the tertiary PCR product was obtained only when both gene family matching tertiary primers were used, then the initial PCR product was created with both a sense 20 and an anti-sense primers. The tertiary primers can also serve as sequencing primers and as amplification primers for cloning.
Alternatively performing tertiary PCR with a subset ofthe tertiary primers to amplify the whole PCR mix can intensify any selected group of PCR from the secondary PCR. Thus, it can be performed on the secondary PCR with either combination of tertiary primers, one gene family 25 tertiary primer and a repeat primer, a repeat primer alone, or two gene family tertiary primers, resulting in clear and strong PCR bands exhibiting the desired combination.
In addition the gene family PCR DNA fingeφrinting method can be modified and performed in combination with DNA digestion method. In brief, using existing method DNA is digested with 30 restriction endonucleases, (generally 2) and double stranded DNA adapters are ligated to the DNA fragments to generate template DNA for PCR amplification.
According to the method of invention a modification is suggested. The cleavage step will be performed with only one restriction enzyme, which is not a frequent cutter, for example EcoR-I.
Ligation is then performed with the EcoR I adapter. The PCR reaction ofthe DNA fingeφrinting is then performed with one primer that matches the ligated adapter and the pool of primers that matches the conserved sequences ofthe selected gene family. 5
2. The use of the gene family conserved primers in combination with primers that match ubiquitous cis-acting regulatory elements
Other sequences that are quite dispersed in the genome are the ubiquitous cis-acting regulatory elements. For example in plants the G-box (CACGTG) is such a ubiquitous element. Proteins 10 known as G-box factors bind to G-boxes in context-specific manner, mediating a wide variety of gene expression patterns. Other most ubiquitous elements in animals include the GC box the CAAT box and the response elements to cAMP, AP2, HNF-1 , glucocorticoid and retinoic acid (Table 1 ).
As the prevalence of these sequences is not arbitrary and as they are more concentrated in the 15 coding regions of various organisms' genome, thus though they are less common than the classical repetitive sequences they are still present in high numbers in the genomic coding sequences. As a result of that they will be able to amplify DNA segments with the primers that anneal to adjacent gene family conserved sequences.
Performing the PCR using these primers together with the gene family conserved primers, has a 20 major advantage that the differential PCR products between the samples will contain information that might be significant not only in regard to a differential gene but also regarding to a differential regulatory element.
It is thus technically performed in a way that a mix of primers that match regulatory elements can be used and replace one set ofthe gene family conserved primers. In the same manner the 25 secondary and tertiary PCR set of primers should follow the scheme of Figure 1 while a mix of primers ofthe regulatory elements having the same parts described in Figure 1 replaces one set of the gene family conserved region primers.
It is an option to add at the 5' end of these regulatory sequence primers the Sal I 6 nucleotides 30 restriction site, which enables two main advantages:
1. As the DNA fingeφrinting might be performed with a pool of primers there is difficulty to perform the PCR sequencing of a desired PCR product with a pool of primers. The existence
of a restriction site enables the ligation of an adapter to generate template for further DNA amplification and sequencing.
2. As many ofthe primers that match the cis-regulatory sequences are too short, the six nucleotides addition elongates the primers. It was found that in many cases the Sal I sequence or part of it. complement the sequences of many cis-regulatory elements of plants, in a higher 5 degree of alignment then the sequence of many other restriction enzymes.
3. Coupling of DNA fingerprinting analysis and gene family expression pattern analysis
In a previous method we used sets of gene family conserved primers for the characterization of various gene family expression patterns in various cells and tissues in order to identify the 10 similarities and dissimilarities between the gene expression patterns under different situations. It is a universal method that for example can identify and allow measuring the expression of genes among the vast and diversified families and species of plants. According to the method, conserved pairs of primers are used in RT-PCR. Among other advantages, overcome this obstacle of species diversification. The various sets of primer pairs create gene expression kits to the various gene 15 families. Each kit is composed of carefully planned sense and anti-sense conserved primers cross-reacting with each other, and in each PCR reaction only one pair of primers is used.
For the first time according to the method of invention we offer a new coupling between the results obtained by using gene family conserved primers at the DNA fingeφrinting and between the expression pattern analysis. Basically the method is aimed for identifying the parallel genes 20 responsible for the DNA expression pattern obtained.
According to the method when a researcher finds that a specific PCR product is interesting in either DNA fingeφrinting analysis or in gene expression pattern analysis, after sequencing the product from either source, he can identify the same PCR product in the other analysis. In this way a rapid switch can be made from the RNA level to the DNA level and vice versa. 25
According to the method of invention one can identify the identical gene in the two analysis by designing a blocking primer specific to the gene identified and sequenced either from the DNA fingeφrinting or from the gene expression analysis. In principle the primer can be either a sense or an anti-sense primer according to the situation. In either case it must match a sequence in close proximity to the primer matching the conserved region and it is synthesized in such a way that its 30 3' side cannot prime PCR. This side ofthe primer is either blocked by dideoxy residue or the last two nucleotides ofthe 3" end are puφosely designed not to complement the template in a way that it can not prime the PCR. or it is chemically modified by any other method. Thus, it anneals
to the DNA disturbing the DNA polymerase from elongating the DNA by physically standing in its way. For example if it is decided to use a sense oriented primer blocking the extension reaction primed by the upstream sense primer, it is designed to match a sequence in close proximity and upstream to the anti-sense primers ofthe gene family expression kit. In addition the blocking primer must be added in high concentrations. As a result of using a blocking primer designed from 5 one analysis to a gene will block the creation of a product in the other analysis, thus revealing the desired PCR product.
For example, when an interesting difference is found in gene expression analysis between the same plant in two different situations, according to the method finding the gene's band in the DNA fingeφrinting ofthe plant is possible, and will result in isolating part ofthe genomic DNA of this 10 gene. When an interesting difference is found between two strains of a specific plant in their DNA fingeφrints, it is possible to identify the PCR reaction expressing that gene, and to screen its possible differential expression at various situation in the specific plant, or in different organs of that plant. Further, differential expression pattern of this gene between the DNA fingeφrint of different strains might explain part ofthe different phenotype of these different plant strains. 15
Finally the system might help in identifying orthologous genes that are members of a gene family in different species. The following flow charts explain in more detail the necessary steps for obtaining data on differential genes and on orthologous genes between species. For identifying differential genes that are members in gene families as candidate polymoφhic markers between different species the following steps are required: 20 a. Identifying within the DNA fingeφrinting of several plant or animal species a PCR fragment of different size in one ofthe species. b. Isolating the differential PCR fragment in the species, sequencing and designing a blocking primer that matches a unique region in the gene of interest. c. Repeating the DNA fingeφrinting ofthe several plant or animal species as performed in 25 section a in the presence ofthe blocking primer, and watching for a differential blocking effect obtained by the blocking primer on the PCR product ofthe specific specie. d. Performing the gene expression PCR analysis ofthe various species, in several different plant or animal organs. e. Performing the same PCR analysis as in section d in the presence ofthe blocking primer, and 30 detecting that the expression of a differential PCR reaction between the species is now turned off or diminished in the specific species.
The same steps can be performed when differential genes are looked for between compared genotypes.
For example if we examine the DNA fingeφrints of one melon genotype and of one cucumber genotype, both belonging to the cucurbitaceous family, by performing DNA 5 fingeφrinting with primer sets for a specific gene family, equal sized and different sized PCR products are obtained. When we focus on the differential products and isolate the sequence of products that appear in the melon and not in the cucumber we can design blocking primers to these genes and then investigate the expression patterns of genes in both plants in various organs. If the genes" expression pattern in the melon is such that a specific RT-PCR product 10 that is highly expressed in fruit versus leaf, is selectively blocked by the blocking primer, and there is no effect on the same RT- PCR product in the cucumber, a clue for the function of this gene in melon fruit development is suggested.
Instead of using the blocking primer, using a specific primer to a differentially expressed gene between various compared species is an alternative. According to this method at the second 15 step instead of using the blocking primer, the specific primer is used together with either the set of sense or anti-sense conserved primers of a gene family in order to obtain a differential new PCR product typical to one ofthe compared species.
Still an advantage gained by using a blocking primer, results from the fact that all the previous results obtained by PCR can be referred relative to the specific situation. 20
For identifying ortholog genes that are members of gene families between different species the following steps are required: a. Identifying within the PCR DNA fingerprinting of several plant or animal species PCR fragments having the same size b. Isolating a PCR fragment in one specie sequencing it and designing a blocking primer that 25 match the gene of interest sequence in a region of a moderately conserved motive. c. Repeating the DNA PCR fingeφrinting of several plant species as performed in section a in the presence ofthe blocking primer, and watching for a comparable effect obtained by the blocking primer on the PCR fragments in the compared plant or animal species. d. Performing the gene expression PCR analysis ofthe various species, in several different plant 30 organs for each ofthe species. e. Performing the same PCR reactions analysis as those of section d in the presence ofthe blocking primer and watching in the various species that the same PCR reactions, that are
also with the same pattern of expression in the various organs or situations, are turned off or diminished.
4. The use of the gene family conserved primers in micro-array technology
As micro-arrays technology becomes highly applicable the use of micro-array is another approach 5 for DNA fingeφrinting. According to the method of invention the following highlights are described.
The application ofthe technique in micro-arrays is based on PCR products obtained for various genotypes, using a set of gene family conserved primers together with a repeat primer. The probes are 10 sequenced. and are attached to the appropriate surface. Many sequences are obtained from many genotypes reflecting the sequence variability. The probes are thus distinguishing between the various genotypes, as there is variability between the genotypes not only in respect to the products size but also in respect to their base composition.
Probes can be obtained from different varieties or genotypes by performing PCR with primer sets 15 from two conserved regions within a gene family. Although these PCR products are more homogenous in their size between the various strains however the base composition ofthe ortholog genes might be quite different and thus will result in different degrees of annealing of the various strains DNA templates to the micro-array probes.
20
5. Gene family DNA fingerprinting and cancer
The same approach of gene family DNA fingeφrinting described in the previous sections can be applied to the subject of analyzing the changes that occur in the process of carcinogenesis. It is well known that cancer formation is accompanied by chromosomal changes like translocations. rearrangements, amplifications, duplications and deletions. These changes can be monitored by 25 various direct assays that stains the chromosomes and by indirect methods including various DNA fingeφrinting assays that inspect for consistent changes in the fingeφrints. Applying the gene family DNA fingerprinting method in order to characterize different DNA fingeφrints of normal versus cancer cells of a tissue has the benefit of detecting typical and consistent changes at the genetic level that might provide an explanation based on a set of related genes. 30
This is possible for any gene family and might be more relevant to the kinases and the G protein families. The method for characterizing the changes of tumor versus normal cells can also be performed using micro-arrays as was described before. It can be done by either using both sets
of conserved primers of a gene family in the DNA fingeφrinting step or one set of conserved primers and a primer that matches a repeat sequence. Either way the PCR reactions are performed over a wide range of samples to a specific cancer, including normal cells and various cancer cells of tumors at various cancer stages and grades. The various PCR products obtained from the various cells are then used as probes by any method of probes production and 5 attachment, and serve as gene family DNA fingeφrinting probes in the micro-array.
BRIEF DESCRIPTION OF THE FIGURES
Figure 1: Schematic presentation of three step primer design.
I. A Schematic presentation of a gene with a conserved domain signified by the rectangular 10 region with two highly conserved regions A and B. 1 β Primary set of conserved primers A1-A3 represents various primers sequences that match the first conserved region, and primary set of conserved primers B1 -B3 represents various primers sequences that match the second conserved region. I C The first secondary set of primers composed of A 1 -A3 sequences, of C sequence which 15 represents first common sequence added to all the set of primers, and RSI the added restriction site. The second set of secondary primers composed of B1-B3 sequences, of D sequence which represents the second common sequence added to all the set of primers, and RS2 the added restriction site. I ) Describes the gene family tertiary primers. 20
Figure 2: The PCR DNA fingeφrinting was performed on two genotypes of com and one genotype of melon, using various combinations ofthe ISSR primer and the kinase conserved primers. The products were separated on 2.5% Nussive agarose gel. The sequence ofthe ISSR primer was (AC)8YT, where Y represents the bases C or T. Each triplet lanes contained from left 25 to right the com genotypes A and B and the melon Pi genotype. Lanes 1 -3: Kinase group A sense and anti-sense primers. Lanes 4-6, the ISSR primer. Lanes 7-9: ISSR and kinase group B anti-sense primers. Lanes 10-12: ISSR and kinase group B sense primers. Lane 13: Marker PBR digested with Alul.
Clear differences between the two com genotypes and the melon genotype is demonstrated in all 30 triplets.
Differences are noticed between co genotypes A and B using the ISSR and kinase group B anti-sense primers, lanes 7 and 8.
Figure 3: The PCR DNA fingerprinting was performed on two genotypes of com and two genotypes of melon, using various combinations that include ten base random primer and the homeobox gene group B conserved primers. The products were separated on 2.5% Nussive agarose gel. The sequence ofthe ten base random primer was CCACTCACCG. Each ofthe four 5 lanes contained from left to right the com genotypes A and B and the dul and Pi genotypes ofthe melon. Lanes 1 -4 homeobox group B sense and anti-sense primers. Lanes 5-8, the ten base random primer. Lanes 9-12. the ten base random primer and homeobox group B sense primers. Lanes 13-16, the ten base random primer and homeobox group B anti- sense primers. Lane 17 Marker PBR digested with Alul. 10
Differences are noticed between co genotypes A and B using the mix ofthe ten base random primer and the homeobox group B anti-sense primers, lanes 13 and 14.
Figure 4: The PCR DNA fingeφrinting was performed on the two strains of co and the two strains of melon as in figure 3. using a mix of ten base random primer and the kinase group A 15 conserved anti-sense primers. The sequence ofthe random primer was CGTCACAGAG.
Lane 1 the PBR Alu I digested marker. Lane 2: melon Pi. Lane 3: melon dul. Lane 4: com strain
A.
Lane 5: com strain B. Marked polymoφhism appears between the species and also between the genotypes. 20
Figure 5: Multiple amino acid sequence alignment of the kinase catalytic domains of several genes from subdomain VIb to subdomain XI is presented. Primers core amino acid sequence match catalytic subdomain VIb (region 1 ) and catalytic subdomain IX (region 2). When the blocking primer is a sense primer it should be designed to match sequences in close proximity and 25 upstream to the anti-sense primer sequence (region 3). When the blocking primer is an anti-sense primer it should be designed to match sequence in close proximity and downstream to the sense primer sequence (region 4). The designing ofthe sequence that the blocking primers match will have major influence on the specificity ofthe blocking. For example blocking primers from the core conserved region of sub-domain VII containing the DFG motive will block many genes 30
(region 5) while primers from region 1 and 2 will block the expression of fewer genes.
EXAMPLES
Several aspects ofthe invention will now be demonstrated by way of non-limiting examples.
Materials and method
The primers used as a prototype in our experiments are primers that match sequences within the kinase conserved domain in plants and conserved sequences within and in adjacent to the homeodomain in the plant homeobox genes. The sequences ofthe primers that match the kinase 5 domain are derived from the catalytic subdomain VIb and subdomain IX, and contain the core motive ofthe amino acid HRDL/IK for the sense primers and DV/I MWS for the anti-sense primers.
The primers were subdivided into two groups:
In the kinase gene family group A contained the sense primers numbered 3001 -3006, 3013-3015. 10 3032-3033 and the anti-sense primers numbered 3051 -3054, 3060- 3062. 3081-3082. Group B contained the sense primers numbered 3006-3012. 3015-3018. 3034-3035. and the anti-sense primers 3055-3057. 3063-3066, 3083-3084. Both groups contained equal number of primers that match the above described conserved sequences of various plant kinase gene subfamilies. MAPK. ERK, SNF related kinase. Calcium dependent kinases, and shaggy like kinases. Within each group 15 similar sequences ofthe subfamilies were grouped together.
In relation to the homeobox genes the sequence ofthe sense primers match conserved regions upstream and adjacent to the homeodomain. and the anti sense primers match a conserved motives within the homeodomain ofthe homeobox genes. The primers were designed to match conserved sequences in two large gene families: the Knotted like and the leucine ZIP like. The 20 core amino acid motives for the sense primers included ELK and KL/RRL respectively, and the core amino acid motives for the anti-sense primers included NQRKR and QNRR respectively. In the homeobox gene family group A contained the sense primers numbered 3201 -3206, 3215-3221, and the anti-sense primers numbered 3251 - 3255, 3260-3264. Group B contained the sense primers numbered 3207-3212. 3222-3228, and the anti-sense primers 3255-3259. 25
3265-3266. 3274-3275. Both groups contained equal number of primers that matched the above-described conserved sequences ofthe Knotted like and the Leucine ZIP like plant homeobox gene subfamilies. Other group of primers included the ISSR dinucleotide repeat primers and ten base random primers, which are described separately at the relevant figures.
30
PCR analysis conditions were as follows: PCR mixtures contained 30 ng of plant genomic DNA. Reaction mixtures contained the various reagents in the following final concentration: I. 10 mM Tris-HCl (pH 8.3). 50 mM Kcl. 1.5 mM MgCl
II . 0.2 uM - of each PCR primer.
III . 0.2 mM of each ofthe four nucleotides- A ,C ,G, -IT.
IV . 1.25 unit Taq polymerase.
The total volume of each PCR was 25-50 uL. 5
The amplification program was as follows: 7 min. at 94°C. 45 s at 94°C, 60 s at 45°C, and 2 min. at 72°C for 30- 35 cycles on an Eppendorf thermocycler. PCR products were in most cases separated on 2.5% ethidium bromide agarose gels.
Results 10
We conducted experiments that were aimed to check the variability obtained by PCR with two different sets of sense and anti-sense primers from two gene families to amplify DNA from the same plant tissue, and also to use these two sets to amplify DNA from different plants. In this way we could answer two questions: 1. Do the application of two sets of primers designed to the same core motives of a gene family result in different DNA patterns of a plant? And could one ofthe 15 two sets or both of them differentiate between different plant species and even between different plant genotypes?
In our experiments these questions were referred to in regard to the use ofthe kinase or homeobox genes primer mix A or B alone, and in combination with the microsatellite dinucleotide repeat or random ten base primers. 20
From the results, which part of them are summarized in figures 2-6 several interesting preliminary conclusions can be made regarding DNA fingeφrinting:
1. The use of a mix of primers that match two gene family conserved regions differentiate between various plant species in all instances, and in many cases between the various genotypes. 25
2. In most instances the use of a mix of either sense or anti-sense primers that match a gene family conserved region together with ISSR primer or ten base random primers resulted in larger spectrum of PCR products as compared with the mix described in item 1.
3. The use ofthe mixes described in item 2 could generally differentiate better between various genotypes then the mixes of item 1. 30
4. As was explained before for each gene family the primers were divided into two groups. These two groups of primers were basically derived from the same conserved regions in each family, and they also contained representative primers from the same subgroups of genes, though
primers with similar sequences were grouped together. The mixes resulted in different DNA fingeφrints for the same plant in all o the cases examined, emphasizing the different annealing properties ofthe various primers, though all were derived from similar sequences (data no shown).
Table 1
Table 1 contains cis-regulatory sequences first those ofthe animal kingdom and then those ofthe plant kingdom.
15