This application is a divisional of the U.S. Utility application Ser. No. 09/772,134, filed Jan. 29, 2001, herein incorporated by reference in its entirety which claims priority to the U.S. Provisional Application Ser. No. 60/178,811, filed Jan. 28, 2000, herein incorporated by reference in its entirety.
The present invention relates to plant breeding and plant genetics. More particularly, the invention relates to soybean cyst nematode and soybean sudden death syndrome resistance genes, soybean cyst nematode and soybean sudden death syndrome resistant soybean lines, and methods of breeding and engineering the same.
| Table of Abbreviations | ||
| AFLP | amplified fragment length polymorphism | |
| BAC | bacterial artificial chromosome | |
| bp | base pair | |
| Cf | tomato genes for resistance to Cladosporium fulvus | |
| FAM | 6-carboxyfluorescein | |
| FI | female index of parasitism | |
| indel | a nucleotide insertion or deletion | |
| MMAS | molecular marker-assisted selection | |
| QTL | quantitative trait loci | |
| RAPD | random amplified polymorphic DNA | |
| RFLP | restriction fragment length polymorphism | |
| rhg1 and | genetic loci conferring resistance to | |
| Rhg4 | Heterodera glycines | |
| RIL | recombinant inbred line | |
| SCN | soybean cyst nematode | |
| SDS | sudden death syndrome | |
| SSR | microsatellite | |
| TAMRA | 6-carboxy-N,N,N′5N′tetrachlorofluorescein | |
| TET | 6-carboxy-4,7,2′,7′,tetrachlorofluorescein | |
Soybeans are a major cash crop and investment commodity in North America and elsewhere. Soybean oil is one of the most widely used edible oils, and soybeans are used worldwide both in animal feed and in human food production.
The soybean cyst nematode (SCN), Heterodera glycines , is a widespread pest of soybeans in the American continent. Reported first in Japan more than 75 years ago, since the first reports in North Carolina in 1954, SCN continues its spread toward almost all soybean-cultivated soils. Known as a small plant-parasitic roundworm that attacks the roots of soybeans, it reproduces very quickly, survives in the soil for many years in the absence of a soybean crop, and can cause substantial soybean crop yield losses.
Resistant soybean varieties are an effective tool available for SCN management. There are multiple sources for soybean cyst nematode resistance genes in commercial soybean varieties (PI88788, Peking and PI209332), and several have been used to develop cultivars (Myers & Anand (1991), Euphytica 55:197-201; Rao-Arrelli et al. (1988) Crop Sci 28:650-652). All the described loci involved in the resistance to SCN are reported to be quantitative. (Concibido et al. (1997) Crop Sci 37:258-264; Concibido (1996) Theor Appl Genet. 93:234-241; Webb et al. (1995) Theor Appl Genet. 91:574-581; Rao-Arrelli et al. (1992) Crop Sci 32:862-864; Matthews et al. (1991) Soybean Genetics Newsletter ; Rao-Arrelli et al., 1988). They differ by their chromosomal position (LG A2, G, B, I, F, J and E) and race of the pathogen against which they confer the resistance (e.g. Race 1, 3, 5 or 14). SCN resistance is simply inherited, but field resistance is oligogenic due to the existence of variation among SCN populations that are described as “races” (Riggs and Schmidt (1988) J Nematol 20:392-395).
One gene, rhg1, provides the major portion of resistance to SCN race 3 across many genotypes derived from Peking (Chang et al. (1997) Crop Sci 372:965-971; Mathews et al. (1998) Theor Appl Genet. 97:1047-1052; Mahalingam et al. (1995) Breed Sci 45:435-445); PI437654 (Prabhu et al. (1999) Crop Sci 39:982-987; Webb et al., 1995), >PI88788=(Bell-Johnson et al. (1998) Soybean Genet Newslett 25:115-118; Concibido et al., 1997; Cregan et al. (1999a) Crop Sci 39:1464-1490; Cregan et al. (1999b) Theor Appl Genet. 99:811-818; Cregan et al. (1999c) Theor Appl Genet. 99:918-928), >PI209332=(Concibido et al., 1996), or >PI90763=(Concibido et al., 1997). A second gene for SCN resistance, Rhg4, provides an equal portion of resistance to SCN race 3 across genotypes derived from Peking (Chang et al., 1997; Mathews et al., 1998; Mahalingam et al., 1995); and PI437654 (Prabhu et al., 1999; Webb et al., 1995) but not PI88788, PI209332 or PI90763 (Concibido et al., 1996; Concibido et al., 1997). Cytological studies suggest PI437654 and Peking derived resistances share mechanisms (pronounced necrosis and cell wall appositions) not seen in PI88788 in response to race 3 (Mahalingham et al. (1996) Genome 39:986-998). These differences in mechanism may derive from distinct alleles at Rhg4, rhg1 and/or other defense associated loci.
DNA molecular markers linked to SCN/SDS resistance loci can be used to develop effective plant breeding strategies. In general, molecular markers are abundant, often co-dominant, and suitable for rapid screening at the seedling stage. Genetic linkage maps of soybean based on RFLP, RAPD, AFLP, and microsatellite markers have been described. See Brown et al. (1987) Principles and Practice of Nematode Control in Crops , pp 179-232, Academic Press, Orlando Fla.; Concibido et al., 1996; Concibido et al., 1997; Mahalingham et al., 1995; Meksem et al. (1999) Theor Appl Genet. 99:1131-1142; Meksem et al. (2000) Theor Appl Genet. 101: 747-755; Webb et al., 1995; Weiseman et al. (1992) Theor Appl Genet. 85:136-138; Lark et al. (1993) Theor Appl Genet. 86:901-906; Shoemaker and Specht (1995) Crop Sci 35:436-446; Chang et al., 1997; Keim et al. (1997) Crop Sci 37:537-543).
All such markers have a limit of resistance trait predictability based principally on proximity of the marker to the resistance locus. In some cases, the interpretative value of genetic linkage experiments can be augmented through the simultaneous or serial detection of more than one genetic marker, although this also incurs additional time and resources. Thus, there is a need for a reliable cost-effective method for detecting SCN or SDS resistance using genetic markers. Optimally, a genetic marker comprises a resistance gene.
Therefore, it is of particular importance, both to the soybean breeders and to farmers, to identify, genetic loci for resistance to SCN and SDS. Having knowledge of the loci for resistance to SCN and SDS, those of ordinary skill in the art can breed or engineer SCN and SDS resistant soybeans. Soybean resistance can be further provided to a non-resistant cultivar in combination with other genotypic and phenotypic characteristics required for commercial soybean lines.
The present invention discloses an isolated and purified genetic marker associated with SCN/SDS resistance in soybeans, said marker mapping to linkage group G in the soybean genome. Preferably, the marker has a sequence identical to any one of SEQ ID NOs:1, 3, and 5. Representative corresponding markers associated with SCN/SDS susceptibility are set forth as SEQ ID NOs:2, 4, and 6.
Also disclosed is an isolated and purified genetic marker associated with SCN/SDS resistance in soybeans, said marker mapping to linkage group A2 in the soybean genome. Preferably, the marker has a sequence identical to any one of SEQ ID NOs:7, 9, and 11. Representative corresponding markers associated with SCN/SDS susceptibility are set forth as SEQ ID NOs:8, 10, and 12.
The present invention further provides a plant, or parts thereof, which evidences an SCN/SDS resistance response comprising a genome, homozygous with respect to genetic alleles which are native to a first parent and normative to a second parent of the plant, wherein said second parent evidences significantly less resistant response to SCN/SDS than said first parent and said improved plant comprises alleles from said first parent that evidences resistance to SCN/SDS in hybrid combination in at least one locus selected from: a locus mapping to linkage group G and mapped by one or more of the markers set forth as SEQ ID NOs:1, 3, and 5, a locus mapping to linkage group A2 and mapped by one or more of the markers set forth as SEQ ID NOs:7, 9, and 11; or combinations thereof, said resistance not significantly less than that of the first parent in the same hybrid combination, and yield characteristics which are not significantly different than those of the second parent in the same hybrid combination.
In another embodiment, a plant of the present invention, or parts thereof, comprises the progeny of a cross between first and second inbred lines, alleles conferring SCN/SDS resistance being present in the homozygous state in the genome of one or the other or both of said first and second inbred lines such that the genome of said first and second inbreds together donate to the hybrid a complement of alleles necessary to confer the SCN/SDS resistance. Further disclosed are hybrid plants derived therefrom.
Also disclosed herein are isolated and purified biologically active SCN/SDS resistance polypeptide and an isolated and purified nucleic acid molecule encoding the same are disclosed. Preferably, the polypeptide comprises a soybean SCN/SDS resistance polypeptide. Chimeric genes comprising the isolated and purified nucleic acid molecules encoding a SCN/SDS resistance polypeptide are also provided.
In one embodiment, the nucleic acid molecule encoding a SCN/SDS resistance gene comprises an isolated soybean rhg1 gene that confers SCN/SDS resistance to a non-resistant host organism. The gene is capable of conveying Heterodera glycines -infestation resistance, Fusarium solani -infection resistance, or both Heterodera glycines -infestation resistance or Fusarium solani -infection resistance to a non-resistant plant germplasm, the gene located within a quantitative trait locus mapping to linkage group G and mapped by genetic markers of SEQ ID NOs:1, 3, and 5, said gene located along said quantitative trait locus between said markers. Preferably, the polypeptide comprises (a) a polypeptide encoded by a nucleic acid sequence set forth as SEQ ID NO:13; (b) a polypeptide encoded by a nucleic acid having homology to a DNA sequence set forth as SEQ ID NO:13; (c) a polypeptide encoded by a nucleic acid capable of hybridizing under stringent conditions to a nucleic acid comprising a sequence or the complement of a sequence set forth as SEQ ID NO:13; (d) a polypeptide which is a biologically functional equivalent of a peptide set forth as SEQ ID NO:14; or (e) a polypeptide comprising a fragment of a polypeptide of (a), (b), (c) or (d).
In another embodiment, the nucleic acid molecule encoding a SCN resistance polypeptide comprises an isolated soybean Rhg4 gene that is capable of conveying Heterodera glycines -infestation resistance to a non-resistant plant germplasm, said gene located within a quantitative trait locus mapping to linkage group A2 and mapped by the AFLP markers of SEQ ID NOs:7, 9, and 11, said gene located along said quantitative trait locus between said markers. Preferably, the nucleic acid molecule comprises any one of SEQ ID NOs:16-19.
The present invention further provides an isolated SCN/SDS resistance gene promoter region, or functional portion thereof, comprising an about 90 kb fragment of soybean genomic clone 73P6 between BamHI restriction sites and 21d9 between HinDIII restriction site. The genomic clone is available from the Forrest BAC library described in Meksem et al (2000) Theor Appl Genet. 101 5/6:747-755, available through Southern Illinois University-Carbondale (Carbondale, Ill.), Texas A&M University BAC center (College Station, Tex.), and Research Genetics (Huntsville, Ala.). Preferably, the isolated promoter region comprises the nucleotide sequence of SEQ ID NO:15 or a sequence substantially similar to SEQ ID NO:15. The SCN/SDS resistance gene promoter region can be operably linked to heterologous sequence.
A recombinant host cell comprising an isolated and purified nucleic acid molecule of the present invention is also disclosed, as is a transgenic plant having incorporated into its genome an isolated and purified nucleic acid molecule. In one embodiment, the nucleic acid molecule comprises encodes a SCN/SDS resistance polypeptide and is present in said genome in a copy number effective to confer expression in the plant of the SCN/SDS resistance polypeptide. Seeds, parts or progeny of the transgenic plant are also disclosed.
Further provided is a method for detecting a nucleic acid molecule that encodes an SCN/SDS resistance polypeptide in a biological sample comprising nucleic acid material is also disclosed. The method comprises: (a) hybridizing an isolated and purified nucleic acid molecule of the present invention under stringent hybridization conditions to the nucleic acid material of the biological sample, thereby forming a hybridization duplex; and (b) detecting the hybridization duplex. Preferably, the isolated and purified nucleic acid molecule comprises any of SEQ ID NOs:13 and 16-19.
An assay kit for detecting the presence, in biological samples, of an SCN/SDS resistance polypeptide is also disclosed. In one embodiment, the kit comprises a first container that contains a nucleic acid probe identical or complementary to a segment of at least ten contiguous nucleotide bases of a nucleic acid molecule of the present invention, preferably a nucleotide sequence of any one of SEQ ID NOs:13 and 16-19. In another embodiment, the kit comprises a nucleic acid probe or primer identical to any one of SEQ ID NOs:1, 3, 5, 7, 9, and 11, or portion thereof.
A method for identifying soybean sudden death syndrome (SDS) resistance or soybean cyst nematode (SCN) resistance in a soybean plant using a SDS resistance gene, a SCN resistance gene, or DNA segments having homology to a SDS resistance gene or to an SCN resistance gene is also disclosed. In one embodiment, the method comprises: (a) probing nucleic acids obtained from the soybean plant with a probe derived from said SDS resistance gene or from said SCN resistance gene or from said DNA segment having homology to said SDS resistance gene or to said SCN resistance gene; and observing hybridization of said probe to said nucleic acids, the presence of said hybridization indicating SDS or SCN resistance in said soybean plant. In another embodiment, the method comprises (a) detecting a molecular marker linked to a quantitative trait locus associated with SCN/SDS resistance, wherein the molecular marker is the sequence set forth as any one of SEQ ID NOs:1, 3, 5, 7, 9, and 11; and (b) determining the presence of SCN/SDS resistance as detection of the molecular marker and determining the absence of SCN/SDS resistance as failure to detect the molecular marker of (b).
A method of reliably and predictably introgressing SCN/SDS resistance genes into non-resistant soybean germplasm is also disclosed. The method comprises: using one or more nucleic acid markers for marker assisted selection among soybean lines to be used in a soybean breeding program, wherein the nucleic acid markers map to linkage groups G or A2 and wherein the nucleic acid markers are selected from among any of SEQ ID NOs: 1, 3, 5, 7, 9, and 11; and introgressing said resistance gene into said non-resistant soybean germplasm.
A soybean plant, or parts thereof, which evidences a SCN/SDS resistance response is also disclosed. The plant comprises a genome, homozygous with respect to genetic alleles which are native to a first parent and non-native to a second parent of the soybean plant, wherein said second parent evidences significantly less resistant response to SCN/SDS than said first parent, and said improved plant comprises alleles from said first parent that evidences resistance to SCN/SDS in hybrid combination of at least one locus selected from: a locus mapping to linkage group G and mapped by one or more of the markers set forth as SEQ ID NOs:1, 3, and 5, a locus mapping to linkage group A2 and mapped by one or more of the markers set forth in SEQ ID NOs:7, 9, and 11; or combinations thereof, said resistance not significantly less than that of the first parent in the same hybrid combination, and yield characteristics which are not significantly different than those of the second parent in the same hybrid combination.
The soybean plant, or parts thereof, can further comprise the progeny of a cross between first and second inbred lines, alleles conferring SCN/SDS resistance being present in a homozygous state in the genome of one or the other or both of said first and second inbred lines such that the genome of said first and second inbreds together donate to the hybrid a complement of alleles necessary to confer the SCN/SDS resistance. Thus, an SCN/SDS resistant hybrid, or parts thereof, formed with the soybean plant is also disclosed, as is a soybean plant, or parts thereof, formed by selfing the SCN/SDS resistant hybrid.
A method of positional cloning of a nucleic acid is also disclosed. The method comprises: (a) identifying a first nucleic acid genetically linked to a SCN/SDS resistance locus, wherein the first nucleic acid maps between two markers selected from SEQ ID NOs:1-12; and (b) cloning the first nucleic acid. Optionally, the first nucleic acid can comprise the rhg1 locus or the Rhg4 locus.
A method for producing an antibody that specifically recognizes a SCN/SDS resistance polypeptide is also disclosed. The method comprises (a) recombinantly or synthetically producing a SCN/SDS resistance polypeptide, or portion thereof; (b) formulating the polypeptide of (a) whereby it is an effective immunogen; (c) administering to an animal the formulation of (b) to generate an immune response in the animal comprising production of antibodies, wherein antibodies are present in the blood serum of the animal; and (d) collecting the blood serum from the animal of (c) comprising antibodies that specifically recognize a SCN/SDS resistance polypeptide. Also provided is an antibody produced by the disclosed method.
Methods for identifying a candidate compound as a modulator of SCN/SDS resistance activity is also disclosed. Such methods include but are not limited to cell-based assays of SCN/SDS resistance gene expression, assays of specific binding to SCN/SDS regulatory elements, and assays of specific binding to SCN/SDS polypeptides. Optionally, the screening methods are adapted to a high-throughput format.
In one embodiment, the method comprises: (a) exposing a cell sample with a candidate compound to be tested, the cell sample containing at least one cell containing a DNA construct comprising a modulatable transcriptional regulatory sequence of an SCN/SDS resistance-encoding nucleic acid and a reporter gene which is capable of producing a detectable signal; (b) evaluating an amount of signal produced in relation to a control sample; and (c) identifying a candidate compound as a modulator of SCN/SDS resistance activity based on the amount of signal produced in relation to a control sample.
The present invention also provides a method for identifying a substance that regulates SCN/SDS resistance gene expression using a chimeric gene that includes an isolated SCN/SDS resistance gene promoter region operably linked to a reporter gene. According to this method, a gene expression system is established that includes the chimeric gene and components required for gene transcription and translation so that reporter gene expression is assayable. To select a substance that regulates SCN/SDS resistance gene expression, the method further provides the steps of using the gene expression system to determine a baseline level of reporter gene expression in the absence of a candidate regulator; providing a plurality of candidate regulators to the gene expression system; and assaying a level of reporter gene expression in the presence of a candidate regulator. A candidate regulator is selected whose presence results in an altered level of reporter gene expression when compared to the baseline level. Preferably, the isolated SCN/SDS resistance gene promoter region used in this method comprises the sequence of SEQ ID NO:15, or functional portion thereof.
In another embodiment, the method comprises using an SCN/SDS regulatory sequence to identify a candidate substance that specifically binds to the regulatory sequence. According to the method, a SCN/SDS regulatory gene sequence is exposed to a candidate substance under conditions suitable for binding to a nucleic acid sequence, and a candidate regulator is selected that specifically binds to the SCN/SDS resistance gene promoter region. Preferably, the isolated SCN/SDS resistance gene promoter region used in this method comprises the sequence of SEQ ID NO:15, or functional portion thereof.
In another embodiment, a cell-free assay system is used and comprises: (a) exposing a SCN/SDS polypeptide of the present invention to a candidate compound; (b) assaying binding of the candidate compound to the SCN/SDS polypeptide; and (c) identifying a candidate compound as a putative modulator of SCN/SDS resistance activity based on specific binding of the candidate compound to the SCN/SDS polypeptide. Preferably, the SCN/SDS polypeptide comprises some or all of the amino acids of SEQ ID NO:14.
A method of modulating SCN/SDS resistance in a plant is also disclosed. The method comprises administering to the plant an effective amount of a substance that modulates expression of an SCN/SDS resistance activity-encoding nucleic acid molecule in the plant to thereby modulate SCN/SDS resistance in the plant. Preferably, the substance that modulates expression of an SCN/SDS resistance activity is discovered by a disclosed method of the present invention.
A method for providing a resistance characteristic to a plant is also disclosed. The method comprises introducing to said plant a construct comprising a nucleic acid sequence encoding an SCN/SDS resistance gene product operatively linked to a promoter, wherein production of the SCN/SDS resistance gene product in the plant provides a resistance characteristic to the plant. The construct can further comprises a vector selected from the group consisting of a plasmid vector or a viral vector. The SCN/SDS resistance gene product comprises a protein having an amino acid sequence of SEQ ID NO:14. The nucleic acid sequence comprises the nucleotide sequence of SEQ ID NO:13 or a nucleic acid that is substantially similar to SEQ ID NO:13, and which encodes an SCN/SDS resistance polypeptide.
The resistance characteristic is preferably nematode resistance, fungal resistance or combinations thereof. More preferably, the nematode resistance is H. glycines resistance, even more preferably race 3 H. glycines resistance.
In an alternative embodiment the construct further comprises another nucleic acid molecule encoding a polypeptide that provides an additional desired characteristic to the plant. Optionally, the method further comprises monitoring an insertion point for the construct in the plant genome; and providing for insertion of the construct into the plant genome at a location not associated with the resistance characteristic, the desired characteristic, or both the resistance and the desired characteristic. Preferably, the plant is a soybean plant.
The present invention also provides methods for providing a resistance characteristic to a plant is also disclosed, wherein a combination of genetic and non-genetic techniques is employed. The method comprises introducing to said plant a construct comprising a nucleic acid sequence encoding an SCN/SDS resistance gene product operatively linked to a promoter and provision of a substance that modulates SCS/SDS resistance gene activity, wherein production of the SCN/SDS resistance gene product in the plant, in combination with provision of the SCN/SDS resistance gene modulator, provides a resistance characteristic to the plant.
Accordingly, it is an object of the present invention to provide novel isolated polynucleotides and polypeptides relating to loci underlying resistance to soybean cyst nematode and soybean sudden death syndrome and methods employing same. The object is achieved in whole or in part by the present invention.
An object of the invention having been stated hereinabove, other objects and advantages will become evident as the description proceeds, when taken in connection with the accompanying Drawings and Examples as best described hereinbelow.
FIG. 1 depicts new AFLP genetic markers for SCN/SDS resistance.
FIG. 1A presents genomic sequences of the both alleles (resistant Forrest and susceptible Essex) of the converted AFLP markers E ATG M CGA 87 (SEQ ID NOs:1-2); E CTA M AGG 113 (SEQ ID NOs:3-4); E CGG M AGA 116 (SEQ ID NOs:5-6); E CCG M AAC 405 (SEQ ID NOs:7-8), E CCC M ATG 161 (SEQ ID NOs:9-10), E CCA M AGC 114 (SEQ ID NOs:11-12. The italicized and underlined sequences represent the forward and reverse sequence specific primers used. The bold capital sequences represent the original AFLP restriction site. The bold letters indicate the difference in sequences between the two alleles.
FIG. 1B presents genomic sequences of the two alleles (resistant and susceptible) of the converted E ATG M CGA 87 markers. The italic sequences represent the resistance specific TaqMan™ probes TMA5-RE and the susceptible allele specific probe TMA5-S. The standard font underlined sequence represent the TaqMan™ forward and reverse primers assay, the underlined italic sequence is the ATG4BACF primer used for sequence extension of the E ATG M CGA 87 marker, the BAC derived extended sequences are in small font capitals.
FIG. 2 depicts AFLPs for selecting SCN/SDS resistance.
FIG. 2A shows PCR amplification products using E ATG M CGA 87 sequence specific primers TMA5 forward and reverse: Lane 1-40 represent 40 RIL DNA, 41 and 42 are the two parents. F: Forrest; E: Essex; 1: resistant allele; 2: susceptible allele; H: heterozygote lines. The PCR products were separated by electrophoresis on a 4% (w/v) Metaphor gel.
FIG. 2B shows a partial AFLP autoradiograph profile of the E CGG M AGA 116 marker. The six selective nucleotides step was replaced by MseI primer M AGAGACT and EcoRI primer E. Lane 7: Essex; Lane 8: Forrest; Lane 1 to 6 and 9 to 20 represent RIL DNA; 1: resistant allele; 2: susceptible allele
FIG. 2C shows PCR amplification products using E CTA M AGG 113 sequence specific primers CTA forward and reverse: Lane 1-40 represent 40 RIL DNA, 41 and 42 are the two parent. F: Forrest; E: Essex; 1: resistant allele; 2: susceptible allele; H: heterozygote lines. The PCR products were separated by electrophoresis on a 4% (w/v) Metaphor gel.
FIG. 2D shows PCR amplification products using E CCG M AAC 405 sequence specific primers A2D8 forward and reverse: Lane 1-40 represent 40 RIL DNA, 41 and 42 are the two parents F: Forrest; E: Essex; 1: resistant allele; 2: susceptible allele; H: heterozygote lines. The PCR products were separated by electrophoresis on a 4% (w/v) Metaphor gel.
FIG. 3 depicts a genetic and physical map showing the location of an Rhg4 gene relative to DNA markers. The location of the aspartokinase serine dehydrogenase (AK-HSDH) and the A2D8 marker are indicated as determined by restriction mapping of BAC DNA. The A2D8 sequences for Essex and Forrest alleles are deposited in GenBank as Accession Nos. AF286701 and AF286700, respectively. The l locus (l) position was estimated by relation to BARC-SAT — 162 (Cregan et al., 1999c). Genetic mapping shows Rhg4 and A2D8 are both within the interval shown by the horizontal line and within a large insert clone, 100B10, that contains a 140 kbp insert (Zobrist et al. (2000) Soybean Genet Newslett 27:10-15).
FIG. 4 depicts the gene structure of the rhg1 gene and clones derived from Forrest genomic DNA.
FIG. 5 depicts detection of the A2D8 marker polymorphism using the TaqMan™ assay and manual selection of genotypes. Eighty-six individuals from an F5 derived population of recombinant inbred lines from the cross of Essex x Forrest that segregate for resistance to SCN are shown.
FIG. 5A is an image of fluorescent signals viewed under the “dye component” field of the sequence detection software and the A2D8 genotypes were manually selected based on the ratio of FAM and TET signals. Allele 1 homozygous, Forrest type; FAM<<TET. Allele 2 homozygous, Essex type; TET<<FAM. Alleles 1 and 2 heterogeneous, Essex and Forrest type; TET less than 2 fold greater or lesser than FAM. Two selections were used, in the first (TaqMan™ assay1) group of genotypes FAM 6-8 and TET 8-9 were considered susceptible. In the second (TaqMan™ assay 2) group, they were considered heterogeneous.
FIG. 5B is a spreadsheet that contains scores (allele designations) for the samples as they were arranged in the 96 well plate. There was no DNA in wells E12, F12 and G12 (negative controls). There was Essex DNA in wells A1, C12 and D12. There was Forrest DNA in wells B2, A12 and B12. The RIL DNA was in well A3 to H11 in order by row from RIL1-RIL86 except samples E1 (RIL3) and E6 (RIL 43) that did not amplify. The RILs resistant to SCN had an index of parasitism FI<10% of the susceptible check resistant lines.
FIG. 6 depicts detection of the A2D8 marker polymorphism by PCR amplification and gel electrophoresis of soybean genotypes. Seventy-eight individuals from an F5 derived population of recombinant inbred lines from the cross of Essex x Forrest that segregate for resistance to SCN are shown.
FIG. 6A is an image of fluorescent signals viewed under the “dye component” field of the sequence detection software and the A2D8 genotypes were manually selected based on the ratio of FAM and TET signals. Lane 1,42 Essex; Lane 2 and 41 Forrest; Lanes 3-40 RILS 1-38.
FIG. 6B is a picture of an ehtidium-stained gel, showing resolution of gel electrophoresis markers. Lane 42 Essex; Lane 41 Forrest; Lanes 1-40 RILS 39-78. Asterisks indicate disagreements with the TaqMan™ assay 1.
FIG. 7A-B presents the rhg1 gene sequence (SEQ ID NO:13).
FIG. 7C presents the rhg1 polypeptide (SEQ ID NO:14).
FIG. 7D shows sequences producing significant alignments using BLAST analysis.
FIG. 7E-F is an alignment between rhg1 protein (SEQ ID NO:14) and Arabidopsis thaliana hypothetical protein T18N14.120 (GenBank Accession T46070).
Disclosed herein is the identification of AFLP markers that are genetically linked to the SCN/SDS resistance loci of Forrest. Further disclosed are purified and isolated SCN or SDS resistance genes, proximal sequences to SCN/SDS resistance genes, and SCN/SDS resistance-related genes.
The isolated and purified polynucleotide sequences disclosed herein can thus be used in a variety of applications pertaining to breeding and engineering soybeans having SCN and SDS resistance. For example, the isolated polynucleotides disclosed herein can be used in position-based or homology-based cloning of additional SCN/SDS resistance genes, including regulatory elements; in gene structure determination; in studies of genome organization and gene expression; in gene complementation experiments; in the isolation of additional DNA markers for gene manipulation and molecular marker assisted breeding; and in plant transformation and the production of transgenic plants.
The present invention also pertains to a soybean plant and methods of producing the same, which is resistant to soybean cyst nematodes (SCN). In one embodiment, the method comprises stable transformation of a plant with an rhg1 gene, disclosed herein. In another embodiment, the method comprises introgression in soybean of a trait enabling the plant to resist soybean cyst nematode (SCN) infestation. Additionally, the present invention relates to method of precise and accurate introgression of the genetic material conferring SCN resistance from one or more parent plants into the progeny.
The present invention also pertains to a soybean plant and methods of producing the same, which is resistant to soybean sudden death syndrome (SDS). In one embodiment, the method comprises stable transformation of a plant with an rhg1 gene, disclosed herein. In another embodiment, the method comprises introgression of the genetic material conferring SDS resistance from one or more parent plants into the progeny with precision and accuracy.
The invention differs from present technology in several regards. In one aspect, the present invention provides the first disclosure of the rhg1 gene sequence, thereby enabling transgenic approaches for providing SCN/SDS resistance. Further, the present invention provides a non-electorphoretic selection assay using nucleotide sequences of SCN/SDS resistance gene alleles. The disclosed nucleotide sequences of SCN/SDS resistance genes and associated genetic markers provide means for easily selecting resistant cultivars, for assembling many resistance genes in a single cultivar, for combining resistance genes in novel combinations, for identifying genes that confer resistance in new cultivars, and for predicting resistance in cultivars. The invention is used to improve selection for SDS and SCN resistance in soybean in breeding programs.
I. Traits
The term “phenotype” or “trait” each refer to any observable property of an organism, produced by the interaction of the genotype of the organism and the environment. A phenotype can encompass variable expressivity and penetrance of the phenotype. Exemplary phenotypes include but are not limited to a visible phenotype, a physiological phenotype, a susceptibility phenotype, a cellular phenotype, a molecular phenotype, and combinations thereof. Preferably, the phenotype is related to SCN/SDS resistance. The term “susceptibility phenotype” refers to an increased capacity or risk for displaying a phenotype, i.e. a susceptibility to SCN/SDS infection.
The term “complex trait” as used herein refers to a trait that is not inherited as predicted by classical Mendelian genetics. A complex trait results from the interaction of multiple genes, each gene contributing to the phenotype. Complex traits can be continuous or show threshold penetrance. In the field, SCN/SDS resistance is inherited as a complex trait.
The term “quantitative trait” is a complex trait that can be assessed quantitatively. Quantitation entails measurement of a trait across a continuous distribution of values. SCN/SDS resistance is a quantitative trait.
The term “SCN/SDS resistance” or “SCN/SDS resistance trait” as used herein refers to a cellular or organismal capacity for resistance to nematode or fungal infection, or both. Preferably, the nematode resistance is Heterodera glycines (the organism that causes SCN in soybeans) resistance, even more preferably race 3 Heterodera glycines resistance. The fungal resistance is preferably Fusarium solani (the organism that causes SDS in soybeans)-infection resistance. SCN resistance can be assayed in the field or in the greenhouse by methods known in the art, including but not limited to determination of an SCN index of parasitism as disclosed in Example 2, Meksem et al. (1999), and U.S. Pat. No. 6,096,944. SDS resistance can be scored by determination of disease incidence, disease severity, and disease index values as disclosed in Hnetkovsky et al. (1996) Crop Sci 36(2):393-400, Njiti et al. (1996) Crop Sci 36:1165-1170; and Matthews et al. (1991).
The term “SCN/SDS resistance” is used herein for convenience to describe traits, transgenic plants, polynucleotides, and polypeptides of the present invention. Therefore, the resistance characteristic conveyed by the polynucleotides and polypeptides of the present invention refers to any resistance characteristic as set forth herein and as would be apparent to one of ordinary skill in the art after reviewing the disclosure of the present invention.
The term “molecular phenotype” refers to a detectable feature of molecules in a cell or organism. Exemplary molecular phenotypes include but are not limited to a presence of a genetic marker nucleotide sequence, a presence of a SCN/SDS resistance gene sequence, a level of gene expression, a splice selection, a level of protein, a protein type, a protein modification, a level of lipid, a lipid type, a lipid modification, a level of carbohydrate, a carbohydrate type, a carbohydrate modification, and combinations thereof. Methods for observing, detecting, and quantitating molecular phenotypes are well known to one skilled in the art. See Sambrook et al., eds. (1989) Molecular Cloning , Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., N.Y.; by Silhavy et al. (1984) Experiments with Gene Fusions , Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., N.Y.; by Ausubel et al. (1992) Current Protocols in Molecular Biology , John Wylie and Sons, Inc. New York, N.Y.; Landgren et. al. (1988) Science 242:229-237; Bodanszky, et al. (1976) Peptide Synthesis , John Wiley and Sons, Second Edition, New York, N.Y.; Harlow and Lane (1988) Antibodies: A Laboratory Manual , Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.; Ochman et al. (1990) in PCR protocols: a Guide to Methods and Applications , Innis et al. (eds.), pp. 219-227, Academic Press, San Diego, Calif.; Koduri and Poola (2001) Steroids 66(1):17-23; Regan et al. (2000) Anal Biochem 286(2):265-276; U.S. Pat. Nos. 6,096,555; 5,958,624; and 5,629,158.
II. Genetic Mapping
For genetic mapping, a representative population was generated as in Example 1. To detect genomic regions associated with resistance to SCN and resistance to SDS, the RILs were classified as Essex type or Forrest type for each marker. In some cases, SCN susceptibility and resistance was quantitatively determined according to a SCN female index (F1) of parasitism (Meksem, 1999) as described in Example 2. Markers were compared with SCN or SDS response scores by the F-test in analysis of variance (ANOVA) done with SAS (SAS Institute Inc., Cary, N.C., 1988). The probability of association of each marker with each trait was determined and a significant association was declared if P≦0.05 (unless noted otherwise in the text) since the detection of false associations is reduced in isogenic lines (Landers & Botstein (1989) Genetics 121:185-199; Paterson et al. (1990) Genetics 124:735-742).
Selected pairs of markers were analyzed by the two-way ANOVA using the general linear model (PROC GLM) procedure to detect non-additive interactions between the unlinked QTL (Chang et al. (1996) Crop Sci 36:965-971) or Epistat (Chase et al. (1997) Theor Appl Genet. 94:724-730). Non-additive interactions between markers which were significantly associated with SCN/SDS response were excluded when P≧0.05. Selected groups of markers were analyzed by multi-way ANOVA to estimate joint heritabilities for traits associated with multiple QTL. Joint heritability was determined from the R 2 term for the joint model in multi-way ANOVA.
Mapmaker-EXP 3.0 (Lander et al. 1987) was used to calculate map distances (cM, Haldane units) between linked markers and to construct a linkage map including traits as genes. The RIL (recombinant inbred line) and F 3 self genetic models were used. The log 10 of the odds ratio (LOD) for grouping markers was set minimally at 2.0, and maximum distance was set at 30 cM. Conflicts were resolved in favor of the highest LOD score after checking the raw data for errors. Marker order within groups was determined by comparing the likelihood of many map orders. A maximum likelihood map was computed with error detection. Trait data were used for QTL analysis (Webb et al. 1995; Chang et al. 1997). The data were subjected to ANOVA (SAS Institute Inc., Cary, N.C.) with mean separation by LSD (Gomez and Gomez (1984). Graphs were constructed by Quattro Pro version 5.0 (Novell Inc., Orem, Utah).
III. Nucleotide Sequences of SCN/SDS Resistance Genes and Associated Genetic Markers
The nucleic acid molecules provided by the present invention include the isolated nucleic acid molecules of SEQ ID NOs:1-13 and 15-114, sequences substantially similar to sequences of SEQ ID NOs:1-13 and 15-114, conservative variants thereof, plant-expressible variants thereof, subsequences and elongated sequences thereof, complementary DNA molecules, and corresponding RNA molecules. The present invention also encompasses genes, cDNAs, promoters, chimeric genes, and vectors comprising disclosed SCN/SDS resistance gene and SCN/SDS resistance gene marker nucleic acid sequences.
III.A. General Considerations
The term “nucleic acid molecule” refers to deoxyribonucleotides or ribonucleotides and polymers thereof in either single- or double-stranded form. Unless specifically limited, the term encompasses nucleic acids containing known analogues of natural nucleotides that have similar properties as the reference natural nucleic acid. Unless otherwise indicated, a particular nucleotide sequence also implicitly encompasses conservatively modified variants thereof (e.g. degenerate codon substitutions), complementary sequences, subsequences, elongated sequences, as well as the sequence explicitly indicated. The terms “nucleic acid molecule” or “nucleotide sequence” can also be used in place of “gene”, “cDNA”, or “mRNA”. Nucleic acids can be derived from any source, including any organism.
The term “isolated”, as used in the context of a nucleic acid molecule, indicates that the nucleic acid molecule exists apart from its native environment and is not a product of nature. An isolated DNA molecule can exist in a purified form or can exist in a non-native environment such as a transgenic host cell.
The term “purified”, when applied to a nucleic acid, denotes that the nucleic acid is essentially free of other cellular components with which it is associated in the natural state. Preferably, a purified nucleic acid molecule is a homogeneous dry or aqueous solution. The term “purified” denotes that a nucleic acid or protein gives rise to essentially one band in an electrophoretic gel. Particularly, it means that the nucleic acid is at least about 50% pure, more preferably at least about 85% pure, and most preferably at least about 99% pure.
The term “substantially identical”, in the context of two nucleotide or amino acid sequences, can also be defined as two or more sequences or subsequences that have at least 60%, preferably 80%, more preferably 90-95%, and most preferably at least 99% nucleotide or amino acid sequence identity, when compared and aligned for maximum correspondence, as measured using one of the following sequence comparison algorithms (described herein below under the heading Nucleotide and Amino Acid Sequence Comparisons) or by visual inspection. Preferably, the substantial identity exists in nucleotide sequences of at least 50 residues, more preferably in nucleotide sequence of at least about 100 residues, more preferably in nucleotide sequences of at least about 150 residues, and most preferably in nucleotide sequences comprising complete coding sequences.
In one aspect, polymorphic sequences can be substantially identical sequences. The term “polymorphic” refers to the occurrence of two or more genetically determined alternative sequences or alleles in a population. An allelic difference can be as small as one base pair.
Another indication that two nucleotide sequences are substantially identical is that the two molecules specifically or substantially hybridize to each other under stringent conditions. In the context of nucleic acid hybridization, two nucleic acid sequences being compared can be designated a “probe” and a “target”. A “probe” is a reference nucleic acid molecule, and a “target” is a test nucleic acid molecule, often found within a heterogenous population of nucleic acid molecules. “Target sequence” is synonymous with “test sequence”.
A preferred nucleotide sequence employed for hybridization studies or assays includes probe sequences that are complementary to or mimic at least an about 14 to 40 nucleotide sequence of a nucleic acid molecule of the present invention. Preferably, a probe comprises 14 to 20 nucleotides, or even longer where desired, such as 30, 40, 50, 60, 100, 200, 300, or 500 nucleotides or up to the full length of any of SEQ ID NOs:1-13, 15-114. Such fragments can be readily prepared by, for example, directly synthesizing the fragment by chemical synthesis, by application of nucleic acid amplification technology, or by introducing selected sequences into recombinant vectors for recombinant production. The phrase “hybridizing specifically to” refers to the binding, duplexing, or hybridizing of a molecule only to a particular nucleotide sequence under stringent conditions when that sequence is present in a complex nucleic acid mixture (e.g., total cellular DNA or RNA). The phrase “binds substantially to” refers to complementary hybridization between a probe nucleic acid molecule and a target nucleic acid molecule and embraces minor mismatches that can be accommodated by reducing the stringency of the hybridization media to achieve the desired hybridization. Probe sequences can also hybridize specifically to duplex DNA under certain conditions to form triplex or other higher order DNA complexes. The preparation of such probes and suitable hybridization conditions are well known in the art.
“Stringent hybridization conditions” and “stringent hybridization wash conditions” in the context of nucleic acid hybridization experiments such as Southern and Northern blot analysis are both sequence- and environment-dependent. Longer sequences hybridize specifically at higher temperatures. An extensive guide to the hybridization of nucleic acids is found in Tijssen (1993) Laboratory Techniques in Biochemistry and Molecular Biology - Hybridization with Nucleic Acid Probes part I chapter 2, Elsevier, New York, N.Y. Generally, highly stringent hybridization and wash conditions are selected to be about 5° C. lower than the thermal melting point (T m ) for the specific sequence at a defined ionic strength and pH. Typically, under “stringent conditions” a probe will hybridize specifically to its target subsequence, but to no other sequences.
The T m is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. Very stringent conditions are selected to be equal to the T m for a particular probe. An example of stringent hybridization conditions for Southern or Northern Blot analysis of complementary nucleic acids having more than about 100 complementary residues is overnight hybridization in 50% formamide with 1 mg of heparin at 42° C. An example of highly stringent wash conditions is 15 minutes in 0.15 M NaCl at 65° C. An example of stringent wash conditions is 15 minutes in 0.2×SSC buffer at 65° C. (See Sambrook et al., 1989) for a description of SSC buffer). Often, a high stringency wash is preceded by a low stringency wash to remove background probe signal. An example of medium stringency wash conditions for a duplex of more than about 100 nucleotides, is 15 minutes in 1×SSC at 45° C. An example of low stringency wash for a duplex of more than about 100 nucleotides, is 15 minutes in 4-6×SSC at 40° C. For short probes (e.g., about 10 to 50 nucleotides), stringent conditions typically involve salt concentrations of less than about 1.0 M Na ion, typically about 0.01 to 1.0 M Na ion concentration (or other salts) at pH 7.0-8.3, and the temperature is typically at least about 30° C. Stringent conditions can also be achieved with the addition of destabilizing agents such as formamide. In general, a signal to noise ratio of 2-fold (or higher) than that observed for an unrelated probe in the particular hybridization assay indicates detection of a specific hybridization.
The following are examples of hybridization and wash conditions that can be used to clone homologous nucleotide sequences that are substantially identical to reference nucleotide sequences of the present invention: a probe nucleotide sequence preferably hybridizes to a target nucleotide sequence in 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO 4 , 1 mM EDTA at 50° C. followed by washing in 2×SSC, 0.1% SDS at 50° C.; more preferably, a probe and target sequence hybridize in 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO 4 , 1 mM EDTA at 50° C. followed by washing in 1×SSC, 0.1% SDS at 50° C.; more preferably, a probe and target sequence hybridize in 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO 4 , 1 mM EDTA at 50° C. followed by washing in 0.5×SSC, 0.1% SDS at 50° C.; more preferably, a probe and target sequence hybridize in 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO 4 , 1 mM EDTA at 50° C. followed by washing in 0.1×SSC, 0.1% SDS at 50° C.; more preferably, a probe and target sequence hybridize in 7% sodium dodecyl sulfate (SDS), 0.5 M NaPO 4 , 1 mM EDTA at 50° C. followed by washing in 0.1×SSC, 0.1% SDS at 65° C.
A further indication that two nucleic acid sequences are substantially identical is that proteins encoded by the nucleic acids are substantially identical, share an overall three-dimensional structure, are biologically functional equivalents; or are immunologically cross-reactive. These terms are defined further under the heading SCN/SDS Resistance Polypeptides herein below. Nucleic acid molecules that do not hybridize to each other under stringent conditions are still substantially identical if the corresponding proteins are substantially identical. This can occur, for example, when two nucleotide sequences are significantly degenerate as permitted by the genetic code.
The term “conservatively substituted variants” refers to nucleic acid sequences having degenerate codon substitutions wherein the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et al. (1991) Nucleic Acid Res. 19:5081; Ohtsuka et al. (1985) J Biol Chem 260:2605-2608; Rossolini et al. (1994) Mol Cell Probes 8:91-98).
The term “plant-expressible variant” means a substantially similar sequence that has been modified to comprise a coding sequence (nucleotide sequence) can be efficiently expressed by plant cells, tissue and whole plants. The art understands that a plant-expressible coding sequence has a GC composition consistent with good gene expression in plant cells, a sufficiently low CpG content so that expression of that coding sequence is not restricted by plant cells, and codon usage which is consistent with that of plant genes. Where it is desired that the properties of the plant-expressible SCN/SDS resistance gene are identical to those of the naturally occurring SCN/SDS resistance gene, the plant-expressible homolog will have an identical coding sequence or a substantially identical coding sequence.
The term “subsequence” refers to a sequence of nucleic acids that comprises a part of a longer nucleic acid sequence. An exemplary subsequence is a probe, described herein above, or a primer. The term “primer” as used herein refers to a contiguous sequence comprising about 8 or more deoxyribonucleotides or ribonucleotides, preferably 10-20 nucleotides, and more preferably 20-30 nucleotides of a selected nucleic acid molecule. The primers of the present invention encompass oligonucleotides of sufficient length and appropriate sequence so as to provide initiation of polymerization on a nucleic acid molecule of the present invention.
The term “elongated sequence” refers to an addition of nucleotides (or other analogous molecules) incorporated into the nucleic acid. For example, a polymerase (e.g., a DNA polymerase), e.g., a polymerase that adds sequences at the 3′ terminus of the nucleic acid molecule can be employed to prepare an elongated sequence. In addition, the nucleotide sequence can be combined with other DNA sequences, such as promoters, promoter regions, enhancers, polyadenylation signals, intronic sequences, additional restriction enzyme sites, multiple cloning sites, and other coding segments.
The term “complementary sequence”, as used herein, indicates two nucleotide sequences that comprise anti-parallel nucleotide sequences capable of pairing with one another upon formation of hydrogen bonds between base pairs. As used herein, the term “complementary sequences” means nucleotide sequences which are substantially complementary, as can be assessed by the same nucleotide comparison set forth above, or is defined as being capable, of hybridizing to the nucleic acid segment in question under relatively stringent conditions such as those described herein. A particular example of a complementary nucleic acid segment is an antisense oligonucleotide.
The present invention further includes vectors comprising the disclosed SCN/SDS resistance gene sequences, including plasmids, cosmids, and viral vectors. The term “vector”, as used herein refers to a DNA molecule having sequences that enable its replication in a compatible host cell. A vector also includes nucleotide sequences to permit ligation of nucleotide sequences within the vector, wherein such nucleotide sequences are also replicated in a compatible host cell. A vector can also mediate recombinant production of an SCN/SDS resistance gene polypeptide, as described further herein below.
Nucleic acids of the present invention can be cloned, synthesized, recombinantly altered, mutagenized, or combinations thereof. Standard recombinant DNA and molecular cloning techniques used to isolate nucleic acids are well known in the art. Exemplary, non-limiting methods are described by Sambrook et al., eds., 1989; by Silhavy et al., 1984; by Ausubel et al., 1992; and by Glover, ed. (1985) DNA Cloning: A Practical Approach , MRL Press, Ltd., Oxford, United Kingdom. Site-specific mutagenesis to create base pair changes, deletions, or small insertions are also well known in the art as exemplified by publications, see, e.g., Adelman et al., (1983) DNA 2:183; Sambrook et al. (1989).
Nucleotide sequences of the present invention can detected, subcloned, sequenced, and further evaluated by any measure well known in the art using any method usually applied to the detection of a specific DNA sequence including but not limited to dideoxy sequencing, PCR, oligomer restriction (Saiki et al., Bio/Technology 3:1008-1012 (1985), allele-specific oligonucleotide (ASO) probe analysis (Conner et al. (1983) Proc Natl Acad Sci USA 80:278), and oligonucleotide ligation assays (OLAs) (Landgren et. al. (1988) Science 241:1007). Molecular techniques for DNA analysis have been reviewed (Landgren et. al. (1988) Science 242:229-237).
| Table of Functionally Equivalent Codons | ||||
| Amino Acids | Codons | |||
| Alanine | Ala | A | GCA GCC GCG GCU | |
| Cysteine | Cys | C | UGC UGU | |
| Aspartic Acid | Asp | D | GAC GAU | |
| Glumatic acid | Glu | E | GAA GAG | |
| Phenylalanine | Phe | F | UUC UUU | |
| Glycine | Gly | G | GGA GGC GGG GGU | |
| Histidine | His | H | CAC CAU | |
| Isoleucine | Ile | I | AUA AUC AUU | |
| Lysine | Lys | K | AAA AAG | |
| Leucine | Leu | L | UUA UUG CUA CUC CUG CUU | |
| Methionine | Met | M | AUG | |
| Asparagine | Asn | N | AAC AAU | |
| Proline | Pro | P | CCA CCC CCG CCU | |
| Glutamine | Gln | Q | CAA CAG | |
| Arginine | Arg | R | AGA AGG CGA CGC CGG CGU | |
| Serine | Ser | S | ACG AGU UCA UCC UCG UCU | |
| Threonine | Thr | T | ACA ACC ACG ACU | |
| Valine | Val | V | GUA GUC GUG GUU | |
| Tryptophan | Trp | W | UGG | |
| Tyrosine | Tyr | Y | UAC UAU | |
III.B. Genetic Markers
The term “genetic marker”, as used herein generally refers to a genetic locus, a phenotype conferred by locus, or a nucleotide sequence residing at a locus, wherein the locus is genetically linked to a trait of interest. The term “genetically linked” as used herein refers to two or more loci that are predictably inherited together during random crossing or intercrossing. Quantitative linkage analysis is further described in the section Genetic Mapping herein above. Preferably, genetically linked loci are less than about 10 cM apart, more preferably less than about 5 cM apart, and even more preferably less than about 1 cM apart. Optimally, the genetic marker and the gene conferring a trait of interest comprise the same or overlapping nucleotide sequence.
An embodiment of the present invention comprises genetic markers associated with SCN resistance and SDS resistance that are isolatable from soybeans, and which are free from total genomic DNA. Disclosed herein are sequences of AFLP markers mapped in soybean to the chromosomal segments carrying rhg1 and SDS loci on molecular linkage group G and the Rhg4 locus on molecular linkage group A2. Representative markers for SCN/SDS resistance are set forth as SEQ ID NOs:1, 3, 5, 7, 9, and 11. Representative corresponding markers for SCN/SDS susceptibility are set forth as SEQ ID NOs:2, 4, 6, 8, 10, and 12.
AFLP bands were obtained as described in Example 3. From each AFLP band, 4-30 clones were sequenced (mean 15.6) depending on the sequence complexity of the originating band. The sequence analysis showed that each AFLP band can be composed of a number of different DNA sequences from fragments of identical size. A mean of 6 sequences per band with a range of 1-15 sequences per band was detected. From a single AFLP band only one sequence corresponded with the original AFLP marker. The other sequences were bands that shared not only the same size within 1-2 bp but also the same selective bases at the EcoRI and MseI sites (100%). Further, some of the cloned sequences from within a band shared between 6 to 15 bp in common to each side (EcoRI and MseI) of the original AFLP polymorphism (about 30% of bands).
To identify polymorphisms within the AFLP, the AFLP sequence was used to design primers to screen the Forrest BamHI BAC library by PCR. For example, E ATG M CGA 87 was a dominant AFLP band in coupling phase with the rhg1 locus, and screening with a E ATG M CGA 87 AFLP band primer yielded a single clone. Two internal primers were designed from the E ATG M CGA 87 resistant allele and DNA from the corresponding BAC was used as template to extend the sequence from the AFLP marker both up and down stream by sequencing. The sequence showed a single 5 bp indel underlay the polymorphic band and no SNPs were present. As used herein, an “indel” refers to a nucleotide insertion or a deletion (FIG. 1B). No additional polymorphisms were detected in about 1,250 bp of flanking sequence.
Sequence comparison of both, resistant and the susceptible alleles of the co-dominant AFLP marker E CTA M AGG 113 found polymorphisms including both indels and SNPs. There were 4 SNPs within 113 bp and 1 indel (21 bp) (FIG. 1A). Primer sets were designed around the indel site and used to map the genetic position. The genetic position of the identified indel mapped to the region of the original AFLP.
Sequence comparison of resistant and the susceptible alleles of the dominant AFLP marker E CCC M ATG 161 found SNP polymorphism. There were 2 SNPs within 116 bp (FIG. 1A). Primer sets were designed around the SNP site and used to map the genetic position. The genetic position of the identified indel mapped to the region of the original AFLP.
Sequence comparison of both resistant and susceptible alleles of the dominant AFLP marker E CCA M AGC 114 found SNP polymorphism adjacent to the EcoRI site. There was 1 SNP within 114 bp (FIG. 1A).
Sequence comparison of resistant and susceptible alleles of the co-dominant AFLP marker E CCG M AAC 405 found polymorphisms including both indels and SNPs. There were 2 indels (12 bp and 4 bp) and 4 SNPs within 405 bp (FIG. 1A). The 4 bp indel was two AG repeats in an [AG] 5 complex micro-satellite sequence. Primer sets were designed around both indel sites and used to map the genetic position. In both cases, the genetic position of the identified indel mapped to the region of the original AFLP.
For the AFLP marker E CGG M AGA 116, the polymorphisms were found adjacent to both the EcoRI and MseI restriction sites (FIG. 1A). The six selective nucleotide step was replaced by M AGAGACT and E C . Using this primer set the detection of the polymorphism on sequencing gels as well as the mapping of this sequence to the same location as the original AFLP was successful (FIG. 2B). There was 1 indel (2 bp) and 1 SNPs within 116 bp (FIG. 1A). The 2 bp indel was the [A] 2 extension of an [A] 8 repeat. Primer sets were designed around the indel and SNP sites and used to map their genetic positions. In both cases, the genetic position of the identified polymorphism was identical to the region of the original AFLP.
Comparison of both alleles of the AFLP marker E CCG M AAC 405 provided four SNPS, two indels and one SSR. The insertion of [AG] 2 in the [AG] 8 repeat of the resistance allele created a microsatellite polymorphism that was designated SIUC-SAG405 by the present co-inventors. The difference of 4 bp between the two alleles at position 224 bp to 228 bp was enough to discriminate between the resistant and susceptible allele after electrophoresis through a 4% (v/w) Metaphor7 agarose gel. The 12 bp indel at 42 bp to 54 bp was used to design a sequence specific PCR marker (FIG. 2D), and to develop a TaqMan™ assay for the Rhg4 locus. SNPs were found within the E CCG M AAC 405. The transversions of T at position 327 in the resistant allele to C at position 337 in the susceptible allele; and A at position 358 bp in the resistance allele to C at position 366 bp in the susceptible allele can also be used for high-throughput screening SNPs based assay.
An indel of 21 bp was responsible for the polymorphism at the E CTA M AGG 113 AFLP locus between Essex and Forrest. PCR based markers were designed to flank the 21 bp indel and shown to be polymorphic, the new marker was named CTA (FIG. 2C).
In the E ATG M CGA 87 marker the insertion of CTTAT to form a tandem repeat in the Forrest allele at position 20 bp to 25 bp created a 5 bp polymorphism that was suitable for marker development. PCR primers were designed to develop a sequence specific PCR assay (FIG. 2A), the new marker was named ATG4. The same indel was used to develop a TaqMan™ probe named TMA5 to discriminate between the two alleles.
The genetic markers of the present invention can be used to reliably select SCN/SDS resistance, as described herein.
III.C. SCN/SDS Resistance Genes
The term “gene” refers broadly to any segment of DNA associated with a biological function. A gene encompasses sequences including but not limited to a coding sequence, a promoter region, a cis-regulatory sequence, a non-expressed DNA segment, a non-expressed DNA segment that contributes to gene expression, a DNA segment designed to have desired parameters, or combinations thereof. A gene can be obtained by a variety of methods, including cloning from a biological sample, synthesis based on known or predicted sequence information, and recombinant derivation of an existing sequence.
The term “gene” thus includes an isolated soybean rhg1 and SDS resistance gene as disclosed herein (FIG. 3). The gene is capable of conveying Heterodera glycines -infestation resistance or Fusarium solani -infection resistance to a non-resistant soybean germplasm, the gene located within a quantitative trait locus mapping to linkage group G and mapped by genetic markers of SEQ ID NOs:1-6, said gene located along said quantitative trait locus between said markers. Positional cloning methods were used to isolate genomic sequences in the chromosomal regions of Forrest that confers SCN/SDS resistance, as further described in Example 4. Specifically, rhg1 sequences were derived from BAC clones 21D9 and 73P6 of the Forrest BamHI or HindIII BAC libraries (Meksem et al., 2000). Preferably, the gene comprises the nucleotide sequence set forth as SEQ ID:13 (FIG. 7A-B). BLASTP analysis of the conceptual translation of the rhg1 gene (FIG. 7C), set forth as SEQ ID:14 shows high homology to the T46070 GenBank entry described as hypothetical protein T18N14.120 from Arabidopsis thaliana (FIG. 7E-F), high homology to the rice Xa21 disease resistance gene encoding a leucine-rich repeat protein, and high homology to the tomato CF-2 gene for resistance to Cladosporium fulvus (FIG. 7D).
The rhg1 sequences disclosed herein can also be used to isolate rhg1 cDNAs according to methods well-known in the art. A representative rhg1 partial cDNA is set forth as SEQ ID NO:122. This segment of the rhg1 gene shows homology to the leucine-rich regions of the Arabidopsis hypothetical protein T18N14.120 (Gen Bank T46070) and tomato CF-2 resistance genes.
For example, the term “gene” also includes an isolated soybean Rhg4 gene. The gene is capable of conveying Heterodera glycines -infestation resistance to a non-resistant soybean germplasm, said gene located within a quantitative trait locus mapping to linkage group A2 and mapped by the AFLP markers of SEQ ID NOs:6-12, said gene located along said quantitative trait locus between said markers. Preferably, the gene comprises a nucleotide sequence set forth as any one of SEQ ID NOs:16-19.
Genes underlying quantitative traits, or genes with related function, such as disease resistance, are often organized in clusters within the genome (e.g., Staskawicz (1995) Science 268:661-667). In the case of SCN/SDS resistance, previous studies by the co-inventors of the present invention have suggested that the resistance trait in Forrest may be caused by four genes in a cluster with two pairs in close linkage or by a two-gene cluster with each gene displaying pleitropy (Meksem et al., 1999). Thus, genomic DNA isolated and disclosed herein comprise multiple resistance gene sequences. Additional sequences derived from the SCN/SDS resistance locus are set forth as SEQ ID NOs:20-66. BLASTX analysis of these sequences reveals further homology to known proteins in other organisms, supporting that they comprise new partial gene sequences (Table 1). Of particular interest, BLASTX analysis of the sequences set forth as SEQ ID NOs:67-114 reveals that several of the disclosed sequences have high homology to the T46070 GenBank entry described as hypothetical protein T18N14.120 from Arabidopsis thaliana , high homology to the tomato CF-2 disease resistance genes encoding leucine-rich repeat proteins, and to the tomato CF-9 gene for resistance to Cladosporium fulvus (Table 1).
The present invention also pertains to resistance genes related to rhg1 and Rhg4. Partial cDNAs of additional putative SCN/SDS resistance genes, set forth as SEQ ID NOs:67-114, were identified based on hybridization to rhg1 and Rhg4 sequences, as further described in Example 5. BLASTX analysis of these sequences reveals further homology to known proteins in other organisms, supporting that they comprise new partial gene sequences (Table 2). Of particular interest, BLASTX analysis of the sequences set forth as SEQ ID NOs:67-114 reveals that several of the disclosed sequences have high homology to the T46070 GenBank entry described as hypothetical protein T18N14.120 from Arabidopsis thaliana , high homology to the tomato CF-2 disease resistance genes encoding leucine-rich repeat proteins, and to the tomato CF-9 gene for resistance to Cladosporium fulvus (Table 2). Based on their hybridization to rhg1 and Rhg4 sequences, genes comprising any of SEQ ID NOs:67-114 may also confer resistance to race 3 Heterodera glycines . It will be apparent to one having ordinary skill in the art that the disclosed sequences, or portion thereof, can be used to identify, confirm and/or screen for SDS, SCN and/or other resistance or for loci that confer SDS, SCN and/or other resistance.
| TABLE 1 | ||||||
| SEQ | best BLAST hit | Score | ||||
| ID NO. | inventor's reference | (ACCESSION) | (bits) | E value | Identities | Positives |
| 20 | III-00_F2-3RCF1900-2450 | T47727 | 230 | 9e−60 | 114/170 (67%) | 134/170 (78%) |
| 21 | III-01_21d9A1, 1A1 | no significant similarity | ||||
| 22 | III-01_21d9A2, 11F11Rlaccase | AC007063 | 97 | 1e−19 | 62/166 (37%) | 92/166 (55%) |
| 23 | III-01_21d9A2, 4A4Mic | no significant similarity | ||||
| 24 | III-01_CMG, smalF1-1F | T46070 | 67 | 4e−13 | 49/147 (33%) | 62/147 (41%) |
| 25 | III-02_21d9A2, 12A12FNaH+hypoth | T00576 | 67 | 2e−10 | 57/188 (30%) | 87/188 (45%) |
| 26 | III-02_F3-1RCF2000-2500 | T46070 | 170 | 7e−42 | 79/105 (75%) | 93/105 (88%) |
| 27 | III-03_21d9A1, 1E1Flaccase | AC007020 | 61 | 1e−08 | 37/65 (56%) | 43/65 (65%) |
| 28 | III-03_21d9A2, 12A12RNaH+hypothet | AC007063 | 116 | 2e−25 | 61/165 (36%) | 95/165 (56%) |
| 29 | III-03_21d9A2, 4B4ESTM | no significant similarity | ||||
| 30 | III-03_21d9A2, 8F8CF1a | T47727 | 187 | 53-48 | 95/142 (66%) | 106/142 (73%) |
| 31 | III-03_21d9A2, 8F8CFHomol | T47727 | 177 | 5e−45 | 90/132 (68%) | 100/132 (75%) |
| 32 | III-03_CMG, smalF1-3FCF300-1100 | T46070 | 107 | 4e−27 | 67/189 (35%) | 89/189 (46%) |
| 33 | III-03_F3-2R1800-Cterm | T47727 | 201 | 1e−64 | 97/129 (75%) | 113/129 (87%) |
| 34 | III-04_21d9A1, 1E1R | no significant similarity | ||||
| 35 | III-04_21d9A2, 1B1 | no significant similarity | ||||
| 36 | III-04_21d9A2, 6D6mic | no significant similarity | ||||
| 37 | III-05_21d9A1, 1C1GmxLaccase | AB010692 | 153 | 2e−36 | 80/124 (64%) | 90/124 (72%) |
| 38 | III-05_21d9A2, 4C4CFHomol | T46070 | 125 | 6e−28 | 65/106 (61%) | 72/106 (67%) |
| 39 | III-06_21d9A2, 11A11laccasegene | AC007020 | 67 | 3e−12 | 30/49 (61%) | 35/49 (71%) |
| 40 | III-07_21d9A1, 2A2F | no significant similarity | ||||
| 41 | III-08_21d9A1, 2A2R | no significant similarity | ||||
| 42 | III-08_21d9A2, 6F6 | no significant similarity | ||||
| 43 | III-09_21d9A1, 1E1 | no significant similarity | ||||
| 44 | III-09_21d9A1, 2D2FNaH+hypothe | AC007063 | 84 | 93-17 | 44/127 (34%) | 74/127 (57%) |
| 45 | III-09_21d9A2, 4E4Laccase | AC007020 | 90 | 1e−32 | 43/53 (81%) | 46/53 (86%) |
| 46 | III-09_21d9A2, 9A9 | no significant similarity | ||||
| 47 | III-10_21d9A2, 11C11 | T47325 | 53 | 3e−06 | 45/132 (34%) | 65/132 (49%) |
| 48 | III-10_21d9A2, 11C11hypothetical | T47325 | 53 | 3e−06 | 45/132 (34%) | 65/132 (49%) |
| 49 | III-11_21d9A1, 1F1SatAT | no significant similarity | ||||
| 50 | III-11_21d9A2, 4A4F | no significant similarity | ||||
| 51 | III-11_21d9A2, 4F4SatTA | no significant similarity | ||||
| 52 | III-12_21d9A2, 1F1NaHexchangine | AC007063 | 126 | 3e−28 | 72/181 (39%) | 108/181 (58%) |
| 53 | III-12_21d9A2, 4A4RSatTAGA | no significant similarity | ||||
| 54 | III-13_21d9A1, 1G1NaHexchanHypothe | T00576 | 50 | 2e−05 | 31/83 (37%) | 44/83 (52%) |
| 55 | III-13_21d9A1, 8D8CF500-1000 | T46070 | 84 | 4e−24 | 48/127 (37%) | 66/127 (51%) |
| 56 | III-13_21d9A2, 4B4FSatGAAAA | no significant similarity | ||||
| 57 | III-14_21d9A2, 11E11GmxEST | no significant similarity | ||||
| 58 | III-14_21d9A2, 1G1 | no significant similarity | ||||
| 59 | III-15_21d9A1, 8E8 | no significant similarity | ||||
| 60 | III-15_21d9A2, 4C4FCF1600-1000 | T46070 | 158 | 6e−38 | 99/215 (46%) | 113/215 (52%) |
| 61 | III-15_21d9A2, 9D9NaHlonexch | AC007063 | 64 | 1e−09 | 38/118 (32%) | 59/118 (49%) |
| 62 | III-16_21d9A1, 11D11laccase | CAA74104 | 82 | 4e−17 | 35/49 (71%) | 43/49 (87%) |
| 63 | III-16_21d9A2, 11F11MicSatTA | no significant similarity | ||||
| 64 | III-16_21d9A2, 4C4R300-1000 | T46070 | 110 | 3e−32 | 67/178 (37%) | 86/178 (47%) |
| 65 | III-17_21d9A1, 2A2SatGA | no significant similarity | ||||
| 66 | III-17_21d9A1, 2A2SatTAA | no significant similarity | ||||
| 73 | II-01F2-4RCf1900-2400 | T46070 | 187 | 6e−47 | 99/183 (54%) | 123/183 (67%) |
| TABLE 2 | ||||||
| SEQ | best BLAST hit | Score | ||||
| ID NO. | inventor's reference | (ACCESSION) | (bits) | E value | Identities | Positives |
| 67 | 3A Cf2 homologues to the +2ORF clone ID: 07d9 | T47727 | 189 | 4e−47 | 103/215 (47%) | 127/215 (58%) |
| 68 | 3B Cf2 homologues to the −2ORF clone ID: 05d7 | T46070 | 148 | 8e−35 | 76/157 (48%) | 98/157 (62%) |
| 69 | 3C Cf2 homologues to the +3 ORF clone ID: 17P9 | T47727 | 200 | 2e−50 | 100/136 (73%) | 113/136 (82%) |
| 70 | 3D Cf2 homologues to the −3ORF clone ID: 06d8 | T46070 | 163 | 2e−39 | 86/179 (48%) | 110/179 (61%) |
| 71 | II-00_F2-3RCF1900-2450 | T47727 | 230 | 9e−60 | 114/170 (67%) | 134/170 (78%) |
| 72 | II-01CMGsmalF1-1F300-1000 | T46070 | 76 | 4e−13 | 49/147 (33%) | 62/147 (41%) |
| 73 | II-01F2-4RCf1900-2400 | T46070 | 187 | 6e−47 | 99/183 (54%) | 123/183 (67%) |
| 74 | II-02F3-1RCF2000-2500 | T46070 | 170 | 7e−42 | 79/105 (75%) | 93/105 (88%) |
| 75 | II-03.21dA2, 8F8CF1-500 | T47727 | 187 | 5e−48 | 95/142 (66%) | 106/142 (73%) |
| 76 | II-03CMG, smalF1-3FCF300-1100 | T46070 | 107 | 4e−27 | 67/189 (35%) | 89/189 (46%) |
| 77 | II-03F3-2R1800-Cterm | T47727 | 201 | 1e−64 | 97/129 (75%) | 113/129 (87%) |
| 78 | II-04.21dA1, 1E1R | no significant similarity | ||||
| 79 | II-05.21dA2, 4C4CFhomol | T46070 | 125 | 6e−28 | 65/106 (61%) | 72/106 (67%) |
| 80 | II-12CFLNO1F-CFNOIF | T46070 | 135 | 2e−33 | 74/165 (44%) | 97/165 (57%) |
| 81 | II-12CFLNO1F-CFLNOIR | T46070 | 273 | 2e−72 | 133/183 (72%) | 156/183 (84%) |
| 82 | II-12CFLNO1F-CFLNNIF | T46070 | 184 | 73-46 | 91/128 (71%) | 100/128 (78%) |
| 83 | II-12CFLNO1F-CFLNN2F | T46070 | 109 | 3e−24 | 69/189 (36%) | 89/189 (46%) |
| 84 | II-13.21dA1, 8D8CF500-1000 | T46070 | 84 | 4e−24 | 48/127 (37%) | 66/127 (51%) |
| 85 | II-15.21dA2, 4C4FCF1600-1000 | T46070 | 158 | 6e−38 | 99/215 (46%) | 113/215 (52%) |
| 86 | II-29.21dA2, 8F8FCF500upstream | T47727 | 102 | 2e−39 | 56/105 (53%) | 67/105 (63%) |
| 87 | II-30.21d9A2, 12E12ESTMedicago | T47731 | 238 | 6e−62 | 119/163 (73%) | 132/163 (80%) |
| 88 | II-30.21d9A2, 8F8RCFpromoter | no significant similarity | ||||
| 89 | II-30.E2, TetRP1downstreamtoRhg1 | S05434 | 35 | 1.0 | 30/109 (27%) | 49/109 (44%) |
| 90 | II-32.E3, TetRP1CF1115-1249 | no significant similarity | ||||
| 91 | II-Cf homol-01CMGsmalF1-2F | T46070 | 76 | 4e−13 | 49/147 (33%) | 62/147 (41%) |
| 92 | II-Cf homol-CMGsmalF1-2F | T46070 | 125 | 8e−32 | 74/188 (39%) | 95/188 (50%) |
| 93 | II-Cf homol-03CMGsmalF1-3 | T46070 | 105 | 1e−26 | 66/188 (35%) | 88/188 (46%) |
| 94 | II-Cf homol-06CMGsmalF2-2F | T46070 | 123 | 2e−27 | 80/224 (35%) | 105/224 (46%) |
| 95 | II-Cf homol-07CMGsmalF2-3F | T46070 | 123 | 2e−27 | 80/224 (35%) | 105/224 (46%) |
| 96 | II-Cf homol-08CMGsmalF2-4F03 | T46070 | 118 | 6e−29 | 71/183 (38%) | 90/183 (48%) |
| 97 | II-Cf homol-10CMGsmalF3-2F | T46070 | 184 | 7e−46 | 91/128 (71%) | 100/128 (78%) |
| 98 | II-Cf homol-09CMGsmalF3-1F | T46070 | 184 | 6e−46 | 91/128 (71%) | 100/128 (78%) |
| 99 | II-Cf homol-smalF3-3F | T46070 | 265 | 2e−70 | 128/174 (73%) | 151/174 (86%) |
| 100 | II-Cf homol-12CMGsmalF3-4F | T46070 | 184 | 7e−46 | 89/107 (83%) | 97/107 (90%) |
| 101 | II-Cf homol-13CMGsmalF1-1R | T46070 | 279 | 3e−74 | 136/191 (71%) | 159/191 (83%) |
| 102 | II-Cf homol-14CMGsmalF1-2R | T46070 | 261 | 3e−69 | 127/176 (72%) | 148/176 (83%) |
| 103 | II-Cf homol-15CMGsmalF1-3R | T47727 | 246 | 1e−64 | 120/162 (74%) | 140/162 (86%) |
| 104 | II-Cf homol-16CMGsmalF1-4R | T46070 | 263 | 1e−70 | 128/176 (72%) | 149/176 (83%) |
| 105 | II-Cf homol-17CMGsmalF2-1R | T46070 | 268 | 5e−71 | 131/183 (71%) | 155/183 (84%) |
| 106 | II-Cf homol-18CMGsmalF2-2R | T46070 | 244 | 4e−65 | 118/159 (74%) | 137/159 (85%) |
| 107 | II-Cf homol-05F3-4R | T46070 | 187 | 6e−47 | 90/136 (66%) | 111/136 (81%) |
| 108 | II-Cf homol-00F2-3R | T46070 | 224 | 3e−58 | 108/148 (72%) | 127/148 (84%) |
| 109 | II-Cf homol-01F2-4R | T46070 | 187 | 6e−47 | 99/183 (54%) | 123/183 (67%) |
| 110 | II-Cf homol-02F3-1R | T46070 | 170 | 7e−42 | 79/105 (75%) | 93/105 (88%) |
| 111 | II-Cf homol-03F3-2R | T47727 | 202 | 9e−65 | 97/133 (72%) | 11/133 (84%) |
| 114 | II-Cf homol-04F3-3R | T46070 | 128 | 1e−30 | 65/108 (60%) | 72/108 (66%) |
| 114 | II-Cf homol-05CMGsmalF2-F | T46070 | 184 | 6e−46 | 91/128 (71%) | 100/128 (78%) |
| 114 | II-downstream to Rhg1 | no significant similarity | ||||
III.D. SCN/SDS Resistance Gene Promoters
The term “promoter region” defines a nucleotide sequence within a gene that is positioned 5′ to a coding sequence of a same gene and functions to direct transcription of the coding sequence. The promoter region includes a transcriptional start site and at least one cis-regulatory element. The present invention encompasses nucleic acid sequences that comprise a promoter region of an SCN/SDS resistance gene, or functional portion thereof.
The terms “cis-acting regulatory sequence” or “cis-regulatory motif” or “response element”, as used herein, each refer to a nucleotide sequence that enables responsiveness to a regulatory transcription factor. Responsiveness can encompass a decrease or an increase in transcriptional output and is mediated by binding of the transcription factor to the DNA molecule comprising the response element.
The term “transcription factor” generally refers to a protein that modulates gene expression by interaction with the cis-regulatory element and cellular components for transcription, including RNA Polymerase, Transcription Associated Factors (TAFs), chromatin-remodeling proteins, and any other relevant protein that impacts gene transcription.
The term “gene expression” generally refers to the cellular processes by which a biologically active polypeptide is produced from a DNA sequence.
A “functional portion” of a promoter gene fragment is a nucleotide sequence within a promoter region that is required for normal gene transcription. To determine nucleotide sequences that are functional, the expression of a reporter gene is assayed when variably placed under the direction of a promoter region fragment.
Promoter region fragments can be conveniently made by enzymatic digestion of a larger fragment using restriction endonucleases or DNAse I. Preferably, a functional promoter region fragment comprises about 5,000 nucleotides, more preferably 2,000 nucleotides, more preferably about 1,000 nucleotides, more preferably a functional promoter region fragment comprises about 500 nucleotides, even more preferably a functional promoter region fragment comprises about 100 nucleotides, and even more preferably a functional promoter region fragment comprises about 20 nucleotides.
Within a candidate promoter region or response element, the presence of regulatory proteins bound to a nucleic acid sequence can be detected using a variety of methods well known to those skilled in the art (Ausubel et al., 1992). Briefly, in vivo footprinting assays demonstrate protection of DNA sequences from chemical and enzymatic modification within living or permeabilized cells. Similarly, in vitro footprinting assays show protection of DNA sequences from chemical or enzymatic modification using protein extracts. Nitrocellulose filter-binding assays and gel electrophoresis mobility shift assays (EMSAs) track the presence of radiolabeled regulatory DNA elements based on provision of candidate transcription factors.
The terms “reporter gene” or “marker gene” or “selectable marker” each refer to a heterologous gene encoding a product that is readily observed and/or quantitated. A reporter gene is heterologous in that it originates from a source foreign to an intended host cell or, if from the same source, is modified from its original form. Non-limiting examples of detectable reporter genes that can be operably linked to a transcriptional regulatory region can be found in brown and PCT International Publication No. WO 97/47763. Preferred reporter genes for transcriptional analyses include the lacZ gene (See, e.g., Rose & Botstein (1983) Meth Enzymol 101:167-180), Green Fluorescent Protein (GFP) (Cubitt et al. (1995) Trends Biochem Sci 20:448-455), luciferase, or chloramphenicol acetyl transferase (CAT). Preferred reporter genes for stable transformation include but are not limited to antibiotic resistance genes. Any suitable reporter and detection method can be used, and it will be appreciated by one of skill in the art that no particular choice is essential to or a limitation of the present invention.
An amount of reporter gene can be assayed by any method for qualitatively or preferably, quantitatively determining presence or activity of the reporter gene product. The amount of reporter gene expression directed by each test promoter region fragment is compared to an amount of reporter gene expression to a control construct comprising the reporter gene in the absence of a promoter region fragment. A promoter region fragment is identified as having promoter activity when there is significant increase in an amount of reporter gene expression in a test construct as compared to a control construct. The term “significant increase”, as used herein, refers to an quantified change in a measurable quality that is larger than the margin of error inherent in the measurement technique, preferably an increase by about 2-fold or greater relative to a control measurement, more preferably an increase by about 5-fold or greater, and most preferably an increase by about 10-fold or greater.
A representative SCN/SDS resistance gene promoter, the rhg1 promoter, is set forth as SEQ ID NO:15. The rhg1 promoter is useful for directing gene expression of heterologous sequences in vivo or in assays to identify modulators of rhg1 expression, described further herein below.
The present invention further provides an isolated SCN/SDS resistance gene promoter region, or functional portion thereof, comprising an about 90 kb fragment of soybean genomic clone 73P6 between BamHI restriction sites and 21d9 between HinDIII restriction site. The genomic clone is available from the Forrest BAC library described in Meksem et al (2000), Theor Appl Genet. 101 5/6: 747-755, available through Southern Illinois University-Carbondale (Carbondale, Ill.), Texas A&M University BAC center (College Station, Tex.), and Research Genetics (Huntsville, Ala.). An isolated SCN/SDS resistance gene promoter region, or functional portion thereof, comprising an about 4.5 kb fragment of soybean genomic clone 21d9A2 8F8 between EcoRI restriction sites is also disclosed.
III.E. Chimeric Genes
The present invention also encompasses chimeric genes comprising the disclosed SCN/SDS resistance gene sequences. The term “chimeric gene”, as used herein, refers to an SCN/SDS resistance gene promoter region operably linked to an open reading frame, wherein the nucleotide sequence created is not naturally occurring. In this regard, the open reading frame is also described as a “heterologous sequence”. The term “chimeric gene” also encompasses a promoter region operably linked to an SCN/SDS resistance gene coding sequence, a nucleotide sequence producing an antisense RNA molecule, a RNA molecule having tertiary structure, such as a hairpin structure, or a double-stranded RNA molecule.
The term “operably linked”, as used herein, refers to a promoter region that is connected to a nucleotide sequence in such a way that the transcription of that nucleotide sequence is controlled and regulated by that promoter region. Techniques for operatively linking a promoter region to a nucleotide sequence are well known in the art.
The terms “heterologous gene”, “heterologous DNA sequence”, “heterologous nucleotide sequence”, “exogenous nucleic acid molecule”, or “exogenous DNA segment”, as used herein, each refer to a sequence that originates from a source foreign to an intended host cell or, if from the same source, is modified from its original form. Thus, a heterologous gene in a host cell includes a gene that is endogenous to the particular host cell but has been modified, for example by mutagenesis or by isolation from native cis-regulatory sequences. The terms also include non-naturally occurring multiple copies of a naturally occurring nucleotide sequence. Thus, the terms refer to a DNA segment that is foreign or heterologous to the cell, or homologous to the cell but in a position within the host cell nucleic acid wherein the element is not ordinarily found.
IV. Polypeptide Sequences of SCN/SDS Resistance Proteins
The polypeptides provided by the present invention include the isolated polypeptide of SEQ ID NO:14, fusion proteins comprising SCN/SDS resistance gene amino acid sequences, biologically functional analogs, and polypeptides that cross-react with an antibody that specifically recognizes an SCN/SDS resistance gene polypeptide.
The term “isolated”, as used in the context of a polypeptide, indicates that the polypeptide exists apart from its native environment and is not a product of nature. An isolated polypeptide can exist in a purified form or can exist in a non-native environment such as, for example, in a transgenic host cell.
The term “purified”, when applied to a polypeptide, denotes that the polypeptide is essentially free of other cellular components with which it is associated in the natural state. Preferably, a polypeptide is a homogeneous solid or aqueous solution. Purity and homogeneity are typically determined using analytical chemistry techniques such as polyacrylamide gel electrophoresis or high performance liquid chromatography. A polypeptide that is the predominant species present in a preparation is substantially purified. The term “purified” denotes that a polypeptide gives rise to essentially one band in an electrophoretic gel. Particularly, it means that the polypeptide is at least about 50% pure, more preferably at least about 85% pure, and most preferably at least about 99% pure.
The term “substantially identical” in the context of two or more polypeptides sequences is measured by (a) polypeptide sequences having about 35%, or 45%, or preferably from 45-55%, or more preferably 55-65%, or most preferably 65% or greater amino acids that are identical or functionally equivalent. Percent “identity” and methods for determining identity are defined herein under the heading Nucleotide and Amino Acid Sequence Comparisons.
Substantially identical polypeptides also encompass two or more polypeptides sharing a conserved three-dimensional structure. Computational methods can be used to compare structural representations, and structural superpositions can be generated and easily tuned to identify similarities around important active sites or ligand binding sites. See Henikoff et al. (2000) Electrophoresis 21(9):1700-1706; Huang et al. (2000) Pac Symp Biocomput 230-241; Saqi et al., 1999; and Barton (1998) Acta Crystallogr D Biol Crystallogr 54:1139-1146.
The term “functionally equivalent” in the context of amino acid sequences is well known in the art and is based on the relative similarity of the amino acid side-chain substituents. See Henikoff and Henikoff (2000) Adv Protein Chem 54:73-97. Relevant factors for consideration include side-chain hydrophobicity, hydrophilicity, charge, and size. For example, arginine, lysine, and histidine are all positively charged residues; that alanine, glycine, and serine are all of similar size; and that phenylalanine, tryptophan, and tyrosine all have a generally similar shape. By this analysis, described further herein below, arginine, lysine, and histidine; alanine, glycine, and serine; and phenylalanine, tryptophan, and tyrosine; are defined herein as biologically functional equivalents.
In making biologically functional equivalent amino acid substitutions, the hydropathic index of amino acids can be considered. Each amino acid has been assigned a hydropathic index on the basis of their hydrophobicity and charge characteristics, these are: isoleucine (+4.5); valine (+4.2); leucine (+3.8); phenylalanine (+2.8); cysteine (+2.5); methionine (+1.9); alanine (+1.8); glycine (−0.4); threonine (−0.7); serine (−0.8); tryptophan (−0.9); tyrosine (−1.3); proline (−1.6); histidine (−3.2); glutamate (−3.5); glutamine (−3.5); aspartate (−3.5); asparagine (−3.5); lysine (−3.9); and arginine (−4.5).
The importance of the hydropathic amino acid index in conferring interactive biological function on a protein is generally understood in the art (Kyte et al. (1982) J Mol Biol 157:105.). It is known that certain amino acids can be substituted for other amino acids having a similar hydropathic index or score and still retain a similar biological activity. In making changes based upon the hydropathic index, the substitution of amino acids whose hydropathic indices are within ±2 of the original value is preferred, those which are within ±1 of the original value are particularly preferred, and those within ±0.5 of the original value are even more particularly preferred.
It is also understood in the art that the substitution of like amino acids can be made effectively on the basis of hydrophilicity. U.S. Pat. No. 4,554,101 states that the greatest local average hydrophilicity of a protein, as governed by the hydrophilicity of its adjacent amino acids, correlates with its immunogenicity and antigenicity, i.e. with a biological property of the protein. It is understood that an amino acid can be substituted for another having a similar hydrophilicity value and still obtain a biologically equivalent protein.
As detailed in U.S. Pat. No. 4,554,101, the following hydrophilicity values have been assigned to amino acid residues: arginine (+3.0); lysine (+3.0); aspartate (+3.0±1); glutamate (+3.0±1); serine (+0.3); asparagine (+0.2); glutamine (+0.2); glycine (0); threonine (−0.4); proline (−0.5±1); alanine (−0.5); histidine (−0.5); cysteine (−1.0); methionine (−1.3); valine (−1.5); leucine (−1.8); isoleucine (−1.8); tyrosine (−2.3); phenylalanine (−2.5); tryptophan (−3.4).
In making changes based upon similar hydrophilicity values, the substitution of amino acids whose hydrophilicity values are within ±2 of the original value is preferred, those which are within ±1 of the original value are particularly preferred, and those within ±0.5 of the original value are even more particularly preferred.
The present invention also encompasses SCN/SDS resistance gene polypeptide fragments or functional portions of an SCN/SDS resistance gene polypeptide. Such functional portion need not comprise all or substantially all of the amino acid sequence of a native resistance gene product. The term “functional” includes any biological activity or feature of SCN/SDS resistance gene, including immunogenicity.
The present invention also includes longer sequences comprising an SCN/SDS resistance gene polypeptide, or portion thereof. For example, one or more amino acids can be added to the N-terminal or C-terminal of an SCN/SDS resistance gene polypeptide. Fusion proteins comprising SCN/SDS resistance gene polypeptide sequences are also provided within the scope of the present invention. Methods of preparing such proteins are known in the art.
The present invention also encompasses functional analogs of an SCN/SDS resistance gene polypeptide. Functional analogs share at least one biological function with an SCN/SDS resistance gene polypeptide. An exemplary function is immunogenicity. In the context of amino acid sequence, biologically functional analogs, as used herein, are peptides in which certain, but not most or all, of the amino acids can be substituted. Functional analogs can be created at the level of the corresponding nucleic acid molecule, altering such sequence to encode desired amino acid changes. In one embodiment, changes can be introduced to improve the antigenicity of the protein. In another embodiment, an SCN/SDS resistance gene polypeptide sequence is varied so as to assess the activity of a mutant SCN/SDS resistance gene polypeptide. In still another embodiment, amino acid changes can be made to improve the stability of the polypeptide.
Isolated polypeptides and recombinantly produced polypeptides can be purified and characterized using a variety of standard techniques that are well known to the skilled artisan. See, e.g. Ausubel et al. (1992); Bodanszky et al., 1976; and Zimmer et al. (1993) Peptides , pp. 393B394, ESCOM Science Publishers, B. V.
V. Nucleotide and Amino Acid Sequence Comparisons
The terms “identical” or percent “identity” in the context of two or more nucleotide or polypeptide sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of amino add residues or nucleotides that are the same, when compared and aligned for maximum correspondence, as measured using one of the sequence comparison algorithms disclosed herein or by visual inspection.
The term “substantially identical” in regards to a nucleotide or polypeptide sequence means that a particular sequence varies from the sequence of a naturally occurring sequence by one or more deletions, substitutions, or additions, the net effect of which is to retain at least some of biological activity of the natural gene, gene product, or sequence. Such sequences include. “mutant” sequences, or sequences wherein the biological activity is altered to some degree but retains at least some of the original biological activity. The term “naturally occurring”, as used herein, is used to describe a composition that can be found in nature as distinct from being artificially produced by man. For example, a protein or nucleotide sequence present in an organism, which can be isolated from a source in nature and which has not been intentionally modified by man in the laboratory, is naturally occurring.
For sequence comparison, typically one sequence is regarded as a reference sequence to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer program, subsequence coordinates are designated if necessary, and sequence algorithm program parameters are selected. The sequence comparison algorithm then calculates the percent sequence identity for the designated test sequence(s) relative to the reference sequence, based on the selected program parameters.
Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith & Waterman (1981) Adv Appl Math 2:482, by the homology alignment algorithm of Needleman &Wunsch (1970) J Mol Biol 48:443, by the search for similarity method of Pearson & Lipman (1988) Proc Natl Acad Sci USA 85:2444, by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, Madison, Wis.), or by visual inspection. See generally, Ausubel et al. (1992).
A preferred algorithm for determining percent sequence identity and sequence similarity is the BLAST algorithm, which is described in Altschul et al. (1990) J Mol Biol 215: 403-410. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (http://www.ncbi.nim.nih.gov/). This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold. These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when the cumulative alignment score falls off by the quantity X from its maximum achieved value, the cumulative score goes to zero or below due to the accumulation of one or more negative-scoring residue alignments, or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength W=11, an expectation E=10, a cutoff of 100, M=5, N=−4, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix. See Henikoff and Henikoff (1989) Proc Natl Acad Sci USA 89:10915.
In addition to calculating percent sequence identity, the BLAST algorithm also performs a statistical analysis of the similarity between two sequences. See, e.g., Karlin and Altschul (1993) Proc Natl Acad Sci USA 90:5873-5887. One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a test nucleic acid sequence is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid sequence to the reference nucleic acid sequence is less than about 0.1, more preferably less than about 0.01, and most preferably less than about 0.001.
VI. Method for Detecting a Nucleic Acid Molecule Associated with SCN/SDS Resistance
In another aspect of the invention, a method is provided for detecting a nucleic acid molecule that encodes an SCN/SDS resistance polypeptide. Such methods can be used to detect SCN/SDS resistance gene variants and related resistance gene sequences. The di