Title:
Identification of Gene Associated with Reading Disability and Uses Therefor
Kind Code:
A1


Abstract:
The present invention relates to identification of a human gene, DCDC2 (MIM: 605755), associated with susceptibility for developing reading disability (RD), which is useful in identifying or aiding in identifying individuals at risk for developing RD, as well as for diagnosing or aiding in the diagnosis of RD.



Inventors:
Gruen, Jeffrey R. (Hamden, CT, US)
Meng, Haiying (New Haven, CT, US)
Application Number:
11/662325
Publication Date:
12/25/2008
Filing Date:
09/14/2005
Assignee:
Yale University (New Haven, CT, US)
Primary Class:
Other Classes:
435/6.16, 536/24.31, 536/24.33
International Classes:
C12Q1/68; C07H21/04
View Patent Images:



Primary Examiner:
BERTAGNA, ANGELA MARIE
Attorney, Agent or Firm:
WOLF GREENFIELD & SACKS, P.C. (BOSTON, MA, US)
Claims:
We claim:

1. An isolated polynucleotide for the detection of a human DCDC2 gene associated with reading disability (dyslexia), in a sample from an individual, comprising a nucleic acid molecule that specifically detects an alteration in the human DCDC2 gene that is associated with reading disability in humans.

2. The polynucleotide of claim 1, wherein the polynucleotide is a probe that hybridizes, under highly stringent conditions, to a mutation in the human DCDC2 gene that is associated with reading disability in humans.

3. The probe of claim 2, wherein the alteration is a deletion in intron 2 of DCDC2

4. The probe of claim 3, wherein the deletion comprises 2,445 bp in intron 2. The probe of claim 2, wherein the probe is a DNA probe.

5. The probe of claim 5, wherein the probe is from about 8 nucleotides to about 500 nucleotides.

6. The probe of claim 6, wherein the probe is from about 10 nucleotides to about 250 nucleotides.

7. The probe of claim 5, wherein the probe comprises one or more non-natural or modified nucleotides.

8. The probe of claim 8, wherein the one or more non-natural or modified nucleotides are radioactive, fluorescently, or chemically labeled nucleotides.

9. A polynucleotide primer that hybridizes, under highly stringent conditions, adjacent to an alteration in a DCDC2 gene that is associated with susceptibility for developing RD in humans.

10. The polynucleotide primer of claim 10, which hybridizes immediately adjacent to the alteration in the DCDC2 gene.

11. A pair of polynucleotide primers that specifically detect a variant DCDC2 allele that is associated with susceptibility for developing RD in humans, wherein the first polynucleotide primer hybridizes to one side of an alteration in the variant DCDC2 allele and the second polynucleotide primer hybridizes to the other side of the alteration in the DCDC2 allele.

12. A pair of polynucleotide primers that hybridize to a region of DNA that comprises an alteration in a variant DCDC2 gene that is associated with susceptibility for developing RD in humans, wherein the polynucleotide primers hybridize to the region in such a manner that the ends of the hybridized primers proximal to the alteration are from about 100 to about 10,000 nucleotides apart.

13. The pair of polynucleotide primers of claim 11, wherein the alteration is an intronic polymorphic deletion or an allele of dbSTS ID 808238 within the region of DCDC2 that spans the intronic polymorphic deletion.

14. The pair of polynucleotide primers of claim 14, wherein the intronic polymorphic deletion is approximately 2,445 bp.

15. The pair of polynucleotide primers of claim 11, wherein the primers are DNA primers.

16. The pair of polynucleotide primers of claim 16, wherein the primers are each from about 8 nucleotides to about 500 nucleotides.

17. The pair of polynucleotide primers of claim 15, wherein the primers are each from about 10 nucleotides to about 250 nucleotides.

18. The pair of polynucleotide primers of claim 15, wherein the primers comprise one or more non-natural or modified nucleotides.

19. The pair of polynucleotide primers of claim 18, wherein the one or more non-natural or modified nucleotides are radioactive or fluorescently labeled nucleotides.

20. A method of detecting, in a sample obtained from an individual, a variant DCDC2 gene that is associated with susceptibility for developing RD in humans, comprising: (a) combining the sample with a polynucleotide probe that hybridizes, under highly stringent conditions, to an alteration in DCDC2 associated with susceptibility for developing RD in humans, but not to a wild type DCDC2 gene; and (b) determining whether hybridization occurs, wherein the occurrence of hybridization indicates that a variant DCDC2 gene that is associated with susceptibility for developing RD in humans is present in the sample.

21. A method of detecting, in a sample obtained from an individual, a variant DCDC2 gene that is associated with susceptibility for developing RD in humans, comprising: (a) combining the sample with a polynucleotide probe that hybridizes, under highly stringent conditions, to variant DCDC2 gene that is associated with susceptibility for developing RD in humans, thereby producing a combination; (b) maintaining the combination produced in step (a) under highly stringent hybridization conditions; and (c) comparing hybridization that occurs in the combination with hybridization in a control, wherein the control is a polynucleotide probe that does not bind to a variant DCDC2 gene that is associated with susceptibility for developing RD in humans or binds only to a wild type DCDC2 gene, and the sample is the same type of sample as in (a) and is treated the same as the sample in (a), and wherein the occurrence of hybridization in the combination but not in the control indicates that a variant DCDC2 gene that is associated with susceptibility for developing RD in humans is present in the sample.

22. The method of claim 21, wherein the extent of hybridization is determined in step (c).

23. A method of detecting, in a sample obtained from an individual, a variant DCDC2 gene that is associated with susceptibility for developing RD in humans, comprising: (a) combining a first portion of the sample with a polynucleotide probe that hybridizes, under highly stringent conditions, to an alteration in the variant DCDC2 that is associated with susceptibility for developing RD in humans; (b) combining a second portion of the sample with a polynucleotide probe that hybridizes, under highly stringent conditions, to a wild type DCDC2 gene; and (c) determining whether hybridization occurs, wherein the occurrence of hybridization in the first portion but not in the second portion indicates that the variant DCDC2 gene is present in the sample.

25. The method of claim 24, wherein the alteration is an intronic polymorphic deletion of DCDC2 or an allele of dbSTS 808238 within the region that the deletion spans.

26. The method of claim 25, wherein the intronic polymorphic deletion of DCDC2 is all or a portion of a 2,445 bp deletion in intron 2 of DCDC2.

27. The method of claim 24, wherein the sample comprises cells obtained from blood, tears, saliva, mucus, urine, epidermis, epithelium or eye tissue.

28. The method of claim 24, wherein the polynucleotide probe is a DNA probe.

29. The method of claim 24, wherein the polynucleotide probe is from about 8 nucleotides to about 500 nucleotides.

30. A method of detecting, in a sample obtained from an individual, a variant DCDC2 gene that is associated with susceptibility for developing RD in humans, comprising: (a) combining the sample with a pair of polynucleotide primers, wherein the first polynucleotide primer hybridizes to one side of DNA that is present in a DCDC2 gene associated with susceptibility for developing RD but not present in a DCDC2 gene not associated with susceptibility for developing RD and the second polynucleotide primer hybridizes to the other side of DNA that is present in a DCDC2 gene associated with susceptibility for developing RD, but not present in a DCDC2 gene not associated with susceptibility for developing RD; (b) amplifying DNA in the sample, thereby producing amplified DNA; (c) sequencing amplified DNA; and (d) detecting in the amplified DNA the presence of DNA that is associated with susceptibility for developing RD in humans, whereby a gene that is associated with susceptibility for developing RD in humans is detected.

31. A method of identifying or aiding in identifying an individual at risk for developing RD, comprising assaying a sample obtained from the individual for the presence of a DCDC2 gene that is associated with susceptibility for developing RD in humans, wherein the presence of a DCDC2 gene that is associated with susceptibility for developing RD in humans indicates that the individual is at risk for developing RD.

32. A method of identifying or aiding in identifying an individual at risk for developing RD, comprising: (a) combining a sample obtained from the individual with a polynucleotide probe that hybridizes, under highly stringent conditions, to an alteration in a DCDC2 gene that is associated with susceptibility for developing RD, but does not hybridize to a wild type DCDC2 gene; and (b) determining whether hybridization occurs, wherein the occurrence of hybridization indicates that the individual is at risk for developing RD.

33. A method of identifying or aiding in identifying an individual at risk for developing RD, comprising: (a) obtaining DNA from the individual; (b) sequencing a region of the DNA that comprises an alteration in a DCDC2 gene that is associated with susceptibility for developing RD; and (c) determining whether the alteration is present in DNA obtained from the individual, is present in the DNA, wherein the presence of the alteration indicates that the individual is at risk for developing RD.

34. A diagnostic kit for detecting a DCDC2 gene associated with susceptibility for developing RD in a sample from an individual, comprising: (a) at least one container means having disposed therein a polynucleotide probe that hybridizes, under highly stringent conditions, to variant DCDC2 DNA associated with susceptibility for developing RD in humans; and (b) a label and/or instructions for the use of the diagnostic kit in the detection of variant DCDC2 DNA in a sample.

35. A diagnostic kit for detecting variant DCDC2 DNA in a sample from an individual, comprising: (a) at least one container means having disposed therein a polynucleotide primer that hybridizes, adjacent to one side of an alteration in variant DCDC2 DNA that is associated with susceptibility for developing RD in humans; and (b) a label and/or instructions for the use of the diagnostic kit in the detection of an alteration in variant DCDC2 DNA in a sample.

36. The diagnostic kit of claim 35, additionally comprising a second polynucleotide primer that hybridizes, to the other side of the alteration in the variant DCDC2 DNA.

Description:

RELATED APPLICATIONS

The present application claims the benefit of U.S. Provisional Application No. 60/610,023, filed Sep. 14, 2004, by Jeffrey R. Gruen and Haiying Meng, entitled “DCDC2 Mutations Cause Dyslexia” and U.S. Provisional Application No. 60/685,101, filed May 26, 2005, by Jeffrey R. Gruen and Haiying Meng, entitled “DCDC2 Mutations Cause Dyslexia.” The referenced applications are incorporated herein in their entirety by reference.

FUNDING

This invention was made with United States government support under grant R01 NS43530, awarded by the National Institutes of Health. The United States government has certain rights in the invention.

BACKGROUND OF THE INVENTION

Reading disability (RD), also known as developmental dyslexia and also known as dyslexia, is one of the most common of the complex neurobehavioral disorders, with prevalence rates ranging from 5 to 17 percent (1). It is characterized by an impairment of reading ability in subjects with normal intelligence and adequate educational opportunities. A range of neuroimaging studies, including diffusion tensor and functional magnetic resonance imaging, show that dyslexics have altered brain activation patterns compared to fluent readers when challenged with reading tasks (2). Partial remediation in language processing deficits results in improved reading, ameliorates disrupted function in brain regions associated with phonologic processing, and produces additional compensatory activation in other brain areas (3). These studies also implicate specific brain locations where genes integral to reading and language are expressed, and which likely are altered in RD.

Over the past 30 years clinical studies have shown that up to 50% of children of dyslexic parents, 50% of siblings of dyslexics, and 50% of parents of dyslexic children are affected (4). Estimates of heritability range from 44 to 75% (5). The first RD susceptibility region, DYX1, was reported on chromosome 15 in 1983 (6). Subsequently, loci were described on chromosomes 1, 2p15-16, 3p13, 6p (7-21), 6q, 7q32, 11, 15q21, and 18p11.2. It is still unclear which and/or how many genes contribute to RD and additional information would be useful for developing diagnostic, preventive and therapeutic approaches to this disorder.

SUMMARY OF THE INVENTION

The present invention relates to identification of a human gene, DCDC2 (MIM: 605755), associated with susceptibility for developing reading disability (RD), which is useful in identifying or aiding in identifying individuals at risk for developing RD, as well as for diagnosing or aiding in the diagnosis of RD. Forms of the DCDC2 gene that harbor variations that are associated with susceptibility for developing RD or lead to differences in RD are referred to, interchangeably, herein as DCDC2 variants, variant DCDC2 DNA or variant DCDC2 genes. As described in detail herein, Applicants identified an intronic polymorphic deletion of DCDC2 and alleles of dbSTS ID 808238 within the region that the deletion spans that are in significant disequilibrium with multiple RD traits. DCDC2 in which there is a deletion, such as the intronic polymorphic deletion described herein, and DCDC2 alleles that are associated with RD are examples of DCDC variants. The polymorphic deletion encodes tandem repeats of putative brain-related transcription factor binding sites in intron 2 of DCDC2. RT-PCR data show that DCDC2 localizes to the region of the brain where fluent reading occurs and RNAi studies show that down regulating DCDC2 leads to alteration in neuronal migration, again within the brain regions of interest. Results demonstrate that DCDC2 is a gene correlated with RD.

In summary, Applicants saturated the region of the genome around JA04, which led to the identification of an intronic polymorphic deletion of DCDC2. Alleles of dbSTS ID 808238 within the region that the deletion spans are in significant disequilibrium with multiple RD traits. RT-PCR data suggest that DCDC2 localizes to the region of the brain where fluent reading occurs and RNAi studies show that down regulating DCDC2 leads to alteration in neuronal migration, again within the brain regions of interest. Applicants' findings support the role of DCDC2 as a gene for harboring variations that lead to differences in RD.

Thus, the present invention relates to a human gene associated with susceptibility for developing RD, which is useful in identifying or aiding in identifying individuals at risk for developing RD, as well as for diagnosing or aiding in the diagnosis of RD. It also relates to methods for identifying or aiding in identifying individuals at risk for developing RD; methods for diagnosing or aiding in the diagnosis of RD; polynucleotides (e.g., probes, primers) useful in the methods; diagnostic kits containing such probes or primers; antibodies that bind wild type DCDC2 or altered DCDC2 gene product (e.g., protein); methods of treating or aiding in treating an individual at risk for or suffering from RD and compositions, such as pharmaceutical compositions, useful for treating an individual at risk for or suffering from RD; methods for determining appropriate and, preferably, optimal treatment for individuals, including response to educational interventions, curricula, written materials, tutoring, specialized classes and pharmaceuticals related to pharmacogenetics. The methods and compositions of the present invention can be used alone or in combination with other methods and compositions used for such purposes. For example, a method of diagnosing or aiding in the diagnosis of RD of the present invention can be used in conjunction with testing and behavioral assessments presently used for determining if an individual has RD. The methods of the present invention provide DNA (genetic) diagnostic tests useful in assessing RD in individuals, as well as in populations, such as the general population.

In one embodiment, the present invention provides polynucleotides useful for detecting or aiding in detecting, in a sample, a DCDC2 variant(s). A DCDC2 variant (also referred to as variant DCDC2 DNA or a variant DCDC2 gene) comprises at least one alteration in or difference from wild type DCDC2. The alteration or difference can be any nucleotide polymorphism of a coding region, exon, exon-intron boundary, signal peptide, 5-prime untranslated region, promoter region, enhancer sequence, 3-prime untranslated region or intron that is associated with RD. These polymorphisms include, but are not limited to, changes in the amino acid sequence of the proteins encoded by the DCDC2 gene, produce alternative splice products, create truncated products, introduce a premature stop codon, introduce a cryptic exon, alter the degree or expression to a greater or lesser extent, alter tissue specificity of DCDC2 expression, introduce changes in the tertiary structure of the proteins encoded by DCDC2, introduce changes in the binding affinity or specificity of the proteins expressed by DCDC2 or alter the function of the proteins encoded by DCDC2.

In another embodiment, the present invention provides methods and compositions useful for identifying or aiding in identifying individuals at risk for developing RD. In a further embodiment, the methods and compositions of the invention may be used for the treatment of an individual who has (is suffering from) RD or is at risk for developing RD. The invention also encompasses diagnostic kits for detecting, in a sample from an individual, variant DCDC2 DNA, such as a DCDC2 allele that is correlated with RD in humans. Such kits are useful in identifying or aiding in identifying individuals at risk for developing RD, as well as for diagnosing or aiding in the diagnosis of RD in an individual.

In one embodiment, the invention provides an isolated polynucleotide for the detection of a DCDC2 allele that is correlated with RD in humans, the polynucleotide comprising a nucleic acid molecule that specifically detects variant DCDC2 DNA that is correlated with the occurrence of RD in humans. Isolated polynucleotides are useful for detecting, in a sample from an individual, DCDC2 gene variants that are correlated with RD in humans. In certain embodiments, the isolated polynucleotide is a probe that hybridizes, under highly stringent conditions, to all or a portion of a DCDC2 gene that is correlated with the occurrence of RD in humans (all or a portion of a variant DCDC2 gene). In certain embodiments, the isolated probe hybridizes, under highly stringent conditions, to all or a portion of a DCDC2 gene that is associated with susceptibility for developing RD in humans but does not hybridize to a DCDC2 gene that is not associated with susceptibility for developing RD in humans. In further embodiments, the isolated polynucleotide is a primer that hybridizes, under highly stringent conditions, adjacent, upstream, or downstream to an alteration in a DCDC2 gene that is associated with susceptibility for developing RD in humans. Alternatively, polynucleotides of the present invention can be primers or probes that are useful to identify wild type DCDC2, wild type DCDC2 gene or wild type DCDC2 DNA, as defined herein. Such polynucleotides, for example, recognize or hybridize to all or a portion of wild type DCDC2, wild type DCDC2 gene or wild type DCDC2 DNA.

The polynucleotides described herein (e.g., a polynucleotide probe or a polynucleotide primer) may be a DNA or RNA molecule. The subject polynucleotide may be single-stranded or double-stranded. Polynucleotide probes and primers of the invention may be from about 5 nucleotides to about 3000 nucleotides. In certain embodiments, the polynucleotide probes and primers of the invention are from about 8 nucleotides to about 500 nucleotides. In further embodiments, the polynucleotide probes and primers of the invention are from about 10 to about 250 nucleotides, from about 10 to about 100 nucleotides, from about 10 to about 80 nucleotides, from about 10 to about 50 nucleotides, from about 10 to about 40 nucleotides, from about 10 to about 30 nucleotides, from about 10, 11, 12, 13 or 15 nucleotides to about 20, 21, 22, 23, 24 or 25 nucleotides. The subject polynucleotides may comprise one or more non-natural or modified nucleotides. Non-natural or modified nucleotides include, without limitation, radioactively, fluorescently, or chemically labeled nucleotides, and protein nucleic acids. Included within the scope of the present invention is any polynucleotide useful to identify or detect wild type or variant DCDC2 sequences. Based on the information provided herein, one of ordinary skill in the art can design and produce polynucleotide probes and primers using methods known in the art.

In one embodiment, the polynucleotide primer of the invention hybridizes vicinal to an alteration or difference (nucleotide polymorphism) in a DCDC2 gene that is associated with susceptibility for developing RD in humans. For example, hybridization may occur in such a manner that fewer than 10 nucleotides separate the alteration and the end of the hybridized primer proximal to the alteration. In specific embodiments, hybridization occurs in such a manner that 1-3 nucleotides separate the alteration and the end of the hybridized primer proximal to the alteration. In certain embodiments, the polynucleotide primer hybridizes immediately adjacent to the alteration. In another embodiment, the polynucleotide primer of the invention hybridizes upstream or downstream from an alteration in the DCDC2 gene that is correlated with the occurrence of RD in humans. For example, hybridization may occur in such a manner that the end of the hybridized primer proximal to the alteration is 10, 25, 50, 100, 250, 1000, 5000, or up to 10,000 nucleotides upstream or downstream from an alteration in the DCDC2 gene. The invention described herein also provides a pair of polynucleotide primers that specifically detect a mutation in the DCDC2 gene that is correlated with the occurrence of RD in humans, wherein the first polynucleotide primer hybridizes to one side of an alteration (e.g., one side of the deletion described herein, such as the 5-prime side) and the second polynucleotide primer hybridizes to the other side of the alteration (e.g., the other side of the deletion described herein, such as the 3 prime side). A pair of polynucleotide primers that hybridize to a region of DNA that comprises an alteration in the DCDC2 gene that is associated with susceptibility for developing RD in humans may hybridize to the region in such a manner that the ends of the hybridized primers proximal to the alteration are from about 20 to about 10,000 nucleotides apart.

Variants of the DCDC2 gene that predispose an individual to RD may be detected by the methods and compositions described herein. In particular embodiments, variant alleles, such as those depicted in Supplementary Table 3 may be detected. As used herein, the terms “wild type DCDC2”, wild type DCDC2 gene” and “wild type DCDC2 DNA” refer to DNA that is not associated with susceptibility for developing RD in humans.

In certain aspects, the invention provides a method of detecting, in a sample obtained from an individual, a DCDC2 allele that is associated with susceptibility for developing RD in humans. Such a method may comprise: (a) combining the sample with a polynucleotide probe that hybridizes, under highly stringent conditions, to a DCDC2 allele that is correlated with RD in humans, but does not hybridize to a DCDC2 gene that is not associated with susceptibility for developing RD in humans and (b) determining whether hybridization occurs. The occurrence of hybridization indicates that a DCDC2 gene that is associated with susceptibility for developing RD in humans is present in the sample. Alternatively, the method may comprise: (a) combining the sample with a polynucleotide probe that uses the polymerase chain reaction to amplify, under stringent conditions, a DCDC2 allele that is associated with susceptibility for developing RD in humans, and (b) sequencing the allele, such as by conventional fluorescent tagged dideoxy terminator sequencing, wherein if the allele comprises the sequence of variant DCDC2 DNA, a DCDC2 allele that is associated with susceptibility for developing RD in humans is present in the sample.

Samples used in the methods described herein may comprise cells from the eye, epidermis, epithelium, blood, tears, saliva, mucus, urine, stool, sperm, ova, or any other tissues or bodily fluids from which sufficient DNA or RNA can be obtained. In a specific embodiment, cells obtained from a buccal swab are used. The sample should be sufficiently processed to render DNA or RNA present available for assaying in the methods described herein. For example, samples may be processed such that DNA from the sample is available for amplification by DNA polymerases or other enzymes that increase the total DNA content or for hybridization to another polynucleotide. The processed samples may be crude lysates where available DNA or RNA is not purified from other cellular material, or may be purified to specifically isolate DNA or RNA. Samples may be processed by any means known in the art that renders DNA or RNA available for assaying in the methods described herein. Methods for processing samples may include, without limitation, mechanical, chemical, enzymatic, or molecular means of lysing and/or purifying cells and cell lysates. Processing methods may include chromatographic methods such as ion exchange (e.g., cation and anion), size exclusion, affinity, and hydrophobic interaction chromatography.

In certain other aspects, the invention provides a method of detecting, in a sample obtained from an individual, a variant DCDC2 gene that is associated with susceptibility for developing RD in humans, comprising: (a) combining the sample (referred to as a test sample) with a polynucleotide probe that hybridizes, under stringent conditions, to a DCDC2 gene that is associated with susceptibility for developing RD in humans, thereby producing a combination; (b) maintaining the combination produced in step (a) under stringent hybridization conditions; and (c) comparing hybridization that occurs in the combination with hybridization in a control. The occurrence of hybridization in the combination but not in the control indicates that a DCDC2 gene that correlates with RD is present in the sample. The control is the same as the test sample and is treated the same as the test sample, except that the polynucleotide probe is one that does not bind to a DCDC2 gene that is associated with susceptibility for developing RD in humans. In all embodiments in which a control is used, the control can be assessed prior to, simultaneous with or subsequent to assessment of the test sample. For example, the control can be a previously established reference or standard. The control is typically the same type of sample as the test sample and is treated the same as the test sample, except that it is combined with a polynucleotide that does not hybridize to a DCDC2 gene that is associated with susceptibility for developing RD in humans.

In another embodiment, the invention provides a method of detecting, in a sample obtained from an individual, a DCDC2 gene that is associated with susceptibility for developing RD in humans, comprising: (a) combining a first portion of the sample with a polynucleotide probe that hybridizes, under highly stringent conditions, to a DCDC2 gene that is correlated with RD in humans, but not to a DCDC2 gene that is not correlated with RD in humans; (b) combining a second portion of the sample with a polynucleotide probe that hybridizes, under highly stringent conditions, to a DCDC2 gene that is not correlated with RD in humans, but not to a DCDC2 gene that is correlated with RD in humans; and (c) determining whether hybridization occurs. The occurrence of hybridization in the first portion but not in the second portion indicates that a gene that is correlated with RD is present in the sample.

The present invention also relates to a method of detecting, in a sample obtained from an individual, a DCDC2 gene that is associated with susceptibility for developing RD in humans, comprising: (a) combining the sample with a pair of polynucleotide primers, wherein the first polynucleotide primer hybridizes to one side of DNA (at least one nucleotide) that is present in a DCDC2 gene associated with susceptibility for developing RD but not present in a DCDC2 gene not associated with susceptibility for developing RD and the second polynucleotide primer hybridizes to the other side of DNA (at least one nucleotide) that is present in a DCDC2 gene associated with susceptibility for developing RD, but not present in a DCDC2 gene not associated with susceptibility for developing RD; (b) amplifying DNA in the sample, thereby producing amplified DNA; (c) sequencing amplified DNA; and (d) detecting in the amplified DNA the presence of DNA that is associated with susceptibility for developing RD, whereby a DCDC2 gene that is associated with susceptibility for developing RD in humans is detected. The presence of DNA that is present in a DCDC2 gene associated with susceptibility for developing RD in humans but not present in a DCDC2 gene not associated with susceptibility for developing RD indicates that a DCDC2 gene associated with susceptibility for developing RD in humans is detected in the sample. In one embodiment, one member of the pair of polynucleotide primers hybridizes to one side of DNA and the other member of the pair hybridizes to the other side of DNA in a DCDC2 gene in which there is a deletion of 2,445 bp, as described herein. The deletion is assigned breakpoints 24,433,346 and 24,435,659 (ENSEMBL database version 33 Sep. 2005). In one embodiment, the compound STR, dbSTS ID 808238, is genotyped by sequencing PCR products generated with forward primer (TGTTGAATCCCAGACCACAA) and reverse primer (ATCCCGATGAAATGAAAAGG). In further embodiments, the members of the primer pairs each hybridize to specific sequence length variants of Repeat Units 1 through 5 and SNP1 listed in Table 3, thereby distinguishing different DCDC2 variants. For example, a primer pair could be synthesized that specifically and only identifies the presence of allele number 1 in a DNA sample; another primer pair could specifically and only identify allele number 2, and so forth. Any method known in the art for amplifying nucleic acids may be used for the methods described herein. For example, DNA in a sample may be amplified using the polymerase chain reaction, rolling circle amplification, isothermal amplification, strand displacement amplification, multiple strand displacement amplification, multiplex ligation-dependant probe amplification, allele-specific amplification, ligase chain reaction, or by other enzymatic processes. Also, any method known in the art of resolving nucleic acids may be used for the methods described herein, including but not restricted to fluorescence tagged dideoxy sequencing, single base extension, capillary electrophoresis, SNPshot, SNPlex, Invader assay, TaqMan, light-cycle real time quantitative PCR, allele-specific hybridization, restriction fragment length polymorphism, single stranded conformational polymorphisms, denaturing gradient gel electrophoresis, denaturing high-pressure liquid chromatography, oligo-hybridization, tag-arrays, dideoxy method of Sanger sequencing, MALDI-TOF, Pyrosequencing, and reverse transcriptase mediated oligonucleotide extension.

In further embodiments of the present invention useful to detect a DCDC2 gene that is correlated with RD in humans, a set of three primers is used: one universal primer that is shared between two alleles, and two primers that are each unique for each an allele. For example, the 2,445 bp deletion was genotyped by allele-specific amplification with a combination of three primers in one reaction: a universal or shared forward primer (AGCCTGCCTACCACAGAGAA), a reverse primer for non-deleted chromosomes (GGAACAACCTCACAGAAATGG), and a reverse primer for deleted chromosomes (TGAAACCCCGTCTCTACTGAA). In this embodiment, the deletion fusion fragment is 225 bp and the non-deleted fragment is 550 bp.

In other embodiments, the invention provides methods of identifying or aiding in identifying an individual at risk for developing RD. In a specific embodiment, such a method comprises assaying a sample obtained from the individual for the presence of a DCDC2 gene that is associated with susceptibility for developing RD in humans. The presence of a DCDC2 gene associated with susceptibility for developing RD indicates that the individual is at risk for developing RD.

In another specific embodiment, a method of identifying or aiding in identifying an individual at risk for developing RD comprises: (a) combining a sample obtained from the individual with a polynucleotide probe that hybridizes, under stringent conditions such as highly stringent conditions, to a DCDC2 gene that is associated with susceptibility for developing RD in humans, but does not hybridize to a DCDC2 gene that is not associated with susceptibility for developing RD in humans; and (b) determining whether hybridization occurs. The occurrence of hybridization indicates that the individual is at risk for developing RD.

In another embodiment, a method of identifying or aiding in identifying an individual at risk for developing RD, comprises: (a) obtaining DCDC2 DNA from the individual; (b) sequencing DCDC2 DNA obtained in (a); and (c) determining whether DCDC2 DNA sequenced in (b) comprises DNA (one or more nucleotides) that is present in a DCDC2 gene that is associated with susceptibility for developing RD but is not present in a DCDC2 gene not associated with susceptibility for developing RD. The presence of DNA (one or more nucleotides) that is present in a DCDC2 gene associated with susceptibility for developing RD but is not present in a DCDC2 gene not associated with susceptibility for developing RD indicates that the individual is at risk for developing RD.

In another embodiment, the invention provides diagnostic kits useful for detecting a DCDC2 gene that is associated with susceptibility for developing RD in a sample from an individual. A diagnostic kit may comprise, for example: (a) at least one container means having disposed therein a polynucleotide probe that hybridizes, under stringent conditions such as highly stringent conditions, to a DCDC2 gene that is associated with susceptibility for developing RD in humans; and (b) a label and/or instructions for the use of the diagnostic kit in the detection of such a gene in a sample.

In another aspect, a diagnostic kit useful for detecting a DCDC2 gene associated with susceptibility for developing RD in humans in a sample from an individual may comprise, for example: (a) at least one container means having disposed therein a polynucleotide primer that hybridizes to one side of DNA (at least one nucleotide) that is present in a DCDC2 gene associated with susceptibility for developing RD but not present in a DCDC2 gene not associated with susceptibility for developing RD; and (b) a label and/or instructions for the use of the diagnostic kit in the detection of a DCDC2 gene in a sample. The diagnostic kit may additionally comprise a second polynucleotide primer that hybridizes, under highly stringent conditions, to the other side of DNA (at least one nucleotide) that is present in a DCDC2 gene associated with susceptibility for developing RD, but not present in a DCDC2 gene not associated with susceptibility for developing RD.

In certain aspects, the invention provides methods and compositions for treating an individual suffering from RD. For example, if a child is assessed, as described herein, and determined to have a variant DCDC2 gene, such as a DCDC2 gene in which there is a deletion (e.g., a 2,445 bp deletion as described herein), which is associated with susceptibility for developing RD, intervention can be more effectively designed. For example, in the case of a young child shown to have the DCDC2 gene in which the deletion described herein occurs, it might be most effective not to stress reading during the first few years of school, but, rather, emphasize other skills and maintain the self esteem of the child. Alternatively, if the child does not show the occurrence of the deletion but, instead, is determined to have, for example, an allele shown in Supplementary Table 3 (e.g., allele 5 or 6), a reading program might be a more effective approach. Another approach to be considered is that of determining whether those with certain alleles, such as those in Supplementary Table 3, respond to presently used drugs, such as phenobarbitol, anti-epileptic drugs and drugs used to treat ADHD (gabaneurgic drugs, such as Ritalin), or drugs designed specifically for the purpose.

The methods and compositions described herein for treating a subject suffering from RD may be used for the prophylactic treatment of individuals who have been diagnosed or predicted to be at risk for developing RD. In this case, the composition is administered in an amount and dose that is sufficient to delay, slow, or prevent the onset of RD. Alternatively, the methods and compositions described herein may be used for the therapeutic treatment of individuals who suffer from RD. In this case, the composition is administered in an amount and dose that is sufficient to delay or slow the progression of the condition, totally or partially, or in an amount and dose that is sufficient to reverse the condition.

Antibodies, both monoclonal and polyclonal, that bind, specifically or nonspecifically, to the product of a DCDC2 gene correlated with RD are also the subject of the present invention. These may be shown to be useful for diagnostic purposes whereby the abundance of DCDC2 protein is qualitatively and/or quantitatively assessed in tissues or fluids. Typical applications include, but are not limited to, use of anti-DCDC2 antibodies in a radio-immunoassay test, or ELISA test, or western-blot analysis, among others.

BRIEF DESCRIPTION OF THE DRAWINGS

The file of this patent contains at least one drawing executed in color. Copies of this patent with color drawings will be provided by the Patent and Trademark Office upon request and payment of necessary fee.

FIG. 1a-1c: High density SNP QTDT analysis. FIG. 1a: Evidence for transmission disequilibrium for 147 SNPs as −log10 P value, and plotted against position in the Ensembl human genomic reference sequence. The locations of 18 genes encoded in this region are provided. The vertical lines on the genes are cSNPs. The location of marker JA04 is shown above the gene map. The longest distance between SNPs was 332 kb located at the centromeric end of the region. The shortest distance was 14 bp in exon 1 of MRS2L. There were 20 cSNPs within exons of nine genes, and 12 non-synonymous cSNPs in five genes (DCDC2, MRS2L, GPLD1, KIAA0319 and TTRAP). The average minor allele frequency was 0.28 in the RD probands, not including the five novel private SNPs in MRS2L. FIG. 1b: −log10 P value for 33 SNPs (P<0.1) located within DCDC2, MRS2L, and part of GPLD1. FIG. 1c: Further expansion of a 110 kb region within DCDC2. SNPs labeled with an asterisk (*) are associated with RD phenotypes with P<0.005. C449792 is located within the deleted 2,445 bp in intron 2 of DCDC2 and designated by a triangle (Δ). The heavy vertical black lines represent exons in DCDC2. The hatched rectangles above exons 1 and 2, and above exons 3 through 5 highlight the coding regions for the DCX doublecortin peptide domains.

FIG. 2a-b: LD between pairs of SNPs. Color-coded D′ values for pairs of SNPs are plotted with the GOLD program. FIG. 2a: LD between pairs of SNPs in the 1.5 Mb region. The location of the 147 SNPs in this region are provided in Supplementary Table 1. Gene and haplotype block depictions on the top are relative to marker number and not actual physical distances. Gene and marker locations on the left are proportional to physical distances. (FIG. 2b: Triangular excerpt from lower left corner of 2a with higher resolution of SNPs 19 through 49 covering 180 kb and haplotype blocks A through E in DCDC2. Asterisks (*) indicate SNPs with P<0.005. Block A spanned five SNPs (SNPs ID: 21, 22, 23, 24, and 25) and 6.5 kb in intron 8. Block B spanned two SNPs (SNPs ID: 26 and 27) and 23 kb in intron 7 including the single marker peak at SNP 26 with IQ. Block C spanned eight SNPs (SNPs ID: 32, 33, 34, 35, 36, 37, 38, and 39) and 34.2 kb from intron 2 to intron 7, including the highest single marker peak at SNPs 33 with DISC. Block D spanned five SNPs (SNP ID: 42, 43, 44, 45, and 46) and 11.5 kb in intron 2. Block E spanned three SNPs (SNP ID: 47, 49 and 50) and 16 kb in from intron 1 to intron 2 and the 5-prime untranslated region including the single marker peak at SNP 49 with DISC. Block F spanned five SNPs (SNP ID: 68, 69, 70, 71, 72) and 5.4 kb, from MRS2L to GPLD1, including the non-synonymous cSNP in MRS2L, SNP 69. Block G spanned three SNPs (SNP ID: 117, 118, and 119) and 34.4 kb including the single marker peak at SNP 117 with PTP. Block H spanned three SNPs (SNP ID: 128, 129, and 130) and 13.5 kb including the single marker peak at SNP 130 with DISC.

FIG. 3: Haplotype-TDT analyses. FBAT results for 12 cognitive phenotypes at haplotype blocks A through H. The locations of the haplotype blocks are presented in FIG. 2. The markers comprising each haplotype block are described in the legend for FIG. 2 and Supplementary Tables 1 and 2a. Evidence for transmission disequilibrium is plotted as −log10 P along the y-axis, for each phenotype represented by tick marks along the x-axis from left to right as: IQ, DISC, PTP, TWR, PWR, WR, PD, OCH, PDL, HCH, OC, and PA. Positive or negative values for −log10 P value reflect the direction of the z-score derived by FBAT, so that z-scores below the population mean are plotted as −log10 P value <0, and visa versa. Dashed lines represent P value <0.5. Haplotypes within each block are numbered 1 through 5 and are represented by different colors. The alleles that define each haplotype are presented in Supplementary Table 2a. Frequencies of each haplotype in the CLDRC cohort are presented in the legend. Blocks A through E span DCDC2.

FIG. 4: RT-PCR results for DCDC2, MRS2L, GPLD1, ALDH5A, KIAA0319, TTRAP, THEM2, and GMN, in 17 areas of anonymous donor human brain regions normalized to thalamus (=1.00).

FIG. 5a-c: In utero RNAi against DCDC2. FIG. 5a: Control transfection of a neutral shRNA vector and eGFP shows normal migration after four days. Most neurons have migrated well away from the ventricular surface (Vent) towards the pial surface (Pia). FIG. 5b: Neurons transfected with an shRNA vector directed against DCDC2 migrate abnormally. FIG. 5c: Cumulative probability plot of the migration distances from the ventricular surface of all transfected eGFP+cells shown in panels a and b in the two transfection conditions. Scale bar in panels a and b is 100 μm.

FIG. 6 shows the results of Electrophoresis Mobility Shift Assay on EMSA3 and EMSA4, which show that binding of nuclear proteins to these short doublestranded domains changes their electrophoretic mobility, indicating that it is likely that the short (20 bp) DNA domains bind transcription factors. This suggests that this region is one that can enhance gene expression/is an enhancer.

DETAILED DESCRIPTION OF THE INVENTION

Applicants identified a novel deletion, located in intron 2 of DCDC2, which showed non-Mendelian allele transmission errors in RD families. The genotypes were confirmed by sequencing of PCR products derived from unamplified genomic DNA templates for the families. The deletion was determined to be 2,445 bp. It is, overall, 60% AT and contains a 168 bp purine-rich (98% AG) region. Within the 168 bp purine-rich region is a polymorphic compound short term repeat (STR), designated dbSTS ID 808238, which is comprised of 10 alleles that contain variable copy numbers of (GAGAGGAAGGAAA)n and (GGAA)n repeat units. Analysis identified 131 putative transcription factor binding sites distributed within the 168 bp of the purine-rich region, including four copies each of PEA3 (AGGAAA) and NF-ATp (AGGAAAG) sites in repeat unit 1 of dsSTS ID 808238. Described herein is a gene, and alleles thereof, associated with susceptibility for developing RD. Results described herein provide evidence for five linkage disequilibrium blocks (designated A to E) that span small clusters of SNPs in DCDC2 (FIG. 2b). A haplotype in each of blocks A, C, D and E (located in DCDC2) and in each of blocks F and G (located centromeric of DCDC2) was associated with compromised performance in several reading tasks in the context of preserved IQ.

Of the reported susceptibility loci, the most widely reproduced is DYX2. However, until the work described herein, only limited information was available about this gene. Reported linkage intervals range widely: 13.4 cM (16.9 Mb) spanning D6S422 (pter) through D6S291 (18), 5 cM (4.8 Mb) spanning D6S464 through D6S258 (17), and 1.8 cM (7.9 Mb) spanning D6S299 through D6S273 (16) (physical distances were previously described (14)). Applicants identified a peak of association with a short tandem repeat (STR) marker, JA04 (NCBI ID: G72384), located in the 5-prime untranslated region of KIAA0319, an uncharacterized gene that is expressed in the brain (7, 11). There are at least 19 genes and two pseudogenes encoded within 1.5 Mb of JA04; most of these are expressed in brain (22).

Applicants' previous study of quantitative transmission disequilibrium test (QTDT)-association used 29 informative STR markers spanning the 10 Mb from D6S1950 through D6S478 (7, 11). This resulted in identification of a peak of total association at JA04 (P=0.0007) with orthographic choice, which is a reading performance task that requires the rapid recognition of a target word versus a phonologically identical background foil that is not a word (i.e. rain, rane; sammon, salmon; see Olson et al, 1989 (23)).

Described herein is investigation of the DYX2 gene and corresponding alleles that create susceptibility for developing RD. To confine an association interval to the smallest possible number of candidate genes, Applicants assembled a high-density marker panel of 147 SNPs covering the 1.5 Mb surrounding JA04. This panel was used to assess single-marker and haplotype transmission disequilibrium with quantitative reading performance assessments in RD families. Quantitative expression studies of eight genes included in the panel were correlated with 18 regions of human brain corresponding to the primary functional reading centers.

As described herein, Applicants saturated the region of the genome around JA04, which led to the identification of an intronic polymorphic deletion of DCDC2. Alleles of dbSTS ID 808238 within the region that the deletion spans are in significant disequilibrium with multiple RD traits. RT-PCR data suggest that DCDC2 localizes to the region of the brain where fluent reading occurs and RNAi studies show that down regulating DCDC2 leads to alteration in neuronal migration, again within the brain regions of interest. These results show that DCDC2 is a gene harboring variation that leads to differences in RD.

Described herein is a human gene associated with susceptibility for developing RD, which is useful in identifying or aiding in identifying individuals at risk for developing RD, as well as for diagnosing or aiding in the diagnosis of RD. Also described are methods for identifying or aiding in identifying individuals at risk for developing RD; methods for diagnosing or aiding in the diagnosis of RD; polynucleotides (e.g., probes, primers) useful in the methods; diagnostic kits containing such probes or primers; antibodies that bind wild type DCDC2 or altered DCDC2 gene product (e.g., protein); methods of treating or aiding in treating an individual at risk for or suffering from RD and compositions, such as pharmaceutical compositions, useful for treating an individual at risk for or suffering from RD; methods for determining appropriate treatment for individuals, including response to educational interventions, curricula, written materials, tutoring, specialized classes and pharmaceuticals related to pharmacogenetics.

In specific embodiments, the present invention provides two DNA screening tests of the DCDC2 gene sequence that identify genetic susceptibility for developing dyslexia: a deletion assay and a DCDC2 haplotype assay spanning exons 5 through 8. These assays provide two methods of assessing the DCDC2 gene sequence to identify genetic susceptibility for developing dyslexia. Currently, there are no DNA diagnostic tests that can reliably predict susceptibility to developing reading disability, or for diagnosing reading disability, or for genetic counseling for predicting the likelihood of passing reading disability to present or future offspring. In overmore than 500 subjects and controls Applicants found the susceptibility haplotype and deletion in the same person five times, but only on the same chromosome twice. Since the two assays—deletion and haplotype—describe different mutations rarely found together, combining them will identify approximately 30% of dyslexics, as shown in Example 2 (see table entitled “Identification of dyslexics with combined deletion and (AGCTAGA) haplotype assays”).

Identification of DCDC2 as DYX2 permits further interrogations of the DCDC2 gene sequence for mutations that could cause reading disability. This would involve interrogation of the coding regions of the 10 exons in the public domain (Ref Seq: NM016356) and also putative regulatory sequences and unreported exons located within introns, the five-prime untranslated region, and the three-prime untranslated region. Both the deletion assay and haplotype assay, as described herein, can be used as a tool to screen for susceptibility to develop reading disability in the general population, as a diagnostic tool for a specific genetic subtype of reading disability, and for genetic counseling within families. These assays can also be used to test and ultimately contribute to decisions about specific forms of remediation.

Variant DCDC2 Polynucleotide Probes and Primers

In certain embodiments, the invention provides isolated and/or recombinant polynucleotides that specifically detect an alteration in a DCDC2 gene that is associated with susceptibility for developing RD (in a variant DCDC2 gene). Polynucleotide probes of the invention hybridize to the alteration of interest, and the flanking sequence, in a specific manner and thus typically have a sequence which is fully or partially complementary to the sequence of the alteration and the flanking region. A variety of alterations in a DCDC2 gene associated with susceptibility for developing RD may be detected by the polynucleotides described herein. For example, any nucleotide polymorphism of a coding region, exon, exon-intron boundary, signal peptide, 5-prime untranslated region, promoter region, enhancer sequence, 3-prime untranslated region or intron that is associated with RD can be detected. These polymorphisms include, but are not limited to, changes in the amino acid sequence of the proteins encoded by the DCDC2 gene, produce alternative splice products, create truncated products, introduce a premature stop codon, introduce a cryptic exon, alter the degree or expression to a greater or lesser extent, alter tissue specificity of DCDC2 expression, introduce changes in the tertiary structure of the proteins encoded by DCDC2, introduce changes in the binding affinity or specificity of the proteins expressed by DCDC2 or alter the function of the proteins encoded by DCDC2. In a specific embodiment, the variation in the DCDC2 gene results in a deletion of 2,445 bp, as described herein. The deletion is assigned breakpoints 24,433,346 and 24,435,659 (Ensembl). The subject polynucleotides are further understood to include polynucleotides that are variants of the polynucleotides described herein, as long as the variant polynucleotides maintain their ability to specifically detect a variation in the DCDC2 gene that is associated with susceptibility for developing RD. Variant polynucleotides may include, for example, sequences that differ by one or more nucleotide substitutions, additions or deletions.

In certain embodiments, the isolated polynucleotide is a probe that hybridizes, under stringent conditions, such as highly stringent conditions, to an alteration in the DCDC2 gene that is associated with susceptibility for developing RD. As used herein, the term “hybridization” is used in reference to the pairing of complementary nucleic acids. The term “probe” refers to a polynucleotide that is capable of hybridizing to another nucleic acid of interest. The polynucleotide may be naturally occurring, as in a purified restriction digest, or it may be produced synthetically, recombinantly or by nucleic acid amplification (e.g., PCR amplification).

It is well known in the art how to perform hybridization experiments with nucleic acid molecules. The skilled artisan is familiar with the hybridization conditions required in the present invention and understands readily that appropriate stringency conditions which promote DNA hybridization can be varied. Such hybridization conditions are referred to in standard text books such as Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory (1989); and Current Protocols in Molecular Biology, eds. Ausubel et al., John Wiley & Sons: 1992. Preferred in accordance with the present invention are polynucleotides which are capable of hybridizing to a variation in the DCDC2 gene, or a region of a variant DCDC2 gene, under highly stringent conditions. By highly stringent conditions is meant that no cross-hybridization to unrelated polynucleotides occurs.

Nucleic acid hybridization is affected by such conditions as salt concentration, temperature, organic solvents, base composition, length of the complementary strands, and the number of nucleotide base mismatches between the hybridizing nucleic acids, as will readily be appreciated by those skilled in the art. Stringent temperature conditions will generally include temperatures in excess of 30° C., or may be in excess of 37° C. or 45° C. Stringent salt conditions will ordinarily be less than 1000 mM, or may be less than 500 mM or 200 mM. For example, one could perform the hybridization at 6.0× sodium chloride/sodium citrate (SSC) at about 45° C., followed by a wash of 2.0×SSC at 50° C. For example, the salt concentration in the wash step can be selected from a low stringency of about 2.0×SSC at 50° C. to a high stringency of about 0.2×SSC at 50° C. In addition, the temperature in the wash step can be increased from low stringency conditions at room temperature, about 22° C., to high stringency conditions at about 65° C. Both temperature and salt may be varied, or temperature or salt concentration may be held constant while the other variable is changed. In one embodiment, the invention provides nucleic acids which hybridize under low stringency conditions of 6.0×SSC at room temperature followed by a wash at 2.0×SSC at room temperature. The combination of parameters, however, is much more important than the measure of any single parameter. See, e.g., Wetmur and Davidson, 1968. Probe sequences may also hybridize specifically to duplex DNA under certain conditions to form triplex or higher order DNA complexes. The preparation of such probes and suitable hybridization conditions are well known in the art. One method for obtaining DNA encoding the biosynthetic constructs disclosed herein is by assembly of synthetic oligonucleotides produced in a conventional, automated, oligonucleotide synthesizer.

A polynucleotide probe or primer used in the present invention may be labeled with any “reporter molecule,” so that it is detectable in any detection system, including, but not limited to enzyme (e.g., ELISA, as well as enzyme-based histochemical assays), fluorescent, radioactive, chemical, and luminescent systems. A polynucleotide probe or primer used in the present invention may further include a quencher moiety that, when placed very close to a label (e.g., a fluorescent label), causes there to be little or no signal from the label. It is not intended that the present invention be limited to any particular detection system or label.

In another embodiment, the isolated polynucleotide of the invention is a primer that hybridizes, under highly stringent conditions, adjacent, upstream, or downstream to an alteration in DCDC2 that is associated with susceptibility for developing RD in humans. For example, a polynucleotide primer of the invention can hybridize adjacent, upstream, or downstream to an alteration in the DCDC2 gene that is associated with susceptibility for developing RD. As used herein, the term “primer” refers to a polynucleotide that is capable of acting as a point of initiation of nucleic acid synthesis when placed under conditions in which synthesis of a primer extension product that is complementary to a nucleic acid strand is induced (i.e., in the presence of nucleotides, an inducing agent such as DNA polymerase, and suitable temperature, pH, and electrolyte concentration). Alternatively, the primer may be capable of ligating to a proximal nucleic acid when placed under conditions in which ligation of two unlinked nucleic acids is induced (i.e., in the presence of a proximal nucleic acid, an inducing agent such as DNA ligase, and suitable temperature, pH, and electrolyte concentration). A polynucleotide primer of the invention may be naturally occurring, as in a purified restriction digest, or may be produced synthetically. The primer is preferably single stranded for maximum efficiency in amplification, but may alternatively be double stranded. If double stranded, the primer is first treated to separate its strands before being used. Preferably, the primer is an oligodeoxyribonucleotide. The exact lengths of the primers will depend on many factors, including temperature, source of primer and the use of the method.

In one embodiment, the invention provides a pair of primers that specifically detect an alteration in the DCDC2 gene that is associated with susceptibility for developing RD. In such a case, the first primer hybridizes upstream from the alteration and a second primer hybridizes downstream from the alteration. It is understood that one of the primers hybridizes to one strand of a region of DNA that comprises an alteration in the DCDC2 gene that is associated with susceptibility for developing RD, and the second primer hybridizes to the complementary strand of a region of DNA that comprises an alteration in the DCDC2 gene that is associated with susceptibility for developing RD. As used herein, the term “region of DNA” refers to a sub-chromosomal length of DNA. In further embodiments, the invention provides a set of three primers useful for distinguishing between two alleles of DCDC2, wherein the first allele is a non-deleted DCDC2 gene and the second allele is a deletion in the DCDC2 gene that is associated with susceptibility for RD. The first primer hybridizes to a nucleotide sequence that is common to both alleles, such as a non-allelic nucleotide sequence that is upstream or downstream of the polymorphic sequence in the DCDC2 gene. A second primer specifically hybridizes to a nucleotide sequence that is unique to a first allele (e.g., a non-deleted DCDC2 gene). A third primer specifically hybridizes to a nucleotide sequence that is unique to the second allele (e.g., a deletion in the DCDC2 gene that is associated with susceptibility for RD). The set of three primers result in the amplification of a region of DNA that is dependent on which DCDC2 allele is present in the sample. Alternatively, two primers out of the set may hybridize to a nucleotide sequence that is common to two alleles of the DCDC2 gene, such as non-allelic nucleotide sequences that are upstream and downstream of a polymorphic sequence in the DCDC2 gene, and a third primer specifically hybridizes to one of the two alleles of the DCDC2 gene.

Detection Assays

The polynucleotides of the invention may be used in any assay that permits detection of a variation in the DCDC2 gene that is associated with susceptibility for developing RD. Such methods may encompass, for example, hybridization-mediated, ligation-mediated, or primer extension-mediated methods of detection. Furthermore, any combination of these methods may be utilized in the invention.

In one embodiment, the polynucleotides of the invention detect an alteration in the DCDC2 gene that is associated with susceptibility for developing RD by amplifying a region of DNA that comprises the alteration. Any method of amplification may be used. In one specific embodiment, a region of DNA comprising the alteration is amplified by using polymerase chain reaction (PCR). PCR in particular has become a research tool of major importance with applications in cloning, analysis of genetic expression, DNA sequencing, genetic mapping, drug discovery, and the like, e.g. Arnheim et al (Ann. Rev. Biochem., 61:131-156 (1992)); Gilliland et al, Proc. Natl. Acad. Sci., 87: 2725-2729 (1990); Bevan et al, PCR Methods and Applications, 1: 222-228 (1992); Green et al, PCR Methods and Applications, 1: 77-90 (1991); Blackwell et al, Science, 250: 1104-1110 (1990). PCR refers to the method of Mullis (See e.g., U.S. Pat. Nos. 4,683,195 4,683,202, and 4,965,188, herein incorporated by reference), which describes a method for increasing the concentration of a region of DNA, in a mixture of genomic DNA, without cloning or purification. For example, the polynucleotide primers of the invention are combined with a DNA mixture (or any polynucleotide sequence that can be amplified with the polynucleotide primers of the invention), wherein the DNA comprises the DCDC2 gene. The mixture also includes the necessary amplification reagents (e.g., deoxyribonucleotide triphosphates, buffer, etc.) necessary for the thermal cycling reaction. According to standard PCR methods, the mixture undergoes a series of denaturation, primer annealing, and polymerase extension steps to amplify the region of DNA that comprises the variation in the DCDC2 gene. The length of the amplified region of DNA is determined by the relative positions of the primers with respect to each other, and therefore, this length is a controllable parameter. For example, hybridization of the primers may occur such that the ends of the primers proximal to the mutation are separated by 1 to 10,000 base pairs (e.g., 10 base pairs (bp) 50 bp, 200 bp, 500 bp, 1,000 bp, 2,500 bp, 5,000 bp, or 10,000 bp).

The invention described herein utilizes standard instrumentation for the amplification and detection of amplified DNA. For example, a wide variety of instrumentation has been developed for carrying out nucleic acid amplifications, particularly PCR, e.g. Johnson et al, U.S. Pat. No. 5,038,852 (computer-controlled thermal cycler); Wittwer et al, Nucleic Acids Research, 17: 4353-4357 (1989) (capillary tube PCR); Hallsby, U.S. Pat. No. 5,187,084 (air-based temperature control); Garner et al, Biotechniques, 14: 112-115 (1993) (high-throughput PCR in 864-well plates); Wilding et al, International application No. PCT/US93/04039 (PCR in micro-machined structures); Schnipelsky et al, European patent application No. 90301061.9 (publ. No. 0381501 A2) (disposable, single use PCR device), and the like. In certain embodiments, the invention described herein utilizes real-time PCR or other methods known in the art such as the Taqman assay.

The amplified DNA may be analyzed by several different methods. Such methods for analyzing the amplified DNA include sequencing of the DNA, determining the size of the fragment by electrophoresis or chromatography, hybridization with a labeled probe, hybridization to a DNA array or microarray, by incorporation of biotinylated primers followed by avidin-enzyme conjugate detection, or by incorporation of 32P-labeled deoxynucleotide triphosphates, such as dCTP or dATP, into the amplified segment. In one embodiment, the amplified DNA is analyzed by gel electrophoresis. Methods of gel electrophoresis are well known in the art. See for example, Current Protocols in Molecular Biology, eds. Ausubel et al., John Wiley & Sons: 1992. The amplified DNA can be visualized, for example, by fluorescent or radioactive means. The DNA may also be transferred to a solid support such as a nitrocellulose membrane and subjected to Southern Blotting following gel electrophoresis. In one aspect, the DNA is analyzed by electrophoresis and exposed to ethidium bromide and visualized under ultra-violet light.

In one aspect, the alteration in the DCDC2 gene that is associated with susceptibility for developing RD is a deletion. The deletion may be detected using any of the polynucleotide primers described herein. For example, a set of three primers may be used to distinguish between an allele of the DCDC2 gene that comprises a deletion and a wildtype DCDC2 gene. The set of three primers result in the amplification of a region of DNA that is dependent on which DCDC2 allele is present in the sample.

In another embodiment, the amplified DNA is analyzed by DNA sequencing. DNA sequence determination may be performed by standard methods such as dideoxy chain termination technology and gel-electrophoresis, or by other methods such as by pyrosequencing (Biotage AB, Uppsala, Sweden). The nucleic acid sequence of the amplified DNA can be compared to the nucleic acid sequence of wild type DNA to identify whether a variation in the DCDC2 gene that is associated with susceptibility for developing RD is present.

In another embodiment, the polynucleotides of the invention detect an alteration in the DCDC2 gene that is associated with susceptibility for developing RD by hybridization-mediated methods. In one aspect, a polynucleotide probe hybridizes to an alteration in the DCDC2 gene, and flanking nucleotides, that is associated with susceptibility for developing RD, but not to a wild type CFH gene. The polynucleotide probe may comprise nucleotides that are fluorescently, radioactively, or chemically labeled to facilitate detection of hybridization. Hybridization may be performed and detected by standard methods known in the art, such as by Northern blotting, Southern blotting, fluorescent in situ hybridization (FISH), or by hybridization to polynucleotides on a solid support (e.g., DNA arrays, microarrays, cDNA arrays, or Affymetrix chips). In one specific aspect, the polynucleotide probe is used to hybridize genomic DNA by FISH. FISH can be used, for example, in metaphase cells, to detect a deletion in genomic DNA. Genomic DNA is denatured to separate the complimentary strands within the DNA double helix structure. The polynucleotide probe of the invention is then added to the denatured genomic DNA. If an alteration in the DCDC2 gene that is associated with susceptibility for developing RD is present, the probe will hybridize to the genomic DNA. The probe signal (e.g., fluorescence) can then be detected through a fluorescent microscope for the presence of absence of signal. The absence of signal, therefore, indicates the absence of an alteration in the DCDC2 gene that is associated with susceptibility for developing RD. Presence of signal can also be used, in another embodiment, to determine the absence of an alteration in the DCDC2 gene.

In another embodiment, the polynucleotides of the invention detect an alteration in the DCDC2 gene that is associated with susceptibility for developing RD by primer extension with DNA polymerase. In one aspect, a polynucleotide primer of the invention hybridizes immediately adjacent to the alteration. A single base sequencing reaction using labeled dideoxynucleotide terminators may be used to detect the alteration. The presence of an alteration will result in the incorporation of the labeled terminator, whereas the absence of an alteration will not result in the incorporation of the terminator. In another aspect, a polynucleotide primer of the invention hybridizes to an alteration in the DCDC2 gene that is associated with the susceptibility for developing RD. The primer, or a portion thereof, will not hybridize to a wild type DCDC2 gene. The presence of an alteration will result in primer extension, whereas the absence of an alteration will not result in primer extension. The primers and/or nucleotides may further include fluorescent, radioactive, or chemical probes. A primer labeled by primer extension may be detected by measuring the intensity of the extension product, such as by gel electrophoresis, mass spectrometry, or any other method for detecting fluorescent, radioactive, or chemical labels.

In another embodiment, the polynucleotides of the invention detect an alteration in the DCDC2 gene that is associated with susceptibility for developing RD by ligation. In one aspect, a polynucleotide primer of the invention hybridizes to a variation in the DCDC2 gene that is associated with susceptibility for developing RD. The primer, or a portion thereof will not hybridize to a wild type DCDC2 gene. A second polynucleotide that hybridizes to a region of the DCDC2 gene immediately adjacent to the first primer is also provided. One, or both, of the polynucleotide primers may be fluorescently, radioactively, or chemically labeled. Ligation of the two polynucleotide primers will occur in the presence of DNA ligase if an alteration in the DCDC2 gene that is associated with susceptibility for developing RD is present. Ligation may be detected by gel electrophoresis, mass spectrometry, or by measuring the intensity of fluorescent, radioactive, or chemical labels.

EXAMPLES

The following examples are for illustrative purposes and are not intended to be limiting in any way.

Example 1 Deletion of DCDC2 Gene Sequence

Through marker saturation studies Applicants identified a 2445 base deletion in intron 2 of DCDC2 (24,433,346 through 24,435,659 bp, in the ENSEMBL database version 33, September 2005). ORF Finder (NCBI) identifies two putative open reading frames (potential exons) within the deleted genomic sequence corresponding with 53 amino acids of putative open reading frame: MLIFLSPRGPHNLLICCNIKTDHRIKMANVSERFYLRTEEKCEEVDIVLSHS.

Deletions of the 2445 bases of genomic DNA from this region would also delete these amino acids. Applicants developed a PCR assay, called “DCDC2 24,433,346 through 24,435,659 Deletion Assay” (described in detail below) that specifically and unambiguously identifies persons with this deletion. In their study population of subjects recruited because they have dyslexia, this deletion is present in 17 of 108 severe dyslexics (15.7%, Table immediately below). The control population reflects the frequency of dyslexia in the general population, reportedly 5 to 15%. The deletion is present in 3 of 42 controls (7.1%). The odds of developing dyslexia in a person with this deletion are twice that of a person without the deletion.

TABLE
Allele and population frequencies of the
DCDC2 24,433,346-24,435,659 deletion
Severe
Controls(1)DyslexiaDyslexia(2)
Allele Frequency.036 (3/84).073 (28/382).079 (17/216)
Population Frequency.071 (3/42).147 (28/191).157 (17/108)
(1)Controls not tested and not selected for reading disability. The frequency of dyslexia in controls reflects the 5-15% frequency reported in the general population.
(2)Dyslexics that perform less than two standard deviations (z < 2.0) on at least one of five primary reading disability performance tests: discriminant score, phonemic awareness, phonological decoding, word recognition, or orthographic coding.

DCDC2 24,433,346 through 24,435,659 Deletion Assay

The PCR assay consists of three primers:

Universal Forward Primer:AGCCTGCCTACCACAGAGAA
Deletion Reverse Primer:TGAAACCCCGTCTCTACTGAA
Non-Deletion Reverse Primer:GGAACAACCTCACAGAAATGG
PCR Mixture:
Shared Forward Primer0.3 μM
Deletion Reverse Primer0.2 μM
Control Reverse Primer0.2 μM
Genomic DNA Template5 ng
10X Taq Polymerase Buffer1/10 volume
Taq Polymerase1 Unit

PCR Conditions:

  95° C. 15 minDenature
  95° C. 30 secTouchdown PCR for 10 cycles
65-57° C. 30 secdrop1oC per cycle
  72° C. 60 sec
  95° C. 30 sec30 cycles
  56° C. 30 sec
  72° C. 60 sec
  72° C. 5 minExtension
4° C.Storage
Gel Conditions:1.5% agarosegel
Band Sizes:
486 bp:no deletion
176 bp:2445 base deletion

Example 2A Haplotype Spanning Exons 5 Through 8 Causes Dyslexia

Applicants also developed a haplotype consisting of seven markers spanning DCDC2 that is associated with dyslexia:

DCDC2 Haplotype Assay Spanning Exons 5 Through 8
Location inLocation in
EnsemblCeleraLocation in
NucleotideOriginDatabaseDatabaseDCDC2
rs2296539ANCBI24,397,40825,522,804Intron 5
rs2328208GNCBI24,393,54825,412,218Intron 5
rs807722CNCBI24,387,89625,513,291Intron 6
C_7454704_10TCelera24,386,84825,512,242Intron 7
rs807700ANCBI24,382,38425,402,536Intron 7
C_7454731_10GCelera24,381,77025,507,166Intron 7
rs793857ANCBI24,353,40125,373,988Intron 7

In the study population of subjects recruited because they have dyslexia, this haplotype is present in 15 of 63 severe dyslexics (23.8%, Table immediately below). The control population reflects the frequency of dyslexia in the general population, reportedly 5 to 15%. The haplotype is present in 3 of 36 controls (8.9%). The odds of developing dyslexia in a person with this haplotype are more than twice that of a person without the haplotype.

TABLE
Haplotype and population frequencies
of the DCDC2 exon 5-8 haplotype
Severe
Controls(1)DyslexiaDyslexia(2)
Haplotype Frequency.039 (3/77).112 (55/491).118 (15/127)
Population Frequency.083 (3/36).233 (55/236).238 (15/63) 
(1)Controls not tested and not selected for reading disability. The frequency of dyslexia in controls reflects the 5-15% frequency reported in the general population.
(2)Severe Dyslexics perform less than two standard deviations (z < 2.0) on at least one of five primary reading disability performance tests: discriminant score, phonemic awareness, phonological decoding, word recognition, or orthographic coding.

The haplotype assay consists of five custom markers from the NCBI dbEST database (rs2296539, rs2328208, rs807722, rs807700, rs793857) made exclusively for Applicants (Assay-by-Design®, ABI), and two proprietary markers (C745470410 and C745473110, Assay-on-Demand®, ABI/Celera).

Custom Markers:

rs2296539
rs2296539_Forward
AGATCCCAAAGTGTCCTATTTGCAT
rs2296539_Reverse
GAAGGAAATTTGTTTTTAACTCAGTCTGGAA
Allele specified primer 1
ACATTTGGAAATGATTTT
Allele specified primer 2
CATTTGGAAGTGATTTT
rs2328208
rs2328208_Forward
TTGCTTTCTATGGGATGCAAATATACCTT
rs2328208_Reverse
GAAAAACACATTTAGATAGGTGTGTCAGG
Allele specified primer 1
CATGGAGGAAGTGACGTT
Allele specified primer 2
CATGGAGGAAATGACGTT
rs807722
rs807722_Forward
CAGTAGCTCTCAGCCATGTATCTG
rs807722_Reverse
GTGAGAGGCTGCAGGTAGTG
Allele specified primer 1
TCTAAAACTTGCATTCTTT
Allele specified primer 2
CTAAAACTTGGATTCTTT
rs807700
rs807700_Forward
CCTTGTGAACGCAAGAAGTATAGTG
rs_07700_Reverse
TCAAAGAGACCAGGCCATTTTCT
Allele specified primer 1
CCCTTTCAGTATTCC
Allele specified primer 2
CCCTTTCAATATTCC
rs793857
rs793857_Forward
CCCTTTCTTTTGAGCTCAGCTATGA
rs793857_Reverse
CTTGGCGACAGAGGGAAACT
Allele specified primer 1
CCATCTCAGAAAGTTT
Allele specified primer 2
CCATCTCAAAAAGTTT

PCR Mixture:

40X Assay mix of primers0.1μl
Genomic DNA Template1.6ng
2X ABI Universal PCR Mix1.0μl
Water0.1μl

PCR Conditions:

95° C. 10 minDenature
92° C. 15 sec
60° C. 60 sec60 cycles
 4° C.Storage

Allele Resolution:

ABI Prism 7900HT Sequence Detection System

ABI Prism 7900HT Standard Protocol for ABI TaqMan Markers

TABLE
Identification of dyslexics with combined
deletion and (AGCTAGA) haplotype assays.
Severe
Controls(1)Dyslexia(2)Dyslexia(3)
Population Frequency.119 (5/42).331 (78/236).296 (32/108)
(1)Controls not tested and not selected for reading disability. The frequency of dyslexia in controls reflects the 5-15% frequency reported in the general population.
(2)The deletion and associated haplotype were found together in five dyslexic subjects, twice on the same chromosome.
(3)Severe Dyslexics perform less than two standard deviations (z < 2.0) on at least one of five primary reading disability performance tests: discriminant score, phonemic awareness, phonological decoding, word recognition, or orthographic coding.

Example 3 Single-Marker Transmission Disequilibrium

Applicants genotyped a total of 147 SNPs distributed through the 1.5 Mb region surrounding JA04 in 153 nuclear RD families recruited by the Colorado Learning Disabilities Research Center (CLDRC). The strongest QTDT peak was with the DISC phenotype and SNP 33 located in intron 6 of DCDC2 (P=0.0003). Table 1 and FIG. 1 provide the results from a selected subset of the most significant QTDT scores. Results for the entire SNP panel can be found in Supplementary Table 1.

Five SNPs yielded a P value of ≦0.01; two of these were located in DCDC2. Thirty-seven SNPs yielded a P value of ≦0.05; eleven of these were located in DCDC2. Of the 31 SNPs distributed through DCDC2 (average minor allele frequency=0.24), ten were associated with the DISC phenotype (P≦0.05).

Example 4 Intermarker Linkage Disequilibrium

Applicants constructed an intermarker linkage disequilibrium map (FIG. 2a) spanning the 1.5 Mb with graphical overview of linkage disequilibrium (GOLD) and Haploview. There was evidence for five linkage disequilibrium blocks (A to E) spanning small clusters of SNPs in DCDC2 (FIG. 2b). There were three blocks (F to H) centromeric of DCDC2 that corresponded to single marker QTDT peaks.

Example 5 Haplotype Transmission Disequilibrium

All five haplotype blocks in DCDC2 showed significant transmission disequilibrium with reading performance tasks; three of these, A, B, and D, did not contain single marker QTDT peaks. FIG. 3 is a graphic presentation of the haplotype transmission disequilibrium data, which is also provided in tabular form in Supplementary Tables 2a and 2b. A haplotype in each of blocks A, C, D, E, F, and G was associated with compromised performance in several reading tasks in the context of preserved IQ. Haplotype blocks A, C, D, and E were located in DCDC2. There were no haplotypes in block H that showed significant association with any of the cognitive phenotypes.

Example 6 Identification of a Novel Deletion in DCDC2

C449792, located in intron 2 of DCDC2 (FIG. 1), showed non-Mendelian allele transmission errors in ten RD families. To ensure that this was not an artifact of whole genome amplification, Applicants confirmed these initial genotypes by sequencing PCR products derived from unamplified genomic DNA templates for all ten families. Allele transmission from the two flanking SNPs, 41 and 42, were typically Mendelian and defined initially the outer boundaries of a 17 kb region with loss-of-heterozygosity (LOH). To identify the extent of the deletion Applicants interrogated for LOH by sequencing SNPs within the 17 kb genomic region in RD trios. Additional flanking SNPs limited the deletion to 3,848 bp. Finally Applicants amplified and sequenced a 1,200 bp fusion fragment in subjects with LOH, which assigned the breakpoints to 24,433,346 and 24,435,659 (ENSEMBL database version 33 Sep. 2005, FIG. 2). Primer walking was used to sequence the non-deleted fragment from the same subjects with LOH. These results confined the deletion to 2,445 bp. Overall, the deletion was 60% AT, and contained a 168 bp purine-rich (98% AG) region.

Example 7 Identification of a Compound STR in the Deletion in DCDC2

Within the 168 bp purine-rich region was a polymorphic compound STR (dbSTS ID 808238) comprised of 11 alleles containing variable copy numbers of (GAGAGGAAGGAAA)n and (GGAA)n repeat units (Supplementary Table 3). In the CLDRC cohort, some alleles were present only in the parents (five) and others—including the deletion—occurred too infrequently in probands to compute transmission disequilibrium. By combining the deletion and ten minor alleles, QTDT showed a peak of transmission disequilibrium with homonym choice (HCH; P=0.00002, Table 2). TESS (24) comparison to the TRANSFAC database identified 131 putative transcription factor binding sites distributed through the 168 bp of the purine-rich region, including four copies each of PEA3 (AGGAAA) and NF-ATp (AGGAAAG) sites in repeat unit 1 of dbSTS ID 808238. Both transcription factors are expressed in mouse brain. PEA3 is associated with sexual function and peripheral motor neuron arborization (25). NF-ATp mediates rapid embryonic axon extension necessary for forming neuronal connections (26), which would complement the putative function of the doublecortin peptide domains in DCDC2.

Example 8 Assessment of Expression Levels of Genes in Human Brain Using Quantitative Real Time RT-PCR

FIG. 4 shows the expression levels of eight genes in 17 regions of human brain normalized to thalamus by quantitative real time RT-PCR; thalamus is a region of the brain that has not consistently been implicated in reading. The most variably expressed genes were KIAA0319, MRS2L, and DCDC2. KIAA0319 was most highly expressed in the superior parietal cortex, primary visual cortex, and occipital cortex. MRS2L was most highly expressed in the superior temporal cortex, hypothalamus, and amygdala. DCDC2 was most highly expressed in the entorhinal cortex, inferior temporal cortex, medial temporal cortex, hypothalamus, amygdala, and hippocampus. Expression of TTRAP, THEM2, Geminin, and ALDH5A in the 17 regions of the brain did not differ significantly from thalamus.

Example 9 Determination of a Role for DCDC2

In utero RNAi was used to test for a functional role of DCDC2 in neuronal migration. Co-transfection of plasmid vectors encoding shRNA targeted against DCDC2 sequence in developing neocortex or control scrabbled sequence along with an eGFP expression plasmid was performed at gestational day 14 in the rat. This transfection method initially labeled approximately 1% of cells at the surface of the ventricles where new neurons undergo their terminal mitoses. Cells migrate from this surface to the pial surface in four to six days. We assessed the progress in migration four days following transfection for the two conditions. As shown in FIG. 5, cells transfected with control plasmids progressed significantly further away from the ventricular surface and towards the pial surface than did cells transfected with a vector targeted against DCDC2. The mean migration distance in matched littermate controls was 606±178 μm and in the DCDC2 shRNA transfection group the mean migration distance was 367 μm+135 (n=4, p<0.01).

Example 10 Annotation of Deletion Sequence

GRUENLAB REFERENCE SEQUENCE
SOURCE: Gruenlab reference sequence compiled from ABI files generated Jan. 10, 2005
through Jan. 21, 2005, from a single sub-clone of genomic DNA from a single subject,
NA10848 (CEPH Family 1332). NA10848 DNA was purchased from the Coriel Institute
(Camden, NJ).
ANNOTATIONS:
Location: Intron 2 of DCDC2 (MIM:605755)
Length: 2,837 bases in length
Direction: pter to cen on 6p
Base #1 corresponds to base number 21,571 in clone RP11-95P3 in the NCBI database
(http://www.ncbi.nlm.nih.gov/).
Base #1 corresponds to base number 24,433,259 in ENSEMBL v33-September 2005
(http://wxvw.ensembl.org/Multi/blastview).
Deletion breakpoints: between base #87-88 (pter)
between base #2,532-2,533 (cen)
Flanking sequence: base 1 through base 87 (pter)
base 2,533 through base 2,837 (cen)
Deletion range: 2,445 bases
Deletion primers:
Del_F primer: 5′- tgt aaa acg acg gcc agt AGCCTGCCTACCACAGAGAA -3′
base #1-20 (5-prime to 3-prime)
(lower case sequence is M13-Forward)
Del_R primer: 5′- tca cac agg aaa cag cta tga c TGAAACCCCGTCTCTACTGAA -3′
base #2,621-2,601 (5-prime to 3-prime)
(lower case sequence is M13-Reverse)
Del_C primer: 5′- tca cac agg aaa cag cta tga c GGAACAACCTCACAGAAATGG -3′
base #486-466 (5-prime to 3-prime)
(lower case sequence is M13-Reverse)
Deletion amplicon, Del_F through Del_R
si e: 216 bases (including the M13F and M13R ends)
Non-deletion amplicon, Del_F through Del_C
si e: 526 bases (including the M13F and M13R ends)
Purine-rich region: 170 bp (1,027 through 1,196)
Compound Short Tandem Repeat, dbSTS ID 808238 (base 1,094 through 1,191)
Repeat Unit 1: (GAGAGGAAGGAAA)n (start base 1,094)
Repeat Unit 2: (GGAA)n (start base 1,120)
SNP1: DelGAAA (start base 1,144)
Repeat Unit 3: (GGAA)n (start base 1,148)
Repeat Unit 4: (GGAA)n (start base 1,168)
Repeat Unit 5: (GGGA)n (start base 1,184)
Comparison of Gruenlab Reference to NCBI sequence:
311 331 379 719 964 1430 1572 1823
Gruenlab C M N C T -- A -
NCBI T A - T - AT G A
2042 2221 2401 2405 2436
Gruenlab C T G G A
NCBI A C C T G
Gruenlab Reference Sequence (1 2,837):
Del_F
1 AGCCTGCCTA CCACAGAGAA TGCCTTGGAA TCAGAGGTTC
41 CCTGAAGAGA CCCTCTCCTC TTAGAATAAT CCAAAACCAG
81 AATCTCCAGA GCCCCGTGGT CAAAACTAAA ACGTTCCATC
121 TAGGAGTGAG AGAGCACGAT ATCTACTTCC TCACACTTCT
161 CCTCGGTTCT CAAATAAAAG CGCTCACTTA CATTTGCCAT
201 CTTTATTCTG TGATCCGTTT TTATGTTACA GCAAATAAGC
241 AAATTATGAG GTCCTCTGGG CGAAAGGAAA ATCAGCATGG
281 AATGTAAGTT ATTGTGCCAT CTAGAGAAAA CGTGAGAGGC
321 TGGAaGCCTC MATCAACTGT CTTCCTTGAA GAATAACCTA
361 GATCTTGGCT CCCACTGGnC AAAGATGAGT GGGGGTTATT
401 GTCTTCTCTA AGAAACTAAA cGTCCCTCAC ATGCTTGAAG
<---------------
441 ATGTCGCAAG GGAGACCTGA TGGCCCCATT TCTGTGAGGT
-Del_C
481 TGTTCCTCAA AGAAGAATCA AAGATTTCAG TCACATTAGC
521 ATCATCATGT TCTCTTAGTC CAGAATTTTT CAGCAAACAT
561 ATTCCACAAA ATTTTCTGCA AGTTCAGGGT ACATATAGCA
601 GGTGTAGTGG ATTTTTGTTA TGTTTTAATA TAACATACTA
641 GAGAAAATCC AGAACATtCT tCTCCCTCTC TCTTCTTCAT
681 CACATTCACA TCTCAGCCTA TAGAGCAGAG TTTATTCCCT
721 AGTATAATAT CAAGGCCTGT TTTAAAAATA TATATATTAT
761 ACATGTGPAT GAGAAATGAG TCACATTTAT TTTACCATGT
801 CTCTGGTTTT TAAATAAAAT TAAAAGGTTG GGAAACTGTT
841 TTTCAGTGTC ACAACCTCTC TGTTCTTACT ACCATAATAT
881 TTACTTGATA TTATTTCAGT TCTTCCTTCC CCACACCCAT
921 GTTGAATCCC AGACCACAAA CTACTGTAAT TTTTCTTTAT
961 TATTCaACAT ATGTAGGAAT GCAGAATTAA AATTATTGAT
1001 CAAGTTTCAT GCAAAGTTCC AAAACCAAAG AAAGAAAGAA
1041 AGGAAGAGAG GAAAAAAGAG AGAAAGACAG GGAGAAAAAT
[RepeatUnit1]
1081 AAAAAGAAGG AAAGAGAGGA AGGAAAGAGA GGAAGCAAAG
[RepeatUnit2] [SNP1] [RepeatUnit3]
1121 GAAGGAAGGA AGGAAGGAAG GAAGAAAGGA AGGAAGGAAA
[RepeatUnit4] [RepeatUnit5]
1161 CAATGAAGGA AGGAAGGAAG GAAGGGAGGG AGGAAATCAG
1201 ACCTTTTCAT TTCATCGGGA TACCTACCAC CTCTCTTTTT
1241 GACTCAAGCT AATGTTAAAT GTTAAAAAGA GTCTCCATTT
1281 TTAGAATACA CCAACCAATA GAAGGACCCC CCCATGCCCT
1321 AGAGCTCCCT GGATAGTAGA AAATTAGTCA AAAATTTAAA
1361 ATTTACTATA GATGATCCAT AAAATTAAAA ATCATACAAA
1401 GCATGTTAAG AGCTGGGTGA CATATATATT AACTATAAAG
1441 AGAGCAGATA TAGAAAGGAA GCCAACATTT ATCTAGCAGA
1481 AGAAAAAAAC ACCATCATTT GTATCAATAA AAAGCATGTA
1521 TGATGAGCGG GCATGGAGGC TTATGCCTAT AACCCAGCAC
1561 TTTGGGAGGC CAAGGCATGT GGGTCGCTTA AGTCCAAGAG
1601 TTCAAGACCA GCCTGGGCAA CAATGGCAAA AATCCGTCTC
1641 TACTAAAAGT GCAAAAAATT GGCCAGGTGT GGTGGTACAT
1681 GCCTGTAGTC CCAGCTAGTC AGGTGGCTGA AGCAGAAGGA
1721 TTCCCTGAGC CTGGGAGATC GAGGCTGAAG TGAGCCTTGA
1761 TCATGCTACT GCACTCCAGC CTGGGTGACA GAGCGAGACC
1801 CTGTCTCAAA AAAAAAAAAA AATGCATAAA AATGTTCATT
1841 TACATCCTCA TTTAACCCAT ACCATACTGT ALTCTACTTG
1881 CAGTATTTGC TAACTACTCC CCAGATAGAT GGGCTCACTT
1921 TGAGGCCAAG GATTGTGTTC TACCATAATC TCATTCCTTC
1961 AGCACAGCTC AGCACCTGGC AAATTGGAGG CAACAAATGT
2001 CTATGGATCC CTCTGTAACC ATGAACAAGT CAGTCAGGGT
2041 ACCTGCACTG TCAAAACTTA CAATTAACTG GATAGTATGT
2081 ATTTGATGAG GGGAACTGAA TTACAGGGAA ACCTAGGTTA
2121 GGCCAAGTGT TGCTTTCGTC ACCAATTCAC AGTTAAGGAA
2161 ACTGAGGCCA CGGGCCACCC AGCTTAGGAC TTTTGACTAT
2201 AAACCCTGAG ATCTCTCTCc TTTaCATAAG CATTTTGTTT
2241 TCATTGCTGT TGACACTTTG TTAATCTTGC TtACTtAAAA
2281 CTAaTTTCTG CTAATAGCTT CAGGGTCTTT AGCAACTGTC
2321 AGCATGTAAT GTGTCTGCAT TTCATATATA TAATTAGTTT
2361 TCATGGCAAC AGTCCACTTT TAGTCAATCA ACATTATAAA
2401 GTTAGTTATT TATTTATTTA TTTATTTATT TATTGACTGA
2441 TACGGAGTTT TGCTCTTGTT GCCCAGGCTG GAGTACAAGG
2481 GCCCAATCTT GGCTCACTGC AACCTCCGCC TCCCGGGTTC
2521 AAGCAATTCT CCTGCCTCAG CCTCCTGAGT AGCTGGGAaT
2561 TATAGGTGCC CGCCACCACA CCCGGCTAAT TTTTGTATTT
<----------------Del_R
2601 TtCAGTAGAG ACGGGGTTTC ACCATGGCAG CCAGGCTGGT
2641 CTCAAACTCC TCACCTCAGG TGATCCAACT CSCCTCAGCC
2681 TCCCAAAGTG CTGGGATTAC AAGTGTGAGC CACCGCGCCT
2721 GGCAACATTA TAAACTTATA ATGAATTTAT GGAGTGTTAC
2761 TAGTAAACAA AATGAATATT CTTTAAATAA AAAAAATTTC
2801 TAAAAGCCTC TCAAATGTGC TTGTCTTTCT CCTTGCA
Green = flanking sequence
Black = deletion sequence
Red = purine-rich region
NCBI sequence: (21,571- 24,406)
agcctgccta ccacagagaa tgccttggaa tcagaggttc
cctgaagaga ccctctcctc ttagaataat ccaaaaccag
aatctccaga gccccgtggt caaaactaaa acgttccatc
taggagtgag agagcacgat atctacttcc tcacacttct
cctcggttct caaataaaag cgctcactta catttgccat
ctttattctg tgatccgttt ttatgttaca gcaaataagc
aaattatgag gtcctctggg cgaaaggaaa atcagcatgg
aatgtaagtt attgtgccat ctagagaaaa tgtgagaggc
tggaagcctc aatcaactgt cttccttgaa gaataaccta
gatcttggct cccactggca aagatgagtg ggggttattg
tcttctctaa gaaactaaac gtccctcaca tgcttgaaga
tgtcgcaagg gagacctgat ggccccattt ctgtgaggtt
gttcctcaaa gaagaatcaa agatttcagt cacattagca
tcatcatgtt ctcttagtcc agaatttttc agcaaacata
ttccacaaaa ttttctgcaa gttcagggta catatagcag
gtgcagtgga tttttgttat gttttaatat aacatactag
agaaaatcca gaacattctt ctccctctct cttcttcatc
acattcacat ctcagcctat agagcagagt ttattcctta
gtataatatc aaggcctgtt ttaaaaatat atatattata
catgtgaatg agaaatgagt cacatttatt ttaccatgtc
tctggttttt aaataaaatt aaaaggttgg gaaactgttt
ttcagtgtca caacctctct gttcttacta ccataatatt
tacttgatat tatttcagtt cttccttccc cacacccatg
ttgaatccca gaccacaaac tactgtaatt tttctttatt
atcaacatat gtaggaatgc agaattaaaa ttattgatca
agtttcatgc aaagttccaa aaccaaagaa agaaagaaag
gaagagagga aaaaagagag aaagacaggg agaaaaataa
aaagaaggaa agagaggaag gaaagagagg aaggaaagga
aggaaggaag gaaggaagga agaaaggaag gaaggaaaga
atgaaggaag gaaggaagga agggagggag gaaatcagac
cttttcattt catcgggata cctaccacct ctctttttga
ctcaagctaa tgttaaatgt taaaaagagt ctccattttt
agaatacacc aaccaataga aggacccccc catgccctag
agctccctgg atagtagaaa attagtcaaa aatttaaaat
ttactataga tgatccataa aattaaaaat catacaaagc
atgttaagag ctgggtgaca tatatatatt aactataaag
agagcagata tagaaaggaa gccaacattt atctagcaga
agaaaaaaac accatcattt gtatcaataa aaagcatgta
tgatgagcgg gcatggaggc ttatgcctat aacccagcac
tttgggaggc cgaggcatgt gggtcgctta agtccaagag
ttcaagacca gcctgggcaa caatggcaaa aatccgtctc
tactaaaagt gcaaaaaatt ggccaggtgt ggtggtacat
gcctgtagtc ccagctagtc aggtggctga agcagaagga
ttccctgagc ctgggagatc gaggctgaag tgagccttga
tcatgctact gcactccagc ctgggtgaca gagcgagacc
ctgtctcaaa aaaaaaaaaa aaatgcataa aaatgttcat
ttacatcctc atttaaccca taccatactg tattctactt
gcagtatttg ctaactactc cccagataga tgggctcact
ttgaggccaa ggattgtgtt ctaccataat ctcattcctt
cagcacagct cagcacctgg caaattggag gcaacaaatg
tctatggatc cctctgtaac catgaacaag tcagtcaggg
taactgcact gtcaaaactt acaattaact ggatagtatg
tatttgatga ggggaactga attacaggga aacctaggtt
aggccaagtg ttgctttcgt caccaattca cagttaagga
aactgaggcc acgggccacc cagcttagga cttttgacta
taaaccctga gatctctctc ccttacataa gcattttgtt
ttcattgctg ttgacacttt gttaatcttg cttacttaaa
actaatttct gctaatagct tcagggtctt tagcaactgt
cagcatgtaa tgtgtctgca tttcatatat ataattagtt
ttcatggcaa cagtccactt ttagtcaatc aacattataa
acttatttat ttatttattt atttatttat ttattggctg
atacggagtt ttgctcttgt tgcccaggct ggagtacaag
ggcccaatct tggctcactg caacctccgc ctcccgggtt
caagcaattc tcctgcctca gcctcctgag tagctgggat
tataggtgcc cgccaccaca cccggctaat ttttgtattt
tcagtagaga cggggtttca ccatggcagc caggctggtc
tcaaactcct cacctcaggt gatccaactc gcctcagcct
cccaaagtgc tgggattaca agtgtgagcc accgcgcctg
gcaacattat aaacttataa tgaatttatg gagtgttact
agtaaacaaa atgaatattc tttaaataaa aaaaatttct
aaaagcctct caaatgtgct tgtctttctc cttgca
Green = flanking sequence
Black = deletion sequence

Example 11 Functional Effects of the Deletion and Polymorphisms in the Purine-Rich Region of DCDC2 Intron 2

The 170basepair purine-rich region in intron 2 of DCDC2 (starting at 24,434,282, ENSEMBL database version 33 Sep. 2005), is a very unique sequence comprised of nearly G and A bases exclusively. TESS (24) comparison to the TRANSFAC database identified 131 putative transcription factor binding sites distributed through this region, including four copies each of PEA3 (AGGAAA) and NF-ATp (AGGAAAG) sites in dbSTS ID 808238 (Table 3). Both transcription factors are expressed in mouse brain. PEA3 is associated with sexual function and peripheral motor neuron arborization (25). NF-ATp mediates rapid embryonic axon extension necessary for forming neuronal connections (26), which would complement the putative function of the doublecortin peptide domains in DCDC2. The presence of these binding sites suggests that the purine-rich region likely functions as an enhancer or regulatory region that could modify DCDC2 expression in terms of tissue or cell specificity, developmental timing, or quantity. To show that this region can actually bind transcription factor proteins, short double-stranded oligonucleotide probes, EMSA1, EMSA2, EMSA3, and EMSA4 (positions shown in figure below), were synthesized from the sequence of the purine rich region and tested for protein binding using the electrophoretic mobility shift assay:

Purine-Rich Region in Intron 2 of DCDC2:

1001CAAGTTTCAT GCAAAGTTCC AAAACCAAAG AAAGAAAGAA
1041AGGAAGAGAG GAAAAAAGAG AGAAAGACAG GGAGAAAAAT
↓--------EMSA2-------↓
-------EMSA1-------↓ ↓-------EMSA3--
1081AAAAAGAAGG AAAGAGAGGA AGGAAAGAGA GGAAGGAAAG
-----↓ ↓--------EMSA4--
1121GAAGGAAGGA AGGAAGGAAG GAAGAAAGGA AGGAAGGAAA
----↓
1161GAATGAAGGA AGGAAGGAAG GAAGGGAGGG AGGAAATCAG
(Black bases = deletion se↓uence)
(Red bases = purine-rich region)
(Underline = repeat units described in Table 3)

EMSA Sequences:

PrimerComplementary Primer
EMSA1TAAAAAGAAGGAAAGAGAGGCCTCTCTTTCCTTCTTTTTA
EMSA2GAGAGGAAGGAAAGAGAGGATCCTCTCTTTCCTTCCTCTC
EMSA3GAGAGGAAGGAAAGGAAGGATCCTTCCTTTCCTTCCTCTC
EMSA4AAGGAAGGAAGGAAAGAATGCATTCTTTCCTTCCTTCCTT

Electrophoretic Mobility Assay:

In the autoradiograph (FIG. 6), the Oct2A transcription factor recognition sequence (Control, lanes 1, 2, 3), EMSA3 (lanes 4, 5, 6) and EMSA4 (lanes 7, 8, 9) were fluorescently labeled and resolved by non-denaturing polyacrylamide gel electrophoresis. Migration was shifted when human brain nuclear cell lysate, containing transcription binding proteins, was mixed with the labeled probes (Control lane 2, EMSA3 lane5, and EMSA4 lane 8), showing that similar to Control, EMSA3 and EMSA4 bind nuclear proteins. Protein binding was then competitively and specifically inhibited by adding unlabeled (“cold”) DNA (control lane3, EMSA3 lane 6, and EMSA4 lane 9).

Therefore, the polymorphisms of the purine-rich region—including the 2,445base deletion—could act by disrupting or modifying DNA-protein interactions, and the specific DCDC2 enhancer-regulatory function encoded in this intron. The result would be a profound effect on DCDC2 expression, which, as shown by the RNAi data (Example 9), would have a significant effect on neuronal migration and ultimately reading ability.

Discussion

Applicants' previous studies showed transmission disequilibrium to JA04. They systematically interrogated the 6p22 DYX2 locus for a candidate gene that could confer susceptibility for RD. Starting with single-marker QTDT analysis they found the strongest peak and concentration of transmission disequilibrium with SNPs in DCDC2. The extent of intermarker linkage disequilibrium clustered through the 1.5 Mb of genomic sequence suggests adequate marker density in this region, and seven haplotype blocks. Blocks spanning DCDC2 also show significant transmission disequilibrium with several quantitative reading phenotypes in the context of preserved IQ, suggesting a specific effect on reading performance and not generalized or global effects on brain function. This fits the definition of the cognitive phenotype for RD and the entry criteria for subject collections; CLDRC subjects have a minimum IQ score of 80.

Reported here are the results from 147 SNP markers, but originally 152 consecutive markers were queued in the high-throughput genotyping strategy. Four markers failed PCR and were dropped from the analysis. A fifth marker, C449792, was flagged for non-Mendelian transmission and was set aside. Only after completion of the single-marker QTDT analysis did Applicants confirm LOH with C449792 in samples not subjected to multiple displacement amplification (MDA) and discover the 2,445 bp deletion in intron 2 of DCDC2, between the exons encoding the two doublecortin domains.

The 2,445 bp deletion, including minor alleles of dbSTS ID 808238, is in strong linkage disequilibrium with reading performance (P=0.00002, Table 2). Furthermore, dbSTS ID 808238 encodes multiple copies of PEA3 and NF-ATp sites that are active in brain. Loss of this entire regulatory region, as would happen with the common large deletion Applicants found in dyslexics, would therefore have profound effects on DCDC2 function. Polymorphisms would disrupt PEA3 and NF-ATp sites, which may explain dyslexia in subjects without the common deletion, or the variation of reading ability due to allelic heterogeneity.

DCDC2 (also called RU2 and KIAA1154, MIM: 605755) is located in the DYX2 locus 500 kb from JA04. The function is unknown but it contains two doublecortin peptide domains that were originally described in the doublecortin gene (DCX, MIM: 300121) encoded on the X chromosome. DCX encodes a cytoplasmic protein that directs neuronal migration by regulating the organization and stability of microtubules, and is mutated in human X-linked lissencephaly (27) and double cortex syndrome. Lissencephaly is a neuronal migration defect that produces profound mental retardation and seizures (28). Double cortex syndrome is caused by arrested migration halfway to the cortex producing a subcortical neuronal band heterotopia or “double cortex.” For both syndromes the large majority of point mutations cluster within the conserved doublecortin peptide motifs of DCX, which are also encoded in DCDC2.

Converging imaging data implicate three important regions in the left hemisphere that are important for fluent reading: the anterior system in the inferior frontal region, the dorsal parietotemporal system involving the angular, supramarginal, and posterior portions of the superior temporal gyri, and the ventral occipitotemporal system involving portions of the middle temporal and middle occipital gyri (3, 29). Imaging studies of dyslexic adults and children show a disruption of posterior reading systems in parieto-temporal and occipito-temporal regions (30). Yet DCDC2 is highly expressed in the same regions activated by fluent and dyslexic readers, suggesting that dysregulation—attributable to polymorphisms of a regulatory region—and not complete disruption of a protein product participating in axonal guidance and growth, could explain the expression patterns.

These findings are consistent with the hypothesis that dyslexia is associated with subtle changes—like the anecdotal microscopic anomalies reported by Galaburda and colleagues (31)—in the migration of neurons in developing neocortex. Similarities in structure and cellular function between DCDC2 and DCX, a gene known to be critical to neuronal migration, further supports a hypothesis for impaired neuronal migration. Loss of function of DCX causes severe developmental disruption in neocortex, and dyslexia in contrast is not characterized by large malformations of neocortex. The DCDC2 alleles that associate with dyslexia, however, would not be expected to be nulls, and so even if DCX and DCDC2 had similarly critical roles in neuronal migration, large malformations would not be an expected phenotype for the described alleles. In addition, a comparison of the RNAi results following DCX RNAi (32) with that following DCDC2 RNAi suggest that DCX may be necessary for neuronal migration while DCDC2 may be more modulatory. Unlike the effects of DCX RNAi treatment (32), DCDC2 RNAi treatment allows cells to migrate farther, attain typical migratory bipolar morphologies, and does not induce the formation of large sub-cortical band heterotopia. While the RNAi treatment does not exclusively target neurons that populate reading centers, when considered in the context of DCDC2 expression in inferior and medial temporal cortex, it offers a plausible pathophysiologic mechanism for RD due to genetic expression heterogeneity. DCDC2 heterogeneity is also consistent with other pathophysiologic mechanisms. Imaging studies have shown a functional disruption of a more subtle nature—demonstrable only in composite maps of pooled subjects imaged at 1.5 tesla—in areas where heterotopias have not been described. Accordingly, it may be that DCDC2 heterogeneity sensitizes the dyslexic reader to disruption in the development of “a hierarchy of local combination detectors” in the occipito-temporal system, as postulated most recently by Dehaene and colleagues (33).

Previous attempts at transmission disequilibrium mapping with sparse densities of SNP markers in this region—31 SNPs over 10 Mb (34) and 57 SNPs over 5.7 Mb (35)—proved inconclusive. One of these studies, which found significant linkage disequilibrium with markers around the TTRAP gene (35), did not include markers over DCDC2. A recent study covering VMP, DCDC2, KIAA0319, TTRAP, and THEM2 identified maximum association with KIAA0319 (36). Given its specificity of expression in brain and the location of JA04 in the 5-prime untranslated region (22), KIAA0319 is a reasonable candidate, but the reported paucity of polymorphisms in disequilibrium with reading phenotypes (35)—confirmed by sequencing in the CLDRC cohort—made it less attractive. Furthermore, in Applicants' population, transmission disequilibrium was mostly from short haplotypes confined to DCDC2 (blocks A through E), with minimal support for association from single markers within MRS2L, GPLD1, KIAA0319, TTRAP, and THEM2 (Supplementary Table 1). Block F, spanning GPLD1 just telomeric of DCDC2, also has one haplotype in disequilibrium. Haploview and Gold show, however, that the strongest marker in F, C2100443, shares weak intermarker disequilibrium with SNP 33 (D′=0.41 and 0.49 respectively) located in block C, suggesting transmission disequilibrium is due to polymorphisms in DCDC2. No other haplotypes spanning GPLD1 show significant disequilibrium (data not shown). The origin of the transmission disequilibrium from block G is unknown and it spans no recognizable coding sequences. Although it is located within 118 kb of a published peak in THEM2, Applicants found no disequilibrium with any Haploview block on either side of block G or spanning THEM2 (35). Haplotypes within block H, telomeric to G and also void of recognizable coding sequences, do not show significant disequilibrium with RD phenotypes. Overall then, conservative estimates of intermarker linkage disequilibrium blocks in this region are relatively short. Therefore, it is unlikely that transmission disequilibrium from DCDC2 in the CLDRC cohort is due to risk alleles of genes located elsewhere in the DYX2 locus.

The brain is a highly intricate organ that requires a complex orchestra of changes and growth to fully develop in humans. Regardless of the pathophysiologic mechanisms, RD is a complex phenotype and several, if not many, genes are involved. Since they are often functionally grouped on chromosomes, it is possible that variations within more than one gene on 6p22 are responsible for interindividual differences in RD, which may be apparent in further studies of additional populations.

Subjects and Methods

The following subjects and methods were used in the work described herein.

CLDRC RD Family Samples

The 536 samples (parents and siblings) consisted of 153 nuclear families collected by the Colorado Learning Disabilities Research Center (CLDRC) (37). Subjects included members of MZ twin pairs (in which case, only one member of the MZ twin pair was used), DZ twin pairs, and nontwin siblings. There were 34 families with one offspring, 94 families with two offspring, 19 families with three offspring, and 6 families with four or five offspring. Predominantly white middle-class families were ascertained from school districts in the state of Colorado, where at least one sibling had a school history of reading problems. Subjects with IQ less than 80 or for whom English was a second language were not included in the initial sample. Subjects with evidence of serious neurological, emotional, or uncorrected sensory deficits were excluded from the present analyses. The average age of the 221 siblings analyzed was 11.55 years, ranging from 8.02 to 18.53 years. The CLDRC cohort was evaluated at the University of Colorado with an extensive battery of psychometric tests described previously (11), consisting of cognitive, language, and reading tasks, and included the intelligence quotient and the Peabody individual achievement test (PIAT). Quantitative-trait data were provided for the following 11 phenotypes: orthographic coding (OC), is the ability to recognize words' specific orthographic patterns and was measured here with our experimental tests for orthographic choice (OCH) and homonym choice (HCH); a composite score for both tests (i.e. OC composite) was created by averaging the z scores for both tasks. Phonological decoding (PD) is the oral reading of nonwords, which have straightforward pronunciations that are based on their spelling. Phonemic awareness (PA) is the ability to isolate and manipulate abstract subsyllabic sounds in speech; for the present analyses, it was measured with an experimental phoneme-transposition (PTP) and phoneme-deletion (PDL) tasks, as well as with a composite score for both tests. WR was measured with an experimental timed-word-recognition (TWR) task and the untimed standardized PIAT word-recognition (PWR) task, which required subjects to read words aloud; a composite score for both tests was also created. Finally, the discriminant score (DISC) for reading was a weighted composite of the reading recognition, reading comprehension, and spelling subtests of the PIAT. These psychometric tasks have been described in detail elsewhere (17, 23, 37-39). The population average was estimated from the large twin database available at the CLDRC. After age regression and standardization, the phenotypic data for each of the reading tasks formed a continuous distribution of quantitative z scores, which were used in the analyses.

RNA Samples

Total RNA samples from 18 areas of adult human brain were purchased from Ambion (see FIG. 4), and were procured from 10 white donors ranging in age from 45 to 79 years, with unknown handedness. RNA samples could not be localized to either the left or right hemispheres. Six donors were male. Seven donors died due to cardiac (e.g. congestive heart failure) or respiratory disease (e.g. respiratory failure), one had liver cancer, one had bladder cancer, and one was listed as unknown.

MDA Amplification

All genomic DNA samples were amplified by MDA (Molecular Staging, Incorporated, New Haven, Conn.) (40). The quality of amplified samples was assessed with two restriction length polymorphisms (RFLPs) by 1% agarose gel electrophoresis; 84% of amplified samples could be genotyped with both 6p22 RFLPs. Deletions identified in amplified DNA were confirmed by resequencing non-amplified samples.

Genotyping

TaqMan Assay-on-Demand® and Assay-by-Design® probes (ABI, Foster City, Calif.) were used to genotype 109 and 39 SNPs respectively. Six SNPs failed web-based primer design for TaqMan and consequently were genotyped by pyrosequencing (Biotage AB, Uppsala, Sweden). The primers for these SNPs are presented in Supplementary Table 4.

Deletion Phenotype

The common 2,445 bp deletion was genotyped by allele-specific amplification with a combination of three primers in one reaction: universal forward primer (AGCCTGCCTACCACAGAGAA), reverse primer for non-deleted chromosomes (GGAACAACCTCACAGAAATGG), and reverse primer for deleted chromosomes (TGAAACCCCGTCTCTACTGAA). Reaction products were resolved on 1.5% agarose gels. The deletion fusion fragment was 176 bp and the non-deleted fragment was 486 bp.

DBSTS ID 808238 GENOTYPE:

The compound STR, dbSTS ID 808238, was genotyped by sequencing PCR products generated with forward primer (TGTTGAATCCCAGACCACAA) and reverse primer (ATCCCGATGAAATGAAAAGG). The sequencing method is described below. Sequence traces results were analyzed and alleles assigned with Mutation Surveyor version 2.6 (SoftGenetics, State College), by comparing samples to reference traces after alignment.

Error Checking

DNA samples were formatted into two 384-well plates with at least one negative control (no genomic DNA) and two positive controls (CEPH NA10848 and NA10849, Coriell Institute, Camden) in each quadrant of 384-well plates. Genetic analyses were only performed on data from plates where the negative control showed negative results, and positive controls showed identical genotypes. Two STR markers from the pseudo-autosomal regions of the sex chromosomes were genotyped to check the sex ID of samples. Data were preprocessed to remove genotype combinations that resulted in Mendelian incompatibilities, low-quality DNA samples, and to detect any pedigree errors. Lastly, all markers with extreme amounts of missing data were removed, to exclude loci where genotyping might have been problematic.

DNA Sequencing

PCR was used to generate 68 amplicons from 26 RD and 6 normal genomic DNA samples from RD sample set 1 for DCDC2, MRS2L, and KIAA0319. Upon completion of thermal cycling, the PCR products were treated with ExoSAP-IT (USB, Cleveland, Ohio) to remove residual dNTPs and primers. DNA sequencing was performed in both forward and reverse directions with Big Dye (ABI) fluorescently labeled dideoxy terminator and the reaction products were resolved by capillary electrophoresis and laser detection on a 3730XL Automated DNA Sequencer (ABI). Sequence alignments and comparisons were made using Phred, Phrap, Polyphred, Consed, and Mutation Surveyor (SoftGenetics, State College, Pa.).

Quantitative Real Time RT-PCR

TaqMan gene expression kits for eight genes in the candidate region (KIAA0319, DCDC2, MRS2L, GPLD1, ALDH5A1, TTRAP, HT012, and GMNN) and six control genes (GAPDH, 18S, β actin, HPRT1, PPIA and PKG1) were purchased from ABI. In the two steps of RT-PCR, RNA samples were reverse transcribed to cDNA with the High Capacity cDNA Archive Kit (ABI). Then real time PCR was performed with the default SDS condition on the 7900HT (ABI). Each sample was tested in triplicate. To control for genomic DNA contamination all of the brain RNA templates were subjected to a sham reverse transcription step with random primers and without RT enzyme, followed by PCR with primers from three of the control genes. To identify potential internal controls, six genes, GAPDH, 18S, P actin, HPRT1, PPIA and PKG1, were tested for consistent expression in all 18 brain samples. To compare RT-PCR efficiencies relative standard expression curves for the eight 6p22 and six control genes were generated. It demonstrated that efficiencies of target and reference are approximately equal. The comparative CT method, which normalizes expression to an endogenous reference and a calibrator, was used for quantitative relative gene expression.

Statistical Analysis

All data were stored in Microsoft Excel files. Genetic Analysis System (GAS) was used to assess the Mendelian transmission of alleles. Identity-by-descent (IBD) probabilities were estimated with SimWalk2. Applicants used QTDT to simultaneously test for transmission disequilibrium (40) in the presence of linkage by the orthogonal model (-ao) with variance components (-wega), and permutations for exact P values (-m1000−1). Through different modeling within QTDT Applicants tested for parent of origin effects (-ot), the significance of polygenic effects (-weg), evidence for linkage without association (-vega), total association (-at), and population stratification (-ap). Haploview and Gold were used to examine the haplotype structure of the markers, to generate haplotype blocks and to assess intermarker linkage disequilibrium (LD). Haplotype-TDT was analyzed by FBAT.

In Utero RNAi

Plasmids were directly introduced into cells at the cerebral ventricular zone of living rat embryos by in utero electroporation as previously described (32). Cells were co-transfected with pCA-eGFP and DCDC2 shRNA plasmid or control shRNA plasmid. The shRNA plasmid directed against DCDC2 contained the hairpin sequence 5′ cccaccaagcaattccagacaa(aca)ttgtctggaattgcttggtggg 3′ and the control sequence was 5′ cccagtcaaggcattgaattaaa(aca)tttaattcaatgccttgactggg 3′. The sequence was selected by its asymmetry and for absence of any matches to rat genomic sequence in the database. Four days after transfection rat embryonic brains were fixed with 4% paraformaldehyde and sectioned with a vibratome (Leica VT1000S) at 60˜80 μm. eGFP fluorescence was observed nuclei were labeled with TOP-PRO-3 (Molecular Probes). Images were acquired with a Leica TCS SP2 confocal microscope system (0.5˜1.0 um optical section) and processed using Photoshop 7.0. For cumulative probability migration plots the distance of each cell (200-1400 in each analysis condition) from the VZ surface was determined 4 days after transfection. Migration distances were determined with automated particle analyses in ImageJ (Wayne Rasband, Research Services Branch, National Institute of Mental Health, Bethesda, Md., USA).

Data Deposition The sequences reported herein have been deposited in the dbSTS database (ID 808238).

REFERENCES

  • 1. Shaywitz, S. E. & Shaywitz, B. A. (2003) Pediatrics in Review 24, 147-53.
  • 2. Shaywitz, S. E., Shaywitz, B. A., Pugh, K. R., Fulbright, R. K., Constable, R. T., Mencl, W. E., Shankweiler, D. P., Liberman, A. M., Skudlarski, P., Fletcher, J. M., Katz, L., Marchione, K. E., Lacadie, C., Gatenby, C. & Gore, J. C. (1998) Proc Natl Acad Sci USA 95, 2636-41.
  • 3. Shaywitz, B. A., Shaywitz, S. E., Pugh, K. R., Mencl, W. E., Fulbright, R. K., Skudlarski, P., Constable, R. T., Marchione, K. E., Fletcher, J. M., Lyon, G. R. & Gore, J. C. (2002) Biol Psychiatry 52, 101-10.
  • 4. Finucci, J. M., Guthrie, J. T., Childs, A. L., Abbey, H. & Childs, B. (1976) Ann Hum Genet 40, 1-23.
  • 5. DeFries, J. C., Fulker, D. W. & LaBuda, M. C. (1987) Nature 329, 537-9.
  • 6. Smith, S. D., Kimberling, W. J., Pennington, B. F. & Lubs, H. A. (1983) Science 219, 1345-7.
  • 7. Turic, D., Robinson, L., Duke, M., Morris, D. W., Webb, V., Hamshere, M., Milham, C., Hopkin, E., Pound, K., Fernando, S., Grierson, A., Easton, M., Williams, N., Van Den Bree, M., Chowdhury, R., Gruen, J., Stevenson, J., Krawczak, M., Owen, M. J., O'Donovan, M. C. & Williams, J. (2003) Molecular Psychiatry 8, 176-85.
  • 8. Marino, C., Giorda, R., Vanzin, L., Molteni, M., Lorusso, M. L., Nobile, M., Baschirotto, C., Alda, M. & Battaglia, M. (2003) Eur Child Adolesc Psychiatry 12, 198-202.
  • 9. Grigorenko, E. L., Wood, F. B., Golovyan, L., Meyer, M., Romano, C. & Pauls, D. (2003) American Journal of Medical Genetics 118B, 89-98.
  • 10. Willcutt, E. G., Pennington, B. F., Smith, S. D., Cardon, L. R., Gayan, J., Knopik, V. S., Olson, R. K. & DeFries, J. C. (2002) Am J Med Genet 114, 260-8.
  • 11. Kaplan, D. E., Gayan, J., Ahn, J., Won, T. W., Pauls, D., Olson, R. K., DeFries, J. C., Wood, F., Pennington, B. F., Page, G. P., Smith, S. D. & Gruen, J. R. (2002) Am J Hum Genet 70, 1287-98.
  • 12. Fisher, S. E., Francks, C., Marlow, A. J., MacPhie, I. L., Newbury, D. F., Cardon, L. R., Ishikawa-Brush, Y., Richardson, A. J., Talcott, J. B., Gayan, J., Olson, R. K., Pennington, B. F., Smith, S. D., DeFries, J. C., Stein, J. F. & Monaco, A. P. (2002) Nat Genet 30, 86-91.
  • 13. Barr, C. L., Shulman, R., Wigg, K., Schachar, R., Tannock, R., Roberts, W., Malone, M. & Kennedy, J. L. (2001) Am J Med Genet 105, 250-4.
  • 14. Ahn, J., Won, T. W., Zia, A., Reutter, H., Kaplan, D. E., Sparks, R. & Gruen, J. R. (2001) Genomics 78, 19-29.
  • 15. Petryshen, T. L., Kaplan, B. J., Liu, M. F. & Field, L. L. (2000) Am J Hum Genet 66, 708-14.
  • 16. Grigorenko, E. L., Wood, F. B., Meyer, M. S. & Pauls, D. L. (2000) Am J Hum Genet 66, 715-23.
  • 17. Gayan, J., Smith, S. D., Chemy, S. S., Cardon, L. R., Fulker, D. W., Brower, A. M., Olson, R. K., Pennington, B. F. & DeFries, J. C. (1999) Am J Hum Genet 64, 157-64.
  • 18. Fisher, S. E., Marlow, A. J., Lamb, J., Maestrini, E., Williams, D. F., Richardson, A. J., Weeks, D. E., Stein, J. F. & Monaco, A. P. (1999) Am J Hum Genet 64, 146-56.
  • 19. Field, L. L. & Kaplan, B. J. (1998) Am J Hum Genet 63, 1448-56.
  • 20. Grigorenko, E. L., Wood, F. B., Meyer, M. S., Hart, L. A., Speed, W. C., Shuster, A. & Pauls, D. L. (1997) Am J Hum Genet 60, 27-39.
  • 21. Cardon, L. R., Smith, S. D., Fulker, D. W., Kimberling, W. J., Pennington, B. F. & DeFries, J. C. (1994) Science 266, 276-9.
  • 22. Londin, E. R., Meng, H. & Gruen, J. R. (2003) BMC Genomics 4, 25.
  • 23. Olson, R., Wise, B., Conners, F., Rack, J. & Fulker, D. (1989) Journal of Learning Disabilities 22, 339-348.
  • 24. Schug, J. (2003) in Current Protocols in Bioinformatics, eds. Baxevanis, A., Davison, D., Page, R., Petsko, G., Stein, L. & Stormo, G. (John Wiley & Sons, Inc.
  • 25. Laing, M. A., Coonrod, S., Hinton, B. T., Downie, J. W., Tozer, R., Rudnicki, M. A. & Hassell, J. A. (2000) Molecular & Cellular Biology 20, 9337-45.
  • 26. Graef, I. A., Wang, F., Charron, F., Chen, L., Neilson, J., Tessier-Lavigne, M. & Crabtree, G. R. (2003) Cell 113, 657-70.
  • 27. Dobyns, W. B., Truwit, C. L., Ross, M. E., Matsumoto, N., Pilz, D. T., Ledbetter, D. H., Gleeson, J. G., Walsh, C. A. & Barkovich, A. J. (1999) Neurology 53, 270-7.
  • 28. Dobyns, W. B. & Truwit, C. L. (1995) Neuropediatrics 26, 132-47.
  • 29. Horwitz, B., Rumsey, J. M. & Donohue, B. C. (1998) Proc. Nat. Acad. Sci. USA 95, 8939-8944.
  • 30. Shaywitz, S. E. & Shaywitz, B. A. (2005) Biol Psychiatry 57, 1301-9.
  • 31. Galaburda, A. M., Sherman, G. F., Rosen, G. D., Aboitiz, F. & Geschwind, N. (1985) Ann Neurol 18, 222-33.
  • 32. Bai, J., Ramos, R. L., Ackman, J. B., Thomas, A. M., Lee, R. V. & LoTurco, J. J. (2003) Nat Neurosci 6, 1277-83.
  • 33. Dehaene, S., Cohen, L., Sigman, M. & Vinckier, F. (2005) Trends Cogn Sci 9, 335-41.
  • 34. Deffenbacher, K. E., Kenyon, J. B., Hoover, D. M., Olson, R. K., Pennington, B. F., DeFries, J. C. & Smith, S. D. (2004) Hum Genet 11, 11.
  • 35. Francks, C., Paracchini, S., Smith, S. D., Richardson, A. J., Scerri, T. S., Cardon, L. R., Marlow, A. J., Macphie, I. L., Walter, J., Pennington, B. F., Fisher, S. E., Olson, R. K., Defries, J. C., Stein, J. F. & Monaco, A. P. (2004) Am J Hum Genet 75, 1046-58. Epub 2004 Oct. 22.
  • 36. Cope, N., Harold, D., Hill, G., Moskvina, V., Stevenson, J., Holmans, P., Owen, M. J., O'Donovan M, C. & Williams, J. (2005) Am J Hum Genet 76, 581-91. Epub 2005 Feb. 16.
  • 37. DeFries, J. C., Filipek, P. A., Fulker, D. W., Olson, R. K., Pennington, B. F., Smith, S. D. & Wise, B. W. (1997) Learning Disabilities: A Multidisciplinary Journal 8, 7-19.
  • 38. DeFries, J. C. & Fulker, D. W. (1985) Behav Genet 15, 467-73.
  • 39. Olson, R. K., Forsberg, H. & Wise, B. (1994) in The varieties of orthographic knowledge I: Theoretical and developmental issues, ed. Berninger, V. W. (Kluwer Academic Publishers, Dordrecht, The Netherlands), pp. 27-71.
  • 40. Dean, F. B., Nelson, J. R., Giesler, T. L. & Lasken, R. S. (2001) Genome Res 11, 1095-9.
  • 41. Abecasis, G. R., Cookson, W. O. & Cardon, L. R. (2000) Eur J Hum Genet 8, 545-51.

TABLE 1
Single-marker QTDT analysis for markers with P value ≦ 0.01
EnsemblCelera
SNP IDGeneDISCIQPTPHCHLocationLocation
33DCDC2 Int 60.00032438684825512242
49DCDC2 Int 10.00352446312925588523
72GPLD1 Int 240.00182453903725664490
117inter-gene0.00772487284425998238
130inter-gene0.00670.0550.08112502279526142106

TABLE 2
QTDT analysis of the compound STR, dbSTS ID 808238.
AlleleDISCIQPTPTWRPWRWRPDOCHPDLHCHOCPA
10.0478
3
4
301 0.09180.0230.04070.03850.000020.00350.085
1Allele 30: combined deletion and all remaining minor alleles of dbSTS ID 808238.

Supplementary Methods and Materials

SNP Map

As in other regions of the human genome, the sequences provided by Celera, NCBI, and Ensembl databases had substantial differences. While exon sequences were identical in all three databases there was considerable variation in intron and intergenic sequences and lengths. Consequently the order of 15 SNPs, such as SNPs 27 and 31 in intron 7 of DCDC2, depended on the map source (Supplementary Table 1). For haplotype and intermarker linkage disequilibrium analyses Applicants chose the locations assigned by Ensembl.

Marker Panel

Applicants tested a total of 154 SNP markers spanning 1.5 Mb from JA03 at 24,033,400 bp through D6S2296 at 25,285,800 bp (Ensembl) in the CLRDC RD families. 109 SNPs were from the Celera database (www.celera.com) and 45 SNPs from the dbSNP database (www.ncbi.nlm.nih.gov/SNP). The marker density was 8.7 kb per SNP. Minor allele frequencies were greater than 5% for cSNPs and greater than 15% of all others.

TaqMan: PCR plates in 384-well configuration were formatted with the Hydra II plus-one liquid handling system (Matrix Technologies, Hudson, N.H.). Reaction volumes were 2 μl with 1.6 ng of template DNA and TaqMan Universal Master Mix without uracil-DNA-glycosylase (ABI). Plates were cycled in the PE 9700 (ABI): initial denaturation step of 10 min at 95° C., followed by 40 cycles of 15 sec at 95° C. and 1 min at 60° C. Fluorescent signals were collected on the 7900HT (ABI) and converted to genotype data by the Sequence Detection System (SDS, ABI).

Pyrosequencing: Primers for pyrosequencing are listed in Supplementary Table 4. A total of 20 μl PCR reaction contained 10 ng of genomic DNA, 0.4 units Hotstart Taq polymerase (Qiagen), 4 pmoles of forward PCR primer, 0.4 pmoles of reverse PCR primer (5′-T3 tag), 3.6 pmoles of biotinylated T3 primer, 2.5 mM MgCl2, and 200 μM dNTPs. Thermal cycling conditions were 15 min at 95° C., following by 45 cycles (30 sec at 95° C., 45 sec at 56° C., 60 sec at 72° C.), 5 min at 72° C., and a hold at 4° C. Upon completion of PCR, the biotinylated PCR product from the entire reaction was purified by binding to streptavidin-sepharose (Amersham) using the Filter Prep tool according to the standard protocol provided by Pyrosequencing, Inc. The Pyrosequencer software automatically scored each reaction and assigned genotypes.

Genotyping Results

Applicants genotyped a total of 147 SNPs distributed through the 1.5 Mb region surrounding JA04 in 153 nuclear RD families recruited by the Colorado Learning Disabilities Research Center (CLDRC). Origins, locations, and allele frequencies for the entire panel of 147 SNPs are presented in Supplementary Table 1. The overall success rate for genotyping was 90%. The average marker density was one SNP per 10.2 kb. The average marker density within genes was one SNP per 4.8 kb.

DNA Sequence Analysis

Applicants sequenced PCR products generated from 26 RD and six non-RD samples selected from the CLDRC RD cohort corresponding to 21 exons of KIAA0319 (12.2 kb), 10 exons of DCDC2 (6.7 kb), and 11 exons of MRS2L (1.99 kb). No novel polymorphisms were identified in the exons or reported splice sites of KIAA0319 or DCDC2. Five non-synonymous cSNPs were found in MRS2L (Supplementary Table 1): MRS5, MRS6 (SNP 58), MRS7, and MRS8 were in exon 1, and MRS9 was in exon 11. Four novel cSNPs were also found in the 5-prime untranslated region (MRS1 through MRS4). MRS3 changed the start codon from ATG to ATC. The minor alleles of MRS1(A), MRS3(C), MRS4(G), MRS5(T), and MRS6(T) were transmitted only once in the RD cohort. All nine SNPs in MRS2L were genotyped in the RD families by fluorescent dideoxy sequencing or pyrosequencing.

Websites

Celera: htp://www.celera.com
Coriell Institute: http://locus.umdnj.edu/
dbSNP database: http://www.ncbi.nlm.nih.gov/SNP
ENDCODE: http://genome.cse.ucsc.edu/ENCODE/
FBAT: http://www.biostat.harvard.edu/˜fbat/fbat.htm
GAS (Genetic Analysis System): http://www.hgmp.mrc.ac.uk/Registered/Option/gas.html
GOLD: http://www.sph.umich.edu/csg/abecasis/GOLD/index.html
Haploview: http://www.broad.mit.edu/personal/jcbarret/haplo/
Mutation Surveyor: http://www.softgenetics.com
Phrap, Phred, Consed: http://www.phrap.org/
PolyPhred: http://droog.mbt.washington.edu/PolyPhred.html
Pyrosequencing: http://www.pyrosequencing.com
QTDT: http://www.sph.umich.edu/csg/abecasis/QTDT
SimWalk2: http://watson.hgen.pitt.edu/docs/simwalk2
TESS: URL: http://www.cbil.upenn.edu/tess
TRANSMIT: http://www-gene.cimr.cam.ac.uk/clayton/software/

Supplementary Table 1: Results of QTDT analysis with 147 SNPs.
Hap-
lo-
SNPtypeEnsemblCeleraAlleleAllele
IDSNPsGeneBlockDISCIQPTPTWRPWRWRPDOCHPDLHCHOCPALocationLocationFreq1Freq2
1rs1925432inter-gene0.03540.035523401987245213000.306
2rs1886705inter-gene23487330246056520.481
3rs1001075inter-gene0.04060.060723943737250630590.327
4Cinter-gene24080100252045740.2820.31
2505961
5Cinter-gene24090597252153940.2500.31
2505937
6Cinter-gene24098041252228340.1450.18
2505926
7Cinter-gene0.09240.046124108706252334740.3180.4
2505896
8Cinter-gene0.06924136382252611520.2950.37
210230
9Cinter-gene0.04690.01520.019624177669253029780.3190.34
91009
10Cinter-gene0.085424189348253146570.3900.46
11831124
11Cinter-gene0.04824198476253237840.4190.48
282670
12Cinter-gene0.03320.067424207349253326570.2710.32
7454493
13Cinter-gene24216188253414960.2730.26
266646
14Cinter-gene0.06290.054524220516253458240.4460.46
1809129
15Cinter-gene0.095824227520253528280.2810.24
452337
16CVMP Int 20.092524239314253646340.4820.47
443745
17CVMP 3′UTR0.0350.022824254651253799710.2170.24
11831166
18Cinter-gene24259500253848160.4900.44
9373644
19rs3804320DCDC2 Int 924285932254112540.109
20CDCDC2 Int 90.04580.07590.026824286285254116060.1650.17
7454570
21rs2791971DCDC2 Int 8A24292952254183590.248
22rs2791972DCDC2 Int 8A0.070724293222254185510.208
23CDCDC2 Int 8A0.092424295803254211330.3840.34
113214
24rs3789221DCDC2 Int 8A24297512254228430.091
25rs2027584DCDC2 Int 8A24299475254248060.063
26C7454462DCDC2 Int 7B0.0237 0.07460.02710.04930.01730.04750.01150.050324315179254405080.2550.27
27rs793842DCDC2 Int 7B0.057524332467253531300.345
28rs793837DCDC2 Int 724338193253588720.173
29rs1087287DCDC2 Int 70.015824345269253660350.273
30rs793857DCDC2 Int 724353401253739880.075
31CDCDC2 Int 724381770255071660.2830.26
7454731
32rs807700DCDC2 Int 7C0.039924382384254025360.276
33C7454704DCDC2 Int 6C 24386848255122420.1680.21
34rs807722DCDC2 Int 6C0.048524387896255132910.278
35rs2328208DCDC2 Int 6C24393548254142180.181
36rs2296539DCDC2 Int 5C0.023424397408255228040.287
37C8344981 C0.082824399182255245750.2690.24
38rs707864DCDC2 Int 2C24413829254345130.083
39rs3895346DCDC2 Int 2C0.08124416015254366670.343
40CDCDC2 Int 224418623255440230.0890.17
7454814
41rs4269365DCDC2 Int 20.06724428411254490840.160
CDCDC2 Int 20.02752443360625559004
449792
42CDCDC2 Int 2D0.05724444623255700190.3610.37
7454810
43rs1923168DCDC2 Int 2D24447276254679210.132
44CDCDC2 Int 2D0.026424454035255794300.3980.39
7454798
45rs2753912DCDC2 Int 2D0.022924455603254762530.404
46rs6922023DCDC2 Int 2D24456095254787050.132
47rs2100377DCDC2 Int 2E24461259255866530.3080.32
48rs793719DCDC2 Int 10.056224482866255882610.433
49C7454790DCDC2 Int 1E 24463129255885230.4530.44
50Cinter-geneE0.050824477494256029300.3400.34
2100395
51Cinter-gene24483529256089670.3020.35
7466824
52rs611103inter-gene24490875256119540.327
53MRS1inter-gene24511109256365780.000
54MRS2MRS21, 5′UTR0.085924511186256366550.126
55MRS3MRS21, 5′UTR24511230256366990.000
56MRS4 24511265256367340.004
57MRS5MRS21, Ex. 124511279256357480.009
585960rs2295551MRS7MRS8 0.07972451138724511391245114452563683625636860258369140.0050.0220.131
61rs2273606MRS2L ln 124513336256340710.144
62CNRS2L ln 20.05624513747256392030.3860.31
2100415
63rs3761789MRS2L ln 224523169256486240.129
64rs1772253MRS2L ln 524528213256536870.368
65rs1056283MRS2L ln 90.09320.015824526355256490640.427
66rs13735MRS2L ln 90.08424531629256570830.311
67MRS9MRS2L ln 1024531804256572670.022
68CMRS2L ln 10F24533669258591220.1550.09
12090381
69C9359851 F0.024624538455256639120.2000.22
70CGPLD1 int 24F0.029324538541256639980.3510.3
9359852
71CGPLD1 int 24F0.01624538964256644170.3670.3
2100442
72C2100443GPLD1 int 24F 24539037256644900.3690.3
73rs1042303 24545437256708880.458
74CGPLD1 int 2024545617256710670.1760.17
7454653
75CGPLD1 int 2024546641256720910.3870.44
2100452
76CGPLD1 int 200.0250.07310.029524547579256730290.1570.31
2100460
77CGPLD1 int 170.07340.08260.08224555828256812830.3010.45
2100474
78CGPLD1 int 170.01390.03280.05890.05830.0310.01710.010324556040256814950.1750.18
7454980
79CGPLD1 int 150.04560.09320.0920.038224557618256630730.1430.3
2100479
80CGPLD1 int 1424558383256838360.2080.33
2100480
81CGPLD1 int 1324564285256897390.3300.26
9373740
82CGPLD1 int 1024574700257001520.2840.39
7466744
83CGPLD1 Int 324587761257132020.4050.24
2479643
84CGPLD1 int 324587852257132930.4230.49
2479645
85CGPLD1 5′UTR24597720257231540.3050.38
2479663
86Cinter-gene24599454257248880.2910.47
2479666
87CALDH5A1 int 324613009257384390.4340.49
2479683
88CALDH5A1 int 424622548257479730.132
15922308
89CALDH5A1 int 724831696257571240.2590.3
3073894
90CALDH5A1 int 824639429257648210.441
3073688
91CALDH5A1 3′UTR24842172257875640.3360.38
7468785
92Cinter-gene0.09380.095824552882257782800.2570.29
7466794
93Cinter-gene24853918257793160.4550.48
3073676
94C7468818 0.0650.04560.028224659643257850430.1680.14
95CKIAA0319 Ex 160.0230.01070.01124667260257926510.2660.23
3073687
96CKIAA0319 Int 1424672108257974980.1280.09
3073685
97rs2744550 24672524257979140.005
98CKIAA0319 int 1224676372258017610.4880.42
3073682
99CKIAA0319 int 824686029258114180.0720.13
3070501
100CKIAA0319 int 724687062258124510.0620.13
3073656
101CKIAA0319 int 624688600258139890.067
3073857
102CKIAA0319 int 524690011258154000.3680.43
3073656
103CKIAA0319 int 30.053524692345256177350.3320.37
1691926
104105rs4504469rs4576240 0.0874246968532470445725822263258298570.3160.150
106Cinter-gene24740511258858890.2650
2221340
107Cinter-gene24753399258787760.3030.29
2463872
108Cinter-gene24753589256789660.2780.29
2463870
109Cinter-gene0.091924754800258801630.4850.43
333352
110C16187858 24781252258866150.0410.06
111CTTRAP Ex 624761355258867180.2400.17
7466919
112CTHEM2 int 124775778259011410.2160.35
2463856
113CTHEM2 int 10.088824795744259211400.2370.17
7466950
114CTHEM2 Ex 20.04850.03450.07780.0460.09420.016824806194259315910.3330.29
3248054
115CCorf62 3′UTR +24813814259392110.3470.48
3248047
116Cinter-gene24829646259550420.2290.23
2140734
117C11830308inter-geneG 0.034524872844259982360.111
118Cinter-geneG0.065924898441260238320.4480.44
2140695
119Cinter-geneG24907257260326480.4380.48
151407
120CC6orf32 int 21 ++24917860260432530.1530.15
11832109
121CC6orf32 int 19 ++24927196260465130.3900.34
152076
122CC6orf32 int 18 ++24935670260549870.3830.36
484656
123CC6orf32 int 14 ++0.05010.08160.051724944140260634570.1770.16
431320
124CC6orf32 int 12 ++24953760260730720.3840.3
11834072
125CC6orf32 int 7 ++0.092324967500260868140.1990.13
371663
126CC6orf32 int 5 ++0.086424976228260955430.3580.34
11198233
127Cinter-gene24988793261081080.3300.42
11198237
128Cinter-geneH25009329261286460.4200.49
15813950
129Cinter-geneH25016624261359410.4370.48
9360070
130C7460841inter-geneH 0.0550.081125022795261421060.2260.21
131Cinter-gene25028725261480360.1140.16
2320908
132Cinter-gene0.02930.060425039268261585770.3160.34
2336471
133Cinter-gene0.089625055520261748260.3810.37
2711470
134Cinter-gene0.067225058808261781120.4240.41
2711477
135Cinter-gene25085110261844160.3320.4
2711487
136Cinter-gene25081087282004040.1570.17
2530807
137Cinter-gene0.011425088590262079080.4280.37
7461306
138Cinter-gene25097132282164530.2790.28
2738571
139Cinter-gene25112704282320250.4180.45
9375211
140Cinter-gene25122023262413350.4430.45
3256976
141Cinter-gene25135267262545740.4110.44
3256996
142Cinter-gene0.06660.071525144959262642490.4310.39
3248665
143Cinter-gene25147997262672870.4290.45
3248675
144Cinter-gene0.047725152693282719840.2240.19
3248685
145rs304257inter-gene0.08170.092125159000282783140.332
146rs215013inter-gene25491223266126640.102
147rs220698inter-gene0.030145915568480381920.315
Note
*: Minor allele frequencies in our RD probands
**: Minor allele frequencies in Caucasians according to Celera
+: C6orf62 is a predicted gene in Ensembl database (Vega).
++: C6orf62 is a predicted gene in Ensembl database (Vega), and is equivalent to KIAA0385 in NCB1 and Celera.
DISC: Discriminant Score
PTP: Phoneme Trasposition
TWR: Timed Word Recognition
PWR: PIAT Word Recognition
WR: Word Recognition Composite
PD: Phonological Decoding
OCH: Orthographic Choice
PDL: Phoneme Deletion
HCH: Homonym Choice
OC: Orthographic Choice + Homonym Choice
PA: Phoneme Transposition + Phoneme Deletion
: cSNPs that change the amino acid sequence of the corresponding protein.
: Single marker TDT peaks with p <0.01.
: Minor allele frequency in populations other than Caucasian in the Celera database.

SUPPLEMENTARY TABLE 2a
Composition of haplotype blocks.
Haplo-
typeFre-
Haplotype BlockIDHaplotypequency
A
5 markers1A A A G G0.60
SNP ID: 21, 22, 23, 24, and 252G C G G G0.25
spanning 6523 bp in Ensembl3A A G A T0.07
B
2 markers1G C0.63
SNP ID: 26 and 272A T0.21
spanning 17287.5 bp in Ensembl3G T0.15
C
8 markers1G T G A G A T G0.62
SNP ID: 32, 33, 34, 35, 36, 37,2A T C G A T T C0.12
38, and 393G T G A G A T C0.06
spanning 33631 bp in Ensembl4A C C A A T T A0.06
5A C C G A T A A0.05
D
5 markers1G G G A G0.54
SNP ID: 42, 43, 44, 45, and 462C G A T A0.15
spanning 11472 bp in Ensembl3C A A T G0.13
4C G A T G0.10
E
3 markers1A G C0.53
SNP ID: 47, 49 and 502G A T0.30
spanning 16235 bp in Ensembl3A A C0.11
F
5 markers1A T A A T0.64
SNP ID: 68, 69, 70, 71,722A A G G G0.22
spanning 5368 bp in Ensembl3G T G G G0.11
G
3 markers1C A T0.46
SNP ID: 117, 118, and 1192C C C0.38
spanning 34413 bp in Ensembl3A C C0.12
H
3 markers1T A G0.38
SNP ID: 128, 129, and 1302C G G0.31
spanning 13466 bp in Ensembl3C G A0.21

Supplementary Table 2b: Haplotype-TDT results for blocks A through H.
DISCIQPTPTWRPWRWR
BlockHaplotype IDZ-scoreP-valueZ-scoreP-valueZ-scoreP-valueZ-scoreP-valueZ-scoreP-valueZ-scoreP-value
A1−1.2510.2112.220 −1.7680.077−1.6750.094−2.256 −2.004
2−0.2310.817−1.4240.1540.5990.5490.6130.5400.7470.4550.7030.482
31.7520.080−0.9150.3601.5690.1171.3490.1771.8220.0681.6460.100
B1−1.2330.2181.3070.191−0.2360.813−1.4100.159−1.6580.097−1.5670.117
20.5180.6050.6310.526−0.9330.351−0.5870.557−0.1690.866−0.3630.717
31.1400.254−1.7350.0831.3590.1741.9050.0571.928 1.942
C1−1.1570.2471.6700.095−1.1680.243−2.102 −1.996 −2.086
20.4710.638−1.0420.2970.3540.7231.2580.2081.0160.3091.1470.251
31.1250.2600.6070.5440.3350.7380.3450.7300.7310.4850.5460.585
4−0.0110.9910.3380.736−1.3920.164−0.2900.772−0.2590.795−0.2830.778
5−0.3860.700−1.0570.291−0.2520.8010.0950.9250.1850.8530.1440.886
D1−1.3460.178−0.6210.5340.2300.818−0.6720.502−0.9300.352−0.8210.411
20.0940.925−2.171 0.2320.8161.6140.1071.7180.0861.7030.089
30.4460.656−0.2070.8360.9790.3270.6400.5220.7280.4660.6960.487
41.4540.1462.080 −0.9920.321−0.6180.536−0.3620.718−0.4830.629
E10.0590.9530.0200.9840.9840.3250.1510.860−0.2410.809−0.0590.953
21.5180.129−0.9760.3291.3480.1781.4670.1432.010 1.8220.068
3−0.6730.5011.3040.192−2.213 −1.3030.192−1.4540.146−1.4170.157
F11.4160.157−0.0050.9961.1930.2330.8240.4100.9030.3660.8730.383
2−2.486 −0.4180.676−1.7780.075−1.7410.082−1.7820.075−1.8370.066
3−0.2420.8091.2330.217−0.5500.582−0.2520.801−0.2530.800−0.2110.833
G1−1.5180.1292.210 −0.8620.389−1.9040.057−1.947 −2.053
20.9620.336−1.0430.297−0.1950.8451.6920.0911.1740.2411.2520.211
30.2270.821−0.7770.4370.4560.6480.3420.7330.2610.7940.3020.762
H11.3510.1770.8030.422−0.1130.9100.0260.9790.6140.5390.3480.728
2−1.3990.162−0.1730.8620.0490.961−0.7040.481−0.9110.362−0.8250.410
3−0.9550.3400.1840.854−0.4840.629−0.8870.375−1.0620.288−0.9950.320
Supplementary Table 2b: Haplotype-TDT results for blocks A through H.
PDOCHPDLHCHOCPA
BlockHaplotype IDZ-scoreP-valueZ-scoreP-valueZ-scoreP-valueZ-scoreP-valueZ-scoreP-valueZ-scoreP-value
A1−1.5880.112−1.3400.180−1.7380.082−0.9600.337−1.4000.161−2.039
20.9020.3670.7420.4581.3840.1860.7680.4420.7190.4721.0740.283
31.1700.2420.8150.4151.1020.2701.5640.118
B1−1.3330.183−0.9970.319−1.3170.1880.6160.538−0.7040.481−1.1410.254
2−0.7930.428−1.5520.121−0.4120.680−1.3540.176−1.5100.131−0.6810.496
32.286 2.220 1.996 0.6640.5072.020 2.062
C1−1.9020.057−2.181 −1.975 −0.9300.352−2.137 −2.064
21.2700.2042.136 1.3290.1840.8020.4231.958 1.1970.231
30.5680.570−0.0190.985−0.0230.9810.5260.599
4−1.7280.084−1.3450.1790.5830.560−1.1710.241−0.7100.478
50.3450.7300.2420.8060.1290.897−0.3360.736
D1−0.8110.4170.7170.473−0.3390.7340.7950.4260.7230.470−0.2060.835
22.353 2.268 1.4640.143−0.2570.7981.7540.0791.3920.164
30.8090.543−0.4520.6510.2580.7961.1670.243−0.1810.8560.5480.584
4−1.1940.232−2.074 −0.5870.557−1.6370.102−2.034 −0.8210.412
E10.0930.9260.9990.3181.5670.1171.2690.2040.9540.3401.3050.192
21.6040.1090.1290.8970.6490.5160.8490.3960.3550.7231.0280.304
3−1.3020.193−1.0810.280−2.625 −0.7930.428−2.598
F10.8900.3731.1880.2351.3220.1861.7160.0861.3410.1801.3250.165
2−1.6810.093−1.4700.142−2.297 −2.346 −1.6590.097−2.102
3−0.6720.502−0.7140.475−0.4120.680−0.5120.609−0.6310.528−0.5800.562
G1−1.7880.074−2.296 −2.033 −1.7220.085−2.455 −1.7800.075
21.5930.1111.0000.3171.4010.1611.2490.2121.3040.1920.8300.407
30.3630.7170.8210.4120.4770.6340.5180.6040.6520.5140.6580.511
H1−0.0600.952−1.2700.2040.4470.655−1.7680.077−1.3670.1720.0930.926
2−0.4170.677−0.4390.660−0.0790.9371.1260.2600.2170.8280.1550.877
3−0.8140.4160.2410.810−0.6330.527−0.0600.953−0.3670.714−0.7780.437

SUPPLEMENTARY TABLE 3
Alleles and frequencies of the compound STR, dbSTS ID 808238.
AlleleRepeat Unit1Repeat Unit2SNP1Repeat Unit3Repeat Unit4Repeat Unit5Allele Freq1
1(GAGAGGAAGGAAA)2(GGAA)7(GGAA)2(GGAA)4(GGAA)20.624
2(GAGAGGAAGGAAA)1(GGAA)9DelGAAA(GGAA)0(GGAA)4(GGAA)20.003
3(GAGAGGAAGGAAA)1(GGAA)6(GGAA)2(GGAA)4(GGAA)20.060
4(GAGAGGAAGGAAA)2(GGAA)6(GGAA)2(GGAA)4(GGAA)20.106
5(GAGAGGAAGGAAA)2(GGAA)8(GGAA)2(GGAA)4(GGAA)20.028
6(GAGAGGAAGGAAA)2(GGAA)8(GGAA)2(GGAA)3(GGAA)20.039
7(GAGAGGAAGGAAA)2(GGAA)8(GGAA)1(GGAA)4(GGAA)20.003
8(GAGAGGAAGGAAA)2(GGAA)7DelGAAA(GGAA)0(GGAA)4(GGAA)20.003
9(GAGAGGAAGGAAA)1(GGAA)7(GGAA)2(GGAA)4(GGAA)20.005
10(GAGAGGAAGGAAA)2(GGAA)4(GGAA)2(GGAA)4(GGAA)20.044
14xxxxxx0.085
1Frequency among parents of the CLDRC families
Allele 14 is the 2,446 bp deletion

SUPPLEMENTARY TABLE 4
Pyrosequencing primers.
MarkerPCR primer 1PCR primer 2Extension Primer
rs503811ATTAACCCTCACTAAAGGGATTCTAATACGACTCACTATAGGGAGAGTTTGAATAGGAAAGGAT
tgtctagaggaatggattctgaccgcattattcaaaagcaagctgtgt
rs1925432ATTAACCCTCACTAAAGGGATTCTAATACGACTCACTATAGGGAGAGATGCAATCAATGGTAAT
tcaattatccaatgggaaagagcatctctaacacaggcaggatg
rs1886705ATTAACCCTCACTAAAGGGATTCTAATACGACTCACTATAGGGAGAACCTGTGCACAGTTTGA
ttgggtgctccttaaaccatttttctgtcctttactctttccctgaa
rs1001075ATTAACCCTCACTAAAGGGATTCTAATACGACTCACTATAGGGAGATGGCTGCTTAACAACCCAATAAAT
ttcaagaataggggaaatgttcatgcttccttaatggctgcttaac
rs1511468ATTAACCCTCACTAAAGGGATTCTAATACGACTCACTATAGGGAGAGGAGACCTCTGCAGATACGTACTA
cattctgttcttggatggagaccgaacccaaacacttgaccaaaag
rs304257ATTAACCCTCACTAAAGGGATTCTAATACGACTCACTATAGGGAGAATCTTCAGCATTGTCAACCTGACC
acttgccaccatcttttgttgttcatggatcttcagcattgtcaac