Title:
Polymorphisms Associated With Age-Related Macular Degeneration And Methods For Evaluating Patient Risk
Kind Code:
A1


Abstract:
The present invention provides for certain polynucleotide sequences that have been correlated to AMD. These polynucleotides are useful as diagnostics, and are preferably used to fabricate an array, useful for screening patient samples. The array is used as part of a laboratory information management system, to store and process additional patient information in addition to the patient's genomic profile. As described herein, the system provides an assessment of the patient's risk for developing AMD, risk for disease progression, and the likelihood of disease prevention based on patient controllable factors.



Inventors:
Seddon, Johanna M. (Boston, MA, US)
Daly, Mark (Arlington, MA, US)
Application Number:
12/988340
Publication Date:
07/21/2011
Filing Date:
04/17/2009
Assignee:
Tufts Medical Center, Inc., (f/k/a/ New England Medical Center Hospitals, Inc.) (Boston, MA, US)
Primary Class:
Other Classes:
435/6.1, 435/6.11, 435/6.19, 506/16, 506/39, 536/23.1
International Classes:
C12Q1/68; C07H21/00; C40B30/00; C40B40/06; C40B60/12
View Patent Images:



Other References:
dbSNP details for ss42299045 (Jul 17, 2005), from www.ncbi.nlm.nih.gov, page 1.
Cipriani V. et al. European Journal of Human Genetics (2012) 20, 1-2.
van de Ven J.P.H. et al. Molecular Vision 2012; 18:2271-2278.
Goto A. et al. j ocul biol dis inform (2009) 2:164-175.
dbSNP Submitted SNP(ss) Details: ss68907811, from http://www.ncbi.nlm.nih.gov/projects/SNP/snp, 09/11/2012, page 1.
GeneCards output for 'CFI Gene', from http://www.genecards.org, 09/11/2012, pages 1-11.
Pennisi E. Science; Sep 18, 1998; 281, 5384, p.1787-1789.
Hegele R.A. Arterioscler Thromb Vasc Biol. 2002;22:1058-1061.
Juppner H. Bone Vol. 17, No. 2, Supplement, August 1995:39S-42S
Wall J.D. et al. Nature Reviews - Genetics (Aug 2003) vol 4, p.587-597.
Zill P. et al. Molecular Psychiatry (2004) 9, 1030-1036.
Losonczy, G. et al. PLOS ONE, November 2012, Volume 7, Issue 11, e50181, pages 1-8.
NIH News (Dec. 12, 2006) NIH Launches dbGaP, a Database of Genome Wide Association Studies, 2 printed pages from: http://www.nih.gov/news/pr/dec2006/nlm-12.htm
Bahcall O. et al. Nature Genetics, Vol. 39, No. 2, (Feb. 2007) page 149
Primary Examiner:
KAPUSHOC, STEPHEN THOMAS
Attorney, Agent or Firm:
K&L Gates LLP-Boston (STATE STREET FINANCIAL CENTER One Lincoln Street BOSTON MA 02111-2950)
Claims:
What is claimed is:

1. A method for diagnosing age-related macular degeneration or a susceptibility to age-related macular degeneration comprising detecting the presence or absence of a particular allele at a polymorphic site associated with complement factor I, wherein the allele is indicative of age-related macular degeneration or a susceptibility to age-related macular degeneration.

2. The method of claim 1, wherein the polymorphic site is a single nucleotide polymorphism.

3. The method of claim 2, wherein the single nucleotide polymorphism is selected from the group consisting of: rs13117504 (SEQ ID NO:8), wherein the guanidine allele is indicative of age-related macular degeneration or susceptibility to age-related macular degeneration; rs10033900 (SEQ ID NO:9), wherein the guanidine allele is indicative of age-related macular degeneration or susceptibility to age-related macular degeneration, and the polymorphic site of SEQ ID NO:10, wherein the guanidine allele is indicative of age-related macular degeneration or susceptibility to age-related macular degeneration.

4. The method of claim 1, wherein the presence or absence of a particular allele is detected by a hybridization assay.

5. The method of claim 1, wherein the presence or absence of a particular allele is determined using a microarray.

6. The method of claim 1, wherein the presence or absence of a particular allele is determined using an antibody.

7. The method of claim 1, further comprising detecting one or more genetic markers associated with age-related macular degeneration and a gene other than complement factor I.

8. The method of claim 7, wherein the one or more genetic markers are associated with C3, C5 or CFH.

9. A purified polynucleotide comprising the polymorphic site and at least about six or more contiguous nucleotides of one or more of the sequences given as SEQ ID NOS:8, 9 or 10, wherein the minor allele is present at the polymorphic site.

10. A diagnostic array comprising one or more polynucleotide probes that are complementary to a polynucleotide of claim 9.

11. A diagnostic system comprising: a diagnostic array of claim 8, an array reader, an image processor, a database having data records and information records, a processor, and an information output; wherein the system compiles and processes patient data and outputs information relating to the statistical probability of the patient developing AMD.

12. A method of determining the risk of a subject for developing age-related macular degeneration, comprising detecting the presence or absence of one or more alleles at one or more polymorphic sites associated with complement factor I and age-related macular degeneration, wherein the sample is from a subject who is determined to be at risk for developing age-related macular degeneration due to one or more environmental risk factors.

13. The method of claim 12, wherein at least one of the one or more polymorphic sites is a single nucleotide polymorphism.

14. The method of claim 13, wherein the single nucleotide polymorphism is selected from the group consisting of: rs13117504 (SEQ ID NO:8), wherein the guanidine allele is indicative of age-related macular degeneration or susceptibility to age-related macular degeneration; rs10033900 (SEQ ID NO:9), wherein the guanidine allele is indicative of age-related macular degeneration or susceptibility to age-related macular degeneration, and the polymorphic site of SEQ ID NO:10, wherein the guanidine allele is indicative of age-related macular degeneration or susceptibility to age-related macular degeneration.

15. The method of claim 12, wherein the one or more environmental risk factors are selected from the group consisting of: obesity, smoking, vitamin and dietary supplement intake, use of alcohol or drugs, poor diet and a sedentary lifestyle.

Description:

GOVERNMENT SUPPORT

This invention was made with government support under EY011309 awarded by the National Institutes of Health. Additional funding was provided by the National Eye Institute (N01-EY-0-2127) and grant U54 RR020278 from the National Center for Research Resources. The government may have certain rights in the invention.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a plot showing sensitivities and specificities for a variety of risk score cutpoints and ROC curves for prediction of advanced age-related macular degeneration among younger and older age groups.

FIG. 2 are plotted histograms for advanced age-related macular degeneration risk scores for cases and controls among the original sample (above) and replication sample (below) based on all genetic variants as well as demographic and environmental variables.

FIG. 3 are sequences showing alleles at polymorphic sites: rs2230199 (SEQ ID NO:1), rs1061170 (SEQ ID NO:2), rs10490924 (SEQ ID NO:3), rs9332739 (SEQ ID NO:4), rs641153 (SEQ ID NO:5), rs1410996 (SEQ ID NO:6) and rs2230203 (SEQ ID NO:7).

FIGS. 4.1 through 4.27 show the complement factor markers that were examined for their association with AMD.

BACKGROUND

Age-related macular degeneration (AMD) is the most common geriatric eye disorder leading to blindness. Macular degeneration is responsible for visual handicap in what is conservatively estimated to be approximately 16 million individuals worldwide. Among the elderly, the overall prevalence is estimated between 5.7% and 30% depending on the definition of early AMD, and its differentiation from features of normal aging, a distinction that remains poorly understood.

Histopathologically, the hallmark of early neovascular AMD is accumulation of extracellular drusen, basal laminar deposit (abnormal material located between the plasma membrane and basal lamina of the retinal pigment epithelium) and basal linear deposit (material located between the basal lamina of the retinal pigment epithelium and the inner collageneous zone of Bruch's membrane). The end stage of AMD is characterized by a complete degeneration of the neurosensory retina and of the underlying retinal pigment epithelium in the macular area. Advanced stages of AMD can be subdivided into geographic atrophy and exudative AMD. Geographic atrophy is characterized by progressive atrophy of the retinal pigment epithelium. In exudative AMD the key phenomenon is the occurrence of choroidal neovascularisation (CNV). Eyes with CNV have varying degrees of reduced visual acuity, depending on location, size, type and age of the neovascular lesion. The development of choroidal neovascular membranes can be considered a late complication in the natural course of the disease possibly due to tissue disruption (Bruch's membrane) and decompensation of the underlying longstanding processes of AMD.

Many pathophysiological aspects as well as vascular and environmental risk factors are associated with a progression of the disease, but little is known about the etiology of AMD itself as well as about the underlying processes of complications like the occurrence of CNV. Family, twin, segregation, and case-control studies suggest the involvement of genetic factors in the etiology of AMD. The extent of heritability, number of genes involved, and mechanisms underlying phenotypic heterogeneity, however, are unknown. The search for genes and markers related to AMD faces challenges—onset is late in life, and there is usually only one generation available for studies. The parents of patients are often deceased, and the children are too young to manifest the disease. Generally, the heredity of late-onset diseases has been difficult to estimate because of the uncertainties of the diagnosis in previous generations and the inability to diagnose AMD among the children of an affected individual. Even in the absence of the ambiguities in the diagnosis of AMD in previous generations, the late onset of the condition itself, natural death rates, and small family sizes result in underestimation of genetic forms of AMD, and in overestimation of rates of sporadic disease. Moreover, the phenotypic variability is considerable, and it is conceivable that the currently used diagnostic entity of AMD in fact represents a spectrum of underlying conditions with various genetic and environmental factors involved.

Early detection of AMD would reduce the growing societal burden due to AMD by targeting and emphasizing modifiable habits earlier in life and recommending more frequent surveillance for those highly susceptible to the disease. Treatment trials will also benefit from such information when enrolling participants. There remains, therefore, a strong need for improved methods of diagnosing or predicting AMD or a susceptibility to AMD in subjects, as well as for evaluating and developing new methods of treatment.

SUMMARY

Described herein are methods and compositions that allow for improved diagnosis of AMD and susceptibility to AMD. The compositions and methods are directed to the unexpected discovery of genetic markers and causative polymorphisms in genes associated with the complement pathway. These markers and polymorphisms are useful for diagnosing AMD or a susceptibility to AMD, for use as drug targets, for identifying therapeutic agents, and for determining the efficacy of and a subject's responsiveness to a therapeutic treatment.

One embodiment is directed to a method for diagnosing age related macular degeneration or a susceptibility to age related macular degeneration comprising detecting the presence or absence of a particular allele at a polymorphic site associated with complement factor I, wherein the allele is indicative of age related macular degeneration or a susceptibility to age related macular degeneration. In a particular embodiment, the polymorphic site is a single nucleotide polymorphism (e.g., selected from the group consisting of: rs13117504 (SEQ ID NO:8), wherein the guanidine allele is indicative of age related macular degeneration or susceptibility to age related macular degeneration; rs10033900 (SEQ ID NO:9), wherein the guanidine allele is indicative of age related macular degeneration or susceptibility to age related macular degeneration, and the polymorphic site of SEQ ID NO:10, wherein the guanidine allele is indicative of age related macular degeneration or susceptibility to age related macular degeneration). In a particular embodiment, the presence or absence of a particular allele is detected by a hybridization assay. In a particular embodiment, the presence or absence of a particular allele is determined using a microarray. In a particular embodiment, the presence or absence of a particular allele is determined using an antibody. In a particular embodiment, the method further comprises detecting one or more genetic markers associated with age related macular degeneration and a gene other than complement factor I (e.g., C3, C5 or CFH).

One embodiment is directed to a purified polynucleotide comprising the polymorphic site and at least about six or more contiguous nucleotides of one or more of the sequences given as SEQ ID NOS:8, 9 or 10, wherein the minor allele is present at the polymorphic site.

One embodiment is directed to a diagnostic array comprising one or more polynucleotide probes that are complementary to a polynucleotide described herein.

One embodiment is directed to a diagnostic system comprising: a diagnostic array described herein, an array reader, an image processor, a database having data records and information records, a processor, and an information output; wherein the system compiles and processes patient data and outputs information relating to the statistical probability of the patient developing AMD.

One embodiment is directed to a method of determining the risk of a subject for developing age related macular degeneration, comprising detecting the presence or absence of one or more alleles at one or more polymorphic sites associated with complement factor I and age related macular degeneration, wherein the sample is from a subject who is determined to be at risk for developing age related macular degeneration due to one or more environmental risk factors. In a particular embodiment, at least one of the one or more polymorphic sites is a single nucleotide polymorphism. In a particular embodiment, the single nucleotide polymorphism is selected from the group consisting of: rs13117504 (SEQ ID NO:8), wherein the guanidine allele is indicative of age related macular degeneration or susceptibility to age related macular degeneration; rs10033900 (SEQ ID NO:9), wherein the guanidine allele is indicative of age related macular degeneration or susceptibility to age related macular degeneration, and the polymorphic site of SEQ ID NO:10, wherein the guanidine allele is indicative of age related macular degeneration or susceptibility to age related macular degeneration. In a particular embodiment, the one or more environmental risk factors are selected from the group consisting of: obesity, smoking, vitamin and dietary supplement intake, use of alcohol or drugs, poor diet and a sedentary lifestyle.

One embodiment is directed to a method of evaluating a treatment for age-related macular degeneration, comprising: determining a level, or an activity, or both said level and said activity, of at least one transcription and/or translation products or fragments of SEQ ID NOS:1-10 (Example 3 and FIG. 3) is compared to a reference value representing a known disease or health status, thereby evaluating the treatment of AMD.

One embodiment is directed to a method of making a diagnostic array of the invention comprising: applying to a substrate at a plurality particular address on the substrate a sample of the individual purified polynucleotide compositions comprising SEQ ID NOS:1-10.

One embodiment is directed to a kit and its use for diagnosis, or prognosis of age-related macular degeneration, or for determination of increased risk of developing AMD, or for monitoring a progression of age-related macular degeneration in a subject, or for monitoring success or failure of a therapeutic treatment of said subject.

DETAILED DESCRIPTION

Described herein is to the unexpected discovery that particular alleles at polymorphic sites associated with genes coding for proteins involved in the complement pathway are useful as markers for AMD and susceptibility to AMD. The compositions and methods described herein refer in particular to markers associated with complement factor I (CFI) alone or in combination with other markers associated with AMD or a susceptibility to AMD.

As used herein, “gene” is a term used to describe a genetic element that gives rise to expression products (e.g., pre-mRNA, mRNA and polypeptides). A gene includes regulatory elements and sequences that otherwise appear to have only structural features, e.g., introns and untranslated regions.

The genetic markers are particular “alleles” at “polymorphic sites” associated with particular complement factors, e.g., CFI. A nucleotide position at which more than one nucleotide can be present in a population (either a natural population or a synthetic population, e.g., a library of synthetic molecules), is referred to herein as a “polymorphic site”. Where a polymorphic site is a single nucleotide in length, the site is referred to as a single nucleotide polymorphism (“SNP”). If at a particular chromosomal location, for example, one member of a population has an adenine and another member of the population has a thymine at the same genomic position, then this position is a polymorphic site, and, more specifically, the polymorphic site is a SNP. Polymorphic sites can allow for differences in sequences based on substitutions, insertions or deletions. Each version of the sequence with respect to the polymorphic site is referred to herein as an “allele” of the polymorphic site. Thus, in the previous example, the SNP allows for both an adenine allele and a thymine allele.

A reference sequence is typically referred to for a particular genetic element, e.g., a gene. Alleles that differ from the reference are referred to as “variant” alleles. The reference sequence, often chosen as the most frequently occurring allele or as the allele conferring an typical phenotype, is sometimes referred to as the “wild-type” allele.

Some variant alleles can include changes that affect a polypeptide, e.g., the polypeptide encoded by a complement pathway gene. These sequence differences, when compared to a reference nucleotide sequence, can include the insertion or deletion of a single nucleotide, or of more than one nucleotide, resulting in a frame shift; the change of at least one nucleotide, resulting in a change in the encoded amino acid; the change of at least one nucleotide, resulting in the generation of a premature stop codon; the deletion of several nucleotides, resulting in a deletion of one or more amino acids encoded by the nucleotides; the insertion of one or several nucleotides, such as by unequal recombination or gene conversion, resulting in an interruption of the coding sequence of a reading frame; duplication of all or a part of a sequence; transposition; or a rearrangement of a nucleotide sequence. Alternatively, a polymorphism associated with AMD or a susceptibility to AMD can be a synonymous change in one or more nucleotides (i.e., a change that does not result in a change to a codon of a complement pathway gene). Such a polymorphism can, for example, alter splice sites, affect the stability or transport of mRNA, or otherwise affect the transcription or translation of the polypeptide. The polypeptide encoded by the reference nucleotide sequence is the “reference” polypeptide with a particular reference amino acid sequence, and polypeptides encoded by variant alleles are referred to as “variant” polypeptides with variant amino acid sequences.

A genetic marker is “associated” with a genetic element or phenotypic trait, for example, if the marker is co-present with the genetic element or phenotypic trait at a frequency that is higher than would be predicted by random assortment of alleles (based on the allele frequencies of the particular population). Association also indicates physical association, e.g., proximity in the genome or presence in a haplotype block, of a marker and a genetic element.

Haplotypes are a combination of genetic markers, e.g., particular alleles at polymorphic sites. The haplotypes described herein are associated with AMD and/or a susceptibility to AMD. Detection of the presence or absence of the haplotypes herein, therefore is indicative of AMD, a susceptibility to AMD or a lack thereof. The haplotypes described herein are a combination of genetic markers, e.g., SNPs and microsatellites. Detecting haplotypes, therefore, can be accomplished by methods known in the art for detecting sequences at polymorphic sites.

The haplotypes and markers disclosed herein are in “linkage disequilibrium” (LD) with preferred complement pathway gene(s), e.g., CFI, and likewise, AMD and complement-associated phenotypes. “Linkage” refers to a higher than expected statistical association of genotypes and/or phenotypes with each other. “LD” refers to a non-random assortment of two genetic elements. If a particular genetic element (e.g., an allele at a polymorphic site), for example, occurs in a population at a frequency of 0.25 and another occurs at a frequency of 0.25, then the predicted occurrence of a person's having both elements is 0.125, assuming a random distribution of the elements. If, however, it is discovered that the two elements occur together at a frequency statistically significantly higher than 0.125, then the elements are said to be in LD since they tend to be inherited together at a higher frequency than what their independent allele frequencies would predict. Roughly speaking, LD is generally correlated with the frequency of recombination events between the two elements. Allele frequencies can be determined in a population, for example, by genotyping individuals in a population and determining the occurrence of each allele in the population. For populations of diploid individuals, e.g., human populations, individuals will typically have two alleles for each genetic element (e.g., a marker or gene).

The invention is also directed to markers identified in a “haplotype block” or “LD block”. These blocks are defined either by their physical proximity to a genetic element, e.g., a complement pathway gene, or by their “genetic distance” from the element. Other blocks would be apparent to one of skill in the art as genetic regions in LD with the preferred complement pathway gene, e.g., CFI. Markers and haplotypes identified in these blocks, because of their association with AMD and the complement pathway, are encompassed by the invention. One of skill in the art will appreciate regions of chromosomes that recombine infrequently and regions of chromosomes that are “hotspots”, e.g., exhibiting frequent recombination events, are descriptive of LD blocks. Regions of infrequent recombination events bounded by hotspots will form a block that will be maintained during cell division. Thus, identification of a marker associated with a phenotype, wherein the marker is contained within an LD block, identifies the block as associated with the phenotype. Any marker identified within the block can therefore be used to indicate the phenotype.

Additional markers that are in LD with the markers of the invention or haplotypes are referred to herein as “surrogate” markers. Such a surrogate is a marker for another marker or another surrogate marker. Surrogate markers are themselves markers and are indicative of the presence of another marker, which is in turn indicative of either another marker or an associated phenotype.

The Complement Pathway

Among other complement pathway members, complement pathway genes were selected genes for evaluation. Tag SNPs were selected across complement factors 3 and 5 and made particular inclusion of SNP rs2230199 in C3, which was reported to have a p=2.8×10−5 in single marker tests available on the NIH dbGAP database in a genome-wide association of 400 AMD cases and 200 controls. Several allelic markers were found in close association with CFI (Example 3), significantly including those described by SEQ ID NOS:8-10. Genotyping was performed as part of experiments using the Illumina GoldenGate assay and Sequenom iPLEX system. The study population consisted of 2,172 unrelated Caucasian individuals 60 years of age or older diagnosed based on ocular examination and fundus photography (1,238 cases of both dry and neovascular (wet) advanced AMD and 934 controls).

The role of epistasis between rs2230199 and five variants previously was also evaluated. Specifically, two variants at CFH (1061170—SEQ ID NO:2 and 10490924—SEQ ID NO:3), two variants at the CFB/C2 locus (9332739—SEQ ID NO:4 and 641153—SEQ ID NO:5), and one at the LOC387715/HTRA1 locus (1410996—SEQ ID NO:6) (see FIG. 3 for sequences) were established as unequivocally associated to AMD risk in this cohort. Using logistic regression, no statistically significant interaction terms between any pair of these SNPs, the two Factor B rare protective SNPs as a category or the three haplotypes formed by the two different CFH SNPs was identified. While weak interactions cannot be excluded, this result suggests that despite targeting the same pathway, these variants largely confer risk in an independent, log-additive fashion.

This associated Arg102Gly variant (SEQ ID NO:1) has previously been established as the molecular basis of the two common allotypes of C3: C3F (fast) and C3S (slow) (so named due to a difference in electrophoretic motility). The C3F variant has been previously reported as associated to other immune-mediated conditions such as IgA nephropathy and glomerular nephritis. The variant has also been reported to influence the long term success of renal transplants, where C3S homozygote recipients had much better graft survival and function when receiving a donor kidney with a C3F allotype than a matched homozygote C3S donor. More generally, deficiencies in both C3 and complement factor H(CFH) have been associated to the immune-mediated renal damage in membranoproliferative glomerulonephritis (MPGN), and the AMD-associated Y402H variant has also been shown to be significantly associated with MPGN underscoring a deep connection in the etiology of these two disorders (Li, M. et al., Nat. Genet., 38:1049-1054, 2006).

Case-control association studies for AMD in several genomic regions continued, yielding a SNP just 3′ of Complement Factor I (CFI) on chromosome 4 with significant association (p<10−7). Sequencing was performed on coding exons in linkage disequilibrium with the detected association. No obvious functional variation was discovered that could be the proximate cause of the association, suggesting a non-coding regulatory mechanism. Subsequent studies with this marker, alone or in combination with other AMD-associated markers established the efficacy of detecting specific CFI-associated allelic markers for diagnosing AMD or a susceptibility to AMD.

Diagnostic Gene Array

Polynucleotide arrays provide a high throughput technique that can assay a large number of polynucleotide sequences in a single sample. This technology can be used, for example, as a diagnostic tool to assess the risk potential of developing AMD, e.g., by detecting one or more AMD-associated allelic markers, e.g., markers associated with CFI. Polynucleotide arrays (for example, DNA or RNA arrays), are known in the art for use as diagnostic or screening tools. Such arrays include regions of usually different sequence polynucleotides arranged in a predetermined configuration on a substrate, at defined x and y coordinates. These regions (sometimes referenced as “features”) are positioned at respective locations (“addresses”) on a substrate. The arrays, when exposed to a sample, exhibit an observed binding pattern. This binding pattern can be detected upon interrogating the array. All polynucleotide targets (for example, DNA) in the sample can be labeled, for example, with a suitable label (e.g., a fluorescent compound) that allows for the detection of specific sample-array interactions. The observed binding pattern is indicative of the presence and/or concentration of one or more polynucleotide components of the sample.

Arrays can be fabricated by depositing previously obtained biopolymers onto a substrate, or by in situ synthesis methods. The substrate can be any supporting material to which polynucleotide probes can be attached, including but not limited to glass, nitrocellulose, silicon, and nylon. Polynucleotides can be bound to the substrate by either covalent bonds or by non-specific interactions, such as hydrophobic interactions. The in situ fabrication methods include those described in U.S. Pat. No. 5,449,754 for synthesizing peptide arrays, and in U.S. Pat. No. 6,180,351 and WO 98/41531 and the references cited therein for synthesizing polynucleotide arrays. Further details of fabricating biopolymer arrays are described in U.S. Pat. No. 6,242,266; U.S. Pat. No. 6,232,072; U.S. Pat. No. 6,180,351; U.S. Pat. No. 6,171,797; EP No. 0 799 897; PCT No. WO 97/29212; PCT No. WO 97/27317; EP No. 0 785 280; PCT No. WO 97/02357; U.S. Pat. Nos. 5,593,839; 5,578,832; EP No. 0 728 520; U.S. Pat. No. 5,599,695; EP No. 0 721 016; U.S. Pat. No. 5,556,752; PCT No. WO 95/22058; and U.S. Pat. No. 5,631,734. Other techniques for fabricating biopolymer arrays include known light directed synthesis techniques. Commercially available polynucleotide arrays, such as Affymetrix GeneChip™, can also be used. Use of the GeneChip™, to detect gene expression is described (Lockhart, D. et al., Nat. Biotech., 14:1675-1680, 1996; Chee, M. et al., Science, 274:610-614, 1996; Hacia, J. et al., Nat. Genet., 14:441-447, 1996; and Kozal, M. et al., Nat. Med., 2:753-759, 1996). Other types of arrays are known in the art, and are sufficient for developing an AMD diagnostic array of the present invention.

To create the arrays, single-stranded polynucleotide probes, for example, can be spotted onto a substrate in a two-dimensional matrix or array. Each single-stranded polynucleotide probe can comprise at least 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25 or 30 or more contiguous nucleotides selected from the nucleotide sequences shown in SEQ ID NO:1-10. In array fabrication, the probes formed at each feature are usually expensive. Additionally, sample quantities available for testing are usually also very small and it is therefore desirable to simultaneously test the same sample against a large number of different probes on an array. These conditions make it desirable to produce arrays with large numbers of very small (for example, in the range of tens or one or two hundred microns), closely spaced features (for example many thousands of features).

Samples can be assayed directly for the presence or absence of one or more AMD-associated markers. Samples can also be processed, for example, to isolate nucleic acids or to amplify specific nucleic acids. Tissue samples from a patient suspected of being at risk for developing AMD, for example, can be treated to isolate single-stranded polynucleotides, for example by heating or by chemical denaturation, as is known in the art. The single-stranded polynucleotides in a tissue sample can be labeled and hybridized to the polynucleotide probes on the array. Detectable labels that can be used include but are not limited to radiolabels, biotinylated labels, fluorophors, and chemiluminescent labels. Double-stranded polynucleotides, comprising the labeled sample polynucleotides bound to polynucleotide probes, can be detected once the unbound portion of the sample is washed away. Detection can be visual or with computer assistance. Preferably, after the array has been exposed to a sample, the array is read with a reading apparatus (such as an array “scanner”) that detects the signals (such as a fluorescence pattern) from the array features. Such a reader would have a very fine resolution (for example, in the range of five to twenty microns) for a array having closely spaced features.

The signal image resulting from reading the array can be digitally processed to evaluate which regions of read data belong to a given feature as well as to calculate the total signal strength associated with each of the features. The foregoing steps, separately or collectively, are referred to as “feature extraction” (U.S. Pat. No. 7,206,438, for example, describes an apparatus and method of enhancing feature extraction, e.g., processing one or more detected signal images each acquired from a field of view of an array reader). Using any of the feature extraction techniques so described, detection of hybridization of a patient derived polynucleotide sample with one of the AMD markers on the array given as SEQ ID NO:1-10 identifies that patient as having a genetic risk factor for AMD, as described above.

Also encompassed by the herein is a system for compiling and processing patient data, and presenting a risk profile for developing AMD. A computer aided medical data exchange system is preferred. The system can be designed to provide high-quality medical care to a patient by facilitating the management of data available to care providers. The care providers include, for example, physicians, surgeons, nurses, clinicians, various specialists, and so forth. It should be noted, however, that while general reference is made to a clinician in the present context, the care providers may also include clerical staff, insurance companies, teachers and students, and so forth. The system provides an interface, which allows the clinicians to exchange data with a data processing system. The data processing system is linked to an integrated knowledge base and a database. System, and the database draw upon data from a range of data resources. The database may be software-based, and includes data access tools for drawing information from the various resources as described below, or coordinating or translating the access of such information. In general, the database will unify raw data into a useable form.

The integrated knowledge base is intended to include one or more repositories of medical-related data. The data itself may relate to patient-specific characteristics as well as to non-patient specific information, as for classes of persons, machines, systems and so forth. Examples of patient-specific clinical data include patient medical histories, patient serum and cellular antioxidant levels, and the identification of past or current environmental, lifestyle and other factors that predispose a patient to develop AMD. These include but are not limited to various risk factors such as obesity, smoking, vitamin and dietary supplement intake, use of alcohol or drugs, poor diet and a sedentary lifestyle.

Use of the present system involves a clinician obtaining a patient sample, and evaluation of the presence of a genetic marker in that patient indicating a predisposition for AMD, such as SEQ ID NO:1-10. The clinician or their assistant also obtains appropriate clinical and non-clinical patient information, and inputs it into the system. The system then compiles and processes the data, and provides output information that includes a risk profile for the patient, of developing AMD. Particular illustrations of this process will depend on the specific information collected and the specific operations of the system, which are believed to be routine given the teachings provided herein.

Described herein are certain allelic markers, e.g., polynucleotide sequences, that have been correlated to AMD, compositions based on these markers, methods for using these markers, and kits and systems for practicing the methods using these markers. These markers are useful as diagnostics for identifying patients who have AMD, are at risk for developing AMD or who have a susceptibility to developing AMD. The markers described herein can be used alone or in conjunction with other diagnostic methods, e.g., methods using other markers or environmental risk factors.

EXEMPLIFICATION

Example 1

Several candidate genes have screened negatively for association with AMD (Haddad, S. et al., Surv. Ophthalmol., 50:306-363, 2006). The list includes TIMP3 (Tissue inhibitor of metalloproteinases-3), IMPG2, the gene encoding the retinal interphotoreceptor matrix (IPM) proteoglycan IPM 200, VMD2 (the bestrophin gene), ELOVL4 (elongation of very long chain fatty acids), RDS (peripherin), EFEMP1 (EGF-containing fibulin-like extracellular matrix), BMD (bestrophin). One gene has been shown to have variations in the coding regions in patients with AMD, namely, GPR75 (a G protein coupled receptor gene). Others have shown a possible association with the disease in at least one study; PON1 the (paraoxonase gene), SOD2 (manganese superoxide dismutase, APOE (apolipoprotein E) for which the E4 allele has been found to be associated with the disease in some studies and not in others; and CST3 (cystatin C) for which one study has suggested an increased susceptibility for ARMD in CST3 B/B homozygotes. There are conflicting reports regarding the role of the ABCR (ABCA4) gene with regard to AMD. Allikmets and colleagues first reported an association with the disease.

Genetic variants associated with AMD have been identified. There is also an association to a region containing several tightly linked genes on chromosome 10 (LOC387715, HTRA1) although the function of those genes and variants is not fully understood. Using the database described herein, a previously unrecognized common, non-coding variant in CFH and other complement factor genes w was identified that substantially increases the influence of this locus on AMD, and strongly replicated the associations of four other published common alleles in three genes (p values ranging from 10-12 to 10-70), including the first confirmation of the BF/C2 locus.

Complement Pathway is involved in AMD: Genetic variants and environment play a role in AMD development and pathogenesis. Therefore, it is desirable to take both into account when determining an individual's risk. To date, the Y402H variant of complement factor H(CFH) is the most replicated and studied of several variants associated with AMD, conferring an estimated 7-fold increased risk in patients with the homozygous condition. The Y402H single nucleotide polymorphism (SNP) is within the CFH binding site for heparin and C-reactive protein (CRP). Altered binding to these sites can lead to loss of function; e.g., decreased ability to bind to targets and/or interact with CRP, thereby giving rise to excessive complement activation. Because the initiation of complement activation can occur on cell surfaces as well as in the fluid phase, the activation of complement is one of the earliest events that can be detected.

When classical pathway activation occurs through the binding and activation of Cl to antibodies, C4 is cleaved, producing C4a and C4b. C4a is released locally and is circulated. It can be detected by a commercially available ELISA kits (e.g., Pharmingen OPT-EIA) in ng/mL quantities. A similar event occurs when the lectin pathway is activated through binding of mannose binding lectin (MBL) to a carbohydrate-covered bacterial surface and the mannan-binding lectin-associated serine protease (MASP) enzymes cleave C4. C4a thus serves as a marker for activation of both the classical and lectin pathways. Many charged surfaces on microbes or other particulates including aggregates of multiple classes of immunoglobulins have been shown to activate the alternative complement pathway. The first split product released in this pathway is Bb from the cleavage of factor B. Bb can be measured in plasma by a commercial ELISA kit (e.g., Quidel) in μg/mL quantities. As complement pathways can interact with one another, measuring components of each pathway may be important for diagnosis or prediction of complement-associated disease, e.g., AMD.

If activation by any of the pathways continues, C3 is the next major protein to produce measurable fragments. C3 is initially split into two pieces: C3a is a small fragment that has anaphylatoxin activity, interacting through a specific C3a receptor found on many cell types, and C3b is a large fragment that has the property of binding covalently to nearby surfaces or molecules through an active thioester bond. The latter is produced by a conformational change in the molecule when the C3 convertase cleaves it. This covalent attachment leads to permanent deposits of C3b (or its subsequent cleavage fragments) on surfaces in the vicinity of complement activation. These deposits and subsequent cleavage fragments interact with C3 receptors (CR1, CR2, CR3, CR4) that are found on many cell types. This leads to immune adherence and provides a transport mechanism for the clearance of immune complexes, bacteria, viruses or whatever the C3b has become attached to C5a and C5b-9 (membrane attack complex (MAC)) are markers of the terminal activation pathway as well.

CFH dampens the alternative pathway by three actions: 1) it prevents binding of factor B to C3; b) it binds to C3bBb (the alternative pathway C3 convertase), displacing the Bb enzymatic subunit; and 3) it provides cofactor activity for factor I (CFI), which can then cleave C3b, producing the inactive form, iC3b. Some iC3b is in the fluid-phase in concentrations normally below 30 μg/mL in plasma, with low variability. When elevated, it may provide an indirect indication that CFH is functioning to inactivate C3b. Inhibition of CFH with antibody reduces the cleavage of C3b to iC3b as measured by Western blot. To determine the function of CFH in inactivating C3b, it would be desirable to measure C3b and iC3b. C3b assays, however, show substantial variability. C3, which reflects certain disease states, is therefore measured, and the ratio of iC3b/C3 is analyzed as another possible indicator of AMD risk.

Factor B provides the enzymatic subunit, Bb, of the C3 convertase, contributing to the amplification loop of the alternative pathway and formation of C5 convertase. Whereas CFH dampens the alternative pathway, properdin stabilizes C3 and C5 convertases of the alternative pathway, thus serving to promote formation of the MAC instead of inactivation of C3b. Whereas variants of CFH increase the risk of AMD, variations in the genes encoding factor B were found to reduce the risk of AMD. Both factors B and C3 are important in the development of laser-induced choroidal neovascularization in mouse models.

In addition to genetic considerations, environmental factors play a role in AMD risk and may affect complement levels. Smoking is an independent risk factor for AMD and has been reported to activate complement and to increase factor B levels. Smokers have been reported to have reduced CFH levels. Plasma levels of CFH are reported to vary widely in the general population (110-615 μg/mL) and the measurement of CFH may not differentiate normal from variant CFH. To identify at-risk patients, therefore, other possible biomarkers associated with AMD are measured—biomarkers that may also be affected by environmental factors strongly associated with increased risk of AMD. Based on the pathways, it would be anticipated that iC3b (or iC3b/C3) would be most elevated in non-smokers with the CFH Y4021H TT genotype and with low BMI (anticipated to have stage 1), and undetectable in CC smokers with high BMI and with advanced AMD. For CC smokers with stage 1, it would be expected that factor B levels would be lower than in those with advanced AMD (with the possible caveat of patients with protective variants of factor B). Bb, a fragment of factor B produced by activation of the alternative pathway, is a reliable marker of alternative pathway activation. Ratios of Bb to B are informative with respect to the activation rate and extent of the alternative pathway, and analysis of these factors in conjunction with C3 measures provides insight into the processes ongoing in the inflammatory lesions.

Genetic approach to AMD: AMD falls into the category of complex, late-onset diseases similar to type II diabetes, Alzheimer's disease, cardiovascular disease, hypertension, etc. where the genetic contributions do not necessarily manifest with straightforward Mendelian inheritance. Instead, it is presumed that these and other complex diseases are the result of complex interaction between environmental factors and susceptibility of multiple alleles of multiple genes and that these factors only cause disease when, in combination, a threshold of susceptibility is reached. Two major hypotheses are commonly explored to search for these genetic risk factors—the “common disease/common variant hypothesis” (e.g., as suggested by the association of the APOE4 allele with Alzheimer's disease) and the hypothesis that rarer, more penetrant variants at multiple genes may explain the genetic component of multifactorial disease. While there is not general agreement, and limited empirical data, to suggest which hypothesis will bear more fruit in any individual disease, it seems most likely that complex diseases with involvement of many genes may quite naturally have contributions from both common and rare variation.

To detect common, low-penetrance variation, the association study is the design of choice—as made evident by both theoretical considerations and a proven track-record of detecting common genetic variants for multifactorial disease. Common variation has been conclusively determined to play a substantial role in the heritability of AMD. Previous efforts, however, have focused almost exclusively on polymorphisms that are already known to result in changes in the coding and regulatory regions of genes. A limited knowledge of the genome, limited ability to recognize many forms of potentially functional variation from sequence context alone, and lack of true understanding of causal pathways, has therefore limited the ability to apply these techniques (which remain quite costly and unproven). These hurdles have been overcome and recent results indicate that successful identification and replication of low-penetrance alleles can be convincingly achieved.

Plasma biomarkers in the complement system are associated with AMD and AMD progression, and these associations differ according to genotype, controlling for environmental factors.

Baseline plasma levels of the complement factors are measured in patients who are genotyped and phenotyped for AMD to determine if these markers predict risk of AMD given environmental risk factors. The study population includes: 1) Discordant sibling pairs (from families and DZ twins) with one sibling grade 3b, 4, and 5 and one sibling with grade 1 (N=100 pairs, with 200 siblings), and 2) Progressors among the siblings with transition from grades 1-4 to grades 3b, 4, and 5 or grade 4 to 5 over time (total sample 620 of whom 214 have progressed). There will be additional progressors over time and the total sample expected for this aim is approximately 1000 subjects. All subjects have stored plasma samples that have never been thawed, and were collected in a manner that can be used for these lab analyses. Plasma data can be coupled with risk factor data as described above, including smoking, body mass index (BMI) and serum high-sensitivity CRP from a different aliquot of blood drawn on the same day as the proposed plasma complement assays (for the discordant pairs). Serum CRP and plasma complement factors (from aliquots drawn on the same day at baseline) can also be measured for subjects in the progression aspect of the study for the prospective analyses.

Complement assays: CFH, factor B, factor I, C3 and C5 levels are measured primarily with radial immunodiffusion, using polyclonal antisera specific for the components, according to the procedures followed by the Complement Laboratory at NJC. Split products C3a, iC3b, C5a and C4a, along with the terminal complement complex (SC5b-9), are measured by ELISA using kits produced by Pharmingen BD or Quidel. Ratios (iC3b:C3 and C3a:C3) can be calculated with these data. The normal ranges for these components are given in Table 1.

TABLE 1
Normal Range
(mean ± 2
Componentstandard deviations)
Factor H160-412μg/mL
Factor I29-58μg/mL
Factor B127.6-278.5μg/mL
C366-162mg/dL
C555-113μg/mL
C411-39mg/dL
C3a98-857ng/mL
iC3b0-30.9μg/mL
Bb0-0.83μg/mL
SC5b-90-179ng/mL
C4a101-745ng/mL

In the clinical laboratory, anything outside of three standard deviations is considered abnormal. Given that some of the patients may have low native components (C3, FB and C4), the ratio of the levels to the split products are predicted to be more useful than absolute amounts. Comparison of the results from the disease cohorts with the controls is extremely useful for further studies in terms of identifying the appropriate biomarkers for AMD patients. All complement split products are evaluated in specimens that have been collected in EDTA tubes, processed to obtain the EDTA-plasma rapidly after blood collection, and stored frozen in liquid nitrogen freezers. Each specimen is tested for all proteins on the first thaw, since repeated freeze-thaw cycles can produce false positive results.

Methods—CFH, factor I, factor B, C5: Radial immunodiffusion is performed by preparing 1% agarose gels containing an appropriate amount of specific antibody for the component to be measured. Wells are cut in the gel and filled with a measured amount of each test serum or plasma, control serum or plasma, and a series of at least three standards with known concentration of the component measured. After incubation of the filled gels for 72 hours at 4° C., the diameter of the precipitin ring formed by combination of the antibody with its antigen (the component being tested) is measured and the area of the precipitin ring is calculated. Using the areas of the rings formed by the standards, the concentrations of the component present in the test samples are calculated by linear regression.

C3a, C4a: ELISA method using OptEIA kits from Pharmingen-BD (San Diego).

iC3b, Bb, SC5b-9: these markers are measured using kits from Quidel (San Diego, Calif.). Three controls are run with each set of test samples, and the specimens are all tested in duplicate.

C-reactive protein (CRP) binds to CFH at the CCP7 where the Y402H CFH polymorphism exists. Serum CRP was found to be elevated in patients with AMD compared to controls. CRP may also increase the risk of AMD in patients carrying at least one allele of the CFH variant. While not being bound by a particular theory, it has been proposed that CFH binds CRP and counter-arrests alternative pathway activation induced by damaged tissue.

Analyses: For the case-control comparison, conditional logistic regression was used to determine the likelihood of having advanced AMD given levels of the various complement factors and CRP values within categories of genotype, while assessing and adjusting for pack year history of smoking, BMI and cardiovascular disease. Effect modification between complement factors versus CRP and complement factors versus genotype is also determined. Risk factor data is available within the existing database and analyzed. Additional analyses are performed to assess associations between genotype and complement factors using the general linear model. For progression, Cox regression analyses is applied to assess whether complement levels are associated with AMD progression, controlling for genotype, smoking, BMI, CRP, etc. Interactions and effect modification are assessed to determine if complement factors are more or less related to AMD within certain genotypes, or whether these associations vary depending on smoking status, level of BMI, etc. Power for the discordant pair analyses is adequate to detect an effect size (i.e., mean difference between groups/sd)=0.40 with 80% power based on a comparison of 100 cases and 100 controls. Power is even larger for the progression study where there are 214 progressors out of 620 subjects. Regarding multiple testing, the different complement factors tend to be highly correlated and a Bonferroni type correction would be inappropriate.

Example 2

Prediction Model for Advanced Atrophic and Neovascular Age-Related Macular Degeneration Based on Genetic, Demographic, and Environmental Variables

Context: Six single nucleotide polymorphisms (SNPs) in five genes are associated with age-related macular degeneration (AMD). Described herein are the joint effects of genetic and environmental variables leading to predictive models for potential screening for AMD.

Design, Setting, and Participants: Caucasian participants in the multi-center Age-Related Eye Disease Study with advanced AMD and visual loss (n=509 cases) or no AMD (n=222 controls) were evaluated. Advanced AMD was defined as geographic atrophy, neovascular disease. Risk factors including smoking and BMI were assessed, and DNA specimens were genotyped for the six variants in five genes: CFH, LOC387715/HTRA1, CFB, C2, and C3. Unconditional logistic regression analyses were performed. Receiver operating characteristic (ROC) curves were calculated.

Outcome Measures: Prevalence of advanced dry and neovascular AMD and predictive ability of risk scores based on sensitivity and specificity to discriminate between cases and controls.

Results: CFH Y402H, CFHrs1410996, LOC387115 A69S, C2 E318D, CFB R32Q, and C3 R102H polymorphisms are each independently related to advanced AMD, controlling for demographic factors, smoking, BMI, and vitamin/mineral treatment assignment. Multivariate odds ratios (OR′ s) were 3.5 (95% confidence interval (CI) 1.7-7.1) for CFH Y402H; 3.7 (95% CI 1.6-8.4) for CFH rs1410996; 25.4 (95% CI 8.6-75.1) for LOC387715 A69S; 0.3 (95% CI 0.1-0.7) for C2 E318D; 0.3 (95% CI 0.1-0.5) for CFB; and 3.6 (95% CI 1.4-9.4) for C3 R102H, comparing the homozygous risk/protective genotypes to the referent genotypes. Genetic plus environmental risk scores provided C statistics ranging from 0.803 to 0.859, which were replicated in an independent sample of 452 cases and 317 controls.

Six genetic variants, as well as smoking and BMI are independently related to advanced AMD causing visual loss, with excellent predictive power.

Methods: Phenotypic Data

The Age-Related Eye Disease Study (AREDS; Arch. Ophthalmol., 119:1417-1436, 2001; Arch. Ophthalmol., 125:1225-1232; Arch. Ophthalmol., 125:671-679, 2007; Arch. Ophthalmol., 112:533-539, 2005) included a randomized clinical trial to assess the effect of antioxidant and mineral supplements on risk of AMD and cataract, and a longitudinal study of AMD. Based on ocular examination and AREDS reading center photographic grading of fundus photographs, Caucasian participants in this study were divided into two main groups representing the most discordant phenotypes: no AMD with either no drusen or nonextensive small drusen (n=222), or advanced AMD with visual loss (n=509). Non-Caucasians were excluded since the distribution of advanced AMD in that population differs considerably compared with Caucasians. The advanced form of AMD, groups 3 and 4 in the original AREDS classification that include non-central and central atrophy, neovascular disease, as well as visual loss, was then reclassified into the two subtypes as either non-central or central geographic atrophy (n=136) or neovascular disease (n=373), independent of visual acuity level using the Clinical Age-Related Maculopathy Grading System, to determine whether results differed between these two (advanced dry and wet) phenotypes. Another comparison was made between unilateral or bilateral advanced AMD according to the AREDS system. Demographic and risk factor data, including education, smoking history, and BMI, were obtained at the baseline visit from questionnaires and height and weight measurements. Antioxidant status was defined as taking antioxidants (antioxidants alone or antioxidants and zinc) or no antioxidants (placebo or zinc alone) in the clinical trial.

Methods: Genotyping

DNA samples were obtained from the AREDS Genetic Repository. The following six SNPs were evaluated: 1) Complement Factor H(CFH)Y402H (rs1061170) in exon 9 of the CFH gene on chromosome 1q31, a change 1277T>C, resulting in a substitution of histidine for tyrosine at codon 402 of the CFH protein, 2) CFHrs1410996 is an independently associated single nucleotide polymorphism (SNP) variant within intron 14 of CFH, 3) LOC387715 A69S (rs10490924 in the LOC387715/HTRA1 region of chromosome 10), a non-synonymous coding SNP variant in exon 1 of LOC387715, resulting in a substitution of the amino acid serine for alanine at codon 69, 4) Complement Factor 2 or C2 E318D (rs9332739), the non-synonymous coding SNP variant in exon 7 of C2 resulting in the amino acid glutamic acid changing to aspartic acid at codon 318, and 5), Complement Factor B or CFB R32Q (rs641153), the non-synonymous coding SNP variant in exon 2 of CFB resulting in the amino acid glutamine changing to arginine at codon 32, and 6), Complement Factor 3 or C3 R102H (rs2230199), the non-synonymous coding SNP variant in exon 3 of C3 resulting in the amino acid glycine to arginine at codon 102. Genotyping was performed using primer mass extension and MALDI-TOF MS analysis by the MassEXTEND methodology of Sequenom (San Diego, Calif.).

Methods: Statistical Analysis

Individuals with advanced AMD, as well as the separate subtypes of dry, wet and bilateral advanced AMD, were compared to the control group of Caucasian persons with no AMD, with regard to genotype and risk factor data. Multivariate unconditional logistic regression analysis was performed to evaluate the relationships between AMD and all of the genotypes plus various risk factors, controlling for age (70 or older, younger than 70), gender, and education (high school or less, more than high school), cigarette smoking (never, past, current), and BMI, which was calculated as the weight in kilograms divided by the square of the height in meters (<25, 25-29.9, and ≧30). The AREDS assignment in the randomized clinical trial was also added to the multivariate model (taking a supplement containing antioxidants or taking study supplements containing no antioxidants). Tests for multiplicative interactions between each of the genotypes versus smoking and BMI respectively, were calculated using cross product terms according to genotype and the individual risk factors. In addition, similar analyses were performed to assess gene-gene interactions for each combination of genes. Odds ratios and 95% CIs were calculated for each risk factor and within the three genotype groups. Tests for trend for the number of risk alleles for each genetic variant (0, 1, 2) were calculated. Sensitivities and specificities for a variety of risk score cut-points were evaluated to assess the optimal use of the model for individual risk prediction, e.g., sensitivities and specificities of at least 80%. The method for calculation of the AMD risk score based on all genetic, demographic and behavioral factors is explained in Table 2. The areas under the receiver operating characteristic (ROC) curves were obtained separately for the age groups 50-69 and 70+ years. In addition, an age-adjusted concordant or “C” statistic based on the ROC curves was calculated for different combinations of genes and environmental factors to assess the probability that the risk score based on the group of risk factors in that model from a random case was higher than the corresponding risk score from a random control within the same 10 year age group. To test the reproducibility of the risk prediction model, a separate replication sample consisting of 452 cases and 317 controls was obtained from the AMD study databases using the same grading system based on ocular photographs, and computed the C statistic using the risk score derived from the original sample. ROC curves were obtained for the replication sample.

Results

The mean ages (±SD) of cases and controls were 69.1 (±5.2) and 66.8 (±4.2) respectively. Females comprised 58% of cases and 54% of controls. Table 3 displays the relationship between genotype and covariate data among controls. There were no statistically significant associations between any of the genetic variants and the demographic, behavioral, or treatment variables. There was a non-significant trend toward an association between age and the C3 variant, with a somewhat higher proportion of the younger individuals with one or two risk alleles, or the GC or GG genotypes.

Relationships between pairs of genes were also evaluated. Table 4 displays multivariate adjusted associations between advanced AMD and demographic and behavioral factors controlling for all genetic variants, as well as associations between AMD and genetic factors adjusting for the environmental factors. There were positive associations between the two independent CFH variants and the combined advanced AMD group (Y402H, OR=3.5, 95% CI 1.7-7.1, p trend=0.0003); CFHrs1410996 (OR=3.7, p trend=0.0003). There were positive associations between AMD and the LOC388715 A69S variant (OR=25.4, p trend<0.0001) and C3 (OR=3.6, p trend=0.001). There were protective associations between C2 (OR=0.3, p=0.003) and CFB variant (OR=0.3, p<0.0001). There were positive independent associations with age (OR=2.8, p<0.0001), current smoking (OR=3.9, p=0.001), and past smoking (OR=1.9, p=0.004). There was a protective effect of higher education (OR=0.6, p=0.01). A borderline positive association with BMI was present (OR=1.5, p=0.11) and no significant association with gender or antioxidant treatment was seen. In general, similar associations between genes and AMD were seen for all subtypes of AMD, including unilateral and bilateral advanced AMD and dry and wet types of advanced AMD, although associations varied slightly for specific types of advanced AMD.

Interactions between each genotype versus smoking (ever/never) and BMI (25+/<25), were evaluated (Table 5). No significant interactions were found between any of the genotypes and smoking or BMI. There was a trend for a smaller effect of BMI on those with genotype CFH Y402H TT and an adverse effect of BMI for those with a risk allele (the CC and CT genotypes). Furthermore, within a given genotype, smoking and higher BMI increased risk of advanced AMD. For example, for the homozygous GG risk genotype for C3, the OR for advanced AMD was 3.3 (1.0-10.9) for “never smokers”, and increased to 9.8 (2.0-47.5) for individuals who had ever smoked, indicating that there are main effects of both smoking and C3 genotype but no interaction effect.

Table 6 shows C statistics for models with different combinations of genetic, demographic, and environmental variables. The C statistic for model 1 based on the two previously reported genes, CFH Y402H and LOC 387715 A69S, (ref) and age, gender, education, and antioxidant treatment was 0.803±0.018. There was a significant improvement in the C statistic upon adding smoking and BMI as additional risk factors in model 2 with a C statistic of 0.822±0.017 (model 1 versus 2, p=0.027). For model 3, the model including all six variants was considered, together with age, gender, education and antioxidant treatment. A C statistic of 0.846±0.016, which was a significant improvement over the corresponding two gene model (model 1 versus 3, p<0.001). When smoking and BMI were added to the basic six genetic variant model 3, the C statistic increased to 0.859±0.015, and this was a significant improvement compared with the corresponding two gene model (model 2 versus 4, p=0.001). There was a modest improvement as well with the addition of the environmental variables to the model with the six variants (model 3 versus 4, p=0.037). It should be noted that these C statistics are higher than the Framingham risk score prediction model results for coronary heart disease (CHD).

The AMD risk score was tested in a separate replication sample of 452 cases and 317 controls that were not used in constructing the algorithm. The mean ages (±SD) were 76±6.6 for cases and 72±4.4 for controls, of which 49% and 53% were male, respectively. The C statistic was 0.810±0.016, which indicates excellent discrimination between cases and controls. This C statistic was calculated with adjustment for age, gender, education, smoking and BMI. For this analysis, antioxidant status was assigned as “no” since participants were not taking AREDS supplements at the time of enrollment, and, in a separate analysis, no subjects were consuming high quantities of these antioxidants in their diets. The C statistic for both the original and replication samples are comparable to or exceed the C statistic for the Framingham risk score for prediction of CHD.

The sensitivity and specificity of model 4 was calculated using different cut points to denote potential screen positive criteria separately for each age group, as described in Table 2. The corresponding ROC curves are presented in FIG. 1. A cutpoint where both the sensitivity and specificity would be at least 80% was identified for the older age group (risk score is ≧3 screen positive, <3 is screen negative), which yielded a sensitivity of 83% and specificity of 82%. Risk prediction for the younger age group was good; for a cut point of screen positivity of 2.5, the sensitivity was 76% and the specificity was 78%. The risk prediction was better for the older age group (FIG. 1).

Risk score distributions within a given age group appeared to be substantially different with case scores tending to be higher than controls although there was some overlap. The risk scores for the replication sample according to age and case-control status are seen at the bottom of FIG. 2 and indicate good separation between cases and controls particularly for older individuals.

Described in this example is the independent association of six genetic variants with AMD adjusting for all of these variants in addition to demographic and behavioral factors. Discrimination between cases and controls is excellent for the overall risk score in both the original and replication samples. The predictive power of this composite of risk factors for advanced AMD, with C statistic score of 0.86 and a replication C statistic of 0.81, are comparable to or better than the Framingham risk functions for CHD in which the C statistics were 0.79 for white men and 0.83 for white women in the Framingham study cohort and somewhat lower in several replication samples. Genetic factors clearly play a major role in AMD as demonstrated by the large and consistent estimates of the effects of the genetic variants on various groups with advanced AMD, including unilateral and bilateral disease, as well as the subtypes of geographic atrophy (dry) and neovascular (wet) advanced AMD. On the other hand, modifiable factors also have an impact. Cigarette smoking increased risk for all genotypes. Risk of advanced AMD increased, for example, from over 3-fold for non-smokers to almost 10-fold for smokers among individuals with the same homozygous C3 risk genotype compared with non-smokers with the non-risk genotype. Higher BMI also contributed to the risk profile for all genotypes.

This study included the evaluation of predictive power based on a large, well characterized population of Caucasian patients with or without advanced AMD from various geographic regions around the U.S. The standardized collection of risk factor information, direct measurements of height and weight, and classification of maculopathy by standardized ophthalmologic examinations and grading of fundus photographs. Misclassification was unlikely since grades were assigned without knowledge of risk factors or genotype. Controls were performed for known AMD risk factors, including age, education, BMI, smoking, and treatment assignment in assessing the relationship between genetic variants and advanced AMD. Both the environmental and genetic risk factors were independently associated with AMD, when considered simultaneously. Subjects likely represent the typical patient with advanced AMD, and the overall population is similar to others in this age range in terms of smoking and prevalence of obesity, as well as the distribution of the genotypes. This large and well-characterized population provided a unique opportunity to evaluate gene-environment associations and interactions. Furthermore, the biological effects of the genetic variants do not appear to differ in major ways among various Caucasian populations with AMD.

It is unlikely that many individuals without AMD in this elderly age group would progress to advanced disease during the remainder of their lifetime. Thus the potential for misclassification of controls who might ultimately become cases is likely to be small.

These analyses and results indicate the potential for individual risk prediction for AMD. In calculating the risk score, for example, one could estimate “points” from the regression coefficients (Table 2) for smoking (1.3), higher BMI (0.4), and the various genetic variants (ranging from −1.3 to +3.2) to obtain an overall risk for an individual to develop advanced AMD. This could be refined as new genetic and other risk predictors are established. Advantages of knowing such a risk score include the possibility for more targeted education and counseling about known modifiable factors. Screening would identify high risk people who would be encouraged to follow a healthy lifestyle by not smoking, eating vegetables and fish, maintaining a normal weight and getting exercise, and taking AREDS type antioxidant and mineral supplements for those with signs of AMD. All of these factors influence the inflammatory and immune pathways that are involved in the pathogenesis of AMD. Targeting high risk individuals could also lead to heightened awareness and more frequent surveillance and clinical examinations, as well as identification of high risk individuals for inclusion in clinical trials of new therapies.

TABLE 2
Calculation of AMD Risk Score.
The risk score was calculated from the following formula:
S=i=118βiXi where βi and Xi are given as follows:
VariableRegression CoeffControlβiCase
Name (Xi)i) Code(X0)Xi(Xi)βiXi
Age 70+1.01300000
Gender−0.10531 = m/1−0.111−0.11
0 = f
Education−0.58451 = some1−0.581−0.58
college/
0 = high
school
or less
Antioxidant Use0.24041 = yes/0010.24
0 = no
BMI 25-290.08711 = yes/10.0910.09
0 = no
BMI 30+0.43701 = yes/0000
0 = no
Current Smoking1.35551 = yes/0000
0 = no
Past Smoking0.62471 = yes/10.6210.62
0 = no
CFH:rs10611700.60021 = yes/0010.60
(Y402H) CT0 = no
CFH:rs10611701.25821 = yes/0000
(Y402H) CC0 = no
LOC387715:rs104909241.12381 = yes/0000
(A69S) GT0 = no
LOC387715:rs104909243.23431 = yes/0013.23
(A69S) TT0 = no
C3:rs22301990.48791 = yes/0000
(R102H) CG0 = no
C3:rs22301991.28981 = yes/0000
(R102H) GG0 = no
CFB:rs641153−1.34531 = yes/1−1.3500
(R32Q) CT or TT0 = no
C2: rs9332739−1.18301 = yes/0000
(E318D) CT or CC0 = no
CFH:rs1410996 CT0.49891 = yes/0000
0 = no
CFH:rs1410996 CC1.30041 = yes/0011.30
0 = no
Risk Score−1.325.4
*There was a constant of -1.9401 obtained from fitting the logistic model. This constant was not used in calculating risk score, so as to make most of the risk scores positive and easier to understand.

TABLE 3
Genotype-Phenotype Associations Among Controls
CFH: rs1061170(Y402H)LOC387715: rs10490924(A69S)
TTCTCCGGGTTT
VariableN%N%N%N%N%N%
Baseline Age
≦706470.36967.62379.310468.94871.64100
70+2729.73332.4620.74731.11928.40
p (trend)0.580.34
Gender
Male4246.24241.21758.66643.73552.20
Female4953.86058.81241.48556.33247.84100
p (trend)0.530.82
Education
High School or Less2426.43029.41034.54429.11928.4125.0
College or More6773.67270.61965.510770.94871.6375.0
p (trend)0.400.86
Smoking Baseline
Never4347.36058.81137.98153.63044.8375.0
Past4246.23837.31758.66241.13450.7125.0
Current66.643.913.485.334.50
p (trend)0.770.69
BMI Baseline
<252830.83736.39315133.82232.8133.3
25-293437.44746.11551.76543.03044.8133.3
≧302931.91817.6517.23523.21522.4133.3
p (trend)0.140.67
Antioxidants
Yes3740.75049.01137.96845.02841.8250.0
No5459.35251.01862.18355.03958.2250.0
p (trend)0.790.77
Genotype
C2: rs9332739(E318D)
TTCTCCTTCT or CC
VariableN%N%N%N%N%
Baseline Age
≦703075.07866.74873.814071.11664.0
70+1025.03933.31726.25728.9936.0
p (trend)0.930.47
Gender
Male1742.55244.43249.29347.2832.0
Female2357.56555.63350.810452.81768.0
p (trend)0.470.15
Education
High School or Less1127.52823.92538.55829.4624.0
College or More2972.58976.14061.513970.61976.0
p (trend)0.140.57
Smoking Baseline
Never1845.06656.43046.210151.31352.0
Past1947.54639.33249.28543.11248.0
Current37.554.334.6115.60
p (trend)0.950.61
BMI Baseline
<251435.04135.01929.27035.5416.0
25-291537.55143.63046.28241.61456.0
≧301127.52521.41624.64522.8728.0
p (trend)0.740.12
Antioxidants
Yes1742.55647.92538.58442.61456.0
No2357.56152.14061.511357.41144.0
p (trend)0.550.21
Genotype
CFB: rs641153(R32Q)C3: rs2230199(R102H)
CCCT or TTCCCGGG
VariableN%N%N%N%N%
Baseline Age
≦7011970.43769.89265.75878.4675.0
70+5029.61630.24834.31621.6225.0
p (trend)0.930.08
Gender
Male7645.02547.26647.13445.9112.5
Female9355.02852.87452.94054.1787.5
p (trend)0.780.23
Education
High School or Less4727.81732.14632.91621.6225.0
College or More12272.23667.99467.15878.4675.0
p (trend)0.550.12
Smoking Baseline
Never9153.82343.47553.63445.9562.5
Past7041.42750.95841.43750.0252.0
Current84.735.775.034.1112.5
p (trend)0.220.58
BMI Baseline
<255733.71732.14935.02128.4450.0
25-297242.62445.35942.13648.6112.5
≧304023.71222.63222.91723.0337.5
p (trend)0.960.64
Antioxidants
Yes7846.22037.75942.13750.0225.0
No9153.83362.38157.93750.0675.0
p (trend)0.280.76

TABLE 4
Association Between Advanced AMD and Demographic, Behavioral and Genetic Risk Factors.
All Advanced AMDUnilateral advanced AMD†Bilateral advanced AMD†Geographic atrophy‡
OR (95% CI)*p-valueOR (95% CI)p-valueOR (95% CI)p-valueOR (95% CI)p-value
No. Cases/Controls509/222202/222307/222136/222
Variable
Age (yr)
<701.01.01.01.0
≧702.8 (1.8-4.2)<0.00012.3 (1.4-3.8)0.0013.7 (2.2-6.2)<0.00012.6 (1.5-4.6)0.001
Gender
Female1.01.01.01.0
Male0.9 (0.6-1.4)0.621.0 (0.6-1.5)0.850.9 (0.5-1.4)0.551.0 (0.6-1.8)0.89
Education
≦High School1.01.01.01.0
>High School0.6 (0.4-0.9)0.010.5 (0.3-0.9)0.010.6 (0.4-1.0)0.070.7 (0.4-1.2)0.18
Smoking
Never1.01.01.01.0
Past1.9 (1.2-2.9)0.0042.2 (1.3-3.6)0.0021.6 (0.9-2.6)0.091.8 (1.0-3.1)0.06
Current3.9 (1.7-8.9)0.0013.7 (1.5-9.6)0.01 4.0 (1.5-10.7)0.012.7 (0.8-8.9)0.11
BMI
<251.01.01.01.0
25-291.1 (0.7-1.8)0.721.2 (0.7-2.1)0.531.0 (0.6-1.8)0.991.0 (0.5-1.9)0.97
30+1.5 (0.9-2.6)0.111.7 (0.9-3.2)0.091.5 (0.8-2.9)0.252.7 (0.8-8.9)0.44
Antioxidant
No1.01.01.01.0
Yes1.3 (0.8-1.9)0.251.3 (0.8-2.1)0.291.2 (0.7-2.0)0.421.1 (0.6-1.9)0.77

TABLE 5
Interaction Effects of BMI, Smoking, and Genotype on Risk of Advanced AMD.
BMI OR (95% CI)*
Variable<2525+P (interaction)P trendNeverEver
CFH: rs1061170(Y402H)
TT1.00.6(0.3-1.4)1.01.6(0.8-3.4)
CT0.9 (0.4-2.0)1.6(0.8-3.3)0.035(CT vs TT)1.3 (0.6-2.7)3.6(1.8-7.4)
CC1.8 (0.6-5.2)2.8(1.1-6.9)0.14(CC vs TT)3.5 (1.3-9.1)5.1(2.1-12.3)
0.090
LOC387715: rs10490924(A69S)
GG1.01.3(0.7-2.3)1.02.5(1.4-4.3)
GT3.3 (1.6-6.9)3.9(2.1-7.2)0.81(GT vs GG)4.2 (2.2-7.8)6.0(3.4-10.8)
TT 25.9 (3.2-211.1)32.1(8.7-118.3)0.96(TT vs GG)17.4 (4.7-63.5)120.4(15.1-957.2)
0.90
CFH: rs1410996
TT1.02.0(0.5-8.0)1.02.1(0.6-7.9)
CT2.4 (0.7-8.4)2.8(0.8-9.6)0.46(CT vs TT)1.4 (0.4-4.5)4.0(1.3-12.7)
CC 5.3 (1.4-20.2)6.4(1.8-22.7)0.50(CC vs TT) 4.6 (1.4-15.2)6.5(2.0-21.6)
0.65
C2: rs9332739(E318D)
TT1.01.3(0.8-2.0)1.01.9(1.3-3.0)
CT or CC0.6 (0.1-3.9)0.3(0.1-0.6)0.44(CT-CC vs TT) 0.2 (0.05-0.7)0.8(0.3-2.2)
CFB: rs641153(R32Q)
CC1.01.3(0.8-2.0)1.02.1(1.3-3.2)
CT or TT0.3 (0.1-0.7)0.3(0.1-0.6)0.9(CT-TT vs CC)0.3 (0.1-0.6)0.5(0.2-1.0)
C3: rs2230199(R102H)
CC1.01.5(0.9-2.7)1.02.2(1.3-3.8)
CG2.4 (1.2-5.1)2.1(1.1-3.9)0.21(CG vs CC)1.9 (1.0-3.6)3.3(1.8-5.9)
GG 2.5 (0.5-11.1)7.2(1.9-27.2)0.51(GG vs CC) 3.3 (1.0-10.9)9.8(2.0-47.5)
0.62
*OR = Odds Ratio, CI = confidence interval
OR's adjusted for age (<70, ≧70), gender, education (≦high school, >high school), smoking (never, past, current), BMI (25, 25-29, 30+),
antioxidant treatment (yes, no), and all genetic variants and associated genotypes.

TABLE 6
C Statistics for Advanced AMD Based on Models with Different Combinations
of Genetic and Environmental Variables.
Demographic, EnvironmentalC Statistic
ModelSampleGenetic VariablesVariables(+/−SE)*
1originalCFH Y402H, LOC387715 A69SAge, gender, education,0.803 +/− 0.018
antioxidant treatment
2originalCFH Y402H, LOC387715 A69SAge, gender, education,0.822 +/− 0.017
antioxidant treatment, smoking
BMI
3originalCFH Y402H, LOC387715 A69S,Age, gender, education,0.846 +/− 0.016
CFH 1410996, C2E318D,antioxidant treatment
CFB R32Q, C3 R102H
4originalCFH Y402H, LOC387715 A69S,Age, gender, education,0.859 +/− 0.015
CFH 1410996, C2E318D,antioxidant treatment, smoking
CFB R32Q, C3 R102HBMI
4areplicationCFH Y402H, LOC387715 A69S,Age, gender, education,0.810 +/− 0.016
CFH 1410996, C2E318D,antioxidant treatment, smoking
CFB R32Q, C3 R102HBMI
*p value (model 1 vs 2, p = 0.027; 1 vs 3 p < 0.001; 2 vs 4, p = 0.001, 3 vs 4, p = 0.037)

Example 3

Variation Near Complement Factor I is Associated with Risk of Advanced AMD

Case-control association studies for AMD in several genomic regions continued, yielding a SNP just 3′ of Complement Factor I (CFI) on chromosome 4 with significant association (p<10−7). Sequencing was performed on coding exons in linkage disequilibrium with the detected association. No obvious functional variation was discovered that could be the proximate cause of the association, suggesting a non-coding regulatory mechanism.

The association of age-related macular degeneration (AMD) with variants on chromosome 1 (CFH), chromosome 6 (CFB; C2), chromosome 10 (LOC387715/ARMS2), and chromosome 19 (C3) have identified the primary role of the complement pathway in disease pathogenesis (Fisher, S. et al., Hum. Mol. Genet., 14:2257-2264, 2005; Klein, R. et al., Science, 308:385-389, 2005; Haines, J. et al., Science, 308:419-421, 2005; Edwards et al., Science, 421-424, 2005; Hageman, G. et al., Proc. Natl. Acad. Sci. USA, 102:7227-7232, 2005; Rivera, A. et al., Hum. Mol. Genet., 14:3227-3236, 2005; Gold, B. et al., Nat. Genet., 38:458-462, 2006; Maller, J. et al., Nat. Genet., 38:1055-1059, 2006; Maller, J. et al., Nat. Genet., 39:1200-1201, 2007). Following up whole-genome linkage regions with fine-mapping has met with limited success in other complex diseases, however the effect sizes of the identified risk variants for AMD are dramatically larger than most late-onset disease associations (Science, 316:1331-1336, 2007; Easton, D. et al., Nature, 447:1087-1093, 2007; Samani, N. et al., N. Engl. J. Med., 357:443-453, 2007). In light of these successes, 1,500 SNPs were selected using two different criteria: targeting genes in regions under suggestive linkage peaks from a recent meta-analysis and genes selected from the complement pathway not in these regions. Genotyping was performed on 2,053 unrelated individuals using the Illumina GoldenGate assay and Sequenom MassARRAY iPLEX assay as previously described.

Sample: the study population consisted of 2,053 unrelated Caucasian individuals 60 years of age or older diagnosed based on ocular examination and fundus photography (1,228 cases of both dry and neovascular (wet) advanced AMD and 825 controls; Seddon, J. et al., Ophthalmol., 113:260-266, 2006). Informed consent was obtained in writing from all participants, and procedures were approved by the appropriate institutional review boards. This is largely the same sample set with the same phenotyping criteria described above—importantly, this sample has been previously confirmed to show no inflation of case-control association statistics due to population substructure.

SNP Selection: a total of 1,536 SNPs across regions of chromosomes 1, 2, 3, 4, 6 and 16 were genotyped based upon the Fisher et. al. bin rank of a meta-analysis of previous whole-genome linkage studies. SNPs were chosen in and around regions of transcription as described by Wiltshire, S. et al. (Eur. J. Hum. Genet., 14:1209-1214, 2006), however, the efficiency of this strategy was augmented by only selecting SNPs that tag seven or more other SNPs (super informative SNPs). This SNP selection routine was conducted by using Tagger and HapMap data from the CEPH population (Phase II). SNPs with a minor allele frequency (>10%) and with a minimum r2 of 0.8 were selected. Within this set of SNPs were nine SNPs in and around the CFI region. Another 20 SNPs were chosen in the region to adequately tag the entire region using the same tagging parameters as above. The 29 total SNPs genotyped had very good information coverage (a mean r2=0.966) for the 173 Kb long region of interest that covered 114 out of the 116 HapMap SNPs in the region above a minor allele frequency (MAF) of 5%.

A total of eight SNPs were genotyped across C3 and seven SNPs were gentoyped across C5. SNPs were picked using Tagger and HapMap data from the CEPH population (Phase II). SNPs were selected with a minor allele frequency (>5%) and a minimum r2 of 0.8 to other selected SNPs. Thus, the SNPs that were selected should have been broadly representative of regional genetic variation because they were either direct proxies of other SNPs in those areas, or combined to form specific multimarker haplotypes that were correlated with other untyped SNPs in the region.

Genotyping: the 1,536 SNPs were genotyped at the Center for Inherited Disease Research (CIDR) using an Illumina OPA. The follow-up genotyping and sequencing was performed using the Sequenom MassARRAY system for iPLEX assays.

Sequencing: a novel SNP was discovered by sequencing 85 subjects as a subset of our case-control cohort. This SNP (see SEQ ID NO:10) is an A/G SNP, with A being the major allele, and G having an MAF of 7.65%. The SNP of SEQ ID NO:10 is on chromosome 4 just 5′ (113 bp) of CFI's exon 12 according to dbSNP, and is located at the coordinate 110 883 313 base pairs on chromosome 4 according to NCBI build 36.1. The SNPs have an r2 of 0.057, 0.006 and 0.003 respectively with respect to rs13117504, rs10033900, and rs11726949. Since the two most associated SNPs are not in high correlation with this novel SNP, nor does it have a MAF that is very close to the associated SNPs, it is fairly certain that this new CFI-related SNP is not the causal SNP driving the association in this region.

Analysis: all linkage disequilibrium calculations (e.g., D′ and r2) were performed with Haploview (Barrett, J. et al., Bioinformatics, 21:263-265, 2005). Single-locus and two-marker haplotype association analysis was conducted using logistic regression tests implemented in PLINK (Purcell, S. et al., Am. J. Hum. Genet., 81:559-575, 2007).

To calculate the percent variance accounted for by any risk alleles, a prevalence of late-stage AMD was assumed in this older age group to be 5% and that liability is normally distributed in the population, with a mean of 0 and a variance of 1.

Subjects: the methods employed in this study conformed to the tenets of the Declaration of Helsinki and received approval from the appropriate institutional review boards. Informed consent was signed by all participants. For these analyses, unrelated Caucasian individuals with extremely discordant phenotypes were included. Cases were defined as described above.

Statistical Analysis: association testing as well as the other statistical analyses were performed using Haploview (found at the world wide web site, broad.mit.edu/mpg/haploview) and PLINK (found at the web site, pngu.mgh.harvard.edu/˜purcell/plink/).

The most significantly associated SNP in the experiment, rs10033900 (p=4.86×10−6), resides in the chromosome 4 linkage peak and was significant even after a Bonferroni correction for 1409 (FIG. 4 shows the full results of the screen). Several nearby SNPs were also associated with p<0.0005, suggesting this association was not due to a sporadic genotyping artifact. Remarkably, this SNP happened to be adjacent to the 3′ UTR of Complement Factor I (CFI).

Given this compelling result, a much higher density of SNPs in this region was genotyped in an expanded panel of samples. The original SNP (rs10033900) remained the most highly associated SNP with a p-value=6.46×10−8 (OR=0.7056 indicating a protective effect for the C allele) (Table 7).

29 SNPs were tested across this region for association, based on the most associated SNP, rs10033900. No significant independent associations were identified. Modest residual association at two neighboring, highly correlated SNPs (Table 7) was observed, however. This result suggests that rs10033900 may not be the causal variant but may be highly correlated with said variant. Therefore, multi-marker haplotype tests were performed in an attempt to refine and isolate the association signal. The two-marker haplotype of the two closest SNPs to rs10033900, both 5′ (rs13117504) and 3′ (rs11726949) were tested. The two-marker haplotype between rs13117504 and rs10033900 shows a somewhat stronger association to AMD than either SNP alone with a p-value=1.18×10−8 (Table 8). None of these three SNPs appear to be functional, although rs11726949 is in Intron 11 of CFI. When a search was performed for differences in association between the neovascular (“wet”) and geographic atrophy (“dry”) forms of advanced AMD, only a 0.2% difference in minor allele frequency (46%) between the two groups.

To determine whether an obvious functional variant exists that explains this association, all exons within the span of LD defined by HapMap were sequenced (all markers correlated to the associated SNPs reside (r2>0.35) in this haplotype block). This block of LD spans the exonic regions of the 3′ end of CFI and all four exons of phospholipase A(2) Group 12A (PLA2G12A). No SNPs were found in either gene transcript that could statistically explain the association observed at rs10033900. A novel SNP just 5′ of exon 12 in CFI was found, but this SNP does not appear to be in high r2 with the associated CFI SNP or haplotype and is therefore not the biological source of association.

The role of epistasis between rs10033900 and rs13117504 and the six variants previously established to be associated with AMD was examined. Using logistic regression, no statistically significant interaction terms between any pair of these SNPs was found. While weak interactions cannot be excluded, this result suggests that despite targeting the same pathway, these variants largely confer risk in an independent, log-additive fashion.

Given the independent action of this new variant, we were able to add it to the multi-locus risk model. Since the individual and combined effects of the AMD-associated variants are additive, the overall proportion of population variance in risk (assuming a prevalence of late-stage AMD in this age group to be 5%) explained by this locus is estimated to be roughly 1% (assuming an underlying normal distribution N(0,1) of liability across the population).

The CFI gene spans 63 kb and contains 13 exons, the first 8 of which encode the heavy chain and the last 5 the light chain, which contains the serine protease domain (Vyse, T. et al., Genomics, 24:90-98, 1994). This serine protease domain is responsible for cleaving and inactivating the activities of C4b and C3b (Catterall, C. et al., Biochem. J., 242:849-856, 1987). C3b inactivation by CFI is regulated by Complement Factor H(CFH). CFH acts as a cofactor for CFI-mediated cleavage of C3b and also has decay accelerating activity against the alternative pathway C3 convertase, C3bBb. MCP also acts as a cofactor for CFI-mediated cleavage of C3b by down regulating the complement cascade (Jha, P. et al., Mol. Immunol., 44:3997-4003, 2004).

Specific alleles associated with AMD or a susceptibility to AMD are as follows:

rs13117504:
(SEQ ID NO: 8)
TGAGATGACCTGACTCCAAGCTTCTCCTAGTTTAGAGGTCTGTCTCAGCG
CTCCTAATTCCAGACTACAGAAGCCAATCTAACTGGTTTAATGAAAAAA
TAGATTTATTCAAAGATATTATGCAGTTCACAGAATTTCCAAAAGAGCCA
GAGAATTGGAGTCTACACATCTGGAAATAACGCTCACAGACCACACTGC
AG[C/G]ACTGCTCCATCAGAGACTACCATGGCCACTAGCATGGGTCCCA
GTCTCCACGTACAGCTTGTACTGCAGAGCCTGGGCACTGGACATTGCTGC
TTCTGTGAGCTCGCCTGAAGGTGGCCAGGGACACAACTCATTGGATGTGG
AGCTCTGCCATCGTCTTATATTTAACCCTGGTTTCAGAGCTCTCTGTCTT
ACAAGAG
rs10033900:
(SEQ ID NO: 9)
AAAAGTACTCCAGTGCTACAAGGTGGGAAACCCAGCATATAGGGCATCC
TCAGCCAATCTAGGAAGGGGGCCCCCAAAAGGGCCAAGCCAGTCTGCAC
AGTGACCTAGCATCTGGCAGCTTCACAGAAAAGTAGACCAGGGTCAAGC
CTTGAAGGGTGAAAGTCAGCCCTCTGAAGCGACTCTATGTGACAGAGAC
CAGG[A/G]ACAGCAGGAGTGAGATGACCTGACTCCAAGCTTCTCCTAGT
TTAGAGGTCTGTCTCAGCGCTCCTAATTCCAGACTACAGAAGCCAATCTA
ACTGGTTTAATGAAAAAATAGATTTATTCAAAGATATTATGCAGTTCACA
GAATTTCCAAAAGAGCCAGAGAATTGGAGTCTACACATCTGGAAATAACG
CTCACAGAC
CFI SNP:
(SEQ ID NO: 10)
TACATCTTGACATCTTGGATAAACCACTTGGCACTTACCTGCACATTCCA
TTTCTTTTTCATAGAAACGATTTCCGTAAAACTTAGAGCAGTTGCTTATT
AGTTTAACTTCACCCCACTGAAGTGAAAAGACTCTTTCGTTATCTAAACA
AAGTGAGAAAGCAAACATTTAGAAGTCACAAATGAGAAATCTAAATAC
ATACTCTCATAACTTAAACCATTGGGATTATGAAAGGGTGTATAGTTTTC
AATATAT[A/G]TAAATTTTTGAGAACTCTTCCTTAGCGTGTTTAATCAT
ATCATATGTCTATTTCTAAAATTGGATGGATAGTCAAGGGGGACTATTGA
ACATATGGTCTGAAGTCACATTCATTTAATAAAGAGCTGGTACCTTAAGT
TGAATGAGATAAATCTTTCCCTTTTGGTTGCAGAACTCAGTCAGGCAATT
GGATAGGAATCAGTGATAAGGTCTTTTCACATAAATACCTCCTATCTTCT
CACAGTCCCTTATTTCC

TABLE 7
29 SNPs Tested Across PLA2G12A and CFI
MAFMAFConditional
CHRSNPhg18Allele 1AffectedUnAffectedAllele 2CHISQP-ValueOdds Ratiors10033900
4rs13101299110787671A10.0%8.9%G1.400.23711.1380.1229
4rs9990765110788160C30.1%36.5%T18.152.040E−050.750.1135
4rs4698774110791792G27.5%27.6%C0.000.95340.99580.7822
4rs6830606110793498C9.7%8.5%T1.740.18741.1580.8431
4rs7690921110798195T35.4%29.5%A15.597.870E−051.3120.2237
4rs17440280110801549C37.5%36.8%T0.210.64771.0310.6611
4rs4698779110816366G12.9%13.9%C0.890.34600.91590.2519
4rs1800627110833107T45.8%51.4%C12.534.013E−040.79810.7469
4rs768063110839905A3.8%3.3%G0.890.34501.180.3177
4rs5030535110841576G34.3%40.3%A14.821.182E−040.77590.6100
4rs2285714110858259T46.2%38.8%C21.353.835E−061.350.4456
4rs2107047110867224A7.4%8.2%G0.820.36390.89820.6670
4rs6854876110872889C40.1%47.3%G18.861.410E−050.74660.1183
4rs2346841110874089A6.0%6.4%G0.260.60680.93120.4367
4rs13117504110878305G37.7%45.8%C26.932.110E−070.71540.02806
4rs10033900110878516C46.0%54.7%T29.226.460E−080.7056NA
4rs11726949110884079T7.9%5.0%C11.965.442E−041.6270.007167
4rs6848178110885620T42.8%46.0%A3.710.05420.87870.4991
4rs6822976110885741A48.4%49.0%G0.150.69750.97550.2998
4rs9998151110886794C5.2%3.7%T4.320.03771.410.2001
4rs4698784110888710A32.6%31.3%T0.720.39691.0610.4749
4rs11098043110891734G23.8%24.9%A0.550.45690.94610.8048
4rs13129180110905862T26.3%27.2%A0.430.51270.95190.8159
4rs1000954110929483A28.9%30.6%G1.360.24320.91840.6391
4rs4698788110938983C2.8%2.7%T0.040.84681.0390.6855
4rs7675460110940037A39.1%37.6%C0.840.35881.0650.2338
4rs4698792110948753T29.2%30.3%C0.610.43370.94680.9426
4rs1002989110953291C8.7%5.9%T11.198.245E−041.5270.02009
4rs4422417110961059G14.1%11.9%A3.950.04681.2130.0729

TABLE 8
Two-marker Haplotype Association Results for most significant SNPs in CFI region
Frequency ofFrequency of
SNPsHaplotypeAffectedUnaffectedDFCHISQP-ValueOR
rs13117504|rs10033900OMNIBUSNANA334.611.48E−07NA
rs13117504|rs10033900CT50.07%40.92%132.521.18E−081.448
rs13117504|rs10033900GC33.64%41.48%125.514.39E−070.715
rs13117504|rs10033900CC12.39%13.25%10.64470.42200.926
rs13117504|rs10033900GT3.90%4.35%10.50680.47650.891

Other Embodiments

Other embodiments will be evident to those of skill in the art. It should be understood that the foregoing detailed description is provided for clarity only and is merely exemplary. The spirit and scope of the present invention are not limited to the above examples, but are encompassed by the following claims. All references cited herein and throughout this specification are hereby incorporated herein by reference in their entirety.