By Sarah Hansen
One of the challenges of genetically complex conditions like autism is that no single gene explains more than a tiny fraction of cases. Instead, the growing consensus is that numerous genes are responsible, and produce their effect by impacting the same biological “pathways.” A 2011 study authored by John P. Hussman and colleagues at the Hussman Institute for Human Genomics identified and prioritized 860 genes that may contribute to autism.
The study implemented a novel statistical method, GWAS-NR (see box) that Hussman developed and validated. GWAS-NR detects associations between elements in extremely large datasets, and Hussman’s team of colleagues applied the method to genetic data in autism. By checking against the known functions of many of the 860 genes, the team found that a significant number of them cooperate in a single, coherent pathway. That pathway regulates the growth of “neurites”—the extensions from neurons that connect with other neurons to form synapses (the space between neurons, across which brain messages travel) and circuits in the brain. A smaller, but still significant, number of genes are involved with the function of synapses, further suggesting that differences in circuit formation may be central to autism.
The growing consensus is that numerous genes are responsible, and produce their effect by impacting the same biological “pathways.”
Hussman and colleagues then used the results of GWAS-NR, to further analyze 837 of the genes detected by the 2011 study. For the new study, the authors sequenced the coding regions of the associated genes (the DNA that tells the cell what protein to build) in 2071 cases of autism and 904 controls. Sequencing produced approximately 30 million “reads,” each 100 base pairs long, for each individual—more than nine billion total base pairs. Eighty-seven percent of target DNA regions were duplicated at least 10 times. Duplication prevents rare sequencing errors from affecting the overall results. While still a massive amount of sequencing, it was much less than sequencing the coding regions of all genes (the “exome”), and much more efficient, because there was already reason to believe these genes are relevant to autism based on the GWAS-NR.
Geneticists identify “variants” by comparing the genome of each individual to a single “reference” genome. As a result, variants describe how individuals differ, but don’t necessarily cause damage to the way a given protein functions. The sequencing results showed that unaffected individuals and those with autism did not differ significantly in their total number of genetic variants, so a large number of variants alone is unlikely to cause autism. The team did find that 144 individuals with autism and 26 without had at least one gene with more than one rare, damaging mutation. These double-mutations spanned 47 different genes. Three of them had already been linked to autism, and the fact that so many more individuals with autism had double-mutations suggests that multiple mutations are a risk factor for autism.
Sequencing produced approximately 30 million “reads,” each 100 base pairs long, for each individual—more than nine billion total base pairs.
The study also found 464 gene variants that cause loss of function (LOF) in the gene’s protein product. Autism cases did not have more LOF than controls overall, but when the researchers focused on genes that had already been linked to autism, there were significantly more autism cases with LOF (27 cases) than controls (four cases).
The study also looked at the inheritance pattern of genetic variants that were previously associated with autism or play a role in neuron function. The most exciting finding from that line of research was a mutation in the RBFOX1 gene that was not inherited from either parent, but rather a new (“de novo”) mutation in one individual. RBFOX1 is associated with autism and plays roles in neuron function and cytoskeleton organization, which is important for neuron development. This was the first time researchers have found a non-inherited mutation in RBFOX1.
When the team analyzed the functions of the genes with two mutations in autism cases, it found important themes: neuron function/development and synapse development. Two genes (GRM3 and GRIK4) are involved in the glutamatergic signaling pathway and believed to contribute to autism, bipolar disorder, and schizophrenia. Another (IL1RAPL2) is important to form and stabilize a particular type of brain synapse. Yet another gene (SEMA6A) causes mice to have altered brain cell organization and development of the long, message-passing projections that extend from neurons. PQBP1 is important for the growth of neuron projections, too.
These findings all suggest that genetic changes affecting neuron outgrowth and guidance, and synapse formation and function may be critical risk factors for autism.
The team found 199 genes with LOF mutations that were only in autism cases, which followed the themes. Seventeen of these were previously associated with autism. FAT1 regulates the ability of a neuron to stretch out into its stereotypical branching shape from a sphere. SYNE1, GRIK2, and NRXN1 are already associated with autism and other neurodevelopmental conditions.
These findings all suggest that genetic changes affecting neuron outgrowth and guidance, and synapse formation and function may be critical risk factors for autism. Also, the fact that no single gene accounted for a large proportion of the autism cases highlights the need to look for the combined effects of multiple, potentially rare, mutations. Further efforts are ongoing both at the Hussman Institute for Human Genomics and the Hussman Institute for Autism to study the biological pathways implicated by these large-scale genetic studies, and to understand their neurological effects, with the goal of improving the lives of those with autism.
Citation:
Griswold AJ, Dueker ND, Van Booven D, Rantus JA, Jaworski JM, Slifer SH, Schmidt MA, Hulme W, Konidari J, Whitehead PL, Cuccaro ML, Martin ER, Haines JL, Gilbert JR, Hussman JP, Pericak-Vance M. 2015. Targeted massively parallel sequencing of autism spectrum disorder-associated genes in a case control cohort reveals rare loss-of-function risk variants. Molecular autism, 6:43. DOI: 10.1186/s13229-015-0034-z
Hussman JP, Chung R, Griswold AJ, Jaworski JM, Salyakina D, Ma D, Konidari I, Whitehead PL, Vance JM, Martin ER, Cuccaro ML, Gilbert JR, Haines JL, Pericak-Vance MA. 2011. A noise-reduction GWAS analysis implicates altered regulation of neurite outgrowth and guidance in autism. Molecular autism. 2:1. http://www.molecularautism.com/content/2/1/1
What is GWAS-NR?
A genome-wide association study with noise reduction, or GWAS-NR, is a modification of the established GWAS genetic screening technique. GWAS is usually used to learn about complex conditions influenced by many genes. It examines hundreds of thousands of individual locations, called single-nucleotide polymorphisms or “SNPs,” in DNA samples from individuals with a trait (such as autism, diabetes, or obesity). Each SNP typically has one of two possible types. GWAS looks for versions of SNPs that are more common in individuals with the trait than those without. Those variants are said to be “associated” with the disease. In the GWAS-NR, Hussman’s method took the analysis one step further to improve accuracy.
Hussman describes it this way: “Imagine taking a million coins, each of them an inch apart, and then flipping all those coins a thousand times. A GWAS analysis would look at all of those coins to see if any of the coins were biased to heads or tails, but the standard analysis looks at each coin separately. In the real world, those coins are locations on a chromosome and are connected by genetic material, so they aren’t independent. GWAS-NR reduces statistical noise by amplifying signals that appear very close to the same location and are replicated in multiple subsets of data.”
GWAS-NR takes account of information from nearby “flanking” SNPs, and also compares the significance of a genetic variant across several separate datasets and a combined dataset consisting of all the data taken together. The GWAS-NR prioritizes SNPs that show the greatest association with the trait of interest (e.g. autism) across all datasets. Because it emphasizes signals that are replicated both within a dataset and across multiple datasets, the method reduces the likelihood of detecting false positives. By reducing these “false positives,” researchers can be more confident and more cost-efficient in follow-up studies. In simulations, GWAS-NR did a better job of identifying relevant genes than two standard methods (joint analysis and Fisher’s method).