The unsolvable problem with the GWAS/heritability/gene-centric paradigm

J.P. Smith
11 min readJun 13, 2019

Ever wonder why exactly the results of genome-wide association studies (GWAS), other gene association studies, twin studies, genome-wide complex trait analysis (GCTA), and other types of studies looking to see how “genetically inherited” complex trait X is are so inconsistent? Maybe you’ve heard that they don’t really tell us about actual biological processes, which is true, but this doesn’t answer the question of why exactly they fail to do so. The answer is more fundamental than you might think, and the only way to deal with it is to reject the gene-centered paradigm all “heritability” studies are based on. The following was copied and pasted from a comment Evan Charney posted on an article he wrote in 2013. It’s such a thorough and devastating refutation I couldn’t help but share it. The refutation is copied and pasted verbatim below the break.

For all of its presumed statistical sophistication, GCTA is based, FOUNDATIONALLY, upon a late 19th and early 20th century paradigm of “genes,” that in light of advances in molecular genetics over the past quarter century, has no scientific validity. Discussing how to reconcile the findings of twin studies with the findings of GCTA is like debating how to fit yet another planetary epicycle into the Copernican view of the solar system.

To defend this assertion adequately would require much more space than I have here (for a more extended discussion, I suggest my own “Behavior genetics and postgenomics” [5]). All that I will do here is list some fundamental scientific developments that do not, that CANNOT, coexist with the crude conception of genes and the genotype-phenotype relation that underlies the entire concept of heritability as expressed in twin studies, GCTA, behavior genetics, and all allied attempts to partition and quantify the effects of “genes” v. “environment” on phenotypic variation.

GCTA, like GWAS, uses blood samples or check swabs to identify and then scan “the human genome.” The problem with this is that persons do not have “a genome.” And I am not even referring to mitochondrial DNA, which is inherited in a non-Mendelian manner and is simply ignored because it cannot fit into the simplistic, reductionist model upon which all heritability studies are based. What I am referring to is the fact that persons do not possess a single NUCLEAR genome. There is now overwhelming scientific evidence that the normal human condition is one of SOMATIC MOSAICISM, different DNA sequences (or “genomes”) in different cells and tissues of the body, a phenomenon that appears to be particularly prevalent in the human brain [6–18].

Conservative estimates place the overall percentage of aneuploid (a form of somatic mosaicism characterized by variable numbers of whole chromosomes — greater or less than 2 in a cell) neural cells in the normal adult brain at an astonishing 10%, involving monosomy, trisomy, polyploidy (greater than four chromosomes), and uniparental disomy (two copies of a chromosome from one parent [19 20]. Given an estimated 100 billion neurons in the adult brain, this yields a rough (conservative) estimate of 10 billion neurons and 100–500 billion glial cells (neural cells that do not transmit electrical impulses but play an essential role in neuronal structure and function) with one or another form of chromosomal aneuploidy. It is estimated that roughly 28% of embryonic neural precursor cells exhibit chromosomal aneuploidy in one form or another [14]. Mature aneuploid neurons are functionally active and integrated into brain circuitry, showing distant axonal connections. One likely result of this is neuronal signaling differences caused by altered gene expression, as documented in mammalian neural cells.

Suppose we are interested in the “heritability” of a psychological trait or “intelligence.” What, precisely, does the analysis of SNPs in DNA taken from blood or cheek cells tell us about the DNA in different regions of the brain, given widespread somatic mosaicism? (A further point is that because these processes are STOCHASTIC, MZ twins are discordant for DNA variation that results from somatic mosaicism [21]).

DNA is not the sole biological agent of inheritance. Persons inherit, in a non-Mendelian manner, epigenetic markings in the form of histone modifications and DNA methylation, non-coding RNAs including microRNAs and long non-coding RNAs, mitochondria and mitochondrial DNA, “maternal” oocytic messenger RNAs, and nucleoli (to name a few) [22–24] all of which play critical roles in every aspect of human development, phenotype formation, and phenotypic variation.

How are all of these non-DNA, non-Mendelian inherited elements to be incorporated into “gene-based” heritability estimates? The prevailing solution to this this problem is to ignore them. Furthermore, one cannot separate the “effect” of inherited DNA from the effect of all of the inherited elements just cited, because without these elements, DNA has no effect whatsoever (for more on this, see below).

With all but a few exceptions (so-called monogenic disorders and oligogenic disorders with a predisposing allele) the “presence” of a “gene,” whether one or one thousand, indicates very little about the “effect” of the “gene” because “gene effects” are the result of agents external to the gene, namely, the cell and all of the cellular machinery that turns genes on and off, transcribes genes, translates genes, and utilizes gene products. If a “gene” is epigenetically silenced, it cannot be transcribed, so the presence of the same “gene” in any two individuals does not tell us whether it can have any phenotypic effect. Furthermore, the presence of a given “gene” does not tell us what protein will be transcribed from it. There are estimated to be over 100,000 proteins in the human body — and the number may be significantly higher — yet approximately 25–30,000 genes in the human genome. Alternative splicing (AS) allows multiple transcripts to be produced from a single segment of DNA (“gene”) and, consequently, multiple proteins. AS is estimated to occur in 95% of human genes, and can result in numerous proteins being synthesized from the same “gene.” These “isoforms” can exert radically different and even opposed physiological effects [25–29].

The supposed dichotomy between “nature” and “nurture,” “genes” and “environment,” is an anachronism about as scientifically sound as Aristotle’s distinction between the sublunar world of change and the immutable heavens. Is the supposition supposed to be that everything “external” to the DNA sequence is “environment”? Is the chromatin, for example, in which the DNA is wrapped and which, by changing configuration in response to a variety of inputs, determines the extent to which any given segment of DNA is accessible to transcription factors and capable of being transcribed, part of the “environment”? Is the oocytic cytoplasm that contains “maternal” messenger RNAs that turn the zygotic genome on and that control early zygotic development prior to the activation of the embryonic genome, the “environment”?

Heritability studies assume that the DNA sequence (which DNA sequence?) does not change. As noted, DNA sequence changes during embryogenesis result in somatic mosaicism. What is more, retrotransposons or jumping genes, mobile segments of DNA that copy and paste themselves at various sites in the DNA sequence, changing DNA sequence and content, remain active throughout life in those parts of the brain that continue to generate new neurons (the hippocampus and the caudate nucleus) [30–35].

If behavior results from the accumulated activity of the embodied brain, if the DNA sequences vary in different neurons as a result of stochastic process during development, if the activity of the DNA sequences varies in different cells due to epigenetic differences, if these epigenetic differences vary during the developmental process and in different environments, if the DNA sequence continues to change in those parts of the brain that undergo neurogenesis throughout life as a result of the activity of retrotransposons, if the activity of these retrotransposons is itself regulated by the epigenome which in turn is highly responsive to environmental inputs, if the protein translated from a given segment of this DNA at any given time is determined not by the segment of the DNA itself but by the mechanisms of the cell which determine which isoform to transcribe in response to innumerable environmental outputs (and on and on), then how on earth are the common SNPs from DNA samples from blood cells of thousands of unrelated persons supposed to tell us, e.g., how much of the variation in depression, or “intelligence” (in a population) is “due to” “genes” and how much to “environment”?

HUMANS POSSESS FEWER GENES THAN CORN. One of the surprising findings of the Genome Project was that the human genome contains an estimated 20,000 protein coding “genes,” less than maize (i.e., corn), which contains over 32,000 protein-coding “genes” [36], and close in number to the nematode, with approximately 19,000. And many “genes” appear to be preserved across species. Surely, the distinctive properties of the human brain and human behavior are the result of something other than what we have less of than corn. Yet GCTA tells us that common SNPs, SNPs that we most likely share with corn and nematodes, account for 35% of the heritability of “intelligence.”

Finally, an explanation as to why I have put the words “gene” and “genes” in quotes throughout. Over one hundred years after the basic rules of heredity were established, the gene is undergoing an identity crisis. Indeed the question ‘‘what is a gene?’’ has been much debated in recent years [37–39]. I have already mentioned the fact that alternative splicing entails that potentially thousands of different proteins can be transcribed from the same gene. But what is this gene? Not clearly, a segment of DNA “coded” for the production of a particular protein. But “it” is likely not a segment of DNA at all. Proteins are produced by the use (by the cell) of various introns and exons, start cites and stop cites, and promoters that are by no means necessarily contiguous segments of DNA (as represented, for example, by an SNP) [40].

Practitioners of GCTA, twin studies, and behavioral genetics are either 1) completely unaware of advances in molecular genetics, 2) choose to ignore them because they cannot fit within their antiquated conception of genes and the genotype-phenotype relationship, or 3) believe that they are COMPATABLE [sic] with their underlying assumptions and methodologies. I am yet to see anything approaching a scientific defense of 3.

References

1. Henn BM, Hon L, Macpherson JM, et al. Cryptic distant relatives are common in both isolated and cosmopolitan genetic samples. PloS one 2012;7(4):e34267-e67 doi: 10.1371/journal.pone.0034267[published Online First: Epub Date]|.
2. Vinkhuyzen AA, Pedersen NL, Yang J, et al. Common SNPs explain some of the variation in the personality dimensions of neuroticism and extraversion. Transl Psychiatry 2012;17(2):27
3. Trzaskowski M, Dale PS, Plomin R. No Genetic Influence for Childhood Behavior Problems From DNA Analysis. Journal of the American Academy of Child & Adolescent Psychiatry 2013;52(10):1048–56.e3 doi: http://dx.doi.org/10.1016/j.jaac.2013.07.016%5Bpublished Online First: Epub Date]|.
4. Viding E, Price TS, Jaffee SR, et al. Genetics of callous-unemotional behavior in children. PLoS One 2013;8(7):e65789 doi: 10.1371/journal.pone.0065789[published Online First: Epub Date]|.
5. Charney E. Behavior genetics and postgenomics. Behav Brain Sci 2012;35:1–80 doi: 10.1017/S0140525X11002226[published Online First: Epub Date]|.
6. Abyzov A, Mariani J, Palejev D, et al. Somatic copy number mosaicism in human skin revealed by induced pluripotent stem cells. Nature 2012;492(7429):438–42 doi: 10.1038/nature11629[published Online First: Epub Date]|.
7. Kano H, Godoy I, Courtney C, et al. L1 retrotransposition occurs mainly in embryogenesis and creates somatic mosaicism. Genes Dev 2009;23:1303–12
8. Baillie JK, Barnett MW, Upton KR, et al. Somatic retrotransposition alters the genetic landscape of the human brain. Nature 2011;479:534–37
9. Vitullo P, Sciamanna I, Baiocchi M, et al. LINE-1 retrotransposon copies are amplified during murine early embryo development. Molecular reproduction and development 2012;79(2):118–27 doi: 10.1002/mrd.22003[published Online First: Epub Date]|.
10. De S. Somatic mosaicism in healthy human tissues. Trends in genetics : TIG 2011;27(6):217–23 doi: 10.1016/j.tig.2011.03.002[published Online First: Epub Date]|.
11. Faulkner GJ. Retrotransposons: mobile and mutagenic from conception to death. FEBS letters 2011;585(11):1589–94 doi: 10.1016/j.febslet.2011.03.061[published Online First: Epub Date]|.
12. Frank SA. Somatic evolutionary genomics: Mutations during development cause highly variable genetic mosaicism with risk of cancer and neurodegeneration. PNAS 2009;107:1725–30 doi: 10.1073/pnas.0909343106[published Online First: Epub Date]|.
13. Iourov IY, Vorsanova SG, Liehr T, et al. Aneuploidy in the normal, Alzheimer’s disease and ataxia-telangiectasia brain: differential expression and pathological meaning. Neurobiology of disease 2009;34(2):212–20 doi: 10.1016/j.nbd.2009.01.003[published Online First: Epub Date]|.
14. Iourov IY, Vorsanova SG, Yurov YB. Detection of Aneuploidy in Neural Stem Cells of the Developing and Adult Human Brain. 2008;4(2):36–42
15. Martin SL. Jumping-gene roulette. Nature 2009;460(August)
16. Mkrtchyan H, Gross M, Hinreiner S, et al. Early embryonic chromosome instability results in stable mosaic pattern in human tissues. PloS one 2010;5(3):e9591-e91 doi: 10.1371/journal.pone.0009591[published Online First: Epub Date]|.
17. Muotri AR, Zhao C, Marchetto MCN, et al. Environmental influence on L1 retrotransposons in the adult hippocampus. Hippocampus 2009;19(10):1002–07 doi: 10.1002/hipo.20564[published Online First: Epub Date]|.
18. Sgaramella V, Astolfi PA. Somatic genome variations interact with environment , genome and epigenome in the determination of the phenotype : A paradigm shift in genomics ? 2010;9:470–73 doi: 10.1016/j.dnarep.2009.11.011[published Online First: Epub Date]|.
19. Rehen SK, Yung YC, McCreight MP, et al. Constitutional aneuploidy in the normal human brain. J Neurosci 2005;25(9):2176–80 doi: 10.1523/jneurosci.4560–04.2005[published Online First: Epub Date]|.
20. Iourov IY, Vorsanova SG, Yurov YB. Chromosomal variations in mammalian neuronal cells: known facts and attractive hypotheses. Int Rev Cytol 2006;249:143–91
21. Piotrowski A, Bruder CE, Andersson R, et al. Somatic mosaicism for copy number variation in differentiated human tissues. Hum Mutat 2008;29:1118–24
22. Charney E. Cytoplasmic inheritance redux. Adv Child Dev Behav 2013;44:225–55
23. Evsikov AV, Graber JH, Brockman JM, et al. Cracking the egg: molecular dynamics and evolutionary aspects of the transition from the fully grown oocyte to embryo. Genes & development 2006;20(19):2713–27 doi: 10.1101/gad.1471006[published Online First: Epub Date]|.
24. Schier AF. The maternal-zygotic transition: death and birth of RNAs. Science 2007;316(5823):406–7 doi: 10.1126/science.1140693[published Online First: Epub Date]|.
25. Grabowski P. Alternative splicing takes shape during neuronal development. Curr Opin Genet Dev 2011;21:388–94
26. Black DL. Mechanisms of alternative pre-messenger RNA splicing. Annu Rev Biochem 2003;72:291–336
27. Blekhman R, Marioni JC, Zumbo P, et al. Sex-specific and lineage-specific alternative splicing in primates. Genome Res 2010;20:180–89
28. Calarco JA, Xing Y, Caceres M, et al. Global analysis of alternative splicing differences between humans and chimpanzees. Gene Dev 2007;21:2963–75
29. Perriman RJ, Ares M. Alternative splicing variability: exactly how similar are two identical cells? Molecular systems biology 2011;7(505):505–05 doi: 10.1038/msb.2011.44[published Online First: Epub Date]|.
30. Perrat PN, DasGupta S, Wang J, et al. Transposition-driven genomic heterogeneity in the Drosophila brain. Science (New York, NY) 2013;340(6128):91–95 doi: 10.1126/science.1231965[published Online First: Epub Date]|.
31. Cowley M, Oakey RJ. Transposable elements re-wire and fine-tune the transcriptome. PLoS genetics 2013;9(1):e1003234-e34 doi: 10.1371/journal.pgen.1003234[published Online First: Epub Date]|.
32. Thomas Ca, Paquola ACM, Muotri AR. LINE-1 Retrotransposition in the Nervous System. Annual review of cell and developmental biology 2012;28:555–73 doi: 10.1146/annurev-cellbio-101011–155822[published Online First: Epub Date]|.
33. Iskow RC, McCabe MT, Mills RE, et al. Natural mutagenesis of human genomes by endogenous retrotransposons. Cell 2010;141(7):1253–61 doi: 10.1016/j.cell.2010.05.020[published Online First: Epub Date]|.
34. Muotri AR, Marchetto MCN, Coufal NG, et al. L1 retrotransposition in neurons is modulated by MeCP2. Nature 2010;468(7322):443–46 doi: 10.1038/nature09544[published Online First: Epub Date]|.
35. Muotri AR, Marchetto MCN, Zhao C, et al. Environmental influence on L1 retrotransposons in the adult hippocampus. Hippocampus 2009;19:1002–07 doi: 10.1002/hipo.20564[published Online First: Epub Date]|.
36. Schnable PS, Ware D, Fulton RS, et al. The B73 maize genome: complexity, diversity, and dynamics. Science 2009;326:1112–15
37. Mercer TR, Mattick JS. Understanding the regulatory and transcriptional complexity of the genome through structure. Genome Res 2013;23(7):1081–8 doi: 10.1101/gr.156612.113[published Online First: Epub Date]|.
38. Mudge JM, Frankish A, Harrow J. Functional transcriptomics in the post-ENCODE era. Genome Research 2013 doi: 10.1101/gr.161315.113[published Online First: Epub Date]|.
39. Brosius J. The fragmented gene. Ann N Y Acad Sci 2009;1178:186–93 doi: 10.1111/j.1749–6632.2009.05004.x[published Online First: Epub Date]|.
40. Mudge JM, Frankish A, Harrow J. Functional transcriptomics in the post-ENCODE era. Genome Res 2013;23(12):1961–73 doi: 10.1101/gr.161315.113[published Online First: Epub Date]|.

--

--