Social misperceptions and oversimplifications of genetics

During the latter half of the 20th century, the fields of genetics and molecular biology matured greatly, significantly increasing understanding of biological heredity. As with other complex and evolving fields of knowledge, the public awareness of these advances has primarily been through the mass media, and a number of social misperceptions and oversimplifications of genetics have arisen. Popular misconceptions include the following ideas:
 * 1) Every aspect of the biology of an organism can be predicted from its genes
 * 2) Single genes code for specific anatomical or behavioural features
 * 3) Genes are a blueprint of an organism's form and behaviour
 * 4) Genes are uninterupted sections of DNA that only code for a single protein

Genetic determinism
While there are many examples of animals that display certain well-defined behaviour that is genetically programmed, these examples have been extrapolated to a popular misconception that all patterns of behaviour, and more generally the phenotype, are rigidly genetically determined. There is good evidence that some basic aspects of human behaviour, such as circadian rhythms, are genetically-based, but it is clear that many other aspects are not.

In the first place, much phenotypic variability does not stem from genetics. For example:
 * 1) Epigenetic inheritance.  In the widest definition this includes all biological inheritance mechanisms that do not change or involve the genome.  In a narrower definition it excludes biological phenomena such as the effects of prions and maternal antibodies which are also inherited and have clear survival implications.
 * 2) Learning from experience.  This is obviously a very important feature of humans, but there is considerable evidence of learned behaviour in other animal species (vertebrates and invertebrates).  There are even reports of learned behaviour in Drosophila larvae.

Extreme examples of this emphasis on a strictly genetic basis for all behaviour can be found in the commercial world in statements from CEOs such as: "acquisitions are in our DNA", and "I don't think you're going to see us as a company that produces premium content. It's not in our DNA".

A gene for X
In the early years of genetics it was suggested that there might be "a gene for" a wide range of particular characteristics. This was partly because the examples studied from Mendel onwards inevitably focused on genes whose effects could be readily identified; partly that it was easier to teach science that way; and partly because the mathematics of evolutionary dynamics is simpler if there is a simple mapping between genes and phenotypic characteristics.

These have led to the general preception that there "is a gene for" arbitrary traits, leading to controversy in particular cases such as the purported "gay gene". However, in light of the known complexities of gene expression networks (and phenomena such as epigenetics), it is clear that instances where a single gene "codes for" a single, discernable phenotypic effect are rare, and that media presentations of "a gene for X" grossly oversimplify the vast majority of situations.

Genes as a blueprint
It is widely believed that genes provide a "blueprint" for the body in much the same way that architectural or mechanical engineering blueprints describe buildings or machines. At a superficial level, genes and conventional blueprints share the common property of being low dimensional (genes are organised as a one-dimensional string of nucleotides; blueprints are typically two-dimensional drawings on paper) but containing information about fully three-dimensional structures. However, this view ignores the fundamental differences between genes and blueprints in the nature of the mapping from low order information to the high order object.

In the case of biological systems, a long and complicated chain of interactions separates genetic information from macroscopic structures and functions. The following simplified diagram of causality illustrates this:


 * Genes → Gene expression → Proteins → Metabolic pathways → Sub-cellular structures → Cells → Tissues → Organs → Organisms

Even at the small scale, the relationship between genes and proteins (once thought of as "one gene, one polypeptide" ) is known to be complicated, with approximately 5 proteins in the human body for each gene. More significantly, the causal chains from genes to functionality are not separate or isolated but are entangled together, most obviously in metabolic pathways (such as the Calvin and citric acid cycles) which link a succession of enzymes (and, thus, gene products) to form a coherent biochemical system. Furthermore, information flow in the chain is not exclusively one-way. While the central dogma of molecular biology describes how information cannot be passed back to inheritable genetic information, the other causal arrows in this chain can be bidirectional, with complex feedbacks ultimately regulating gene expression.

Instead of being a simple, linear mapping, this complex relationship between genotype and phenotype is not straightforward to deconvolute. Rather than describing genetic information as a blueprint, some have suggested that a more appropriate analogy is that of a recipe for cooking, where a collection of ingredients is combined via a set of instructions to form an emergent structure, such as a cake, that is not described explicitly in the recipe itself.

Genes as words
It is popularly supposed that a gene is "a linear sequence of nucleotides along a segment of DNA that provides the coded instructions for synthesis of RNA" and even some current medical dictionaries define a gene as "a hereditary unit that occupies a specific location on a chromosome, determines a particular characteristic in an organism by directing the formation of a specific protein, and is capable of replicating itself at each cell division".

In fact, as the diagram illustrates schematically, genes are much more complicated and elusive concepts. A reasonable modern definition of a gene is "a locatable region of genomic sequence, corresponding to a unit of inheritance, which is associated with regulatory regions, transcribed regions and/or other functional sequence regions". One of the major complicating factors is that the exons which code for the proteins are often separated by many introns, which used to be called "junk DNA" but appear to have various as-yet-ill-understood purposes. The exons can be combined in different orders (splice variants) to produce different proteins. For example the gene called Dscam in Drosophila has 110 introns and therefore tens of thousands of possible splice variants.

This kind of misperception is perpetuated when mainstream media report that an organism's genome has been "decyphered" when they mean that it has simply been sequenced.

A related misconception is that the sole function of genes is to code for proteins, with the non-coding remainder being "junk DNA". However, it now appears that, although protein-coding DNA makes up barely 2% of the human genome, about 80% of the bases in the genome may be being expressed, so the term "junk DNA" may be a misnomer.