|Jet in Carina WFC3 IR [Courtesy NASA]|
In spite of these exhilarating developments, some writers, principally of the creationist and intelligent design schools, prefer instead a highly combative approach to science, particularly to traditional topics such as geology and evolution. One widely used line by such writers is that certain features of biology are so unlikely, according to simple back-of-the-envelope probability calculations, that they could never have been produced by a purely natural, "random" evolutionary process, even assuming millions of years of geologic history. Thus the entirety of evolutionary theory must be false. Why don't scientists see the light?
For example, some writers equate the theory of evolution to the absurd suggestion that monkeys randomly typing at a typewriter could compose a selection from the works of Shakespeare, or that an explosion in an aerospace equipment yard could produce a working airliner [Dembski1998; Foster1991; Hoyle1981; Lennox2009]. More recent studies of this genre argue that functional biology operates on an exceedingly small subset of the space of all possible DNA sequences, and that any changes to the "computer program" of biology are, like changes to human computer programs, almost certain to make the organism non-functional [Axe2017; Marks2017].
One specific creationist-intelligent design argument addresses the human alpha-globin molecule, a component of hemoglobin that performs a key oxygen transfer function in blood. These writers argue that since alpha-globin is a protein chain based on a sequence of 141 amino acids, and since there are 20 different amino acids common in living systems, the probability of selecting human alpha-globin "at random" is one in 20141 or one in 10183 (i.e., a one followed by 183 zeroes). This probability is so tiny, so they argue, that even after millions of years of random molecular trials, no human alpha-globin protein molecule would ever appear, thus refuting the hypothesis of human evolution [Foster1991, pg. 79-83; Hoyle1981, pg. 1-20; Lennox2009, pg. 163-173].
As this chart indicates, the key initial steps in the chain of reasoning are to carefully define the physical phenomenon in question and to formulate an accurate mathematical model of this phenomenon. In the case of probability calculations, one must carefully address these questions: What exactly is the phenomenon being modeled? What exactly is the probability space (the set of all possible outcomes)? How exactly is probability to be measured? Is each possible event presumed to have the same probability? If so, why can this be assumed? Are certain events assumed to be independent? Is so, why can this be assumed? If these and other questions are not carefully addressed, then it matters not in the slightest how sophisticated the mathematical calculations are -- the chain of inference is broken, and any conclusion is almost certainly invalid.
One overriding lesson of probability and statistics, when rigorously applied, is that seemingly improbable "coincidences" can and do happen. For instance, a common classroom exercise is to inquire how likely it is, in a class say of 30 students, that two or more of the students have the same birthday. Most people presume this is rather unlikely, but the correct probability is 70.6%; in general, it is more likely than not to happen whenever the class has 23 or more students. For additional details on fallacies in probability and statistics, including examples of how seemingly "improbable" events can happen, see [Hand2014; Mlodinow2009; Pinker2021].
More generally, the alpha-globin argument, along with almost all other creationist-intelligent design probability arguments, is a clear instance of the classic post-hoc probability fallacy, namely reckoning the probability of a single event after the fact and then claiming that the event is remarkably improbable. Post-hoc probability arguments, which are a form of confirmation bias reasoning, have long plagued scientific research work, ranging from physics, chemistry, biology and medicine to psychology, economics and finance. Most disciplines now take strong steps to avoid such arguments in published literature [Pinker2021; Bailey2014]. As one illustration of this problem, suppose one was dealt the following 13-card hand from a standard 52-card well-shuffled deck, in order:
If one calculates the probability of this specific event, after the fact, assuming that all card deals are equally likely, the result is approximately one in 4 x 1021 (i.e., roughly one in 4 billion trillion). This probability is so tiny that even if every adult human on Earth were to conduct this experiment 1,000 times, it is still exceedingly unlikely (less than one in a billion chance) that this specific hand, in order, would be dealt. But it was. So does this microscopic probability constitute evidence that the original card deal event occurred outside the realm of natural law? Of course not. In reality, there is nothing particularly remarkable about this specific hand at all. If this particular ordered hand had been specified beforehand, then perhaps the outcome would have been remarkable. But it was not specified beforehand, so this is a clear instance of fallacious post-hoc reasoning.
Note that the probability figure calculated above could be made much more extreme by merely dealing more cards. For example, if one were to deal all 52 cards from a well-shuffled deck, record the result, and then calculate, after the fact, the probability of this particular ordered deal, the result is approximately one in 8 x 1067. This figure is so remote that if every planet around every star in the observable universe were each populated with ten billion sentient beings, and if each of these beings had been repeatedly shuffling and dealing 52-card hands since the big bang 13.8 billion years ago, it is still exceedingly unlikely that this particular ordered hand of cards would ever be dealt. But it was. Any argument based on post-hoc probability reckoning is effectively nullified.
We should add here that at least in the card-deal examples, the relevant probability can be reliably calculated. By contrast, implicit in the creationist-intelligent design alpha-globin argument is the assumption that every instance of the space of 141-long amino acid chains is equally probable, so that the probability of any given outcome is merely the reciprocal of the total number of combinatorial possibilities. But no justification is provided for this assumption, and, given the complexity of molecular biology, it is utterly false -- some amino acid sequences might be relatively more likely to emerge, while vast numbers of other sequences might not be biologically possible at all. At our current level of scientific understanding, no empirically defensible probability figure can be specified for these phenomena. As an illustration, consider the following snowflake, one among many shown in [Bentley1962]:
It is straightforward, by analysis say of a digital image, to record the presence or absence of ice at each point say in a 256x256 pixel grid (256x256 = 65,536 pixels). One might then argue, along the lines of the alpha-globin argument, that the "probability" of this structure is one in 265,536, or roughly one in 2 x 1019,728. So does this staggeringly remote probability figure constitute evidence that this particular snowflake was "designed" or appeared in violation of natural law? Hardly. As with the alpha-globin argument, some snowflake patterns are relatively more common, while many others are not physically possible at all. Any "probability" figure for the outcome of a complex physical process, based mainly on enumerating combinatorial possibilities rather than real empirical data, has no credibility.
Finally, most anti-evolution probability arguments (certainly including the alpha-globin example) fail to recognize that the process of natural biological evolution is not really a "random" process. Evolution certainly has some random aspects, notably mutations and genetic events during reproduction. But the all-important process of natural selection, acting in a competitive landscape and with numerous complicated environmental pressures, is anything but random. This strongly directional nature of natural selection, which is the essence of evolution, by itself invalidates most anti-evolution probability arguments.
With regards to alpha-globin, it is worth noting that heme, the key oxygen-carrying component of hemoglobin, is remarkably similar to chlorophyll, the molecule behind photosynthesis. The principal difference is that heme has a central iron atom, whereas chlorophyll has a central magnesium atom; otherwise they are virtually identical. This similarity can hardly be a coincidence, and in fact researchers concluded since at least 1980, based on both functional and biochemical evidence, that these two biomolecules "have arisen in the course of evolution from a common origin" [Hendry1980; Hardison2012]. Here is a diagram of the two molecules [from MasterOrganicChemistry.com]:
In summary, the probability arguments that have been promoted in the creationist and intelligent design literature are riddled by severe errors, most notably the fact that they are clear instances of the post-hoc probability fallacy -- vacuous arithmetic exercises with no basis in real empirical biology. Such arguments would never be accepted in a rigorously peer-reviewed journal in applied probability or biology, not because of their implication for evolution (pro or con), but because this type of reasoning is clearly invalid. For additional discussion, see [Musgrave1998; Rosenhouse2016; Rosenhouse2018]. A newly published book by Jason Rosenhouse, which discusses these issues in significantly greater detail, is [Rosenhouse2022].
For example, as mentioned above, some critics have equated natural biological evolution to the absurd suggestion that some monkeys typing randomly at a keyboard could generate a passage of Shakespeare. Others have argued that any changes to a "computer program" would surely render the program unusable. But these too are fallacious arguments, since they ignore the all-important process of natural selection. As a single example, a 2009 study by the present author exhibited results of a computer program simulating natural evolution, which "evolved" segments of English text very much akin to actual passages from Charles Dickens. In many instances, a class of college students were unable to distinguish the computer-generated text segments from real text segments taken from Dickens' Great Expectations. See English-text for details.
Another example is the recent rise of "genetic algorithms" and "evolutionary computing," namely computer programs that mimic the process of biological evolution to produce novel solutions to scientific and engineering problems. As a single example, in 2017 Google researchers generated 1000 image recognition algorithms, each of which were trained using state-of-the-art deep neural networks to recognize a selected set of images. They then used an array of 250 computers, each running two algorithms, to identify an image. Only the algorithm that scored higher proceeded to the next iteration, where it was changed somewhat, mimicking mutations in biological evolution. The researchers found that their scheme could achieve accuracies as high as 94.6%, better than human efforts [Gershgorn2017]. In another Google-funded research project, a computer was programmed with only the rules of Go, together with an evolution-style "deep learning" algorithm, and then had the program play games against itself. Within a few days it had advanced to the point that it defeated an earlier Google program 100 games to zero. This earlier program, in turn, had previously defeated the world's champion human Go player [Greenmeier2017]. As a third example, researchers in Spain have developed software that employs an evolution-like strategy to directly infer scientific results from raw data, in some cases more successfully than the best human efforts [Wood2022].
Some creationist writers have attempted to dismiss examples of genetic algorithms and evolutionary computing, claiming that such programs always include a target that nullifies their result. But this is not true, certainly not for the above-mentioned examples [Thomas2006; Thomas2010].
Here it is instructive to consider transposons or "jumping genes," namely sections of DNA that have been "copied" from one part of an organism's genome and "pasted" seemingly at random in other locations. The human genome, for example, has over four million individual transposons in over 800 families [Mills2007]. In most cases transposons do no harm, because they "land" in an unused section of DNA; and some transposons have subsequently adopted biological functionality (although for the purposes of this discussion it does not matter in the least whether or not they have biological functionality). But because they are distinctive and inherited, they serve as excellent markers for genetic studies. Indeed, transposons have been used to classify a large number of vertebrate species into a family tree, with a result that is virtually identical to what biologists had earlier reckoned based only physical features and biological functions [Rogers2011, pg. 25-31, 86-92]. As just one example, consider the following table, where columns labeled ABCDE denote five blocks of transposons, and x and o denote that the block is present or absent in the genome [Rogers2011, pg. 89].
Transposon blocks Species A B C D E /--------- Human o x x x x /---------- Bonobo x x x x x / \--------- Chimp x x x x x /------------ Gorilla o o x x x -----|------------ Orangutan o o o x x \------------ Gibbon o o o o o
It is clear from these data that our closest primate relatives are chimpanzees and bonobos. As another example, here is a classification of four cetaceans (ocean mammals) based on transposon data [Rogers2011, pg. 27]:
Transposon blocks Species A B C D E F G H I J K L M N O P /------ Bottlenose dolphin x x x x x x x x x x x x x x x x /\------ Narwhal whale x x x x x x x x x x x x x x x x ---|------- Sperm whale x x x x x o o o o o o o o o o o \------- Humpback whale x x o o o o o o o o o o o o o oOther examples could be listed, encompassing an even broader range of species [Rogers2011, pg. 25-31, 86-92].
Needless to say, these data, which all but scream "descent from common ancestors," are highly problematic for creationists and others who hold that the individual species were separately created without common biological ancestry. Transposons typically are several thousand DNA base pair letters long, but, since there are often some disagreements from species to species, let us be very conservative and say only 1000 base pair letters long. Then for two species to share even one transposon starting at the same spot, presumably only due to random single-letter mutations since creation, the probability (according to the creationist hypothesis) is one in 41000 or roughly one in 10600. For 16 such common transposons, the chances are one in 416000 or roughly one in 109600 (and individual species typically have many thousands of transposons).
But this is not all, because we have not yet considered the fact that in each diagram above, or in other tables of real biological transposon data, there is a clear hierarchical relationship. This is by no means assured, and in fact is quite improbable -- for almost all tables of "random" data, there is no hierarchical pattern, and no way to the rearrange the rows to be in a hierarchical pattern. For example, in a computer run programmed by the present author, each column of the above cetacean table was pseudorandomly shuffled (thus maintaining the same number of x and o in each column), and the program checked whether the rows of the resulting table could be rearranged to be in a hierarchical order. There were no successes in 10,000,000 trials. As a second experiment, a 4 x 16 table of pseudorandom data (with a 50-50 chance of x or o) was generated, and then the program attempted to rearrange the rows to be in a hierarchical pattern as before. There were only three successes in 10,000,000 trials.
These calculations are simplified and informal; more careful reckonings can be done, and one can vary the underlying assumptions. But one way or the other, it is clear that the hypothesis of separate creation of individual species does not resolve any probability paradoxes; instead it enormously magnifies them.
The only other possibility, from a strict creationist worldview, is to propose that a supreme being separately created individual species with hundreds of thousands of transposons already in place, essentially just as we see them today. But since this hypothesis can be crafted to match any set of DNA evidence, it fails to be falsifiable. More importantly, it merely replaces a scientific failure (the inability of the independent creation model to explain, using natural scientific laws, the vast phylogenetic patterns in transposon data) with a theological disaster (why did a truth-loving supreme being fill the genomes of the biological kingdom with vast amounts of misleading DNA evidence, if "descent from common ancestors" is not the conclusion we are to draw?) [Rogers2011, pg. 89]. Indeed, with regards to the discomfort some have about evolution, the creationist-intelligent design alternative of separate creation is arguably much worse, both scientifically and theologically.
However, the back-of-the-envelope probability arguments that have appeared in the creationist-intelligent design literature do not help unravel these profound questions, because these arguments are riddled with severe errors that would disqualify them from peer-reviewed journals in the evolutionary biology field. These difficulties include:
Perhaps at some time in the distant future, a super-powerful computer will be able simulate with convincing fidelity the multi-billion-year biological history of the Earth, in the same way that scientists today attempt to simulate, in a much more modest scope, the Earth's weather and climate. Then, after thousands of such simulations have been performed, researchers might obtain some meaningful statistics on the chances involved in the origin of life on Earth, or in the formation of some class of biological structures such as alpha-globin. Perhaps also researchers will eventually reconstruct, in the laboratory, additional key biomolecular steps involved in the origin of life. And perhaps one day researchers will even discover forms of life on other planets, and eventually, after thousands of such life forms have been catalogued, they may be able to empirically assess the probability of the origin of life or of specific biomolecules.
Until that time, the probability calculations that appear in creationist-intelligent design literature should be viewed with great skepticism, to say the least. After all, does anyone really believe that the entire edifice of modern evolutionary biology can be felled by a few simple back-of-the-envelope probability calculations that do not involve real empirical data? Common sense says otherwise. As mathematician Jason Rosenhouse writes [Rosenhouse2018],
When biologists ascribe to evolution the ability to craft information-rich genomes, they are neither speculating nor guessing. The basic components of evolutionary theory are empirical facts. Genes really do mutate, sometimes leading to new functionalities. The process of gene duplication with subsequent divergence leads to the creation of information by any reasonable definition of the terms. Selection can string small variations together into directional change. On a small scale, this has all been observed. And if small increases in information are an empirical reality on human timescales, then what abstract principle of mathematics is going to rule out much larger increases on geological scales?
Then here come the ID [intelligent design] folks, full of swagger and bravado. They say the accumulated empirical evidence must yield before their back-of-the-envelope probability calculations and abstract mathematical modeling. Evolution should be abandoned in favor of the new theory of intelligent design. This theory states, in its entirety, that an intelligent agent of unspecified motives and abilities did something at some point in natural history. Not very useful.
In a larger context, one must question whether highly technical issues such as biomolecular structures or calculations of probabilities have any place in a discussion of modern philosophy or theology. As noted above, the natural universe, as revealed by the latest state-of-the-art scientific research, is far larger, far more exotic, far more astounding in complexity and beauty, and, and the same time, far more bound by elegant laws, than any previous generation could ever have imagined. And the fact that we humans, working together across races, economic classes and national boundaries, can uncover these laws, via diligent application of the scientific method, is a tribute to the highest potential of the human spirit.