|Triplet Arp 274 [Courtesy NASA]|
One creationist-intelligent design argument goes like this: the human alpha-globin molecule, a component of hemoglobin that performs a key oxygen transfer function, is a protein chain based on a sequence of 141 amino acids. There are 20 different amino acids common in living systems, so the number of potential chains of length 141 is 20141, which is roughly 10183 (i.e., a one followed by 183 zeroes). These writers argue that this figure is so enormous that even after billions of years of random molecular trials, no human alpha-globin protein molecule would ever appear "at random," and thus the hypothesis that human alpha-globin arose by an evolutionary process is decisively refuted [Foster1991, pg. 79-83; Hoyle1981, pg. 1-20; Lennox2009, pg. 163-173].
To illustrate the difficulties with probability arguments, mathematics teachers often ask their class (let's say it has 30 students) if they think it is likely that two or more persons in the class have exactly the same birthday. Most students say that it is highly unlikely, thinking that the chances that two people have the same particular birthday is 1/365, and so 30 times this amount is only 30/365. But this argument is fallacious, since, for example, in a class of 30 students there are 435 pairs of students. When the probability calculation is done correctly for the case of 30 students [it is equal to 1 - (364/365 x 363/365 x ... x 336/365)], one obtains 70.6%. In general, if there are 23 or more students in the class, then the chances that two or more have the same birthday is greater than 50%.
For numerous other examples of how seemingly improbable "coincidences" can happen, see [Hand2014].
A calculation such as this can be refined further, taking into account other features of alpha-globin and its related biochemistry. Some of these calculations produce probability values even more extreme than the above. But do any of these calculations really matter? The main problem is that all such calculations, whether done accurately or not, suffer from the fatal fallacy of presuming that a structure such as human alpha-globin arose by a single all-at-once random trial event. But generating a molecule "at random" in a single shot is decidedly not the scientific hypothesis in question -- this is a creationist theory, not a scientific theory. Instead, available evidence from hundreds of published studies on the topic has demonstrated that alpha-globin arose as the end product of a long sequence of intermediate steps, each of which was biologically useful in an earlier context. See, for example, the survey article [Hardison2001], which cites 144 papers on the topic of hemoglobin evolution (note: this reference is now 17 years out of date -- many more have been published since then).
In short, the creationist-intelligent design argument claiming that scientists assert an all-at-once "at random" creation of various biomolecules, and then asserting that this is probabilistically impossible, is a classic "straw man" fallacy. Scientists do not believe this, so this line of argumentation is completely invalid. In other words, it does not matter how good or how bad the mathematics used in the analysis is, if the underlying model is a fundamentally invalid description of the phenomenon in question. Any simplistic probability calculation of evolution that does not take into account the step-by-step process by which the structure came to be is almost certainly fallacious and can easily mislead [Musgrave1998; Rosenhouse2018].
What's more, such calculations completely ignore the atomic-level biochemical processes involved, which often exhibit strong affinities for certain types of highly ordered structures. For example, molecular self-assembly occurs in DNA molecule duplication every time a cell divides. If we were to compute the chances of the formation of a human DNA molecule during meiosis, using a simple-minded probability calculation similar to that mentioned above, the result would be something on the order of one in 101,000,000,000, which is far, far beyond the possibility of "random" assemblage. Yet this process occurs many times every day in the human body and in every other plant and animal species.
The fallacy here, once again, is presuming an all-at-once random assembly of molecules. Instead, snowflakes, like biological organisms, are formed as the product of a long series of steps acting under well-known physical laws, and the outcomes of such processes very sensitively depend on the starting conditions and numerous environmental parameters. It is thus folly to presume that one can correctly reckon the chances of a given outcome by means of superficial probability calculations that ignore the processes by which they formed.
For example, a 2009 study by the present author exhibited results of a computer program simulating natural evolution, which "evolved" segments of English text very much akin to actual passages from Charles Dickens. In many instances, a class of college students were unable to distinguish the computer-generated text segments from real text segments taken from Dickens' Great Expectations. See English-text for details.
Closely related are advances in artificial intelligence, in which a set of computer programs "compete" to produce a superior program. One notable example is the 2016 defeat of the world's top Go player by a computer program named AlphaGo, developed by DeepMind (a subsidiary of Alphabet, Google's parent company), in an event that surprised observers who had not expected this for decades, if ever. Then in 2017, DeepMind announced even more remarkable results: their researchers had started from scratch, programming a computer with only the rules of Go, together with a "deep learning" algorithm, and then had the program play games against itself. Within a few days it had advanced to the point that it defeated the earlier champion-beating AlphaGo program 100 games to zero. After one month, the program's rating was as far above the world champion as the world champion was above a typical amateur [Greenmeier2017].
Here it is instructive to consider transposons or "jumping genes," namely sections of DNA that have been "copied" from one part of an organism's genome and "pasted" seemingly at random in other locations. The human genome, for example, has over four million individual transposons in over 800 families [Mills2007]. In most cases transposons do no harm, because they "land" in an unused section of DNA, but because they are inherited they serve as excellent markers for genetic studies. Indeed, transposons have been used to classify a large number of vertebrate species into a family tree, with a result that is virtually identical to what biologists had earlier reckoned based only physical features and biological functions [Rogers2011, pg. 25-31, 86-92]. As just one example, consider the following table, where columns labeled ABCDE denote five blocks of transposons, and x and o denote that the block is present or absent in the genome [Rogers2011, pg. 89].
Transposon blocks Species A B C D E /--------- Human o x x x x /---------- Bonobo x x x x x / \--------- Chimp x x x x x /------------ Gorilla o o x x x -----|------------ Orangutan o o o x x \------------ Gibbon o o o o oIt is clear from these data that our closest primate relatives are chimpanzees and bonobos. As another example, here is a classification of four cetaceans (ocean mammals) based on transposon data [Rogers2011, pg. 27]:
Transposon blocks Species A B C D E F G H I J K L M N O P /------ Bottlenose dolphin x x x x x x x x x x x x x x x x /\------ Narwhal whale x x x x x x x x x x x x x x x x ---|------- Sperm whale x x x x x o o o o o o o o o o o \------- Humpback whale x x o o o o o o o o o o o o o oOther examples could be listed, encompassing an even broader range of species [Rogers2011, pg. 25-31, 86-92].
Needless to say, these data, which all but scream "descent from common ancestors," are highly problematic for creationists and others who hold that the individual species were separately created without common biological ancestry. Transposons typically are several thousand DNA base pair letters long, but, since there are often some disagreements from species to species, let us be very conservative and say only 1000 base pair letters long. Then for two species to share even one transposon starting at the same spot, presumably only due to random mutations since creation, the probability (according to the creationist hypothesis) is one in 41000 or roughly one in 10600. For 16 such common transposons, the chances are one in 416000 or roughly one in 109600. What's more, as mentioned above, an individual species typically has at least several hundred thousand such transposons. Including even part of these in the reckoning would hugely multiply these odds.
But this is not all, because we have not yet considered the fact that in each diagram above, or in other tables of real biological transposon data, there is a clear hierarchical relationship. This is by no means assured, and in fact is quite improbable -- for almost all tables of "random" data, there is no hierarchical pattern, and no way to the rearrange the rows to be in a hierarchical pattern. For example, in a computer run programmed by the present author, each column of the above cetacean table was pseudorandomly shuffled (thus maintaining the same number of x and o in each column), and the program checked whether the rows of the resulting table could be rearranged to be in a hierarchical order. There were no successes in 10,000,000 trials. As a second experiment, a 4 x 16 table of pseudorandom data (with a 50-50 chance of x or o) was generated, and then the program attempted to rearrange the rows to be in a hierarchical pattern as before. There were only three successes in 10,000,000 trials.
Like the calculations mentioned earlier, these calculations are simplified and informal; more careful reckonings can be done, and one can vary the underlying assumptions. But, again, do the fine details of the calculations really matter? One way or the other, it is clear that the creationist hypothesis of separate creation does not resolve any probability paradoxes; instead it enormously magnifies them. The only other possibility, from a strict creationist worldview, is to posit that a supreme being separately created species with hundreds of thousands of transposons already in place, essentially just as we see them today.
But this merely replaces a scientific disaster (the utter failure of the creationist model to explain the vast phylogenetic patterns in intron data) with a theological disaster (why did a truth-loving supreme being fill the genomes of the entire biological kingdom with vast amounts of misleading DNA evidence, all pointing unambiguously to an evolutionary descent from common ancestors, if that is not the conclusion we are to draw?). Indeed, with regards to the discomfort some have about evolution, the creationist alternative of separate creation is arguably far worse, both scientifically and theologically.
However, arguments based on probability, statistics or information theory that have appeared in the creationist-intelligent design literature do not help unravel these questions, because these arguments have serious fallacies:
Perhaps at some time in the distant future, a super-powerful computer will be able simulate with convincing fidelity the multi-billion-year biological history of the Earth, in the same way that scientists today attempt to simulate (in a much more modest scope) the Earth's weather and climate. Then, after thousands of such simulations have been performed, with different starting conditions, we might obtain some meaningful statistics on the chances involved in the origin of life, or in the formation of some class of biological structures such as hemoglobin, or in the rise of intelligent creatures such as ourselves.
Until that time, probability calculations that appear in creationist-intelligent design literature and elsewhere should be viewed with great skepticism, to say the least. As mathematician Jason Rosenhouse writes [Rosenhouse2018],
When biologists ascribe to evolution the ability to craft information-rich genomes, they are neither speculating nor guessing. The basic components of evolutionary theory are empirical facts. Genes really do mutate, sometimes leading to new functionalities. The process of gene duplication with subsequent divergence leads to the creation of information by any reasonable definition of the terms. Selection can string small variations together into directional change. On a small scale, this has all been observed. And if small increases in information are an empirical reality on human timescales, then what abstract principle of mathematics is going to rule out much larger increases on geological scales?
Then here come the ID [intelligent design] folks, full of swagger and bravado. They say the accumulated empirical evidence must yield before their back-of-the-envelope probability calculations and abstract mathematical modeling. Evolution should be abandoned in favor of the new theory of intelligent design. This theory states, in its entirety, that an intelligent agent of unspecified motives and abilities did something at some point in natural history. Not very useful.
In a larger context, one has to question whether highly technical issues such as calculations of probabilities have any place in a discussion of religion. Why attempt to "prove" God with probability, particularly when there are very serious questions as to whether such reasoning is valid? One is reminded of a passage in the New Testament: "For if the trumpet gives an uncertain sound, who shall prepare himself for the battle?" [1 Cor. 14:8]. It makes far more sense to leave such matters to peer-reviewed scientific research.
See DNA, English text, Information theory, Origin and Deceiver for additional related discussion.