Kevin L. O'Brien Responds to Dr Royal Truman

Dr. Royal Truman is a long-winded fellow, but in essence his essay comes down to two points: there is no indisputable evidence that mutation has ever produced an increase in information and there is no mechanism by which coded information can arise by chance. While I am certainly no expert in either information theory or abiogenesis, I shall nonetheless put on my best John Adams demeanor and pontificate dogmatically on these issues.

The first point is set up deliberately so that it can never be refuted. Any possible evidence will always be under dispute (at the very least by Dr. Truman), and the definition of information he has adopted demands that information be intelligently derived, so no mutation could ever produce an increase in information by definition, evidence notwithstanding. In my opinion, the best way to deal with this kind of argument is to counter with a definition of information that not only permits, but actually demands that mutations increase information.

This is where the Brooks/Wiley/Collier theory comes in. BWC theory demands three things of biological information: it must be physically real (both material rather than abstract and able to change the system), it must have nontrivial differences between microstates and macrostates, and its macrostates cannot define the microstates. A microstate is a state that might be occupied by a particular entity. For example, if you have a system consisting of a population of organisms, each organism is an entity and each genotype class possessed by that population could be a microstate. A macrostate is the distribution of the entities of a system over the microstates available to the system. For example, in our hypothetical population, the macrostate would be how many organisms possess which classes of genotypes available in that population. Since BWC theory asserts that biological information is a closed system, it can be described by partial entropy functions, and since the entropy of a physical system is directly related to the macrostate of that system, these partial entropy functions can be used to measure the information content of the system, which in turn is a measure of the configurational complexity of the system.

One such partial entropy function defines the maximum possible configurations of microstates, assuming an equal probability that any entity can occupy a particular microstate. This is known as the phase space, and is defined as

Hmax = N k log M

where Hmax is maximal entropy (and thus maximal information), N is the number of entities in the system, M is the number of microstates available to that system and k is a factor that will give the answer in bits. For our purposes we will use a value of k = 3.333. If the resulting macrostate of the system allows an equiprobable distribution of all entities over all available microstates, then the entropy of the macrostate would be equal to Hmax. If, however, the macrostate is constrained so that there are different probabilities that an entity will occupy particular microstates, then the partial entropy function that measures the macrostate is

Hobs = -- k sum(N) pi log pi

where Hobs is the entropy of the actual (observed) macrostate, sum(N) is the sum over all entities in the system and pi is the probability that a random entity will occupy a particular microstate.

Going back to our hypothetical example, let's say we have a population of 10 organisms and let's concentrate on only one locus. If only one allele (A) can occupy that locus, then the population has only one genotype class (A), so its maximal entropy for that locus will be

Hmax = 10 k log 1 = 0 bits.

Now, let's assume that A duplicates and one copy mutates to a second allele, B. The population now has three genotype classes for that locus -- AA, AB, BB -- and the maximal entropy is now

Hmax = 10 k log 3 = 16 bits.

Now, let's assume that A duplicates again, and that one copy mutates into a third allele, C. There are now 6 genotype classes for that locus -- AA, AB, AC, BB, BC, CC -- and the maximal entropy is now

Hmax = 10 k log 6 = 26 bits.

Another duplication/mutation will give us 4 alleles with 10 genotype classes and a maximum entropy of Hmax = 33 bits. Another duplication/mutation will give us 5 alleles with 15 genotype classes and Hmax = 39 bits. Finally, another duplication/mutation will give us 6 alleles with 21 genotype classes and Hmax = 44 bits.

As you can see from this simple example, maximal entropy, and thus maximal information, increases with each mutation, as long as each mutation creates a new allele. And we haven't even addressed the question of whether the mutations, or their genotype classes, are beneficial, harmful or neutral. In this case, that distinction doesn't matter, because we have defined phase space simply by how many genotype classes there are, not by how many fit phenotypes are generated. In any event, the only way that a mutation can decrease information in this case is be eliminating an allele, and while it does happen, it happens less frequently during an evolutionary scenario than the creation of new alleles. It also doesn't matter whether the result of the mutation adds "information" to the new allele, deletes "information" or simply reworks it, because information in BWC theory is defined as a physical hierarchical array, not as a coded message. In other words, it makes no difference if the new gene is longer or shorter than the old gene it is duplicated from, whether it is the result of a point mutation or a frameshift, or any of a number of other characteristics creationists usually point to try to describe "information"; all that matters is that it is one more allele which in turn increases genotypic phase space, and information is defined in terms of genotypic phase space in this example.

To put it in more general terms, BWC theory defines biological information in terms of genetic phase space. As such, any mutation that increases genetic phase space, in any way, increases information, regardless of whether the actual change looks like an increase or a decrease in "information". So a mutation that removes a nucleotide from a gene sequence may appear to decrease "information", but if the mutation increases genetic phase space the result is an overall increase in information. Since an increase in genetic phase space represents an increase in information capacity, which in biological terms means an increase in diversity potential, it doesn't matter if the mutation adds a new gene, deletes an existing gene, duplicates an existing gene without changing the duplication or reworks an existing gene, if the result is an increase in potential diversity, information has also increased. I leave it to people who know evolutionary biology better than I do to select examples of reworked or deleted genes that increase potential diversity.

The second problem is also defined in a way that makes refutation difficult, but the best way to refute it is to describe proteinoid microsphere research. Briefly, proteinoids are made by the thermal copolymerization of amino acids in a selective, nonrandom fashion. The amino acids already possess the structural and chemical information necessary to allow them to selectively polymerize into nonrandom, catalytically active polymers. In other words, proteinoids contain specified coded information, but the coding is based on the physiochemical nature of the amino acids. Microspheres are then able to use proteinoids as templates to make polynucleotides, which are themselves nonrandom because specific three-nucleotide sequences bind preferentially to specific amino acids. So the specified coded information in the proteinoids can be passed on to the polynucleotides. Finally, microspheres can create polypeptides using polynucleotides as templates in a reverse of the process that formed the polynucleotides in the first place. As such, the specified coded information in the polynucleotides can be passed on to the polypeptides. Once microspheres were able to make their own polypeptides from polynucleotide templates, natural selection could take over and evolve a more efficient genetic coding and translation system.

In any event, the coding system used in modern cells is derived from the coding system inherent in the physiochemical nature of the amino acids.

Kevin L. O'Brien