Contains

2617 words

2617 words

Category

Other

Other

ABSTRACT: For the scientific community intelligent design represents creationism's latest grasp at scientific legitimacy. Accordingly, intelligent design is viewed as yet another ill-conceived attempt by creationists to straightjacket science within a religious ideology. But in fact intelligent design can be formulated as a scientific theory having empirical consequences and devoid of religious commitments. Intelligent design can be unpacked as a theory of information. Within such a theory, information becomes a reliable indicator of design as well as a proper object for scientific investigation. In my paper I shall (1) show how information can be reliably detected and measured, and (2) formulate a conservation law that governs the origin and flow of information. My broad conclusion is that information is not reducible to natural causes, and that the origin of information is best sought in intelligent causes. Intelligent design thereby becomes a theory for detecting and measuring information, explaining its origin, and tracing its flow.

BIOSKETCH: Bill Dembski has a Ph.D. in mathematics from the University of Chicago, a Ph.D. in philosophy from the University of Illinois at Chicago, and an M.Div. from Princeton Theological Seminary. Bill has done post-doctoral work at MIT, University of Chicago, Northwestern, Princeton, Cambridge, and Notre Dame. He has been a National Science Foundation doctoral and post-doctoral fellow. His publications range from mathematics to philosophy to theology. His monograph The Design Inference will appear with Cambridge University Press in 1998. In it he describes the logic whereby rational agents infer intelligent causes. He is working with Stephen Meyer and Paul Nelson on a book entitled Uncommon Descent, which seeks to reestablish the legitimacy and fruitfulness of design within biology.

1. INFORMATION

In Steps Towards Life Manfred Eigen (1992, p. 12) identifies what he regards as the central problem facing origins-of-life research: "Our task is to find an algorithm, a natural law that leads to the origin of information." Eigen is only half right. To determine how life began, it is indeed necessary to understand the origin of information. Even so, neither algorithms nor natural laws are capable of producing information. The great myth of modern evolutionary biology is that information can be gotten on the cheap without recourse to intelligence. It is this myth I seek to dispel, but to do so I shall need to give an account of information. No one disputes that there is such a thing as information. As Keith Devlin (1991, p. 1) remarks, "Our very lives depend upon it, upon its gathering, storage, manipulation, transmission, security, and so on. Huge amounts of money change hands in exchange for information. People talk about it all the time. Lives are lost in its pursuit. Vast commercial empires are created in order to manufacture equipment to handle it." But what exactly is information? The burden of this paper is to answer this question, presenting an account of information that is relevant to biology.

What then is information? The fundamental intuition underlying information is not, as is sometimes thought, the transmission of signals across a communication channel, but rather, the actualization of one possibility to the exclusion of others. As Fred Dretske (1981, p. 4) puts it, "Information theory identifies the amount of information associated with, or generated by, the occurrence of an event (or the realization of a state of affairs) with the reduction in uncertainty, the elimination of possibilities, represented by that event or state of affairs." To be sure, whenever signals are transmitted across a communication channel, one possibility is actualized to the exclusion of others, namely, the signal that was transmitted to the exclusion of those that weren't. But this is only a special case. Information in the first instance presupposes not some medium of communication, but contingency. Robert Stalnaker (1984, p. 85) makes this point clearly: "Content requires contingency. To learn something, to acquire information, is to rule out possibilities. To understand the information conveyed in a communication is to know what possibilities would be excluded by its truth." For there to be information, there must be a multiplicity of distinct possibilities any one of which might happen. When one of these possibilities does happen and the others are ruled out, information becomes actualized. Indeed, information in its most general sense can be defined as the actualization of one possibility to the exclusion of others (observe that this definition encompasses both syntactic and semantic information).

This way of defining information may seem counterintuitive since we often speak of the information inherent in possibilities that are never actualized. Thus we may speak of the information inherent in flipping one-hundred heads in a row with a fair coin even if this event never happens. There is no difficulty here. In counterfactual situations the definition of information needs to be applied counterfactually. Thus to consider the information inherent in flipping one-hundred heads in a row with a fair coin, we treat this event/possibility as though it were actualized. Information needs to referenced not just to the actual world, but also cross-referenced with all possible worlds.

2. COMPLEX INFORMATION

How does our definition of information apply to biology, and to science more generally? To render information a useful concept for science we need to do two things: first, show how to measure information; second, introduce a crucial distinction-the distinction between specified and unspecified information. First, let us show how to measure information. In measuring information it is not enough to count the number of possibilities that were excluded, and offer this number as the relevant measure of information. The problem is that a simple enumeration of excluded possibilities tells us nothing about how those possibilities were individuated in the first place. Consider, for instance, the following individuation of poker hands:

(i) A royal flush.

(ii) Everything else.

To learn that something other than a royal flush was dealt (i.e., possibility (ii)) is clearly to acquire less information than to learn that a royal flush was dealt (i.e., possibility (i)). Yet if our measure of information is simply an enumeration of excluded possibilities, the same numerical value must be assigned in both instances since in both instances a single possibility is excluded.

It follows, therefore, that how we measure information needs to be independent of whatever procedure we use to individuate the possibilities under consideration. And the way to do this is not simply to count possibilities, but to assign probabilities to these possibilities. For a thoroughly shuffled deck of cards, the probability of being dealt a royal flush (i.e., possibility (i)) is approximately .000002 whereas the probability of being dealt anything other than a royal flush (i.e., possibility (ii)) is approximately .999998. Probabilities by themselves, however, are not information measures. Although probabilities properly distinguish possibilities according to the information they contain, nonetheless probabilities remain an inconvenient way of measuring information. There are two reasons for this. First, the scaling and directionality of the numbers assigned by probabilities needs to be recalibrated. We are clearly acquiring more information when we learn someone was dealt a royal flush than when we learn someone wasn't dealt a royal flush. And yet the probability of being dealt a royal flush (i.e., .000002) is minuscule compared to the probability of being dealt something other than a royal flush (i.e., .999998). Smaller probabilities signify more information, not less.

The second reason probabilities are inconvenient for measuring information is that they are multiplicative rather than additive. If I learn that Alice was dealt a royal flush playing poker at Caesar's Palace and that Bob was dealt a royal flush playing poker at the Mirage, the probability that both Alice and Bob were dealt royal flushes is the product of the individual probabilities. Nonetheless, it is convenient for information to be measured additively so that the measure of information assigned to Alice and Bob jointly being dealt royal flushes equals the measure of information assigned to Alice being dealt a royal flush plus the measure of information assigned to Bob being dealt a royal flush.

Now there is an obvious way to transform probabilities which circumvents both these difficulties, and that is to apply a negative logarithm to the probabilities. Applying a negative logarithm assigns the more information to the less probability and, because the logarithm of a product is the sum of the logarithms, transforms multiplicative probability measures into additive information measures. What's more, in deference to communication theorists, it is customary to use the logarithm to the base 2. The rationale for this choice of logarithmic base is as follows. The most convenient way for communication theorists to measure information is in bits. Any message sent across a communication channel can be viewed as a string of 0's and 1's. For instance, the ASCII code uses strings of eight 0's and 1's to represent the characters on a typewriter, with whole words and sentences in turn represented as strings of such character strings. In like manner all communication may be reduced to the transmission of sequences of 0's and 1's. Given this reduction, the obvious way for communication theorists to measure information is in number of bits transmitted across a communication channel. And since the negative logarithm to the base 2 of a probability corresponds to the average number of bits needed to identify an event of that probability, the logarithm to the base 2 is the canonical logarithm for communication theorists. Thus we define the measure of information in an event of probability p as -log2p (see Shannon and Weaver, 1949, p. 32; Hamming, 1986; or indeed any mathematical introduction to information theory).

What about the additivity of this information measure? Recall the example of Alice being dealt a royal flush playing poker at Caesar's Palace and that Bob being dealt a royal flush playing poker at the Mirage. Let's call the first event A and the second B. Since randomly dealt poker hands are probabilistically independent, the probability of A and B taken jointly equals the product of the probabilities of A and B taken individually. Symbolically, P(A&B) = P(A)xP(B). Given our logarithmic definition of information we therefore define the amount of information in an event E as I(E) =def -log2P(E). It then follows that P(A&B) = P(A)xP(B) if and only if I(A&B) = I(A)+I(B). Since in the example of Alice and Bob P(A) = P(B) = .000002, I(A) = I(B) = 19, and I(A&B) = I(A)+I(B) = 19 + 19 = 38. Thus the amount of information inherent in Alice and Bob jointly obtaining royal flushes is 38 bits.

Since lots of events are probabilistically independent, information measures exhibit lots of additivity. But since lots of events are also correlated, information measures exhibit lots of non-additivity as well. In the case of Alice and Bob, Alice being dealt a royal flush is probabilistically independent of Bob being dealt a royal flush, and so the amount of information in Alice and Bob both being dealt royal flushes equals the sum of the individual amounts of information. But consider now a different example. Alice and Bob together toss a coin five times. Alice observes the first four tosses but is distracted, and so misses the fifth toss. On the other hand, Bob misses the first toss, but observes the last four tosses. Let's say the actual sequence of tosses is 11001 (1 = heads, 0 = tails). Thus Alice observes 1100* and Bob observes *1001. Let A denote the first observation, B the second. It follows that the amount of information in A&B is the amount of information in the completed sequence 11001, namely, 5 bits. On the other hand, the amount of information in A alone is the amount of information in the incomplete sequence 1100*, namely 4 bits. Similarly, the amount of information in B alone is the amount of information in the incomplete sequence *1001, also 4 bits. This time information doesn't add up: 5 = I(A&B) _ I(A)+I(B) = 4+4 = 8.

Here A and B are correlated. Alice knows all but the last bit of information in the completed sequence 11001. Thus when Bob gives her the incomplete sequence *1001, all Alice really learns is the last bit in this sequence. Similarly, Bob knows all but the first bit of information in the completed sequence 11001. Thus when Alice gives him the incomplete sequence 1100*, all Bob really learns is the first bit in this sequence. What appears to be four bits of information actually ends up being only one bit of information once Alice and Bob factor in the prior information they possess about the completed sequence 11001. If we introduce the idea of conditional information, this is just to say that 5 = I(A&B) = I(A)+I(B|A) = 4+1. I(B|A), the conditional information of B given A, is the amount of information in Bob's observation once Alice's observation is taken into account. And this, as we just saw, is 1 bit.

I(B|A), like I(A&B), I(A), and I(B), can be represented as the negative logarithm to the base two of a probability, only this time the probability under the logarithm is a conditional as opposed to an unconditional probability. By definition I(B|A) =def -log2P(B|A), where P(B|A) is the conditional probability of B given A. But since P(B|A) =def P(A&B)/P(A), and since the logarithm of a quotient is the difference of the logarithms, log2P(B|A) = log2P(A&B) - log2P(A), and so -log2P(B|A) = -log2P(A&B) + log2P(A), which is just I(B|A) = I(A&B) - I(A). This last equation is equivalent to

(*) I(A&B) = I(A)+I(B|A)

Formula (*) holds with full generality, reducing to I(A&B) = I(A)+I(B) when A and B are probabilistically independent (in which case P(B|A) = P(B) and thus I(B|A) = I(B)).

Formula (*) asserts that the information in both A and B jointly is the information in A plus the information in B that is not in A. Its point, therefore, is to spell out how much additional information B contributes to A. As such, this formula places tight constraints on the generation of new information. Does, for instance, a computer program, call it A, by outputting some data, call the data B, generate new information? Computer programs are fully deterministic, and so B is fully determined by A. It follows that P(B|A) = 1, and thus I(B|A) = 0 (the logarithm of 1 is always 0). From Formula (*) it therefore follows that I(A&B) = I(A), and therefore that the amount of information in A and B jointly is no more than the amount of information in A by itself.

For an example in the same spirit consider that there is no more information in two copies of Shakespeare's Hamlet than in a single copy. This is of course patently obvious, and any formal account of information had better agree. To see that our formal account does indeed agree, let A denote the printing of the first copy of Hamlet, and B the printing of the second copy. Once A is given, B is entirely determined. Indeed, the correlation between A and B is perfect. Probabilistically this ...

Dissecting Education Think about how much of your life is spent trying to learn all you can and make yourself better prepared for the ?real world.? We start schooling at age five or six. Kindergarten is about finger paints and learning the alphabet. Before we know it, we are standing in front of our class and parents accepting a high scho...

_John and Julie, your two best friends, have just read an article about the death penalty. It explains the reasons why death by lethal injection is a legitimate punishment for certain crimes. As Julie reads the article, she strongly agrees with what the author has to say. 'An eye for an eye, a tooth for a tooth,' she imagines. Without examin...

A "meme" is an idea which can be passed from person to person, the basic unit of cultural transmission. Memes are often analogized to genes, as self replicating information patterns; or to viruses, organic or computer, which prosper by infecting new hosts (in this case human minds) with copies of themselves. The term was invented by Richard Dawkin...

The Bushmen of the Kalahari Desert live in a culture much different from ours. They live with no crime, no law, and no police. In their culture nothing is bad nor evil and they are relatively gentle people. They have no sense of ownership and their values are sharing and kindness to all. The Bushmen live by no time and go by no calendars....

Freud's Framework of Dreams The purpose of dreams is to sleep. Dreams also represent wish fulfillment. A dream is the result of a mediation or compromise between these two opposing forces. Freud defines the manifest content of dreams as that which an individual is dreaming in a particular dream, its primarily apparent subject matter. In co...