"Let the whole outside world consist of a long paper tape…"
John von Neumann 1948
Thought experiments are the metaphysician's primary tool. Limited only by logic we can test metaphysical hypotheses at any possible world we can imagine. But imagination has its limits.
Can we imagine two possible worlds, each entirely empty except for a single object; the difference being only that one object is stationary whereas its counterpart moves?
In possible worlds containing only two particles perpetually moving relative to each other but never colliding, must there be a determinate answer to what would happened if they did collide?
If the actual world were changed at the moment of the Big Bang just enough so that Romney would have won in 2012, would his opponent still have been Obama?
Much can hang on how we answer questions like these but answers are not easy. The possible worlds in question seem either too poorly furnished or too cluttered with bewildering detail for imagination or argument to get a grip.
In this post I want to introduce a new device for thinking about metaphysical problems; they are a kind of philosophical workbench on which we can test competing metaphysical theories. I call them "Turing Worlds".
In this post I am not going to use Turing Worlds to advance any very radical metaphysical thesis. Instead I'm going to show how some well know problems play out in these worlds. Think of it as an exercise of calibrating a new tool against known quantities.
Turing Worlds
Roughly: a Turing World is a possible world whose workings are entirely comprehensible as the operation of a Turing Machine.
I will assume all readers have got the gist of Turing machines. Here is a particularly nice example.
The video shows an actual machine. That is, it shows a machine embedded in the actual world and governed by actual physics. This requires there be a lot of hidden electronics behind the scenes that we would have to understand to really understand how it works. It is only a superficially simple machine with complex inner workings embedded in our complex world.
To build a Turing world we want to abstract away as much of the complexity as we can. So, to a first approximation, we can think of a Turing world -- as Von Neuman instructs-- as a tape; infinite in length; divided into cells; each cell occupied by a zero or one.
Figure 1
At any given moment there will be one cell whose contents are directly relevant to what happens next. Intuitively, this is the cell that is under the scanner but we can do without the apparatus of the scanner itself: in my illustration I have substituted a blue glow.
The scanner position changes over time; moving left or right relative to the tape but never moving more than one cell at a time. Before the scanner moves it prints either a "1" or "0" on the selected cell. Sometimes this changes the contents of that cell, sometimes not. Sometimes the scanner prints a symbol but does not move.
Figure 1 depicts the sort of world Von Neuman described but — with all due respect to that great man— there must be more to a computable world than what we see here. In a Turing machine what happens next doesn't just depend on what symbol is under the scanner; it also depends upon the machine's computational state. Computational states are features of the world that vary over time in systematic ways that can result in something different happening even when the same symbol is at the scanner.
To build Turing worlds we will have to give them computational states. It is easy to imagine what the computational state of a Turing world might be-- think of the kind of hidden machinery that drives the machine in the video. It is far less obvious what a computational state metaphysically must be. For the purposes of this post I am going to assume that the computational state of a world at any moment must at the very least correspond to some intrinsic property of something in that world at that time; some property that changes over time.
Staying as neutral as I can about what those properties might be I'll build this into my illustrations of Turing worlds with a simple label as in figure 2, illustrating worlds in states A and B.
Figure 2
With states in the picture we can describe what happens in a Turing world will in terms of the transition rules of that world. Here is a table of transition rules for a very simple Turing world.
|
|
Table 1
The state diagram on the right describes the same transitions as the table on the left. The machine/world depicted here has two states, 'A' and 'B'. The table tells us, for example, that if in state 'A' when a '1' is under the scanner cell, the machine will do '0→B': that is, it will print a '0', move the tape to the right and move to state B. The '↓' symbol means the tape does not move.
The transition table for an actual Turing machine like the one in the video gives an abstract description of its workings but the reason the actual machine instantiates one table rather than another— indeed, the reason it works at all—depend on its underlying machinery. Tinkering with that machinery can change the table that describes it.
Which brings us to the signal difference between a Turing World and a world that merely has a Turing machine in it. I stipulate as a matter of definition:
(D1) |
The transition table that describes a Turing World must be entailed by the physical laws of that world. |
One way of framing this requirement would be to say that the transition rules of a Turing world are it's laws. I myself am happy with this way of putting things, but this formulation might be contentious.
There are metaphysical theories that require laws, properly so called, to be rooted in particular kinds of intrinsic properties of things, or of universals, or in relations between universals. Whether or not that is so is a question that I think Turing Worlds might help us figure out. But that is a good reason not to beg such questions from the outset. Better, instead, to say, more cautiously, that the transition table of a Turing machine describes certain regularities and stipulate— as (D1) does — that a Turing World corresponding to a particular transition table will be a world where those regularities are nomological. That leaves us free to argue later on about precisely what 'nomological' might mean.
While (D1) is neutral on the question of whether Turing transition rules could be the basic laws of any world, I will continue to depict the ontology of Turing worlds in the austere manner of Figure 2 and talk as if their transition tables exhaust their physics. If this leads to disagreement down the line we will have learned something about the upshots of our different assumptions about the laws.
I take it as uncontroversial that there can be and in fact are physical processes which are a) describable in computational terms b) proceed as they do as a matter of physical law. The encoding of proteins by DNA is an obvious example.
With these caveats in place we should all be able to agree that Turing Worlds are metaphysically possible. They are real possible worlds in whatever sense possibilities are "real". That means that generalizations over possible worlds of the sorts that metaphysicians are wont to give ought to apply to Turing Worlds and we can test metaphysical hypothesis by seeing if we can construct Turing Worlds that refute them.
Laws and Initial conditions
Consider then the world w0 which is governed by the laws of Table 1.
The illustration shows the evolution of this world over time centering on the position of the scanner. To make the figure easier to read I've made the freshly printed numerals a little darker (think of it as the ink still being wet) and I've added arrows to show the motion of the tape. These are for illustration only and needn't be thought of as part of the world itself. (Again, not so the states labeled in the circles, which as I've said, I take to be intrinsic properties of the world itself).
In the illustration we see only the first few moments in the life of w0 but it is clear enough what its future will be: every second step the scanner will add a '1' to its right, ad infinitum. This world is a machine for converting zeros to ones. That this is so is the upshot of the laws of w0 (described in Table 1), but also of its initial conditions. If we use the same laws but different initial conditions, we get very different results.
Where w0 stutters out ones to the right of the scanner, w1 streams out zeros ad infinitum; a massive difference arising from the smallest change in initial conditions.
Worlds w0 and w1 illustrate how different worlds can be even if they have the same laws. They also exhibit the difference between laws and mere regularities. At wo and w1 both (R1) and (R2) hold without exception and forevermore.
(R1) A '1' at the scanner is always followed by a '0' at the scanner.
(R2) A '0' at the scanner is always followed by a '1' at the scanner.
But look at the first few moments of w3, which is also governed by Table 1.
(R1) fails at w3. (R2) is preserved, as it will be for any world governed by Table 1. (R2 ) is nomological given those laws but (R2) is only an accidental regularity of w0 and w1.
Counterfactuals
Now I have just been speaking subjunctively about Turing Worlds. I said that w1 is what w0 would look like if its initial conditions had been different in a certain way. Because the workings of Turing worlds are so simple the upshots were clear. That clarity makes Turing worlds a particularly good venue to test more complicated questions about the nature of counterfactual reasoning.
All of us-- at least all of us who are prepared to talk about "possible worlds"— agree about how to evaluate counterfactuals of the form :
if ANTecedant were the case then CONsequent would have been the case.
ANT > CON
We suppose that there is way of ordering worlds by comparative similarity against which we may say:
(C) |
ANT > CON is true at w0 iff if there is any world where (ANT & CON ) is true, there is at least one such world that it is more similar to w0 than any world at which ANT & ~CON. |
Which we can gloss in the standard way by saying ANT > CON is true if CON is true at "the closest" ANT world.
This, as I said, is not controversial. Controversy ensues as soon as we try to say what the standards of similarity should be. Here are a couple of plausible measures.
MINIMIZE MIRACLES .
w1 is closer to w0 than w2 if fewer and/or smaller miracles (according to the laws of w0) distinguish w1 from w0 than distinguish w2 from w0.
MAXIMIZE OVERLAP
w1 is closer to w0 than w2 if w1 matches w0 in matters of particular fact over a larger spatio-temporal region than w2.
These are not the only possible candidates and even these are not mutually exclusive. Different measures of closeness can conflict and when they do a theory of counterfactuals must tell us which measure should dominate.
Turing worlds can help us sort this out. Since we know the laws and everything is digitized we will have no trouble identifying, and even counting, miracles. Before we proceed though, we will have to think a little bit more about what counts as similarity in matters of particular fact.
Similarity at a Time
People sometimes worry that in speaking of "similarity" metaphysicians are appealing to a notion which is vague, subjective or arbitrary. It need not be so at Turing worlds. At any given moment, Turing worlds can differ only in only two respects: either in the distribution of symbols on the tape or in their computational state. Thanks to Information Theory we a variety of rigorous ways to characterize the similarity of strings of symbols. For our present purposes, the simplest measure will do: we can just count the number of places at which the world-tape holds different symbols (the "Hamming distance").
Dissimilarities resulting from differences in computational states and their upshots are a more complicated. Compare these three worlds.
Figure 3
The first two worlds of Fig 3 are alike in respect of the contents of the tape, but differ in computational state. The second two differ in computational state but have different contents. Which is most similar to which?
The answer, of course, is that it is going to depend on what a difference in state amounts to and that will depend upon what the world's underlying machinery looks like.
In the case of the Turing machine in the video above the computational state of the machine we can see is probably a matter of the distribution of electrons in the register of the hidden CPU that makes the whole thing work. A difference in that state would involve a much smaller change in the world than erasing and redrawing a one or a zero on the visible tape.
On the other hand we can also imagine machines whose underlying machinery was such that a change of state involves grosser changes. Consider, for example, the wonderfully ingenious machine in the video below .
In this actual machine it seems me that, ceteris paribus, the amount of difference made by a change in computational state vs a change in the contents of the tape are exactly equal: in either case they come down to the position of a single ball bearing in a metal grid.
And I think we could go farther. It seems to me that there are good reasons to think that for any given any tape and table we could always construct background machinery such that that changes of state always made for more dissimilarity than swapping symbols on the tape.
To see this, reflect that any description of a world in terms of tape and table may be satisfied by any number of worlds with underlying differences. For example, the hidden machinery that makes a world like w0 tick could be another Turing machine-- perhaps a universal Turing machine-- operating its own program running a hidden tape. That would make the states and transitions of Table1 virtualizations of comparable but more complicated states and transitions in the underlying machine. If we chose our background machine carefully enough, I conjecture that we could always build one such that the relevant changes in virtual state required changing more than one symbol on the tape of the underlying machine.
In other words, I conjecture that among any set of worlds describable by a particular tape and table there will be worlds which satisfy this constraint:
(D2) |
Turing worlds that differ only in the occurrence of a '1' or a '0' in a single cell at a given time are more similar, at that time, than worlds that differ in any other respect. |
Which I stipulate as a second defining condition on Turing worlds (and similarity). The effect of this assumption is to allow us to treat the first two worlds in Figure 3 as more different than the bottom two.
Criteria of Similarity
This stipulation about how to measure similarity of Turing worlds at particular times does not, I think, beg any outstanding questions about counterfactuals. But neither does it resolve them. The controversial questions are about how to measure counterfactually relevant similarity across time.
Here, for example, is a way of measuring similarity that some have found plausible:
PRESERVE THE LAWS
In comparing worlds for similarity:
It is of first importance to Minimize Miracles.
It is of second importance to Maximize Overlap.
The minimum number of miracles is no miracles at all. So, on this account, if ANT is false in the actual world— and the actual world is deterministic— then the closest world would be one which everything that happened was in accordance with our laws but in which the initial conditions were just different enough to make ANT true.
The standard argument against PRESERVE THE LAWS is epistemological. Suppose the counterfactual in question is:
- If Romney had won in 2012, he would have raised the debt ceiling.
According to PRESERVE THE LAWS, to decide if this is true we should look for worlds which are just different enough from ours at the moment of the Big Bang to bring it about that Romney wins and observe his economic policies there. But while reasonable people can disagree about the truth of (1), no sensible person could claim to have any idea how the entire history of the universe could be have changed to lawfully bring about a Romney victory.
That is a good argument, but Turing worlds let us make a different one. In a Turing world we can know exactly what would have happened had initial conditions been different so we can test PRESERVE THE LAWS directly.
In our diagrams we can identify any position on the tape by its offset (plus or minus) o, from the scanner at a time t and so refer to any cell with the pair [t,o]. Let us write 'V(t,o)' to say that at time t the cell offset by o from the scanners has the symbol '1'. We will write ~V(t,o) to mean that cell [t,o] contains a '0'. So we can say the difference between w0 and w 1 results from the fact that.
~V(1,0) at w0
V(1,0) at w1.
Since the only difference between w0 and w1 is a single symbol, (D2) guarantees us that no world is closer to w0 than w1. So it seems then that w1 verifies:
(2) ~V(1,0) > ~V(3,3)Since ~V(3,3) is the case at w1 and we regard w1 as the way things would have gone at w0 if its initial cell had been different.
This conclusion is consistent with PRESERVE THE LAWS since w1 has no miracles relative to w0. But counterfactual (2) is a special case: its antecedent is about the initial conditions of w1. What about a case in which the antecedent is tied to a later time. Say:
(3) ~ V(2,1) > ~V (3,0) |
|
Well we already know that there is at least one world at which nothing miraculous happens and cell [2,1] is '0': it is w1. At w1 cell [2,1] is a '0' and so is cell [3,0]. According to PRESERVE THE LAWS then, we should conclude that (3) is true.
But we don't. To see this remember that, in a Turing world , the contents of any one cell can make a difference to the future contents of any other only when that cell passes under the scanner. The rules of Table 1 always move the tape to the right. So once a cell has moved passed the scanner it will never do so again and subsequent changes to it cannot make a difference to another cell. Cell [2,1] is to the right of the scanner so its having different contents at t1 wouldn't make any difference to [3,0]. Which is another way of saying that (3) is false at w0.
So Preserve The Laws must be wrong.
Lewis
In his seminal work on counterfactuals. David Lewis seemed to be arguing for this standard of similarity:
PRESERVE THE FACTS
In comparing worlds for similarity
It is of first importance to Maximize Overlap.
It is of secondary importance to Minimize Miracles.
Critics objected that this wouldn't work because it seemed to entail the falsity of more or less every counterfactual of the form,
Small difference in facts at t1 > Big difference facts at t2.
These would almost always turn out false on PRESERVE THE FACTS since worlds which the difference in the facts at t2 was small would always be more like the actual world that worlds for which the subsequent difference was big. And yet we do think that there are many occasions when a small difference in fact would make for large differences later on.
Kit Fine offered the example of Richard Nixon in the darkest hour of his presidency. Assuming the nuclear launch system was in perfect working order at the time, it seems correct to say:
(4) If Nixon had pushed the button, there would have been a holocaust.That is to say, we think the world closest to ours at which the button was pushed would be very, very different from the actual world. Yet consider a world at which Nixon did push the button but then some small miracle interfered with the machinery and kept the missiles from flying. Such a world has more miracles in it than one in which the missiles fly (it will take one miracle to get him to push the button and another to nullify its effects). But PRESERVE THE FACTS says that minimizing miracles is of secondary importance to maximizing overlap; and the future of the world at which the mechanism miraculously fails is vastly more similar to the actual world than one laid waste by nuclear war.
So PRESERVE THE FACTS would tell us (4) was false and that is the wrong answer.
One way Lewis might have answered Fine's objection would be to revise PRESERVE THE LAWS in some way that would push out worlds with miracles after the antecedent was (miraculously) made true. Something like this has become the Standard Model; the one that most philosophers use when they want to put the logic of counterfactuals to serious work.
THE STANDARD MODEL
In comparing worlds for similarity:
It is of first importance to Minimize Miracles after the time ANT becomes true (or just before).
It is of second importance to PRESERVE THE FACTS up to the time (or just before) ANT becomes true.
Jonathan Bennett has argued persuasively that this is, in fact, the criterion that comes closest to the way folks actually go about assessing counterfactuals.
But Lewis rejected this solution. He didn't want to build an explicit reference to time into the analysis of counterfactuals because he wanted an account of counterfactuals that would allow him to analyze event causation in counterfactual terms.
So far as we can tell causation only operates forwards in time, but this seems to be only a contingent fact. If causation is rooted in counterfactual dependence, then it had better be an only contingent fact that the future depends counterfactually on the past in ways that the past doesn't depend on the future. But to build an explicit distinction between events prior to and after the antecedent into the analysis of counterfactuals, threatens to make the direction of counterfactual dependence non-contingent and hence (arguably) would defeat the hope of explaining causation in counterfactual terms.
So in Counterfactual Dependence and Time's Arrow, Lewis proposed a different, temporally neutral, criterion.
NO BIG MIRACLES
In comparing worlds for similarity
It is of the first importance to avoid Big Miracles.
It is of the second importance to Maximize Overlap.
It is of the third importance to Preserve the Laws.
It is of the fourth importance to maximize the region of Approximate Similarity.
This, Lewis argued, answered Fine's objection because once Nixon pushed the button it would take a Big Miracle to erase the effects of his act. True, it might not take much to stop the missiles from flying-- a single burnt fuse might do -- but then, among many other things, that burnt fuse, and the heat and light its burnout generated would be left around forevermore as traces of the first and second miracles; traces making for a perpetual difference to many aspects of the universe. Perhaps not differences that we care about, but differences nevertheless and maybe, down the road, very big differences. So to preserve exact overlap-- to erase all traces of the button pushing-- there would have to be many other miraculous happenings that cover all those traces and all those miracles, Lewis argued, would add up to a big miracle.
Lewis thought that in a deterministic world it would always require a big "convergence" miracle to cover the traces any of any divergence. Lewis called this the "Asymmetry of Miracles": at "worlds like ours" a small miracle can produce massive divergence but it will always take a large miracle to produce convergence.
In response, Lewis's critics have constructed ingenious counter examples, set in worlds very much like our own, to argue that this need not be so. But we don't need to be at all ingenious to see that the Asymmetry of Miracles thesis doesn't hold at Turing Worlds.
We saw in comparing w0 with w1 that replacing the initial zero with a one under the scanner could make a very large difference: the difference between an infinite run of '0's versus an infinite run of '1's. So look at w4.
w4 illustrates, we can all agree, what would have happened at w0 if cell [2,0] had been '1'. You can think of the appearance of this symbol on this square as Nixon's pushing the button and the string of zero's that ensues, the resulting holocaust. After time t2, w4 diverges drastically from w0.
And yet it would not have taken a big miracle to produce convergence. Thus w5.
At w5 a miracle --no bigger than the one at t2-- at t3 puts things back on track. The trace of the t3 miracle persists forever, but area of overlapping perfect similarity between w5 and w0 will also expand forever so that w5 overlaps w0 to an infinitely larger extent than w4.
So we can use Turing worlds to show that convergence miracles need not be "big". More interestingly we can show how you can get convergence with no miracles at all.
Bennett Worlds
Take a look at w6.
In w6 like w4 the cell [2,0] contains a '1' where w0 has a '0'. But w6's future is almost exactly the same as that of w0 and w6 contains no miracles big or small. World w6 does not perfectly match either the future or the past of w0. It has different initial conditions, but in respect of the size of the region of perfect overlap— that being infinitely large— it beats w4 for similarity by Lewis's measure. Moreover, w6 differs from w0 in only one cell after t2, making it (according to D2) at least as similar to w0, from that time forward, as any other world.
World w6 is an example of what Lewis called a "Bennett World". Jonathan Bennett had challenged NO BIG MIRACLES by asking why there couldn't be worlds with convergence but no miracles. Thus, think of the Nixon example: might there not be a world with no miracles but with a past just different enough from ours to ensure that Nixon lawfully pushed the button and such that the difference also lawfully ensured that the button pushing made little or no difference to the subsequent course of events?
Lewis thought not. His argument turned on a principle Bennett called 'Amplify':
(Amplify) Two determinist worlds that are somewhat like ours and are slightly different at some time, become greatly unalike at later times.
w6 illustrates that Amplify is false at Turing worlds. w6 and w0 end up more alike after t3 than before.
So why do we count w4 as closer to w0 than w6? Why do we think w4 describes how things would have gone if V[2,0] had been true. The answer seems to be that it is because w4 has greater overlap—perfect identity-- with w0 than w6 before t2, never mind what comes after. Which is to say that we take Preseving the Facts before the antecedent to be more important than preserving them after.
This is what the Standard Model assumes though, of course, to put things this way appeals to precisely the temporal element Lewis was trying to avoid.
Schaffer
Johnathan Schaffer has proposed that Lewis's hopes for an atemporal account of similarity might be saved against counter-examples involving convergence miracles by supplementing it with an explicit appeal to causal independence (whatever wreck this might make of the hopes of analyzing causation in terms of counterfactual dependence)
Schaffer's Causal Model
In comparing worlds for similarity
It is of the first importance to avoid Big Miracles.
It is of the second importance to Maximize Overlap of those regions causally
independent of whether or not the antecedent obtains.
It is of the third importance to Preserve the Laws.
It is of the fourth importance to maximize the region of Approximate Similarity of those regions causally independent of whether or not the antecedent obtains.
Schaffer's thought was that if we ignore those many matters of fact that causally depend on Nixon's pushing the button we will exclude worlds with convergence miracles because they will restore similarity only in the matters of fact we are ignoring but still make for extra differences elsewhere.
World w6 shows how Bennett worlds can thwart Schaffer's strategy. When Schaffer talks about what ranges of fact "causally depend" on the truth of ANT he is not assuming, for example, that the fact that Nixon did not push the button was somehow, all by itself, causally sufficient for e.g. Gerald Ford's swearing in-- facts about Fords history and actions also played a causal role. Neither does Schaffer think that pushing that button would have been sufficient, all by itself, to cause a holocaust—the missles had to fly, to detonate and so on. What Schaffer's test invites us to set aside are the causal consequences of ANTassuming that all other causally relevant facts are held constant, that is: assuming no other miracles. He assumes, in other words, that any change in those other facts would require additional miracles and, he assumes, changes sufficient to produce convergence would require big miracles.
That assumption ignores the possibility of Bennett Worlds. As world w6 illustrates, at a Bennett World non-miracolus differences can swamp the causal difference made by ANT. When we take Bennett worlds into account the region causally dependent on the truth of the antecedent looks very different. Figure 4 marks the the region that depend That is, the values of all the other cells can be changed or preserved-- whatever the value of [2,0]-- by changing initial conditions.only on the contents of [2,0] with 'X's.
The value of every blank square is causally independent of the value of [2,0] if we are allowed to apropriately tinker with the initial conditions. If, as Schaffer's criterion requires, we ignore the differences in the "X"d out region, we still find that w6 is more similar to w0 than w4. Indeed, ignoring that region, w6 differs in only one cell from w0 at every moment after t2.
So why don't we think w6 is more similar to w0 than w4? That is, why do we suppose that w6 does not show how things would have been different if the contents of cell [2,0] had been different?
Again, the obvious answer lies in the differences between w6 and w0 before t2, the time the antecedent becomes true. The identity of w4 and w0 before t2, weighs more heavily than their differences after. That, again, is precisely what the Standard Model predicts.
Worlds like ours?
So far our discussion has been an argument for the Standard Model. I warned you ahead of time not to expect much philosophical novelty in this post. I aim to sell you on the methodological value of Turing Worlds, not any particular thesis about counterfactuals.
In fact, I do not think the Standard Model is correct and I think philosophers’ standard approaches to all the topics we have been considering need to be rethought. I plan to make those arguments in future posts, but many of those arguments will be staged on Turing Worlds and I want to close this post by considering objections to this new apparatus.
In "Counterfactual Dependence and Time's Arrow" Lewis explicitly restricts his claims to "worlds like ours". I expect that some readers have been wanting to protest that Turing Worlds are not sufficiently like ours to be relevant to the issues at hand.
My first response is to point out that this objection has no force at all unless the person who makes it is prepared to say in what relevant respects Turing worlds are different from the worlds that Lewis took as his domain.
As I read Lewis, the relevant features he had in mind were the direction of causation and the direction of counterfactual dependence. But in those respects the Turing worlds we considered are like the actual world: in them causation is temporally forward and the future counterfactually depends on the past in ways that the past does not depend on the future.
Of course, the Turing worlds we considered were deterministic and, it seems, the actual world is not. But we can describe probabilistic Turing machines and all of the arguments we have offered here can be recast mutatis mutandis to similar effect. That too I will leave to a later post.
One complaint may be that Turing Worlds are just too simple. It is very easy to see that principles like Lewis's "Amplify" are false at Turing worlds because the causal downstream of any miracle is narrow and easily blocked by swapping an appropriate zero for a one. The ease of this counter-example is likely to provoke muttering in some circles about "the benefits of theft over honest toil". But muttering is not argument. Simplification is only oversimplification when something is left out. If the objector can explain what metaphysically relevant feature of the actual world is left out of Turing worlds we will all have learned something.
Of course, as a methodological principle, when we are doing metaphysics we should refrain from talking about worlds that are befuddling different from our own. Are there worlds where rain puddles are sentient? I don't know. I don't understand the question and wouldn't know how to argue with someone who thought he had a clear grip on truth conditions of counterfactuals like "if rain puddles were sentient then …."
But Turing worlds are not like that. Indeed, there is sense in which we understand the workings of Turing Worlds better than we understand the workings of our own. We understand their laws well enough to predict and explain everything that happens in them. And they are “realistic” in the sense that we can build real world models of them (and post videos of them on YouTube!) and in the real world models of w0 all the counterfactuals we have asserted about w0 are actually true.
And now is the appropriate place to recall Wheeler's famous speculation that the actual world is a Turing world.
" . . . one enormous difference separates the computer and the universe--chance. In principle, the output of a computer is precisely determined by the input . . . . Chance plays no role. In the universe, by contrast, chance plays a dominant role. The laws of physics tell us only what may happen. Actual measurement tells us what is happening (or what did happen). Despite this difference, it is not unreasonable to imagine that information sits at the core of physics, just as it sits at the core of a computer. Trying to wrap my brain around this idea of information theory as the basis of existence, I came up with the phrase "it from bit." The universe and all that it contains ("it") may arise from the myriad yes-no choices of measurement (the "bits"). Niels Bohr wrestled for most of his life with the question of how acts of measurement (or "registration") may affect reality. It is registration--whether by a person or a device or a piece of mica (anything that can preserve a record)--that changes potentiality into actuality. I build only a little on the structure of Bohr's thinking when I suggest that we may never understand this strange thing, the quantum, until we understand how information may underlie reality. Information may not be just what we learn about the world. It may be what makes the world." John Wheeler 1998
And Wheeler's "It From Bit" should remind us that there is another sense in which Turing Worlds are not at all simple. Thanks to figures like Turing, Church, Shannon and Von Neumann we know that some of the most daunting questions of physics and mathematics can be addressed in computational terms. Information theory, I contend, holds the same promise for metaphysics.
We saw how Turing worlds allow us to replace loose talk about miracles "big" and "small" with a strict accounting of zeros and ones. But that barely begins to tap the ways in which Information Theory can enable us to think rigorously about notions like similarity, simplicity and order.
At a Turing worlds we can, at our whim, reverse the direction of time's arrow.
We can track individual token cells or their contents, across time and across worlds to explore questions about thier identity across time and across worlds.
Thanks to the work of figures like Kolmogorov and Solomonoff we can quantify the entropy and order of Turing Worlds at particular times and over time. We can speak of the simplicity of laws or circumstances without waving our hands.
We can give objective and a priori content to the idea of “prior probability" and -- imagining ourselves inhabitants of a Turing world-- operate Baye's Theorem on objective and subjective priors . (In future posts I'll be advocating computational epistemology as well as computational metaphysics. )
Contemporary physics increasingly looks to Information Theory to plumb its deepest mysteries. Why not metaphysics?
Comments