…the question, What are the laws of nature? may be stated thus: What are the fewest and simplest assumptions, which being granted, the whole existing order of nature would result? Another mode of stating it would be thus: What are the fewest general propositions from which all the uniformities which exist in the universe might be deductively inferred? A System of Logic J.S.Mill, 1843
... if we knew everything, we should still want to systematize our knowledge as a deductive system, and the general axioms in that system would be the fundamental laws of nature." Universals of Law and of Fact Frank Ramsey, 1928
"a contingent generalization is a law of nature if and only if it appears as a theorem (or axiom) in each of the true deductive systems that achieves a best combination of simplicity and strength" Counterfactuals ,p73 David Lewis 1973
The quotes from Mill, Ramsey and Lewis above express the what philosphers call the " cf. 'Laws of nature', Stanford Encyclopedia of Philosophy. Best System Analysis" of Laws.
BSA is probably the most widely accepted answer we have to the question, "What is it to be a Law of nature?" Even so, it is notoriously fraught with unanswered questions. A short list:
Can the account properly distinguish accidental from nomological regularities?
Can it explain the connection between the laws of nature, counterfactuals and dispositions?
Why should we count only generalizations as laws, given that many scientific principles do not obviously take this form? Can't singular statements describing, say, fundamental constants be laws too?
What is the connection between this deductive account of laws and our inductive methods of discovering them?
How does BSA accommodate the existence of probabilistic laws?
What do Mill and Lewis mean when they speak of "simplicity"? Isn't simplicity in the eye of the beholder? If so how can it be a subjective matter what the laws of nature are? Or, if simplicity is just a measure of shortness of our sentences, doesn't that make law-hood a matter of what language we happen to speak?
And, anyway, why should we think that the laws of nature must be simple in any sense?
In this post I want to raise different and, I think, more fundamental problems for BSA and provide an alternative theory of lawhood based on Algorithmic Information Theory (AIT). This new theory is precisely captured in the theorem of AIT that appears above. Don't worry if you don't understand it just now. AIT is a recent development and a novelty to most philosophers. Before we are done, I hope to have explained to you what this equation means and to have convinced you of its deep significance.
What is the Best System Theory?
As a first step we need to clarify what the Best System Analysis is. Though the theory is widely accepted it is also widely misunderstood. This is mostly Lewis's fault. Lewis introduces his version of BSA by saying:
Take all the deductive systems whose theorems are true. Some are stronger, more informative than others. These virtues compete: An uninformative system can be very simple, an systemized compendium of miscellaneous information can be very informative. The best system is one that strikes as good a balance as truth will allow between simplicity and strength. How good a balance that is will depend on how kind nature is. A regularity is a law if it is a theorem of the best system. David Lewis, Papers in Metaphysics and Epistemology, 1983, pp. 41-2 5
and this is how the theory is generally glossed. Thus the SEP dutifully reports:
Some true deductive systems will be stronger than others; some will be simpler than others. These two virtues, strength and simplicity, compete. (It is easy to make a system stronger by sacrificing simplicity: include all the truths as axioms. It is easy to make a system simple by sacrificing strength: have just the axiom that 2 + 2 = 4.) According to Lewis, the laws of nature belong to all the true deductive systems with a best combination of simplicity and strength. Carroll, John W., "Laws of Nature", The Stanford Encyclopedia of Philosophy (Spring 2012 Edition), Edward N. Zalta (ed.) 6
Problems intrude when we ask exactly what is meant by 'strength' here.
When we speak of the "strength" of a candidate set of axioms in a deductive system we can mean one of two things.
Strength-1: How many theorems can be formally derived from the axioms.
Strength-2: How much information is expressed by the axioms.
The first notion, strength-1, is syntactic. We measure it by counting sentences: some set of sentences A1 is a stronger axiom set than A2 if every theorem (formally) deducible from A2 is deducible from A1 but not vice versa.
Strength-2 is a semantic property: we measure it by counting worlds. If A2 is true at every world at which A1 is true, but not vice versa then A1 is more informative than A2. The idea is that the fewer worlds at which a sentence is true the more informative it is.
Lewis clearly intended this second, semantic, reading of 'strength' in the passage just quoted and this is how he is generally understood. Thus Ned Hall tells us:
Lewis takes it that there is some canonical scheme for representing facts about the world. Then any correct representation that makes use of this scheme will have two features: First, it will have a degree of informativeness, determined purely by which possible worlds the representation rules out. So it automatically follows that if one correct representation rules out more possible worlds than a second (i.e., every world in which the second true is one in which the first is true, but not vice versa), then the first is more informative. There are thus maximally informative representations, made so by being true only in the actual world. Second, it will have a degree of simplicity, determined by broadly syntactic features of the representation . These two factors of simplicity and informativeness then determine an ordering— presumably, partial—among all the correct representations there are, in terms of how well each one balances simplicity and informativeness. Lewis's hope is that the nature of our world will yield a clear winner.Ned Hall, "Humean Reductionism About Laws of Nature", p.12 7
The problem is that Lewis had no sooner told this simplicity vs. strength-2 story than he immediately refuted it.
We face an obvious problem. Different ways to express the same content, using different vocabulary, will differ in simplicity. The problem can be put in two ways, depending on whether we take our systems as consisting of propositions (classes of worlds) or as consisting of interpreted sentences. In the first case, the problem is that a single system has different degrees of simplicity relative to different linguistic formulations. In the second case, the problem is that equivalent systems, strictly implying the very same regularities, may differ in their simplicity. In fact, the content of any system whatever may be formulated very simply indeed. Given system S, let F be a predicate that applies to all and only things at worlds where S holds. Take F as primitive, and axiomatise S (or an equivalent thereof) by the single axiom (∀x)Fx. If utter simplicity is so easily attained, the ideal theory may as well be as strong as possible. Simplicity and strength needn't be traded off. Then the ideal theory will include (its simple axiom will strictly imply) all truths, and fortiori all regularities. Then, after all, every regularity will be a law. That must be wrong. David Lewis, Papers in Metaphysics and Epistemology, 1983, pp. 41-2 8
This means that simplicity and strength-2 do not conflict and don't need to be balanced off. Lewis's argument demonstrates that for any possible world there is a possible language— a "canonical scheme"— in which there is a very short one sentence axiom which entails all the truth about the world. Yet we do not want to say that that sentence expresses a law, not just because that would make every regularity a law but also because it would make every truth an immediate consequence of a law so every truth would be nomologically necessary.
What to do? Lewis says
The remedy, of course, is not to tolerate such a perverse choice of primitive vocabulary. We should ask how candidate systems compare in simplicity when each is formulated in the simplest eligible way; or, if we count different formulations as different systems, we should dismiss the ineligible ones from candidacy. An appropriate standard of eligibility is not far to seek: let the primitive vocabulary that appears in the axioms refer only to perfectly natural properties. David Lewis, Papers in Metaphysics and Epistemology, 1983, pp. 41-2) 9
This abandons the semantic criterion of strength in favor of the syntactic one. When we look for laws we are no longer looking for Hall's "maximally informative propositions" instead we are looking for propositions expressed by sentences that a) belong to a particular kind of language: a language whose lexicon includes names for perfectly " cf. Bird, Alexander and Tobin, Emma, "Natural Kinds", The Stanford Encyclopedia of Philosophy (Winter 2012 Edition), Edward N. Zalta (ed.)natural properties" (hereinafter an "N-Language") and 2) are sentences that have the form of generalizations in that N-language and 3) are sentences that belong to the simplest sets of sentences (axioms) from which the maximum number of true sentences of that language can be formally deduced.
Note this does not require that we identify the laws with these sentences nor does it require that we speak an N-language in order to express laws. We can still regard the laws as language transcendent propositions and hope to express them in, say English, provided we can believe that there are some English sentences— however unsimple they might be— that are synonymous with the law-expressing sentences in some N-language. Still, Lewis's theory is "linguistic" in the sense that it requires us to say that what makes such a proposition a law is that some sentence expressing it in some N-language appears in the axioms which best combine the genuinely competing virtues of simplicity and strength-1 in that language. Lewis certainly hoped that strength-1 in such a language would correlate with strength-2, but that hope was not part of his analysis.
Failure to grasp this point is, all by itself, the source of considerable For example, Ned Hall thinks that the central problem for Lewis's analysis is how to reconcile its competing demands of simplicity and informativeness. That problem goes away, or at least looks very different, when one notices that demands don't compete and that Lewis, in any case,doesn't demand that the laws be maximally informative in the relevant sense. confusion in the literature on laws but we need not dwell on that just now. What is more relevant is that understanding BSA in this way should not at all detract from its intuitive appeal.
That appeal is nicely captured by Helen Beebee's retelling.
So the idea is something like this. Suppose God wanted us to learn all the facts there are to be learned. (The Ramsey-Lewis view is not an epistemological thesis but I'm putting it his way for the sake of the story.) He decides to give us a book— God's Big Book of Facts— so that we might come to learn its contents and thereby learn every particular matter of fact there is. As a first draft, God just lists all the particular matters of fact there are. But the first draft turns out to be an impossibly long and unwieldy manuscript, and very hard to make any sense of-it's just a long list of everything that's ever happened and will ever happen. We couldn't even come close to learning a big list of independent facts like that. Luckily, however (or so we hope), God has a way of making the list rather more comprehensible to our feeble, finite minds: he can axiomatize the list. That is, he can write down some universal generalizations with the help of which we can derive some elements of the list from others. This will have the benefit of making God's Big Book of Facts a good deal shorter and also a good deal easier to get our rather limited brains around.
For instance, suppose that all the facts in God's Big Book satisfy f=ma. Then God can write down f=ma at the beginning of the book, under the heading "Axioms", and cut down his hopelessly long list of particular matters of fact: whenever he sees facts about an object's mass and acceleration, say, he can cross out the extra fact about its force, since this fact follows from the others together with the axiom f=ma. And so on. God, in his benevolence, wants the list of particular matters of fact to be as short as possible-that is, he wants the axioms to be as strong as possible; but he also wants the list of axioms to be as short as possible-he wants the deductive system (the axioms and theorems) to be as simple as possible. The virtues of strength and simplicity conflict with each other to some extent; God's job is to strike the best balance. And the contingent generalizations that figure in the deductive closure of the axiomatic system which strikes the best balance are the laws of nature.Helen Beebee,"The Non-Governing Conception of Laws of Nature",Philosophy and Phenomenological Research, Vol. LXI, No.. 3, November 2000, pp.574-5 10
This gets Lewis's picture exactly right, provided that we stipulate that God writes his Big Book in an N-language and assume that it is a language we can understand I do not suggest that Beebee was unaware of this point. 11 . This is important because for this story to get God writing things that look like laws we must understand that God's problem is not just how to write a shorter book that gives us all the facts. If God's only task was to say the same thing in fewer words— to convey the same information— his best strategy would be to stop writing in N-language and speak the language Lewis described; the one with the single super predicate "F". Written in that language the book of facts is just one sentence long viz.'(∀)(Fx)'. The problem with this is not that we frail humans couldn't understand that language. The problem is that even if we could, neither we nor God would regard this axiom as a law of nature because that would make every truth nomologically necessarily.
The task Lewis sets God is not just to express all the facts but to express them in N-language. He must tell the truth about the world by telling us which N-Language sentences are true and he must tell us that as simply and concisely as possible.
As Beebee points out, the device of axiomatization provides one way of doing that. By writing a few general N-language sentences as axioms, God gives us a way to deduce the truth of many others; allowing Him to leave those others out and make the Big Book correspondingly shorter.
Axiomatization does the job of telling us which N-Language sentences are true using fewer sentences. But if that is the job that laws of nature do, we now need to ask if there is any other, better, way of doing that job?
Let us begin by noticing that Beebee's story is, in more ways than one, a bit old fashioned. Why would God try giving us this big unwieldy manuscript? Surely God would know that we all have computers now. He ought make the Big Book an ebook and send us the file!
Of course, there would still be problems of size. How could he send it to us? Even the digitized Big eBook of Facts would be a very big file, certainly too big send to us in e-mail.
God's problem here is one we've all faced even with the small books and documents we write ourselves. How do you get that 10 megabyte chapter to the publisher when the email system only lets you send 5 megabyte attachments? Axiomatizing your documents would be one way shorten them. But that is not what we do. Instead we compress them. We put them in a ".zip" file. Nerds who hold that God speaks Linux, will insist that it should be a ".gz" file. As we shall see below, it makes no difference. 12
Compressing books, or any other kind of data, is a way of squeezing a lot of bits into fewer bits. That 10 megabyte chapter may get turned into a 1 megabyte zip file. Anyone who has the right decompression program can "unzip" that file and get the whole 10 megabyte document back again. Of course they will have to have the unzipping program but, remarkably, the decompressing program may itself be quite short. So even if the mail system won't let you send more than 5 megabytes at a time you may be able to send the zip of your 10 megabyte chapter and the unzip program in the same email. And the same short programs that zip and unzip your chapter can, of course, be used to compress and decompress documents of any size.
To understand what data compression is we are going to have to learn a bit about Algorithmic Information Theory. But before we start on that we should pause to take inventory of what data compression is not. We commonly say that the zip file contains "all the data" that was in the original, uncompressed document. As we shall see, there is a rigorous senses of "data" in which this is true but there are important senses in which it isn't:
It is not that a zip file "says" everything that was in the original document but in fewer or shorter words. That can happen: translating, say, the Kritik der reinen Vernunft into English may give you a shorter book. But compression is not a matter of translating a document into some other, less prolix, language.
Nor is a zip file anything like an abridgment or a summary of the original. Abridgment requires throwing some information away, compression need not. More importantly, an abridgment or summary of a document still says something. A zip file does not.
A zip of an English document will likely contain no sentences in any language. The zip of a document is not true in the same possible worlds as the original document. A zip file isn't true or false at all. Indeed, for reasons we shall explore below, the better the compression algorithm you use, the more the contents of the compressed document will approximate purely random data.
Nor, as we shall soon see is there any sense in which a zip file is a collection of axioms from which the decompressing program "deduces" the contents of the compressed document.
In any case, having come thus far it should start to seem very peculiar that Best System theorists should ever have thought that the "Best System" would be a deductive one….
Well, notice first that a deductive system is only one particular kind of formal system. A formal system is a collection of rules for manipulating symbols— that is, meaningful objects. What makes a system "formal" is that its rules will not mention those symbols' meaning but, will instead describe physical manipulations of symbols keyed only to those symbols' intrinsic physical properties.
Austerely conceived, a deductive system consists of a set of seed symbol strings (the axioms), and a set of transformation rules that describe how to transform and augment those seed strings to produce new strings (the theorems). What makes such a system "deductive" is that there is some interpretive scheme— some way of assigning the strings meanings— such that a) the axiom and theorem strings can be all be interpreted as sentences— the sorts of things that are true or false; and b) the rules for transforming strings into strings turn out to be truth preserving.
Deductive systems are a vanishingly small subset of all the formal systems there are. Not all strings of symbols are sentences; manipulations of symbols, even of sentences, do not always produce sentences, and manipulations that do map sentences onto other sentences are not all truth preserving. There are formal systems that generate sentences from seed strings that are not sentences: for example, a grammar is a formal system that starts off with words as its seeds and describes rules for building sentences from them; its "theorems" comprise all the sentences of a language, true and false.
What is special about deductive systems— what has given them their central place in Western thought— is that they are a means of proof. The transformations that lead from axioms to theorems can be interpreted as arguments in which the theorems appear as conclusions; valid arguments, if the system's transformations are truth preserving; sound arguments, if the system's axioms are true.
In this way, deductive systems serve as a kind of epistemological amplifier: they allow us to extend our confidence in the truth of a few seed sentences to an equal confidence in the truth of the many— perhaps infinitely many— theorems of the system. This in turn has seemed important because there seem to be some sentences about whose truth we can be certain without proof. These sentences serve as "axioms" in the formal sense because they are "axiomatic" in the informal sense: they, or the propositions they express, are held to be "obvious", "self-evident", "apodictic", "self-justifying", "a priori", "analytic" ... .
This variety of labels reflects an even larger number of competing philosophical theories about why these sentences enjoy this special epistemic status. Is it because they express propositions we learned in a prior life? Is it because they express relations between ideas, or concepts, or meanings? Or is that they are the deliverances of a special non-sensory faculty of Reason? Never mind. So long as we can agree that we somehow know that these special sentences are true we can use deductive systems to extend that knowledge to an infinity of other sentences which would otherwise not be "obvious", "self-evident", etc.
Now, what is peculiar about the idea that the Laws of Nature get their status by being— or being expressed by— axioms in formal deductive systems is that no one, at least no one nowadays, thinks that the laws of nature are axiomatic. No one thinks that they are "obvious", "self-evident", "analytic" or anything of the sort. No one thinks that we have a special way/grounds/justification/warrant for believing in the laws of nature which we can use to prove the truth of their consequences. Quite the opposite: in the sort of deductive system Ramsey and Lewis envisage, our confidence in the truth of the axioms will rest entirely on our confidence in the truth of the theorems— that is on matters of particular fact that constitute the data for the theory the laws embody.
Remember that Beebee's God doesn't axiomatize His book because He needs to convince us that what it says is true (We already believe Him. He's God!). He axiomatizes only to make the book shorter. Of course, in real life, we don't have a book from God. But we can write volumes and volumes of sentences that record our observations of particular matters of fact and we are altogether more confident in the truth of these observations (e.g. that this swan is white, and that one) than we are of any generalization over them (that all swans are white). Of course, we might think that the fact that a statement like "f=ma" could occur in a tight axiomatization of all our observation sentences is a kind of argument for its truth. And so it is. But it is not a deductive argument.
Formal deduction does two things: it gives us a way of building many sentences from a few sentences and it gives us a way of building arguments for the truth of those sentences. But if we already know what sentences we want to build and are confident in their truth —because we find them in God's Big Book, or because they are direct reports of empirical data—then the argument building won't give us any epistemic advance. In that case we might as well look for other ways of building those sentences which are more efficient because they are not constrained by deduction's requirement that they start from sentences or use transformations that preserve truth.
What we are looking for is not the simplest way to describe the world that makes God's Big Book true. What we are looking for is the simplest description of the book itself. What we are looking for is a data compression algorithm.
The word 'data' as we shall understand it here and in all that follows, does not denote facts about the world but rather sets of sentences, conceived as strings of symbols that may or may not express truths or, indeed, may express nothing at all. A data compression algorithm provides a formal way of describing those strings using shorter strings; "formal", again, in the sense that it ignores what, if anything, those strings might mean.
Your file zipping program operates a compression algorithm that doesn't care if the chapter you feed it is true or false. It can't tell if it is zipping sentences from God's Big of Facts or the Big Book of All Empirical Observations or a stream of random gibberish. Nevertheless, I am going to try to convince you that by understanding data compression we can understand something important, and deep, about the laws of nature.
"Objection: How can thinking about the mere formal manipulation of meaningless symbols, tell us why the laws of nature are true?"
Answer: It cannot. But remember that the Best Systems theory doesn't purport to tell us that either. The question that theory is supposed to answer is why— among all the sentences we think are true for whatever reason—we single out particular sentences for the label 'Laws'. More generally, why do we speak of some sentences as nomologically necessary and others as nomologically contingent? And remember that Lewis's answer is that what distinguishes the Law sentences from others is not that they express some special kind of proposition knowable in a special way, but rather that the propositions they express are expressible by sentences which have particular formal properties—simplicity and strength-1— in a special kind of language: an N-language.
The core of Lewis's idea was surely this: if we describe the world in a language whose basic vocabulary describes the "natural" properties—if we speak a language that "carves nature at it joints"—then whatever order and system we find in the world will be reflected in the words we use to describe it. Insofar as we speak such a language and we speak truly, then regularities in reality will be reflected in regularities in the data— that is, in the sentences— that report them. The order of the world, at least such order as we can ever hope to articulate or fathom, will be reflected in the forms of our descriptions of it.
As we shall see, data compression works precisely by exploiting system and regularity— in rigorously definable senses of these terms— in the data. Purely random data is incompressible. Your chapter shrinks when you zip it because it is not random data. Its compressibility is testimony to the underlying orderliness of your thoughts as expressed in your words. Likewise how compressible God's Big book is will reflect the orderliness of His thoughts. That "orderliness"—that system of the world—is, I will argue, what we are talking about when we talk about the laws of Nature.