The logical view: probability as objective degree of belief

As of the 23rd May 2022 this website is archived and will receive no further updates.

understandinguncertainty.org was produced by the Winton programme for the public understanding of risk based in the Statistical Laboratory in the University of Cambridge. The aim was to help improve the way that uncertainty and risk are discussed in society, and show how probability and statistics can be both useful and entertaining.

Many of the animations were produced using Flash and will no longer work.

The view of probability as an objective degree of belief was developed in the early 20th century by people such as Harold Jeffreys and the the young John Maynard Keynes Keynes1921, and was later adopted by the influential philosopher of science, Rudolf Carnap Carnap1950. This view is not widely held these days, either by statisticians or philosophers, though there seems to be something of recent revival (see for example Williamson 2005 Williamson2005.

Probability theory is in this view seen as an extension of logic: using deductive logic we can tell what logically follows from the premises. If our premises are that “all cats are mammals”, and “Tibbles is a cat”, then it follows logically that “Tibbles is a mammal”.
When there is no logical certainty, we can at least, using probability theory, assess the probability of something, following from the premises. This is sometimes called “inductive logic”. If our premises are merely that “Tibbles is a cat”, and “all cats we have seen so far have been mammals”, then it only follows that “Tibbles is probably a mammal” (since it is always possible that the next cat we observe turns out to be a robot). The statement is only partially entailed by the premises.

The probability of an event is then defined as the degree of partial entailment, because the premises partially entail it. This also means that a probability is always a probability given some premises. This is written as [math]P(h|e)[/math], the probability of [math]h[/math] given [math]e[/math], or, in the cat example, the probability of the hypothesis (all cats are mammals), given the evidence (all cats we have seen so far have been mammals).

We can use this definition to work out how we can learn from experience, as in the cat example. To work out the probability we should assign to the hypothesis, [math]P(h|e)[/math], given the evidence we have for it (and our background assumptions), we need to know the probability of [math]h[/math] (given only our background assumptions), the probability of [math]e[/math] (again, given the background assumptions) and the probability that [math]P(e|h)[/math], i.e. the probability that if [math]h[/math] were true, [math]e[/math] would follow (in the cat example that's simply 1: if the hypothesis were true, then [math]e[/math] would follow). We can then work out [math]P(h|e)[/math]. Then we can use a formula named after the 18th century mathematician Thomas Bayes. The logical interpretation is therefore also often called “objective Bayesianism”.

The actual working out of the probability may take some work, but at least according to the logical interpretation it is at least possible to arrive at an objective (i.e. logically entailed) probability in which it is rational to believe (here we run into a possible confusion with the words objective and subjective: probability under this interpretation is subjective because it is a person's degree of belief rather than a feature of the world. It is however objective in the sense that there is only one degree of belief which may be rationally held).

The problems of the scheme arise in the determining the other probabilities. In particular, when there is absolutely no other evidence which we can use to determine a probability, we have to resort to finding a reasonable alternative. In the classical definition of probability it is assumed that, if there is no information to suggest otherwise, the different outcomes are given equal weight. Thus a die has 6 sides, therefore the probability of getting any one side is simply 1/6. Similarly, the logical interpretation stipulates that if there is no other information, then we need to give a range of outcomes equal probability. Keynes called this the “principle of indifference”.

This principle does, however, gives rise to some paradoxes, which for many philosophers is reason enough to abandon the logical interpretation, as none of the proposed solutions seem completely convincing. Suppose we blindly take a book from a library, but we don't know its colour. We have as much reason to suppose it is red than it is black, so we assign the probability that it is red as ½. However, we also have as much reason to suppose it is yellow than that it is black, and therefore the probability of it being yellow is also ½. This is clearly not right, as the probabilities should all add up to one, yet the probabilities we assigned for red, black and yellow (and any other colour using similar reasoning) add up to more than one.

So, we assign equal probabilities to all the possible colours. Say, if there are five possible colours, then the probability of the random book being black is 1/5.
This solution works well when there are discreet colours and their number is known, say when we draw smarties or lego bricks at random. In the example of the books however, this is not known: the colour spectrum is not discrete, and there can be arguments for example over whether Prussian Blue is a shade or a colour in its own right. But if there is supposed to be only one rationally possible assessment of a probability it shouldn't be affected by such arguments over how to divide the spectrum. However there are worse problems with the principle of indifference. Sometimes to assess a probability there are several different areas where we can assume indifference, and these lead to different results.

The “book paradox” is described by Keynes himself; Keynes 1921 chapter 4, and covered in Gillies 2000 Gillies2000. Further, more intractable, paradoxes are discussed by Gillies as well.

Levels: 

Comments

Keynes and Good note that prior probabilities, P(h), often don't make sense, for the reasons you cite, but one can always restrict hypotheses to statistical ones so that the likelihoods, P(e|h), do make sense and are objective. Thus an option is to reserve the term 'objective' for those things that are objective, and have the rest as imprecise or undefined. I am developing some notes on this and related issues at djmarsay.wordpress.com.

Your remarks on the principle of indifference, and the so called "paradoxes" it generates, while common, are based on a confusion between conditional and marginal probability. There are two conditions which must be satisfied for the principle of indifference to apply: 1) The evidence must give no reason to think one event more likely than the other. 2) The evidence must exhaust the possibilities being considered. While condition 1) is satisfied in comparing red and black books (so long as we have no other information). But condition 2) is certainly not, as you point out with adding yellow to the problem (or any other conceivable colour). This is usually the source of the paradoxes - failure to recognise that condition 2) needs to be fulfilled for the principle of indifference to be the "right answer". It is also why the assignment of 1/6 to the side of a dice seems much more appropriate - they exhaust the possibilities in addition to be exclusive. So we cannot apply the principle of indifference for the marginal probability of red in this case; but we can apply the principle of indifference to the conditional probability of red, given that we only have red or black. You should read the probability in the same manner than you read a logical implication, and it makes it obvious. So we have "If the only possible colours of books are red and black, and there is no other evidence, then the probability of seeing black on the next draw is 1/2". If the premise is not satisfied, the probabilities meaningless. This is in the same way that the logical implication "A implies B" is meaningless if A is false - it says nothing about B when A is false. This is usually the hard part with using probability as extended logic: setting up the hypothesis space. The reason is that there is nothing in the theory will tells you which set of possibilities to use. So in a sense, this view is the hardest to "get started" on calculations.

a very good post for readers like myself, it does not only inform but it also enhances one's knowledge about a certain topic. Thanks! http://instalikes.org