Every event is a result of causes: but the multitude of forces and the variety of collocations being immeasurably great, the overwhelming majority of events occurring about the same time are only related by Causation so remotely that the connection cannot be followed. Whilst my pen moves along the paper, a cab rattles down the street, bells in the neighbouring steeple chime the quarter, a girl in the next house is practising her scales, and throughout the world innumerable events are happening which may never happen together again; so that should one of On the other hand, many things are now happening together or coinciding, that will do so, for assignable reasons, again and again; thousands of men are leaving the City, who leave at the same hour five days a week. But this is not chance; it is causal coincidence due to the custom of business in this country, as determined by our latitude and longitude and other circumstances. No doubt the above chance coincidences—writing, cab-rattling, chimes, scales, etc.—are causally connected at some point of past time. They were predetermined by the condition of the world ten minutes ago; and that was due to earlier conditions, one behind the other, even to the formation of the planet. But whatever connection there may have been, we have no such knowledge of it as to be able to deduce the coincidence, or calculate its recurrence. Hence Chance is defined by Mill to be: Coincidence giving no ground to infer uniformity. Still, some chance coincidences do recur according to laws of their own: I say some, but it may be all. If the world is finite, the possible combinations of its elements are exhaustible; and, in time, whatever conditions of the world have concurred will concur again, and in the same relation to former conditions. This writing, that cab, those chimes, those scales will coincide again; the Argonautic expedition, and the Trojan war, and all our other troubles will be renewed. But let us consider some more manageable instance, such as the throwing of dice. Every one who has played much with dice knows that double sixes are sometimes thrown, and sometimes double aces. Such coincidences do not happen once and only once; they occur again and again, and a great number of trials will show that, though their recurrence has not the regu Instead of speaking of the 'throwing of the die' and its 'turning up ace' as two events, the former is called 'the event' and the latter 'the way of its happening.' And these expressions may easily be extended to cover relations of distinct events; as when two men shoot at a mark and we desire to represent the probability of both hitting the bull's eye together, each shot may count as an event (denominator) and the coincidence of 'bull's-eyes' as the way of its happening (numerator). It is also common to speak of probability as a propor To the subjective view it may be objected, (a) that belief cannot by itself be satisfactorily measured. No one will maintain that belief, merely as a state of mind, always has a definite numerical value of which one is conscious, as 1/100 or 1/10. Let anybody mix a number of letters in a bag, knowing nothing of them except that one of them is X, and then draw them one by one, endeavouring each time to estimate the value of his belief that the next will be X; can he say that his belief in the drawing of X next time regularly increases as the number of letters left decreases? If not, we see that (b) belief does not uniformly correspond with the state of the facts. If in such a trial as proposed above, we really wish to draw X, as when looking for something in a number of boxes, how common it is, after a few failures, to feel quite hopeless and to say: "Oh, of course it will be in the last." For belief is subject to hope and fear, temperament, passion, and prejudice, and not merely to rational considerations. And it is useless to appeal to 'the Wise Man,' the purely rational judge of probability, unless he is producible. Or, if it be said that belief is a short cut to the evaluation of experience, because it is the resultant of all past experience, we may reply that this is not true. For one striking experience, But (c) at any rate, if Probability is to be connected with Inductive Logic, it must rest upon the same ground, namely—observation. Induction, in any particular case, is not content with beliefs or opinions, but aims at testing, verifying or correcting them by appealing to the facts; and Probability has the same object and the same basis. In some cases, indeed, the conditions of an event are supposed to be mathematically predetermined, as in tossing a penny, throwing dice, dealing cards. In throwing a die, the ways of happening are six; in tossing a penny only two, head and tail: and we usually assume that the odds with a die are fairly 5 to 1 against ace, whilst with a penny 'the betting is even' on head or tail. Still, this assumption rests upon another, that the die is perfectly fair, or that the head and tail of a penny are exactly alike; and this is not true. With an ordinary die or penny, a very great number of trials would, no doubt, give an average approximating to 1/6 or 1/2; yet might always leave a certain excess one way or the other, which would also become more definite as the trials went on; thus showing that the die or penny did not satisfy the mathematical hypothesis. Buffon is said to have tossed a coin 4040 times, obtaining 1992 heads and 2048 tails; a pupil of De Morgan tossed 4092 times, obtaining 2048 heads and 2044 tails. There are other important cases in which probability is estimated and numerically expressed, although statistical evidence directly bearing upon the point in question cannot be obtained; as in betting upon a race; or in the prices of stocks and shares, which are supposed to represent the probability of their paying, or continuing to pay, a certain rate of interest. But the judgment of experts in such matters is certainly based upon experience; and great pains are taken to make the evidence as definite as possible by comparing records of speed, or by financial estimates; though something must still be allowed for reports of the condition of horses, or of the prospects of war, harvests, etc. However, where statistical evidence is obtainable, no one dreams of estimating probability by the quantity of his belief. Insurance offices, dealing with fire, shipwreck, death, accident, etc., prepare elaborate statistics of these events, and regulate their rates accordingly. Apart from statistics, at what rate ought the lives of men aged 40 to be insured, in order to leave a profit of 5 per cent. upon £1000 payable at each man's death? Is 'quantity of belief' a sufficient basis for doing this sum? We may suppose some one to object that "by this relative standard even empirical laws cannot be called 'only probable' as long as we 'know no exception to them'; for that is all that can be said for the boasted law of causation; and that, accordingly, we can frame no fraction to represent their probability. That 'all swans are white' was at one time, from this point of view, not probable but certain; though we now know it to be false. It would have been an indecorum to call it only probable as long as no other-coloured swan had been discovered; not merely because the quantity of belief amounted to certainty, but because the number of events (seeing a swan) and the number of their happenings in a certain way (being white) were equal, and therefore the evidence amounted to 1 or certainty." But, in fact, such an empirical law is only probable; and the estimate of its probability must be based on the number of times that similar laws have been found liable to exceptions. Albinism is of frequent occurrence; and it is common to find closely allied varieties of animals differing in colour. Had the evidence been duly weighed, it could never have seemed more than probable that 'all swans are white.' But what law, approaching the comprehensiveness of the law of causation, presents any exceptions? Supposing evidence to be ultimately nothing but accumulated experience, the amount of it in favour of causation is incomparably greater than the most that has ever been advanced to show the probability of any other kind of event; and every relation of events which is shown to have the marks of causation obtains the support of that incomparably greater body of evidence. Hence the only way in which causation can be called probable, for us, is by considering it as the upward limit (1) to But, further, a merely empirical statistical law will only be true as long as the causes influencing the event remain the same. A die may be found to turn ace once in six throws, on the average, in close accordance with mathematical theory; but if we load it on that facet the results will be very different. So it is with the expectation of life, or fire, or shipwreck. The increased virulence of some epidemic such as influenza, an outbreak of anarchic incendiarism, a moral epidemic of over-loading ships, may deceive the hopes of insurance offices. Hence we see, again, that probability depends upon causation, not causation upon probability. That uncertainty of an event which arises not from ignorance of the law of its cause, but from our not knowing whether the cause itself does or does not occur at any particular time, is Contingency. Fig. 11. Here o is the average event, and oy represents the number of average events. Along ox, in either direction, deviations are measured. At p the amount of error or deviation is op; and the number of such deviations is represented by the line or ordinate pa. At s the deviation is os; and the number of such deviations is expressed by sb. As the deviations grow greater, the number of them grows less. On the other side of o, toward -x, at distances, op', os' (equal to op, os) the lines p'a', s'b' represent the numbers of those errors (equal to pa, sb). If o is the average height of the adult men of a nation, (say) 5 ft. 6 in., s' and s may stand for 5 ft. and 6 ft.; That some such figure as that given above describes a frequent characteristic of an average with the deviations from it, may be shown in two ways: (1) By arranging the statistical results of any homogeneous class of measurements; when it is often found that they do, in fact, approximately conform to the figure; that very many events are near the average; that errors are symmetrically distributed on either side, and that the greater errors are the rarer. (2) By mathematical demonstration based upon the supposition that each of the events in question is influenced, more or less, by a number of unknown conditions common to them all, and that these conditions are independent of one another. For then, in rare cases, all the conditions will operate favourably in one way, and the men will be tall; or in the opposite way, and the men will be short; in more numerous cases, many of the conditions will operate in one direction, and will be partially cancelled by a few opposing them; whilst in still more cases opposed conditions will approximately balance one another and produce the average event or something near it. The results will then conform to the above figure. From the above assumption it follows that the symmetrical curve describes only a 'homogeneous class' of measurements; that is, a class no portion of which is much influenced by conditions peculiar to itself. If the class is not homogeneous, because some portion of it is subject to peculiar conditions, the curve will show a hump on one side or the other. Suppose we are tabulating the ages at which Englishmen die who have reached the age of 20, we may find that the greatest number die at 39 (19 years being the average expectation of life at 20) and that as far as that age the curve upwards is regular, and that beyond the age of 39 it begins to descend regularly, but that on approaching 45 it bulges out some way before resuming its regular descent—thus: Fig. 12. Such a hump in the curve might be due to the presence of a considerable body of teetotalers, whose longevity was increased by the peculiar condition of abstaining from alcohol, and whose average age was 45, 6 years more than the average for common men. Again, if the group we are measuring be subject to selection (such as British soldiers, for which profession all volunteers below a certain height—say, Fig. 13. If, above a certain height, volunteers are also rejected, the curve will fall abruptly on both sides. The average is supposed to be 5 ft. 8 in. The distribution of events is described by 'some such curve' as that given in Fig. 11; but different groups of events may present figures or surfaces in which the slopes of the curves are very different, namely, more or less steep; and if the curve is very steep, the figure runs into a peak; whereas, if the curve is gradual, the figure is comparatively flat. In the latter case, where the figure is flat, fewer events will closely cluster about the average, and the deviations will be greater. Suppose that we know nothing of a given event except that it belongs to a certain class or series, what can we venture to infer of it from our knowledge of the series? Let the event be the cephalic index of an Englishman. The cephalic index is the breadth of a skull × 100 and divided by the length of it; e.g. if a skull is 8 in. long and 6 in. broad, (6×100)/8=75. We know that the average English skull has an index of 78. The skull In such cases as heights of men or skull measurements, where great numbers of specimens exist, the average will be actually presented by many of them; but if we take a small group, such as the measurements of a college class, it may happen that the average height (say, 5 ft. 8 in.) is not the actual height of any one man. Even then there will generally be a closer cluster of the actual heights about that number than about any other. Still, with very few cases before us, it may be better to take the median than the average. The median is that event on either side of which there are equal numbers of deviations. One advantage of this procedure is that it may save time and trouble. To find approximately the average height of a class, arrange the men in order of height, take the middle one and measure him. A further advantage of To make a single measurement of a phenomenon does not give one much confidence. Another measurement is made; and then, if there is no opportunity for more, one takes the mean or average of the two. But why? For the result may certainly be worse than the first measurement. Suppose that the events I am measuring are in fact fairly described by Fig. II, although (at the outset) I know nothing about them; and that my first measurement gives p, and my second s; the average of them is worse than p. Still, being yet ignorant of the distribution of these events, I do rightly in taking the average. For, as it happens, ¾ of the events lie to the left of p; so that if the first trial gives p, then the average of p and any subsequent trial that fell nearer than (say) s' on the opposite side, would be better than p; and since deviations greater than s' are rare, the chances are nearly 3 to 1 that the taking of an average will improve the observation. Only if the first trial give o, or fall within a little more than ½p on either side of o, will the chances be against any improvement by trying again and taking an average. Since, therefore, we cannot know the position of our first trial in relation to o, it is always prudent to try again and take the average; and the more trials we can make and average, the better is the result. The average of a number of observations is called a "Reduced Observation." We may have reason to believe that some of our measurements are better than others because they have Not only in such a practical affair as insurance, but in matters purely scientific, the minute and subtle peculiarities of individuals have important consequences. Each man has a certain cast of mind, character, physique, giving a distinctive turn to all his actions even when he tries to be normal. In every employment this determines his Personal Equation, or average deviation from the normal. The term Personal Equation is used chiefly in connection with scientific observation, as in Astronomy. Each observer is liable to be a little wrong, and this error has to be allowed for and his observations corrected accordingly. The use of the term 'expectation,' and of examples drawn from insurance and gambling, may convey the notion that probability relates entirely to future events; but if based on laws and causes, it can have no reference to point of time. As long as conditions are the same, events will be the same, whether we consider uniformities or averages. We may therefore draw probable inferences concerning the past as well as the future, subject to the same hypothesis, that the causes affecting the events in question were the same and similarly combined. On the other hand, if we know that conditions bearing on the subject of investigation, have changed since statistics were collected, or were different at some time previous to the collection of evidence, every probable inference based on those statistics must be corrected by allowing for the altered conditions, whether we desire to reason forwards or backwards in time. (1) If two events or causes do not concur, the probability of one or the other occurring is the sum of the (2) If two events are independent, having neither connection nor repugnance, the probability of their concurring is found by multiplying together the separate probabilities of each occurring. If in walking down a certain street I meet A once in four times, and B once in three times, I ought (by mere chance) to meet both once in twelve times: for in twelve occasions I meet B four times; but once in four I meet A. This is a very important rule in scientific investigation, since it enables us to detect the presence of causation. For if the coincidence of two events is more or less frequent than it would be if they were entirely independent, there is either connection or repugnance between them. If, e.g., in walking down the street I meet both A and B oftener than once in twelve times, they may be engaged in similar business, calling them from their offices at about the same hour. If I meet them both less often than once in twelve times, they may belong to the same office, where one acts as a substitute for the other. Similarly, if in a multitude of throws a die turns six oftener than once in six times, it is not a fair one: that is, there is a cause favouring the turning of six. If of 20,000 people 500 see apparitions and 100 have friends murdered, the chance of any man having both experiences is 1/8000; but if each lives on the average 300,000 hours, the chance of both events occurring in the same hour is 1/2400000000. If the two events occur in the same hour oftener than this, there is more than a chance coincidence. The more minute a cause of connection or repugnance between events, the longer the series of trials or instances (3) The rule for calculating the probability of a dependent event is the same as the above; for the concurrence of two independent events is itself dependent upon each of them occurring. My meeting with both A and B in the street is dependent on my walking there and on my meeting one of them. Similarly, if A is sometimes a cause of B (though liable to be frustrated), and B sometimes of C (C and B having no causes independent of B and A respectively), the occurrence of C is dependent on that of B, and that again on the occurrence of A. Hence we may state the rule: If two events are dependent each on another, so that if one occur the second may (or may not), and if the second a third; whilst the third never occurs without the second, nor the second without the first; the probability that if the first occur the third will, is found by multiplying together the fractions expressing the probability that the first is a mark of the second and the second of the third. Upon this principle the value of hearsay evidence or tradition deteriorates, and generally the cogency of any argument based upon the combination of approximate generalisations dependent on one another or "self-infirmative." If there are two witnesses, A and B, of whom A saw an event, whilst B only heard A relate it (and is therefore dependent on A), what credit is due to B's recital? Suppose the probability of each man's being correct as to what he says he saw, or heard, is 3/4: then (3/4 × 3/4 = 9/16) the probability that B's story is true is a little more than 1/2. For if in 16 attestations A is wrong 4 times, B can only be right in 3/4 of the remainder, or 9 times in 16. Again, if we have the Approximate Generalisations, 'Most attempts to reduce wages are met by Of course this method of calculation cannot be quantitatively applied if no statistics are obtainable, as in the testimony of witnesses; and even if an average numerical value could be attached to the evidence of a certain class of witnesses, it would be absurd to apply it to the evidence of any particular member of the class without taking account of his education, interest in the case, prejudice, or general capacity. Still, the numerical illustration of the rapid deterioration of hearsay evidence, when less than quite veracious, puts us on our guard against rumour. To retail rumour may be as bad as to invent an original lie. (4) If an event may coincide with two or more other independent events, the probability that they will together be a sign of it, is found by multiplying together the fractions representing the improbability that each is a sign of it, and subtracting the product from unity. This is the rule for estimating the cogency of circumstantial evidence and analogical evidence; or, generally, for combining approximate generalisations "self-corroboratively." If, for example, each of two independent circumstances, A and B, indicates a probability of 6 to 1 in favour of a certain event; taking 1 to represent certainty, 1-6/7 is the improbability of the event, notwithstanding each circumstance. Then 1/7 × 1/7 = 1/49, the improbability of both proving it. Therefore the probability of the event is 48 to 1. The matter may be plainer if put thus: A's indication is right 6 times in 7, or 42 in 49; in the remaining 7 times in 49, B's indication will be right 6 times. Therefore, together they will be right 48 times in 49. If each of two witnesses is truthful 6 times in 7, one or the other will be truthful 48 times in 49. But they will not be If in an analogical argument there were 8 points of comparison, 5 for and 3 against a certain inference, and the probability raised by each point could be quantified, the total value of the evidence might be estimated by doing similar sums for and against, and subtracting the unfavourable from the favourable total. When approximate generalisations that have not been precisely quantified combine their evidence, the cogency of the argument increases in the same way, though it cannot be made so definite. If it be true that most poets are irritable, and also that most invalids are irritable, a still greater proportion will be irritable of those who are both invalids and poets. On the whole, from the discussion of probabilities there emerge four principal cautions as to their use: Not to make a pedantic parade of numerical probability, where the numbers have not been ascertained; Not to trust to our feeling of what is likely, if statistics can be obtained; Not to apply an average probability to special classes or individuals without inquiring whether they correspond to the average type; and Not to trust to the empirical probability of events, if their causes can be discovered and made the basis of reasoning which the empirical probability may be used to verify. The reader who wishes to pursue this subject further should read a work to which the foregoing chapter is greatly indebted, Dr. Venn's Logic of Chance. |