One of the best ways to clear up notions of what the functions are which schools should develop and improve is to get measures of them. If any given knowledge or skill or power or ideal exists, it exists in some amount. A series of amounts of it, varying from less to more, defines the ability itself in a way that no general verbal description can do. Thus, a series of weights, 1 lb., 2 lb., 3 lb., 4 lb., etc., helps to tell us what we mean by weight. By finding a series of words like only, smoke, another, pretty, answer, tailor, circus, telephone, saucy, and beginning, which are spelled correctly by known and decreasing percentages of children of the same age, or of the same school grade, we know better what we mean by 'spelling-difficulty.' Indeed, until we can measure the efficiency and improvement of a function, we are likely to be vague and loose in our ideas of what the function is. A SAMPLE MEASUREMENT OF AN ARITHMETICAL ABILITY: THE ABILITY TO ADD INTEGERSConsider first, as a sample, the measurement of ability to add integers. The following were the examples used in the measurements made by Stone ['08]:
The scoring was as follows: Credit of 1 for each column added correctly. Stone combined measures of other abilities with this in a total score for amount done correctly in 12 minutes. Stone also scored the correctness of the additions in certain work in multiplication. Courtis uses a sheet of twenty-four tasks or 'examples,' each consisting of the addition of nine three-place numbers as shown below. Eight minutes is allowed. He scores the amount done by the number of examples, and also scores the number of examples done correctly, but does not suggest any combination of these two into a general-efficiency score.
The author long ago proposed that pupils be measured also with series like a to g shown below, in which the difficulty increases step by step.
Woody ['16] has constructed his well-known tests on this principle, though he uses only one example at each step of difficulty instead of eight or ten as suggested above. His test, so far as addition of integers goes, is:— SERIES A. ADDITION SCALE (in part) By Clifford Woody
In his original report, Woody gives no scheme for scoring an individual, wisely assuming that, with so few samples at each degree of difficulty, a pupil's score would be too unreliable for individual diagnosis. The test is reliable for a class; and for a class Woody used the degree of difficulty such that a stated fraction of the class can do the work correctly, if twenty minutes is allowed for the thirty-eight examples of the entire test. The measurement of even so simple a matter as the efficiency of a pupil's responses to these tests in adding integers is really rather complex. There is first of all the problem of combining speed and accuracy into some single estimate. Stone gives no credit for a column unless it is correctly added. Courtis evades the difficulty by reporting both number done and number correct. The author's scheme, which gives specified weights to speed and accuracy at each step of the series, involves a rather intricate computation. This difficulty of equating speed and accuracy in adding means precisely that we have inadequate notions of what the ability is that the elementary school should improve. Until, for example, we have decided whether, for a given group of pupils, fifteen Courtis attempts with ten right, is or is not a better achievement than ten Courtis attempts with nine right, we have not decided just what the business of the teacher of addition is, in the case of that group of pupils. There is also the difficulty of comparing results when short and long columns are used. Correctness with a short column, say of five figures, testifies to knowledge of the process and to the power to do four successive single additions without error. Correctness with a long column, say of ten digits, testifies to knowledge of the process and to the power to do nine successive single additions without error. Now if a pupil's precision was such that on the average he Further, in the case of a column of whatever size, the result as ordinarily scored does not distinguish between one, two, three, or more (up to the limit) errors in the single additions. Yet, obviously, a pupil who, adding with ten-digit columns, has half of his answer-figures wrong, probably often makes two or more errors within a column, whereas a pupil who has only one column-answer in ten wrong, probably almost never makes more than one error within a column. A short-column test is then advisable as a means of interpreting the results of a long-column test. Finally, the choice of a short-column or of a long-column test is indicative of the measurer's notion of the kind of efficiency the world properly demands of the school. Twenty years ago the author would have been readier to accept a long-column test than he now is. In the world at large, long-column addition is being more and more done by machine, though it persists still in great frequency in the bookkeeping of weekly and monthly accounts in local groceries, butcher shops, and the like. The search for a measure of ability to add thus puts the problem of speed versus precision, and of short-column versus long-column additions clearly before us. The latter It may be said further that the measurement of ability to add gives the scientific student a shock by the lack of precision found everywhere in schools. Of what value is it to a graduate of the elementary school to be able to add with examples like those of the Courtis test, getting only eight out of ten right? Nobody would pay a computer for that ability. The pupil could not keep his own accounts with it. The supposed disciplinary value of habits of precision runs the risk of turning negative in such a case. It appears, at least to the author, imperative that checking should be taught and required until a pupil can add single columns of ten digits with not over one wrong answer in twenty columns. Speed is useful, especially indirectly as an indication of control of the separate higher-decade additions, but the social demand for addition below a certain standard of precision is nil, and its disciplinary value is nil or negative. This will be made a matter of further study later. MEASUREMENTS OF ABILITIES IN COMPUTATIONMeasurements of these abilities may be of two sorts—(1) of the speed and accuracy shown in doing one same sort of task, as illustrated by the Courtis test for addition shown on page 28; and (2) of how hard a task can be done perfectly (or with some specified precision) within a certain assigned time or less, as illustrated by the author's rough test for addition shown on pages 28 and 29, and by the Woody tests, when extended to include alternative forms. The Courtis tests, originated as an improvement on the Stone tests and since elaborated by the persistent devotion of their author, are a standard instrument of the first sort for measuring the so-called 'fundamental' arithmetical Tests of the second sort are the Woody tests, which include operations with integers, common and decimal fractions, and denominate numbers, the Ballou test for common fractions ['16], and the "Ladder" exercises of the Thorndike arithmetics. Some of these are shown on pages 36 to 41. Courtis Test Arithmetic. Test No. 1. Addition Series B You will be given eight minutes to find the answers to as many of these addition examples as possible. Write the answers on this paper directly underneath the examples. You are not expected to be able to do them all. You will be marked for both speed and accuracy, but it is more important to have your answers right than to try a great many examples.
and sixteen more addition examples of nine three-place numbers. Courtis Test Arithmetic. Test No. 2. Subtraction Series B You will be given four minutes to find the answers to as many of these subtraction examples as possible. Write the answers on this paper directly underneath the examples. You are not
and twenty more tasks of the same sort. Courtis Test Arithmetic. Test No. 3. Multiplication Series B You will be given six minutes to work as many of these multiplication examples as possible. You are not expected to be able to do them all. Do your work directly on this paper; use no other. You will be marked for both speed and accuracy, but it is more important to get correct answers than to try a large number of examples.
and twenty more multiplication examples of the same sort. Courtis Test Arithmetic. Test No. 4. Division Series B You will be given eight minutes to work as many of these division examples as possible. You are not expected to be able to do them all. Do your work directly on this paper; use no other. You will be marked for both speed and accuracy, but it is more important to get correct answers than to try a large number of examples.
and twenty more division examples of the same sort. SERIES B. MULTIPLICATION SCALE By Clifford Woody
SERIES B. DIVISION SCALE
Ballou Test Addition of Fractions
An Addition Ladder [Thorndike, '17, III, 5] Begin at the bottom of the ladder. See if you can climb to the top without making a mistake. Be sure to copy the numbers correctly.
A Subtraction Ladder [Thorndike, '17, III, 11]
An Average Ladder [Thorndike, '17, III, 132] Find the average of the quantities on each line. Begin with Step 1. Climb to the top without making a mistake. Be sure to copy the numbers correctly. Extend the division to two decimal places if necessary.
As such tests are widened to cover the whole task of the elementary school in respect to arithmetic, and accepted by competent authorities as adequate measures of achievement in computing, they will give, as has been said, a working definition of the task. The reader will observe, for example, that work such as the following, though still found in many textbooks and classrooms, does not, in general, appear in the modern tests and scales. Reduce the following improper fractions to mixed numbers:— 19/13 43/21 176/25 198/14 Reduce to integral or mixed numbers:— 61381/37 2134/67 413/413 697/225 Simplify:— 3/4 of 8/9 of 3/5 of 15/22 Reduce to lowest terms:— 357/527 264/312 492/779 418/874 854/1769 30/735 44/242 77/847 18/243 96/224 Find differences:—
Square:— 2/3 4/5 5/7 6/9 10/11 12/13 2/7 15/16 19/20 17/18 25/30 41/53 Multiply:— 2/11 × 33 32 × 3/14 39 × 2/13 60 × 11/28 77 × 4/11 63 × 2/27 MEASUREMENTS OF ABILITY IN APPLIED ARITHMETIC: THE SOLUTION OF PROBLEMSStone ['08] measured achievement with the following problems, fifteen minutes being the time allowed. "Solve as many of the following problems as you have time for; work them in order as numbered: 1. If you buy 2 tablets at 7 cents each and a book for 65 cents, how much change should you receive from a two-dollar bill? 2. John sold 4 Saturday Evening Posts at 5 cents each. He kept 1/2 the money and with the other 1/2 he bought Sunday papers at 2 cents each. How many did he buy? 3. If James had 4 times as much money as George, he would have $16. How much money has George? 4. How many pencils can you buy for 50 cents at the rate of 2 for 5 cents? ' 5. The uniforms for a baseball nine cost $2.50 each. The shoes cost $2 a pair. What was the total cost of uniforms and shoes for the nine? 6. In the schools of a certain city there are 2200 pupils; 1/2 are in the primary grades, 1/4 in the grammar grades, 1/8 in the high school, and the rest in the night school. How many pupils are there in the night school? 7. If 3½ tons of coal cost $21, what will 5½ tons cost? 8. A news dealer bought some magazines for $1. He sold them for $1.20, gaining 5 cents on each magazine. How many magazines were there? 9. A girl spent 1/8 of her money for car fare, and three times as much for clothes. Half of what she had left was 80 cents. How much money did she have at first? 10. Two girls receive $2.10 for making buttonholes. One makes 42, the other 28. How shall they divide the money? 11. Mr. Brown paid one third of the cost of a building; Mr. Johnson paid 1/2 the cost. Mr. Johnson received $500 more annual rent than Mr. Brown. How much did each receive? 12. A freight train left Albany for New York at 6 o'clock. An express left on the same track at 8 o'clock. It went at the rate of 40 miles an hour. At what time of day will it overtake the freight train if the freight train stops after it has gone 56 miles?" The criteria he had in mind in selecting the problems were as follows:— "The main purpose of the reasoning test is the determination of the ability of VI A children to reason in arithmetic. To this end, the problems, as selected and arranged, are meant to embody the following conditions:— 1. Situations equally concrete to all VI A children. 2. Graduated difficulties. 3. The omission of The test is purposely so long that only very rarely did any pupil fully complete it in the fifteen minute limit." Credits were given of 1, for each of the first five problems, 1.4, 1.2, and 1.6 respectively for problems 6, 7, and 8, and of 2 for each of the others. Courtis sought to improve the Stone test of problem-solving, replacing it by the two tests reproduced below. ARITHMETIC—Test No. 6. Speed Test—Reasoning Do not work the following examples. Read each example through, make up your mind what operation you would use if you were going to work it, then write the name of the operation selected in the blank space after the example. Use the following abbreviations:—"Add." for addition, "Sub." for subtraction, "Mul." for multiplication, and "Div." for division.
(Two more similar problems follow.) Test 6 and Test 8 are from the Courtis Standard Test. Used by permission of S. A. Courtis. ARITHMETIC—Test No. 8. Reasoning In the blank space below, work as many of the following examples as possible in the time allowed. Work them in order as numbered, entering each answer in the "answer" column before commencing a new example. Do not work on any other paper.
These proposed measures of ability to apply arithmetic illustrate very nicely the differences of opinion concerning what applied arithmetic and arithmetical reasoning should be. The thinker who emphasizes the fact that in life out of school the situation demanding quantitative treatment is usually real rather than described, will condemn a test all of whose constituents are described problems. Unless we are excessively hopeful concerning the transfer of ideas of method and procedure from one mental function to another we shall protest against the artificiality of No. 3 of the Stone series, and of the entire Courtis Test 8 except No. 4. The Courtis speed-reasoning test (No. 6) is a striking example of the mixture of ability to understand quantitative relations with the ability to understand words. Consider these five, for example, in comparison with the revised versions attached. 1. The children of a school gave a sleigh-ride party. There were 9 sleighs, and each sleigh held 30 children. How many children were there in the party? Revision. If one sleigh holds 30 children, 9 sleighs hold .... children. 2. Two school-girls played a number-game. The score of the girl that lost was 57 points and she was beaten by 16 points. What was the score of the girl that won? Revision. Mary and Nell played a game. Mary had a score of 57. Nell beat Mary by 16. Nell had a score of .... 3. A girl counted the automobiles that passed a school. The total was 60 in two hours. If the girl saw 27 pass the first hour how many did she see the second? Revision. In two hours a girl saw 60 automobiles. She saw 27 the first hour. She saw .... the second hour. 4. On a playground there were five equal groups of children each playing a different game. If there were 75 children all together, how many were there in each group? Revision. 75 pounds of salt just filled five boxes. The boxes were exactly alike. There were .... pounds in a box. 5. A teacher weighed all the children in a certain grade. One girl weighed 70 pounds. Her older sister was 49 pounds heavier. How many pounds did the sister weigh? Revision. Mary weighs 70 lb. Jane weighs 49 pounds more than Mary. Jane weighs .... pounds. The distinction between a problem described as clearly and simply as possible and the same problem put awkwardly or in ill-known words or willfully obscured should be regarded; and as a rule measurements of ability to apply arithmetic should eschew all needless obscurity or purely linguistic difficulty. For example, A boy bought a two-cent stamp. He gave the man in the store 10 cents. The right change was .... cents. is better as a test than If a boy, purchasing a two-cent stamp, gave a ten-cent stamp in payment, what change should he be expected to receive in return? The distinction between the description of a bona fide problem that a human being might be called on to solve out of school and the description of imaginary possibilities or puzzles should also be considered. Nos. 3 and 9 of Stone are bad because to frame the problems one must first know the answers, so that in reality there could never be any point in solving them. It is probably safe to say that nobody in the world ever did or ever will or ever should find the number of apples in a box by the task of No. 4 of the Courtis Test 8. This attaches no blame to Dr. Stone or to Mr. Courtis. Until very recently we were all so used to the artificial problems of the traditional sort that we did not expect anything better; and so blind to the language demands of described problems that we did not see their very great influence. Courtis himself has been active in reform and has pointed out ('13, p. 4 f.) the defects in his Tests 6 and 8. "Tests Nos. 6 and 8, the so-called reasoning tests, have proved the least satisfactory of the series. The judgments of various teachers and superintendents as to the inequalities of the units in any one test, and of the differences between the different editions of the same test, have proved the need of investigating these questions. Tests of adults in many lines of commercial work have yielded in many cases lower scores than those of the average eighth grade children. At the same time the scores of certain individuals of marked ability have been high, and there appears to be a general relation between ability in these tests and accuracy in the abstract work. The most significant facts, however, have been the difficulties experienced by teachers in attempting to remedy the defects in reasoning. It is certain that the tests measure abilities of value but the abilities are probably not what they seem to be. In an attempt to measure the value of different units, for instance, as many problems as possible were constructed based upon a single situation. Twenty-one varieties were secured by varying the relative form of the question and the relative position of the different phrases. One of these proved nineteen times as hard as another as measured by the number of mistakes made by the children; yet the cause of the difference was merely the changes in the phrasing. This and other facts of the same kind seem to show that Tests 6 and 8 measure mainly the ability to read." The scientific measurement of the abilities and achievements concerned with applied arithmetic or problem-solving is thus a matter for the future. In the case of described problems a beginning has been made in the series which form a part of the National Intelligence Tests ['20], one of which is shown on page 49 f. In the case of problems with real situations, nothing in systematic form is yet available. Systematic tests and scales, besides defining the abil National Intelligence Tests. TEST 1Find the answers as quickly as you can. Begin here
From National Intelligence Tests by National Research Council. |