Prepare for your exams
Get points
Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

For each uploaded document

Answer questions

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

yule theory in statistics, Essays (university) of Statistics

Central University of South Bihar Statistics

notes for yules theory student help guide

Typology: Essays (university)

2016/2017

Uploaded on 11/16/2017

vasu-chanda 🇮🇳

1 document

1 / 14

This page cannot be seen from the preview

Don't miss anything!

bg1

VOLUME

II

FEBRUARY,

1903 No. 2

NOTES

ON THE

THEORY

OF

ASSOCIATION

OF

ATTRIBUTES

IN

STATISTICS.

BY

G.

UDNY YULE.

CONTENTS.

Introductory 121

1. Notation ; terminology ; tabulation, etc 122

2.

Consistence and inference 124

3.

Association 125

4.

On the theory of complete independence of a series of Attributes . . 127

5.

On the fallacies that may be caused by the mixing of distinct records . 132

THE

simplest possible form

of

statistical classification

is

" division"

(as the

logicians term

it)

" by dichotomy,"

i.a the

sorting

of the

objects

or

individuals

observed into

one or

other

of two

mutually exclusive classes according

as

they

do

or do not possess some character

or

attribute;

as one

may divide

men

into sane

and

insane,

the

members

of a

species

of

plants into hairy

and

glabrous,

or the

members

of

a

race

of

animals into males

and

females.

The

mere fact that

we do

employ

such

a

classification

in any

case must

not of

course

be

held

to

imply

a

natural

and

clearly defined boundary between

the

two classes; e.g. sanity

and

insanity, hairiness

and glabrousness,

may

pass into each other

by

such fine gradations that judgments

may differ

as to the

class

in

which

a

given individual should

be

entered.

The

judgment must however

be

finally decisive; intermediates

not

being classed

as

such even when observed.

The theory

of

statistics

of

this kind

is of a

good deal

of

importance,

not

merely because they

are of a

fairly common type—the statistics

of

hybridisation

experiments given

by the

followers

of

Mendel may

be

cited

as

recent examples—

but because

the

ideas

and

conceptions required

in

such theory form

a

useful

introduction

to the

more complex

and

less purely logical theory

of

variables.

The

classical writings

on the

subject

are

those

of De

Morgan*, Boole

f and

JevonsJ,

the method

and

notation

of the

latter being used

in the

following Notes,

the

first

three sections

of

which

are an

abstract

of

the two memoirs referred

to

below§.

*

Format Logic,

chap,

vrn.,

"

On

the

Nnmerio&lly Definite Syllogism," 1847.

t

Analytit

of

Logic,

1847.

Laws

of

Thought,

1854.

J

"On a

General System

of

Numerically Definite Seasoning," Memoin

of

Manchester Literary and

Philosophical

Society, 1870. Reprinted

in

Pure Logic and other Minor Works, Macmillan,

1890.

8

"On the

Aiwociation

of

Attribntea

in

Statistics," Phil. Tratu.

A, Vol. 194

(1900),

p. 257. "On

the

theory

of

Consistence

of

Logical Class Frequencies," Phil. Trans.

A, Yol. 197

(1901),

p. 91.

Biomctriia

n 10

pf3

pf4

pf5

pf8

pf9

pfa

pfd

pfe

Related documents

Further Statistical Theory - Exam 2005 - Statistics and Economics

AR(2) Process: Yule-Walker Estimation and Spectral Density Function

Méthodes de regroupement des données : Règle de Sturge et Règle de Yule

Asymptotic Properties of Yule-Walker and Maximum Likelihood Estimators in ARMA Processes -

Yule-Walker Equations and Maximum Likelihood Estimation in Autoregressive Models - Prof. A

Time Series Analysis: Parameter Estimation using Yule-Walker Method - Prof. P. L. Bartlett

Time Series Estimation: Yule-Walker & Maximum Likelihood Methods - Prof. Richard Smith

Time Series Modelling 2004 - Statistics and Economics

statistics theory(mod.1)

Probability Theory and Statistics

Crime Statistics, Stain Theory, Control Theory - Notes | SOC 270

(2)

Final Exam in Statistics

Partial preview of the text

Download yule theory in statistics and more Essays (university) Statistics in PDF only on Docsity!

VOLUME II FEBRUARY, 1903 No. 2

NOTES ON THE THEORY OF ASSOCIATION OF

ATTRIBUTES IN STATISTICS.

BY G. UDNY YULE.

CONTENTS.

Introductory 121

Notation ; terminology ; tabulation, etc 122
Consistence and inference 124
Association 125
On the theory of complete independence of a series of Attributes.. 127
On the fallacies that may be caused by the mixing of distinct records. 132

THE simplest possible form of statistical classification is " division" (as the logicians term it) " by dichotomy," i.a the sorting of the objects or individuals observed into one or other of two mutually exclusive classes according as they do or do not possess some character or attribute ; as one may divide men into sane and insane, the members of a species of plants into hairy and glabrous, or the members of a race of animals into males and females. The mere fact that we do employ such a classification in any case must not of course be held to imply a natural and clearly defined boundary between the two classes; e.g. sanity and insanity, hairiness and glabrousness, may pass into each other by such fine gradations that judgments may differ as to the class in which a given individual should be entered. The judgment must however be finally decisive; intermediates not being classed as such even when observed. The theory of statistics of this kind is of a good deal of importance, not merely because they are of a fairly common type—the statistics of hybridisation experiments given by the followers of Mendel may be cited as recent examples— but because the ideas and conceptions required in such theory form a useful introduction to the more complex and less purely logical theory of variables. The classical writings on the subject are those of De Morgan, Boole f and JevonsJ, the method and notation of the latter being used in the following Notes, the first three sections of which are an abstract of the two memoirs referred to below§. *** Format* Logic, chap, vrn., " On the Nnmerio&lly Definite Syllogism," 1847. t Analytit of Logic, 1847. Laws of Thought, 1854. J "On a General System of Numerically Definite Seasoning," Memoin of Manchester Literary and Philosophical Society, 1870. Reprinted in Pure Logic and other Minor Works, Macmillan, 1890. 8 "On the Aiwociation of Attribntea in Statistics," Phil. Tratu. A, Vol. 194 (1900), p. 257. "On the theory of Consistence of Logical Class Frequencies," Phil. Trans. A, Yol. 197 (1901), p. 91. Biomctriia n 10

122 On the Theory of Association

Notation; terminology ; relations between the class frequencies; tabulation. The notation used is as folio WB * : N = total number of observations, (-4) = no. of objects or individuals possessing attribute A, (a) = „ „ not possessing attribute A, (AB) — „ „ possessing both attributes A and B, (A/3) = „ „ „ attribute A but not B. (aB) = „ „ „ attribute B but not A, (a/9) = „ „ not possessing either attribute A or B, and so on for as many attributes as are specified. A class specified by n attributes in this notation may be termed a class of the nth order. The attributes denoted by English capitals may be termed positive attributes, and their contraries, denoted by the Greek letters, negative attributes. If two classes are such that every attribute in the one is the negative or contrary of the corresponding attribute in the other they may be termed contrary classes, and their frequencies contrary frequencies; (AB) and (off), (ABy) and (aflC) are for instance pairs of contraries. If the complete series of frequencies arrived at by noting n attributes is being tabulated, frequencies of the same order should be kept together. Those of the same order are best arranged by taking separately the set or "aggregate" of frequencies, derivable from each positive class by substituting negatives for one or more of the positive attributea Thus the frequencies for the case of three attributes may conveniently be tabulated in the order— Order 0. N Order 1. (A), (a) : (B), (£): (C),(y) Order 2. (AB), (A0), (*B), (ctf): (AC), (Ay), (aC), (ay): (BC), (By), (/SCO, (#7) Order 3. (ABC), (aBC), (ApC), (ABy), (a/3C), (aBy), (A$y), But since all frequencies are used non-exclusively, (A) denoting the frequency of objects possessing the attribute A with or without others and so forth, the frequency of any class can always be expressed in terms of the frequencies of classes of higher order; that is co say we have N :

(A)~(AB) + (A/3) *

( h

= (ABC) + (ABy) + (ApC) + (A0y) = etc

*** I h»ve snbititnted small Greek letter* for Jerons" italioj. Italics are rather troublesome when speaking, as one has to spell out a group like** AbcDE, "big A, little 6, little e, big D, big B." It is simpler to say JflyDE. The Greek become more troublesome when many letters are wanted, owing to the non-correspondence of the alphabets, but this is not often of consequence.*

124 (hi the Theoi-y of Association

Consistence and Inference.

Although the positive-class frequencies (including N under that heading) are all independent in the sense that no single one can be expressed in terms of the others, they are nevertheless subject to certain limiting conditions if they are to be self-consistent, Le. such as might have been observed in one and the same field of observation or " universe," to use the convenient term of the logicians. Consider the case of three attributes, for example. It is evident that we must have

(AB) «£ 0

(A)

(B)

as (AB) must not be negative

as (A/3) as (aB)

and similar conditions must hold for (AC) and (BO). But these are not the only conditions that must hold. The second-order frequencies must not only be such as not to imply negative values for the frequencies of other classes of their own aggregates, but also must not imply negative values for any of the third-order frequencies. Expanding all the third-order frequencies in terms of the frequencies of positive classes, and putting the resulting expansion •<£ 0, we have

*£(AB) + (AC)-(A)

•$(AB) + (BC)-(B)

<t(AC)+(BC)-(C)

>(AB)

>(AC)

>(BC)

> (AB) + (AC) + (BC)-(A)- (B) - (C)

or the frequency given below will be negative

(ABC) [m

(A/3y) [2] (aBy) [3] (a/3(7) [4] (ABy) [5] (Af3C) [6] (aBC) [7]

+ F (afa) [8]

But if any one of the minor limits [1]—[4] be greater than any one of the major limits [5]—[8] these conditions are impossible of fulfilment. There are four minor limits to be compared with four major limits or sixteen comparisons in all to be made; but the majority of these, twelve in all, only lead back to conditions of the form (4). The four comparisons of expansions due to contrary frequencies alone lead to new conditions—viz.

(AB) + (AC) + (BC) < (A) + (B) + (C)-N

G. U. YULE 125

These conditions give limits to any one of the three frequencies (AB), (AC) and (BC) iu terms of the other two and the frequencies of the first order, i.e. enable us to infer limits to the one class-frequency in terms of the others. It will very usually happen in practical statistical cases that the limits so obtained are value- less, lying outside those given by the simpler conditions (4), but that is merely because in practice the values of the assigned frequencies, e.g. (AB) and (AC), seldom approach sufficiently closely to their limiting values to render inference possible.

Association.

Two attributes, A and B, are usually defined to be independent, within any given field of observation or " universe," when the chance of finding them together is the product of the chances of finding either of them separately. The physical meaning of the definition seems rather clearer in a different form of statement, viz. if we define A and B to be independent when the proportion of A's amongst the B"s of the given universe is the same as in that universe at large. If for instance the question were put " What is the test for independence of small-pox attack and vaccination?", the natural reply would be "The percentage of vaccinated amongst the attacked should be the same as in the general popu- lation " or " The percentage of attacked amongst the vaccinated should be the same as in the general population." The two definitions are of course identical in effect, and permit of the same simple symbolical expression in our notation; the criterion of independence of A and B is in fact

In this equation the attributes specifying the universe are understood, not expressed. If all objects or individuals in the universe are to possess an attribute or series of attributes K it may be written

(AK)(BK)

(ABK) ^ —.

An equation of such form must be recognised as the criterion of independence for A and B within the universe K. As I have shewn in the first memoir referred to in note §, p. 121, if the relation (7) hold good, the three similar relations for the remaining frequencies of the " aggregate "—Le. the set of frequencies obtained by substituting their contraries a, 8 for A or B or both—must also hold, viz.

G. U. YULE 127

This point is frequently forgotten. In an investigation as to the inheritance of deaf-mutism in America*, for instance, only the offspring of deaf-mutes were observed, and the argument consequently breaks down on page after page into conjectural statements as to points on which the editor has no information— e.g. the proportion of deaf-mutes amongst the children of normals.

The difference of (AB)/(A) from (S)/JV and of (AB)/(B) from (A)/N are of course not, as a rule, the same, and it would be useful and convenient to measure the " association " by some more symmetrical method—a " coefficient of association " ranging between ± 1 like the coefficient of correlation. In the first memoir referred to in note §, p. 121, such a coefficient, of empirical form, was suggested, but that portion of the memoir should now be read in connection with a later memoir by Professor Pearson f.

On the theory of complete independence of a series of Attributes.

The tests for independence are by no means simple when the number of attributes is more than two. Under what circumstances should we say that a series of attributes ABGD... were completely independent? I believe not a few statisticians would reply at once " if the chance of finding them together were equal to the product of the chances of finding them separately," yet such a reply would be in error. The mere result

(ABCB...)_(A) (B) (G)

~F N'N'N'N

does not in general give any information as to the independence or otherwise of the attributes concerned. If the attributes are known to be completely inde- pendent then certainly the relation (9) holds good, but the converse is not true. " Equations of independence" of the form (9) must be shewn to hold for more than one class of any aggregate, of an order higher than the second, before the complete independence of the attributes can be inferred.

From the physical point of view complete independence can only be said to subsist for a series of attributes ABGD... within a given universe, when every pair of such attributes exhibits independence not only within the universe at large but also in every sub-universe specified by one or more of the remaining attributes of the series, or their contraries. Thus three attributes A, B, G are completely independent within a given universe if AB, AC and BC are independent within that universe and also

AB independent within the universes G and 7, •A-G „ „ „ B „ fS, BC „ „ „ A „ a. _ ilarriag.es of the Deaf in America,_* ed. by E. A. Fay. Volta Bureau, Washington, 1898. t Phil. Tram. VoL 195, p. 16.

128 On the Theory of Association

If a series of attributes are completely independent according to this definition relations of the form (9) must hold for the frequency of every class of every possible order. Take the class-frequency (ABCD) of the fourth order for instance. A and B are, by the terms of the definition, independent within the universe CD. Therefore

But A and G, and also B and C, are independent within the universe D. Therefore the fraction on the right is equal to

1 (AD) (CD) (BD) (CD) _ (AD) (BD) (CD) (CD)' (D) • (25) (DJ •

But again AD, BD, CD are each independent within the universe at large; therefore finally

N * N m~F If

Any other frequency can be reduced step by step in precisely the same way.

Now consider the converse problem. The total frequency N is given and also the n frequencies (A), (B), (C), etc. In how many of the ultimate frequencies (ABCD...MN), (aBCD...MN), etc. must "relations of independence " of the form

hold good, in order that complete independence of the attributes may be inferred? The answer is suggested at once by the following consideration. The number of ultimate frequencies (frequencies of order n) is 2B^ ; the number of frequencies given is n + 1. If then all but n + 1 of the ultimate frequencies are given in terms of the equations of independence, the remaining frequencies are deter- minate ; either these determinate values must be those that, would be given by equations of independence, or a state of complete independence must be impossible. Suppose all the ultimate class-frequencies to have been tested and found to be given by the equations of independence, with the exception of the negative class (a/3<yb...f*.v)and the n classes with one positive attribute (Afiyh... pv),(aByo... nv), etc. Take any one of these untested class-frequencies, (Af3y& ... nv), and we have for example A0B A(ABCD ... MN) -(ABCD ...Mv) — other terms with one negative -(ABCD.../iv) — other terms with two negatives

-(AByS ... fiv) — other terms with n — 2 negatives.

130 On the Theory of Association

Replacing C by N— (7) and regrouping in similar pairs of terms containing (D) and (8) this will become

e t c }

and continuing the same process until all the frequencies (D) (E). .. ( M) (N) are eliminated, Le. n — 1 times altogether,

That is to say the theorem must be true quite generally: " A series of n attributes ABG... MN are completely independent if the relations of independence are proved to hold for (2* — n + 1) of the 2" ultimate frequencies; such relations must then bold for the remaining n + 1 frequencies also." If the ultimate frequencies are only given by the relations of independence in n cases or less, independence may exist for certain pairs of attributes in certain universes but not in general. The mere fact of the relation holding for one class, e.g.

implies nothing—in striking contrast to the simple case of two attributes, where 2* — n + 1 = 1 and only the one class-frequency need be tested in order to see if independence exists. In the case of three attributes the number of third-order classes is eight, of which four must be tested in order to be certain that complete independence exists. In the case of four attributes there are sixteen fourth-order classes of which eleven must be tested, and so on.

I have dealt with the problem hitherto on the assumption that only the first- order and the nth order frequencies were given, and that the frequencies of intermediate orders were unknown—or at least uncalculated, for of course the frequencies of all lower orders may be expressed in terms of those of the nth order. If however the frequencies of all orders may be supposed known, the above result may be thrown into a somewhat interesting form. It will be remembered that the frequency of any class of any order may be expressed in terms of the frequencies of the positive classes [(A) (AB) (AC) (ABO) eta] of its own and lower orders. Then complete independence exists for a series of attributes if the criterion of independence hold for all the positive-cbiss frequencies up to that of the nth order. If we have for instance

G. U. YULE 131

and also

we must have

(aBCD ... MN) = {BCD ... MN) - (ABCD... MN)

= J L m (G)(D) ... (if) (N)} {N-(A)}

and so on. The number of class-frequencies to be tested in order to demonstrate the existence of complete independence is, of course, the same as before, viz.

It should be noted aa a consequence of these results that the definition of " complete independence " given on p. 127 is redundant in its terms. It is quite true that if complete independence subsist for a series of attributes every possible pair must exhibit independence in every possible sub-universe as well as in the universe at large, but it is not necessary to apply the criterion of independence to all these possible cases. In the case of three attributes for instance the criterion of independence need only be applied to four frequencies, as we have just seen, in order to demonstrate complete independence; it cannot then be necessary, as suggested by the definition, to test nine different associations, viz.

\AB\

\AC\

\BC\

\AB\

\AC\

\BG\

\o\

A 1

\AB\

\AC\

\BC\

B

a

in the notation of my memoir on Association (an expression like ) AB j G | specifying " the association between A and B in the universe of (7s"). It is in fact only necessary to test |-4-B|, \AC, \BC, and | J 4 5 | ( 7 | (or one of the other three partial associations in positive universes). If these are zero, the remaining associations must be zero also; for we are given

(ABC) = ^ (AC) (BG) = i s (A) (B) (0),

Le. | AG\ B |, | BG \ A |, etc. are zero. Quite generally, it is only necessary, if the testing be supposed to proceed from the second order classes upwards, to test one of all the possible partial associations corresponding to each positive class. If there be four attributes A BCD, the six total associations | AB |, | AC , | AD |, | _BG _

17—

G. U. YULB 133

negative value we cannot be sure that nevertheless | _AB \ C_ and | AB | y | are not both zero. Some given attribute might, for instance, be inherited neither in the male line nor the female line; yet a mixed record might exhibit a considerable apparent inheritance. Suppose for instance that 50 % of the fathers and of the sons exhibit the attribute, but only 10 % of the mothers and daughters. Then if there be no inheritance in either line of descent the record must give (approximately)

fathers with attribute and sons with attribute 25 % „ without „ 25°/o without „ „ with „ 2 5 % without „ 2 5 %

mothers with attribute and daughters with attribute 1 % „ without „ 9 % without „ „ „ with „ 9 % „ without „ 81°/o.

If these two records be mixed in equal proportions we get

parents with attribute and offspring with attribute 13 % ,, without „ 1 7 % without „ „ „ with „ 1 7 % » „ n „ „ without „ 5 3 %

Here 13/30 = 43£ % °f the offspring of parents with the attribute possess the attribute themselves, but only 30% °f offspring in general, i.e. there is quite a large but illusory inheritance created simply by the mixture of the two distinct records. A similar illusory association, that is to say an association to which the most obvious physical meaning must not be assigned, may very probably occur in any other case in which different records are pooled together or in which only one record is made of a lot of heterogeneous material.

Consider the case quite generally. Given that | AB \ G | and | AB \ y | are both zero, find the value of (AB). From the data we have at once

/ A »M - (^7) (By) _ [(A) - (AC)} [(B) - (BC)]

(AQ(BC)

(ABC)^ ^y—.

Adding

_N(AC)(BC)-(A)(C)(BO-(B)(C)(AO) + (A)(B)(C)

{AB) <0>[Jr-(O] •

134 On the Theory of Association

Write

(AB=±(A)(B), (AC\ = ±T(A)(C), (BG\ = i (5)(C),

subtract _(AB_ from both sides of the above equation, simplify, and we have

(AB) - {AB\ C[N-{C)] ' That is to say, there will be apparent association between A and B in the universe at large unless either A or B is independent of G. Thus, in the imaginary case of inheritance given above, if A and B stand for the presence of the attribute in the parents and the offspring respectively, and C for the male sex, we find a positive association between A and B in the universe at large (the pooled results) because A and B are both positively associated with C, i.e. the males of both generations possess the attribute more frequently than the females. The " parents with attribute " are mostly males; as we have only noted offspring of the same sex as the parents, their offspring mnst be mostly males in the same proportion, and therefore more liable to the attribute than the mostly-female offspring of "parents without attribute." It follows obviously that if we had found no inheritance to exist in any one of the four possible lines of descent (male-male, male-female, female-male, and female-female), no fictitious inheiitance could have been introduced by the pooling of the four records. The pooling of the two records for the crossed-sex lines would give rise to a fictitious negative inherit- ance—disinheritance—cancelling the positive inheritance created by the pooling of the records for the same-sex linea I leave it to the reader to verify these statements by following out the arithmetical example just given should he so desire. The fallacy might lead to seriously misleading results in several cases where mixtures of the two sexes occur. Suppose for instance experiments were being made with some new antitoxin on patients of both sexes. There would nearly always be a difference between the case-rates of mortality for the two. If the female cases terminated fatally with the greater frequency and the antitoxin were administered most often to the males, & fictitious association between " antitoxin " and "cure" would be created at once. The general expression for _(AB) — {AB_ shews how it may be avoided; it is only necessary to administer the antitoxin to the same proportion of patients of both sexes. This should be kept constantly in mind as an essential rule in such experiments if it is desired to make the most use of the results. The fictitious association caused by mixing records finds its counterpart in the spurious correlation to which the same process may give rise in the case of continuous variables, a case to which attention was drawn and which was fully discussed by Professor Pearson in a recent memoir*. If two separate records, for each of which the correlation is zero, be pooled together, a spurious correlation will necessarijy be created unless the mean of one of the variables, at least, be the same in the two cases.

- Phil Tram. A, Vol. 192, p. 277.