DATA DEMO INVOLVING STRATIFIED SAMPLES ======================================= 10/10/05 DATA from Example 4.3, Table 4.2 > ACLS.fr <- data.frame(matrix(c(9100, 1950, 5500, 10850, 2100, 5500, 9000, 915, 633, 658, 855, 667, 833, 824, 636, 451, 481, 611, 493, 575, 588, 38, 27, 18, 19, 36, 13, 26), ncol=4, dimnames=list(c("LIT","CLAS","PHIL","HIST","LING", "POLT","SOCY"), c("Members","Mailed","Returned","Female.Pct")))) ## Stratified estimate of % Females, assuming SAME among responders ## as non-responders > attach(ACLS.fr) pstr <- sum(Members*Female.Pct)/sum(Members) ## ans 24.65% ## SE taking percentages as fractions, and numbers returned as ## sample-sizes: > SEpstr <- sqrt(sum((1-Returned/Members)*(Members/44000)^2* (Female.Pct/100)*(1-Female.Pct/100)/(Returned-1))) > SEpstr [1] 0.007118622 ## We can ask: suppose the same total number of returns were distributed equally [resp. proportionately to members] among the 7 Professional Societies, and the observed fractions female were the same, then how would the SE change ? > nalt1 <- round(sum(Returned)/7) nalt2 <- round(sum(Returned)*Members/44000) rbind(Equal=nalt1, PropSize=nalt2, Actual=Returned) [,1] [,2] [,3] [,4] [,5] [,6] [,7] Equal 548 548 548 548 548 548 548 PropSize 793 170 479 946 183 479 784 Actual 636 451 481 611 493 575 588 ## SE under Actual allocation was .0071. ## Under Equal Allocation: > sqrt(sum((1-nalt1/Members)*(Members/44000)^2* (Female.Pct/100)*(1-Female.Pct/100)/(nalt1-1))) [1] 0.00743646 ## Under Proportional Allocation: > sqrt(sum((1-nalt2/Members)*(Members/44000)^2* (Female.Pct/100)*(1-Female.Pct/100)/(nalt2-1))) [1] 0.006522492 ## Now let's find the optimal allocation and associated SE. > Shsq <- Female.Pct/100*(1-Female.Pct/100) nopt <- round(sum(Returned)*Members*sqrt(Shsq)/ sum(Members*sqrt(Shsq))) > nopt [1] 918 180 439 884 209 384 820 > sqrt(sum((1-nopt/Members)*(Members/44000)^2*Shsq/(nopt-1))) [1] 0.006474661 ### So proportional allocation was already close to optimal! --------------------------------------------------------- NB. In this Example: all one can actually estimate is the proportion of females among those who WOULD respond if asked.