Entry
Reader's guide
Entries A-Z
Subject index
Optimal Allocation
Optimal allocation is a procedure for dividing the sample among the strata in a stratified sample survey. The allocation procedure is called “optimal” because in a particular survey sampling design (stratified simple random sampling) it produces the smallest variance for estimating a population mean and total (using the standard stratified estimator) given a fixed budget or sample size.
A sample survey collects data from a population in order to estimate population characteristics. A stratified sample selects separate samples from subgroups (called “strata”) of the population and can often increase the accuracy of survey results. In order to implement stratified sampling, it is necessary to be able to divide the population at least implicitly into strata before sampling. Given a budget that allows gathering data on n subjects or a budget amount $B, there is a need to decide how to allocate the resources for data gathering to the strata. Three factors typically affect the distribution of resources to the strata: (1) the population size, (2) the variability of values, and (3) the data collection per unit cost in the strata. One also can have special interest in characteristics of some particular strata that could affect allocations.
In a stratified simple random sample, a sample of size nh is selected from strata or subpopulation h, which has a population size of Nh (h = 1,2,…, H). The standard estimator of the population total is
where
h is the mean (arithmetic average) of the sample values in stratum h and Σ denotes summation across strata h = 1,2,…, H. The variance of the estimator is
where
is the variance of the values in stratum h. If the rate of sampling is small in all strata, then (ignoring the finite population correction terms
the variance is approximately
Suppose the cost to collect data from one element (person, unit, etc.) in stratum h is Ch. If there is a budget of B, then the entire budget is spent when
Then the variance (ignoring the finite population correction terms) of the estimated population total is minimized when the sample size in stratum h is
where the summation in the denominator is over all strata, Sh is the standard deviation (square root of the variance) of the values in stratum h, and n is the total sample size. This formula implies that one should sample more in large subpopulations (strata), more in strata with large variances, and more in strata with small cost. If costs of per unit data collection are the same in all strata, then the optimal allocation in stratum h is
If in addition variances (and standard deviations) are constant, then
which is the allocation known as proportional allocation to strata. If the nh's are not integers, then one must round the numbers to integers for sample selection. Rounding does not necessarily move all values to the closest integer for all strata, because the total sample size n needs to be allocated.
Suppose one wanted to collect data on students at a large public university. Questions of interest could be hours worked per week; amount of money expended per semester on textbooks; amount of time spent eating at restaurants in a week; number of trips to the airport in a semester; and whether or not friends smoke cigarettes. The students selected for the survey could be contacted via their university email addresses and asked to complete an online Web survey. A survey can be preferable to contacting every student, because for a sample better efforts can often made to encourage response and check data quality. Administrative records contain college year designations (first, second, third, fourth) for each student in the target population; college years can be used as strata. Suppose the total sample size is allowed to be 1,600 students. Equal allocation to strata would sample 400 students from each year. Table 1 presents allocations of students to the four strata based on total enrollments by college year; these numbers are similar to 2006 enrollment at Iowa State University. The hypothetical variable being considered is hours worked per week. It is assumed that students in higher years have more variable employment situations than students in earlier years, hence the increasing standard deviation. It also is assumed that more attempts are needed to contact students in later years than in earlier years. As can be seen in the table, the stratum of fourth-year students receives the largest sample (n4 = 731), whereas the stratum of first-year students receives the smallest (n1 = 224).
...
- Ethical Issues in Survey Research
- Anonymity
- Beneficence
- Cell Suppression
- Certificate of Confidentiality
- Common Rule
- Confidentiality
- Consent Form
- Debriefing
- Deception
- Disclosure
- Disclosure Limitation
- Ethical Principles
- Falsification
- Informed Consent
- Institutional Review Board (IRB)
- Minimal Risk
- Perturbation Methods
- Privacy
- Protection of Human Subjects
- Respondent Debriefing
- Survey Ethics
- Voluntary Participation
- Measurement - Interviewer
- Measurement - Mode
- Measurement - Questionnaire
- Aided Recall
- Aided Recognition
- Attitude Measurement
- Attitude Strength
- Attitudes
- Aural Communication
- Balanced Question
- Behavioral Question
- Bipolar Scale
- Bogus Question
- Bounding
- Branching
- Check All that Apply
- Closed-Ended Question
- Codebook
- Cognitive Interviewing
- Construct
- Construct Validity
- Context Effect
- Contingency Question
- Demographic Measure
- Dependent Variable
- Diary
- Don't Knows (DKs)
- Double Negative
- Double-Barreled Question
- Drop-down Menus
- Event History Calendar
- Exhaustive
- Factorial Survey Method (Rossi's Method)
- Feeling Thermometer
- Forced Choice
- Gestalt Psychology
- Graphical Language
- Guttman Scale
- HTML Boxes
- Item Order Randomization
- Item Response Theory
- Knowledge Question
- Language Translations
- Likert Scale
- List-Experiment Technique
- Mail Questionnaire
- Mutually Exclusive
- Open-Ended Question
- Paired Comparison Technique
- Precoded Question
- Priming
- Psychographic Measure
- Question Order Effects
- Question Stem
- Questionnaire
- Questionnaire Design
- Questionnaire Length
- Questionnaire-Related Error
- Radio Buttons
- Random Order
- Random Start
- Randomized Response
- Ranking
- Rating
- Reference Period
- Response Alternatives
- Response Order Effects
- Self-Administered Questionnaire
- Self-Reported Measure
- Semantic Differential Technique
- Sensitive Topics
- Show Card
- Step-Ladder Question
- True Value
- Unaided Recall
- Unbalanced Question
- Unfolding Question
- Vignette Question
- Visual Communication
- Measurement - Respondent
- Acquiescence Response Bias
- Behavior Coding
- Cognitive Aspects of Survey Methodology (CASM)
- Comprehension
- Encoding
- Extreme Response Style
- Key Informant
- Misreporting
- Nonattitude
- Nondifferentiation
- Overreporting
- Panel Conditioning
- Panel Fatigue
- Positivity Bias
- Primacy Effect
- Reactivity
- Recency Effect
- Record Check
- Respondent
- Respondent Burden
- Respondent Fatigue
- Respondent-Related Error
- Response
- Response Bias
- Response Latency
- Retrieval
- Reverse Record Check
- Satisficing
- Social Desirability
- Telescoping
- Underreporting
- Measurement - Miscellaneous
- Nonresponse - Item-Level
- Nonresponse - Outcome Codes and Rates
- Busies
- Completed Interview
- Completion Rate
- Contact Rate
- Contactability
- Contacts
- Cooperation Rate
- e
- Fast Busy
- Final Dispositions
- Hang-up during Introduction (HUDI)
- Household Refusal
- Ineligible
- Language Barrier
- Noncontact Rate
- Noncontacts
- Noncooperation Rate
- Nonresidential
- Nonresponse Rates
- Number Changed
- Out of Order
- Out of Sample
- Partial Completion
- Refusal
- Refusal Rate
- Respondent Refusal
- Response Rates
- Standard Definitions
- Temporary Dispositions
- Unable to Participate
- Unavailable Respondent
- Unknown Eligibility
- Unlisted Household
- Nonresponse - Unit-Level
- Advance Contact
- Attrition
- Contingent Incentives
- Controlled Access
- Cooperation
- Differential Attrition
- Differential Nonresponse
- Economic Exchange Theory
- Fallback Statements
- Gatekeeper
- Ignorable Nonresponse
- Incentives
- Introduction
- Leverage-Saliency Theory
- Noncontingent Incentives
- Nonignorable Nonresponse
- Nonresponse
- Nonresponse Bias
- Nonresponse Error
- Refusal Avoidance
- Refusal Avoidance Training (RAT)
- Refusal Conversion
- Refusal Report Form (RRF)
- Response Propensity
- Saliency
- Social Exchange Theory
- Social Isolation
- Tailoring
- Total Design Method (TDM)
- Unit Nonresponse
- Operations - General
- Advance Letter
- Bilingual Interviewing
- Case
- Data Management
- Dispositions
- Field Director
- Field Period
- Mode of Data Collection
- Multi-Level Integrated Database Approach (MIDA)
- Paper-and-Pencil Interviewing (PAPI)
- Paradata
- Quality Control
- Recontact
- Reinterview
- Research Management
- Sample Management
- Sample Replicates
- Supervisor
- Survey Costs
- Technology-Based Training
- Validation
- Verification
- Video Computer-Assisted Self-Interviewing (VCASI)
- Operations - In-Person Surveys
- Operations - Interviewer-Administered Surveys
- Operations - Mall Surveys
- Operations - Telephone Surveys
- Access Lines
- Answering Machine Messages
- Call Forwarding
- Call Screening
- Call Sheet
- Callbacks
- Caller ID
- Calling Rules
- Cold Call
- Computer-Assisted Telephone Interviewing (CATI)
- Do-Not-Call (DNC) Registries
- Federal Communications Commission (FCC) Regulations
- Federal Trade Commission (FTC) Regulations
- Hit Rate
- Inbound Calling
- Interactive Voice Response (IVR)
- Listed Number
- Matched Number
- Nontelephone Household
- Number Portability
- Number Verification
- Outbound Calling
- Predictive Dialing
- Prefix
- Privacy Manager
- Research Call Center
- Reverse Directory
- Suffix Banks
- Supervisor-to-interviewer Ratio
- Telephone Consumer Protection Act 1991
- Telephone Penetration
- Telephone Surveys
- Touchtone Data Entry
- Unmatched Number
- Unpublished Number
- Videophone Interviewing
- Voice over Internet Protocol (VoIP) and the Virtual Computer-Assisted Telephone Interview (CATI) Facility
- Political and Election Polling
- 800 Poll
- 900 Poll
- ABC News/Washington Post Poll
- Approval Ratings
- Bandwagon and Underdog Effects
- Call-in Polls
- Computerized-Response Audience Polling (CRAP)
- Convention Bounce
- Deliberative Poll
- Election Night Projections
- Election Polls
- Exit Polls
- Favorability Ratings
- FRUGing
- Horse Race Journalism
- Leaning Voters
- Likely Voter
- Media Polls
- Methods Box
- National Council on Public Polls (NCPP)
- National Election Pool (NEP)
- National Election Studies (NES)
- New York Times/CBS News Poll
- Poll
- Polling Review Board (PRB)
- Pollster
- Pre-Election Polls
- Pre-Primary Polls
- Precision Journalism
- Prior Restraint
- Probable Electorate
- Pseudo-Polls
- Push Polls
- Rolling Averages
- Sample Precinct
- Self-Selected Listener Opinion Poll (SLOP)
- Straw Polls
- Subgroup Analysis
- SUGing
- Tracking Polls
- Trend Analysis
- Trial Heat Question
- Undecided Voters
- Public Opinion
- Agenda Setting
- Consumer Sentiment Index
- Issue Definition (Framing)
- Knowledge Gap
- Mass Beliefs
- Opinion Norms
- Opinion Question
- Opinions
- Perception Question
- Political Knowledge
- Public Opinion
- Public Opinion Research
- Quality of Life Indicators
- Question Wording as Discourse Indicators
- Social Capital
- Spiral of Silence
- Third-Person Effect
- Topic Saliency
- Trust in Government
- Sampling, Coverage, and Weighting
- Adaptive Sampling
- Add-a-Digit Sampling
- Address-Based Sampling
- Area Frame
- Area Probability Sample
- Capture-Recapture Sampling
- Cell Phone Only Household
- Cell Phone Sampling
- Census
- Cluster Sample
- Clustering
- Complex Sample Surveys
- Convenience Sampling
- Coverage
- Coverage Error
- Cross-Sectional Survey Design
- Cutoff Sampling
- Designated Respondent
- Directory Sampling
- Disproportionate Allocation to Strata
- Dual-Frame Sampling
- Duplication
- Elements
- Eligibility
- Email Survey
- EPSEM Sample
- Equal Probability of Selection
- Error of Nonobservation
- Errors of Commission
- Errors of Omission
- Establishment Survey
- External Validity
- Field Survey
- Finite Population
- Frame
- Geographic Screening
- Hagan and Collier Selection Method
- Half-Open Interval
- Informant
- Internet Pop-up Polls
- Internet Surveys
- Interpenetrated Design
- Inverse Sampling
- Kish Selection Method
- Last-Birthday Selection
- List Sampling
- List-Assisted Sampling
- Log-in Polls
- Longitudinal Studies
- Mail Survey
- Mall Intercept Survey
- Mitofsky-Waksberg Sampling
- Mixed-Mode
- Multi-Mode Surveys
- Multi-Stage Sample
- Multiple-Frame Sampling
- Multiplicity Sampling
- n
- N
- Network Sampling
- Neyman Allocation
- Noncoverage
- Nonprobability Sampling
- Nonsampling Error
- Optimal Allocation
- Overcoverage
- Panel
- Panel Survey
- Population
- Population of Inference
- Population of Interest
- Post-Stratification
- Primary Sampling Unit (PSU)
- Probability of Selection
- Probability Proportional to Size (PPS) Sampling
- Probability Sample
- Propensity Scores
- Propensity-Weighted Web Survey
- Proportional Allocation to Strata
- Proxy Respondent
- Purposive Sample
- Quota Sampling
- Random
- Random Sampling
- Random-Digit Dialing (RDD)
- Ranked-Set Sampling (RSS)
- Rare Populations
- Registration-Based Sampling (RBS)
- Repeated Cross-Sectional Design
- Replacement
- Representative Sample
- Research Design
- Respondent-Driven Sampling (RDS)
- Reverse Directory Sampling
- Rotating Panel Design
- Sample
- Sample Design
- Sample Size
- Sampling
- Sampling Fraction
- Sampling Frame
- Sampling Interval
- Sampling Pool
- Sampling without Replacement
- Screening
- Segments
- Self-Selected Sample
- Self-Selection Bias
- Sequential Sampling
- Simple Random Sample
- Small Area Estimation
- Snowball Sampling
- Strata
- Stratified Sampling
- Superpopulation
- Survey
- Systematic Sampling
- Target Population
- Telephone Households
- Telephone Surveys
- Troldahl-Carter-Bryant Respondent Selection Method
- Undercoverage
- Unit
- Unit Coverage
- Unit of Observation
- Universe
- Wave
- Web Survey
- Weighting
- Within-Unit Coverage
- Within-Unit Coverage Error
- Within-Unit Selection
- Zero-Number Banks
- Survey Industry
- American Association for Public Opinion Research (AAPOR)
- American Community Survey (ACS)
- American Statistical Association Section on Survey Research Methods (ASA-SRMS)
- Behavioral Risk Factor Surveillance System (BRFSS)
- Bureau of Labor Statistics (BLS)
- Cochran, W. G.
- Council for Marketing and Opinion Research (CMOR)
- Council of American Survey Research Organizations (CASRO)
- Crossley, Archibald
- Current Population Survey (CPS)
- Gallup Poll
- Gallup, George
- General Social Survey (GSS)
- Hansen, Morris
- Institute for Social Research (ISR)
- International Field Directors and Technologies Conference (IFD&TC)
- International Journal of Public Opinion Research (IJPOR)
- International Social Survey Programme (ISSP)
- Joint Program in Survey Methodology (JPSM)
- Journal of Official Statistics (JOS)
- Kish, Leslie
- National Health and Nutrition Examination Survey (NHANES)
- National Health Interview Survey (NHIS)
- National Household Education Surveys (NHES) Program
- National Opinion Research Center (NORC)
- Pew Research Center
- Public Opinion Quarterly (POQ)
- Roper Center for Public Opinion Research
- Roper, Elmo
- Sheatsley, Paul
- Statistics Canada
- Survey Methodology
- Survey Sponsor
- Telemarketing
- U.S. Bureau of the Census
- World Association for Public Opinion Research (WAPOR)
- Survey Statistics
- Algorithm
- Alpha, Significance Level of Test
- Alternative Hypothesis
- Analysis of Variance (ANOVA)
- Attenuation
- Auxiliary Variable
- Balanced Repeated Replication (BRR)
- Bias
- Bootstrapping
- Chi-Square
- Composite Estimation
- Confidence Interval
- Confidence Level
- Constant
- Contingency Table
- Control Group
- Correlation
- Covariance
- Cronbach's Alpha
- Cross-Sectional Data
- Data Swapping
- Design Effects (deff)
- Design-Based Estimation
- Ecological Fallacy
- Effective Sample Size
- Experimental Design
- F-Test
- Factorial Design
- Finite Population Correction (fpc) Factor
- Frequency Distribution
- Hot-Deck Imputation
- Imputation
- Independent Variable
- Inference
- Interaction Effect
- Internal Validity
- Interval Estimate
- Intracluster Homogeneity
- Jackknife Variance Estimation
- Level of Analysis
- Main Effect
- Margin of Error (MOE)
- Marginals
- Mean
- Mean Square Error
- Median
- Metadata
- Mode
- Model-Based Estimation
- Multiple Imputation
- Noncausal Covariation
- Null Hypothesis
- Outliers
- p-Value
- Panel Data Analysis
- Parameter
- Percentage Frequency Distribution
- Percentile
- Point Estimate
- Population Parameter
- Post-Survey Adjustments
- Precision
- Probability
- Raking
- Random Assignment
- Random Error
- Raw Data
- Recoded Variable
- Regression Analysis
- Relative Frequency
- Replicate Methods for Variance Estimation
- Research Hypothesis
- Research Question
- Rho
- Sampling Bias
- Sampling Error
- Sampling Variance
- SAS
- Seam Effect
- Significance Level
- Solomon Four-Group Design
- Standard Error
- Standard Error of the Mean
- STATA
- Statistic
- Statistical Package for the Social Sciences (SPSS)
- Statistical Power
- SUDAAN
- Systematic Error
- t-Test
- Taylor Series Linearization
- Test-Retest Reliability
- Total Survey Error (TSE)
- Type I Error
- Type II Error
- Unbiased Statistic
- Validity
- Variable
- Variance
- Variance Estimation
- WesVar
- z-Score
- Loading...
Get a 30 day FREE TRIAL
-
Watch videos from a variety of sources bringing classroom topics to life
-
Read modern, diverse business cases
-
Explore hundreds of books and reference titles
Sage Recommends
We found other relevant content for you on other Sage platforms.
Have you created a personal profile? Login or create a profile so that you can save clips, playlists and searches