Raking Survey Data (a.k.a. Sample Balancing)

A survey sample may cover segments of the target population in proportions that do not match the proportions of those segments in the population itself. The differences may arise, for example, from sampling fluctuations, from nonresponse, or because the sample design was not able to cover the entire target population. In such situations one can often improve the relation between the sample and the population by adjusting the sampling weights of the cases in the sample so that the marginal totals of the adjusted weights on specified characteristics agree with the corresponding totals for the population. This operation is known as raking or sample-balancing, and the population totals are usually referred to as control totals.

Raking assigns a weight value to each survey respondent such that the weighted distribution of the sample is in very close agreement with two or more marginal control variables. For example, in household surveys the control variables are typically sample design and socio-demographic variables. Raking is an iterative process that uses the sample design weight as the starting weight and terminates when the convergence criterion is achieved. The resulting final weight may however exhibit considerable variability, with some sampling units having extremely low or high weights relative to most of the other sampling units. This leads to inflated sampling variances of the survey estimates. To combat this problem we enhanced the previously released IHB Raking macro by adding two weight trimming options that are implemented during the actual iterative process, allowing one to achieve convergence while controlling the highest and lowest weight values.

Resources:

Programming questions should be addressed to David Izrael.