Development of a Longitudinally Consistent Dataset for the Health and Retirement Study

Published: 2004

Project ID: UM04-10

Abstract

Imputation for missing data is critical to the development of high-quality microdata sets. For the most part, imputations are developed from the relationship between the response to a survey variable and the characteristics of the respondent; e.g., imputing an asset value for respondents whose asset value is missing involves creating a pool of observations with known asset values for a set of respondents with characteristics similar to x, and then selecting one of the pooled observations as the imputed value of the asset for respondent x. In cases where the microdata set is longitudinal, more information is available to estimate imputed values than will be the case where the data are cross-sectional. In a longitudinal dataset we have the value of the missing variable from a prior wave or waves, and that value plays a major role in the imputation process.