The harmonization of international census microdata for demographic research: the IPUMS project
Discutant: Valérie Golaz (INED/CEPED) Séance en anglais
In recent years, international comparative population research
has expanded steadily as more and better demographic data have
become available. Scholars are increasingly able to compare not
just aggregated summary data, but individual-level microdata,
across countries. But this promising area of research is hindered
by uneven access to data, differing survey instruments, uneven
documentation, and incompatible technical specifications. Even when
individual researchers can overcome these difficulties, the result
is substantial duplication of effort and limited reproducibility of
results. The Integrated Public Use Microdata Series (IPUMS) is
designed to address these issues for world population census data.
The data series contains 111 national census samples from 35
countries and has agreements with 40 additional national
statistical offices. The data are consistently coded and documented
across countries, and a web-based extraction system allows users to
pool selected variables and censuses for downloading and analysis.
Since the project began ten years ago, the IPUMS research team has
been forced to confront numerous technical and intellectual
challenges related to international harmonization. This paper
describes some of the key innovations, the reasoning behind them,
and the implications for research. Some persistent difficulties
true for the census samples -- and to some degree inherent to all
harmonization efforts -- are discussed. Finally, we present some
examples of research results that suggest the possibilities and
perils of cross-national research using harmonized data.