There has been a spectacular increase in the availability and quality of data from developing countries in recent years. Many of these datasets are either in the public domain or can be obtained at modest cost from the data collection agency. This page is intended as a resource to help locate those data. The International Household Survey Network provides links to documentation to many of these and other data. Some of the data are available at ICPSR, the World Bank Microdata Catalog and theHarvard Dataverse Network.
In addition, there has also been a huge investment in methods that seek to isolate exogenous variation in programs, policies, opportunities and constraints faced by individuals, households, families, firms and communities. Randomized controlled trials have played a central role in this work. The AEA maintains a Registry of RCTs which is an extremely valuable resource for investigators to register their RCTs and to learn about on-going or completed RCTs. In addition, a lot of authors are posting their collected data to Dataverse so please search there too.
An on-going longitudinal survey of individuals, households, families, communities and facilities, the first wave of the survey was conducted in 1993/4 and included interviews with 7,224 households in 321 communities in 13 provinces in Indonesia. The survey is representative of about 83% of the Indonesian population. The second wave, in 1997/8, was followed by a survey of 25% of the sample enumeration ares in 1998. IFLS3 was conducted in 2000 and IFLS4 was conducted in 2007/08.
The surveys contain retrospective histories about, for example, employment, marriage, fertility and migration over the life course of each respondent. The surveys also include household consumption, assets, self-reported health status and a battery of health measures (including anthropometrics, hemoglobin, blood pressure, lung capacity and time to stand from a sitting position). In 2007, cholesterol and dry blood spots were added.
Public domain data and documentation are available on the web .
conducted in 1976/7 and 1988 also contain extensive histories on employment, marriage, fertility and migration. Respondents in the first wave were followed in the subsequent waves; in the second wave, a refreshment sample was added. MFLS1 (1976/77) and MFLS2 (1988) are in the public domain.
was conducted in 1996 and covers the same area as the Matlab Demographic Surveillance System. The data are in the public domain . A resurvey is underway.
Mexican Family Life Survey (MxFLS)
is an on-going nationally representative longitudinal survey of individuals, households, families and communities. The first wave was conducted in 2002. The first follow-up was completed in 2005. The second follow-ups was conducted in 2009/10. In addition to consumption, income, wealth, employment, marriage and fertility, the survey contains a module on crime and victimization as well migration histories. Respondents are followed if they move and interviewed in their new location. This includes people who move to the U.S. and those that return to Mexico. Biomarker data are collected and include assets, self-reported health status and a battery of health measures and dry blood spots. Data from the first two waves collected in Mexico are in the public domain .
is a single cross section survey which was was conducted in rural communities in 4 of Guatemala’s 22 departments. The survey was fielded in 1995. The data are publicly available .
University of North Carolina Surveys
Conducted by a team of researchers from the United States and the Philippines, the Cebu Longitudinal Health and Nutrition Survey is an ongoing study of a cohort of Filipino women who gave birth between May 1, 1983 and April 30, 1984 and have been re-interviewed periodically since then. The data are available at UNC.
The China Health and Nutrition Survey was conducted in 1989 and 1991 in 8 provinces in China and provides a wealth of detailed information on health and nutrition of adults and children including physical examinations. These data are available at UNC.
The Nang Rong projects represent a major data collection effort that was started in 1984 with a census of households in 51 villages. The villages were resurveyed in 1988 and again in 1994/95. New entrants were interviewed and a subsample of out migrants were followed. These data are available at UNC.
is an on-going panel survey of households in Russia that began in 1992. These data are available at UNC.
University of Washington CSDE Vietnam Research Projects
The Vietnam Life History Survey is a collaboration between the University of Wasthington, the Institute of Sociology and the Institute of Social Sciences, in Vietnam. The survey collects data from about 100 households in two urban and two rural areas in Vietname. The data are available at CSDE at UW.
The Vietnam Longitudinal Survey is a collaboration between Professor Charles Hirschman, University of Wasthington, the Institute of Sociology in Vietnam. The survey collects detailed demographic information from all adult respondents in over 1,800 households in one area of Vietnam. The data are available at CSDE at UW.
Rural Economic and Demographic Survey (REDS)
The National Council of Applied Economic Research has been surveying households and villages since the late 1960s as part of REDS. Some of the respondents have been interviewed in several rounds yielding a panel spanning 30 years. The raw data from the 1969, 1982 and 1999 waves are available on Andrew Foster’s web site . Foster provides an overview of the files here.
State-level data from India copiled by the Economic Organiasation and Public Policy Programme at the LSE is available here. Topics covered include
- land reform
- media and political agency
- labor regulation
- quality of life
- economic reforms
India Agriculture and Climate Data Set
The database provides district level data on agriculture and climate in India from 1957/58 through 1986/87. The dataset includes information on
- Area planted, production and farm harvest prices for five major and fifteen minor crops.
- Areas under irrigated and high-yielding varieties (HYV) for major crops.
- Data on agricultural inputs, such as, fertilizers, bullocks and tractors – in both quantity and price terms
- Agricultural labor, cultivators, wages and factory earnings, rural population and literacy proportion.
- Meteorological station level climate data (average climate over 30 year period)
- Soil data
The dataset was compiled by Apurva Sanghi, K.S. Kavi Kumar, and James W. McKinsey, of the World Bank and draws on work by James McKinsey and Robert Evenson of Yale University.
For more information, click here . The data and documentation are available here . A note on converting the files to STATA written by Gareth Nellis is here .
National Sample Survey Organization
The National Sample Survey Organisation (NSSO) of India has a long tradition of conducting high quality surveys. NSSO carries out socio-economic surveys, undertakes field work for the Annual Survey of Industries and follow-up surveys of Economic Census, sample checks on area enumeration and crop estimation surveys and prepares the urban frames useful in drawing of urban samples, besides collection of price data from rural and urban sectors.
The data are available for purchase on CD.
China Health and Retirement Longitudinal Study (CHARLS)
The China Health and Retirement Longitudinal Study (CHARLS) is patterned after the Health and Retirement Study (HRS) in the US. Pilot data were collected in 2008 in two provinces: Zhejiang and Gansu (the richest and poorest provinces). One person aged 45 and over was randomly chosen in each household with an age eligible person, and they and their spouse were interviewed. The sample is representative of people 45 and over in these two provinces in China. This sample contains data on 1,570 households and just under 2,700 individuals. Data are available here. The first nationally-representativa wave of CHARLS will be fielded in 2011 and the second in 2013.
is a prospective longitudinal survey of older adults (born before 1951) and their spouses. The first wave was conducted in 2001 and interviewed almost 10,000 adults and 5,000 spouses. The first follow-up was completed in 2003. The project is a collaboration of researchers at the Universities of Pennsylvania, Maryland and Wisconsin with INEGI in Mexico. It is directed by Beth Soldo.
is a series of comparable cross-national surveys on health and aging organized as a cooperative venture among researchers in Argentina, Barbados, Brazil, Chile, Cuba, Mexico and Uruguay. The goal of the project is to describe health, cognitive achievement and access to health care among people age 60 and older with a special focus on people over 80 years old. Professor Alberto Palloni is the project PI which has been funded by PAHO and the NIA.
Colombian Familas en Accion
Familias en Accion is a poverty alleviation program in Colombia. Data are availablehere . Evaluation of the program is described at the Center for the Evaluation of Development Policies at IFS.
Learning and Education Achievement in Punjab Schools
The Learning and Education Achievement in Punjab Schools (LEAPS) Project is a multi-year project initiated by researchers at Harvard University, Pomona College, and the World Bank that attempts to capture and track changes in the educational universe at the primary level (upto grade 5) in 112 villages in Pakistan. The main component of the project is a set of extensive surveys designed & conducted by the LEAPS team, with care being taken to be representative of the various actors in the educational market.The data consists of questionnaires administered to all 823 primary schools (public, private, NGO) in the 112 villages, to over 800 teachers (with basic information on 5,000 teachers), 1800 households, 6000 school children, and achievement tests of 12,000 class 3 children in Mathematics, English, and Urdu. All children, households, schools and teachers are matched and then followed over three additional (annual) rounds of surveys, for a complete 4-year panel.
The first round of data from these surveys & related documentation is now publicly available for researchers at: www.leapsproject.org. The website also provides related information (questionnaires for all rounds, preliminary papers, and a LEAPS report that highlights findings from the first round).
South African DataFirst Data Archive
DataFirst, a research unit at the University of Cape Town, is a web portal for South African census and survey data as well as metadata and all research output based on this data. The catalogue of downloadable datasets is here.
Living Standards Measurement Studies (LSMS)
Since 1980, the World Bank has been collecting multi-purpose household survey data in several countries under the Living Standards Measurement Study umbrella. That site contains information about the project, lists the countries included in the project and describes how data may be accessed. For some of the surveys, the data are available on the web.
- 1995 Azerbijan Survey of Living Conditions
- 1995 Bulgaria Integrated Household Survey
- 1985 Cote d’Ivoire Living Standards Survey
- Jamaica 1988-98
- Kyrgyz Republic
- Pakistan 1991
- Papua New Guinea
- Peru 1985, 1990, 1991
- 1992 Russian Longitudinal Monitoring Survey (available from a site at UNC).
- 1994 South African Integrated Household Survey
- 1993-1994 Tanzania Human Resource Development Survey
- Vietnam 1992-3
The Rural Income Generating Activities (RIGA) database
The RIGA project, a collaborative effort of FAO, the World Bank and American University in Washington, DC, aims to promote the understanding of the roles, relationships and synergies of on-farm and off-farm income generating activities for rural households. Building on existing household living standards surveys, the project has developed methodologically consistent, internationally comparable income data that are now available free of charge from the project’s website.
The database contains cross-country comparable indicators of household-level income for 26 surveys representing 16 countries across Africa, Asia, Eastern Europe and Latin America, making it a valuable resource for researchers and analysts in the development field. The surveys are both cross-sectional and panel, and currently run from 1992 through 2005; more surveys will be added to the database as they become available. While the RIGA project focuses mainly on the analysis of rural issues, the dataset contains information on both urban and rural income sources.
Find out more about the RIGA project: http://www.fao.org/es/ESA/riga/
Learn how to access the data:http://www.fao.org/es/ESA/riga/english/form_en.htm
Access the RIGA project publications:http://www.fao.org/es/ESA/riga/english/pubs_en.htm
Descriptions of evaluations conducted by the Abdul Latif Jameel Poverty Action Lab are available from the J-PAL evaluations page. Data underlying these evaluations are available from the data pages at Harvard Dataverse.
IFPRI has conducted several very innovative surveys in African and Asian countries. Many of these surveys are available for research purposes. See their home page and click on datasets.
Townsend Thai Project
and associated Thai databases are described here. The Townsend Thai project began in 1997 with a relatively large cross-section survey. Annual resurveys have been conducted and a monthly survey was initiated in August 1998.
This is an integrated longitudinal farm production and consumption survey conducted by Christopher Udry and Markus Goldstein (Yale University). Data may be downloaded from here.
collects longitudinal socio-demographic data in Kenya and Malawi under the direction of Susan Watkins and Jere Behrman. Data are available for downloading here.
NIDS is a nationally representative panel study that examines income, consumption and expenditure of households over time in South. Africa. The baseline survey was conducted in 2008 and the first follow-up was conducted in 2010. The data will throw light on matters such as coping strategies deployed in response to shocks and unexpected events whether negative or positive, such as death in the family or an unemployed relative obtaining a job.
In addition to income and expenditure dynamics, study themes include the determinants of changes in poverty and well-being; household composition and structure; fertility and mortality; migrancy and migrant strategies; labour market participation and economic activity; human capital formation, health and education; vulnerability and social capital. See the NIDS web page for details.
Langeberge integrated household survey was conducted by a consortium of South African and American universities along with government and non government agencies in South Africa. Data may be requested by sending an email. See their web page web page for details.
in five Asian countries collected detailed information on the status of women and their husbands in conjunction with fertility choices. Data collected in Malaysia, Pakistan, Philippines and Thailand in 1993/1994 are available for downloading here.
is housed at the Economic Growth Center at Yale University and distributes
Bicol longitudinal surveys, Philippines
ICRISAT India village level study
ICRISAT Burkina Faso farm production survey
Professor Doug Massey and collaborators have collected several waves of surveys on migration from central Mexico with special sub-samples of Mexicans living in Chicago. The data can be obtained from the MMP. web-site of by contacting Kristin Espinosa at the University of Pennsylvania. Her e-mail address is email@example.com.
is an extension of the MMP. Mexican Migration Project . The project is directed by Professor Doug Massey who, with his collaborators, has collected data in Puerto Rico, the Dominican Republic, Nicaragua, Costa Rica and Peru. Data are available here.
collects fertility and health surveys carried out in Central America. Data from Belize, Guatemala, El Salvador, Honduras, Nicaragua, Costa Rica and Panama are included in the collection.
TAPS is an annual panel data set covering the period 2002 throuh 2006 that follows a native Amazonian horticultural and foraging society experiencing rapid integration to the rest of the world. The study has been tracking about 1,500 native Amazonians in about 250 households of 13 villages along the Maniqui River, Department of Beni, Bolivia, and has introduced agricultural development projects. TAPS surveys take place every year during June-August. The first five-years of data, 2002-2006, are now available to the public in STATA. To request access to the 2002-2006 panel data set and its documentation go to the following web site:http://people.brandeis.edu/~rgodoy/research/pgs/panel.html or contact Ricardo Godoy (781) 736-2784, firstname.lastname@example.org
The World Fertility Surveys (WFS) were conducted in 41 countries during the 1970s and early 1980s. The data are all in the public domain and available at the Office of Population Research at Princeton University . This is a very good site to find out about data on fertility including the Chinese In-Depth Fertility Surveys.
Countries for which World Fertility Surveys are available include:
Benin; Cameroon, 1978; Cote d’Ivoire, 1980-81; Egypt, 1980; Ghana, 1979-80; Kenya, 1977-1978; Lesotho, 1977; Mauritania, 1981; Morocco, 1980; Nigeria, 1981-82; Rwanda, 1983; Senegal, 1978; Sudan (North), 1978-79; Tunisia, 1978;
Colombia, 1976; Costa Rica, 1976; Dominican Republic, 1975 and 1980; Ecuador, 1979-80; Guyana, 1975; Haiti, 1977; Jamaica, 1975-76; Mexico, 1976-77; Panama, 1975-76; Paraguay, 1979; Peru, 1977-78; Trinidad & Tobago, 1977; Venezuela, 1977;
Bangladesh, 1975-76; Fiji, 1974; Indonesia, 1976; Jordan, 1976; Korea, Republic of, 1974; Malaysia, 1974; Nepal, 1976; Pakistan, 1975; Philippines, 1978; Sri Lanka, 1975; Syria, 1978; Thailand, 1975; Turkey, 1978; Yemen Arab Republic, 1979;
More recent fertility, mortality and health data are available from Demographic and Health Surveys (DHS) . National which is DHS has been collecting national sample surveys of population and maternal and child health conducted in many developing countries since the 1980s. Data are currently collected under the umbrella of the Measure project which is administered by Macro International. Data have been collected in four waves:
- DHS-I (1986-90)
- DHS-II (1991-1992)
- DHS-III (1993-1997)
- Measure (1998-present))
See the Measure DHS website for a list of countries that have been surveyed.
The Centers for Disease Control (CDC) assists countries throughout the world in the development, implementation and analysis of national reproductive health surveys.
Provides a listing of many household surveys conducted across Africa.
Firm level data collected by The World Bank in collaboration with the Centre for the Study of African Economies, Oxford University, and several Government Statistical Agencies may be downloaded from this site.
CSAE faculty have collected firm level data in several African countries. Data from Ghana, Ethiopia, Tanzania and also, from a comparative study, in Cameroon, Ghana, Kenya and Zimbabwe. are available from the CSAE web-site . Some of these data are also available on the World Bank web site.