photo credit: ILRI/Stevie Mann

It sounds a little like the popular fairytale. A researcher searches out datasets only to find them too small or too incomplete, too this or not enough that...the perfect fit of appropriate scale, detail and reliability eluding her (or his) desires. The researcher begins to sympathize and thinks to herself, maybe Goldilocks wasn’t so spoiled after all.

For example, household survey data from sub-Saharan Africa are rich in detail but their sample sizes are usually too small for any kind of fine level disaggregation. Census data have large enough sample sizes but don’t always include the variables you need (such as income and consumption). In a recent paper we (Azzarri & Cross) fuse survey and census data to find a combination of detail and size that is, in the words of Goldilocks, just right.

Small Area Estimation (SAE) techniques which integrate census and survey data à la Elbers et al (2003) take advantage of the details in the survey and the comprehensive coverage of the census. This combination allows researchers to analyze variables at a higher spatial disaggregation than would be possible with the survey alone, allowing for higher spatially-disaggregated maps. While SAE has been predominately used to more precisely estimate income and food consumption across a landscape, this technique could also benefit analysis of livestock numbers and income from livestock activities to enhance location-specific knowledge for policy targeting.

The method behind SAE is relatively straightforward. Take livestock income in Uganda, for example. First, comparable variables are selected from the survey and census. They must be similar in mean, standard deviation, frequency distribution at the national level, collection method and definition. Second, an estimation model of livestock income is fitted on the survey and used to predict the missing information in the census: livestock income at the local-level (see Elbers et al (2003) for more details).

But first, the method needs validation using available livestock data. We tested the method with survey and census data from Uganda, including the 2009/10 Uganda National Panel Survey and the 2008 Uganda National Livestock Census. The maps below (figure 1) show the actual densities of large ruminants (# of livestock/squared kilometer) from both survey and census, and predicted to census using SAE (click on maps to enlarge).

Figure 1. Density of Large Ruminants: Survey (left), Census (middle) and Predicted to Census (right)

One of the obvious results of the analysis is that, by virtue of survey-to-census prediction, it is possible to draw higher spatially-disaggregated maps than using survey alone. The survey shows what appear to be homogeneous regions, but once disaggregated to the sub-county level through the census, becomes a more detailed and scattered picture. Second, the density range is wider in the census than in the survey, as in the latter the distribution is composed of four values, one for each region, as averages of sub-county values within each region. Third, and foremost, from a policy perspective the census map is more meaningful for targeting purposes.

We then fitted an estimation model of livestock income on the survey and used that to predict income at the local-level. The maps below (figure 2) show the share of income from livestock using survey and predicted to census using SAE (click on maps to enlarge).

Figure 2. Share of Income from Livestock: Survey (left) and Predicted to Census (right)

Given the quandary of relying on either survey or census data alone, there is obviously a need to complement the two for more spatially-specific analyses. By fitting accurate prediction models, there is the concrete possibility of combining multi-topic household surveys with specialized databases to estimate contribution of livestock to household livelihoods. SAE has been successfully used for targeting poverty programs in many countries worldwide, and the present work shows that it could also work well as a tool for informing livestock policy.

So if you find yourself feeling like Goldilocks with survey data that’s too small and census data that’s too sketchy, this data combination technique is juuust right for your needs.

The original paper this article is based on is not yet publicly available for distribution.

Citation

HarvestChoice, 2012. "Solving Goldilocks’s Quandary ." International Food Policy Research Institute, Washington, DC., and University of Minnesota, St. Paul, MN. Available online at http://harvestchoice.org/node/5264.

Jul 23, 2012