We identify TFP by estimating a Cobb-Douglas production function with inputs measured per hectare, implicitly imposing constant returns to scale on the production technology. In such a setting, the inclusion of a measure of farm size as an explanatory variable identifies any relationship between farm size and TFP . The Mexican Family Life Survey is a longitudinal survey of Mexican households, representative of the Mexican population at the national, urban, and rural levels. The MxFLS is a rich source of data for this analysis, as controlling for unobservable farm and community level characteristics using fixed effects is potentially important for determining the farm size – productivity relationship. Further, the decade long span of the surveys allows for a careful analysis of how the size-productivity relationship has evolved in the wake of NAFTA and contemporaneous reforms affecting the Mexican agricultural sector. The three survey rounds – 2002, 2005-06, and 2009-128 – tracked a broad range of individual, family, and community characteristics for the 8,437 initial households. The second and third waves of the survey successfully re-interviewed 90% and 94% of first wave households, respectively. Individuals from the first wave formed new households at annual rates of 3.6% and 4.7% between the first and second and the second and third waves, with 83% of newly formed households being re-interviewed in the third survey wave. While not representative of the Mexican agricultural sector per se, the MxFLS is representative of both rural and non-rural Mexican households. As such, the use of the dataset to study Mexican agriculture has the important caveat that it under represents the larger, stacking pots commercial agricultural operations to the degree that they are not family farms.
A comparison with the 2007 Agricultural Census reveals that both the census and MxFLS have less than 5% of farms that are greater than 50 ha. However, it is important to note that these “large” farms are not necessarily the same as those in the census because they are family-run farms and do not include corporate-run, commercial agricultural operations. In comparison to the 2007 census, the MxFLS over-represents farms less than 2 ha and under-represents farms between 20 ha and 50 ha. This is true for each survey wave, highlighting that while the MxFLS is not representative of the Mexican agricultural sector in its entirety, it is appropriate for studying household farms in Mexico. We employ a farm level analysis using all MxFLS households engaged in agricultural production. A plot-level analysis is not feasible because agricultural input data is recorded at the household level and is therefore not plot specific. However, as we are primarily concerned with documenting the farm size – productivity relationship in Mexico and how it has changed over time, and we are less concerned with fully explaining its determinants, a farm level analysis will suffice. Households in the MxFLS move in and out of agricultural production between survey waves. An unbalanced panel is constructed through two stages of restricting the MxFLS data: first, cross-sections of households with complete farm data are identified and cleaned to eliminate outliers, and second, the unbalanced panel is formed out of all households that appear in two or more MxFLS survey waves. Table 2.1 shows all households using plots for agricultural production in a given survey wave are referred to as agricultural households, whereas all households with plot size and output data for all non-fallow plots are referred to as complete farms.
The intermediate group, farms with farm size data, includes all farms with complete farm size data but not necessarily complete production data – this less restricted dataset increases the sample size at the expense of potentially introducing some measurement error, and is an alternative treatment of the data that is pursued below. Lastly, the number of farms in the panel includes the number of households with complete farm data in two or more of the survey years. These restrictions on the data leave us with a sample of 566 farms reappearing in two or more survey years. Table 2.2 describes these farms according to the combination of survey years in which they appear. Farms are classified into one of 7 farm size groups, as shown below in Table 2.3. The distribution of farms across these bins is roughly constant over time and across treatments of the data, although the share of farms between 0 and 0.5 ha is falling over time while the share of farms between 0.5 and 1 ha is increasing. Importantly, with the exception of the share of farms between 0.5 and 1 ha in 2002, the distribution does not change in any notable way as we restrict the cross section to form the panel, an indication that use of the panel has not introduced bias along this dimension. There is a considerable range in farm sizes in the sample, ranging from less than one hundredth of a hectare to 45,000 hectares. The median farm size in the panel is 2.5, 2.1, and 3.0 hectares in 2002, 2005, and 2009, respectively, with mean farm sizes of 101, 232, and 218 hectares. Around 75 percent of farms utilize only one plot for production in any given year. The preferred measure of agricultural output is a Fisher quantity index that includes all crop and livestock production for each farm in the MxFLS panel. Crop pricesfrom the Food and Agriculture Organization of the United Nations are used to aggregate crop output. Together with a measure of the value of livestock production, an output index is constructed as detailed in Appendix B.1. The MxFLS offers data on five agricultural inputs other than land: physical capital, draft animals, purchased intermediate inputs, family labor, and non-family labor.
Physical capital is measured as the value of tractors and other machines and equipment owned and draft animals is the value of horses, donkeys, and mules owned by each household in each survey year, deflated to 2002 values. Purchased intermediate inputs are measured using reported expenditures on each of nine agricultural inputs over the course of the previous year, again deflated to 2002 values. An index of family labor is constructed using household members’ time use and employment data in the MxFLS, and is an estimate of annual hours worked on the farm by all household members. In contrast, the non-family labor index is a measure of the number of non-household individuals that worked on each farm in each year, measured in workers and not labor hours. Appendix B.2 provides a detailed discussion of the source and construction of the family labor and non-family labor indices, including a set of alternative family labor indices. Table 2.4 shows the share of panel households using the different input categories in each year, with purchased intermediate inputs shown both collectively and further disaggregated into their nine components. For all of the inputs there exist at least some, if not a majority, of households that have zeros for that input category. This is expected, as farms in the sample are expected to span a range from low technology subsistence agriculture to more modern and input intensive operations. Furthermore, nft hydroponic many inputs may be substitutes for each other, and farms can access these inputs by owning them or by purchasing them in factor markets. Tractor services, for example, may be substituted for with draft animals. Households can either own some combination of these capital stocks or purchase their services from the market. We follow Battese to estimate production functions with observations having zero inputs. Of principle importance is any relationship between inputs per hectare and farm size, as systematic relationships between input intensity and farm size potentially drive a wedge between the farm size – land productivity and farm size – total factor productivity relationships . We calculate the correlation coefficients between logged input per hectare and logged farm size for those farms with non-zero values of usage of each input. These correlations are shown in Table 2.5. Conditional on using the input, the intensity of all inputs used declines with farm size, emphasizing the importance of moving from partial measures of productivity to a comprehensive measure such as TFP.The vast majority of plots are either privately owned property or are part of an ejido – a piece of communally held land where plots are farmed by designated households. It is commonly accepted that ejidos are less productive than privately held farms, although there is little empirical evidence comparing the TFP of these farms using micro data. At least 91% of privately held plots in the MxFLS have some form of formal documentation in any given year, while just 75-84% of MxFLS ejido properties do. Privately held plots primarily have a formal deed or title to the land as documentation, whereas ejido plots primarily have a certificate of ejido status or agricultural rights.
Formal documentation of property rights is important for accessing credit and is expected to be positively correlated with TFP. How property rights are formally documented matters, however, as a certificate of ejido status is often not acceptable to private financial institutions for use as collateral whereas formal deeds are. We control for both separately in the core empirical analysis. Because ejidos may function differently than privately owned parcels, we control for ejido status. Ejido farms make up 58% of the panel, and the ejido status of farms does not change for almost all farms in the panel. Panel farms are located in 92 distinct communities and are grouped into five regions in Mexico: the North, Center, Pacific, South, and Gulf. In the first survey wave, 26% of panel farms are in Northern states where agriculture is characterized by having larger commercial farms with greater importance of the commercial production of maize. In comparison, 50% of first wave farms are in Southern and Central states where agriculture is characterized by more traditional, smallholder maize producers and the commercial production of fruits and edible vegetables . In tests of heterogeneity, we introduce regional interactions with farm size in estimations of equation , allowing the farm size – TFP relationship to vary across agricultural regions. Additional household level controls are grouped into two broad categories: variables describing agricultural practices that are mostly endogenous, and demographic variables that are largely exogenous. Household demographic variables are based on predetermined characteristics of the household head. The panel farms predominantly have male, married, and Spanish speaking heads of household, with little differences across farm sizes or ejido status. Table B.3.5 in Appendix B.3 shows that farms larger than about 5 ha appear to be less likely to have an indigenous household head and more likely to have a literate household head than do smaller farms. Literacy is just one way to measure educational attainment of the household head, and it captures a rather low bar. We measure the education of household head by creating indicator variables for the highest level of formal schooling attended, from no formal education to elementary school, secondary school, high school, or college education. With little variation across survey years, Table B.3.6 in Appendix B.3 shows educational attainment by farm size for 2002 only, showing that a majority of farms have household heads with no more than an elementary school education, while almost one quarter of the panel’s household heads have no formal education at all. The following variables describing agricultural practices of farms are potentially endogenous, and for this reason are not included in the base specifications. They are introduced to shed light on potential channels affecting TFP and the farm size – TFP relationship. Any farm that does not bring any of its crop to market is classified as a subsistence farm, identifying farms that may behave differently than those who do. There is little difference in the prevalence of subsistence farming between ejido and non-ejido farms. As shown in Table B.3.1 in Appendix B.3, subsistence farming decreases with farm size, as expected. We calculate the share of each farm’s crop that is marketed – on average, those farms in the sample that do participate in the market sell around 75% of their production. This appears relatively constant across farm size bins. Alongside subsistence farming practices, Table B.3.1 in Appendix B.3 shows the share of farms engaged in monocropping.