The extract solution contains Griffithsin as well as the host and viral protein impurities

To ensure consistency of the nutrient solution, all water was assumed to be treated by reverse osmosis with solution-monitoring for proper pH and dissolved solids content. The three phases of plant growth require a total batch time of 38 days in the upstream portion of the facility. Due to the protracted and continuous nature of plant cultivation, the upstream portion of the facility contains multiple concurrent batches staggered at different stages of growth. When one batch graduates to the next step of production , the trays containing the batch’s biomass are cycled out and the corresponding rack space is immediately filled with a new rotation of trays. We divided the 38-day growth period into 11 concurrent batch periods, with one batch ready to enter downstream purification every 3.44 days. Table 1 is a summary of the number of plants, trays and batches that comprise the upstream facility at any given moment. For model building, batch schedules were calculated under the initial assumption of 24/7 operation for 330 days per year. Plant uptake of nutrients and growth were assumed to be linear reaching 15 g FW per plant at viral inoculation and then increasing in mass to reach 40 g FW per plant at harvest. A 5% failure rate of TMV inoculation was assumed . The Griffithsin expression rate was fixed at 0.52 g/kg FW of harvested biomass, with a downstream recovery of 70%, based on pilot-scale results. Additionally, nutrient solution demand was assumed to match observed biomass growth rates assuming that for each kilogram of nutrient solutions, 0.5 kilogram goes into biomass and the remainder is considered aqueous waste. The materials used, quantities and source are summarized in Supplementary Table 1 in Supplementary Material, plastic growing bag together with clarifying comments and references that were used to assist in the calculations. Using the inputs shown in Supplementary Table 1, the upstream and downstream processes were modeled in SuperPro.

The results generated by the software for the upstream operations are shown in Figure 1, with scheduling shown in the equipment occupancy chart in Figure 2. The following descriptions elaborate on the schema presented in each figure. Griffithsin recovery and purification was modeled as a batch process in a facility with an available operating time of 330 days a year for 24 h a day and 7 days a week. In each year, there are 95 batches total to produce 20 kg of purified Griffithsin API. Since the recovery and purification process only takes 1.6 days, the downstream facility has a significant down time of 2.78 days between batches. Overall, each batch requires 39.6 days from seed planting to formulating the final product, with 38 days upstream and 1.6 days downstream. In Figure 1, the upstream processes are dictated by 11 concurrent batches with each batch being 3.44 days apart from each other. A batch basis of 3.44 days was chosen to decrease equipment idle time and thereby increase downstream equipment utilization efficiency. Despite the 39.6-day batch period and a 332-day operating year, in the model the batch time upstream was reduced to approximately 38 days and the operating year was increased to 365 days to reach the desired 95 batches per year. This was done because SuperPro reproduces uniform results for each year. The goal of the upstream process operations is to produce sufficient biomass to enable isolation of 20 kg Griffithsin per annum. The modeling results show that each batch would produce 578 kg of biomass containing 300 g of Griffithsin, assuming an expression yield 0.52 g API/kg FW biomass . Because induction was modeled using infection with recombinant TMV vector, the three main phases in upstream are germination, pre-inoculation, and post inoculation. The duration of the phases in the model are 21 days, 3 days, and 14 days, respectively. Each batch of N. benthamiana plants goes through a germination phase of 21 days and the germination room is designed with a capacity to grow the 86,700 plants necessary to reach the production goal.

This step of the process uses 90 germination trays, each holding about 960 plants, distributed among 6 batches in the germination room. After 21 days post germination, the N. benthamiana plants are transplanted to a lower density to enable further growth. Thus, seedlings from one germination tray are transplanted into three grow trays , meaning that there are three times the number of trays in pre- and post-inoculation, individually, than in germination. The plant density is 646 plants per m2 in the germination trays and 215 plants per m2 after transplantation. In practice, during transplantation each plant will spend only a few minutes away from its growth environment to minimize transplant shock and undue stress. In the model, the overall time was overestimated to be 3 h to accommodate other necessary procedures, such as moving the plants back to the tray stacks. The transplanted trays are relocated to pre-inoculation rooms that are designed to accommodate the increased area from transplanting for ∼3 days. The pre-inoculation room contains 1 batch, each containing 45 trays with 320 plants per tray. Recombinant TMV for inoculation is produced in and isolated from N. benthamiana. The plant growth model is the same as the rest of N. benthamiana plants. By using infected plants and the purification model defined by Leberman , 4 mg of pure TMV per gram of infected plant material can be recovered . Each batch is equivalent to 14,450 plants distributed on 45 trays. Less than 1 microgram of TMV virion is needed to inoculate each plant . Thus, approximately 14.5 mg of TMV is needed per batch and the necessary amount of TMV to inoculate a batch can be produced from a single N. benthamiana plant. Multiple batches of TMV solution can be made simultaneously and stored at −20◦C . TMV production can be done at lab scale and equipment, labor and material costs are negligible compared to the overall cost of plant maintenance. The isolated TMV is incorporated in diatomaceous earth buffer solution at a concentration of 10 micrograms per 2.5 mL of diatomaceous earth buffer solution, which contains 1% by volume diatomaceous earth and 2% by volume of sodium/potassium-based buffer .

The selected inoculation volume of 2.5 mL is a safe middle value from the range suggested in the literature . In the model, the estimated mixing and transfer time for the solution is 1 h, which starts at the beginning of post-inoculation, so the plants and solution enter the same stage together. A forklift is used to transport the plants into the inoculation room. The plants are inoculated with the diatomaceous earth buffer solution described above with a high velocity spray. Inoculation machines are often custom made and consist of a conveyor traveling through an enclosed cylinder equipped with high pressure spray nozzles aimed at the plants’ aerial structures. Once the inoculation is complete, the trays are conveyed to the post-inoculation growth room,wholesale grow bags which is similar in design to the pre-inoculation growth room; the main difference being its size. The post-inoculation room contains 4 batches at any given time for a total of 180 trays with 320 plants per tray. The scheduling of 3 batches is summarized in the equipment occupancy chart in Figure 2. As shown, seeding, germination, transplant, pre-inoculation, inoculation, and post-inoculation occur sequentially, and the batches are staggered by 3.44 days. The downstream unit operations developed in SuperPro are shown in Figure 3, with scheduling summarized in the equipment occupancy chart shown in Figure 4. The following descriptions elaborate on the schema presented in each figure. At the end of each 3.44 day growing rotation cycle upstream, one batch of N. benthamiana plants is ready to be transferred to downstream processing. This is done by placing each tray of plants onto a conveyor system which leads them to the first phase of downstream operations. The matured plants are first harvested for the green biomass from which the majority of Griffithsin can be recovered with a single extraction. Additional Griffithsin could be recovered from fibrous material by reprocessing and from roots ; however, reprocessing was not included in this model. The automated harvester processes the 578 kilograms of biomass at a rate of 193 kilograms of biomass per hour. With an operational buffer time of 1 h, this process is thus expected to take 4 h. As the biomass is processed by the harvester, it is directly fed into a shredder which further comminutes the biomass to improve Griffithsin recovery. The shredder operates at a capacity of 193 kg of harvested biomass per hour for 2.8 h. The shredded biomass is then mixed with an extraction buffer in a buffer addition tank. For every kilogram of plant material, 1 L of extraction buffer is added. Thus, for 578 kg of N. benthamiana in a batch, approximately 578 L of extraction buffer are added. The resultant solid-liquid mixture has a total volume of about 1,135 L and is sent through a screw press, which is represented as a generic box in the model. The screw press separates the solidliquid slurry leaving a main process fluid stream of plant extract and a waste stream of biomass. A loss about 12% of the original starting Griffithsin was modeled assuming it to be non-liberated from the homogenized biomass. The removal of the biomass leaves a main process stream that contains about 585 L . To facilitate the aggregation of proteinaceous impurities, the extract solution is transferred into a mixing tank and heated to 55◦C for 15 min. The mixture is passively cooled and simultaneously transferred out of the tank and fed into the first 0.3µm plate-and-frame filter. The extract solution is filter-pressed at 25–30 psig to remove the aggregated protein impurities.

Filtering has a process time of 1 h and requires a filter area of 3 m2 to handle the 590 kg/batch of the process stream. At this stage, the process loses a further 8% of the Griffithsin but removes all the RuBisCO and 87% of the TMV coat protein impurities. The filtrate from this step is transferred to a second mixing and storage tank, mixed with bentonite clay and magnesium chloride, and stored at 4◦C for a 12-h period. This stage is the bottleneck operation for the downstream process. After the 12-h incubation, the solution is filtered through a second 0.3µm filter press and a 0.2µm inline sterilizing filter. These operations remove the remaining protein impurities leaving a Griffithsin extract with greater than 99% purity but at the cost of losing 6% of the Griffithsin. The second plate-and-frame filter has a filter area of about 3 m2 and will process all of the extract in 1 h. There is approximately 222 g of Griffithsin per batch at the end of the filtration phase. Following the filtrations steps, the Griffithsin extract solution is collected in a storage tank and further purified using an AxiChrom column with Capto MMC resin to remove residual color and potential non-proteinaceous impurities. To accommodate the 222 g of Griffithsin in solution, 4.9 L of MMC bed resin is needed at a 45 mg/mL binding capacity . The order of the operations for this chromatography step are: Equilibrate, load, wash, elute, and regenerate. In total, chromatography requires 10 h with the load step taking the longest, at 8 h, because approximately 600 L of solution are processed. Chromatography is necessary to decolorize the extract at the expense of losing 4% of the Griffithsin, giving a remaining Griffithsin mass of 210 g per batch. The 10 L of eluant process fluid is sent through a viral clearance filter and transferred into a pool/storage tank. Subsequently, the extract is sent through an ultrafiltration/diafiltration cycle to remove salts introduced in the chromatography column. After ultrafiltration, the product is transferred into a storage tank to be mixed with the final formulation components. The concentrated Griffithsin is diluted to give a concentration of 10 g/L Griffithsin in 10 mM Na2HPO4, 2.0 mM KH2PO4, 2.7 mM KCl and 137 mM NaCl at pH 7.4. The final volume of the DS is 21 L per batch. As shown by Figure 4, each batch in the downstream requires 39 h of process time which includes all SIP and CIP operations.

The estimated productivity gaps in GLW are an order of magnitude larger than our estimates

The shift out of agriculture and into other more “modern” sectors has long been viewed as central to economic development. This structural transformation was a focus of influential early scholarship with the issue even stretching back to Soviet debates over whether to “squeeze” farmer surplus to hasten industrialization . A more recent macroeconomic empirical literature has revived interest in these issues, often using data from national accounts . This body of work has documented several important patterns that help shed light on the sources of income differences across countries. First, it shows that the share of labor in the agricultural sector correlates strongly with levels of per capita income: most workers in the poorest countries work in agriculture while only a small share do in wealthy countries. Importantly, while income per worker is only moderately larger for non-agricultural workers in wealthy countries relative to poor countries, agricultural workers are many times more productive in rich countries. This creates a double disadvantage for poor countries: agricultural work tends to be far less productive in low-income countries, yet the workforce is concentrated in this sector.Studies that explore the closely related gap between the urban and rural sectors reach similar conclusions. Several recent studies have examined the extent to which these productivity gaps across sectors can reasonably be viewed as causal impacts rather than mainly reflecting worker selection. By a causal impact of sector,wholesale grow bags we mean that a given worker employed in the non-agricultural sector is more productive than the same worker employed in the agricultural sector. In contrast, worker selection would reflect differences driven by the fact that workers of varying ability and skill levels are concentrated in particular sectors.

This paper seeks to disentangle these two competing explanations by estimating sectoral wage gaps using unusually long-run individual-level panel data from two low-income countries, Indonesia and Kenya. If there are causal impacts of sector, the large share of the workforce employed in the agricultural sector in low-income countries could be viewed as a form of input misallocation along the lines of Hsieh and Klenow and Restuccia and Rogerson . The resolution of this econometric identification issue, namely, distinguishing causal effects from selection, is not solely of scholarly interest: the existence of causal sectoral productivity gaps would imply that the movement of population out of rural agricultural jobs and into other sectors could durably raise living standards in low-income countries, narrowing cross-country differences. The existence of large causal sectoral productivity gaps also raises questions about the nature of the frictions that limit individual movement into more productive employment, and the public policies that might promote such moves or hinder them . Gollin, Lagakos, and Waugh and Young are two important recent studies that explore this identification issue. GLW examine labor productivity gaps in nonagricultural employment versus agriculture using a combination of national accounts and repeated cross-sectional data from micro-surveys, and document a roughly three-fold average productivity gap across sectors. In their main contribution, GLW show that accounting for differences in hours worked and average worker schooling attainment across sectors—thus partially addressing worker selection— reduces the average estimated agricultural productivity gap by a third, from roughly 3 to 2. They also find that agricultural productivity gaps and per capita consumption gaps based on household data remain large but tend to be somewhat smaller than those estimated using national labor surveys, possibly in part due to differences in how each source measures economic activity. GLW remain agnostic regarding the causal interpretation of the large agricultural productivity gaps that they estimate. If individual schooling captures the most important dimensions of worker skill and thus largely addresses selection, GLW’s estimates would imply that the causal impact of moving workers from agriculture to the non-agricultural sector in low-income countries would be to roughly double productivity, a large effect.

Of course, to the extent that educational attainment alone fails to capture all aspects of individual human capital, controlling for it would not fully account for selection. Young examines the related question of urban-rural differences in consumption , rather than productivity, and similarly finds large cross-sectional gaps.Using Demographic and Health Surveys that have retrospective information on individual birth district, Young shows that rural-born individuals with more years of schooling than average in their sector are more likely to move to urban areas, while urban-born individuals with less schooling tend to move to rural areas. Young makes sense of this pattern through a model which assumes that there is more demand for skilled labor in urban areas, shows that this could generate two-way flows of the kind he documents, and argues that he can fully explain urban-rural consumption gaps once he accounts for sorting by education.3 The current study directly examines the issue of whether measured productivity gaps are causal or mainly driven by selection using long-term individual-level longitudinal data on worker productivity. Use of this data allows us to account for individual fixed effects, capturing all time invariant dimensions of worker heterogeneity, not just educational attainment . We focus on two country cases – Indonesia and Kenya – that have long-term panel micro data sets with relatively large sample sizes, rich measures of earnings in both the formal and informal sector, and high rates of respondent tracking over time. The datasets, the Indonesia Family Life Survey and Kenya Life Panel Survey , are described in greater detail below.4 For both countries, we start by characterizing the nature of selective migration between non-agricultural versus agricultural economic sectors, and between urban versus rural residence. Like Young , we show that individuals born in rural areas who attain more schooling are significantly more likely to migrate to urban areas and are also more likely to hold non-agricultural employment, while those born in urban areas with less schooling are more likely to move to rural areas and into agriculture.

We exploit the unusual richness of our data, in particular, the existence of measures of cognitive ability , to show that those of higher ability in both Indonesia and Kenya are far more likely to move into urban and non-agricultural sectors, even conditional on educational attainment. This is a strong indication that conditioning on completed schooling is insufficient to fully capture differences in average worker skill levels across sectors. We next estimate sectoral productivity differences, and show that treating the data as a repeated cross-section generates large estimated sectoral productivity gaps, echoing the results in existing work. In our main finding, we show that the inclusion of individual fixed effects reduces estimated sectoral productivity gaps by over 80 percent. This pattern is consistent with the bulk of the measured productivity gaps between sectors being driven by worker selection rather than causal impacts. Specifically, we first reproduce the differences documented by GLW for Indonesia and Kenya, presenting both the unconditional gaps as well as adjusted gaps that account for worker labor hours and education . These are large for both countries,grow bags for gardening with raw gaps of around 130 log points, implying roughly a doubling of productivity in the non-agricultural sector. When we treat our data as a series of repeated cross-sections, the gaps remain large, at 60 to 80 log points. These are somewhat smaller than GLW’s main estimates, though recall that GLW’s estimates using household survey data also tend to be smaller. Conditioning on individual demographic characteristics as well as hours worked and educational attainment narrows the gap, but it remains large at between 30 and 60 log points. Finally, including individual fixed effects reduces the agricultural productivity gap in wages to 4.7 log points in Indonesia and to 13.4 log points in Kenya, and neither effect is statistically significant. Analogous estimates show that productivity gaps between urban and rural areas are also reduced substantially, to zero in Indonesia and 13.2 log points in Kenya. We obtain similar results for the gap in per capita consumption levels across sectors where this is available for Indonesia. This is useful since consumption measures may better capture living standards in less developed economies than earnings measures, given widespread informal economic activity. Furthermore, we show that the productivity gap is not simply a short-run effect by demonstrating that gaps do not emerge even up to five years after an individual moves to an urban area. We also find that productivity gaps are no larger even when considering only moves to the largest cities in Indonesia and Kenya .

Our methodological approach is related to Hendricks and Schoellman , who use panel data on the earnings of international migrants to the United States, including on their home country earnings. Mirroring our main results, the inclusion of individual fixed effects in their case greatly reduces the return to international migration . Similarly, McKenzie et al. show that cross-sectional estimates of the returns to international immigration exceed those using individual panel data or those derived from a randomized lottery. Bryan et al. estimate positive gains in consumption in the sending households of individuals randomly induced to migrate within Bangladesh, although no significant gains in total earnings. Bazzi et al. argue that cross-sectional estimates of productivity differences across rural areas within Indonesia are likely to overstate estimates derived from panel data using movers. Other related studies on the nature of selective migration include Chiquiar and Hanson , Yang , Beegle et al. , Kleemans , and Rubalcava et al , among others. A limitation of the current study is that we focus on two countries, in contrast to the scores of countries in GLW and Young . This is due to the relative scarcity of long-run individual panel data sets in low-income countries that contain the rich measures necessary for our analysis. That said, the finding of broadly similar patterns in both countries, each with large populations in two different world regions, suggests some generalizability. Another important issue relates to the local nature of our estimates, namely, the fact that the fixed effects estimates are derived from movers, those with productivity observations in both the non-agricultural and agricultural sectors. It is possible that productivity gains could be different among non-movers, an issue we discuss in Section 2 below. There we argue that, to the extent that typical Roy model conditions hold and those with the largest net benefits are more likely to move, selection will most likely produce an upward bias, leading our estimates to be upper bounds on the true causal impact of moving between sectors. However, absent additional knowledge about the correlation between individual preferences, credit constraints, and unobserved productivity shocks, it is in principle possible that selection could bias our estimates downward instead. Similarly, it is possible that very long-run and even inter-generational “exposure” to a sector could persistently change individual productivity due to skill acquisition, and this opens up the possibility that selection and causal impacts are both important. We return to these important issues of interpretation in the conclusion, including ways to reconcile our estimates with existing empirical findings. The paper is organized as follows. Section 2 presents a conceptual framework for estimating sectoral productivity gaps, and relates it to the core econometric issue of disentangling causal impacts from worker selection. Section 3 describes the two datasets ; characterizes the distinctions between the non-agricultural and agricultural sectors, and urban vs. rural areas; and presents evidence on individual selection between sectors. Section 4 contains the main empirical results on productivity gaps, as well as the dispersion of labor productivity across individuals by sector, consumption gaps, dynamic effects up to five years after migration, and effects in big cities versus other urban areas. The final section presents alternative interpretations of the results, and concludes. We present a development accounting framework to disentangle explanations for the aggregate productivity gap across sectors. We consider both observable and unobservable components of human capital, and whether intrinsic worker preferences for sector may bias direct measurement of the productivity gap. A standard model suggests that worker selection is most likely to bias sectoral productivity gaps upward when estimated among those moving into non-agriculture but lead to a downward bias when estimated among those moving into agriculture.