Kenya 2017LR Yield Gap Analysis. 1. Introduction. January 2017 Agriculture Research Team

Size: px
Start display at page:

Download "Kenya 2017LR Yield Gap Analysis. 1. Introduction. January 2017 Agriculture Research Team"

Transcription

1 1. Introduction Variability in key agronomic variables such as rainfall and soil fertility means that farmers within the same geographic area often experience widely different yield outcomes. Region or site specific agronomic practices can address this problem, yet most farmers in Africa currently only have access to blanket agronomic recommendations developed at the national level or, at best, for broad agro ecological zones for production of the most common crops. One Acre Fund s integrated bundle of financing, distributing, and training on the use of modern agricultural inputs and techniques has delivered strong results to date for example, farmers enrolled in One Acre Fund programs in 2015 experienced on average a 40% increase in yield. Most One Acre Fund farmers currently receive blanket agronomic recommendations developed at a national level. Delivering one size fits all product and training bundles has clear advantages in terms of operational and logistical simplicity. However, variability in key agronomic variables means that farmers in different areas can experience vastly different yield outcomes when using the same products and methods. As Figure 1 illustrates, One Acre Fund s Kenya program delivers great impact for farmers overall. In the 2017 long rains season (2017LR), in most program districts ~75% or more of One Acre Fund clients achieved higher yields than the control median. However, it is also clear that the products and trainings One Acre Fund currently offers farmers have highly variable results both between and within districts. For example, in Ndalu and Cherangany districts over 50% of One Acre Fund clients achieved lower yields than the control median, and in 14 districts, >25% of clients had yields below the control median. Figure 1 One Acre Fund program impact on maize yields in 2017LR, Kenya. The red line represents the median yield for control farmers. The boxes represent the central 50% of observations (the top and bottom of this box represent the interquartile range, IQR) and the horizontal lines within the boxes show the median values. The whiskers show the upper and lower 25% of observations, and the dots represent outliers which are more than 1.5 x IQR from the central 50% of values.

2 One Acre Fund is currently conducting yield gap analyses for all its country programs. This includes integrating crop management, weather, and soil datasets to identify the most important agronomic levers for different regions of program operations, and developing hypotheses on how to optimise agronomy at a sub national scale through more locally appropriate product and practice recommendations. These hypotheses will then be tested in farmer field trials in future seasons. This report presents progress to date in analysing yield gaps in Kenya, using data from the 2017LR season. 2. Methods 2.1 Data Collection In the 2017 long rains season, One Acre Fund surveyed maize management practices and yields of 3,565 randomly selected farmers across 32 districts in the Western and Nyanza regions of Kenya (Figure 2.1). All survey participants were smallholders farming less than 4 hectares. Half of respondents were One Acre Fund clients. The other half of the sample were non clients, selected by asking each One Acre Fund farmer surveyed to recommend a neighbour who might be interested in becoming a One Acre Fund client, but had not yet done so (farmers who had been One Acre Fund clients in the past were excluded). For One Acre Fund clients, two fields were surveyed one which had been planted using One Acre Fund inputs, and another which had been planted using inputs from other sources. For non clients, one field was surveyed. In total, 4,576 fields were surveyed. The surveys were conducted over the months of June, July and August, as farmers were harvesting their maize crops. Figure 2.1 Study areas. The legend shows the minimum number of fields sampled in polygons of that color

3 Yields were measured using two randomly placed crop cut plots of 40 m² in each field surveyed. Farmers were asked a series of questions about the attributes of their field (e.g. soil depth and fertility, distance from home, weed pressure), their agronomic practices (e.g. crop rotation, planting date, weeding, variety selection, soil amendments and fertilizers), and the incidence and severity of any biotic or abiotic crop stresses experienced during the season (e.g. drought, Maize streak virus, Fall armyworm). Where possible, agronomic practices were directly measured (e.g. planting density and inter and intra row spacing). The above data were integrated with third party data from awhere, a subscription service providing remote sensing derived weather data, and NASA. AWhere uses satellite imagery validated with local weather observations to generate global weather predictions and forecasts. One Acre Fund chose awhere due to the strength of their predictions and data coverage for One Acre Fund s East African operating areas. Normalized difference vegetation index (NDVI) and Land Surface Temperature (LST) data were obtained from NASA. NDVI is a measure of plant health derived from satellite imagery, and LST refers to the temperature of the land itself rather than the ambient air above it (as is used in most typical temperature recording). A link to the full list of data variables assembled for this analysis is provided in the Appendix. If you would like to access the field data collected by One Acre Fund for this study, please send a request to ag research@oneacrefund.org. 2.2 Definition of agro climatic zones Agro climatic zones (ACZs) reflect environmental conditions conducive to different types of plant growth. ACZs can help simplify what is otherwise a highly variable landscape into a discrete number of broadly similar agronomic management zones. One Acre Fund is exploring the potential to use ACZs to strike a balance between tailoring practices to local conditions and reasonable levels of operational and logistical complexity. The most recent publicly available ACZ maps produced for western Kenya were produced by Wageningen University researchers in the 1970s. In order to gain a more up to date sense of which locations in One Acre Fund s operating areas could potentially share key agronomic management characteristics, seasonal rainfall averages from 2010 to 2016 inclusive from awhere were combined with altitude measurements for all One Acre Fund distribution sites. The ACZ definitions for One Acre Fund s Kenya program areas could be improved in future iterations through inclusion of constructed variables such as growing degree days and P/PET. Likewise, inclusion of data characterising regional variations in edaphic qualities would also enable the development of more nuanced, agro ecologically defined management zones. K means clustering was used to find natural groupings of sites in the data. K means clustering is an unsupervised clustering algorithm, meaning that it is agnostic to the number of clusters it finds in the data. It will find as many clusters as it is instructed to find and ensure points in those clusters are as similar to each other and as different from points in other clusters as possible. Figure 2.2 shows ACZ definitions of 2 6 clusters (zones). As the algorithm used is agnostic to the number of clusters, it is ultimately necessary to use domain knowledge of One Acre Fund s operations and operating area to determine which number of clusters most appropriately captures meaningful agronomic variation balanced with attainable operational and logistical complexity. This decision making process can however be informed by understanding variability in yield drivers between between zones, and understanding how this varies according to the number of clusters used. For the purposes of 2018LR trials, an ACZ definitions of 3 zones (Figure 2.2 B) will be used, as this seems to offer the greatest opportunity to maximise differences between ACZs within current operational constraints. In the longer term it is possible that a larger number of ACZs could be used. It is also possible that different numbers of ACZs might be used for different agronomic levers. For example, it might be appropriate to use 3 ACZs to define locally optimal variety recommendations, but necessary to use 5 ACZs to define locally optimal fertilizer recommendations.

4 A B C D E Figure 2.2 Agro climatic zone definitions. A 2 ACZs, B 3 ACZs, C 4 ACZs, D 5 ACZs, E 6 ACZs.

5 2.3 Yield Gap Analysis One Acre Fund has chosen to use machine learning approaches to model yield and identify testable hypotheses. Modern machine learning methods were designed for working with larger data to quickly separate the signal, the data useful for answering our question, and the noise, the data not useful for our investigation. Standard analytical methods that model yield as a linear function often fail to adequately describe these complex relationships. One Acre Fund s yield gap analyses instead utilise a tree based modeling approach called Random Forest. The Random Forest yields two key outputs to aid in interpretation and hypothesis formation: variable importance plots and partial dependence plots. Variable importance plots measure the contribution of the variable to model performance. Functionally, this relates to the contribution of the variable toward the construction of the decision trees in the model. Variable importance informs variable selection the filtering of which variables are most meaningful to model performance. The more important the variable, the more critical it is to overall model performance; the more critical to model performance, the more important a variable is in explaining maize yield gaps. This filtering addresses a problem present in modeling yield as a linear function where we have more variables than we can realistically include in a model. Variable importance measures show which variables are most important for determining yield out of all available variables. While variable importance plots are valuable tools in identifying which agronomic levers will be most useful in closing yield gaps, it is critical to note that they do not provide any explanation of the functional form of the underlying random forest and extreme gradient boosting models, meaning that there could be interactions between variables that are not apparent in the variable importance plot. Partial dependence plots (PDPs) illustrate how maize yield changes with changes in one specific variable, holding all other variables at their average values. These graphs are helpful signals of the likely functional relationship between a single variable and maize yield, however, due to likely interactions in the modeling process these plots should be interpreted as providing qualitative indications of the relationship between X and Y. PDPs can be interpreted as identifying quantified differences in yield due to individual variables while keeping in mind (i) the broader context of variable interactions in the modelling process and (ii) that the model does not establish causal relationships. In this study PDPs are used to provide insight into the functional relationship between model features and yield to compare these to expectations of the relative importance of different agronomic variables, and to evaluate the potential magnitude of yield change suggested by the relationship. Counter intuitive results in the PDPs are not necessarily an indication that the model or underlying data are unreliable, but do nonetheless warrant further consideration or investigation of why the data are presenting in agronomically unexpected ways. Accordingly, while counter intuitive results should not be immediately dismissed, agronomically inexplicable findings regarding yield drivers should not be accepted at face value. The primary value of using a Random Forests modelling approach is that it enables rapid distillation of a large number of potentially explanatory variables down to a prioritized few which can be studied in detail. 3. Results 3.1 One Acre Fund program yield impact As in previous seasons, high variability in maize yields between districts was observed, highlighting the need for locally optimised agronomy (Figure 3.1). In nine districts, yields produced using One Acre Fund inputs had mean yields below 3 t/ha. In twenty one districts mean yields in fields using One Acre Fund inputs were over 3 t/ha, of which mean yields in nine districts ranged from ~4 4.5 t/ha. Defining yield potential as the the top of the interquartile range, sixteen districts had yield potentials associated with use of One Acre Fund inputs in excess of 5 t/ha, of which seven had yield potentials in excess of 6 t/ha. However, seven other districts had yield potentials associated with use of One Acre Fund

6 inputs of less than 4 t/ha. Similarly, high variability in yield outcomes within districts was observed, with most having interquartile ranges of ~2 t/ha or more. Figure 3.1 Maize yields produced using inputs purchased from One Acre Fund. The boxes represent the central 50% of observations (the top and bottom of this box represent the interquartile range, IQR) and the horizontal lines within the boxes show the median values. The whiskers show the upper and lower 25% of observations, and the dots represent outliers which are more than 1.5 x IQR from the central 50% of values. 3.2 Random forest model performance and key explanatory variables 1 The random forest model explains 30% of yield variation. This means that 70% of variation in yields are not explained by the data currently in the model. Figure 3.2 displays Variable Importance Plots (VIPs) listing the top 40 variables in terms of predictive importance for yield and illustrating the relative importance of their respective contributions to yield prediction model performance. The most important yield predictors can be grouped into six categories, summarised here in rough order of importance: Location, as indicated by district and region Weather variables of land surface temperature, growing degree days, P/PET (precipitation divided by potential evapotranspiration) and farmer reported drought and flooding Fertilizer application rates, and to a lesser extent, farmer reported soil fertility ratings and soil depth Field characteristics such as size, distance from the home, slope and terracing Biotic stresses and their management pest and disease incidence and severity (particularly Fall armyworm), striga, farmer reported weed pressure ratings, number of weedings, and pesticide application Historical field management crops grown and fertilizer applied in the previous season, and number of cultivations before planting 1 This performance metric is model performance out of sample i.e. on data not used to train the model. Predictive performance on data the model has not seen is an objective measure of model performance.

7 A B Figure 3.2 Variable Importance Plots. A shows the top 1 20 variables in terms of explanatory importance for yield, B shows variables ranked in terms of their explanatory importance. Note that close attention should be paid to the x axis scale variables in B are significantly less important than those in A. The emergence of these groups of variables as being of particular importance is as expected. That district and region have emerged as being of particularly high importance confirms the need for more local optimisation of agronomic

8 products and guidance. Agronomic levers that are somewhat conspicuous by their absence from the lists of the most important variables are seed variety and planting density. Note however that the interactions of these two variables with others would in fact have contributed to the importance of the variables listed above. Several options are available to respond to these yield drivers through locally optimised agronomy. For example, through more regionally appropriate maize variety selection, it may be possible to reduce yield variability arising from variability in temperature, growing degree days and P/PET. Yield variability associated with P/PET might also be reduced through regional optimisation of planting density recommendations, and adjusting fertilizer application rates to better account for regional differences in nitrogen leaching rate arising from differences in rainfall and soil texture. Yield gaps arising from from biotic stresses might be reduced by more regional targeted trainings on appropriate crop rotation, weed management and pesticide use. These opportunities are explored further below. As discussed in section 2.3, the most appropriate number of ACZs for One Acre Fund to use to optimize agronomy may vary by practice. In reviewing yield gaps by agronomic practice below, configurations of two to six ACZs are considered. 3.3 Fertilizer application Figures display nitrogen (N) and phosphorus (P) response curves for ACZ classifications of 2 to 6 ACZs. The yield values presented here are measured, not predicted, and therefore to some extent reflect underlying variation in other agronomic practices such as variety selection and planting density. Clearly visible in the figures are virtually solid lines of yield measurements at N application rates of ~55 kg/ha, and P application rates of ~25 kg/ha. These data are from fields receiving fertilizer applications that align with One Acre Fund recommendations of 50 kg DAP and 50 kg CAN per acre (50 kg per acre is 124 kg per hectare). Fields receiving these application rates may therefore have been relatively more likely to have (i) received fertilizer applications in previous seasons, (ii) been planted using hybrid seed provided by One Acre Fund rather than saved OPV seed, and (iii) been planted following One Acre Fund planting density recommendations, which, while potentially not optimal for all areas, may in many cases be an improvement on typical planting practices nonetheless. Dramatic changes in N response curve observed immediately above or below an application rate of ~55 kg N / ha should therefore be interpreted with some caution, and likewise for P response curves either side of ~25 kg P / ha. Table shows the gaps in yield between those achieved using current One Acre Fund recommendations, and the agronomic optimum, here defined as the yield achieved at the maximum fertilizer application rate for which there is reasonably high confidence of obtaining higher yields than at lower rates. This was determined by identifying the application rate ~5 kg below where the lower confidence bound of the response curve was observed to peak or level off. An exception was made for N application in ACZ 2 of 6, where this point (55 kg N/ha) was observed to deliver zero additional yield compared to an application rate of 30 kg/ha, so the lower rate was used. Yield gaps by ACZ for N application range from 0 to 0.85 t/ha, or 0 24% of current mean yields using current One Acre Fund recommendations. Yield gaps by ACZ for P application are relatively smaller, ranging from 0 to 0.45 t/ha, or 0 13% of current mean yields using current One Acre Fund recommendations. Assuming a 3 ACZ program configuration (with geographical distribution of clients as in 2017LR), optimised N application could increase average yields by 17% for ~30% of clients, and by 11% for ~40% of clients. Similarly, optimised P application could increase average yields by 6 7% for 70% of clients. Table shows the fertilizer application rates needed to close the yield gaps shown in Table These were calculated by first determining the diammonium phosphate (DAP) application rate necessary to close the yield gap arising from insufficient P, and then determining the calcium ammonium nitrate CAN application rate necessary to meet close the yield gap arising from insufficient N, accounting for the N already supplied by DAP. Closing yield gaps appears to require increasing DAP applications in some ACZs by 20 80%, but in several ACZ definitions increasing DAP application rates surprisingly appears unlikely to increase yields. Closing yield gaps would in most cases require increasing CAN applications by %, but in a few instances CAN rates could actually be reduced from the current

9 One Acre Fund recommendation if DAP rates are sufficiently increased to meet P demand. It should be acknowledged that these rate recommendations are somewhat simplistic as timing of N supply and fertilizer package sizes would also need consideration. For all ACZ clustering approaches except #4, three different DAP application rates are required to address variation in P requirements between ACZs. For CAN application, depending on the clustering approach, 2 6 different application rates would be needed to address variation in N requirement. Note these figures are based on analysis of agronomic optimums (defined above) economic optimum yields which maximise farmers profits may require lower fertilizer application rates. Further analysis incorporating fertilizer and grain prices is required to determine the economic optimum application rates. Table Estimated yield gaps associated with N and P application by ACZ # ACZs ACZ # Yield at N application level (t/ha) Estimated N yield gap Yield at P application level (t/ha) One Acre Fund recommendatio n Agronomic optimum t/ha % One Acre Fund recommendation Agronomic optimum Estimated P yield gap t/ha % Rather than optimising agronomy by seeking to maximise yields and overall profitability as per the yield gaps identified above, an alternative approach could be to consider how to enable resource poor farmers to take advantage of the fertilizer application rates with the highest returns on investment (i.e. those rates that are on the relatively steeper slopes of the response curve) on the largest possible area of their land. Regardless of the number of ACZ used, for all N response curves an initial inflection point can be observed at a rate of ~30 kg N/ha. Similarly, in most P response curves there is an initial reduction in the rate of yield response once application rates of about kg P/ha are reached. This

10 indicates that many farmers might be able to maximise their return on investment in fertilizer from One Acre Fund by applying it at half the recommended rate, to twice as much land. Recommendations Conduct further analyses incorporating fertilizer and grain prices to estimate economic optimum (i.e. profit maximising) DAP and CAN (or urea) application rates Run fertilizer application rate trials across 3 ACZs comparing 1. Current One Acre Fund recommendations 2. Estimated profit maximising fertilizer application rates note that there may be some uncertainty associated with these estimates, so it may be necessary to trial more than one rate 3. Estimated ROI maximising fertilizer application rates (half of current One Acre Fund recommendations) In order to identify the rates most likely to increase farmers net income by ACZ. Table Fertilizer application rates needed to close N and P yield gaps by ACZ # ACZs ACZ # Nutrient addition required for agronomic optimum (kg/ha) Fertilizer rates to reach agronomic optimum (kg/ha) Change from current 1AF recommendation (kg/acre) N P DAP CAN DAP CAN

11 Figure Yield response to nitrogen fertilizer application rates using two ACZs. The blue line shows the mean response and the dark gray area around it shows the 95% confidence bounds. Figure Yield response to nitrogen fertilizer application rates using three ACZs. The blue line shows the mean response and the dark gray area around it shows the 95% confidence bounds.

12 Figure Yield response to nitrogen fertilizer application rates using four ACZs. The blue line shows the mean response and the dark gray area around it shows the 95% confidence bounds. Figure Yield response to nitrogen fertilizer application rates using five ACZs. The blue line shows the mean response and the dark gray area around it shows the 95% confidence bounds.

13 Figure Yield response to nitrogen fertilizer application rates using six ACZs. The blue line shows the mean response and the dark gray area around it shows the 95% confidence bounds. Figure Yield response to phosphorus fertilizer application rates using two ACZs. The blue line shows the mean response and the dark gray area around it shows the 95% confidence bounds.

14 Figure Yield response to phosphorus fertilizer application rates using three ACZs. The blue line shows the mean response and the dark gray area around it shows the 95% confidence bounds. Figure Yield response to phosphorus fertilizer application rates using four ACZs. The blue line shows the mean response and the dark gray area around it shows the 95% confidence bounds.

15 Figure Yield response to phosphorus fertilizer application rates using five ACZs. The blue line shows the mean response and the dark gray area around it shows the 95% confidence bounds. Figure Yield response to phosphorus fertilizer application rates using six ACZs. The blue line shows the mean response and the dark gray area around it shows the 95% confidence bounds.

16 3.4 Variety selection Figures illustrate variation in yields by maize variety maturity category across ACZs. Maturity categories were defined as: Early maturity within months Medium maturity within 4.5 to 5.5 months Late maturity within months Yield performance was evaluated in terms of maturity categories because One Acre Fund is unlikely to restrict variety offerings to only the highest yielding for an area; farmer preference considerations also need to be taken into account, as does the need for diversity. There may however be potential to encourage farmers to select varieties from maturity categories that are best suited to their location. Sample sizes for certain varieties within certain ACZs are in any case in several instances too small to permit meaningful comparison with other varieties. Table shows yield gaps associated with variety selection by ACZ. Yield gaps are here determined to be the difference in mean yields where the differences between maturity categories are statistically significant. Only two ACZs (2 of 4, and 1 of 6) are found to have no statistically significant yield gap associated with selection of varieties from different maturity categories (Table 3.4). Both of these are the zones within their group with the lowest mean annual rainfall (~1000 mm). In these ACZs it may be sensible to encourage farmers to grow early maturity varieties as there is no clear benefit associated with growing later maturity varieties, which would be at greater risk from moisture stress during the latter part of relatively drier cropping seasons. Use of earlier maturity varieties in these zones would also facilitate more productive intercropping, and greater flexibility in planting date allowing multiple plantings in a season to spread risk of exposure to biotic or abiotic stress, and late plantings when the onset of rains is delayed. For all other ACZs, there are mean yield gaps of t/ha, or 25 65% associated with selecting varieties of early rather than medium or late maturity. In ACZ definitions with 2, 3 and 4 zones, any yield increases associated with selection of late maturity varieties instead of those of medium maturity were insignificant, indicating that farmers would generally be best advised to grow medium varieties enabling them to maximise yields in normal seasons and minimize risks (as outlined in above discussion of earlier maturity varieties). Similarly, in zones 3 and 5 of the 5 ACZ definition, and zone 6 of the 6 ACZ definition, there is no significant yield benefit associated with planting late maturity varieties rather than medium maturity varieties. However, in zones 1 3 of the 5 ACZ definition, and 2 5 of the 6 ACZ definition, only late maturity varieties deliver any significant yield benefit over early maturity. Any risks associated with a shift towards medium and late maturity varieties could be mitigated by developing a stronger focus on identifying high yielding drought tolerant medium and late maturing varieties with strong farmer preference attributes. Overall it seems that in most areas, the greatest potential for optimising agronomy through variety selection lies in promoting medium and late varieties rather than early varieties. However, it would be prudent to first gain a better understanding of why a large proportion of farmers are selecting early varieties despite their lower grain productivity. A more nuanced approach may be appropriate if some farmers are deliberately selecting early varieties to use for intercropping, or to mitigate crop stress risks. Recommendations Prioritise drought tolerant medium and late maturity varieties over early maturity varieties as drought risk mitigating options in One Acre Fund s variety research pipeline. Use focus groups and surveys to better understand the reasons farmers are choosing early maturity varieties despite their lower productivity In most areas, consider promoting already scaled medium and late maturity varieties over early varieties while still giving farmers reasonable flexibility to mitigate risk through selection of earlier maturity varieties if they choose e.g. through restricting seed catalog options, bundling, or field officer incentives

17 Table 3.4 Yield gaps associated with variety maturity category by ACZ. Where differences between maturity categories are not statistically significant, yield gap values are gray and italicised. # ACZs ACZ # Yield by maturity category t/ha Yield gap associated with variety maturity category Medium instead of Early Late instead of Early Late instead of Medium Early Medium Late t/ha % t/ha % t/ha %

18 Figure Yields of different variety maturity categories using two ACZs. The boxes represent the central 50% of observations (the top and bottom of this box represent the interquartile range, IQR) and the horizontal lines within the boxes show the median values. The whiskers show the upper and lower 25% of observations. Figure Yields of different variety maturity categories using three ACZs. The boxes represent the central 50% of observations (the top and bottom of this box represent the interquartile range, IQR) and the horizontal lines within the boxes show the median values. The whiskers show the upper and lower 25% of observations, and the dots represent outliers which are more than 1.5 x IQR from the central 50% of values.

19 Figure Yields of different variety maturity categories using four ACZs. The boxes represent the central 50% of observations (the top and bottom of this box represent the interquartile range, IQR) and the horizontal lines within the boxes show the median values. The whiskers show the upper and lower 25% of observations, and the dots represent outliers which are more than 1.5 x IQR from the central 50% of values. Figure Yields of different variety maturity categories using five ACZs. The boxes represent the central 50% of observations (the top and bottom of this box represent the interquartile range, IQR) and the horizontal lines within the boxes show the median values. The whiskers show the upper and lower 25% of observations, and the dots represent outliers which are more than 1.5 x IQR from the central 50% of values.

20 Figure Yields of different variety maturity categories using six ACZs. The boxes represent the central 50% of observations (the top and bottom of this box represent the interquartile range, IQR) and the horizontal lines within the boxes show the median values. The whiskers show the upper and lower 25% of observations, and the dots represent outliers which are more than 1.5 x IQR from the central 50% of values. 3.5 Planting density Figures show mean planting density response curves for ACZ designations of 2 to 6 ACZs. The yield values presented in these figures are measured, not predicted, and therefore to some extent reflect underlying variation in other agronomic practices, such as variety selection and fertilizer application. Yields may be likely to be relatively higher and densities close to One Acre Fund s recommendation (53,333 plants / ha) because farmers following this recommendation may be relatively more likely to be using fertilizer and hybrid seed. The densities reported here were measured ~1 month after planting and may therefore be slightly lower than the exact densities used at planting e.g. due to germination losses. Likewise, it is likely that actual plant densities at harvest would have been somewhat lower. Table 3.5 shows the gaps in yield between those achieved using current One Acre Fund plant density recommendations (53,333 plants/ha, based on a plant spacing of 75 x 25 cm), and the estimated optimum, determined by identifying the seed rate at which the lower confidence bound of the response curve reaches its peak. In 4 of the ACZs planting density appears to be optimal. In a few ACZs yields could be slightly increased by increasing plant density, but in most areas the opportunities for optimising agronomy relate to reducing plant density. The potential yield gains associated with optimised density are very small, ranging from 0 6% across ACZ designations. However, in many cases, it seems that these yield gains could be achieved with seed rate reductions of up to 16%, so while opportunities to optimise agronomy by increasing yield through improved planting density rate are relatively limited, it may be possible to optimise agronomy in terms of increased profitability through reduced seed rates. For example, many ACZ designations share an optimal density of ~45,000. By adjusting the One Acre Fund spacing recommendation for these areas to 75 x 30 cm (44,444 plants/ha), yields could be slightly increased or roughly maintained while using ~17% less seed. Reducing density by increasing spacing recommendations in this way should also result in relatively higher yields under drought conditions. Note however, that the only yield considered here is that of grain, so differences in biomass production associated with

21 different densities have not been taken into account. Reduced densities could well result in reduced production of biomass which is of value for mulching, composting or use as a livestock feed. Also, reducing density would reduce the shading effect of the crop, potentially resulting in greater weed pressure. While optimum density for grain production varies from year to year, and between varieties, the balance of evidence suggests there may be an opportunity for One Acre Fund farmers to achieve meaningful increases in maize profitability by adjusting their planting densities. Recommendations To validate the estimated optimum densities in Table xx, conduct trials across 3 ACZs to compare yields and profitability using three densities 1. Current One Acre Fund recommendation of 53,333 plants/ha (75 x 25 cm spacing) 2. Lower density [A] 44,444 plants / ha using 75 x 30 cm spacing, or [B] 47,619 plants / ha using 70 x 30 cm spacing) 3. Higher density [A] 61,538 plants/ha using 65 x 25 cm spacing, or [B] 57,143 plants / ha using 70 x 25 cm spacing 4. Alternative spacing for achieving a density close the the estimated optimum for the given ACZ e.g. for ACZ 1 (of 3 ACZ definition) lower density option [A] could be treatment 2, and option [B] could be treatment 4. These trials would ideally be conducted for a range of early to late maturity varieties. This may not be possible in a single season due to operational constraints which will generally mean a maximum of 4 treatments per farmer. As late maturity varieties seem to be optimal for most areas, it would make sense to prioritise these in the density trial. Table 3.5 Yield gaps associated with planting density across ACZ definitions # ACZs ACZ # Estimated optimum density Estimated optimum density Yield One Acre Fund recommendation Yield gap t/ha % % change in seed rate

22 Figure Yield response to planting density using two ACZs. The blue line shows the mean response and the dark gray area around it shows the 95% confidence bounds. Figure Yield response to planting density using three ACZs. The blue line shows the mean response and the dark gray area around it shows the 95% confidence bounds.

23 Figure Yield response to planting density using four ACZs. The blue line shows the mean response and the dark gray area around it shows the 95% confidence bounds. Figure Yield response to planting density using five ACZs. The blue line shows the mean response and the dark gray area around it shows the 95% confidence bounds.

24 Figure Yield response to planting density using six ACZs. The blue line shows the mean response and the dark gray area around it shows the 95% confidence bounds. 3.6 Biotic and abiotic stresses Weeds There are no clear differences in weed pressure between ACZs, regardless of the number of ACZs used. This is perhaps to be expected, especially given that the data represent subjective and qualitative farmer judgements of whether weed pressure in a field was low, medium or high. It is reasonable to expect that farmer perceptions of what constitutes low or high weed pressure might vary geographically. Depending on the ACZ, increasing frequency of weeding could increase yields by t/ha, or around 10 50%. Using a 3 ACZ definition, in 2017LR increased frequency of weeding increased yields by up to: 0.4 t/ha (13%) in ACZ 1 (representing 40% of clients in 2017LR) 1.52 t/ha (49%) in ACZ 2 (representing 30% of clients in 2017LR) 0.77 t/ha (29%) in ACZ 3 (representing 30% of clients in 2017LR Precise yield response to increased weeding will be influenced by various factors including the timeliness of weeding, weed pressure within a field, whether or not a farmer is intercropping, and rainfall within a given season. Presence of striga in fields decreased median yields by ~0.7 t/ha relative to uninfected fields (Figure 3.4.1). Yields in Homa Bay, Nambele, Butere, Kakamega South, Teso, Ndalu, Gucha and Belgut districts were particularly affected by striga, and there are opportunities here to optimise agronomy through use of Imazapyr resistant herbicide coated maize seed, the Fusarium oxysporum f.sp. strigae fungal pathogen, or Desmodium unicantum intercrops.

25 Figure Yield response striga infection at the program level. The boxes represent the central 50% of observations (the top and bottom of this box represent the interquartile range, IQR) and the horizontal lines within the boxes show the median values. The whiskers show the upper and lower 25% of observations, and the dots represent outliers which are more than 1.5 x IQR from the central 50% of values Pests Yield reductions up to ~0.8 t/ha were associated with increases in farmer reported rating of the impact of pests on their crop (Figure 3.6.2). Pesticide application was associated with yield increases of around t/ha across different ACZ. There were no clear differences in yield gap associated with pest impact nor pesticide use between ACZs. Figure Yield effect of farmer reported ratings of pest impact on their crop at the program level. The boxes represent the central 50% of observations (the top and bottom of this box represent the interquartile range, IQR) and the horizontal lines within the boxes show the median values. The whiskers show the upper and lower 25% of observations, and the dots represent outliers which are more than 1.5 x IQR from the central 50% of values.

26 The data available for most individual pests are insufficient to enable identification of any ACZ or program level opportunities to increase yield through improved pest management. The only pest whose presence in fields was consistently identified in any area was Fall armyworm (FAW), which caused yield reductions of up to ~0.6 t/ha (~18%) across the program (Figure 3.6.3). It is reasonable to expect that Fall armyworm will continue to be a risk in future seasons, though it is not currently possible to predict in which areas this risk will be highest in order to locally optimize product offerings or trainings. The data available are insufficiently detailed to enable identification any of other pests whose improved management might lead to meaningful yield gains. Figure Yield effect of Fall armyworm at the program level. The blue line shows the mean response and the dark gray area around it shows the 95% confidence bounds Flooding Median yields from fields which flooded during the season were ~0.5 t/ha lower than those from that did not (Figure 3.6.4). There may be opportunities to optimise agronomy in fields prone to flooding through practices that improve infiltration, for example planting trees, increasing SOM levels, and training farmers on approaches to avoid compaction Drought Median yields from fields which experienced a period of drought during the season were ~0.25 t/ha lower than median yields in fields that did not (Figure 3.6.5). In some drought prone areas there may be opportunities to optimise agronomy through increasing soil organic matter (SOM) levels, improved variety selection, planting density, planting timing, and adoption of conservation agriculture approaches such as mulching.

27 Figure Yield effect of farmer reports affecting their crop during the season. The boxes represent the central 50% of observations (the top and bottom of this box represent the interquartile range, IQR) and the horizontal lines within the boxes show the median values. The whiskers show the upper and lower 25% of observations, and the dots represent outliers which are more than 1.5 x IQR from the central 50% of values. Figure Yield effect of farmer reported incidence of drought affecting their crop during the season. The boxes represent the central 50% of observations (the top and bottom of this box represent the interquartile range, IQR) and the horizontal lines within the boxes show the median values. The whiskers show the upper and lower 25% of observations, and the dots represent outliers which are more than 1.5 x IQR from the central 50% of values. Recommendations Make weed management a high priority topic for farmer trainings, emphasising that increasing the number of weedings from one to three can increase yields up to 50%, and making clear that timeliness of weeding is important Conduct trials to evaluate the impact of (i) 3 versus 2 weedings and (ii) use of leaf count to align weeding timings with the most critical periods (between V3 and V8). Suggested trial treatments are:

28 1. Two weedings before topdress with timing determined at farmers discretion 2. Three weedings before topdress with timing determined at farmers discretion 3. Two weedings first at V3, second at V5, 4. Three Weedings first at V3, second at V5, third at V7 Conduct trials to understand the potential for herbicide use to improve farmers maize profits Conduct trials on striga management using Imazapyr resistant herbicide coated maize seed, Fusarium oxysporum f.sp. strigae fungal pathogen, or Desmodium unicantum intercrops. Develop organisational capacity to accurately monitor, map and respond to pest and disease issues both within season (e.g. through targeted pesticide distributions) and over the longer term (e.g. through identification of resistant varieties for zones where chronic pest or disease problems are identified). Make soil organic matter management a high priority topic for Product Innovations and farmer trainings 3.7 Field context Plot size As Figure illustrates, yields per unit area tend to increase as plot sizes increase up to 1 acre in size, and level off thereafter. This may result in relatively lower yields in areas with relatively smaller field sizes. There is no obvious agronomic cause for this, but a possible explanation is that farmers are accustomed to determining input use on a 1 acre basis, and struggle to appropriately use their inputs on smaller land sizes. An alternative explanation is that farmers tend to reserve use of more expensive inputs for their relatively larger fields. A further possible contributing factor is that One Acre Fund clients are encouraged to plant in groups, and may choose to plant their largest fields on the days they are receiving being assisted by their fellow group members. This may result in the planting of larger fields being more compliant with One Acre Fund recommendations. It is also possible that larger fields are planted relatively earlier in the planting window, so crops in larger fields receive a relatively larger volume rainfall over the season. Figure Influence of field size (acres) on yield. The blue line shows the mean response and the dark gray area around it shows the 95% confidence bounds.

29 3.7.2 Field distance from farmer s homestead The Variable Importance Plots (Figure 3.2) indicate that field distance from the farmer s homestead is one of the most important variables influencing yield. Although PDPs do not reveal the most agronomically intuitive direct relationship between field distance from the homestead and yield, we can nonetheless be sure that its interaction with other variables means it is a variable worthy of consideration in evaluating how to increase yield. Several previous studies of smallholder agriculture in East Africa have demonstrated a tendency for soil fertility to reduce with increasing distance from the home. These gradients tend to arise from relatively more organic matter inputs such as compost and manure being preferentially applied to fields conveniently closer to the homestead. Fields which do not receive these organic inputs will tend to become less fertile, and less fertilizer responsive over time. This means that over time, soils with similar inherent soil fertility may over time develop strongly heterogeneous fertility and responses to nutrient applications. This poses a significant challenge to the development of appropriate blanket recommendations for all farms and fields within even a relatively small area. There may be opportunities in the short term to optimise agronomy by encouraging farmers to ensure they apply organic matter inputs more equally to all of their fields. However, this could result in yield reductions in fields closer to home in many cases there is insufficient organic matter available to be applied to all fields. In the longer term, it may be necessary for One Acre Fund to develop fertilizer recommendations for different types of field within an area. Field distance from the home may also influence whether or not weedings and topdressing are carried out, and the timeliness of those activities Slope The Variable Importance Plots (Figure 3.2) indicate that slope steepness is one of the most important variables influencing yield. As with field distance from the homestead, although PDPs do not reveal the most agronomically intuitive direct relationship between slope steepness and yield, we can still be sure that its interaction with other variables means it is a variable worthy of consideration in evaluating how to increase yield. The importance of slope may relate to its interactions with native soil fertility (which is reduced over time due to erosion) and also nutrient loss (particularly of fertilizer) through runoff. There may be opportunities to optimise agronomy in sloping fields through runoff and erosion control measures such as contour cultivation and planting of trees or fodder banks. Recommendations Use light touch surveys or focus groups to evaluate farmer understanding of how to use One Acre Fund inputs on fields less than 1 acre in size. Conduct further research (beginning with a review of existing literature) to determine how farmers can most effectively maximise the ROI of inputs they purchase from One Acre Fund, over different time periods. For example Is it better for farmers to apply fertilizer to their most fertile fields and perhaps achieve more immediate yield benefits, or is it better to apply fertilizers to their less fertile fields in order to achieve better yields on those fields in the longer term? In the long term, are net yields likely be increased more by spreading limited compost or manure resources across all fields, or by focussing application on a small area? Encourage adoption of soil management practices that limit runoff and erosion 3.8 Field preparation As Figure 3.8 illustrates, yield potentials slightly increase as the number of field ploughings or cultivations prior to planting increases from 0 to 2. Besides eliminating weeds before planting and creating a tilth enabling good seed to soil contact,

30 ploughing and cultivation have several impacts on physical and chemical attributes of the soil. Breaking up of large soil clods and aggregates can create a better environment for root development and improve rainfall infiltration. Ploughing and cultivation also cause increased soil aeration resulting in increased mineralization of soil organic matter which leads to increased nutrient availability, at least in the short term. In the longer term more numerous ploughings and cultivations could result in reduced soil organic matter, which could result in yield reductions. Figure 3.8 Yield effect of number of plowings or cultivations before planting The boxes represent the central 50% of observations (the top and bottom of this box represent the interquartile range, IQR) and the horizontal lines within the boxes show the median values. The whiskers show the upper and lower 25% of observations, and the dots represent outliers which are more than 1.5 x IQR from the central 50% of values. Recommendation G iven the yield value of moderate ploughing 1AF should continue to encourage farmers to utilize this field preparation activity, while mitigating the risks of SOM loss through increased compost application and on farm biomass production. 3.9 Field management in preceding season As expected, maize yields tended to be higher in fields which had been used to grow beans in the previous season, and lower in fields which had been used to grow maize in the previous season (Figures and 3.9.2). This can be attributed to relatively lower soil N status in maize fields that had not been rotated with legumes. Yields also tended to be relatively higher in fields which had received fertilizer applications in the previous season (Figure 3.9.3). This is likely due a residual effect of fertilizer applied in the previous season, particularly P which is slowly released to other crops in subsequent seasons if not taken up by the crop to which it is applied. Recommendation Make appropriate crop rotation a high priority topic for farmer trainings

31 Figure Yield effect of whether beans were grown in the previous season. The boxes represent the central 50% of observations (the top and bottom of this box represent the interquartile range, IQR) and the horizontal lines within the boxes show the median values. The whiskers show the upper and lower 25% of observations, and the dots represent outliers which are more than 1.5 x IQR from the central 50% of values. Figure Yield effect of whether maize was grown in the previous season. The boxes represent the central 50% of observations (the top and bottom of this box represent the interquartile range, IQR) and the horizontal lines within the boxes show the median values. The whiskers show the upper and lower 25% of observations, and the dots represent outliers which are more than 1.5 x IQR from the central 50% of values.