SI Files prepared for Land Availability for Biofuel Production

Size: px
Start display at page:

Download "SI Files prepared for Land Availability for Biofuel Production"

Transcription

1 SI Files prepared for Land Availability for Biofuel Production Ximing Cai 1 (Corresponding author), Xiao Zhang 1 and Dingbao Wang 2 1 Ven Te Chow Hydrosystems Laboratory, Department of Civil and Environmental Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801; Tel: (217) ; xmcai@illinois.edu 2 Department of Civil, Environmental, and Construction Engineering, University of Central Florida, Orlando SI-I: Additional Figures and Table (Pages S2 S6), including 5 tables (Tables S1 S5) and two figures (Figures S1-S2) SI-II: Data Sources and Fuzzy logic modeling (Pages S7 S17), including text and seven figures (Figures S3- S9) All together 17 pages including this page. S1

2 Appendix I: Additional Figures and Table Table S1: Global datasets used in this study Soil Topography Soil temperature regime (STR) Databases Resolution Sources Description Harmonized 30 arcsecond FAO/IIAS The HWSD was developed for World Soil A (5) climate change impact Database assessment and for the (HWSD) FAO/IIASA Global Agroecological Assessment study. See Table S3 for sixteen soil Global Terrain Slope (GTS) 30 arcsecond USDA- NRCS Temperature Climatic Research Unit Precipitation Climatic Research Unit 2 arcminute 2 arcdegree 2 arcdegree Land cover IGBP 30 arcsecond Fischer et al., (6) properties. GTS was compiled using elevation data from the Shuttle Radar Topography Mission (SRTM) with 3 arc-second resolution. GTS includes eight slope classes: 0-5%, 5-2%, 2-5%, 5-10%, 10-15%, 15-30%, 30-45%, and >45%. NRCS (7) STR uses sixteen indices (1-16): ocean, inland water body, ice, Hypergelic, Pergelic, Gelic, Cryic, Frigid, Mesic, Thermic, Hyperthermic, Megathermic, Isomesic, Isothermic, Isohyperthermic, and Isomegathermic. New et al.,(8) New et al.,(8) Biradar et al., (9) Mean monthly temperature during Mean monthly precipitation during IGBP includes the various land cover types: forest, shrubland, savanna, grassland, cropland, cropland/natural vegetation mosaic, wetland, urban and built-up, snow and ice, barren or sparsely vegetated, and water bodies Notation: Grid cell size of 30 arc-second is approximately 1 km x 1 km at the equator S2

3 Table S2: the modeled land area with regular (R) productivity on different land covers, in million hectare (mha) Africa China Europe India South America United States Cropland Crop/veg. mixing Forest Shrubland Savanna Grassland Wetland Urban Table S3. The soil properties from HWSD Category Index Soil Property 1 Soil organic matter 2 Bulk density 3 Clay content 5 ph Topsoil 6 Sodium adsorption ratio 7 Carbonates 8 Gypsum 9 Cation-exchange capacity Soil profile Subsoil water features 12 Permeability 13 Sodium adsorption ratio Subsoil toxicity 14 Electric conductivity 15 Cation-exchange capacity Subsoil reaction 16 ph 10 Depth to restrictive layer 11 Available water capacity in the root zone S3

4 Table S4: The rules to determine regular and marginal productivity land for the continental United States Land productivity Rule Index Factors Soil AI STR Slope Regular 1 R R R R 1 M R R R Marginal 2 R L R R 3 R M R R Table S5: The four scenarios of comparing the modeled land with the land cover map Land Use Modeled productivity regular marginal/low Cropland hit Miss Non-cropland False alarm Correct Rejection Figure S1 Air temperature correlations with soil temperature regime S4

5 Figure S2 Global 30-year average Humidity Indexes for S5

6 Appendix II: Data Sources and Fuzzy logic modeling Global data sources We have conducted an extensive data collecting and processing job and compiled the global datasets with the best available resolution for the study (Table S1). More details on soil properties are presented in Table S3. The soil property ratings are assigned by a value with a range between zero and one. For each of the five categories shown in Table S3, the category rating takes the average of the ratings of the properties belonging to the category. The overall soil productivity rating is the multiplication of the ratings of all the categories, but with a higher weight to low values, which reflects the impact of one or more limiting factors on potential soil productivity (1). The soil ratings are scaled by 1000 and then rounded to integers. Global aridity index is calculated as the ratio of precipitation over potential evapotranspiration (PET). PET is calculated based on the equation from Thornthwaite (1948). Where L is the day length and N is day numbers of each month, both of which are used as an adjustment. I is heat index and t is mean monthly temperature in Celsius. Exponent α is calculated from the following equation In this work, we use the correlation between air temperature and soil temperature regime to represent the soil temperature factor. According to statistic analysis between air temperature and soil temperature regime, it is found that among the 14 classes, class 3-5 correspond to temperature below 265K, classes 6-8 are within range 265K-280K, and temperatures of classes 9-16 are higher than 280K. At the same time, the three categories of soil temperature regimes correspond to 3 suitability functions. Thus, annual mean air temperature, instead of soil temperature regime, is directly used in the fuzzy modeling, to S6

7 avoid the errors and uncertainties that may occur during the transmission from air temperature to soil moisture regime. STR map is not used also because it is for 2000, and climate is likely to have changed from to 200 Fuzzy logic modeling Due to the errors involved in the global data sets and the uncertainty involved in land productivity rating, we undertake a fuzzy-logic modeling method, which is composed of three important steps, i.e., fuzzification, fuzzy rule inference, and defuzzification. Fuzzification is the process of converting quantitative values of a factor into linguistic terms through membership functions assigned for each of the factors identified for the assessment of land productivity, including soil, slope, soil temperature and moisture regimes, and precipitation. Fuzzy rule inference is the process to define the rules necessary to model the final productivity of each piece of land. Two subprocedures are needed for this step: aggregation and composition. Aggregation identifies the possible controlling factor in the factor combination; the composition procedure integrates all the rules to identify the most likely membership for each of land productivity categories (low, marginal and regular productivity). These procedures have been tested by Joss et al. (2), who applied the fuzzy-logic modeling approach to assess land suitability for the afforestation of hybrid poplar across the Prairie Provinces of Canada. In this paper, the membership functions and the rules for land productivity categorization are calibrated through a learning process by comparing the crop land output with a remote-sensed land cover map. The premise for the learning process is that the cropland usually belongs to regular productivity land. Figures S3 and S4 shows the procedures of the combined fuzzy logic modeling and the learning process conducted in this study. The overall procedures are shown in Figures S3 and S4. Figure S3 describes the first learning procedure which computes the statistics for each possible rule. Figure S4 shows the procedures of land productivity assessment by fuzzy logic modeling including the learning procedure of comparing modeled regular productivity land with the cropland. The initial membership function and the inference rules are adjusted by comparing the S7

8 modeled regular productivity land to the cropland from land use and land cover maps so that they match as much as possible. The fuzzy-logic modeling of the land productivity is applied to the seven countries/regions selected for land productivity assessment: Africa, Brazil, China, Europe, India, South America (excluding Brazil), and continental United States. The membership functions and inference rules vary with these spatial regions. Fuzzification In this study, land productivity is evaluated by five factors, i.e., soil, slope, soil temperature regime, soil moisture regimes, and precipitation, denoted as V k j in Figure S3, where j represents the factor index j=1,...4; k is the index of grid cells, k=1,2..nc, where N c is the total number of cells in a study region. The values of the five factors are transformed into three linguistic variables in terms of land productivity: low (L), marginal (M), and regular (R) by prescribed fuzzy rules. To determine the values of the three linguistic variables, a membership function is defined with each of the five factors and the membership values associated with L, M and R are denoted as µ j k (L), µ j k (M), and µ j k (R), respectively. Triangular and trapezoidal membership functions are usually used in engineering applications (3). The membership function starts from a reasonable initial assessment and ends with the calibration procedure shown in Figure S3 discussed later. j To determine the initial membership function, the maximum and minimum values of V k over all the grid cells are obtained, i.e., VMAX j =max(v j k : k=1,2, N c ) and VMIN j =min(v j k : k=1,2, N c ). The highest membership value for L and R is located at VMIN j and VMAX j, respectively; the value for M is located at the middle of the range, i.e., (VMAX j + VMIN j )/2, except for precipitation. For example, Figures S4~S7 show the adjusted membership functions of the four factors for the continental United States. Fuzzy rule inference Fuzzy inference is the process of mapping with the set of inputs and the outputs through a set of fuzzy rules. The essential procedure of fuzzy logic modeling is to partition the rules to subsets of R, M and L. The number of rules needed for the fuzzy inference step depends on the number of linguistic variables (3 in this study) and the S8

9 number of evaluation factors (4 in this study). There are 81 (=3 4 ) combinations of the five factors over the three linguistic variables, which means that all together there will be 81 rules to apply. Construction of the rules is usually based on expert knowledge and relevant literature. The basic form of a fuzzy rule is if then with a condition of input and a specific conclusion of output. The if- part is referred to as the predicate (premise, antecedent) which combines the subsets of input factors in a harmonious manner. The then part gives the outcome based on the predicate, i.e., the land productivity category in this paper. The input subsets within the premise part are combined most often with linguistic conjunctions such as AND and OR. In this study, only AND is used to combine the five factors, for example: IF soil = regular suitability (R) AND slope= R AND HI=R AND T= R THEN land productivity= R Rule aggregation is to determine the aggregated membership value over all inputs of the IF part of a rule. Several fuzzy operations (such as min, max, and min-max) can be used to aggregate the membership values of the multiple preconditions. In this study, the min operator is used, as shown in Figure S3 and expressed as below: µ k i =min(µ k j (*): j=1,2,,4) (1) where * represents the index of R or M or L; µ i k is the membership value that represents the outcome of rule i; µ j k (*) is the membership value of each factor (j) in the predicate for cell k, for example, µ Nh k =min(µ 1 k (R), µ 2 k (R), µ 3 k (R), µ 4 k (R)). Given all the possible rules, next is to identify the production rules applied to the Nr various linguistic output variables, i.e. finding the subset of rules for R: µ k (N r =1,2, N R ), M: µ Nm k (N m =1,2, N M ), and L: µ Nl k (N l =1,2, N L ), where N R +N M +N L =81. As shown in Figure S3, the initial production rules for the three categories are determined based on the statistics on each rule through learning procedures as described below. First S9

10 we calculate the total number of cells (N i hit ) which are covered by cropland in the land cover map and have an aggregation membership value µ i k > 6. The values of µ i k are calculated using Eq. 1 with initial membership functions. The rules that have a large number of hit are added into the production rule set of µ Nr k, and those having a moderate number of hit are added into the production rule set of µ Nm k, and the rest belong to the production rule set of µ Nl k. For example, the ratio of N hit /N c is 47% for the rule RRRR and 12% for the rule RMRR in the continental USA. As mentioned earlier, the partitioned rule subsets may also be adjusted in the procedure of comparing the modeled regular productivity land and the cropland from the land use map. As an example, Table S4 shows the rules of regular and marginal productivity land for the continental United States. Note that the rest of the rules (77= 81-4) apply to the low productivity land. Once the partitioned subsets of rules are determined for each category of then-part outcomes (i.e., R, M and L), the rule composition procedure is applied to combining all the rules in a particular rule subset. The fuzzy operator MAX is used for rule composition. For a particular spatial region, as shown in Figure S4 the membership value for marginal productivity land is: µ M =max(µ 1 k,, µ NM k ) (2) where µ M is the membership value for marginal productivity land and N M is the total number of rules which define marginal productivity land. Similarly, the membership values for regular and low productivity land can be computed by the max operator. Thus three values, i.e., µ R, µ M, and µ L, are computed for each grid cell. Defuzzification Defuzzification is to convert membership grades from the membership values after rule composition into a single crisp value. Several methods have been presented in the literature such as Center-of-Gravity (CoG) and Center-of-Maximum (CoM) (4, 2). The CoM, one of the most frequently referenced methods in the literature, is used in this study to compute a weight averaged membership function value of the land productivity over all land categories (i.e., R, M and L), which is used as the overall land productivity indicator for each land cell. The land productivity is represented by the defuzzification S10

11 membership functions of the various land productivity categories, ranging from 0 (lowest productivity) to 1 (highest productivity). e.g., Figure S9 is the defuzzification membership function for the U.S. A sub-range is specified each category (R, M and L), i.e., 0-5 for L and 1-9 for M and for R. From these curves, we identify the land productivity that is associated with the maximum membership value (1.0), e.g., 5 for M and 95 for R (9-1.0 for membership value 1.0 and using the mean 95). Using the membership values from the composition procedure (µ R, µ M, and µ L ) as weights, the final land productivity is calculated as P=05*µ L +95*µ R +5*µ M. For example, for one cell µ L =0, µ M =3 and µ R =7, the overall land productivity is P=05*0+5*3+95*7=82. Once the overall land productivity is obtained, thresholds are used to classify all the cells into three categories, R, M and L. For example, for U.S. if the overall land productivity of a cell is between 55 and 7, then the cell is classified as M, above 7 as R and below 55 as L. The thresholds are also adjusted through the learning procedures. The final outputs are the maps for R, M and L, respectively, showing the area percentage of R, M, and L for each grid cell (Figure S1). It should be noted that the output maps follow the data format of the slope data set, which presents the area percentage for each slope class of the eight classes (Table S2) for each grid cell. The fuzzy logic modeling procedures described above are run for each land slope class, and the area percentage of R, M, and L are obtained with each slope class. After looping through the eight slope classes, the accumulated area percentages of R, M, and L are computed for each cell, respectively. Learning procedures of the modeling parameters As mentioned earlier, the premise for the learning process is that cropland is usually high productive compared to other land uses, even though some cropland belongs to marginal productivity land. Based on this premise, the learning procedure is to compare the modeled land with regular productivity and the cropland from the land cover map. As shown in Table S5, there are four possible results for the comparison: (1) if a cell is cropland based on the land cover map, and a high percentage area of the cell (e.g, S11

12 larger than 80%) is modeled as R, then this is defined as a consequence called hit ; (2) if a cell is cropland, but a low percentage area of the cell (e.g, less than 20%) is modeled as R, then a consequence of miss is counted; (3) if a cell is non-cropland, but a high percentage area of the cell is modeled as R, a consequence of false alarm ; (4) if a cell is non-cropland, and a low percentage area of the cell is modeled as R, a consequence of correct rejection. We conduct statistics on the four types of consequences over all cells in a country or region and the statistical results provide knowledge for the learning procedures. Learning procedures are conducted for the assignment of membership functions and the determination of rules, the two critical procedures involved in the fuzzy-logic modeling. The learning process includes two procedures: 1) the initialization of the production rule sets for regular (R) µ Nr k (N r =1,2, N R ), marginal (M) µ Nm k (N m =1,2, Nl N M ), and low (L) µ k (N l =1,2, N L ); 2) the adjustment of membership functions, partitioning of rule subsets, and setting the thresholds for the overall productivity. As discussed earlier, the membership functions are initialized by simple symmetric triangular functions, and the partitioning of subsets of rules is initialized by statistics (i.e., N i hit ) on each rule i. In procedure 1), for rule i, one is added to N i hit if the cell is cropland i i and the aggregated membership value µ k is larger than a threshold. The value of N hit will also be used as the guidance for the adjustment of partitioning the rules into subsets (R, M, and L) in procedure 2). We follow two objectives to adjust the membership functions and rule subsets: 1) for the modeled regular productivity land, maximizing the total number of hit (N hit ), and minimizing the total number of miss (N miss ) the total number of false alarm (N false_alarm ); (2) for the modeled marginal productivity land, adjusting and verifying the modeled land with the land cover map by some empirical rules, for example, marginal land cannot be in barren land or water surface (e.g., lakes). To achieve these objectives, the value of N i hit from the rule initialization procedures (Figure S4) is taken as a guide, and a trial and error procedure is conducted to switch rules between R and M, or between M and L. The rules, which have high value of N i hit, are the candidates for the subset of R. The rules, which have high value of N i hit but are not identified to the subset of R are put to the subset of M; the rest of the rules are put to subset of L. If the switch of rules S12

13 cannot obtain expected results, the Fuzzification or Defuzzification membership function will be modified to achieve the objectives. As shown in Figure S4, at the end of each iteration (fuzzification, fuzzy rule inference, and defuzzification), if the comparison of the modeled regular productivity land and the identified crop land from the land use map is acceptable, we stop the adjustment and the land productivity is identified; otherwise, the membership functions, partitioning of rule subsets, and the thresholds for the overall productivity are adjusted and procedures are re-run. References 1. Olson GL, McQuaid BF, Easterling KN, Scheyer JM (1996) Methods for Assessing Soil Quality 49: Joss BN, Hall RJ, Sidders DM, Keddy TJ (2008) Environ. Monit. Assess. 141: Pedrycz W (1994) Fuzzy Sets Syst. 64: Driankov D, Hellendoorn H, Reinfrank M (1996) An Introduction to Fuzzy Control, Springer-Verlag New York, Inc. New York, NY. 5. FAO/IIASA/ISRIC/ISSCAS/JRC (2009) Harmonized World Soil Database (version 1.1). FAO, Rome, Italy and IIASA, Laxenburg, Austria. 6. Fischer G, Nachtergaele F, Prieler S, Velthuizen HTv, Verelst L, Wiberg D (2008) Global Agro-ecological Zones Assessment for Agriculture (GAEZ 2008), IIASA, Laxenburg, Austria and FAO, Rome, Italy. 7. Natural Resources Conservation Service (NRCS) (2001) Soil Climate Map USDA-NRCS, Soil Survey Division, World Soil Resources, Washington D.C.. 8. New M, Hulme M, Jones PD (2000) Journal of Climate 13: Biradar CM, Thenkabail PS, Turral H, Noojipady P, Li YJ, Velpuri M, Dheeravath V, Vithanage J, Schull M, Cai XL, Murali KG, Rishiraj D (2009) Int. J. Appl. Earth Obs. Geoinf. 11: Thornthwaite CW (1948). An approach toward a rational classification of climate. Geographical Review 38: S13

14 Figure S3: Initialization of rule sets for H, M, and L S14

15 Figure S4: Fuzzy logic modeling framework for land productivity assessment S15

16 1. 0 Lo w Margina l Regula r 8 soi 6 l u Soil ratings Figure S5: membership function of soil for the continental United States slop e 6 u Low Slope classes Figure S6: membership function of slope for the continental United States 6 Margina l 8 Regula r 1 Figure S7: membership function of soil temperature regimes for the continental United States S16

17 Figure S8: membership function of aridity index for the continental United States Figure S9. Defuzzification S17