* Author to whom correspondence should be addressed; Tel.: ; Fax:

Size: px
Start display at page:

Download "* Author to whom correspondence should be addressed; Tel.: ; Fax:"

Transcription

1 Remote Sens. 2012, 4, ; doi: /rs Article OPEN ACCESS Remote Sensing ISSN Utilizing a Mlti-Sorce Forest Inventory Techniqe, MODIS Data and Landsat TM Images in the Prodction of Forest Cover and Volme Maps for the Terai Physiographic Zone in Nepal Eero Minonen 1, *, Heikki Parikka 1, Yam P. Pokharel 2, Sahas M. Shrestha 2 and Kalle Eerikäinen Finnish Forest Research Institte, Joens Unit, P.O. Box 68, FI Joens, Finland; s: heikki.parikka@metla.fi (H.P.); kalle.eerikainen@metla.fi (K.E.) Department of Forest Research and Srvey, G.P.O. 3339, Babarmahal, Kathmand, Nepal; s: yampokharel@yahoo.com (Y.P.), sahasman2011@hotmail.com (S.S.) * Athor to whom correspondence shold be addressed; eero.minonen@metla.fi; Tel.: ; Fax: Received: 15 October 2012; in revised form: 4 December 2012 / Accepted: 5 December 2012 / Pblished: 10 December 2012 Abstract: An approach based on the nearest neighbors techniqes is presented for prodcing thematic maps of forest cover (forest/non-forest) and total stand volme for the Terai region in sothern Nepal. To create the forest cover map, we sed a combination of Landsat TM satellite data and visal interpretation data, i.e., a sample grid of visal interpretation plots for which we obtained the land se classification according to the FAO standard. These visal interpretation plots together with the field plots for volme mapping originate from an operative forest inventory project, i.e., the Forest Resorce Assessment of Nepal (FRA Nepal) project. The field plots were also sed in checking the classification accracy. MODIS satellite data were sed as a reference in a local correction approach condcted for the relative calibration of Landsat TM images. This stdy applied a non-parametric k-nearest neighbor techniqe (k-nn) to the forest cover and volme mapping. A tree height prediction approach based on a nonlinear, mixed-effects (NLME) modeling procedre is presented in the Appendix. The MODIS image data performed well as reference data for the calibration approach applied to make the Landsat image mosaic. The agreement between the forest cover map and the field observed vales of forest cover was sbstantial in Western Terai (KHAT 0.745) and strong in Eastern Terai (KHAT 0.825). The forest cover and volme maps that were estimated sing the k-nn method and the inventory data from the FRA Nepal project are already appropriate and valable data

2 Remote Sens. 2012, for research prposes and for the planning of forthcoming forest inventories. Adaptation of the methods and techniqes was carried ot sing Open Sorce software tools. Keywords: data imptation; GIS; k-nearest neighbors; mapping forest variables; nonlinear mixed-effects model; Open Sorce software; Landsat; MODIS 1. Introdction A general aim of forest inventory projects is to form a basis for decision-making and sstainable se of forest resorces in the form of objective statistics at the national and regional levels and for administrative nits. Conventionally, the statistics are obtained for pre-defined areas for the species-specific measres of the total volme of living and dead trees and merchantable volme characteristics. Qestions and isses raised by the United Nations Framework Convention on Climate Change (UNFCCC) throgh its Redcing Emissions from Deforestation and Forest Degradation (REDD) program have increased the need for forest resorce inventories and widened the application of their reslts. Contries already having sitable national forest inventory (NFI) frameworks can harness their NFIs to spport also the assessment of green hose gas (GHG) emissions. When developing contries, sch as Nepal, for instance, are addressing this kind of task, technical and methodological spport provided for assessing the forest resorces may become even more important than before [1]. De to the repetition reqirements and temporal and geographical consistency conditions, the possibilities for tilizing remote sensing technology together with field inventory data are increasingly gaining the interest of the organizations responsible for the forest inventory activities related to the REDD mechanism (e.g., [2,3]). The operative inventory project Forest Resorce Assessment of Nepal (FRA Nepal) belongs to the Finnish Government s development aid toolkit. The FRA Nepal project is an example of an action that aims to develop cost-efficient inventory techniqes based on information obtained from both conventional field inventories and remote sensing materials. Biometrical modeling is also needed for inventory calclations that make se of remote sensing: for example, the calclation system for the FRA Nepal project consists of models for height or volme by tree species, which are necessary for impting the vales missing from the field data. Obtaining data from the tree level to the stand or plot level, or to the regional, level is a process by which inventory calclations and reporting modles are developed. To pt all of this together, a forest inventory database is needed. To combine field sample plot observations, digital map data and satellite image data, for classification or prediction prposes, mlti-sorce forest inventory (MSFI) techniqes have been developed [4,5]. Even if the pre inventory reslts are directly derived from the characteristics measred by the field sample plots, a mlti-sorce approach aiming at the wall-to-wall mapping of forest characteristics reqires remote sensing data. The necessary inpt data, i.e., the grond-trth data, comes from the inventory calclation system and from the Forest Inventory Database. Tools for processing GIS and image data are frther tilized, for example, when classifying satellite image pixels and performing overlaying analysis.

3 Remote Sens. 2012, A representative and large enogh field sample for MSFI is a prereqisite, so that the estimation reslts are of a satisfactory qality and accrate enogh. However, if only a small or very sparse sample of field plots is available, it mst be possible to not only tilize plots on the same image, bt also on neighboring images. For the FRA Nepal project, at minimm, it wold be necessary to have a relative calibration procedre that wold make it possible to combine several satellite images with respect to the target inventory areas. In or stdy case in Terai region, the main factors affecting the availability and the sability of Landsat TM images are clodiness and seasonal variations. If clod-free images are available, one alternative for relative calibration is then to adjst the radiometric properties of a sbject image to those of a reference image [6 11]. Seasonal variation effects between images can partly be decreased by aiming to se images from the same time of year [6]. For tackling the problems cased by clodiness, gap-filling approaches have been presented, where several images are sed to cover the gaps in a selected image [12]. The aim of this stdy was to prodce thematic maps of forest cover (forest/non-forest) and total stand volme for the Terai region in sothern Nepal sing a non-parametric, k-nearest neighbor techniqe. The pre-processing of Landsat TM images was condcted by sing MODIS satellite data as a reference in a local correction approach. Mixed modeling was tilized for the generalization of sample tree characteristics that was condcted as a part of calclating the stand volmes by sample plots (see Appendix). The approach sed to create the forest cover map was based on a combination of Landsat TM satellite data and visal interpretation data, i.e., a sample grid of visal interpretation plots by which we obtained the land se classification according to the FAO standard. For the volme mapping procedre, we tilized Landsat TM data, the forest cover map and the total stand volmes obtained by the field measred sample plots. 2. Materials 2.1. Stdy Area The target area of the stdy, i.e., the Terai region, is located approximately between N and E. The Terai region is the sothernmost of the five physiographic zones of Nepal and comprises 14% of the contry s land (Figre 1). In contrast to the other for physiographic zones, the terrain in this sbtropical lowland platea is topographically less complex, and its elevation varies from 60 to 330 m above sea level [13]. In this stdy, the western Terai comprises the two leftmost sb-regions of Terai, whereas the eastern Terai consists only of the rightmost sb-region of Terai (Figre 1) Image Materials Three Landsat TM satellite images from the western part of Terai and for images from the eastern part of Terai were inclded in two image mosaics, one covering western Terai in the UTM zone of 44 and another covering eastern Terai in the UTM zone of 45 (Table 1). MODIS srface reflectance prodcts provide an estimate of the srface spectral reflectance as it wold be measred at grond level in the absence of atmospheric scattering or absorption. The

4 Remote Sens. 2012, MODIS Srface Reflectance prodct MYD09A1 is a combination of the best possible observations in an eight-day time period at a 500-m resoltion at the sinsoidal projection [14]. MYD09A1 material from the h25v06 and h24v06 sinsoidal tiles were tilized as image mosaic reference materials. The eight-day time interval of both MODIS prodcts was from 10 Febrary 2010 to 17 Febrary The MODIS s data coverage cold be fond from a time interval close to the dates for the Landsat images, which provided a qite reasonable starting point for image mosaic bilding and image interpretation prposes. One Landsat TM image from the north-western side of Terai was taken in 2011, ths representing a difference of jst one year from the time-point of the MODIS data. Figre 1. Map of Nepal with borders of the physiographic zones. The stdy area, i.e., Terai, is denoted by the grey-fill color. Table 1. Information abot the Landsat TM satellite imagery sed in this stdy. Path and row information is given in the WRS-2 system. Area Date Path Row Western Terai Eastern Terai

5 Remote Sens. 2012, The MODIS Reprojection Tool [15] was sed to transform the MODIS data into a UTM zone44 and zone45 coordinate system with wgs84 datm and 500-m resoltion. All other image processing operations were carried ot sing Open Sorce tools and the tilities of GDAL, Qantm GIS and GRASS GIS [16 18] Two-Phase Sampling and Visal Interpretation Data The setp of the ongoing national forest inventory was designed for the FRA Nepal project and formed the basis for this stdy. In the FRA Nepal project, a stratified systematic clster sampling design with a two-phase clster sampling method was applied [19,20]. In order to locate the sampling nits, i.e., the sample plots within sqare clsters, throghot the whole contry, a 4 km by 4 km grid was created. Each clster consisted of six sample plots in two parallel lines in a North-Soth direction; the sample plots were 150 m apart and the distance between the North-Soth lines was 300 m (see Figre 2(a)). The sampling of clsters with the sample plots for the field measrements was condcted in two phases [20]. Dring the first phase, each clster of six plots along the 4 km by 4 km grid covering the entire contry was visally classified according to land se class (1 = Forest area (n = 1,603), 2 = Other wooded area (n = 131), 3 = Agricltral area with tree cover (n = 796), 4 = Agricltral area withot tree cover (n = 4,168), 5 = Bilt-p area with tree cover (n = 202), 6 = Bilt-p area withot tree cover (n = 119), 7 = Roads (n = 3), 8 = Other area (n = 229), 9 = Water area (n = 282), where n is the nmber of observations in Terai) and accessibility. Technically, the interpretation was implemented sing Google Earth, a virtal globe and satellite imagery viewer [21] consisting of freely available satellite images and additional data layers, inclding RapidEye imagery, topographical maps and the sampling grid for Nepal. In this stdy, the visal interpretation data for the Terai physiographic zone comprised altogether 7,533 plotwise observations within 1,281 clsters. Dring the second phase, the selection of clsters for field measrements was done by strata. The first stratm contained the clsters having at least one plot in the forest area and the second stratm contained the clsters with no forest area plots in the 1 st phase sample. Dring this sampling process, more weight was given to the clsters comprising plots on forest area. Only the for-corner plots of the clster in a sqare with a side length of 300 m were sed in these analyses (see Figre 2(a)) Field Data The field data for this stdy were collected from the plots sampled for field measrements between 2010 and 2011 in the FRA Nepal project. The field-measred inventory data were finally obtained from 217 sample plots within 56 clsters on forest land (171 plots) and non-forest land (46 plots). The type of sample plot employed in the FRA Nepal project is a Concentric Circlar Sample Plot (CCSP), which is recommended for inventories of natral forests that are large in size and characterized by many old trees and a wide variety of tree species (see [20,22]). In the FRA Nepal project, the CCSPs consist of for circlar plots having radii of 20, 15, 8 and 4 m and diameter-atbreast-height threshold ranges of 30.0, , and cm, respectively. Since all

6 Remote Sens. 2012, the FRA Nepal sample plots have been established on a permanent basis, the positions of the trees were obtained via bearing and distance measrements (Figre 2(b)). Figre 2. Layot of the inventory clster (a) and Concentric Circlar Sample Plot (CCSP) (b) sed in the FRA Nepal project. In the case of Terai, one clster comprises for CCSPs (1, 3, 4 and 6) in a sqare with 300 m sides, i.e., plots nmber 2 and 5 of the basic clster, which are sed in the visal interpretation (first stage sample) and are exclded from the field inventory (second stage sample). In (b), the symbols r 1,..., r 4 are for the radii of the for circlar plots (4, 8, 15 and 20 m, respectively) within the CCSP. (a) (b)

7 Remote Sens. 2012, Both the diameter at breast height and the tree species were measred for every tallied tree, whereas every fifth tree was treated as a sample tree, for which the total tree height was also measred. General stand characteristics and plot-specific variables, sch as the land se class, forest type, aspect, x and y coordinates obtained for the center point sing a GPS device, soil depth, main site type, forest type, development stats and origin of the forest were also collected. Descriptive statistics for the sample plot data collected from Terai are given in Table 2. Table 2. Descriptive statistics for the CCSP data, (n = 217): D g is the basal area weighted mean diameter at breast height (cm), H g is the basal area weighted mean height (m), N is the nmber of living trees (ha 1 ), G is the stand basal area of living trees (m 2 ha 1 ) and V is the stand volme of living trees (m 3 ha 1 ). Variable Minimm Median Mean Standard Deviation Maximm D g H g N G V The existing tree-level volme models developed by Sharma and Pkkala [23] for Nepalese tree species are applicable when the species, diameter at breast height and total height, measred or predicted, are known. The logarithmically linearized allometric models developed by Sharma and Pkkala [23] were available for a total of 21 individal tree species and two additional tree species grops. Heights were needed for all tally trees in order to apply these existing stem volme models based on the height of the tree and its diameter at breast height (see Sharma and Pkkala [23]). Therefore, it was necessary to develop a height generalization model with species-specific parameters designed for the inventory data from Terai. The height prediction approach based on a nonlinear, mixed-effects (NLME) modeling procedre (see, e.g., Pinheiro and Bates [24]) is presented in the Appendix. The heights of the tally trees (see Appendix) were predicted first. Stem volmes were thereafter predicted for all the tallied trees sing species-specific volme fnctions of Sharma and Pkkala [23]. Finally, total stand volmes by sample plots were calclated (Table 2). 3. Methods 3.1. Bilding a Satellite Image Mosaic The FAO NAFORMA project has prodced a techniqe for creating image mosaics for the prposes of forest inventory planning [8]. A MODIS (MYD09A1) image, i.e., an atmospherically corrected image with a 500-m pixel size, was sed as a reference for atmospherically correcting the Landsat images. The basic principle was to match the mean and the variance of the data in both images by taking into accont the difference in the pixel sizes [8]. In the techniqe developed by Tomppo et al. [8], simple linear mapping was calclated for each band and the correction was calclated separately for each Landsat

8 Remote Sens. 2012, image when making the mosaics. The averaging was sed for the Landsat data to accont for the larger resoltion in the MODIS data. The approach developed by Tomppo et al. [8], which they initially sed in Tanzania, is close to that of the one developed by Tominen and Pekkarinen [7] for local radiometric correction to assist with the mlti-sorce forest inventory. In their stdy, the idea was to adjst the BRDF-affected intensities in digitized aerial photography to match the local intensities of the reference imagery, i.e., the Landsat 5 TM satellite data. Ths, the reference imagery needs to be independent of the BRDF. The local calibration was based on adjsting the mean vales in a correction nit, for example, a moving circle with a radis of 40 m. Using a moving window, an adaptive correction based on the reference image digital nmbers became possible; likewise, the parameters for the method were defined empirically. In the case of Terai, approaches developed by both Tomppo et al. [8] and Tominen and Pekkarinen [7] were combined to create a local correction model. The separate Landsat TM images were corrected by applying the correction fnction below (see Eqation ) and by sing MODIS image data as the nderlying reference image. The objectives of the procedre presented here are as follows: to match the mean and the variance of the data in both images by taking into accont the differences in pixel sizes [8] and to apply the correction locally [7] by sing raster map algebra and a moving window approach to calclate the parameter vales for the model. The correction fnction (see Tomppo et al. [8]) applied for a pixel (x, y) was as follows: fˆ ( x, y) a ( x, y) f ( x, y) b ( x, y) i i where fˆ i ( x, y) is the corrected data for the pixel (x, y) in band i, f i (x, y) is the ncorrected data in band i, and a i (x, y) and b i (x, y) are the parameters compted for the given pixel (x, y) in band i. To calclate the parameters for the correction fnction (Eqation ), averaging was first sed to transform the Landsat image data to match the resoltion of the MODIS reference image. The common pixel size of the images after averaging is later denoted by c. This stdy sed the c-vale of 500 m. The raster maps avg, sd, avg Mj and sd Mj which represent the mean and standard Li Li deviation vales of the Landsat TM data and the MODIS data in the compatible spectral wavelength bands i and j, respectively were thereafter calclated sing a moving window approach with a window size of w w pixels. Here, a window neighborhood was tilized (w = 21), representing approximately a 10 km neighborhood for each pixel with the size of c. The raster maps with a pixel size of c were compted for both image datasets: target images (Landsat TM) and reference images (MODIS). Parameters a i and b i in the correction fnction (Eqation ) were calclated for each pixel (x, y) as follows: a ( x, y) sd ( x, y) /sd ( x, y) i Mj and b ( x, y) avg ( x, y) avg ( x, y) a ( x, y) i Mj where i denotes a Landsat TM band and j denotes a MODIS band that is compatible with i. This stdy sed the MODIS bands 3, 4, 1, 2, 6 and 7 as compatible bands for the Landsat TM bands 1, 2, 3, 4, 5 and 7. i Li Li i i

9 Remote Sens. 2012, Finally, the correction model (Eqation ) was applied in the original Landsat TM image data to prodce locally corrected Landsat data in a MODIS reflectance vale scale and at the original 30-m resoltion. The coefficients a i and b i (see Eqation ) of the correction fnction (Eqation ) were local, being based on statistics from the neighborhood of each pixel (x, y). As a reslt, the Landsat TM pixel vales were rescaled according to the local distribtion parameters (mean and standard deviation) of the pixel vales in the reference image bands. In the correction approach of this stdy, each Landsat TM image had to first be processed separately, after which a mosaic of the corrected Landsat TM images was created. This mosaic was sed as inpt for the forest/non-forest classification and the wall-to-wall map generation of a forest variable (stand volme, m 3 ha 1 ). The resoltion (c) of the raster maps representing the mean and standard deviation and the size of the moving window (w) are ser-defined parameters. Here, we sed a resoltion corresponding to that of the reference MODIS image, i.e., averaging was sed only for Landsat images, as mentioned above. The size of the moving window (w) determines the flexibility of local correction. However, the window size (w) shold be large enogh to enable valid comptations of the mean and standard deviation. In this case, it was not possible to condct an exhastive search for the resoltion and neighborhood size (w), and therefore, we evalated the reslt by checking the visal appearance of the reslting image mosaic. A good-qality reference image withot any extreme pixel vales is needed for a sccessfl reslt Nearest Neighbors Techniqes A non-parametric, mlti-sorce forest inventory method based on the k-nearest neighbor (k-nn) estimation (see Tomppo [4], Tomppo et al. [5]), an approach recently reviewed by McRoberts [25], was applied in the prodction of forest cover and volme maps in the Terai region. In this stdy, the poplation nits (the target set of pixels) were the pixels in the satellite image. The satellite image bands were sed as the ancillary variables (featre variables) with observations for all nits of the poplation. The forest variables with observations only available for a sample (the reference set of pixels) were denoted as response variables. In practice, the reference set is bilt by qerying the satellite image in the locations of sample plots, whereas the aim of the nearest neighbors approach is to impte the response variable vales to the target set elements. The classification was based on a pixel-by-pixel analysis, where the nearest neighbors for a target pixel among all the reference pixels (the pixels covering the center point of a sample plot) were determined sing weighted Eclidean distance in spectral featre space (see Tokola [26]; for this in matrix form, see, e.g., McRoberts [25]): n b 2 d ij [ ph ( bih b jh )], (3) h1 where n b is the nmber of bands, i is the target set element for which a prediction is soght and j is a reference set element, b ih and b jh are spectral band vales for the pixels i and j on band h, respectively, and p h is the empirical parameter for band h.

10 Remote Sens. 2012, For the target pixel i, the k-nearest neighbors (i.e., a set of k reference pixels to which the Eclidean distance in spectral featre space from the target pixel i is smallest, K(i)={ j 1 (i),, j k (i)}) were soght. In the case of categorical variables, the mode vale of a response variable from among the k-nearest neighbors was the predicted class for a target pixel. In the case of continos response variables, on the contrary, the weight w ij of each neighbor j K(i) was determined to be inversely proportional to its distance to target pixel i: t ij w d / d, (4) ij t ij jk ( i) where t is a ser-defined parameter (t 0). In this stdy, a small positive nmber was given for zero distances and a vale t = 2 was sed in Eqation (4). It follows that for target pixel i, w 1. A k-nearest neighbor prediction ( ŷ i ) for target pixel i was calclated as follows: ij jk ( i) y ˆ w y, i ij jk ( i) where y j is the observed vale of the response variable in reference pixel j K(i). In forest cover mapping, the set of spectral featres contained vales of the image mosaic bands from 1 to 6, corresponding to the Landsat TM image bands from 1 to 5 and 7, and we sed the Eclidean distance as the measre of distance (the band weight (p h ) was set to 1 for all bands). The mode vale of a response variable, i.e., the FAO land se class, among the k-nearest neighbors was the predicted class for the given target pixel. We selected pixels from the Landsat TM mosaic from Terai as the set of target pixels sing the physiographic zone raster map as a mask in the GRASS GIS package. The reference set consisted of the pixels covered by the center points of the 1st phase plots, where the observed vale (FAO land se class) was known based on the visal interpretation. In order to clarify forest classification, the plots belonging to the land se categories Agricltral area with tree cover or Bilt-p area with tree cover were dropped ot from the reference set, whereas the category Roads, having a very small nmber of observations, was combined with the category Bilt-p land withot tree cover. The post-processing phase of the classification inclded three steps. First, we applied a spatial 3 3 mode filter to redce the salt-and-pepper effect in the pixel-by-pixel classification. Second, we constrcted a forest/non-forest raster map (denoted as a forest cover raster map later) by reclassifying non-forest categories into a single class. Third, we converted the reslting forest cover raster map into a vector format and exclded forest segments with an area of less than 0.5 ha Validation of Reslts For the category variable, i.e., forest cover, the vale for k was selected by examining the classification accracy of the forest cover predictions obtained sing different vales of k. For classification accracy, indicators based on the confsion matrices were sed [27] and inclded: prodcer s accracy (FPA, forest), ser s accracy (FUA, forest), (3) overall accracy (OA) and (4) the Kappa statistic (KHAT); the KHAT was compted becase the OA reslts are too optimistic if the proportion of a single class is high (see Hyvönen and Anttila [28]). j (5)

11 Remote Sens. 2012, The Monte Carlo techniqe, inclding the random sampling of test samples, was applied when selecting k, i.e., the nmber of neighbors. For each k {1,3,5,...,15}, 5,000 test samples were randomly selected from the entire reference element set. Each sample contained 1,000 elements (1st phase sample plots of visal inspection), which were classified as test material, whereas all the remaining elements not inclded in the sample comprised the reference. The aforementioned classification accracy indicators were calclated for each test sample. As a reslt, an empirical distribtion of each test indicator with a varying vale of k was prodced. This approach was applied in the two parts of Terai separately. At the end, the forest cover delineations derived from the visal interpretation and the forest cover map were also checked by their classification accracy sing the field observed vales for the field sample plots. A leave-one-ot analysis was condcted, and common cross-validation criteria, i.e., RMSE and bias, were calclated for the continos variable, i.e., the total stand volme, in the reference set (e.g., Tokola [26]; Katila and Tomppo [29]). In this stdy, the RMSEs and biases were determined as follows: RMSE n i1 ( y yˆ ) i i 2 n (6) n bias ( y yˆ ) n, (7) i1 where y i is the observed vale, ŷ i the predicted vale of the given characteristic, and n is the nmber of observations. The relative, i.e., percent, RMSEs (RMSE % ) and biases (bias % ) were calclated by dividing the absolte RMSEs and biases by the means of the respective vales from the observations (y) and mltiplying the reslting qotients by 100. Cross-validation and featre selection were carried ot sing a genetic algorithm-based approach (e.g., Haapanen and Tominen [30]) that has been compiled in with the genalg package of R software [31]. The vale for k was simltaneosly determined in the featre selection and weight search. Several rns were made in order to find the best combination of featres and parameter vale for k for the volme estimation. The parameter vales of poplation size, nmber of iterations, elitism and mtation chance that we sed for the algorithm were 500, 1,000, 40% and 0.05, respectively, and the goal was to minimize the sm of RMSE and bias obtained for the stand volme. The featres that we tested for the model inclded band vales (b 1,,b 6 ) in the Landsat TM mosaic (referring to TM wavelength bands 1,..,5 and 7) together with the ratios of the band vales and the vale of the Normalized Difference Vegetation Index (NDVI). When creating thematic wall-to-wall maps for growing stock volme, we selected a set of target pixels sing the forest cover map as a mask in the GRASS GIS package. The reference set consisted of the pixels covered by the center points of the field plots (Table 2). For composing the thematic map of volme, we classified the volme predictions for the pixels into classes of 50 m 3 ha 1. i i

12 Remote Sens. 2012, Reslts 4.1. Forest Cover Mapping An example of an original Landsat TM image sbset and the reslting image mosaic is shown in Figre 3(a c). After inspecting the overall appearance of the image mosaic, it proved sable as inpt for the forest cover and volme mapping based on k-nearest neighbor techniqes. Figre 3. (a) Sbsets of two original Landsat TM images from a region in Eastern Terai (Feb for the image on the left and Jan for the image on the right), a composite of TM bands 3, 2 and 1 (rgb). (b) A sbset of the corrected Landsat TM image mosaic, a composite of TM bands 3, 2 and 1 (rgb). (c) A sbset of the corrected Landsat TM image mosaic, a composite of TM bands 5, 3 and 2 (rgb). (Coordinate reference system: WGS 84/ UTM zone 45N). (a) (b)

13 Remote Sens. 2012, Figre 3. Cont. (c) Kappa statistics (Figre 4) show that in forest cover mapping, sing vales of k greater than 5 does not improve the qality of the estimation reslt. For this reason, we set the vale of k at 5 in forest cover mapping for both parts of Terai. Figre 4. Distribtions of KHAT statistic vales at different k vales in the western (pper figre) and eastern (lower figre) parts of Terai. In the boxplot, the grey region is the area between the 1st and 3rd qartiles (inclding 50% of the observations), the thick line is the median and the oter lines extend to the data point, which is no more than 1.5 times the area between the 1st and 3rd qartiles away from the box. The most extreme individal observations are also plotted [31].

14 Remote Sens. 2012, Other accracy measres (Table 3) also spport the decision made regarding the vale of k and indicate that the accracy is qite good in both parts of Terai; both the FUA and FPA are more than 85%, even in the case of k = 1. Table 3. Statistics (mean and standard deviation (sd)) by the for indicators of classification accracy (ICA): the ser s accracy of forest class (FUA), the prodcer s accracy of forest class (FPA), (3) overall accracy (OA) and (4) Kappa statistic (KHAT) based on the test samples for each vale of k. Area Western Terai Eastern Terai ICA Vale of k FUA mean sd FPA mean sd OA mean sd KHAT mean sd FUA mean sd FPA mean sd OA mean sd KHAT mean sd Table 4 shows the confsion matrices of the forest cover delineations based on visal interpretation and the forest cover map against the one based on the observed vales for the field sample plots (n = 217). The accracy of the delineation of forest cover by visal interpretation is very good in both parts of Terai. In eastern Terai, the forest cover mapping based on the k-nn techniqes appears to have the same accracy level as the visal interpretation, bt in western Terai, the ser s accracy in the class Non-forest is slightly lower compared to other categories. The confsion matrices between the forest cover map classification and the visal interpretation were also created (Table 5). In this examination, based on the set of field plot points only, the Kappa statistic shows a better agreement between the forest cover map and visal interpretation in eastern Terai compared to the western Terai, where the agreement can still be characterized as sbstantial [32]. A forest cover map for the entire Terai physiographic zone made by combining the reslts from the western and eastern parts of Terai is shown in Figre 5.

15 Remote Sens. 2012, Table 4. Confsion matrices between the forest cover delineations derived from the visal interpretation (Visal), the forest cover map and the field observations (Obs.). (UA = User s accracy, PA = prodcer s accracy, OA = overall accracy, KHAT = Kappa statistic). Visal Interpretation Forest Cover Map Area Obs. Forest Non-Forest Total PA, % Forest Non-Forest Total PA, % Western Terai Eastern Terai Forest Non-Forest Total UA, % OA, % KHAT Forest Non-Forest Total UA, % OA, % KHAT Table 5. Confsion matrices between the forest cover map and the forest cover delineation derived from the visal interpretation (Visal). (UA = User s accracy, PA = prodcer s accracy, OA = overall accracy, KHAT = Kappa statistic). Area Western Terai Eastern Terai Forest Cover Map Visal Forest Non-Forest Total PA, % Forest Non-Forest Total UA, % OA, % 87.1 KHAT Forest Non-Forest Total UA, % OA, % 94.8 KHAT Thematic Map of a Forest Variable: Volme The following featres (spectral band vales or ratios) were inclded in the model for the estimation of total stand volme: b 1, b 2, b 4 /b 3 and b 5 /b 3. The band weights (i.e., the parameters p h in Eqation (3)) for the for variables in the weighted spectral Eclidean distance were as follows: , , and , respectively. It was noted that several rns prodced qite similar reslts. We sed the vales of 4 and 2 for parameters k and t, respectively, of which the former was obtained from the genetic algorithm. The RMSE and bias vales obtained for the stand volme

16 Remote Sens. 2012, were 85.4 m 3 ha 1 (62.0%) and m 3 ha 1 (slight overestimation effect), respectively. The boxplot [31] of residals for the volme prediction categories of 50 m 3 ha 1 show the presence of some extreme observations, whereas the median of residals is close to zero in all categories (Figre 6). Figre 5. A forest cover map (forest: bright green color) for the Terai physiograhic zone prodced by the k-nn techniqe and overlaid on a composite of MODIS image bands 1, 4 and 3 (rgb). Figre 6. Volme residals in volme prediction categories of 50 m 3 ha 1.

17 Remote Sens. 2012, Plotting the means for the ordered grops of observations against the means for the predictions for the same ordered grops (see McRoberts [33]) reveals that volme predictions sffer from averaging: the large volmes were nderestimated and the small volmes were overestimated (Figre 7). Figre 7. Mean observation verss mean prediction for volme (m 3 ha 1 ). The observations have been sorted and groped into grops of 20 observations at the minimm based on the observed volme (m 3 ha 1 ). 5. Discssion In the case of the Terai region, it was not possible to apply the nearest neighbor s techniqes for each Landsat image separately, becase the nmber of plots was inadeqate. This meant that we needed to apply relative calibration for the Landsat images, despite the fact that there cold be qite a long amont of time between the dates for the available good-qality images. For this prpose, we applied a pre-processing approach based on sing a reference image covering the whole region. For aiming at a seamless reslt, the potential overlapping area of the neighboring target images (Landsat TM images in or case) needs to be tilized when the raster maps representing means and standard deviations are compted. By detecting pixels having extreme vales (otlier pixels) in the images and exclding them from this stage, one can improve the reslt of this local approach. Depending on image materials, this method also allows one to apply a pixel size c different than the original pixel size in the reference image. A robst regression approach wold offer another empirically oriented approach for a relative calibration of image materials. Regressing the images for the selected reference image wold make it possible to concrrently se neighboring images. In the case of the geographically wide Terai region,

18 Remote Sens. 2012, the robst regression approach was less applicable. However, in a mlti-temporal case, a regression approach for relative radiometric calibration cold possibly be applied when detecting changes in landscape, sch as cttings (e.g., [11,34,35]). The agreement between the k-nn forest cover classification and the visal interpretation in Terai was strong, since the reslting vale of KHAT was greater than 0.80 (see, e.g., Congalton [27] and Green [32]). The OA vales were also high, which can at least be partly explained by the high proportion of the non-forest category. It is worth noting that we did not examine the effect of the post-processing phase (steps 1 to 3) when calclating the accracy measres in the test samples for the selected vale of k. Also, the classification accracy of the forest cover map evalated against field observed vales showed that the agreement between these forest cover delineations is sbstantial (western Terai) or strong (eastern Terai). The visal land se class interpretation process was condcted sing Google Earth satellite imagery viewer [21], where the backgrond imagery originated from the period between 2003 and 2010, together with additional RapidEye imagery from the year De to shaded areas, haze and clodiness, the visal interpretation work proved difficlt in the case of some RapidEye images, and then, the interpretation was spported sing satellite imagery available in Google Earth. Therefore, the time difference between these image materials cold make a sorce of ncertainty to the interpretation process itself. In this stdy, an nwanted mixing of the forest class and the land se classes with some tree cover was tackled by dropping ot the plots in classes Agricltral area with tree cover and Bilt-p area with tree cover from the reference set. For classification accracy checking, a correct matching of the locations between field plots and the visal interpretation plots on the image is also crcial. Unfortnately, information to check this location accracy was not available. Besides RMSE and bias, robst measres for the qality of the k-nn-based estimation in terms of featre selection and cross-validation cold be sefl, especially in cases where the nmber of plots is small, which corresponds to this stdy (n = 217). For instance, in the k-nearest neighbor analysis for stand volme, the most extreme observations may also have affected the model parameter search, becase the nmber of field plots was qite small. Therefore, the diagnostic approach sggested by McRoberts [33] for detecting otliers and inflential observations cold be very important for or work in the ftre. The k-nearest neighbor techniqe, applied to the mapping of forest cover and stand volme in the Terai region of Nepal, is a straightforward procedre that has been efficiently tilized in Finland (see, e.g., Tomppo et al. [5] and Tomppo [4]). Moreover, this techniqe was also applied earlier by Tokola et al. [36] for classifying land se and for estimating timber volme and biomass in Nepal. This was, however, one of the first forest mapping stdies completely condcted sing Open Sorce software tools and free packages. In this respect, it was natral that sbstantial efforts were needed to incorporate appropriate data processing techniqes and sitable procedres and, especially, that the techniqes and procedres were compatible for processing the remote sensing and inventory data. The forest cover map now available for Terai (Figre 5) is one sorce of information that can be tilized when estimating above-grond forest volmes and biomasses based on the LiDAR data as part of the ongoing srvey condcted by the FRA Nepal project in the central parts of Terai. In the ftre, similar forest cover maps will also be needed for other physiographic zones beyond Terai, where the LiDAR-oriented estimations of forest attribtes will be condcted. Later, experience will show if the

19 Remote Sens. 2012, forest cover maps prove sefl in other fields of forestry. Potential applications in Nepalese conditions inclde inventory planning and monitoring practices. The forest cover and volme maps (Figre 8) estimated sing the k-nn method and inventory data from the FRA Nepal project are already appropriate and valable data for research prposes and for planning forthcoming forest inventories when testing optimal inventory designs (see [37]). Figre 8. At left: forest cover map (vector bondaries) and a field sample plot clster, (plotnr = sample plot nmber; zone = UTM zone nmber; ba_ha = basal area, m 2 ha 1 ; vol_ha = volme, m 3 ha 1 ; tmcl1 tmcl6 = pixel vales in the Landsat TM mosaic bands 1 6). At right: A thematic map of the volme. An example from eastern Terai (backgrond: a composite of Landsat TM mosaic bands 5, 3 and 2 (rgb)). (Coordinate reference system: WGS 84/UTM zone 45N). Spatially explicit biomass maps cold also form a basis for the forthcoming greenhose gas inventory, i.e., the REDD-related estimation of gross primary prodction and soil carbon change. The k-nn-based forest biomass mapping techniqe, which corresponds to that of stand volme mapping, has already been demonstrated by Tominen et al. [38] in Finnish conditions. One shared featre between the approach by Tominen et al. [38] and the one in this stdy has to do with the dataefficient, mixed-effects modeling-based generalization of sample tree characteristics. This stdy, however, sed an NLME model for generalizing the sample tree heights (see Appendix), whereas the model tilized by Tominen et al. [38] made se of an LME model provided by Eerikäinen [39]. Adapting the methods and techniqes and applying the Open Sorce software tools presented here reqires capacity bilding in Nepal: development work and edcation measres have already been lanched by the ICI project, which is an inter-instittional development cooperation project between Nepalese, Vietnamese and Finnish governmental instittes and that was initiated with the spport of the Ministry for Foreign Affairs (MFA) of Finland. The development activities of the project have

20 Remote Sens. 2012, been designed and implemented with a special focs on hman capacity development within the governmental forest research organizations participating in the project. Special emphasis has therefore been given to hands-on training periods and workshops and to disseminating information on forest inventory-related techniqes and procedres. Of these objectives, the latter covers the goal of the present stdy: to increase and share knowledge abot the existing techniqes and procedres and to make it easier to adapt them to the conditions in, for instance, Nepal. Local expertise is always reqired for the most pivotal part of the forest inventory, that is to say, the work condcted in the field and, at the moment, by the FRA Nepal project in Nepal. Throgh scientific work and collaboration, however, it is possible to achieve more diverse and detailed reslts in terms of conventional statistics and advanced electronic maps of the forest attribtes of interest. This stdy is also one example of how to improve the se of field inventory data and exploit them together with additional remote sensing data in order to satisfy the new reporting reqirements set for large-scale forest inventories. 6. Conclsions In this stdy, we introdced an approach for a MODIS-based relative calibration of Landsat TM images to enable the se of a mosaic of several Landsat TM images in a k-nearest neighbor (k-nn) estimation. The presented approach for relative calibration combined aspects presented earlier by Tominen and Pekkarinen [7] and Tomppo et al. [8]. The method relies only on image data (see [7]) and is easy to implement. Optionally, the reference image cold have been converted to a reflectance scale, bt for the k-nn estimation method that was not necessary. A prereqisite for the sccess of the local correction approach for the relative calibration of images is that the target images and the good-qality reference image material need to be temporally and seasonally close to each other. However, more stdies are needed on selecting the parameters for the approach when sing image resoltion scales other than the ones sed in the Terai case, which tilized Landsat TM and MODIS. The k-nearest neighbor techniqe was very applicable to the forest cover mapping in Terai sing visal interpretation plots as a reference material. There was a strong agreement (KHAT > 0.80) between the forest cover delineations based on visal interpretation and field observations. The agreement between the forest cover delineation in the forest cover map and the one derived from the field observed vales was sbstantial in Western Terai (KHAT 0.745) and strong in Eastern Terai (KHAT 0.825). One featre related to the development of the mlti-sorce techniqe and, especially, to its application to different geographical conditions in Nepal in the ftre has to do with sing digital elevation models (DEMs). In the case of Terai, i.e., within the lowlands of Nepal, no radiometric corrections were condcted sing DEMs (e.g., Tomppo et al. [5]; Tokola et al. [36]), neither was the DEM-based moving geographical vertical reference area sed in the k-nn method (see Katila and Tomppo [29]). It is therefore recommended that the significance of DEM be tested in other physiographic vegetation zones north of Terai, where the topography is very montainos compared to the sothernmost region of Nepal.

21 Remote Sens. 2012, Acknowledgments This stdy was condcted at the Finnish Forest Research Institte s (Metla) Regional Unit in Joens, Finland and the Department of Forest Research and Srvey (DFRS) in Babarmahal, Kathmand, Nepal. It was partly fnded by the Ministry for Foreign Affairs of Finland throgh the inter-instittional development cooperation project entitled Improving Research Capacity of Forest Resorce Information Technology in Vietnam and Nepal (contract nmber: HELM431-14, intervention code: ). We are gratefl to these instittions for their resorces and fnding. Spport given by the Forest Resorce Assessment of Nepal project is also grateflly acknowledged. Many thanks are also de to Jho Pitkänen for his help in developing the calclation procedres for this stdy. References 1. Acharya, K.P., Dangi, R.B., Tripathi, D.M., Bshley, B.R., Bhandary, R.R., Bhattarai, B., Eds.; Ready for REDD? Taking Stock of Experience, Opportnities and Challenges in Nepal; Nepal Foresters Association: Kathmand, Nepal, Pearson, T.; Walker, S.; Brown, S. Sorcebook for Land Use, Land-Use Change and Forestry Projects; Winrock International and the BioCarbon Fnd of the World Bank: Washington, DC, USA, Available online: winrock-biocarbon_fnd_sorcebook-compressed.pdf (accessed on 30 May 2012). 3. GOFC-GOLD. A Sorcebook of Methods and Procedres for Monitoring and Reporting Anthropogenic Greenhose Gas Emissions and Removals Cased by Deforestation, Gains and Losses of Carbon Stocks in Forests Remaining Forests, and Forestation; GOFC-GOLD Report version COP17-1; GOFC-GOLD Project Office, Natral Resorces Canada: Edmonton, AB, Canada, Available online: (accessed on 20 September 2012). 4. Tomppo, E. The Finnish Mlti-Sorce National Forest Inventory Small Area Estimation and Map Prodction. In Forest Inventory. Methodology and Applications; Managing Forest Ecosystems; Kangas, A., Maltamo, M., Eds.; Springer: Dordrecht, The Netherlands, 2006; Volme 10; pp Tomppo, E.; Haakana, M.; Katila, M.; Peräsaari, J. Mlti-Sorce National Forest Inventory Methods and Applications; Managing Forest Ecosystems, Springer Science+Bsiness Media: New York, NY, USA, 2008; Volme Coppin, P.; Jonckheere, I.; Nackaerts, K.; Mys, B.; Lambin, E. Digital change detection methods in ecosystem monitoring: A review. Int. J. Remote Sens. 2004, 25, Tominen, S.; Pekkarinen, A. Local radiometric correction of digital aerial photographs for mlti sorce forest inventory. Remote Sens. Environ. 2004, 89, Tomppo, E.; Katila, M.; Mäkisara, K.; Peräsaari J.; Malimbwi, R.; Chamya, N.; Otieno, J.; Dalsgaard, S.; Leppänen, M. A Report to the Food and Agricltre Organization of the United Nations (FAO) in Spport of Sampling Stdy for National Forestry Resorces Monitoring and Assessment (NAFORMA) in Tanzania; FAO: Rome, Italy, 10 March Available online: (accessed on 24 Agst 2012).